segment-ology

Success With Small Segments

Featured

Posted on July 19, 2026 by Jim Bartlett

Time out for an update on small segments. I have a little over 100,000 Matches at AncestryDNA; and 9441 rows in my spreadsheet of Ancestry Matches with Common Ancestors – it’s roughly 9%, so far. Of the 9441, 4281 are in the 6cM to 10cM range; and another 2,434 are in the 11cM to15cM range. So… depending on how you characterize “small” segments, from 45% to 71% of my “success” has been with cousins in the small segment range.

These cousins have been “proved” by me using traditional genealogical methods – not, necessarily GPS standards, but my 50 years of genealogy experience. It matters NOT to me that the small segments may be real or poison – in each case, they led me to a cousin. And in my spreadsheet, I also note most are closely related to other Matches per Pro Tools.

If your hair is on fire about this, just pretend that I found and linked these folks (Matches) to my Tree without any DNA evidence… It’s OK, I did that for 36 years before atDNA came along.

I am *still* drinking through a firehose at Ancestry and MyHeritage with Matches of Shared Matches showing relationships to known cousins. And I am more convinced than ever that a large share of our DNA Matches (at all the companies) are true genealogy cousins within a genealogy time frame, say back into the early 1700s in Colonial America. The limiting factor is not the DNA, it is the genealogy records. Pro Tools is demonstrating that many of our Matches are closely related – they are NOT 10 to 20 (or more) generations away. Focus on the positive viewpoint, and dig in on the genealogy research!

That’s my take.,,,

[06I] Segment-ology: Success With Small Segments; by Jim Bartlett 20270719

Collaboration

Featured

Posted on July 17, 2026 by Jim Bartlett

Genealogy, and Genetic Genealogy, are largely lonely, individual hobbies/projects. I think about the many thousands of messages and emails I’ve sent out over the years, and the few hundred responses. And that’s further reduced to a few tens of folks who really want to trade info – or work together to solve a problem. Maybe it’s just me… or maybe most of us are in the same boat. This applies to both our ancestral families as well as our DNA relationships.

Recently I posted about a collaboration concept that involved each of us contributing to a world family Tree. Some of us do that on a genealogy level, but very few do it on a genetic level. There is a lot of power in knowing the cM relationship with a Match AND between our Matches (I am *still* drinking through that firehose at Ancestry and MyHeritage). But I can only dream of collaboration with Matches, including with folks who, gasp, have tested but don’t Match! As I’ve said before, the vast majority of our true cousins will not share enough DNA with us to wind up as a Match. But all those cousins who have tested can stitch together their Trees just as we build our own – and, particularly on a true genealogy level, our overlapping families should be exactly the same… Think about that.

But I don’t harbor illusions that we are going to start writing/calling/Zooming each other. Perhaps the easiest way to stay in isolation, but help each other is contributing our researched information to the several “world Trees”.

However, another thought came to mind…. I call it The 1810 Census Project. I looked it up. The 1810 US Census counted 7,239,881 individuals (1,101,362 were enslaved). This was for 587 counties in 17 states and 6 territories (and of course, some info has been lost). It is estimated there were about 1.2 million households – that’s roughly 2,000 families per county. Many counties have genealogy societies… Suppose many of these genealogy societies took this on as a project. Document each family and, somehow, list and link DNA test takers to each family. Of course, each of us would probably go back to a number of counties extant in 1810; and the membership of the local county societies would not necessarily have roots in their own county. However, in this digital age, each of us, individually, could connect; and the societies could, nevertheless, build and track the genealogy (and genetic) data… For example, I still have a Tree: “Northern Neck of Virginia Families”, with over 100 Editors who have entered 13,000 people rooted in the Northern Neck.

The US has almost entirely been a country of immigrants – people came into the US; few moved out. So the 7 million in 1810 were a combination of descendants of earlier immigrants, and some new immigrants. AND virtually all of their descendants were in succeeding census records (plus some new immigrants). For the most part, your Ancestors go back to the 1810 census or to a more recent immigrant. For me, the focus has been to push back in America to the original immigrants (while each genealogist has their own objectives.)

And, with an 1810 Census Project, think of the possibilities for identifying and sharing DNA relationships. Think of the Y-DNA and mtDNA threads… I’m thinking of that right now!! I’m working on a family that has three of my lines intertwined: Benjamin WELCH born c1775 VA married 1797 VA to Mary BARTLETT born c1778 (they were 1C, on my WELCH, CARROLL and BARTLETT lines). The 1810, 1820, and 1830 census indicates they had 5 daughters; the on-line genealogies have a dozen different given names, with very little circumstantial proof for most of them. So far we have 4 Test Takers with J2a1a1 mtDNA from probable intersecting lines. What we need are female lines from each of the wannabe daughter lines down to someone we can test. That *should* weed out a few wannabes. I’m sure many of you have gone through this drill of tracking down families and begging them to take a free mtDNA test.

Well, anyway… I’m still thinking of ways to stimulate collaboration among the active genetic genealogists… I dream of documenting the interconnecting DNA of our roots. We are each documenting little pieces – we need a way to build a bigger picture.

[22DN] Segment-ology: Collaboration; by Jim Bartlett 20260717

Where Is Your Ancestor in a Floating Branch?

Featured

Posted on June 29, 2026 by Jim Bartlett

BLUF (Bottom Line Up Front) – you may already know this: The average cMs of all your Matches in a Floating Branch back to the Common Ancestor (CA) is a strong clue to about your relationship (use judgment to pick the CA and cull out outliers); and the highest average among the children of that generation would often point to the child you descend from.

This is an insight based on collected data. First, we’ll set up the situation:

A Floating Branch is a group of your Matches who all descend from a CA who is not one of your known Ancestors. Typically, a Floating Branch can contain a lot of Matches (I’ve seen 20, 50, over 100). It’s frustrating – their CA is almost certainly your CA, but the link eludes you. It’s probably one of your Brick Walls, or beyond.

Brick Walls – to me, there are at least two types of Brick Walls – Ancestors in “plain view” and Ancestors “hiding under a rock” (or abducted by aliens). The “in plain view” Brick Walls are out there, with records, but we just haven’t made the genealogy connection. But, in my opinion, most of our hard Brick Walls persist for a reason: there is little to no information about them. My favorite example is a tic mark on the 1840 census (perhaps a young man who impregnated a young woman; and went “west” in 1849 and died along the way). The parents of the tic mark are known, but no other info on the son. There are many other reasons why a person may not be in the records. If they had left records, they probably wouldn’t wind up as Brick Walls.

Think about these two concepts (Floating Branch & Brick Wall)… To my thinking, they should “link up”. But how?

Well… After we’ve done an exhaustive review of the Floating Branch (fill out the descendants of the CA; ethnicity; geography; records, etc.), we probably have at least two things we can do with the collected DNA data. One is to average the cMs of all the Matches (except obvious outliers) and use the Shared cM Project to make an educated guess at the probable relationship (and generation) of our connection to the Branch. Ask: what Brick Walls do we have at that generation? Yes, it gets harder and harder as we go back in time.

However, the gist of this post is to figure out where we tie into the Floating Branch. It’s where the highest cMs are! Probably obvious to Segmentologists…

Take a known Ancestor with, say, 8 children; we would usually see Matches from most of the children having one average cM amount, while the Matches descending from the child who is our Ancestor will have a higher cM average. Again, easier to detect with close relationships. In theory, the difference is a factor of 4. If most of the Matches are 2C averaging about 229cM, we find the 1C Matches with the child who is our Ancestor to be about 866cM. See this table for different generations:

Matches with most of the children vs Matches with just your Ancestor child:

Great grandparents 229cM for 2C vs 866cM for 1C – pretty close to 1:4

2xG grandparents 73cM for 3C vs 229cM for 2C – a little more that 1:2

3xG grandparents 35cM for 4C vs 73cM for 3C – about 1:2

4xG grandparents 25cM for 5C vs 35cM for 4C – still more, on average

5xG grandparents 18cM for 6C vs 25cM for 5C – still more, on average

In other words, the floating branch should have an array of Match cMs just like any of your known CAs has any array of descendant cousins – roughly like the Shared cM table shows.

The point is that if you had a Floating Branch, and didn’t know where you tied in, averaging the cMs for each child at the same generational level, might provide a clue. I have two cases where the probable link sticks out like a sore thumb; and 2 others where the averages are not as clear as I thought they should be…

I’ve had this thought before – see here; and here. The more data (Matches), the better.

I’d be grateful for feedback from anyone who works with Floating Branches and has enough data to indicate whether one path sticks out like a sore thumb… And/or if you have any thoughts, pro or con, about this kind of analysis to squeeze a little more out of a Floating Branch…

[22DM] Segment-ology: Where Is Your Ancestor in a Floating Branch? By Jim Bartlett 20260629

A Concept Project for Segmentologists

Featured

Posted on June 22, 2026 by Jim Bartlett

As I noted in my last post (Drinking Through A Fire Hose), I have over 10,000 DNA Matches with pretty solid genealogy paths back to our Common Ancestors. I’m about 1/3 of the way through entering and Tagging these Matches and their paths back to our CAs in my main Tree at Ancestor.

Let’s look at an example. My Ancestor [A0856] John HIGGINBOTHAM b 1695; married 1713 in Amherst Co, VA to Frances RILEY. I have identified 827 DNA Match/cousins who descend from them.

Note 1: most of my Matches do not have a Tree back to John – I and ThruLines determined most of the paths.

Note 2: search Public Member Trees for John [drumroll….]: 10,323 Trees! WOW! Take a guess at how many Ancestry members actually descend from John and Francis, but don’t have Trees that reach 9 generations back… – 100 thousand? A million?

Note 3: take a guess at how many DNA test takers there are in addition to the 827 folks I have already documented back to John and Francis. I’m sure it’s a LOT!!

Suppose we were all working within one Tree…

Over the years there have been several attempts to establish one family tree: OneWorldTree (Ancestry); World Family Tree (Geni); World Connect (RootsTech); WikiTree; FamilySearch Family Tree. IMO, there are a lot of issues within these attempts, as individuals interpret records differently, or worse, enter names and relationships without any documentation, etc, etc. Many NPEs are never discovered…

Concept: suppose we started Tagging ourselves, our DNA Matches, the DNA Connections, and the DNA Common Ancestors in WikiTree or FamilySearch.

As I wrote about in “Advanced Genetic Genealogy, Techniques and Case Studies”, I had identified three separate Triangulated Groups from John HIGGINBOTHAM and Frances RILEY [my 7XG grandparents – Ahnentafel 856]. In other words there were three finite segments in my maternal DNA that were in each of my Ancestors going back to John or Frances; and a shared/overlapping DNA segment [part of my segment] in each of my 827 Matches, and in their Ancestors in a path of descendants from John or Frances down to and including each Match. [Remember each Match overlapped some of my DNA segment in the full Triangulated Group.]

Note that, on average, each of my Matches probably also had about 3 segments that went back to John and Frances. And I am not the center of the universe – all the other DNA test takers have their own, independent, experience. Certainly, there were many other descendants of John and Frances who had different DNA segments.

So what benefits would accrue to this concept…

1. Our accumulated Tags would provide a consensus that the paths we shared back to an Ancestor, passed through true genetic genealogy paths. This would be evidence that was independent of the genealogy analysis.

2. Perhaps a TG segment was actually from a different Ancestor. In the grand scheme of DNA, there would be certain distinct DNA segments that would be passed down from each Ancestor to various living test takers. It seems to me that, in general, these would wind up in multiple descendants. Note: we know that a given DNA segment could possibly be from a range of Ancestors, but when a number of DNA test takers all have the same [overlapping] DNA segment, it must surely be from one Common Ancestor.

3. So, this concept would help weed out which DNA segments came from which Ancestors.

4. Also, we’d start to accumulate specific DNA segments that came from specific Ancestors. We’d have the accumulated data to “paint” some of the Ancestor’s DNA.

I’m looking for feedback on this post. Pros and cons… Additional ideas… Is this already being tried? Shouldn’t we Segmentologists be working on a way to share and combine our DNA data to benefit each other? I’m looking for better language to articulate the possibilities and/or drawbacks of this concept.

[22DL] Segment-ology: A Concept Project for Segmentologists; by Jim Bartlett 20260622

Drinking Through a Fire Hose!

Featured

Posted on June 16, 2026 by Jim Bartlett

I have incorporated all of my AncestryDNA ThruLines and MyHeritage Theory of Family Relativity Matches into my Common Ancestor spreadsheet (see Chapter 7 of the free book, Segmentology Fundamentals, at ISOGG). Here is a tabulation of Matches with Common Ancestors (CAs) at all companies:

23andMe 167

Ancestry 10,435

FTDNA 239

GEDmatch 170 [I’ve not been able to find these at the other companies]

MyHeritage 261 [this includes 113 from Theory of Family Relativity]

Total 11,272

Clearly AncestryDNA leads the pack; but note that at the other companies, all the Matches have known shared DNA segments in specific Triangulated Groups (TGs).

Here is a breakdown of Ancestry by category:

ThruLines 8,516 [includes 144 wrong but fixed; plus 104 which are now gone]

No Tree 175 [determined by Pro Tools]

Private Tree 27 [determined by Pro Tools]

Unlisted Tree 531 [Note the large number of Matches; not found by ThruLInes]

Found in Tree 1,186 [Just searching]

Total 10,435

Of the above 10,435 Matches at Ancestry

There are 1,078 (roughly 10%) that I have tagged as incorrect – and I move those out of the active part of the spreadsheet
There are 741 Matches with known shared DNA segment in TGs [plus 328 additional Matches for whom I don’t know the CA]
There are 6,885 from 5xG grandparents or closer (nominal 6Cs) – this reflects the power of ThruLines to find them.
There are 2,077 Matches from 6xG grandparents; 1,232 Matches from 7xG grandparents; and 181 Matches from more distant Ancestors.

Note – some of the Matches are listed more than once when they are related to me multiple ways; some because they tested at multiple companies.

So what’s the point here?

#1 is that I’m drinking through a fire hose! Granted that I’m retired and can spend time on genetic genealogy…

#2 is that the data is out there – or rather, the data is here, within the reach of the major DNA companies…

My morning routine includes seeing if there are any new ThruLines at Ancestry or any new Matches at GEDmatch (particularly Ancestry ones). Often I cannot get through that chore before I have to break for other responsibilities.

If I have time, my next task is working down my Ancestry Matches in my Common Ancestor spreadsheet and evaluating their shared Matches with Pro Tools. I didn’t have Pro Tools when I first developed the CA spreadsheet, so there is a lot of catching up to do.

It’s a two pronged approach – enter the Match in my Tree [tagging them: “DNA Match” AND using a special Dot in their profile when I do so] and entering their path to our CA [tagging each person: “DNA Connection”] in my Tree; and then evaluating the shared DNA Matches using Pro Tools [sorted on the Match’s relationship]. I’ve done this through my 4xG grandparents and now have 3,368 Tagged Matches. Still thousands to go, plus all of the new Matches with CAs I find with Pro Tools. Drinking through a fire hose.

The point is that I’m building a large family of DNA-linked descendants for each Ancestor – easy to review in the CA spreadsheet and in my Tree. AND, as I find more and more Matches with segment information, the consensus builds for Chromosome Mapping info.

[22DK] Segment-ology: Drinking Through a Fire Hose; by Jim Bartlett 20260616

Free: Segmentology Fundamentals eBook available for download at ISOGG/Wiki

Clustering vs Triangulation

Featured

Posted on April 7, 2026 by Jim Bartlett

Recently, I was called out for abandoning Segment Triangulation. Let me set the record straight. I like, and use, both Clustering and Triangulation – virtually every day! Each one has some unique strengths and some drawbacks. A (very) short review….

Triangulation. When I started Segmentology, in 2015, I was the Johnny Appleseed of Segment Triangulation. I don’t claim to have invented it, but I was definitely a fan, and, I think, took it to new levels. It has been called the “Gold Standard” in genetic genealogy. Your DNA is composed of specific DNA segments from specific Ancestors – Triangulation helps you to determine these segments, group your Matches by shared segments (and therefore ancestral lines), and develop (or “paint”) a Chromosome Map of segments. However, the critical step of determining the ancestral line is often hard (at least for me) because the companies providing segment data, in general, don’t have any/many good Trees.

Clustering. In late 2018, auto-Clustering was introduced, and we could easily get many Clusters depending on the range of cMs we used. With this tool, you could determine these families of cousins, group your Matches by shared Matches, and try to determine the Common Ancestor. This was a tool we could use with AncestryDNA (either auto- or manual Clustering), where there tended to be many more Trees and genealogy tools. It worked for me… However, at AncestryDNA we cannot get the segment data to tie Matches to DNA segments.

By 2020 I had finished my Triangulation and had 372 Triangulated Group (TG) Segments. I was growing frustrated because I wasn’t getting very far finding Common Ancestors. On the other hand, I was finding a lot of Clusters with pretty solid Common Ancestors. So, I shifted my focus and have mainly been using Clusters, ever since. I still look for close Matches with larger segments for Triangulation. But my focus is on confirming more distant Ancestors and working on Brick Walls – mostly with Shared Match Clustering.

Bottom Lines:

1. Comprehensive Triangulation is a lot of hard work; but it can be a good tool for specific segments that appear to come from/through a Brick Wall.

2. Clustering is somewhat easier and focuses more directly on the genealogy. I think it’s more fun; and a better tool for many hobby genealogists. A spreadsheet is not required (but it is helpful to track everything).

3. Again: I use both, every day!

[22DL] Segment-ology: Clustering vs Triangulation; by Jim Bartlett 20260407

A Calculated Guess Is Great

Featured

Posted on April 3, 2026 by Jim Bartlett

In genetic genealogy, DNA is a tool. It helps us, in many different ways, to determine Ancestors and confirm cousins. My point in this blogpost, is that we don’t need to know precisely how we are related to each cousin – a calculated guess is fine. In fact, I encourage it.

Here is an example. I have determined that XYZ is a 7C. This is based on genealogy. We share my Ancestor couple METZGER/KEIFFER (aka Ahnentafel 352). This Ancestor couple’s full names, dates, places, life story are not important to Segmentology. Match XYZ happens to share 15cM, also not too important, but well within Shared cM range. XYZ is on my Paternal side, as is our MRCA. XYZ and I have over 30 other Matches on our Shared Match list – the top 20 Shared Matches share over 100cM with XYZ. I can tell you that this will show up as a strong, solid Cluster. You get the picture; this is pretty solid…

So now I notice on my Shared Match list with XYZ that, per ProTools, ABC is a 1C to XYZ, sharing 852cM. ABC has NO Tree. However, ABC has a long Shared Match list with over 30 Matches who are known cousins to me through Ahnentafel 352. #A0352P is the first thing in many of my Shared Match Notes, which is all I need to know…

There is no Tree for ABC, but as a 1C to XYZ, I don’t need a Tree. I know XYZ’s grandparents, must also be ABC’s grandparents for them to be 1C. So, I confidently add ABC to my Common Ancestor spreadsheet and copy the line of descent I already have for XYZ, and change ABC’s parent to UNK. Done!

I’ve now added ABC to the Shared Match spreadsheet, and can enter a Note for ABC which starts #A0352P. Which Note is now visible to all other Shared Match lists (and Clusters) that include Match ABC. This helps me find even more Matches to evaluate and add.

Emboldened by this logic, I am sure I can also add a proposed 2C to my spreadsheet (with UNK for both parent and grandparent). This will “tuck” many more Matches into my spreadsheet of known cousins, even though I don’t know their parents or grandparents. And those Matches will often highlight other Matches who can be added. For A0352P, I now have 151 confirmed Matches!

Note: It’s important that these potential additional Matches be vetted (as above). They should also be part of an appropriate Cluster of Shared Matches. Even the parent of a known Match can be on the wrong side. That is to say, for instance, my line from XYZ goes through her mother – so her tested father could well show up as a close match to XYZ, and to me, but his path to a CA would be different (ie NOT through his wife to A0352P!).

In conclusion, there are two options:

1. Leave this “Match with no Tree” out of my Tree – disavow them as a cousin because *I* don’t know their parent’s name.

2. Accept this Match as a 7C to me; just as much as I accepted XYZ as my 7C.

To me, as a lifelong genealogist, I’d choose Option 2 in a heartbeat. I’d hug this new Match just as strongly at a family reunion. I’d probably ask their parents’ names… But the rest of their line would already be in my spreadsheet. The UNK parent/grandparent wouldn’t make any difference. And, that Match will now help me document other Matches!

AND, often this new cousin Match doesn’t know much of their Ancestry – be a helpful genealogist, and send them a message about the ancestry you are sure they have… Add their line to your Tree, and ask if they’d like you to add names in place of the UNKs!

This is also a plug for the Common Ancestor spreadsheet – a valuable tool for recording found cousins, and for easily seeing how other Matches fit in. Hard work, but it sure highlights strong branches of my family Tree (as well a weak branches).

[35BCa] Segment-ology: A Calculated Guess Is Great; by Jim Bartlett 20260403

From Waterfalls to the Sea

Featured

Posted on March 29, 2026 by Jim Bartlett

This is an analogy about your DNA.

Recently, a Kona low flooded a lot of Oahu, HI – particularly the North Shore. I watched many videos of torrential rains and swollen waterfalls (one after another); and videos of flooded areas as all that water made its way to the sea.

Close your eyes and think of each waterfall as one of your Ancestors and then imagine each waterfall representing part of your DNA. The water flows in branches to rivers and finally into the Pacific Ocean. Each waterfall is like your DNA, flowing from a distant Ancestor and combining with DNA “flows” from other Ancestors to your parent, and then to you (you are the Pacific Ocean in this analogy). Many different paths over time and geography winding up with you. And the same is happening on the other side of the mountain, also flowing to the Pacific – representing the DNA from your other parent…

I think it’s a good visual analogy – many Ancestor sources of parts of your DNA flowing and combining until it finally reaches you.

[22DK] Segment-ology: From Waterfalls to the Sea by Jim Bartlett 20260329

ProTools Part 27

Featured

Posted on March 27, 2026 by Jim Bartlett

Shared Match Relationships

Setup: Whenever I add a Match to my Tree (usually a ThruLines hint, that I agree with), I then check the Shared Matches, sorted by ProTools by the closest relationships. I first scroll down the list to confirm that, indeed, several of them have the same MRCA (the first thing in the Notes field). I then usually look at each one (usually down to 100cM) to see if I can link them to the base Match and/or place them in my Tree and add them to my . Usually this is done at AncestryDNA, but sometimes at MyHeritage.

Topic: In my spreadsheet I have a column for the relationship to one of the closest Matches. Format: 209cM/1C2R: Match Name. This is strong, additional, evidence that this branch of my Tree is “fluffing out” correctly. Some observations about this relationship:

1. Usually the relationship is exactly right.

2. Usually AncestryDNA offers two alternative relationships. One is a “full” relationship, like 1C2R; and the other is a “half” relationship, like half great granduncle. These are equivalent from a DNA (cM) “math” standpoint – they would have the same cMs on average – the DNA alone couldn’t tell the difference. But relatively few of your Matches will be “half” (indicating their MRCA is one person with two different mates). You can usually tell them apart by how they fit in your Tree, or by their ages, or by a consensus among their own shared Matches. Bottom line – it’s usually the full relationship.

3. However… in a few cases the relationship doesn’t mesh with where I think they go in my Tree. There usually are other equivalent relationships; and a simple click on the Shared Matches estimate at AncestryDNA will quickly bring up a list. In this case, 2C1R, was on the list and that agreed with the genealogy. An alternative is to keep the DNA Painter Shared cM Project tool handy – just type in the cM amount to see the equivalent relationship and other relationships that are found almost as frequently.

4. If I cannot find a reasonable close relationship, I force myself to dig a little deeper… Sometimes a Match’s Tree skips a generation or adds an extra one; infrequenly the Match has shifted the test taker to a parent or grandparent (the test taker appears to be the child of someone born in 1880…). There are several ThruLines Trees that “skip” a generation in order to generation a Match within 6C range. Sometimes, I can figure it out and put the “corrected” version in my Tree; other times I just set it aside, and NOT include that line in my linked Tree, and highlight it as probable wrong in my spreadsheet and in the Match Notes (so I don’t stumble over it again).

Bottom line: with larger Shared Matches than 100cM (or so – use your judgment), the AncestryDNA relationships are pretty accurate; but occasionally we need to use one of the other, equivalent, relationships. This relationship is a pretty good Quality Control check.

[22DK] Segment-ology: Pro Tools 27 – Shared Match Relationships; by Jim Bartlett 20260327

Musing…

Featured

Posted on March 25, 2026 by Jim Bartlett

Volume 1 of Segmentology is done. Fundamentals. What to do next? Some musings – waddayathink?

1. Continue this blog. Volume 1 incorporated many of the 200 blogposts so far, but I have perhaps 100 more in various stages – from title or concept to full drafts not yet published (same “scatter-shot” range of topics… And, as always, I encourage you to request topics .

2. Focus on Volume 2. Something like “Using the Fundamentals”. At the top of my list would be chapters on Finding Bio Ancestors; Walking The Clusters Back; Compilation of ThruLines TIDBITs; Tying in Floating Branches; Creating Your Personal Shared cM Chart; Some Core Objective Statements… What would be your catchy titles?

3. A Segmentology Collaboration platform or Forum. Some method where we could share our collective experience, insights, objectives, wish lists…. I feel there is a lot of collected wisdom among practicing “Segmentologists” – how can we capture and focus build on that? Your ideas are encouraged.

I’m not going to add an “all of the above” category, but that’s where this might go…

This blog has helped – actually “forced” – me think through, and research, and document Segmentology related concepts – to put them in plain English as best I can. I encourage you to comment on our future path. In the near term I’m going to “unload” some of my backlogged posts. I turn 83 this week, and I’m just not done with this Segmentology journey…

[99F] Segment-ology: Musing… by Jim Bartlett 20260325

Segmentology Fundamentals

Featured

Posted on March 22, 2026 by Jim Bartlett

Segmentology Fundamentals

A Segmentology eBook is now available for free download at ISOGG Wiki: https://isogg.org/wiki/Segmentology_Fundamentals

10 Chapters in 3 GROUPS (Segments, Groups, Tools) and a robust Glossary

Special thanks to all of you who have provided so much valuable feedback and encouragement on this Segmentology journey.

Sincerely,

Jim Bartlett

[99E] Segment-ology: Segmentology Fundamentals; by Jim Bartlett 20260322

Insights into Matches

Featured

Posted on January 25, 2026 by Jim Bartlett

In my last post I outlined two insights from analysis of my 8700 Matches at AncestryDNA with confirmed Common Ancestors (CAs): the number of Matches increases dramatically with each generation going back to the 6C level (where ThruLines ferrets out a lot of my cousins); and the average cMs flattens out in the mid-teens beyond the 4C level.

For this post I analyzed the Matches to see the distribution based on shared cMs.

Shown and not shown are 1491 Matches over 20cM, about 17% of the total. But the insight is that 83% of the Matches are from 6 to 20cM. And you can easily see the spike at 9cM. You’ll also notice the Matches at 6 and 7cM which I saved just before the AncestryDNA change in the lower threshold several years ago. I’m not sure there is a drop at 8cM – maybe because I haven’t found a lot of Matches at the 7C level and beyond.

At this point, as a life-long genealogist, I want to reiterate that cousins are where you find them and by far most are under 15cm (what we usually call small segments). And this is just the tip of the iceberg, because most of our true cousins beyond 4C (who have taken a DNA test) do not show up as DNA Matches. Most of my under-15cM Matches are also part of interrelated family groups (per ProTools), and their lines usually agree with standard genealogy research. A small percentage don’t and I remove them from the spreadsheet and this analysis.

Everyone has their own objectives in genetic genealogy. I encourage you to think about yours and write them down. Collecting cousins is not my objective but documenting interrelated cousins in family groups (with ProTools), and building evidence for each Ancestor is. This includes finding a few Ancestors that don’t “look” right and turn out to be NPEs. Or using Triangulated Groups or Clusters or Floating Branches to build evidence to break though Brick Walls/NPEs.

Clearly this is genealogy “big picture”. It forces me to treat all lines and Ancestors equally (yes, after I’ve spent a lot of time on my favorites). However, some of these insights, will also help with “targeted” objectives into specific areas of our genealogy.

[06H] Segment-ology: Insights into Matches; by Jim Bartlett 20260125

Insights into cM Patterns

Featured

Posted on January 23, 2026 by Jim Bartlett

I now have over 8,700 Matches at AncestryDNA with a confirmed Common Ancestor (CA) with me between 2C and 8C. See my Common Ancestor Spreadsheet post here. That’s a lot of data, so I thought I’d do some analysis. In 2024 I posted (here) my averages for 3C to 8C which roughly agreed with the Shared cM project.

Below is a table summarizing all of my data (including full cousins, half cousins and removed cousins). For each relationship there are columns for the number of Matches, the average cMs, the lowest cM, the highest cM; plus the number of generations (meiosis events), and average cMs for each. The table is then repeated with a sort based on meiosis events.

A word about meiosis events. They are the count from me up to the CA and then back down to the Match. Like generations… A 1C is 4 events (two up to grandparent (the CA) plus two back down to the Match. The number of meiosis events with a 1C2R is 6 (two up and 4 down). A half relationship adds one to the meiosis events – eg a 4C1R is 11 events; and 4C1Rh is 12 events. These are important because in a mathematical simulation, each event reduces the cM by half. From the Shared DNA Project a 1C (4 events) average is 866cM compared to 2C1R (7 events) is 122cM which is roughly 866 halved three times. Remember, it’s an order of magnitude thing. And, as we shall see, it generally works for close relationships (like 1C and 2C), but drifts away for more distant relationships (like 4C and beyond). Important: this is not biology’s fault, it’s the math’s fault. It’s because we have a LOT of true distant cousins that do NOT share matching DNA with us; and they are not reflected in the averages. This (lack of a normal curve) is highlighted in the second sort (by meiosis numbers) below. This is also reflected in the DNA Painter Shared cM Project tool which shows different groups of Matches for a given input cM value. For example at DNA Painter, plug in 55cM… the 29% group of 3Ch, 3C1R, 2C3R and 2C2R half are all 9 meiosis events; and the second group of 4C, 3c1Rh, and 3C2R are all 10 meiosis events. This also demonstrates that by the time we get down to 3C and 4C levels there is a lot of overlap.

For this first table, the takeaway is that the number of Matches with CAs increased dramatically with each generation. [Note I combine full cousin with cousin 1R because at my age, most Matches will be a generation younger that me] 3C & 3C1R: 196 Matches; 4C & 4C1R: 662 Matches; 5C & 5C1R: 1,406 Matches; 6C & 6C1R: 3,426 Matches. WOW, what an increase in the number of Match cousins. And then we have 7C & 7C1R: 584 Matches; 8C & 8C1R: 363 Matches. What happened? Why the steep decrease in numbers. Well, IMO, the major factor is that AncestryDNA’s ThruLines quits at 6C – ThruLines can “see” into private Trees (I cannot); and it roots out MRCAs with the smallest of Trees (I don’t have that time). I can only dream of how many ThruLines I’d get at the 7C and 8C levels. Some of the ones I have now, were found/recorded when we had Circles at Ancestry.

The point is: there are LOTS of cousins still waiting to be determined. ProTools is helping.

Table 1: 8,799 AncestryDNA Matches Summarized by Relationship

AncestryDNA		cM	cM	cM
MRCA	#Matches	avg	low	high	meiosis
1C2Rh	3	138	78	200	7
2C	1	269			6
2C1R	14	127	34	220	7
2C2R	8	47	39	162	8
2C3R	2	34	22	140	9
3C	57	63	13	208	8
3Ch	5	20	16	95	9
3C1R	139	28	6	148	9
3C1Rh	26	28	7	111	10
3C2R	106	22	6	68	10
3C2Rh	20	20	6	92	11
3C3R	34	22	6	58	11
3C3Rh	12	23	8	40	12
3C4R	1	20			12
3C4Rh	2	10	8	12	13
4C	128	24	6	220	10
4Ch	7	12	6	19	11
4C1R	534	20	6	114	11
4C1Rh	33	16	6	30	12
4C2R	267	16	7	92	12
4C2Rh	12	12	6	39	13
4C3R	27	16	6	44	13
4C4R	1	17	17	17	14
5C	469	16	6	62	12
5Ch	29	17	6	27	14
5C1R	1137	14	6	60	13
5C1Rh	7	14	6	60	14
5C2R	300	14	6	41	14
5C2Rh	9	14	7	27	15
5C3R	75	14	6	40	15
5C3Rh	1	18	18	18	16
5C4R	2	10	10	10	16
6C	1922	12	6	56	14
6Ch	97	11	6	25	15
6C1R	1503	12	6	52	15
6C1Rh	58	10	6	22	16
6C2R	618	12	6	44	16
6C2Rh	47	12	6	29	17
6C3R	12	15	6	30	17
7C	262	13	6	41	16
7Ch	10	15	6	39	17
7C1R	322	12	6	43	17
7C1Rh	7	17	6	43	18
7C2R	17	16	6	25	18
7C3R	5	15	6	18	19
8C	310	12	6	35	18
8Ch	6	10	7	17	19
8C1R	53	16	6	37	19
8C2R	12	17	8	19	20
8C3R	7	10	6	13	21
9C	63	14	6	24	20
Total	8799

For the second table; the takeaway is that the average cM tracks pretty close to each other at the same meiosis numbers. And after meiosis level 9 which averages 27cM; the “curve” quickly “flatlines” in the mid teens. This is reflected at DNA Painter with many relationships all in play under 20cM.

Table 2: 8,799 AncestryDNA Matches Summarized by Meiosis Events

AncesttryDNA		cM	cM	cM
MRCA	#M	avg	low	high	meiosis	avg cM
2C	1	269			6	269
1C2Rh	3	138	78	200	7
2C1R	14	127	34	220	7	132
2C2R	8	47	39	162	8
3C	57	63	13	208	8	55
2C3R	2	34	22	140	9
3Ch	5	20	16	95	9	27
3C1R	139	28	6	148	9
3C1Rh	26	28	7	111	10
3C2R	106	22	6	68	10	25
4C	128	24	6	220	10
3C2Rh	20	20	6	92	11
3C3R	34	22	6	58	11	18
4Ch	7	12	6	19	11
4C1R	534	20	6	114	11
3C3Rh	12	23	8	40	12
3C4R	1	20			12
4C1Rh	33	16	6	30	12	18
4C2R	267	16	7	92	12
5C	469	16	6	62	12
3C4Rh	2	10	8	12	13
4C2Rh	12	12	6	39	13	13
5C1R	1137	14	6	60	13
4C3R	27	16	6	44	13
4C4R	1	17	17	17	14
5Ch	29	17	6	27	14
5C1Rh	7	14	6	60	14	15
5C2R	300	14	6	41	14
6C	1922	12	6	56	14
5C2Rh	9	14	7	27	15
5C3R	75	14	6	40	15	13
6Ch	97	11	6	25	15
6C1R	1503	12	6	52	15
5C3Rh	1	18	18	18	16
5C4R	2	10	10	10	16	13
6C1Rh	58	10	6	22	16
6C2R	618	12	6	44	16
7C	262	13	6	41	16
6C2Rh	47	12	6	29	17
6C3R	12	15	6	30	17	13
7Ch	10	15	6	39	17
7C1R	322	12	6	43	17
7C1Rh	7	17	6	43	18
7C2R	17	16	6	25	18	15
8C	310	12	6	35	18
7C3R	5	15	6	18	19
8Ch	6	10	7	17	19	13
8C1R	53	16	6	37	19
8C2R	12	17	8	19	20	15
9C	63	14	6	24	20
8C3R	7	10	6	13	21
Total	8799

Sidebar – this evaluation also acts as a Quality Control indicator. Watch for data points way outside the norms. I had three Matches who skewed one of the numbers. I went back to them – they were close to each other and I was sure they were from an NPE. Upon reevaluation, they needed to be a generation closer to our CA. I made the shift, and all the numbers fell back into the norm.

These insights are helping me with a new review of Walking The Clusters Back, where in I need to use judgment when imputing relationships and CAs.

[06G] Segment-ology: Insights into cM Patterns; by Jim Bartlett 20260122

How Old Are Your Segments?

Featured

Posted on December 18, 2025 by Jim Bartlett

Well, it depends… Your chromosomes are very large segments, which are not very old at all. On the other hand, I have some small DNA segments from Neanderthal Ancestors – pretty old. In general, the smaller the segment, the older it is. But let’s think about this for a moment.

This discussion will be about your DNA segments – large segments from close relatives to ever smaller segments from more and more distant relatives. They are all part of the DNA you inherited from your Ancestors. Segments are formed at the moment of conception – when sperm meets egg – about nine months before you were born. They don’t change until you pass them on – after recombination and new crossovers – to the next generation. So our unit of “age” measurement is a generation.

So, let’s start with the largest “segments” – your 44 autosomes passed to you by your parents. How old are these 44 chrsomosomes? Well, they are 0 generations old. You are the first person to ever have each of these specific – full chromosome – segments.*

Then let’s look at your grandparent segments that make up your chromosomes. On average you have 22 chromosomes, subdivided by 34 crossovers, for 56 grandparent segments per Side. These were each part of full, new, chromosomes passed from your grandparents to your parents; and then, one generation later, passed to you by a parent – they are 1 generation old. Again, due to random recombination for every child, you are the first person to ever have these specific segments.*

Similarly, your great grandparents, passed new chromosomes to your grandparents, who passed segments to your parents who passed segments to you, which would be unique and 2 generations old.*

You get the picture. The unique segments in each of your Ancestors are recombined into new segments and passed down – generation after generation. Your segments are “imbedded” in the chromosomes and large segments they passed down. And knowing the genealogy of each segment, we can count the generations to find their age – always one less than the number of Ancestor generations back.*

* So what’s with that pesky asterisk? In short, “sticky” segments. Some segments are passed down intact – they are exactly the same segment in an Ancestor and their child (who is also your Ancestor) – they were not subjected to a recombination crossover. More likely than not, one of your smaller chromosomes (Chr 18 to 22) was passed from a parent to you intact. So, in that particular case, it’s age is 1 generation (not 0 generations like all the other chromosomes). And this happens to some of the other segments passed down at each generation. Above we noted that you got about 56 grandparent segments from one parent. When you pass these to your children, recombination will create about 34 new crossovers. In general, they will be subdivisions of 34 of the 56 grandparent segments passed down to you – leaving 22 grandparent segments intact. You only pass half of your DNA to each child, but that still includes about 11 grandparent segments which are now 2 generations old!

It gets complicated real quick!

This is one of the reasons that as segments get smaller, the range of possible relationships increases. A given segment may have persisted for several generations, or not.

Chromosome Mapping of segments with MRCAs let’s us figure this out. Even if our Map is not complete, at least in some areas of our chromosomes can be figured out. Someday… it will be interesting to try to determine a Shared cM Chart which figures in the age of the segment. I’ll bet the ranges would be somewhat smaller…

[O5H] Segment-ology: How Old Are Your Segments? by Jim Bartlett 20251218

Shared Segment Spreadsheet Incarnations

Featured

Posted on November 2, 2025 by Jim Bartlett

For me, the Shared Segment Spreadsheet is a critical tool, which evolves through four incarnations.

1. It starts as a collection of all your shared DNA segments – from each company. This also means a collection of all your Matches (except AncestryDNA), some with multiple shared segments. It can be searched and sorted.

2. Use as a segment Triangulation tool. Sort on: Company + Chromosome + Segment Start to arrange all the shared segments (within a company) into Chromosomes. And within a Chromosome they are arranged so that overlapping segments are close to each other. With this “view” each segment is Triangulated with other overlapping segments, or not. Maternal and Paternal Triangulated groups are formed*. Some of the under-15cM segments will not Triangulate and are labeled “false” and deleted or moved out of this spreadsheet – “everybody’s got to be somewhere.” This process is repeated for each company.

3. Form/identify Triangulated Group (TG) segments. Sort on: Side + Chromosome + Segment Start to separate the maternal and paternal segments and sort them in order within each chromosome. Since this spreadsheet is comparing all these shared segments with your own DNA segments, the shared segments from different companies will “break” into TG segments that align with your own segments. However, this phase of the process requires some judgment – the data is a little fuzzy and the ends of TGs will not be precise. You have to make a call. In general, to align with your DNA segments, each TG will end at the same Mbp as the next one starts. Make those calls and assign a TG Identification (TG ID)** for each segment. Make a TG segment header row for each one (I have 372 TG segments) that lock in the overall TG start and end positions and TG ID. TIP: make the TG header start location 0.01Mbp less than the first shared segment in the TG – so it sorts on top of the individual segments. Remember that every Match in a TG is related to you on your line back to a specific Common Ancestor (CA). Note: some small segments in a TG may go back further.

4. Use these TG groups to do the genealogy! Among the Matches, find the consensus path to the CA.

Summary: A shared segment spreadsheet has several uses – collection > Triangulation > TG ID > genealogy. The TG segment is your DNA segment. This covers all of your genetic genealogy, but you can always focus on one or more individual TGs, if you don’t want to eat the whole elephant at one time.

*I’ve covered the Triangulation process in other blogposts, and won’t repeat that here – this blogpost is about the three incarnations of the spreadsheet.

**I’ve covered TG IDs in other blogposts

[35BBa] Segment-ology: Shared Segment Spreadsheet Incarnations by Jim Bartlett 20251102

Walk The Clusters Back with AncestryDNA

Featured

Posted on October 17, 2025 by Jim Bartlett

AncestryDNA has just rolled out enhancements to their Clustering Program that let you “Create custom clusters”. At AncestryDNA > DNA > Matches > By Clusters/Pro > Create custom clusters. You must have the additional subscription for ProTools to access this program. I have not run it through its paces yet, but I wanted to review the Walk The Clusters Back (WTCB) concept, and ask for feedback on your experience with it.

The concept of WTCB is to adjust the cM range to focus on two generations at a time. The idea is to “solve” the Clusters for close relatives and then adjust the range down to include Matches in the next generation back, and then see where the Clusters separate into more distant Clusters. Start easy with a range of 90-400cM which is the recommendation for the LEEDS method to determine four groups. This would be roughly four Clusters with each one focused on a separate grandparent. Tag (by Dots or by Notes) every Match to the appropriate grandparent. Then drop the range to, say, 70-200cM to get mostly Clusters that include Matches who are 1C on a grandparent, and 2C on a Great grandparent. I don’t know of anyone who has found a “sweet-spot” range for each generation, and I suspect it might be different for each of us. The last time I did this WTCB I had to “fiddle” with the ranges – and never could find any range that gave me only Clusters with Matches from only two generations in each Cluster. So, get used to that.

The point is to notice when some Matches you’ve tagged to an Ancestor, then show up in different Clusters based on a new range – and then determine which sides are represented by the new clusters. Then tag all of those new Matches appropriately. Example: you have a Cluster with 20 Matches that is focused on a Great grandparent. Tag all the Matches with that grandparent (if not already tagged as a closer Ancestor). Adjust the range to add more Matches. Look in the new Clusters for the previously Tagged Matches – hopefully there are two new Clusters, but maybe three. From my experience there may be two Clusters with 15 to 25 Matches, each of which include some of the 20 Matches from the previous run. These new Clusters would represent the next generation back and the focus would only be one of the two parents of the previous Cluster.

Yes, it gets harder and harder with each new generation. The good news is that a Cluster with known Matches from one generation, can only morph into Clusters going back from that one Ancestor. This reduces the genealogy effort . If you’ve reviewed all of your ThurLines (and used ProTools to add even more Matches), you have likely tagged a lot of Matches out to 6C. So as the 4C and 5C and 6C Clusters start to form (as you reduce the cM range), you may already see the Ancestor for the new Clusters by looking at the Notes.

Use your judgment, and fiddle with the cM ranges. Please report back on your experience, and/or if you find a sweet spot for some range. Note that the sweet spot should include two generations – the one you’ve figured out and the next one you are working on.

[19P] Segment-ology: Walk The Clusters Back at AncestryDNA by Jim Bartlett 20251017

Boundaries of a Triangulated Segment Part 2

Featured

Posted on September 15, 2025 by Jim Bartlett

Thanks to all for your responses to my last blogpost. All of them are a good read.

I had always thought a TG segment was crystal clear… WRONG. Per the classic refrain from the Legal Genealogist, Judy Russell: “It depends!” My second ever blogpost on 9 May 2015 (Benefits of Triangulation) stated 16 benefits, including: Organizing most Matches into TGs; All Matches in a TG have the same Common Ancestor; the TGs define crossovers and a Chromosome Map; TGs are equivalent to Phased data. What I didn’t say explicitly is that each TG represents a segment of my DNA.

The elephant in the room is: who was the first Ancestor to pass down that segment (as part of a full chromosome passed to a child who is my Ancestor)? In other words, in what earliest generation did that full segment first exist in my line? There may be a different such “elephant” for each Match… but that’s another story.

So back to “it depends”…. For me there are 3 objectives:

1. “See” my DNA segments. Divide up my chromosomes into discrete segments, each one of which came from a specific Ancestor.

2. Determine the Ancestor for each segment.

3. Determine my Chromosome Map of segments – each segment being adjacent to another segment from the beginning to the end of each of my 45 chromosomes.

When I started forming Triangulated Groups, I only worked with known cousin Matches. It created a patchwork of TGs. One day I decided to bite the bullet and Triangulate all of my segments, a company at a time (FTDNA, 23andMe and MyHeritage). It took months without many of the tools we have today. And the three versions meshed virtually exactly! That was as expected since all comparisons were against my DNA. I was using the “full” version of a TG, plus some judgment for large segments from close relatives that spanned more than one TG. This brings me to a significant factor in Triangulation: Judgment.

Judgment: It’s easy to compare yourself to another Match and “see” an exact shared DNA segment. But what would happen if Match 3 in the last blogpost only overlapped Match 1 by 5cM? Would we then call this a 5cM TG (against the rules and throw the whole thing out?). Would we discard Match 3 (even if they had a robust Tree that included a CA)?

Judgment: Sometimes there is a close relative, Match 5, who overlaps much more than me and Matches 2 and 4. Experience (and judgment) tells me that this somewhat larger segment is probably from a close relative whose Common Ancestor with me includes a father/mother more distant – with one of them being the CA for the full TG.

As I read over the comments of the previous blogpost, several words pop into my mind: context, messy, complex, judgment, imprecise, etc., as well as “we’re making this up as we go”.

Messy – yes Triangulating all of our Match segments against our own can be messy – and judgment is needed. Given the random nature of recombination, I do see some curve balls from time to time. Triangulation usually identifies false (IBS) segments, which should be discarded. If I find a shared segment that really messes things up, I’ll also discard it (or at least highlight it as weird). As I’ve blogged before, the raw data is sometimes messy – or fuzzy – sometimes reporting a shared DNA segment that runs longer that it should. Although my parents are not related (per GedMatch), I do have one area of my DNA that my two parents combined have all of the most common SNPs, and so I get a “zigzag” pileup of many Matches with false segments there. I’ve identified this area and then toss out those Match segments (<10cM). Pedigree collapse and endogamy also create messy areas. To the extent possible, identify these specific locations with a dummy segment to highlight the potential issue.

Context – in developing my Chromosome Map, the segments will be adjacent to each other. I look for the previous and the following TGs to the one I am working on. Ideally (and actually) each of my segments will “crossover” to the next segment which is from a different Ancestor of mine. Note – that “next” Ancestor may involve a different grandparent, or a different 3xG grandparent. We have to fill out the Chromosome map to figure that out, but it is important to remember that the next TG will have a different CA. So if I accept the conservative TG (a part of the Match 3 shared segment), what different Ancestor can I find for all of the “leftover” shared DNA segment pieces of my DNA.

Complex – One complex part of this analysis is what about the parts of true segments from Match 1 and 2 and 4 that are not in the full TG I show in blue? I focus on my DNA, but I think every true Segmentologist should try this experiment with say Match 2 at GEDmatch. Use Segment Search to find other Matches who share the same segment and build the TG for Match 2 – it “will” be different than my (or your) TG. A little different or a lot different? If Match 2 is a known cousin, the same MRCA would almost always apply. By doing this with other Matches in a TG, many of us (working together) are building a larger segment of the CA.

Imprecise – I’ve blogged about fuzzy data. I counter this with judgment. I look at all the segment data for a TG (all my segments are in one spreadsheet). Among the TG fuzzy start data I decide on a specific Mbp start location. Then I decide on a Mbp start location for the next (adjacent) TG. Often some shared segments from the initial TG will “spill over”, past the start of the next TG. The small amounts of spillover, I just ignore: fuzzy data. If there is a large spillover, I’ll consider if the second TG is potentially closely related to the first TG, or not.

Imprecise – This also describes the fact that all your shared DNA segments may not “cover” all of your DNA neatly, or uniformly, or even completely. The shared DNA segments are independent and random – they are not at our beck and call… They don’t necessarily help us fill the gaps perfectly. They are what they are – they are clues we must use as best we can.

All of the above is to indicate that all IBD shared segments should have a home in a TG, and that all the TG segments should cover all of your chromosomes, IMO. Remember, at each generation, all of your segments from that generation must add up to all your chromosomes!

Another aspect of this which I muse about is the SNPs – thousands of them in a unique arrangement in my DNA. Let’s say Match 1 shares 2,000 SNPs with me. Alone we would say the shared DNA segment between us (green) came from a Common Ancestor. Similarly we would say the 3,000 SNPs in the shared segment with Match 2 was from a CA. I don’t see how we could argue that these two CAs were somehow different. I think it is much more likely that the CA is the same, and Match 1 just didn’t get the full segment that I did and Match 2 did. Match 3 is in the middle of all these SNPs – surely Match 3 got the same SNPs from the overlapping locations. By comparing the SNP values of all 4 Matches, I’m confident that we’d find the same values at each SNP location.

Note: all of these Matches and evaluations are based on separated cousins. Of course close relatives could have the same segments and SNPs – the whole concept of segment Triangulation depends on an analysis of more distant relationships.

My summary:

The TG Group of Matches should all look for the same Common Ancestor – and hopefully help each other toward that goal.

The full TG segment (blue) is my DNA segment, which I can use as part of my Chromosome Map. It defines my crossover points. Also I can contribute my SNPs to any larger study of my Ancestor’s DNA.

I must be careful to not state that my Matches have this TG segment. Matches will have their own different, but overlapping, TG segment.

The Common Ancestor almost certainly passed down a larger DNA segment, through at least some of their children, which different descendants (including some of my Matches) got. Note: there may be other descendants who have DNA tested who may share with the TG Matches, but not me (I am not the center of the universe…)

[08Ab] Segment-ology: Boundaries of a Triangulated Segment Part 2 by Jim Bartlett 20250915

Boundaries of a Triangulated Segment

Featured

Posted on September 15, 2025 by Jim Bartlett

I presented “More Segmentology” today at the East Coast Genetic Genealogy Conference. I was questioned on a slide grouping segments into a Triangulated Group, and it appears there is a debate about this. I’d like to have your input on this.

Here is my slide:

I show 4 Matches with overlapping Shared Matches with me on one side (parent). This is the definition of a Triangulated Group, which I showed in the bottom Chromosome – in green. What we can “see” is only the Shared Segments from Matches 1 to 4 in green. I contend that Matches will rarely have segments that are exactly the same as my segment. So for the purpose of illustration, I guessed that their segments from our Common Ancestor was almost always different – that sometimes their segments started to the left of mine, sometimes to the right of mine; and sometimes the same ending, and other possibilities shown. In fact, I have tested this at GEDmatch where I could Triangulate with each Match as the base, and sure enough, they had their own, different, Triangulated segment. I went on to claim that my segment (from our Common Ancestor) started where the Shared Segments had their earliest start; and ended where the Shared Segments had their latest end – as shown in the green Triangulated Group segment above. The start and end of the TG defined my segment. Some others contended that the Triangulated Group segment should be shown as only the green that was common to all 4 Matches – like the space between the two vertical blue lines.

I don’t know of any Scientific Paper that defines the boundaries of a Triangulated segment. So I am interested in your perspective, and why.

[08Aa] Segment-ology: Boundaries of a Triangulated Segment by Jim Bartlett 20250914

A New Cluster on the Block

Featured

Posted on July 25, 2025 by Jim Bartlett

AncestryDNA has rolled out an “auto” Cluster program. I tried it and got 8 Clusters, ranging from 3 to 9 Matches in each one. A total of 40 of my 60 Matches above 65cM. The other 20 Matches were not included because they didn’t form a Cluster of at least 3 Matches. I know the Common Ancestors for each of the 40 Matches and the program clustered them 100% correctly. I’d give AncestryDNA an A+ for this new program. I’m impressed and anxious to have the ability to adjust the cM ranges downward to get more Clusters.

Some additional input on auto-Clustering.

It began in late 2018, with Genetic Affairs (by EJ Blom), and soon we also had Shared Clustering (by Jonathan Brecher) and DNAGedcom Client (by Rob Warthen). I tried all three. I had already done segment Triangulation on all my Matches at FamilyTreeDNA, and I worked with Johathan Brecher and we Clustered those same Matches. There was over 90% concurrence between the hundreds of Clusters and the hundreds of Triangulated Groups. Not enough to say the two processes were equivalent (they are not), but certainly this analysis showed a strong tendency of Clusters to point to a Common Ancestor between me and all the Matches in each Cluster. A very strong clue in each case.

I then Clustered all of my Matches at AncestryDNA – down to about 18cM. Many of the Clusters had a Common Ancestor consensus (easily seen in the Match Notes I had previously entered – many from ThruLines). So, I imputed that Common Ancestor to the rest of the Matches in each Cluster. I used Ahnentafel numbers to represent my Ancestors and developed a tagging code: e.g. #A0020. The #A means a confirmed Common Ancestor with a Match, and 20 is Ahnentafel for William MITCHELL 1824-1895. This code is the first thing in the Notes field. When I impute a Common Ancestor to a Match from a Cluster consensus, I use #L0020 – which means the Match is highly Likely to have that Common Ancestor with me. With a #A or a #L, I tagged almost all my Ancestry Matches over 20cM and many below that. This was in the 2019-21 time frame.

Recently, with ProTools, I’ve been able to determine how many more Matches fit into my Tree – and thus our Common Ancestor. For well over 90% of all these new Match cousins, the #L tag turned out to be correct – I only needed to change the L to A.

Bottom line 1: I am a big fan of Clustering at AncestryDNA and really look forward to expanding the coverage to more Matches.

Bottom line 2: Use ProTools with Clustered Matches to really nail down Common Ancestors to Matches.

[22DI] Segment-ology: A New Cluster on the Block by Jim Bartlett 20250725

Segment Triangulation Insight

Featured

Posted on May 25, 2025 by Jim Bartlett

Your DNA segments are from your Ancestors. They are adjacent to each other and fill up (or “cover” or paint) each of your Chromosomes. You have shared DNA segments with your Matches. With a browser, you can see your shared DNA on a chromosome – visually as a bar and by the start and end points in the data. Segment Triangulation lets us group overlapping segments and identify your full segment from an Ancestor. It also places each Triangulated segment where it belongs on one of your 46 chromosomes. Genealogy helps you decide if each segment is on a maternal or paternal chromosome. Once you do that, it’s then relatively easy to “fit” the Triangulated segments along each chromosome.

Three key elements of Segment Triangulation:

1. A browser to give you the data – where is each segment on a chromosome.

2. Determine the segments are on the same chromosome (you have two of each chromosome – one maternal and one paternal). Several ways to do this…

3. Determine where one of your segments stops and another starts – i.e. the crossover points. A judgment call based on the consensus of the data.

A fourth key element is determining the MRCA for the Triangulated segment, and the path the segment took from the MRCA down a line of your Ancestors to a parent to you. This is mainly a genealogy task, working with your Matches and their Trees to build a consensus.

I hope this “insight” provides a clearer picture of what Segment Triangulation is all about and why it is a worthwhile process – for specific segments or all of your DNA.

[08F] Segment-ology: Segment Triangulation Insight by Jim Bartlett 20250525

Half-Identical Region (HIR)

Featured

Posted on May 21, 2025 by Jim Bartlett

Your DNA segments (that make up the 23 Chromosomes passed down to you from a parent) are not the same as shared DNA segments with a Match (as described by a chromosome browser) aka a Half Identical Region (HIR). All of your DNA is real, down to any size you want to analyze. This is not necessarily so for a shared DNA segment (or HIR)!

From the ISOGG Wiki: A half-identical region (HIR) is a region of two paired chromosomes where at least one of the two alleles from one person’s pair of chromosomes matches at least one of the two alleles from a different person’s pair of chromosomes throughout the entire region. A half-identical region may be either identical by descent (IBD) or identical by state (IBS).

In my words, for genetic genealogy, a computer compares your DNA test to a potential Match’s DNA test. The computer compares the two raw DNA data files – about 600,000 SNPs with two values (alleles) for each SNP. The two values are one from the DNA passed down from the father and one from the mother. The computer is looking for a long string of matching SNPs, which are then reported as a shared DNA segment. This meets the HIR definition above – at least one value is the same at each SNP in the shared segment. The theory is that, although much of our DNA will be the same, there is some variation, and a long enough string of matching SNPs will indicate this segment of DNA is from a Common Ancestor. This also implies that the long string is on one side – on one chromosome from our mother OR our father. A lot of reported genetic data indicates that such an HIR is true when it’s at least 15cM.

But why aren’t all shared DNA segments true? Because the computer algorithm blindly looks at *both* values at each SNP for you and the potential Match. The computer may create a string of your SNPs that agree with your potential Match’s SNPs, but some are from your father and some from your mother. Clearly this “zig-zag” result, using SNPs from both your parents’ DNA, is not a representation of your DNA on one chromosome. It’s not a DNA segment passed down from one of your parents to you. It’s a false segment! Or this might have happened with your potential Match’s data, or with both of you. Bottom line: wherever the “zig-zag” occurred, the shared DNA segment is false.

The good news is that this “zig-zag” result doesn’t occur with long enough segments – over 15cM. And it occurs very infrequently with 14cM shared DNA segments. And there is a rough distribution curve – probably different for each of us – which drops down to about half of our 7cM segments are false. And most shared DNA segments are false below 7cM – which is why they are generally not used. Some of the companies use other, proprietary, algorithms to discard (not report) some of these false Matches. Also, as I’ve blogged before, Triangulated Groups are very good at culling out the false segments.

This also ties into the ISOGG terms: Identical By Descent (IBD) and Identical By State (IBS), noted above. IBD would apply to true shared DNA segments – you and your DNA Match got the shared DNA segment from a Common Ancestor. IBS means the computer found a “match”, but IBS is usually used in genetic genealogy to indicate the false segments. I usually just stick to “true” and “false” shared DNA segments (or HIRs).

Another quirk in this discussion is using the term HIR to refer to a shared DNA segment. This is proper and OK. But, an HIR only refers to a shared DNA segment between you and one particular Match. We virtually never find exactly the same HIR with two Matches (although it’s possible with Matches who are closely related to each other.) When we look at segment Triangulation, the Triangulated Group is comprised of different HIRs. So HIR should not be used to refer to a TG. A TG represents a segment of your DNA (from a specific Ancestor) – there are many different HIRs in a TG. And each Match in a TG would have a different (but overlapping) segment from the Common Ancestor, with different HIRs. Because the whole process is so random, we just don’t get the same segments from our Common Ancestors that our Matches get.

Bottom Line: A shared DNA segment is also an HIR – formed by a computer by comparing raw DNA test data (about 600,000 SNPs) with two values (alleles) for each SNP. Shared DNA Over 15cM all are true segments (IBD); below 15cM some are false (IBS). A shared DNA segment (aka HIR) is usually unique to a specific Match.

[22DH] Segment-ology: Half Identical Region by Jim Bartlett 20250521

HAPPY 10TH ANNIVERSARY

Featured

Posted on May 7, 2025 by Jim Bartlett

10 years ago, I blogged: “What is a segment?”, and noted the difference between an ancestral segment (your DNA segment) – passed down from an Ancestor to you; and a shared segment (created by a computer algorithm) which usually indicates a Common Ancestor for both you and your Match.

This is still the fundamental concept that is key to genetic genealogy.

We’ve looked at a lot of twists and turns based on this concept…

– How segments are measured

– Why the data is a little fuzzy, but that doesn’t negate its power

– How our DNA is passed down in identifiable segments from our Ancestors

– How each generation of our Ancestors contributes two full genomes (46 Chr) to us

– Why some of our segments must be sticky (persistent) for multiple generations

– How we “see” our own segments through shared segments

– How we can map (or paint) our segments on our chromosomes

– How shared segment “size” predicts relationships

– How we can group Matches by segment Triangulation or shared Match Clusters

– How we can use groups to solve brick walls, NPEs, Bio-Ancestors, unknowns

– Which ancestors always, or sometimes, or never have shared Matches

– Why all of our shared segments (6cM and up) may be important to us

– How to Walk Ancestors, Clusters, Segments back in our genealogy

– How spreadsheets can help us collect, arrange, analyze, QC, and use data

– How to use new tools: autoClustering, DNA Painter, browsers, ProTools, etc.

You have all been part of this journey of learning – as in fact, we are all learning from each other. I very much value your feedback and suggestions.

As some of you know, I also host DNA Special Interest Group (SIG), through the Washington DC Family Search Center. It was in person/local until Covid. We are now international via Zoom – 2nd Wednesday of each month 7-9pm ET. This is now an Advanced DNA SIG, and members are encouraged to participate and/or present (learn from each other). If you’d like to join, please email me at jim4bartletts@verizon.net

Happy Anniversary – your suggestions/observations/comments are “gifts” to us all.

[99F] Segment-ology: Happy 10th Anniversary by Jim Bartlett 20250507

SPECIAL ANNIVERSARY COMING UP

Featured

Posted on April 22, 2025 by Jim Bartlett

My first real Segmentology blog post was on 7 May 2015 – so an anniversary is coming up soon. I’m looking to consolidate and re-package the approximately 200 posts in Segmentology. If you would like any new or revised topics included, please feel free to use the comments or email me at jim4bartletts@verizon.net. NOTE: The Table of Contents (Outline in the header bar) has been updated, and all the posts are hyperlinked.

[99E] Segment-ology: Special Anniversary Coming Up by Jim Bartlett 20250422

MITx Class on DNA is Free

Featured

Posted on February 28, 2025 by Jim Bartlett

MITx offers a wide range of free, on-line, self-paced semester-long courses to anyone in the world. Coming up next week is Introduction to Biology – The Secret of Life. I’ve taken this course (actually twice). It’s taught by Professor Eric Lander – the founding director of the BROAD Institute and a principle leader of the Human Genome Project – and a fantastic instructor (his course is fun). This course is targeted at non-biology students. This is not about genealogy, it’s about DNA. Anecdote: I was about halfway through the course, and one night my wife called out: “Jim, what are you doing – it’s 3 AM.” My reply: “I’m in a lab, folding proteins to capture a virus”. If you are into DNA and Segment-ology, this is a great opportunity to get a firm grounding. As a side note, I think MITx is a great undertaking and am a regular donor to that program. Free, world-wide MIT classes…

Here is a link: https://www.edx.org/learn/biology/massachusetts-institute-of-technology-introduction-to-biology-the-secret-of-life

Click on the short YouTube video… Enjoy.

[99D] Segment-ology: MITx Class on DNA is Free by Jim Bartlett 20250128

ProTools Part 25

Featured

Posted on February 22, 2025 by Jim Bartlett

The Path Is Key

This may be an extension of my “genealogy sacrilege” outlook or rant.

But before I begin, to each their own – you get to choose your objectives.

My two main objectives are to get my genealogy right; and to get the Chromosome Map of segments from my Ancestors at each generation right. My objectives do not include finding all of the descendants of all of my Ancestors. However, I do think that documenting how my DNA Matches interrelate to me and each other is very helpful in achieving my two objectives – and this swells my Tree somewhat. I’m finding: Match paper trail paths (and ThruLines clues) that are impossible, given the DNA evidence; and DNA evidence that has revealed genealogy paths I never would have otherwise found (not just limited to breaking through brick walls).

So, a lot of work to do to document what will be over 10,000 Matches… Time is precious…

When documenting DNA Matches and their line of descent from our MRCA to them, the “Path Is Key”. Dotting all of the “i”s and crossing all the “t”s is NOT! The DNA segments do not “know” their hosts’ names (or dates, or places), just that the segments are passed along. We genealogists document what we can about each of these Match ancestor DNA hosts. It helps us keep track – in time and place. But how much effort do we need to put into documenting our Matches’ lines? My opinion is: not much! We need to be sure of the path. We don’t need to know the full names, or pet names, or titles. It’s nice to know the birth/death years, but how much digging should we do to find the complete birth date or place? What do we do when several different descendants insist on different given names … I could go on and on, but I’ve decided it’s not my job to adjudicate their family “wars” – my objective is to be clear of the path.

Therefore, I’m now using terms like Pvt, Unknown, GUESS, sibling of XYZ, etc. to describe Match Ancestors – particularly those close to the Match.I don’t really care about their parent’s or grandparent’s names or genealogy info – just the path that must exist for a DNA segment. [NB: proving a specific genealogy-DNA link is a separate issue; a potential path is not a proven path.]

I am still documenting the child and grandchild of the MRCA (given name and birth year at least). But, IMO, the further down the path from the MRCA to the Match, the less precise this info needs to be. The Key Is the Path. I don’t want to introduce incorrect info, so I’m introducing “other” terms in the name field when it is unclear, in debate, or might take days to research and resolve. I note the “path” that has to be and move on.This allows me to get as many DNA Matches as possible into the spreadsheet. Then the interrelationships can be better evaluated.

SUMMARY: Don’t worry about “fully” documenting the MRCA-to-Match path; just that the path does exist, and no incorrect info is introduced (unless your Tree is private). And, of course, it’s up to your own judgment as to if/how much of this recommendation to follow. My plan is to get as many Matches as possible into MRCA family groups in a spreadsheet, and then study the interrelationships with ProTools. Get Matches in my Tree and my Common Ancestor spreadsheet, but “do no harm”.

[22DG] Segment-ology: ProTools 25 – The Path Is Key by Jim Bartlett 20250222

ProTools Part 24

Featured

Posted on February 21, 2025 by Jim Bartlett

Small Segment Stats

Ancestry DNA Matches who share 6-7cM and have a known MRCA with me: 1,160.

Total Ancestry DNA Matches at any cM level: 7450.

About 15% of my DNA Matches with a known MRCA share only 6-7cM.

This is NOT a statement linking DNA and Ancestors.

This IS a statement about the many true cousins we will not see in our Match lists because the current threshold at AncestryDNA is 8cM.

I’m glad I Dotted and saved some of my 6-7cM Matches when Ancestry made the threshold change – it was a fraction of the total. I wish I’d have saved them all…

To end on a higher note – I still have 2,600 other 6-7cM Matches to work with – many of them are being determined as close cousins to known MRCA Matches by using ProTools.

[22DF] Segment-ology: ProTools Part 24 – Small Segment Stats by Jim Bartlett 20250221

ProTools Part 23

Featured

Posted on January 31, 2025 by Jim Bartlett

Integrating With Genealogy

ProTools is a powerful tool. But it has it’s limits. 1C and closer relationships are very accurate, in my experience. Beyond that, the range of possibilities grows quickly as the cMs fall below the 1C range. But think about what that means… A 1C relationship takes us back to our grandparent level. Think of a 20 year old genealogist with a 50 year old parent, and 80 year old grandparents. Those grandparents would be in the 1950 census. And the census is a pretty good tool back to 1850 – another few generations. You might argue that the census is not rock solid in every case. There may be adoptions, NPEs, etc. That is true, but those individuals will not show up as DNA Matches – for the most part.

Yes, there are still a few situations that may slip through. But on the plus side, the census and ProTools will sort out a high percentage of false relationships, and/or incorrect genealogy “research”.

Used together, the census and ProTools can pretty accurately cover the past 175 years.

[22DE] Segment-ology: ProTools 23 – Integrating With Genealogy by Jim Bartlett 20250131

ProTools Part 22

Featured

Posted on January 19, 2025 by Jim Bartlett

A Rant about Relationships

I praise Ancestry for ProTools – just about everything about it is great. I have often reported how accurate the close Relationship Estimates are. I rely almost 100% on 1C and closer relationships; and have found many 2C relationships to be correct. I worked for several days on a 3C relationship – knowing the Trees of the two Matches pretty well – to no avail. This is becoming a regular occurrence.

I’ve noted over the past year, Ancestry has tightened up their Relationship Estimates – all are now within 4C. We can tag a Match at 4C or closer, or Distant. A far cry from the Circles where Ancestry showed us how we were related out to 8C; or even the current ThruLines out to 6C. Will they change again, tomorrow, to only showing Matches related within 4C or closer? I am long since past that threshold…

So I decided to take a deeper dive, under their hood, to see what they predicted for small cM Matches. I randomly selected a 6cM Match that I had saved. She was predicted to be Half 3C1R or 4C – evidently their deepest estimate. So I clicked on that estimate to get their more in depth analysis. Here are two screenshots of their analysis [sarcasm: based on results from their 27 million testers?]:

It seems to me they have adopted the “Cinderella Principle” – push hard to fit the data into a desired result. Are they really claiming that 99% of all Matches at the 6cM level are a 4C or closer? The Ancestry folks are much smarter than that… They know better, and, for some reason, AncestryDNA is distorting the truth! SHAME! Our tens of thousands of small cM Matches do not fit into a size 4C Cinderella slipper!!

Bottom lines: still rely on 1C or closer relationships for analysis with ProTools; IMO, beyond 2C, treat the estimates as garbage; let me/us know if you have some insight that I’m missing (other than something related to greed).

[22DD] Segment-ology: ProTools 22 – A Rant About Relationships by Jim Bartlett 20250119

Pro Tools Part 21

Featured

Posted on January 9, 2025 by Jim Bartlett

Adding a GUESS

Setup

gk (Match1) is known 5C1R – with grandmother: Anetta b 1926 m SURNAME1 > father: private Male > gk; AND gk has 10 known 2C to Anetta’s father (in the line going back to our MRCA).

Justin (Match2) shares 898cM (estimated 1C) to gk; and has a very small Tree of Private Ancestors.

Analysis

To be a 1C to gk, Justin would need to share grandparents with gk – either gk’s paternal grandparents or gk’s maternal grandparents. From the setup (above), we know the maternal grandparents are SURNAME1 and Anetta b 1926; we don’t know (but can often find) gk’s paternal grandparents. In this case there wasn’t enough info in Justin’s Tree to help.

However, there is another way to determine which set of grandparents Justin descends from. If he descends from Anetta’s side, Justin would also be 2C to the 10 known 2C that gk has (NB: all 2C match each other). If Justin descends from the other grandparents of gk, it is highly likely that Justin will NOT share any of the 10 known 2C to gk. A quick look at Justin’s Shared Match list, shows he matches ALL of the same 2C that gk has. Justin is clearly a 1C to gk on gk’s maternal side – which is the side back to the MRCA with me!

Therefore, I am very confident in adding Justin to my Tree with UNKOWN parent and KNOWN grandparents: SURNAME1 and Anetta b 1926. The rest of the path gk has back to our MRCA is already in my Tree.

This places another Match into my Common Ancestor spreadsheet and into my Tree. It takes this Match off the list of unknown (aka Mystery) Matches. In Shared Match lists, Justin will now show up as a known (Dotted) Match – reinforcing Clusters. I don’t know if Justin’s addition to my Tree will help AncestryDNA with future ThruLines evaluations, but I hope so. I *know* it will help me.

A similar analysis can be made for a Pro Tools estimate of 1C1R or a 2C, but it gets less reliable with each additional degree of separation. There is also a higher degree of difficulty in the analysis, because the certainty of the cousinship estimate is not as assured and the number of possible alternatives that need to be addressed increases. It’s often not impossible, but it is harder. A strong factor is whether a *candidate* Match shares a lot of the same Shared Matches. In other words, if the candidate Match clusters with a lot of the same Shared Matches (which can be observed in the Shared Match list), to me that is a strong indication that candidate Match has the same MRCA. This needs to be tempered with endogamy or pedigree collapse – judgment is needed in those cases.

[22DC] Segment-ology: Pro Tools Part 21 – Adding a GUESS by Jim Bartlett 20250109

Pro Tools Part 20a

Featured

Posted on January 1, 2025 by Jim Bartlett

A Plan and some TIPS (corrected)

At the end of 2024, I wanted to review my Plan for using Pro Tools (and filing in a Common Ancestor Spreadsheet) and highlight some TIPS .

For the long haul – addressing all of your genealogy using Pro Tools – make a Plan! Perhaps a New Year’s Resolution…

I now think the best plan is to start with the closest Ancestors and work back a generation at a time.

That is, start with your grandparents –two grandparent couples [Ahnentafels 4 and 6]. The Matches at this level would nominally be 1C to you – maybe some “removed” – like a 1C1R or 1C2R – particularly as we get older:>( There are only two groups at this generation – one on the paternal side and one on the maternal side. So, two CA-Couple headers in the CA Spreadsheet. For each row under a header row, enter the Match information (name, cM, # segments, cousinship) and then the child of the CA and their birth year, and then the path to the Match.

TIP1: for each, and every, Match I list, I use Pro Tools to show *their* closest Matches – these are often close Shared Matches to them that can be figured out even if the SM has no Tree.

TIP2: for each Match I list, I add them (and their path to the CA) into my Tree (and apply the DNA-connection and/or DNA-Match Tags). I don’t know if the Tags help AncestryDNA build Trees or determine ThruLines; but it does help me when I run across them days/weeks/months/years later. Not necessarily a *certification*, but at least a reminder that I’ve reviewed the path before.

TIP3: Fill in some Notes for the Match – I always start with my CA code – example: #A0064P [the A means I’m satisfied the Ancestor is correct; and the # is a holdover from the days we searched for unique strings; the 0064P is Ahentafel 64 on my Paternal side [in a DNAGedcom Client Spreadsheet Report, I can sort on the Notes column, and they will group in order]

TIP4: I Star & MRCA Dot & Tagged-in-my-Spreadsheet Dot each Match – this unique Star-Dot-Dot “trio” clearly highlights Shared Matches who are already in my CA Spreadsheet. In a Shared Match list they help identify a Cluster.

TIP5: Each of the Matches under an MRCA Couple at this generation should match each other. They are 1C, 1C1R, 1C2R to you and each other, and all should Match. A Quality Control Check. [NB: I am tempted to add in any Aunt or Uncle Matches to my Spreadsheet; but they may be close to the Match, but not on the path to my Ancestor – when that happens they won’t have close cM ties to the other Matches.]

TIP5: I have a separate column in my CA Spreadsheet to indicate I’ve done all of the above. I’ve got about 8,000 Match rows in my spreadsheet and I’m reviewing each one to make sure I’ve covered all of the above and then check it off. As it turns out, some have changed their Trees, some have dropped out of Ancestry, Ancestry continues to update ThruLines, etc., etc. This checkoff indicates a fresh update.

Time now to tackle the four Great grandparent couples [Ahentafels 8, 10, 12, & 14].

Repeat the steps above for each of your Ancestor couples.

Note that TIP5 still applies – under each couple the Matches are 2C (or 2C1R, 2C2R, or maybe 1C1R) with you. These nominal 2C should all be close cousins to each other (sharing large amounts of cMs)

At any point in this process, take a break and chase down a rabbit hole or two. But then come back to this methodical process.

TIP6: Using this process, makes us treat all of our Ancestors equally. I tend toward favorite Ancestor lines, and this process forces me to grind through all of the Ancestors and Matches. It’s a good thing.

A slight change occurs at the next generation [eight 2xG grandparent couples; 3C level; A 16 – A 30] At this level, TIP5 breaks down a little.

TIP7: Reminder – 2C-100%; 3C-90%; 4C-50%; 5C-15%; 6C-5% – (roughly)… This is the “curve” indicating how often true cousins will be a DNA Match to each other. ALL true 2C will be a DNA Match to each other. Of 10 true 3C, each one will usually have a DNA match with only about 9 of the others; but each of the 10 will have about 9 of the others matching – so these 10 would still form a pretty strong Cluster… Among a group of 4C, each one will only match about half of the total; and they may not all form one, strong/compact Cluster. And it gets worse, at the 5C and 6C levels… – some interconnecting cousin Matches, but not strong Clusters. However, now with Pro Tools we can find groups of strongly interconnected (closely related) Matches – strong ties to each other, but perhaps their strong subgroup is 5C to 8C with you.

At the 4C level, I see interconnected groups around the children of each grandparent couple; and sometimes a few interconnections between children. At the 5C level, as expected, I’m seeing groups (Clusters) form on the grandchildren.

Additonal TIPS

TIP8: multiple marriages; non-marriages: IF you and a Match only share DNA through one Ancestor, then your relationship is “Half”. Pro Tools often includes cMs for Half relationships, but these only apply with when you share only one Common Ancestor.

TIP9: Some Matches may be related to you multiple ways – give them a separate row (and Ahnentafel #) for each relationship. NB: If you are 3C on A16 and on A18 the odds are equal – with one segment, it could be either; with multiple segments, it could be both… However, if you are 3C on A16 and 4C on A38, with one segment, the odds are 4 to 1 that the DNA came from A16; and if you are 3C on A16 and 5C on A76, the odds are 16 to 1 that the DNA came from A16. This is because *shared DNA* is divided by 4 with each generation, on average. If you have shared DNA with a Match, it’s much more likely to be from the closest relationship.

Please post in the comments if you have good TIPs that would help us all.

Happy New Year!

[I fixed the error in Tip 7, and reposted]

[22DB] Segment-ology: Pro Tools Part 20a – A Plan and some TIPS by Jim Bartlett 20250101

Pro Tools Part 19

Featured

Posted on December 11, 2024 by Jim Bartlett

Comments on Sacrilegious Genetic Genealogy

I thought these comments were excellent and wanted to share them.

Guest Post from Terry Butcher dated 11 Dec 2024

In regards to your Pro Tools Part 16 Sacrilegious Genetic Genealogy post, I would like to share some thoughts on the topic.

While I appreciate the power that various DNA analysis techniques offer in identifying clusters of matches to specific common ancestors, my primary focus has always been about the genealogy side of the effort.

I feel that I need to connect my tree to each match to really have anything of value. I already accept that I am related to my matches (within the parameters you have described related to cM size). Being able to document the relationship and share it with my matches is my reward for investing the time and effort in researching them.

I try to make a connection with each match and approach each one as an opportunity to learn something new. Each match that I find a common ancestor for in essence validates that specific branch of my tree by having both a paper trail and a DNA match.

I add my matches tree into my tree as I research them. I start by adding them as an unrelated person in my tree and start working back along their tree picking up all of their branches until I either find a common ancestor, hit a dead end or believe there is no longer any possibility because of location has gone back to Europe. It usually doesn’t take long to find most CAs. While researching a match, I usually only add parents and the child, ignoring the other siblings to save effort. However, if I am successful in finding our CA, I will usually go back and pick up the other siblings for several of the most recent generations.

I have been systematically working my way through my matches starting with closest related and have made it down to the 41 cM matches (about 2,000 so far). If the match has useful information in their tree, I have been successful about 90-95% of the time. In the past, I would contact matches without trees and offer assistance. Now with Shared Matches Pro I am able to find their close matches with trees and sometimes find a CA. This is much welcomed capability that changes what is possible in my research. I have a total of 132k matches now with 11,500 marked as 4^th cousin or closer. It would take me many, many years to even get through the 4^th cousins and closer matches so I am not worried about running out of matches to research that I have an excellent chance of finding a CA.

For the 5-10% of my matches that I build their tree but can not find a CA, I suspect they may be either connected with 2 brick walls that I have at 3rd GGF or some unknown adoption or incorrect parent in my tree. Several of these unsolved CA matches now tie together in their trees and I am hopeful they will eventually result in solutions.

By working through my matches and incorporating their trees into my tree, I have expanded my tree significantly to over 222k people now. As nearly all of my ancestors have lived in WV since the early 1800’s, my tree is heavily weighted with WV families. I typically don’t have to add but a generation or two until I find my CA.

I am not concerned about having floating tree branches as I believe they will eventually connect into my overall tree. Anytime I encounter a common surname in my research, I chase it back until it connects with other members of that family which strengthens the connections in my tree.

I value the ability to generate family tree reports showing the relationship path between my match and myself and always share the typically one-page report with my match by saving it to my Dropbox folder and sharing a link in the message I send them.

Any match that I can connect to my tree to a CA has over 10k ancestors (and their descendants) with many up to 40k.

My approach over my 30 years of genealogy as a hobby has evolved as it has for most I suspect. As I research, I pick up as much information as I can including photos, obituaries and sometimes other records like draft registration documents, marriage and death certificates. All of these documents are incorporated into the detailed reports I generate whenever the person is included in the report which makes for some very interesting reading for my matches when I share reports with them. I find that Ancestry provides 98% of my information with a bit of help from the other sites whenever I hit a dead end in Ancestry.

[22DA] Segment-ology: Pro Tools Part 19 – Comments on Sacrilegious Genetic Genealogy by Terry Butcher 20241211

Pro Tools Part 18

Featured

Posted on December 10, 2024 by Jim Bartlett

Family Group Sheets

One of the key features of my Common Ancestor Spreadsheet (see post here) is that it offers an arrangement like a traditional genealogy Family Group Sheet (FGS). The FGS has an Ancestor couple at the top of the sheet, with a list of their children down the page with birth, death, marriage dates and places. If we are going to create an inventory of our DNA Matches with known links to an MRCA, this FGS spreadsheet format would be a great way to do that. It also turns out to be a handy tool when working with Pro Tools.

The Common Ancestor spreadsheet for Match cousins is actually a “nested” FGS. By sorting on Ancestor Ahnentafel Numbers, all the Matches connected to one Ancestor are grouped together. By also sorting on the birth year of the Ancestor’s children, this “FGS sort” results with Matches grouped under each child. By adding sorts on birth years for grandchildren and great grandchildren, we get a “nested” FGS. I regularly use my entire spreadsheet sorted by these four columns.

This arrangement has several advantages when using Pro Tools…

1. When Pro Tools indicates a parent/child or sibling relationship to an existing Match (already entered into the spreadsheet), I can create a new row and copy most of the info and just adjust one column – a real time saver. And this works even with new Matches with No Tree, Private Tree, Unlinked Tree, Scrawny Tree, even small cMs – Pro Tools has already provided all the relationship information needed.

2. When Pro Tools indicates a (full) 1C relationship to an existing Match, this limits the relationship possibilities to only two. [In my experience, 1C estimates are highly accurate.] Analysis: the new Match is connected to the existing Match (already in the spreadsheet) on (1) the same side I am on, or (2) on the other side. Be aware of this! If the new Match is on the “other” side, they are NOT part of this Ancestor (Ahnentafel) line. If the new Match has any info in a Tree, this “side” issue can usually be figured out and the spreadsheet cells filled out (mostly by copying from the existing Match). If there is no Tree info, the “side” can usually be determined by looking at the Shared Matches of the new Match (sorted on new Match’s cMs). There should be a clear consensus (at/near the top of that list) of the same Ancestor line as the existing Match. If not, then skip this new Match. If so, I add a row for the new Match, copy data from the existing Match, and enter GUESS for the new Match parent (as a sibling of the existing Match parent), and then the new Match [NB: to save typing, I indicate each “terminal” Match as an asterisk (*) because they are already spelled out in the Match column near the beginning of the row.]

Analysis summary: A) look at their Tree; and/or B) look at their closest SMOMs.

3. For a 1C1R or 1C2R the estimates are still very good, and the process above can be used. Use available info or judgement to shift the new Match to the right or left per the “removes”. Where the individuals are not known, just put Unknown or Private in the cell. The complete path down to the Match is not critical, IMO.

4. When Pro Tools indicate Aunt/Uncle or Niece/Nephew, that too is highly accurate, as are the genders. Similar to the above, there is usually enough information to place them in the spreadsheet (which is like a horizontal Tree).

5. Pro Tools often includes a Half relationship in their estimate. This is based on tables that indicate two estimates shown are almost exactly the same cM range. Although technically correct, it is much more likely, IMO, that the relationships are standard (NOT Half). But a few will be Half so watch for that situation. Remember these Pro Tools cMs are between your Match and the Shared Match (not affected by whether or not you have a Half relationships with the Ancestor)

6. Adding a hitherto unknown child branch – best described by a recent example I had. In looking for my A38 (ALLEN ancestor) cousins, I found a bunch descending from four well documented children of A38 – 56 Match cousins (4C, 4C1R, 4C2R and 4C3R) with an average of 18cM. There appears to be more than four children in the 1810-30 Virginia census records. And there was an old story about this family, that a son named William went west. So when some known Matches had some SMOMs with ancestor William H ALLEN born 1815 in VA and living in IL, I took notice – it seemed to fit. As I pushed it with Pro Tools I found (so far) 10 Matches descending from William H ALLEN averaging 20cM. But more importantly, those Matches also had Shared Matches with 12 of the 56 Matches from other children from this A38. It sure looked like a Cluster with gray cells to other Clusters! I’d really like to determine William’s Y-DNA; and/or some DNA segment data… But, in the meantime, I’ve got two of William’s descendants checking their Matches for links to my A38 ALLEN. There are 147 Trees at Ancestry for William H ALLEN – not a one has any good clue to his ancestry, except that he was born in VA. Not my Brick Wall, but I think there will be 147 happy campers.

A key point in this long story, is the DNA has no sense of geography. The facts that four children stayed in VA (and were well known) and one child moved far away, made no difference to the DNA. From each descendant’s viewpoint, all the lines were equal – and a pretty even distribution of Matches showed up for all 5 children. The DNA is like blind justice.

7. Equality – a final thought is that this spreadsheet is a lot like the DNA – it’s relatively equal over all the Ancestors and descendants. This spreadsheet encourages me to treat all of my Ancestors equally (they each have an Ahnentafel placeholder row). I still have my “favorite” Ancestors, but as I methodically go through the spreadsheet, I’m spending time on each one. This includes the Ancestors that have issues… This spreadsheet also highlights the Brick Wall holes, to be plugged with floating family branches. This is a good thing.

To me, the key points in doing this spreadsheet work also include:

1. An inventory of Matches who have MRCAs with me. Separate from my on-line Tree. Saved in the cloud and/or archived – available to my heirs or selected genealogy archives someday.

2. Family Group Sheets – of sorts* – this is a standard genealogy tool.

3. A Quality Control check on the accuracy of name spelling and birth years; and the FGS itself. This QC review often reveals “quirks” (as a kinder word) that folks have in their Trees…

4. With Ancestor second marriages, this FGS listing will show the demarcation between full cousins and Half cousins. [I add “INSIGHT” rows with marriage years that will sort and separate the children to the different parent couples.] Half cousins for me only occur at the children level in my spreadsheet. Half cousins between Matches and Shared Matches can occur anywhere.

5. A re-sort by Match name highlights multiple relationships. Since shared DNA is divided by 4 (on average) going back each generation, the closer relationships are much more likely. I’ve found some Matches with MRCAs on both sides of my Tree. With single shared segments, the DNA can only come from one Ancestor. With multiple shared segments, there may be a segment for each line.

* I used “of sorts” in 2 above, because this FGS will not usually be a complete list of all Ancestor children, grandchildren, etc. It includes only the ones who provided a DNA path down to our Matches. Which in turn depends on family sizes and who did DNA tests – there can be wide variations on both.

Note: If I were starting over, I’d probably add name & birth year columns for 9 generations – out to 8C level; and then a catch-all column for any additional info. This would provide a handy way to evaluate the cousinship levels. Reminder: I only list the given name and one initial for males; and the given name, initial and married surname for females. I try to keep it as easy and simple as possible.

Bottom line: An FGS spreadsheet offers an easy way to add new Matches which have been identified by Pro Tools as closely related to known Matches. This adds independent, genealogy triangulation and tight Clusters to an inventory of known Matches. It will be an outstanding adjunct to an auto-Clustering program.

Also – you don’t have to use a spreadsheet to benefit from most of the concepts imbedded above.

[22CZ] Segment-ology: Pro Tools Part 18 – Family Group Sheets by Jim Bartlett 20241209

Pro Tools Part 17

Featured

Posted on December 8, 2024 by Jim Bartlett

NPEs

If we just consider our own ancestral line, we may miss some NPE’s. We may have an NPE as an Ancestor, IF we haven’t explored the whole family.

Way back, NPE was Non-Paternal Event, but we’ve seen non-Maternal events, too. So we changed it to Not the Parent Expected. The whole issue centers around the expectation of a family with two “expected” parents. Important: an NPE is usually for one child – perhaps your Ancestor; perhaps a different child in the family. We “expect” all the children in a family to be from the husband and wife. So “usually” an NPE is a one-off event. But life unfolds in many different ways…

A man and a woman create a child – sometimes one of them is not married (i.e. living with their parents, or on their own) – or perhaps this is the case for both of them. Sometimes they are both married to someone else. Sometimes the man is not (or ever) aware the woman got pregnant. Again – in life, there are many variations to this. The point is the NPE does not apply to a family – it applies to a child. This is important to DNA analysis, and how we use Pro Tools.

I have this case for one of my Ancestors. The pregnant woman was an unmarried child in a family who raised her and her son, giving him their surname (which has confused genealogists to this day). It appears the father was not yet married either, but he went on to marry and have children. I know because I got some DNA from him (through the NPE child) and have Matches who descend from him through his other children (half cousins), and though her children by her later marriage (half cousins). [NB: Challenging in my Common Ancestor spreadsheet.]

Getting back to Pro Tools – the DNA truth-teller/helper. In general, the higher-cM SMOM interrelationships lead to one generational level in my Tree – to one MRCA couple. They may be cousins 1 or 2 or 3 times removed (because I’m old), but usually all go back to one MRCA. Then, as I scroll down the SMOM list, I often find SMOMs who descend from one generation further back. This is normal and expected. These would be a generation more distant to us, and should have appropriately smaller cMs, on average. In fact, if this doesn’t happen, we should be suspicious.

NB: Alternatively, some highest-cM Matches may be tied to a closer generation (which should be, on average, a higher-cM relationship). If these higher-cM Matches are at the same generation level, it may be due to multiple segments and, perhaps, additional relationships (with Colonia Virginia ancestry, I sometimes find multiple relationships with some Matches).

Finally, back to NPEs… If one of the Ancestors in an MRCA couple is an NPE, you wouldn’t get any Matches to that couple (just like with an only child; an exception would be if they had more than one child together). So, instead, look to see that *some* of the Matches are from each bio-parent. This is how I solved a Brick Wall. I had many Matches to my A36 (4C level) Ancestors [Thomas NEWLON & unknown wife]. As I kept looking at the Shared Matches, I found some smaller-cM Matches to my A72 (5C level) couple [Thomas NEWLON’s parents] who had been well researched. Analysis of “other” Shared Matches revealed many had the CUMMINGS surname (now my A74; 5C level ancestor).

The point is that if Pro Tools points to a group of higher-cM Matches to a 3C, 4C, etc MRCA; the lower-cM Match should point to groups for the next two MRCAs back. This is true whether these MRCAs are well known or an NPE or a Brick Wall. If you find a consensus Ancestor among these smaller-cM Matches you may have found GOLD.

Bottom Line: When dealing with an NPE, think carefully about what that means to Pro Tools, and target your “rabbit holes” appropriately;>j

[22CY] Segment-ology: Pro Tools Part 17 – NPEs by Jim Bartlett 20241208

Pro Tools Part 16

Featured

Posted on December 5, 2024 by Jim Bartlett

Sacrilegious Genetic Genealogy

For this post I want to explore a deviation from the normal genealogy and DNA research “requirements”.

Do we need to do comprehensive research on each cousin Match? Do I really need to find the complete link between each Match and our Common Ancestor? The sacrilige: do I care about all my distant cousins – to the extent that I must develop their complete link to me? Do I really care how much DNA they share with me? Must I link the DNA to the Common Ancestor? Or, is it enough to determine that they are on a specific branch of my Tree? I think so!

My standard mantra: our bio-Ancestors and DNA segments are set! We compare each Match to our Tree and DNA to find a Common Ancestor. I’m very close to finding out how 10% of my 100,000 Matches (at Ancestry) are related to my bio-Ancestors.

My experience with Pro Tools indicates many more can be easily found. I acknowledge that some shared DNA segments under 15cM will be false – but that doesn’t mean those Matches aren’t related to me. Most of our true cousins beyond 3C will not share any DNA with us, so is the cM amount beyond 3C meaningful? I acknowledge that some Matches will be related beyond a genealogy timeframe.

However, given these negative factors, I believe a lot more of my Matches are related to me within 9 generations back [8C level] – perhaps somewhat more than 20% of my total Matches. It’s taken me 14 years to “collect” and document approximately 10% of my Matches as cousins. It’s daunting to think what time and effort I’d need to double that.

My sacrilege is to give up on full genealogy research for each Match. Using Pro Tools I’m finding lots of 6-10cM (small segment) Matches (to me) that are children, nieces/nephews, or 1C to strong higher-cM Matches that I have placed in my Tree. Clearly, these Matches are part of a family group well within a genealogy time frame.

I’m inclined to just quickly:

1. Add these small-segment Matches to my Common Ancestor spreadsheet

2. Add a Match Note (at Ancestry) to indicate the Common Ancestor and/or Ahnentafel [e.g. #A0062]

3. Give them my standard star and MRCA Dot; but not the Dot indicating a linked Match

4. Use a new Dot to indicate “Likely” in a family group under the MRCA; but not complete research [I could always filter on that Dot later, and do the research, some day…]

5. Add a shorthand note like: SMOM: 3,442cM/son of “Match Name” [SMOM: Shared Matches of Match – the cM between them]

I’m looking for a more efficient way to group Matches into known family lines.

There are several points here:

1. Identify additional Matches within a genealogy timeframe (is it over 50% of all Matches?)

2. Group Matches under my Ancestor Couples – often under a specific child or grandchild (why would I need to dig deeper – unless the Match had a robust Tree with many records…)

3. Build a firm interrelated framework for later research on each extended “twig” of my Tree. Get some confidence of my Ancestors and their children and grandchildren.

4. Identify Brick Walls through clear absence of interconnected Matches. My spreadsheet has an Ahnentafel header for each of my Ancestors back to the 8C level – some of them have no known Matches, or what is clearly a small mess of non-interconnecting Matches. These are a judgment call, but with many more Matches involved, these few “problems” become more and more obvious.

5. Connect Floating branches – I now have several strong “clumps” of interconnected Matches, under a single MRCA couple, that I cannot link to my Tree. This is a strong hint in light of #4 above. I plan to explore this more in a separate blogpost.

For DNAGedCom, Genetic Affairs, DNA Painter: Any way to automate the Clusters/Groups to include only those Matches who interrelate, say, over 90cM (and make that threshold adjustable)?

Bottom line: I think many more , if not most, of our Matches will turn out to be real cousins within a genealogy timeframe (out through 8C level). This includes Matches with no Trees, Private Trees, UnLinked Trees and scrawny Trees – all of these are now put into the mix through Pro Tools. For me, compiling data from my 100,000 Ancestry Matches will be a way to bound (if not counter) the continued warnings that many of our Matches are false and/or distant. Some are, some are not – what can we learn?

As usual, I value your feedback – on the sacrilege of adding Matches to Tree branches based on strong interrelationships, but without fully documenting the genealogy; as well as the bigger picture of possibly linking Floating branches to “bare spots” in our Trees.

[22CX] Segment-ology: Pro Tools Part 16 – Sacrilegious Genetic Genealogy by Jim Bartlett 20241205

Pro Tools part 15

Featured

Posted on November 25, 2024 by Jim Bartlett

Shared Match Cluster Hints

I’ve written in this Pro Tools series about the power of Shared Matches. They form manual Clusters of Matches. Like all Clusters, they *tend* to point to a Common Ancestor. Each individual Match has their own ancestry, and they may relate to us in several different ways (particularly with my Colonial Virginia ancestry). With auto-Clustering this is displayed by placing the Match in a Cluster with the strongest ties to other Shared Matches – and using gray-cells to indicate ties to other Clusters. This shows up in a Shared Match list with a mix of Shared Matches tied to one Common Ancestor, along with other Shared Matches who may be related in different ways, and even some Shared Matches who might not be interrelated at all.

So, to make a point: Shared Match Clusters (or concentrations in Shared Match Lists) should be considered as a Hint. The stronger the consensus, the stronger the Hint. The chore that still remains is tracing the genealogy from the Match to a Common Ancestor(s).

I find that consensus is a judgment call. But when I make that call, I usually find other Matches with a genealogy link as expected. But not always…

Segment Triangulation is fairly precise – each of our DNA segments came to us from one particular ancestral path. Shared Matches (aka In Common With, aka Relatives in Common, etc) are not equivalent to Triangulation. When Shared Matches form a Cluster, it’s a strong Hint. And a 20×20 Cluster is much stronger than a 3×3 Cluster. And a 20×20 Cluster where each Match matches almost all of the other Matches is very strong, compared to a 20×20 Cluster where each Match only matches, say, half of the others… I have found large, strong Clusters (beyond close cousins) usually turn out to include one TG (maybe two), but there is no hard rule.

Summary: Shared Matches can grouped into Clusters. Clusters are not the same as Triangulated Groups (TGs), but they can be good pointers and helpful Hints.

[22CW] Segment-ology: Pro Tools Part 15: Shared Match Cluster Hints by Jim Bartlett 20241125

Pro Tools Part 14

Featured

Posted on November 24, 2024 by Jim Bartlett

Jigsaw Puzzles

Our genetic genealogy is very much like a jigsaw puzzle. Our Ancestors and our DNA segments are both pieces of a large jigsaw picture (ourselves). Soon after the moment of conception – when sperm meets egg – our DNA segments and crossover points are determined. And, of course, our Ancestors, each with 2 biological parents, are determined. There may be lots we don’t know, but those configurations (DNA and Ancestors) are fixed – waiting for us to discover them. Just like a box of jigsaw puzzle pieces, all the pieces are there – and they only go together one way (like our DNA segments and our Ancestors).

Now think about our DNA Matches – perhaps 100,000 of them – as we open our list… The overarching concept is that a Match sharing at least 15cM with us is always a true (Identical By Descent or IBD) relative; and over half of the remaining Matches will also be IBD and a true relative. Of course, some of these Match-relatives will be distant cousins.

Based on my deep dive with Pro Tools, I’m now convinced at least 20% of my DNA Matches at Ancestry are relatives within a genealogy time-frame. I’ll go out on a limb and say 8C or closer!.

So, to the point of this blog post… 20,000 of my 100,000 Matches are probably 8C or closer. Each one of them is a jigsaw puzzle piece. Each one interlocks with me (sometimes in multiple ways) and very often with other Matches (look at *their* Shared Match list). In many cases they form interlocking relationships with each other, from siblings to parent/child to 1C and 2C and 3C interrelationships. Just like a jigsaw puzzle. Some will be like the jigsaw lake, or forest, or barn or road – all of which “clumps” of the puzzle will eventually integrate – only one way – into the grand picture….

With Pro Tools’ new Sort feature (the Shared Matches’ *close to distant* Sort), it’s a whole lot easier to form small branches. Think of it this way…. You have 1,000 Matches, and you can easily find links that result in 500 pairs…. In a flash, you’ve cut your workload in half. And as you form larger clumps of Matches – all of your Matches in that clump must lead back to you! Put another way, look at the clump and see where all of your Matches have a Common Ancestral line – out of the clump and directly into your Tree – somehow…

The jigsaw puzzles:

The Ancestors must interlock in pairs and form an entire “Tree” jigsaw picture>
The DNA segments must array adjacently and form a Chromosome Map picture
Our Matches will interlock with us; each other; and our Ancestor Tree.

[22CV] Segment-ology: Pro Tools Part 14: Jigsaw Puzzles by Jim Bartlett 20241124

Pro Tools Part 13

Featured

Posted on November 17, 2024 by Jim Bartlett

Status of Common Ancestor Spreadsheet

I have a spreadsheet of all Matches with Common Ancestors with me. It includes my Ancestors and their children down to each Match. See more at https://segmentology.org/2021/12/19/segmentology-common-ancestor-spreadsheet/ It’s a lot of work, so feel free to adapt it suit your needs.

I have been reviewing all of these Matches and adding a LOT more using Pro Tools. I posted various ways to do this here, and I’ve gone down all those rabbit holes. I’m now on a march to review these Matches methodically – from closest Ancestors to more distant. I’ve found that it’s essential to have “known” Matches highlighted in Shared Match lists to speed the process of determining new Matches with CAs and forming family groups. So I’m adopting a two phase process. First: Recheck all Matches for firm relationships and having a clear set of Dots that will spotlight them in a Shared Match List – probably out to 5C level; Then: I’ll go back and use Pro Tools to tease out new Matches to add in.

Toward this end, I’m going to paste a Table below that shows my progress to date; and later I’ll update the Table to show the effect of Pro Tools. I’ve used Ahnentafel numbers (male of an ancestral couple) – their names are not needed for this exercise, although I did use given names for children for the first two generations. The comment column gives some reasons why the cMs deviate from the averages as when there are double Cousins or half Cousins, or Ancestors out of the US. You may also note the high number of Matches for Ahnentafel 70 – it’s because I jumped to that Ancestor, and used Pro Tools to find several key Matches to help with a burning question.

Here is where I stand now:

Note that this summary has 2477 Matches, through the 5C level (4XG grandparents). I have another 6,070 Matches in the 6C to 8C group. My total is 8,547 Matches from AncestryDNA, out of about 100,000 total – I wanted to see what impact Pro Tools will have. We’ll see how far I can get…

[22CU] Segment-ology: Pro Tools Part 13 – Status of Common Ancestor Spreadsheet by Jim Bartlett 20241117

Pro Tools Part 12

Featured

Posted on October 28, 2024 by Jim Bartlett

The jokes on me… heads up!

In my last post I noted that the Pro Tools cM relatedness was pretty accurate! Today I found two Matches who were 1C – their parents were brothers. But the SMOM said 1,637cM they had to be half siblings. I checked with DNAPainter – 1,637cM is 100% half siblings (for same generation relationship). Back to the drawing board… Did the two brother marry (or have children with) the same wife? Maybe one brother died, and the other married the widow… Nope. Checking some more – the two brothers married two sisters! They were double 1C! Not in the DNA Painter range of options, but spot on for twice the 1C cMs. All is OK, but it had me scratching my head for a few minutes.

[22CT] Segment-ology: Pro Tools Part 12 – The Jokes on Me by Jim Bartlett 20241028

Pro Tools Part 11

Featured

Posted on October 28, 2024 by Jim Bartlett

Ways to analyze Shared Matches Of Matches (SMOM) cMs.

Pro Tools gives us a LOT of new information. Not quite segment Triangulation, but very powerful data.

For example a Match shares 8cM with me and does not have a Tree. However, a SMOM shares 3,489cM with the Match, and Ancestry (with insider info) says the SMOM is the mother of the Match; and shares 17cM with me. As it turns out, I know the SMOM is a 3C1R with me on a particular Ancestor couple. It’s easy to 1. add the Match to my Tree; 2 add the Match to my Common Ancestor Spreadsheet; and 3. add a synopsis of this info (as a 3C2R) to the Match’s Notes. Of course this doesn’t happen every time, but it does happen some of the time.

The above example is a parent/child relationship, and Ancestry usually knows if it’s a son or daughter and a mother or father. Ancestry usually knows niece/nephew and aunt/uncle.

But the thrust of this blog post is about a family group and their interrelationships. I’ve tried several methods to document and analyze new Match/SMOM cMs. All methods utilize my Common Ancestor Spreadsheet which is arranged by family groups [I sort by Ahnentafel of the Common Ancestor; and the birth years of children, grandchildren and great grandchildren.] This CA spreadsheet is my foundation of “known” cousins – I’m looking at their Shared Matches to see if I can determine how we are related and add them to the spreadsheet; and checking to see that the existing cousins are interrelated to each other as expected.

First try was to add about 10 blank columns to the spreadsheet. I’d then type an asterisk [*] for a Match in a column, and enter the shared cMs with the other Matches in the spreadsheet in the same column. It was sort of like a Cluster matrix; and anyone who had a faulty genealogy was easily highlighted. But two issues: 1. It was a lot of work for a family group; 2. some of the Matches were in fact related up or down a generation [not physically close on the spreadsheet]; and 3. it was difficult (for me, anyway) to determine how an unknown Match would fit in… [someday I’ll try DNA Painter or BanyanDNA…]

The second try was just one new column, and I would type in the highest cM found among all the Shared Matches; the suggested relationship [almost always accurate for high cMs]; the Match name; and any known info. Issues: again, a lot of work; and some Matches don’t have any high cM SMOM with me. I still add these when they are the only evidence I have for adding a new Match to my Common Ancestor spreadsheet.

Third/current try involves about 3 new columns and I color in a column where Matches match most of the others. Sort of like LEEDs column-coloring. This is somewhat easier to do, without a lot of typing. And the colored “stripes” are comforting to see (and to highlight Matches who may not “belong” and/or need further research.)

Also, I’m hopping around some these days, working on specific issues (Brick Walls, questionable genealogy, trying to link in (or out) selected Matches). It appears that the closer generations have one stripy column and as I work on more distant Ancestors, the number of colored columns grows.

I’m still fiddling with good/efficient ways to use/display SMOM cMs; or even if I need to at all. I’ve worked on about 10% of my Matches in the Common Ancestor Spreadsheet. At every turn, Pro Tools is helping me find more and more Matches for whom I can determine our relationship. So still a long way to go – and I’m sure there are many more Matches to add to my spreadsheet.

You are encouraged to post in the comments any insights, tricks or hacks you’ve developed for using SMOM cMs…

[22CS] Segment-ology: Pro Tools 11 – Ways to Analyze SMOM cMs by Jim Bartlett 20241027

Pro Tools Part 10

Featured

Posted on August 13, 2024 by Jim Bartlett

Branch Groups

I’m methodically working my way through my Ancestors and Matches using Pro Tools. My main tool is my Common Ancestor Spreadsheet, which is now growing very rapidly. I’m not really in it for the bulk, but for the advantages of Branch Groups. What I call Branch Groups are groups of my DNA Matches under one child or grandchild of one of my Ancestors – these Matches are on the same Tree “branch”. Such Matches are closer to each other (than to me) and tend to share more DNA with each other. They stand out with DNA shares over 90cM; and I take notice. I can often “fit” them into a Branch Group. On the other hand, I’ve found some Matches that have the right genealogy for a Branch Group, but they don’t share much DNA with others in the Group – more on this below.

Here are some thoughts and observations:

SMOM – Shared Matches of Matches aka “Rabbit Holes” – haha. When you select a Match and click on the Shared Matches button – you get a list of all the Matches you both have in your respective Match lists. These are your Shared Matches (SMs) with that Match. Each of these SMs shares some DNA with you that you both got from the same Common Ancestor (CA). And, with Pro Tools, you know how much DNA each of these SMs also shares with the “base” Match that *they* got from some CA. Often these two CAs are the same (or one is ancestral to the other); but sometimes the CAs are completely different – *their* CA could be unrelated to you or related to you on a different line – see Outliers below). When we’ve done our homework and entered Notes for many Matches, we can usually look down the SM list and easily see if there is a consensus, or not – see Birds of a Feather below. Like with auto-Clustering, a consensus indicates a group of Matches that mostly match each other, indicating a Common Ancestor among them. Usually, their CA is also one of your Ancestors – BINGO! This is a Branch Group. Sometimes their CA is unknown to you – this could be a random happenstance. Or it could be a Floating Branch Group – see below.

Branch Group aka Cluster. When you find SMOMs who share high levels of shared DNA (cMs) with each other they usually form a Branch Group. By “high levels” I mean at least 90cM; but I often drop down to around 50cM as the group grows larger. I consider 20-25cM as “in the noise”, and usually not worth the trip down a rabbit hole. [For your own situation, experiment to find a threshold that usually gives you efficient results.] Sometimes you can get 5-10 (or more) of these SMOMs which link under a child or grandchild or Great grandchild of one of your Ancestors. And then it’s easier to find other SMOMs that fit into the Branch Group. Use an SMOM in a Branch Group to make a new Shared Match list, invariably with new SMOMs… the clues (or rabbit holes) are everywhere! As it turns out in a Branch Group, not all Match descendants will Match all of the other Matches in the group. Remember: at the 4C level, roughly 50% of true 4C won’t show up as matches to each other.

Birds of a Feather. On many Shared Match lists, a scan of the Notes indicates a clear consensus – most SMs have Notes indicating the same CA; and some are from the same line (up or down a generation). These are birds of a feather – they cluster together. And Pro Tools shows them to be close relatives – these are a Branch Group. In these cases, I’m much more likely to review Matches not yet linked in, and to build their Trees back to find the link. As a quick check, click on a Match and see *their* SMs with you – are they indeed Birds of a Feather? Or not?. For some Shared Match lists, a quick scan of existing Notes may indicate they are all over the place – on both sides; on different branches – so, it’s difficult to determine a consensus. Move on…

Outliers – linked by genealogy, but not linked by shared DNA. I’ve now run into a very few cases of DNA Matches who are clearly genealogy relatives (in my Common Ancestors spreadsheet) under Ancestor XYZ, but they do not share DNA with other close cousin Matches under XYZ. In each case, so far, they are also related to me in another way, and they do share DNA with their other cousins. Thinking about multiple segments and/or multiple relationships leads me to Triangulated segments, but I’ll put that discussion off for a future blog post. Just be aware that a Match with one shared segment can only be genetically related one way. Pro Tools may help determine which one.

Collateral SURNAMES in Branch Groups. Less than 1% of my Matches have the same SURNAME as the CA we share [Y and mt lines are pretty rare]. This means my Common Ancestor spreadsheet (tracking the lines of descent down to Matches) includes Collateral SURNAMEs. As I’m working on an MCRA Branch Group in my spreadsheet, I’m reviewing each of my Match cousins, and reviewing all of the SMOM shared cMs, and checking the Trees of those over 90cM (and glancing at some down to 50cM). Often there is enough to tie those Matches to my Tree (even some with no Trees). It really helps to review the Collateral SURNAMEs already recorded in my spreadsheet for that Branch – that’s usually where I’m going to find a link. And it means I don’t have to build a tree back for each Match – I can usually copy the line of descent of an existing Match in the spreadsheet, and just change the last few generations. A big time saver – in searching and typing… Recognizing a Collateral SURNAME in a Match’s scrawny Tree is helpful. Sometimes I’ll filter a long Shared Match list by a Collateral SURNAME…

Floating Branch Groups. A few times I’ve found a Branch Group that I cannot link to my Tree. They usually include parent/children, siblings, aunt/uncle/niece/nephew, and maybe some 1C or 2C, all in a tight family group. All the interrelationship cMs are on target. But, other than being on a Shared Match list with some known Matches in a Branch Group, I cannot find a link. In most cases this has happened “near” a Brick Wall (or “iffy”) Ancestor of mine. So I’ve created a Floating Branch in my Tree, so I can link other Matches to it. I need to do a study of closest known Matches to see where this Floating Branch is headed – another rabbit hole. Such a Floating Branch could just be a mirage (not really linked to me), or I might find some “tendril” Matches (maybe through a Collateral SURNAME filter) that help find the link. I operate under the belief that ALL Matches over 15cM (and many under 15cM) are true cousins, and many are within a genealogy timeframe and should fit in my Tree somewhere.

I am now convinced of two things: A) A lot more of our under-20cM Matches are well within our genealogy timeframe than I originally thought; and B) our Brick Walls (out to at least 8C level) have plenty of Matches forming Branch Groups. With each generation going back, it’s harder and harder to figure them out, but Pro Tools can often provide new insights. This helps offset the fact that many Matches have NO Trees or very scrawny Trees. There is hope! But it takes work!

[22CR] Segment-ology: Pro Tools Part 10 Branch Groups by Jim Bartlett 20240812

Pro Tools Part 9

Featured

Posted on July 25, 2024 by Jim Bartlett

Build A Foundation

I feel like I’ve been drinking though a fire hose – there are just so many good clues in the Shared Match cM lists. I’ve tried all of the four Plans of Action I previously laid out – and I’ve found myself still jumping from one to another – good clues are just too hard to pass up. And a parent/child, sibling, aunt/uncle/niece/nephew and even a 1C will suck me in like a magnet – particularly when one has NO Tree and another has a good Tree. AND, if I’m working in a small sub-branch so I know many of the collateral SURNAMES and the geography, I’ve got to capture that info before I move on…

Observation 1: As I scroll through hundreds of Shared Match lists, I see lots of Shared Matches with the same MRCA I’m researching [almost all of my over-20cM Matches have a Note indicating a validated MRCA, or a likely/imputed one]. And I see lots of Shared Matches one generation up or one generation down. For instance, I’m working on my MRCA couple 40P, and I see Shared Matches that are also 40P, and Matches who are 20P and 80P and 82P. I shouldn’t be surprised, because we are all on the same ancestral line; AND a 20P Match who is a 3C (or 3C1R) with me, is also related to my Matches on 40P – maybe as 4C or 3C1R, etc. This is very comforting to see a Match with Shared Matches up and down one of my lines.

Observation 2: Each of the MRCAs that I focus on – usually for a few days – has seen a significant increase in the number of Matches that I can verify exactly how we are related. Plus, if they are closely related to a known Match AND have a bunch of Shared Matches with me along this same line, I can add them to my Common Ancestor Spreadsheet anyway, with confidence they are on the same sub-branch. In any case, I’m winding up with a lot of Matches under each MRCA; and a lot of new Notes for them.

Recommendation/Tip: Combining 1 and 2 above, I now think the best path forward is to build the foundations and then work back in our Ancestry. I have no 1C, so this means starting with my great grandparent MRCA couples and, using Pro Tools, teasing out as many Matches as possible for each one of my 4 MRCA couples [8P, 10P, 12M, 14M] – and adding their info into the Match Notes. Then, as I move to the next generation further back, I will see many of these Notes in the Shared Match lists for Match-cousins back to MRCAs16P to 30M. In general, the Shared Matches to these MRCAs will “stay in their lane,” and that is a strong indication. Remember, some Shared Matches may match you one way and match the base Match another way – those Matches will usually have a shorter, random list of Shared Matches – I skip over those quickly and move on.

Bottom Line: If we start with our closest MRCA couples and “Note” all the Matches we can, we’ve built a strong foundation for when we get to the next generation. This will become more and more valuable as we work out through more distant generations. I think such a foundation will be essential when we get to 4C and beyond.

[22CQ] Segmentology: Pro Tools Part 9 Build A Foundation; by Jim Bartlett 20240724

Pro Tools 8

Featured

Posted on July 19, 2024 by Jim Bartlett

Group Process

Here is (sorta) my process for working with a Match and their Shared Match list with me.

1. Pick a Match with an MRCA (I don’t have good criteria yet, but I like one with a good Tree; and it’s helpful to know that they have several close cousins in my Common Ancestor spreadsheet).

2. First pass: look though all of their Shared Matches for Notes that indicate they share the same MRCA. [sometimes I note the shared cM in a new column; sometimes I just use a highlight color in a column – in either case to indicate a group with the Match in #1]

3. I’ll stop at any Match who is very close – parent/child; sibling; even aunt/uncle/niece/nephew. If not in my spreadsheet, I add them in and add appropriate Notes to their profile to highlight the MRCA and relationship to me – e.g. #A0038P/4C1R: ALLEN/Elizabeth.

4. Then I make another pass through the Shared Match list – opening the Matches who share above 90cM (generally within about 2C to the original Match in #1 above). From my spreadsheet I know of other SURNAMES the other Matches have in their path back to the MRCA – so I’m looking for those surnames in addition to the MRCA surnames. For example: MOESZINGER led me to 4 other new Matches from my ALLEN MRCA.

5. Repeat #4 (a third pass) looking at above 50cM or so – digging a little more (and by this time, I usually have additional Matches with helpful Notes to play off of.

6. Now, start at the top of the #1 Shared Match list (a fourth pass), and open each Match who does share the MRCA with you. Look down each such Match’s Shared Match list with you, using the #4 process above. The idea here is that not every cousin will share with every other cousin (remember only 50% of all your true 4C will share DNA with you; 50% will not!). So using this step usually adds a few more Matches to the group. [If you use a highlight color column, all of the Matches in a part of a Common Ancestor spreadsheet should get colored in.]

7. If you’re working on a Brick Wall (or NPE or bio-Ancestor, etc), go through the remaining Matches who have Trees and jot down the SURNAME in their Trees. Look for a Common Ancestor among those (usually more distant) Matches, who would be a good potential for an Ancestor at or beyond the Brick Wall.

In each case above, I add new Matches to my Common Ancestor Spread Sheet (now about 7,000 from Ancestry), and add them (and their path) to my Ancestry Tree (they are always living and private).

Sidebar: My Common Ancestor spreadsheet is a good tool for each family group based on an MRCA. I haven’t found a good way, yet, to analyze the Shared Matches who are related to me through the children or ancestors of the MRCA. However, I do note that they show up in the Shared Match list. For instance, I’m now working on my MRCA – 38P (this is the Ahnentafel representing Joseph ALLEN, along with his wife Elizabeth [39P on her own] – maiden name unknown). It’s comforting to note that many of the Shared Matches have Notes starting with #A0018P (an MRCA representing my Ancestor who married A19P, the daughter of 38P) and some close Matches with #A0008P), an even closer descendant of 38P. Normally, I would have some smaller cM Matches back to 76P and 78P (representing the two MRCA couples who are ancestors of 38P and 39P, but both 38P and 39P are brick walls… So Group Process #7 above, is next on my list.

The above is a classic example of the iterative nature of genetic genealogy, and the importance of having a good Note system that lets you see the key elements in a Shared Match list. It all comes back to doing the homework of keeping good, visible, Notes at Ancestry. Tip: I now add a Match’s SURNAMEs to the Notes if I don’t have any other clues – I can then see these SURNAMEs in the Notes fields in a Shared Match list…

Bottom Line: I think the Pro Tools Shared cM feature needs an iterative process of reviewing Shared Matches to add in as many new Matches as possible under our MRCA groups. This also includes noting Shared Matches closer and more distant to each MRCA group; and analyzing remaining (usually smaller cM) Matches to break through more distant Brick Walls. Lot’s to do….

[22CP] Segment-ology: Pro Tools Part 8 Group Process; by Jim Bartlett 20240719

Pro Tools Part 7

Featured

Posted on July 17, 2024 by Jim Bartlett

What Is Your Plan of Action?

The Pro Tools feature that lets us see the amount of DNA (in cMs) between our Shared Matches is a significant tool. It allows us to “stitch together” families, to include Matches with skimpy, or even no, Trees. This could potentially impact all my 96,000+ Matches. That’s a lot of ground to cover…

So what’s the game plan – how do we most efficiently use this new cM data? What is the Plan of Action (POA)?

I see four different POAs – and I’m seeking your input on any insights you’ve found so far.

1. Work down our Match list. Start at the top, and methodically work on each Match that we haven’t placed in our Tree. The advantage here is that the top Matches (most shared cM) are usually the easiest to figure out. With Pro Tools we can see their top Matches, potentially ones with good Trees, and often tease out their place in our Tree. At the least, even if we cannot find the exact relationship, we can figure out which sub-branch of our Tree they are on (which is all we really need to know for them to be helpful forming a tight group).

2. Confirm each MRCA couple group. I’ve been working on this method for a while, using my Common Ancestor Spreadsheet. The focus is on all the Matches who have the same MRCA couple – does each one share an appropriate amount of cMs with the others. I must take care when some Matches have multiple relationships with me (colonial Virginia ancestry) and/or multiple segments – these could throw off a one-to-one analysis. But the main point here is: does each Match “fit”? I’ve found 2 so far (out of hundreds), who really don’t “fit” within all the shared cMs – indicating incorrect genealogy or an NPE. Bottom line: it is very comforting to see a large list of Matches under an MRCA that all “fit” each other (well within the Shared cM Project ranges). Each such MRCA couple at one generation, then is a strong foundation when working on the next generation – many Matches will be related to each other across two (or more) generations. More “comfort”…

3. Focus on specific problems. Work on an unknown bio-Ancestor/NPE/Brick Wall. Build an appropriate group, and then re-review the Shared Match list for highest-cM Matches that may be helpful – and then look at their Shared Matches for more clues. This POA may foster a lot of “rabbit holes” and “blind alleys”. But the main point here is: build a group of interlocking Matches – they will often lead to insights.

4. Hit-n-Miss. Have fun chasing random leads. These sometimes result in a floating branch of your Tree. I have two of these – many Matches which apparently form a large (several hundred) list of Matches from one person – probably an Ancestor of mine, but no known paper trail link. Pro Tools will confirm, or not, if this is an “interlocking” group. If so, then I will look for Matches in that huge group, who have shared Matches with some other, known, MRCA group(s) of mine – hopefully there will be a strong consensus – there should be…

[Side bar: early in my Navy career (1971), I needed a system to track a lot of projects (before PCs) – we had a table of milestones. I called it: the Hectic Input and Tabulation of Numerous Milestones in our Sacred System (HITNMISS) – my boss was not impressed; so I changed the name to the Simplified Work Input and Follow Through (SWIFT) – I got promoted:>j]

I think the unifying theme above is to form interlocking Matches into a family group based on shared cM – it’s right up our alley as genealogists. And each such group is a very valuable foundation, with important links in different generations (up and down our Tree).

Are you using any of the above POAs? Have you developed a different POA that you’ve found to be particularly effective and efficient, or not? What’s the best way to incorporate all this new data? Please share.

[22CO] Segment-ology: Pro Tools Part 7 What Is Your Plan of Action; by Jim Bartlett 20240717

Pro Tools Part 6

Featured

Posted on July 11, 2024 by Jim Bartlett

Watch Out…

BLUF: Do not rely strongly on Ancestry’s suggested relationships – I find the true relationship is rarely the top one in Ancestry’s long list of possibilities; and it’s usually down their list somewhat. The cMs with my Matches are always within the ranges in the Shared cM Project and at DNA Painter. But, again, they are rarely at the average.

I’m reviewing all my Matches at the 3C level: 79 Matches with A16 MRCA couple; 92 with A18; 43 with A20, so far. None have been found to be outside the range of inter-relationships (perhaps 50% sampling). All are inside the appropriate ranges. BUT, two siblings may show vastly different values – one somewhat higher and one somewhat lower than the average. My engineer brain wants two siblings to have very close cMs, but the data is truly random (within the ranges of the Shared cM Project.

Bottom line: be careful, and don’t try to force a fit. Expect the values to be in the range for the relationship; but accept that they may be all over that range. And looking at it the other way, starting with a shared cM value, the relationship [of a Match without a Tree] will NOT necessarily (or even probably) be in a small set of Ancestry suggestions (although almost always on their long list – in the “Tree” view of a Match profile).

[22CN] Segment-ology: Pro Tools Part 6 Watch Out; by Jim Bartlett 20240711

Pro Tools Part 5

Featured

Posted on July 7, 2024 by Jim Bartlett

Small Segments At Work

I have Match A at 25cM with no Tree; but a nephew at 1773cM, Match B who has a Tree. Match B is a 3C3R on my Ancestor 16P. Analysis of the 1950 census and his grandmother’s obit, gave me the same name as Match A and a place in my Tree. Match B is 9cM to me. Match B has another uncle at 1771cM, Match C. Match C is also listed by name in the grandmother’s obit. Match C is 8cM with me. And, sure enough Match A and Match C share 2315cM [siblings] with each other [corrected 7/9/2024].

This is about as solid as it gets. Clearly the 9cM Match B and the 8cM Match C are true cousins to me, per genealogy. Each of these Matches share one DNA segment with me. Although this data doesn’t “prove” these 3 segments are the same and linked back to our MRCA 16P, I’d be willing to wager that an upload to GEDmatch would show these segments would Triangulate; and match many other segments from MRCA 16P. In a genealogy sense it doesn’t matter: these Matches belong in my Tree – with or without a DNA link.

I find this example compelling. The old saw: when you hear hoofbeats, think horses. Yes, zebras are a possibility, but the odds in the USA are way in favor of horses. These individuals show up as DNA Matches to me – they share a segment of DNA with me. Some segments are small, some are large. When they come from such a tight fit in one part of my Tree, I’m inclined to believe that they are the same segment. It is “possible” that they each got a randomly different segment, or even false segments, but the logical reasoning is that they share part of the same segment from an MRCA. Why not just accept that for now? Perhaps, someday, some alternative will come up – even so, it would not change the genealogy backed up by records.

Icing on the cake – in reviewing Match C’s shared Matches, Match D (8cM to me) is 3476cM (a daughter) of Match C – another add to my Common Ancestor spreadsheet and to my Tree.

Bottom Line: ProTools is providing a lot of great bread crumbs to follow; and linking a lot of small cM Matches to my Tree. Be sure to scroll to the bottom of a ProTools Shared Match list, looking for high cM interrelationships! Don’t discard genealogy “finds”, just because they share small cMs.

[22CM] Segment-ology: ProTools Part 5 Small Segments At Work; by Jim Bartlett 20240707

Pro Tools Part 4

Featured

Posted on July 5, 2024 by Jim Bartlett

The Spreadsheet

By popular request, below is a section of my Common Ancestor Spreadsheet. Shown are most of the essential columns, In order to fit the space I have in this blog, I’ve deleted a number of columns that I use to record, emails, TGs, Notes, Y or mtDNA possibilities, etc. – they are not pertinent to point of this post. On the far right are 3 columns for cMs between a Match and the *Match for that column.

Common Ancestor Spreadsheet with columns for Shared cM between Matches

Note this part of the spreadsheet is for DNA cousins on my Ancestor John H BARTLETT b 1804 (married to Sarah FLEMING). For each Match, I have their Name, any Admin, cM (with me), # segs, Ahnentafel of MRCA couple (all are 16 in this section), Cousinship; and then the given name and birth year of the child of the MRCA through which they descend; same for grandchild; and Great grandchild; and then a column for more descendants if desired (all in one cell – and I usually run this out – down to the Match). The ** in green means that Match (and the path) is in my Tree. The next columns are for entering an * for a Key *Match and the amount of shared cM between the other Matches and that *Match. You’ll note near the bottom of the spreadsheet, child James b 1836 is listed – he is the child that I descend from (and cousin on him would be under Ahnentafel 8). Hundreds of other MRCA couples, thousands of Match cousin, are in other sections – all sorted by Ahnentafel # and birth year columns.

To do a perfect matrix, I’d need to have 67 columns to show all of the pair-wise relationships. I think I can get a pretty good picture from only one Match for grandchild. And, of course, as I find Shared cMs over about 100, I usually go down each of those rabbit holes and wind up adding most of those Matches to my spreadsheet.

Please feel free to use as much of this format as you link, AND to add/delete/shift columns to suit your own style of research and analysis.

[22CL] Segment-ology: ProTools Part 4 The Spreadsheet; by Jim Bartlett 20240705a

Pro Tools Part 3

Featured

Posted on July 5, 2024 by Jim Bartlett

BLUF – The matrix which can be created by all the shared cM relationships is also showing the range of cousins who don’t Match each other.

I have now shifted to using my Common Ancestor Spreadsheet to analyze the cMs between my Matches. This spreadsheet lists about 9,000 Matches who are known cousins on specific Ancestors (a small percentage of Matches share multiple Common Ancestors with me). The backbone of this spreadsheet is a list of all my Ancestor couples out to 8C level (and some beyond), with columns for their Ahnentafel number (e.g. 16); and husband’s birth year. Under that goes a row for each Match with that Ancestor as a Common Ancestor (with the Ahnentafel number and cousinship (e.g. 3C1R) and the Match’s given names of the child and birth year the CA couple. The next two columns to the right are the Match’s Ancestor who is the grandchild of the CA, and their birth year; etc. With this setup, I can sort on Ahnentafel Number and the first birth year column and then the second birth year column and the whole spreadsheet sorts into family groups.

I am now selecting a Match and entering a * in a new column; and then, in that column, the cM of their closest Matches already in the spreadsheet. [NB: As previously reported, I’m also finding Matches who are very close relatives to the *Match (sometimes a parent or child or 1C), which causes me to go down that rabbit hole – which, in turn, frequently results in a new known cousin Match added to the spreadsheet – it’s like drinking through a fire hose.]

Anyway, as I now look down the amount of Shared cM between Matches (in a * column), I can clearly see the parents/children, siblings, aunts/uncles/nieces/nephews and close 1C and 2C in close rows of the spreadsheet. The Shared cMs get smaller and smaller up and down the spreadsheet – in fairly predictable order as the spreadsheet has different “layers” of relationships – it’s very comforting to see this pattern. Mind you, it’s not a straightforward “curve” – there is the same “jumble” that is reflected in the Shared cM Project cMs – the overlap of possible ranges among different cousinships.

The other thing that is showing up under a *Match, is that not all the 3C or 4C or 5C are showing up as Matches. This is expected. Remember the rough estimates that true 3C only match 90% of the time; and 4C only match about 50% of the time; etc. I would need to have 9,000 columns, to perform a full analysis, and that probably isn’t in the cards. Perhaps one of the 3rd party programmers can come up with a automated program to do this…

Bottom line: for now, it appears the concept of “true cousins don’t always match each other” is alive and well in the Shared cM data…

[22CL] Segment-ology: ProTools Part 3 TIDBIT; by Jim Bartlett 20240705

Pro Tools Part 2

Featured

Posted on July 2, 2024 by Jim Bartlett

A ProTools Epiphany…

As I walk down my hitherto unknown Matches, I’m setting up a small spreadsheet for each group.

Important: Starting with a “base” Match, scroll through *all* the Share Matches – looking for those who share lots of shared cM with each other. Generally 90cM is a good threshold – these Shared Matches (with each other) would generally be 1C or 2C to each other. If I drop down to about 50cM, I get 3C & 4C too. Feedback from the LEEDS method indicates these over-90cM Matches tend to share the same grandparent. This can occur between two Matches who share much less cM with you. Your Matches may be fairly distant; but among themselves they are closely related. Often, some of these Matches are known to you – either through a good Tree or a ThruLines clue (with reference material). Looking through *all* of the Shared Matches, and then through *their* Shared Matches, I’ve usually found a group of Matches who are closely related to each other on some branch of my Tree.

Epiphany: At this point it is not critical, or even necessary, to “pin the tail on the donkey” precisely. These Matches may well be 3C or 4C or 5C or more to you, but they collectively anchor a sub-branch of your Tree. The fact that they share high-cMs with each other, is a very strong indication that their bond is strong and correct [classic genealogy triangulation]. And, even though they are more distantly related to you, their *grouping* is a strong indication that they are related to you through your Common Ancestor to that sub-branch.

Each of the Matches in this sub-branch (including those without Trees of their own), becomes a strong “tell-tale” that tracks a Shared Match Cluster and/or a Triangulated Segment of your DNA.

There is so much new ground to cover here, that I’m now shifting my focus. Instead of trying to fit each Match into a specific place in my Tree, with detailed genealogy research, I’m just highlighting the groups who are clearly descended from a specific person in my Tree. This specific person may be a child, or grandchild, or great grandchild of my Ancestor. At this point, it doesn’t add any more value to my Tree as a whole to know exactly how they relate to each other – just that they do closely relate to each other. Five or ten or twenty of my unknown Matches are now under a grandson of one of my specific Ancestors – although, they may all be around 5C to me. Other Matches cannot be in that sub-branch, unless they share an appropriate amount of DNA with others in that group. So, using inverse logic, we must find a different sub-branch for these other Matches

This process remains a hoot, and a game changer at Ancestry. I really think a large percentage of our Matches can now be correctly put into sub-branches of our Trees. This also highlights Match groups which will be helpful in getting through Brick Walls. Every IBD Match has to tie into some part of our Tree – ProTools is looking like a great tool to help place those Matches, and perhaps identify some small-cM false Matches. For me this clearly helps, identify small cM Matches who are related within a genealogy timeframe (as opposed to being very distantly related).

[22CK] Segment-ology: ProTools Part 2 TIDBIT; by Jim Bartlett 20240702

My Take on Ancestry Pro Tools

Featured

Posted on June 29, 2024 by Jim Bartlett

A Segment-ology TIDBIT

Ancestry ProTools includes several features. ProTools costs $10/month extra on an existing Ancestry account. This post is focused on the cMs between Shared Matches. I’ve fiddled with it for a few days, and, of course, have come up with a helpful spreadsheet.

One method: Focus on a “base” Match of interest to you.

Start with a Match of interest to you (often a high-cM Match with an unknown, or iffy, link to your Tree). I call this the “base” Match. Click on Shared Matches (to use ProTools or subscribe to it).

The resulting Shared Match list (with ProTools) took me a while to get used to. It is essentially a list of the Matches that you and your selected Match have in common. This is a fundamental building block of Shared Match Clustering, and Matches who appear on each other’s Shared Match lists tend to all have the same Common Ancestor. These Clusters can include Matches with Trees (where you can search for a Common Ancestor among them); as well as Matches with Unlinked Trees, Private Trees and NO Trees)

However, the ProTools Shared Match list also reveals the cMs shared between your “base” Match and each of the Shared Matches on the list. These cMs may, or may not, be significant information. So far, the list is only arranged by cMs shared between you and the Matches. I’ve found my “go to” process is to scroll down the right hand list and check the cMs shared between your “base” Match and each of the Shared Matches. This is often a wide range of cM values – from 20cM up to some real surprises. These surprises may be on the last page of Shared Matches – so, for me, it’s well worth the time to look at all the pages of Shared Matches (20 Matches per page). There is a rumor that Ancestry is working on way to let us sort on this value. I have found several cases of a relatively small Match to me, who is a parent, or child, or sibling, or other very close relationship to the “base” Match. This is often a “BINGO” for me – particularly when one or the other of this duo doesn’t have a Tree. These close relationships can also be game changers – 1C, 2C or even 3C can show a family group in one “sub-branch” of your Tree – importantly, separated from other branches.

Inverse Logic: If you are pretty sure of some Matches who descend from one child of a particular Ancestor, and a group of Matches (among themselves), appear to be on the same line, but their cMs with you are somewhat smaller than the other Matches from that Ancestor, then this is a strong clue they are related another generation back, or so.

In any case, this info can be very valuable in conjunction with a WATO analysis at DNAPainter.

Another method: Work on your top Matches on one branch of your Tree.

Of course I tried several spreadsheet methods. The one that works best for me is a list of my top Matches on one branch of my Tree. I determine these Matches from my Notes (derived from ThruLines; Clusters; UnListed Trees; blind luck; etc). Almost all are captured in my Common Ancestor Spreadsheet – here). Since I know how most of my Matches relate at the grandparent level, I focused on the Great Grandparent groups and/or 2xG Grandparents who were on my paternal side. In other words, on known, or suspected, Ahnentafels: 8P, or 16P and 18P, or occasionally one more generation back (32p-39P).

I walked down my Paternal List of Matches and selected the ones I had Notes for that indicated they were from my targeted branch, or, based on previous Clustering, who were Likely to be on that targeted branch (Likely Matches were labeled with an “L”, and usually had NO or very small Trees.) I listed the Match Name, cM, Relationship (e.g. 8P/2C1R), and sometimes the Child the Match descended from. Feel free to add any columns that might be helpful to your analysis – columns can always be moved or deleted or hidden. Out of this list I selected a Key Match (often unknown) and put an asterisk (*) adjacent to them in a new column. I then clicked on the Key Match’s Shared Matches and reviewed that list – on the right side was the shared cM with each Match. Initially I went from top to bottom of that list and put the shared cM amount in the column under the * and in the row for the match – creating a matrix of sorts. After a few iterations, I limited this to shared cM amounts over about 50cM and highlighted amounts over 90cM. As indicated above, I sometimes found very large cMs, indicating very close relationships – clearly on one particular branch twig in my Tree; and sometimes one Match had a full Tree and the others did not (very useful, bringing Matches with little to no info into play). One vexing Match has a father born the same year as me, so I can assume a 1R relationship (and her 162cM is 2C1R 53% of the time per DNA Painter). AND I note she shares 1883cM with another Match who is highly suspected of having a NPE bio-parent in my Tree) – the clues are adding up.

The method above is also creating a sub-branch, that could very well be from an unknown wife/mother (39P) for whom I have very few Matches so far. In these cases, I’m creating additional * columns for the highest cM Match in that group and looking at their Shared Matches – looking for one of their closer Matches who might have a Tree; or looking for other Shared Matches who might provide Trees or other insights – all in all: looking for a Cluster that might go back to 39P…

As I’m playing with this method, and adding more * columns (creating a matrix), I’m basically identifying all my Matches on my 8P/9M MRCA branch, and subdividing them into sub-branches. This will get me to a good Cluster from Matches back through 8P/9P MRCA to 18P/19P to 38P/39P and ultimately to the 78P/79P MRCA who are parents of my unknown wife/mother: 39P.

Traditional Clustering methods can do this alone, but knowing the cM relationship between the Matches helps a lot.

Clearly I’ll be spending time with this new spreadsheet. I can add new Matches that are close to my key Matches but may be under 50cM, or even at 20cM, with me, but with helpful Trees and/or Unlinked Trees. At any rate, its easy to sort the spreadsheet on an * column, and easily see Matches who should be grouped on a sub-Branch. And, at any time, I can easily use DNAPainter’s WATO tool to focus on likely Branches. It’s a whole lot easier to find a link by building a Match’s small tree back, when I have good intel on the Surnames and geography and timeframes.

ProTools identification of shared cMs between Matches is a strong addition – well worth $10 for a trial month, IMO.

Please feel free to post your own methods of squeezing out more info using this feature of ProTools.

[22CJ] Segment-ology: My Take on Ancestry ProTools TIDBIT; by Jim BARTLETT 20240629

Shared Segments for Small Segments

Featured

Posted on June 5, 2024 by Jim Bartlett

The Shared cM Project is an important and powerful tool for genetic genealogy – particularly with it’s integration with the DNA Painter tools. Over 60,000 submissions is impressive.

Two observations on the Shared cM Project – a very high percentage of the submissions were for the closer relationships; and the data was from many different users (perhaps with varying degrees of accuracy).

I now have over 9,000 entries in my Common Ancestors spreadsheet [see my blogpost about this spreadsheet tool]. I’ve curated these down to 7,800 entries from 1C to 8C, that I am pretty confident are correct. Also, my analysis is that there is a high probability, based on Trees and Shared Match Clusters, that each shared segment is from the Common Ancestor.

So I decided to compile cM statistics from my own curated data. I also wanted to see how the small cM relationships played out.

My data is not nearly as robust as the Shared cM Project was for 4C and closer relationships. However, in the 5C range my data was closer; and in the 6C to 8C range I generally had more data points than the Shared cM Project. This reflects my emphasis on all Ancestors out to 8C range.

Overall, there were no big surprises. In general, for 6C to 8C my data was in a tighter range; and I had some data for distant relationship that weren’t in the Shared cM Project (but, no surprises)

Bottom Line: In my opinion the ranges in the Shared cM Project are a little broad – probably a reflection of data from so many sources. I think the broader ranges give folks more wiggle room with low percentage probabilities, when they should really be looking for other possibilities.

Here is my table comparing my data with the Shared cM Project data – the top row indicates the full cousinship; once removed (1R) and twice removed (2R):

The significant increase in data points at the 6C level reflects the power of ThruLines to build Trees back (subject to my review); but only out to 6C.

As always, feedback is welcomed.

[06F] Segment-ology: Shared cMs for Small Segments; by Jim Bartlett 20420605