Unknown's avatar

About Jim Bartlett

I've been a genealogist since 1974; and started my first Y-DNA surname project in 2002. Autosomal DNA is a powerful tool, and I encourage all genealogists to take a DNA test.

Clustering vs Triangulation

Featured

Recently, I was called out for abandoning Segment Triangulation.  Let me set the record straight. I like, and use, both Clustering and Triangulation – virtually every day! Each one has some unique strengths and some drawbacks. A (very) short review….

Triangulation. When I started Segmentology, in 2015, I was the Johnny Appleseed of Segment Triangulation. I don’t claim to have invented it, but I was definitely a fan, and, I think, took it to new levels. It has been called the “Gold Standard” in genetic genealogy. Your DNA is composed of specific DNA segments from specific Ancestors – Triangulation helps you to determine these segments, group your Matches by shared segments (and therefore ancestral lines), and develop (or “paint”) a Chromosome Map of segments. However, the critical step of determining the ancestral line is often hard (at least for me) because the companies providing segment data, in general, don’t have any/many good Trees.

Clustering. In late 2018, auto-Clustering was introduced, and we could easily get many Clusters depending on the range of cMs we used. With this tool, you could determine these families of cousins, group your Matches by shared Matches, and try to determine the Common Ancestor. This was a tool we could use with AncestryDNA (either auto- or manual Clustering), where there tended to be many more Trees and genealogy tools. It worked for me… However, at AncestryDNA we cannot get the segment data to tie Matches to DNA segments.

By 2020 I had finished my Triangulation and had 372 Triangulated Group (TG) Segments. I was growing frustrated because I wasn’t getting very far finding Common Ancestors. On the other hand, I was finding a lot of Clusters with pretty solid Common Ancestors. So, I shifted my focus and have mainly been using Clusters, ever since. I still look for close Matches with larger segments for Triangulation. But my focus is on confirming more distant Ancestors and working on Brick Walls – mostly with Shared Match Clustering.

Bottom Lines:

1. Comprehensive Triangulation is a lot of hard work; but it can be a good tool for specific segments that appear to come from/through a Brick Wall.

2. Clustering is somewhat easier and focuses more directly on the genealogy. I think it’s more fun; and a better tool for many hobby genealogists. A spreadsheet is not required (but it is helpful to track everything).

3. Again: I use both, every day!

[22DL] Segment-ology: Clustering vs Triangulation; by Jim Bartlett 20260407

A Calculated Guess Is Great

Featured

In genetic genealogy, DNA is a tool. It helps us, in many different ways, to determine Ancestors and confirm cousins. My point in this blogpost, is that we don’t need to know precisely how we are related to each cousin – a calculated guess is fine. In fact, I encourage it.

Here is an example. I have determined that XYZ is a 7C. This is based on genealogy. We share my Ancestor couple METZGER/KEIFFER (aka Ahnentafel 352). This Ancestor couple’s full names, dates, places, life story are not important to Segmentology. Match XYZ happens to share 15cM, also not too important, but well within Shared cM range. XYZ is on my Paternal side, as is our MRCA. XYZ and I have over 30 other Matches on our Shared Match list – the top 20 Shared Matches share over 100cM with XYZ. I can tell you that this will show up as a strong, solid Cluster. You get the picture; this is pretty solid…

So now I notice on my Shared Match list with XYZ that, per ProTools, ABC is a 1C to XYZ, sharing 852cM. ABC has NO Tree. However, ABC has a long Shared Match list with over 30 Matches who are known cousins to me through Ahnentafel 352. #A0352P is the first thing in many of my Shared Match Notes, which is all I need to know…

There is no Tree for ABC, but as a 1C to XYZ, I don’t need a Tree. I know XYZ’s grandparents, must also be ABC’s grandparents for them to be 1C. So, I confidently add ABC to my Common Ancestor spreadsheet and copy the line of descent I already have for XYZ, and change ABC’s parent to UNK. Done!

I’ve now added ABC to the Shared Match spreadsheet, and can enter a Note for ABC which starts #A0352P. Which Note is now visible to all other Shared Match lists (and Clusters) that include Match ABC. This helps me find even more Matches to evaluate and add.

Emboldened by this logic, I am sure I can also add a proposed 2C to my spreadsheet (with  UNK for both parent and grandparent). This will “tuck” many more Matches into my spreadsheet of known cousins, even though I don’t know their parents or grandparents. And those Matches will often highlight other Matches who can be added. For A0352P, I now have 151 confirmed Matches!

Note: It’s important that these potential additional Matches be vetted (as above). They should also be part of an appropriate Cluster of Shared Matches. Even the parent of a known Match can be on the wrong side. That is to say, for instance, my line from XYZ goes through her mother – so her tested father could well show up as a close match to XYZ, and to me, but his path to a CA would be different (ie NOT through his wife to A0352P!).

In conclusion, there are two options:

1. Leave this “Match with no Tree” out of my Tree – disavow them as a cousin because *I* don’t know their parent’s name.

2. Accept this Match as a 7C to me; just as much as I accepted XYZ as my 7C.

To me, as a lifelong genealogist, I’d choose Option 2 in a heartbeat. I’d hug this new Match just as strongly at a family reunion. I’d probably ask their parents’ names… But the rest of their line would already be in my spreadsheet. The UNK parent/grandparent wouldn’t make any difference. And, that Match will now help me document other Matches!

AND, often this new cousin Match doesn’t know much of their Ancestry – be a helpful genealogist, and send them a message about the ancestry you are sure they have… Add their line to your Tree, and ask if they’d like you to add names in place of the UNKs!

This is also a plug for the Common Ancestor spreadsheet – a valuable tool for recording found cousins, and for easily seeing how other Matches fit in. Hard work, but it sure highlights strong branches of my family Tree (as well a weak branches).

[35BCa] Segment-ology: A Calculated Guess Is Great; by Jim Bartlett 20260403

From Waterfalls to the Sea

Featured

This is an analogy about your DNA.

Recently, a Kona low flooded a lot of Oahu, HI – particularly the North Shore. I watched many videos of torrential rains and swollen waterfalls (one after another); and videos of flooded areas as all that water made its way to the sea.

Close your eyes and think of each waterfall as one of your Ancestors and then imagine each waterfall representing part of your DNA. The water flows in branches to rivers and finally into the Pacific Ocean. Each waterfall is like your DNA, flowing from a distant Ancestor and combining with DNA “flows” from other Ancestors to your parent, and then to you (you are the Pacific Ocean in this analogy).  Many different paths over time and geography winding up with you. And the same is happening on the other side of the mountain, also flowing to the Pacific – representing the DNA from your other parent…

I think it’s a good visual analogy – many Ancestor sources of parts of your DNA flowing and combining until it finally reaches you.

[22DK] Segment-ology: From Waterfalls to the Sea by Jim Bartlett 20260329

ProTools Part 27

Featured

Shared Match Relationships

Setup: Whenever I add a Match to my Tree (usually a ThruLines hint, that I agree with), I then check the Shared Matches, sorted by ProTools by the closest relationships.  I first scroll down the list to confirm that, indeed, several of them have the same MRCA (the first thing in the Notes field). I then usually look at each one (usually down to 100cM) to see if I can link them to the base Match and/or place them in my Tree and add them to my . Usually this is done at AncestryDNA, but sometimes at MyHeritage.

Topic: In my spreadsheet I have a column for the relationship to one of the closest Matches. Format: 209cM/1C2R: Match Name.  This is strong, additional, evidence that this branch of my Tree is “fluffing out” correctly. Some observations about this relationship:

1. Usually the relationship is exactly right.

2. Usually AncestryDNA offers two alternative relationships. One is a “full” relationship, like 1C2R; and the other is a “half” relationship, like half great granduncle. These are equivalent from a DNA (cM) “math” standpoint – they would have the same cMs on average – the DNA alone couldn’t tell the difference. But relatively few of your Matches will be “half” (indicating their MRCA is one person with two different mates). You can usually tell them apart by how they fit in your Tree, or by their ages, or by a consensus among their own shared Matches. Bottom line – it’s usually the full relationship.

3. However… in a few cases the relationship doesn’t mesh with where I think they go in my Tree. There usually are other equivalent relationships; and a simple click on the Shared Matches estimate at AncestryDNA will quickly bring up a list. In this case, 2C1R, was on the list and that agreed with the genealogy.  An alternative is to keep the DNA Painter Shared cM Project tool handy – just type in the cM amount to see the equivalent relationship and other relationships that are found almost as frequently.

4. If I cannot find a reasonable close relationship, I force myself to dig a little deeper… Sometimes a Match’s Tree skips a generation or adds an extra one; infrequenly the Match has shifted the test taker to a parent or grandparent (the test taker appears to be the child of someone born in 1880…). There are several ThruLines Trees that “skip” a generation in order to generation a Match within 6C range. Sometimes, I can figure it out and put the “corrected” version in my Tree; other times I just set it aside, and NOT include that line in my linked Tree, and highlight it as probable wrong in my spreadsheet and in the Match Notes (so I don’t stumble over it again).

Bottom line: with larger Shared Matches than 100cM (or so – use your judgment), the AncestryDNA relationships are pretty accurate; but occasionally we need to use one of the other, equivalent, relationships. This relationship is a pretty good Quality Control check.

[22DK] Segment-ology: Pro Tools 27 – Shared Match Relationships; by Jim Bartlett 20260327

Musing…

Featured

Volume 1 of Segmentology is done. Fundamentals. What to do next? Some musings – waddayathink?

1. Continue this blog. Volume 1 incorporated many of the 200 blogposts so far, but I have perhaps 100 more in various stages – from title or concept to full drafts not yet published (same “scatter-shot” range of topics…  And, as always, I encourage you to request topics .

2. Focus on Volume 2. Something like “Using the Fundamentals”. At the top of my list would be chapters on Finding Bio Ancestors; Walking The Clusters Back; Compilation of ThruLines TIDBITs; Tying in Floating Branches; Creating Your Personal Shared cM Chart; Some Core Objective Statements… What would be your catchy titles?

3. A Segmentology Collaboration platform or Forum. Some method where we could share our collective experience, insights, objectives, wish lists…. I feel there is a lot of collected wisdom among practicing “Segmentologists” – how can we capture and focus build on that? Your ideas are encouraged.

I’m not going to add an “all of the above” category, but that’s where this might go…

This blog has helped – actually “forced” – me think through, and research, and document Segmentology related concepts – to put them in plain English as best I can. I encourage you to comment on our future path. In the near term I’m going to “unload” some of my backlogged posts. I turn 83 this week, and I’m just not done with this Segmentology journey…

[99F] Segment-ology: Musing… by Jim Bartlett 20260325

Segmentology Fundamentals

Featured

Segmentology Fundamentals

A Segmentology eBook is now available for free download at ISOGG Wiki:  https://isogg.org/wiki/Segmentology_Fundamentals

10 Chapters in 3 GROUPS (Segments, Groups, Tools) and a robust Glossary

Special thanks to all of you who have provided so much valuable feedback and encouragement on this Segmentology journey.

Sincerely,

Jim Bartlett

[99E] Segment-ology: Segmentology Fundamentals; by Jim Bartlett 20260322

Insights into Matches

Featured

In my last post I outlined two insights from analysis of my 8700 Matches at AncestryDNA with confirmed Common Ancestors (CAs): the number of Matches increases dramatically with each generation going back to the 6C level (where ThruLines ferrets out a lot of my cousins); and the average cMs flattens out in the mid-teens beyond the 4C level.

For this post I analyzed the Matches to see the distribution based on shared cMs.

Shown and not shown are 1491 Matches over 20cM, about 17% of the total. But the insight is that 83% of the Matches are from 6 to 20cM. And you can easily see the spike at 9cM. You’ll also notice the Matches at 6 and 7cM which I saved just before the AncestryDNA change in the lower threshold several years ago. I’m not sure there is a drop at 8cM – maybe because I haven’t found a lot of Matches at the 7C level and beyond.

At this point, as a life-long genealogist, I want to reiterate that cousins are where you find them and by far most are under 15cm (what we usually call small segments). And this is just the tip of the iceberg, because most of our true cousins beyond 4C (who have taken a DNA test) do not show up as DNA Matches. Most of my under-15cM Matches are also part of interrelated family groups (per ProTools), and their lines usually agree with standard genealogy research. A small percentage don’t and I remove them from the spreadsheet and this analysis.

Everyone has their own objectives in genetic genealogy. I encourage you to think about yours and write them down. Collecting cousins is not my objective but documenting interrelated cousins in family groups (with ProTools), and building evidence for each Ancestor is. This includes finding a few Ancestors that don’t “look” right and turn out to be NPEs. Or using Triangulated Groups or Clusters or Floating Branches to build evidence to break though Brick Walls/NPEs.

Clearly this is genealogy “big picture”. It forces me to treat all lines and Ancestors equally (yes, after I’ve spent a lot of time on my favorites). However, some of these insights, will also help with “targeted” objectives into specific areas of our genealogy.

[06H] Segment-ology: Insights into Matches; by Jim Bartlett 20260125

Insights into cM Patterns

Featured

I now have over 8,700 Matches at AncestryDNA with a confirmed Common Ancestor (CA) with me between 2C and 8C. See my Common Ancestor Spreadsheet post here. That’s a lot of data, so I thought I’d do some analysis. In 2024 I posted (here) my averages for 3C to 8C which roughly agreed with the Shared cM project.

Below is a table summarizing all of my data (including full cousins, half cousins and removed cousins). For each relationship there are columns for the number of Matches, the average cMs, the lowest cM, the highest cM; plus the number of generations (meiosis events), and average cMs for each. The table is then repeated with a sort based on meiosis events.

A word about meiosis events. They are the count from me up to the CA and then back down to the Match. Like generations… A 1C is 4 events (two up to grandparent (the CA) plus two back down to the Match. The number of meiosis events with a 1C2R is 6 (two up and 4 down). A half relationship adds one to the meiosis events – eg a 4C1R is 11 events; and 4C1Rh is 12 events. These are important because in a mathematical simulation, each event reduces the cM by half. From the Shared DNA Project a 1C (4 events) average is 866cM compared to 2C1R (7 events) is 122cM which is roughly 866 halved three times. Remember, it’s an order of magnitude thing. And, as we shall see, it generally works for close relationships (like 1C and 2C), but drifts away for more distant relationships (like 4C and beyond). Important: this is not biology’s fault, it’s the math’s fault. It’s because we have a LOT of true distant cousins that do NOT share matching DNA with us; and they are not reflected in the averages. This (lack of a normal curve) is highlighted in the second sort (by meiosis numbers) below. This is also reflected in the DNA Painter Shared cM Project tool which shows different groups of Matches for a given input cM value. For example at DNA Painter, plug in 55cM… the 29% group of 3Ch, 3C1R, 2C3R and 2C2R half are all 9 meiosis events; and the second group of 4C, 3c1Rh, and 3C2R are all 10 meiosis events. This also demonstrates that by the time we get down to 3C and 4C levels there is a lot of overlap.

For this first table, the takeaway is that the number of Matches with CAs increased dramatically with each generation. [Note I combine full cousin with cousin 1R because at my age, most Matches will be a generation younger that me] 3C & 3C1R: 196 Matches; 4C & 4C1R: 662 Matches; 5C & 5C1R: 1,406 Matches; 6C & 6C1R: 3,426 Matches. WOW, what an increase in the number of Match cousins. And then we have 7C & 7C1R: 584 Matches; 8C & 8C1R: 363 Matches. What happened? Why the steep decrease in numbers. Well, IMO, the major factor is that AncestryDNA’s ThruLines quits at 6C – ThruLines can “see” into private Trees (I cannot); and it roots out MRCAs with the smallest of Trees (I don’t have that time). I can only dream of how many ThruLines I’d get at the 7C and 8C levels. Some of the ones I have now, were found/recorded when we had Circles at Ancestry.

The point is: there are LOTS of cousins still waiting to be determined. ProTools is helping.

Table 1: 8,799 AncestryDNA Matches Summarized by Relationship

AncestryDNAcMcMcM  
MRCA#Matchesavglowhighmeiosis
1C2Rh3138782007
2C12696
2C1R14127342207
2C2R847391628
2C3R234221409
3C5763132088
3Ch52016959
3C1R1392861489
3C1Rh2628711110
3C2R1062266810
3C2Rh202069211
3C3R342265811
3C3Rh122384012
3C4R12012
3C4Rh21081213
4C12824622010
4Ch71261911
4C1R53420611411
4C1Rh331663012
4C2R2671679212
4C2Rh121263913
4C3R271664413
4C4R117171714
5C4691666212
5Ch291762714
5C1R11371466013
5C1Rh71466014
5C2R3001464114
5C2Rh91472715
5C3R751464015
5C3Rh118181816
5C4R210101016
6C19221265614
6Ch971162515
6C1R15031265215
6C1Rh581062216
6C2R6181264416
6C2Rh471262917
6C3R121563017
7C2621364116
7Ch101563917
7C1R3221264317
7C1Rh71764318
7C2R171662518
7C3R51561819
8C3101263518
8Ch61071719
8C1R531663719
8C2R121781920
8C3R71061321
9C631462420
Total8799     

For the second table; the takeaway is that the average cM tracks pretty close to each other at the same meiosis numbers. And after meiosis level 9 which averages 27cM; the “curve” quickly “flatlines” in the mid teens. This is reflected at DNA Painter with many relationships all in play under 20cM.

Table 2: 8,799 AncestryDNA Matches Summarized by Meiosis Events

AncesttryDNAcMcMcM    
MRCA#Mavglowhighmeiosisavg cM
2C12696269
1C2Rh3138782007 
2C1R14127342207132
2C2R847391628 
3C576313208855
2C3R234221409 
3Ch5201695927
3C1R1392861489 
3C1Rh2628711110 
3C2R106226681025
4C12824622010 
3C2Rh202069211 
3C3R34226581118
4Ch71261911 
4C1R53420611411 
3C3Rh122384012 
3C4R12012 
4C1Rh33166301218
4C2R2671679212 
5C4691666212 
3C4Rh21081213 
4C2Rh12126391313
5C1R11371466013 
4C3R271664413 
4C4R117171714 
5Ch291762714 
5C1Rh7146601415
5C2R3001464114 
6C19221265614 
5C2Rh91472715 
5C3R75146401513
6Ch971162515 
6C1R15031265215 
5C3Rh118181816 
5C4R21010101613
6C1Rh581062216 
6C2R6181264416 
7C2621364116 
6C2Rh471262917 
6C3R12156301713
7Ch101563917 
7C1R3221264317 
7C1Rh71764318 
7C2R17166251815
8C3101263518 
7C3R51561819 
8Ch6107171913
8C1R531663719 
8C2R12178192015
9C631462420 
8C3R71061321 
Total8799       

Sidebar – this evaluation also acts as a Quality Control indicator. Watch for data points way outside the norms. I had three Matches who skewed one of the numbers. I went back to them – they were close to each other and I was sure they were from an NPE. Upon reevaluation, they needed to be a generation closer to our CA. I made the shift, and all the numbers fell back into the norm.

These insights are helping me with a new review of Walking The Clusters Back, where in I need to use judgment when imputing relationships and CAs.

[06G] Segment-ology: Insights into cM Patterns; by Jim Bartlett 20260122

How Old Are Your Segments?

Featured

Well, it depends… Your chromosomes are very large segments, which are not very old at all.  On the other hand, I have some small DNA segments from Neanderthal Ancestors – pretty old. In general, the smaller the segment, the older it is. But let’s think about this for a moment.

 This discussion will be about your DNA segments – large segments from close relatives to ever smaller segments from more and more distant relatives. They are all part of the DNA you inherited from your Ancestors. Segments are formed at the moment of conception – when sperm meets egg – about nine months before you were born. They don’t change until you pass them on – after recombination and new crossovers – to the next generation. So our unit of “age” measurement is a generation.

So, let’s start with the largest “segments” – your 44 autosomes passed to you by your parents. How old are these 44 chrsomosomes? Well, they are 0 generations old. You are the first person to ever have each of these specific – full chromosome – segments.*

Then let’s look at your grandparent segments that make up your chromosomes. On average you have 22 chromosomes, subdivided by 34 crossovers, for 56 grandparent segments per Side. These were each part of full, new, chromosomes passed from your grandparents to your parents; and then, one generation later, passed to you by a parent – they are 1 generation old. Again, due to random recombination for every child, you are the first person to ever have these specific segments.*

Similarly, your great grandparents, passed new chromosomes to your grandparents, who passed segments to your parents who passed segments to you, which would be unique and 2 generations old.*

You get the picture. The unique segments in each of your Ancestors are recombined into new segments and passed down – generation after generation. Your segments are “imbedded” in the chromosomes and large segments they passed down. And knowing the genealogy of each segment, we can count the generations to find their age – always one less than the number of Ancestor generations back.*

* So what’s with that pesky asterisk? In short, “sticky” segments. Some segments are passed down intact – they are exactly the same segment in an Ancestor and their child (who is also your Ancestor) – they were not subjected to a recombination crossover. More likely than not, one of your smaller chromosomes (Chr 18 to 22) was passed from a parent to you intact. So, in that particular case, it’s age is 1 generation (not 0 generations like all the other chromosomes). And this happens to some of the other segments passed down at each generation. Above we noted that you got about 56 grandparent segments from one parent. When you pass these to your children, recombination will create about 34 new crossovers. In general, they will be subdivisions of 34 of the 56 grandparent segments passed down to you – leaving 22 grandparent segments intact. You only pass half of your DNA to each child, but that still includes about 11 grandparent segments which are now 2 generations old!

It gets complicated real quick!

This is one of the reasons that as segments get smaller, the range of possible relationships increases. A given segment may have persisted for several generations, or not.

Chromosome Mapping of segments with MRCAs let’s us figure this out. Even if our Map is not complete, at least in some areas of our chromosomes can be figured out. Someday… it will be interesting to try to determine a Shared cM Chart which figures in the age of the segment. I’ll bet the ranges would be somewhat smaller…

[O5H] Segment-ology: How Old Are Your Segments? by Jim Bartlett 20251218

Shared Segment Spreadsheet Incarnations

Featured

For me, the Shared Segment Spreadsheet is a critical tool, which evolves through four incarnations.

1. It starts as a collection of all your shared DNA segments – from each company. This also means a collection of all your Matches (except AncestryDNA), some with multiple shared segments. It can be searched and sorted.

2. Use as a segment Triangulation tool. Sort on: Company + Chromosome + Segment Start to arrange all the shared segments (within a company) into Chromosomes. And within a Chromosome they are arranged so that overlapping segments are close to each other.  With this “view” each segment is Triangulated with other overlapping segments, or not. Maternal and Paternal Triangulated groups are formed*. Some of the under-15cM segments will not Triangulate and are labeled “false” and deleted or moved out of this spreadsheet –  “everybody’s got to be somewhere.”  This process is repeated for each company.

3. Form/identify Triangulated Group (TG) segments. Sort on: Side + Chromosome + Segment Start to separate the maternal and paternal segments and sort them in order within each chromosome. Since this spreadsheet is comparing all these shared segments with your own DNA segments, the shared segments from different companies will “break” into TG segments that align with your own segments. However, this phase of the process requires some judgment – the data is a little fuzzy and the ends of TGs will not be precise. You have to make a call. In general, to align with your DNA segments, each TG will end at the same Mbp as the next one starts. Make those calls and assign a TG Identification (TG ID)** for each segment. Make a TG segment header row for each one (I have 372 TG segments) that lock in the overall TG start and end positions and TG ID. TIP: make the TG header start location 0.01Mbp less than the first shared segment in the TG – so it sorts on top of the individual segments. Remember that every Match in a TG is related to you on your line back to a specific Common Ancestor (CA). Note: some small segments in a TG may go back further.

4. Use these TG groups to do the genealogy! Among the Matches, find the consensus path to the CA.

Summary: A shared segment spreadsheet has several uses – collection > Triangulation > TG ID > genealogy. The TG segment is your DNA segment. This covers all of your genetic genealogy, but you can always focus on one or more individual TGs, if you don’t want to eat the whole elephant at one time.

*I’ve covered the Triangulation process in other blogposts, and won’t repeat that here – this blogpost is about the three incarnations of the spreadsheet.

**I’ve covered TG IDs in other blogposts

[35BBa] Segment-ology: Shared Segment Spreadsheet Incarnations by Jim Bartlett 20251102