I now have over 8,700 Matches at AncestryDNA with a confirmed Common Ancestor (CA) with me between 2C and 8C. See my Common Ancestor Spreadsheet post here. That’s a lot of data, so I thought I’d do some analysis. In 2024 I posted (here) my averages for 3C to 8C which roughly agreed with the Shared cM project.
Below is a table summarizing all of my data (including full cousins, half cousins and removed cousins). For each relationship there are columns for the number of Matches, the average cMs, the lowest cM, the highest cM; plus the number of generations (meiosis events), and average cMs for each. The table is then repeated with a sort based on meiosis events.
A word about meiosis events. They are the count from me up to the CA and then back down to the Match. Like generations… A 1C is 4 events (two up to grandparent (the CA) plus two back down to the Match. The number of meiosis events with a 1C2R is 6 (two up and 4 down). A half relationship adds one to the meiosis events – eg a 4C1R is 11 events; and 4C1Rh is 12 events. These are important because in a mathematical simulation, each event reduces the cM by half. From the Shared DNA Project a 1C (4 events) average is 866cM compared to 2C1R (7 events) is 122cM which is roughly 866 halved three times. Remember, it’s an order of magnitude thing. And, as we shall see, it generally works for close relationships (like 1C and 2C), but drifts away for more distant relationships (like 4C and beyond). Important: this is not biology’s fault, it’s the math’s fault. It’s because we have a LOT of true distant cousins that do NOT share matching DNA with us; and they are not reflected in the averages. This (lack of a normal curve) is highlighted in the second sort (by meiosis numbers) below. This is also reflected in the DNA Painter Shared cM Project tool which shows different groups of Matches for a given input cM value. For example at DNA Painter, plug in 55cM… the 29% group of 3Ch, 3C1R, 2C3R and 2C2R half are all 9 meiosis events; and the second group of 4C, 3c1Rh, and 3C2R are all 10 meiosis events. This also demonstrates that by the time we get down to 3C and 4C levels there is a lot of overlap.
For this first table, the takeaway is that the number of Matches with CAs increased dramatically with each generation. [Note I combine full cousin with cousin 1R because at my age, most Matches will be a generation younger that me] 3C & 3C1R: 196 Matches; 4C & 4C1R: 662 Matches; 5C & 5C1R: 1,406 Matches; 6C & 6C1R: 3,426 Matches. WOW, what an increase in the number of Match cousins. And then we have 7C & 7C1R: 584 Matches; 8C & 8C1R: 363 Matches. What happened? Why the steep decrease in numbers. Well, IMO, the major factor is that AncestryDNA’s ThruLines quits at 6C – ThruLines can “see” into private Trees (I cannot); and it roots out MRCAs with the smallest of Trees (I don’t have that time). I can only dream of how many ThruLines I’d get at the 7C and 8C levels. Some of the ones I have now, were found/recorded when we had Circles at Ancestry.
The point is: there are LOTS of cousins still waiting to be determined. ProTools is helping.
Table 1: 8,799 AncestryDNA Matches Summarized by Relationship
| AncestryDNA | cM | cM | cM | |||
| MRCA | #Matches | avg | low | high | meiosis | |
| 1C2Rh | 3 | 138 | 78 | 200 | 7 | |
| 2C | 1 | 269 | 6 | |||
| 2C1R | 14 | 127 | 34 | 220 | 7 | |
| 2C2R | 8 | 47 | 39 | 162 | 8 | |
| 2C3R | 2 | 34 | 22 | 140 | 9 | |
| 3C | 57 | 63 | 13 | 208 | 8 | |
| 3Ch | 5 | 20 | 16 | 95 | 9 | |
| 3C1R | 139 | 28 | 6 | 148 | 9 | |
| 3C1Rh | 26 | 28 | 7 | 111 | 10 | |
| 3C2R | 106 | 22 | 6 | 68 | 10 | |
| 3C2Rh | 20 | 20 | 6 | 92 | 11 | |
| 3C3R | 34 | 22 | 6 | 58 | 11 | |
| 3C3Rh | 12 | 23 | 8 | 40 | 12 | |
| 3C4R | 1 | 20 | 12 | |||
| 3C4Rh | 2 | 10 | 8 | 12 | 13 | |
| 4C | 128 | 24 | 6 | 220 | 10 | |
| 4Ch | 7 | 12 | 6 | 19 | 11 | |
| 4C1R | 534 | 20 | 6 | 114 | 11 | |
| 4C1Rh | 33 | 16 | 6 | 30 | 12 | |
| 4C2R | 267 | 16 | 7 | 92 | 12 | |
| 4C2Rh | 12 | 12 | 6 | 39 | 13 | |
| 4C3R | 27 | 16 | 6 | 44 | 13 | |
| 4C4R | 1 | 17 | 17 | 17 | 14 | |
| 5C | 469 | 16 | 6 | 62 | 12 | |
| 5Ch | 29 | 17 | 6 | 27 | 14 | |
| 5C1R | 1137 | 14 | 6 | 60 | 13 | |
| 5C1Rh | 7 | 14 | 6 | 60 | 14 | |
| 5C2R | 300 | 14 | 6 | 41 | 14 | |
| 5C2Rh | 9 | 14 | 7 | 27 | 15 | |
| 5C3R | 75 | 14 | 6 | 40 | 15 | |
| 5C3Rh | 1 | 18 | 18 | 18 | 16 | |
| 5C4R | 2 | 10 | 10 | 10 | 16 | |
| 6C | 1922 | 12 | 6 | 56 | 14 | |
| 6Ch | 97 | 11 | 6 | 25 | 15 | |
| 6C1R | 1503 | 12 | 6 | 52 | 15 | |
| 6C1Rh | 58 | 10 | 6 | 22 | 16 | |
| 6C2R | 618 | 12 | 6 | 44 | 16 | |
| 6C2Rh | 47 | 12 | 6 | 29 | 17 | |
| 6C3R | 12 | 15 | 6 | 30 | 17 | |
| 7C | 262 | 13 | 6 | 41 | 16 | |
| 7Ch | 10 | 15 | 6 | 39 | 17 | |
| 7C1R | 322 | 12 | 6 | 43 | 17 | |
| 7C1Rh | 7 | 17 | 6 | 43 | 18 | |
| 7C2R | 17 | 16 | 6 | 25 | 18 | |
| 7C3R | 5 | 15 | 6 | 18 | 19 | |
| 8C | 310 | 12 | 6 | 35 | 18 | |
| 8Ch | 6 | 10 | 7 | 17 | 19 | |
| 8C1R | 53 | 16 | 6 | 37 | 19 | |
| 8C2R | 12 | 17 | 8 | 19 | 20 | |
| 8C3R | 7 | 10 | 6 | 13 | 21 | |
| 9C | 63 | 14 | 6 | 24 | 20 | |
| Total | 8799 | |||||
For the second table; the takeaway is that the average cM tracks pretty close to each other at the same meiosis numbers. And after meiosis level 9 which averages 27cM; the “curve” quickly “flatlines” in the mid teens. This is reflected at DNA Painter with many relationships all in play under 20cM.
Table 2: 8,799 AncestryDNA Matches Summarized by Meiosis Events
| AncesttryDNA | cM | cM | cM | |||||
| MRCA | #M | avg | low | high | meiosis | avg cM | ||
| 2C | 1 | 269 | 6 | 269 | ||||
| 1C2Rh | 3 | 138 | 78 | 200 | 7 | |||
| 2C1R | 14 | 127 | 34 | 220 | 7 | 132 | ||
| 2C2R | 8 | 47 | 39 | 162 | 8 | |||
| 3C | 57 | 63 | 13 | 208 | 8 | 55 | ||
| 2C3R | 2 | 34 | 22 | 140 | 9 | |||
| 3Ch | 5 | 20 | 16 | 95 | 9 | 27 | ||
| 3C1R | 139 | 28 | 6 | 148 | 9 | |||
| 3C1Rh | 26 | 28 | 7 | 111 | 10 | |||
| 3C2R | 106 | 22 | 6 | 68 | 10 | 25 | ||
| 4C | 128 | 24 | 6 | 220 | 10 | |||
| 3C2Rh | 20 | 20 | 6 | 92 | 11 | |||
| 3C3R | 34 | 22 | 6 | 58 | 11 | 18 | ||
| 4Ch | 7 | 12 | 6 | 19 | 11 | |||
| 4C1R | 534 | 20 | 6 | 114 | 11 | |||
| 3C3Rh | 12 | 23 | 8 | 40 | 12 | |||
| 3C4R | 1 | 20 | 12 | |||||
| 4C1Rh | 33 | 16 | 6 | 30 | 12 | 18 | ||
| 4C2R | 267 | 16 | 7 | 92 | 12 | |||
| 5C | 469 | 16 | 6 | 62 | 12 | |||
| 3C4Rh | 2 | 10 | 8 | 12 | 13 | |||
| 4C2Rh | 12 | 12 | 6 | 39 | 13 | 13 | ||
| 5C1R | 1137 | 14 | 6 | 60 | 13 | |||
| 4C3R | 27 | 16 | 6 | 44 | 13 | |||
| 4C4R | 1 | 17 | 17 | 17 | 14 | |||
| 5Ch | 29 | 17 | 6 | 27 | 14 | |||
| 5C1Rh | 7 | 14 | 6 | 60 | 14 | 15 | ||
| 5C2R | 300 | 14 | 6 | 41 | 14 | |||
| 6C | 1922 | 12 | 6 | 56 | 14 | |||
| 5C2Rh | 9 | 14 | 7 | 27 | 15 | |||
| 5C3R | 75 | 14 | 6 | 40 | 15 | 13 | ||
| 6Ch | 97 | 11 | 6 | 25 | 15 | |||
| 6C1R | 1503 | 12 | 6 | 52 | 15 | |||
| 5C3Rh | 1 | 18 | 18 | 18 | 16 | |||
| 5C4R | 2 | 10 | 10 | 10 | 16 | 13 | ||
| 6C1Rh | 58 | 10 | 6 | 22 | 16 | |||
| 6C2R | 618 | 12 | 6 | 44 | 16 | |||
| 7C | 262 | 13 | 6 | 41 | 16 | |||
| 6C2Rh | 47 | 12 | 6 | 29 | 17 | |||
| 6C3R | 12 | 15 | 6 | 30 | 17 | 13 | ||
| 7Ch | 10 | 15 | 6 | 39 | 17 | |||
| 7C1R | 322 | 12 | 6 | 43 | 17 | |||
| 7C1Rh | 7 | 17 | 6 | 43 | 18 | |||
| 7C2R | 17 | 16 | 6 | 25 | 18 | 15 | ||
| 8C | 310 | 12 | 6 | 35 | 18 | |||
| 7C3R | 5 | 15 | 6 | 18 | 19 | |||
| 8Ch | 6 | 10 | 7 | 17 | 19 | 13 | ||
| 8C1R | 53 | 16 | 6 | 37 | 19 | |||
| 8C2R | 12 | 17 | 8 | 19 | 20 | 15 | ||
| 9C | 63 | 14 | 6 | 24 | 20 | |||
| 8C3R | 7 | 10 | 6 | 13 | 21 | |||
| Total | 8799 | |||||||
Sidebar – this evaluation also acts as a Quality Control indicator. Watch for data points way outside the norms. I had three Matches who skewed one of the numbers. I went back to them – they were close to each other and I was sure they were from an NPE. Upon reevaluation, they needed to be a generation closer to our CA. I made the shift, and all the numbers fell back into the norm.
These insights are helping me with a new review of Walking The Clusters Back, where in I need to use judgment when imputing relationships and CAs.
[06G] Segment-ology: Insights into cM Patterns; by Jim Bartlett 20260122
Jim – You say “I can only dream of how many ThruLines I’d get at the 7C and 8C levels.” As you likely know, you can fool AncestryDNA into providing some of these by making yourself a sibling to your parent (or grandparent) in the tree you are linked on. What had been your 6th GGPs are now your 5th GGPs and visible. I’m not sure how many would show up if you are now two or three generations older than your actual 7Cs – but I do it every few years and I find some interesting and real 7Cs. I posted this on Facebook years ago and three or four people thought it was the end of the world; that at that moment someone might be looking at my tree and get the wrong data. Meh. It takes maybe 36 hours to update, and I change it back a few days later.
LikeLiked by 1 person
Rich, I do use that tip – several times a year. And like you, I try to whiz through the new ThruLines (and put them in my spreadsheet), so that I can revert to the standard way quickly. Note that ThruLines limit is 6C level on both parties. So what you can get at the parent level is only 6C1R. You look like 7 generations back (but the CA is actually 8 generations back); and the Match has to be within 7 generations of that same, fairly distant, CA. It does add some true Matches to our list, but if Ancestry up the level to 7C for ThruLines, we’d get a LOT more. In the meantime, we do what we can. Jim
LikeLike
Jim, WOW is right. Very impressive. I’m imagine you are shooting for 10000 matches now to make statistics easy. I was wondering if you have correlated all these matches somehow with your TG’s and if any insights resulted? I know much of this match data is from Ancestry, but with all the overlapping matches in multiple databases the segments often seem identifiable.
LikeLiked by 1 person
pdtbill, When I add in my data for other companies (not really apples to apples because of Timber), I have over 10,000. My goal was to confirm relationships with 10% of my Matches at AncestryDNA, which is hard (but ProTools are helping – not enough time). And yes, I’ve determined the TG with 1,065 AncestryDNA Matches, but not all of those Matches include MRCAs.
I am working on an improved/streamlined Walk The Clusters Back (WTCB) version, which lets me impute known TGs in a Cluster to other Matches (working down to 20cM Matches). This TGID then carries over to the data set in this post. My “expectation” there is that the TGs will be distributed, roughly, equally by generation. In other words, if I round my 372 TGs up to 400 TGs, I’d have 100 TGs “flowing” through each of my grandparents (with Matches at nominal 1C level. Note: “flowing through” is not the same as “originating at”. Then 50 at each G grandparent; 25 per 2XG grandparent, etc. to about 1 TG for each 7xG grandparent at 7C level. In other words at the 6C level (where I have the most Matches), I would expect to see about 3 TGs (or actually 6 TGs for each couple). So if you track the TGs in the Common Ancestor spreadsheet (as I do), don’t expect them all to be the same in each family, but do expect a small number of them, probably confined to different branches in each 5xG grandparent/couple family.
I check GEDmatch every week for new Matches from AncestryDNA kits… AND I put the TGID in the new Notes column at GEDmatch to track the ones I’ve done. Jim
LikeLike
Great, thank you. I’m always trying to learn & follow your lead here.
LikeLiked by 1 person
Jim o ricevuto un altra corrispondenza di condiviso di 13.9cN con due segmenti il più grande triangolato con altri di 7,1cM che si aggiunge al gruppo Gia esistente di 18 match piu io sul chromosome16 e dai suoi cognomi genitori mi pare che sia tra rom e misti rim
LikeLike
Kevin, It looks like you are making good progress with this TG on Chr 16. It’s now time to transition to genealogy research to try to determine how the TG Matches all interrlate to each other – to find your Common Ancestor. Jim
LikeLike
Jim mi sai spiegare che vuol dire questo messaggio dal team di myheritage sul fatto che ho queste corrispondenze che ti parlo sempre
LikeLike
Kevin, It’s all about the power of autosomal DNA to find distant cousins to you. Generally, the more distant they are the smaller the shared cMs. You should use DNA Painter, and plug in the small cM amount and see the wide range of relationship possibilities. These small segment Matches may well be from your Ancestors well beyond a genealogy timeframe. The DNA gives you Matches and their cMs, which provide a rough range of possible relations. The next step is to use genealogy to figure out how you are realted to the Matches. Jim
LikeLike
Jim.come posso vedere se un tg triangolato viene da un figlio ho dei figli dei antenati o antenato?
LikeLike
Kevin – only with genealogy – finding a Common Ancestor with each MAtch. The TG segment of DNA you share with each Match was present in one of your parents, one of your grandparents, one of your Great grandparents, etc back to the Common Ancestor you have with that Match. You may have different Common Ancestors with different Matches, but they must all be along one path to the most distant CA. In other words with the Matches in a TG, you may be a 4C with one, and 5C with another and 6C with a different Match – but the CAs would be along one path. As I explained in a recent post, when you get down to 15cM or below, that amount of DNA can be shared with a wide array of different relationships from 4C on back. For each of your Matches in the TG, you should look up the shared cM using DNA Painter to see the wide array of possibilities. That’s all the DNA can do! The rest has to be genealogy work – looking at Trees and records. Jim
LikeLike
Thank you for this explanation … I could feel “bits and pieces” of that story lurking in all we speak to in genetic genealogy but you have explained it in a way that helps make sense of it all to me !!! Thank you.
LikeLiked by 1 person
John, Thanks for your feedback. Jim
LikeLike
Jim, some amazing analysis here. Well done.
LikeLiked by 1 person
Jean, Thank you! Jim
LikeLike
If I had to describe what one of worst nightmare situations would be it would be having to live with this man.
LikeLike
Linda, Well… my wife is a PhD scientist and taught me a lot about DNA/biology. Her work with cancer researchers had a parallel with this post… be sure you understand when some data is missing (for a variety of reasons), because it skews the averages. I see many genetic genealogists trying to “sharpen a marshmallow” – trying to pinpoint a relationship using shared cMs… we can get a range, but not “the” relationsip with cMs. At that point we have to include genealogy in the analysis. Jim
LikeLiked by 1 person