Insights into cM Patterns

I now have over 8,700 Matches at AncestryDNA with a confirmed Common Ancestor (CA) with me between 2C and 8C. See my Common Ancestor Spreadsheet post here. That’s a lot of data, so I thought I’d do some analysis. In 2024 I posted (here) my averages for 3C to 8C which roughly agreed with the Shared cM project.

Below is a table summarizing all of my data (including full cousins, half cousins and removed cousins). For each relationship there are columns for the number of Matches, the average cMs, the lowest cM, the highest cM; plus the number of generations (meiosis events), and average cMs for each. The table is then repeated with a sort based on meiosis events.

A word about meiosis events. They are the count from me up to the CA and then back down to the Match. Like generations… A 1C is 4 events (two up to grandparent (the CA) plus two back down to the Match. The number of meiosis events with a 1C2R is 6 (two up and 4 down). A half relationship adds one to the meiosis events – eg a 4C1R is 11 events; and 4C1Rh is 12 events. These are important because in a mathematical simulation, each event reduces the cM by half. From the Shared DNA Project a 1C (4 events) average is 866cM compared to 2C1R (7 events) is 122cM which is roughly 866 halved three times. Remember, it’s an order of magnitude thing. And, as we shall see, it generally works for close relationships (like 1C and 2C), but drifts away for more distant relationships (like 4C and beyond). Important: this is not biology’s fault, it’s the math’s fault. It’s because we have a LOT of true distant cousins that do NOT share matching DNA with us; and they are not reflected in the averages. This (lack of a normal curve) is highlighted in the second sort (by meiosis numbers) below. This is also reflected in the DNA Painter Shared cM Project tool which shows different groups of Matches for a given input cM value. For example at DNA Painter, plug in 55cM… the 29% group of 3Ch, 3C1R, 2C3R and 2C2R half are all 9 meiosis events; and the second group of 4C, 3c1Rh, and 3C2R are all 10 meiosis events. This also demonstrates that by the time we get down to 3C and 4C levels there is a lot of overlap.

For this first table, the takeaway is that the number of Matches with CAs increased dramatically with each generation. [Note I combine full cousin with cousin 1R because at my age, most Matches will be a generation younger that me] 3C & 3C1R: 196 Matches; 4C & 4C1R: 662 Matches; 5C & 5C1R: 1,406 Matches; 6C & 6C1R: 3,426 Matches. WOW, what an increase in the number of Match cousins. And then we have 7C & 7C1R: 584 Matches; 8C & 8C1R: 363 Matches. What happened? Why the steep decrease in numbers. Well, IMO, the major factor is that AncestryDNA’s ThruLines quits at 6C – ThruLines can “see” into private Trees (I cannot); and it roots out MRCAs with the smallest of Trees (I don’t have that time). I can only dream of how many ThruLines I’d get at the 7C and 8C levels. Some of the ones I have now, were found/recorded when we had Circles at Ancestry.

The point is: there are LOTS of cousins still waiting to be determined. ProTools is helping.

Table 1: 8,799 AncestryDNA Matches Summarized by Relationship

AncestryDNAcMcMcM  
MRCA#Matchesavglowhighmeiosis
1C2Rh3138782007
2C12696
2C1R14127342207
2C2R847391628
2C3R234221409
3C5763132088
3Ch52016959
3C1R1392861489
3C1Rh2628711110
3C2R1062266810
3C2Rh202069211
3C3R342265811
3C3Rh122384012
3C4R12012
3C4Rh21081213
4C12824622010
4Ch71261911
4C1R53420611411
4C1Rh331663012
4C2R2671679212
4C2Rh121263913
4C3R271664413
4C4R117171714
5C4691666212
5Ch291762714
5C1R11371466013
5C1Rh71466014
5C2R3001464114
5C2Rh91472715
5C3R751464015
5C3Rh118181816
5C4R210101016
6C19221265614
6Ch971162515
6C1R15031265215
6C1Rh581062216
6C2R6181264416
6C2Rh471262917
6C3R121563017
7C2621364116
7Ch101563917
7C1R3221264317
7C1Rh71764318
7C2R171662518
7C3R51561819
8C3101263518
8Ch61071719
8C1R531663719
8C2R121781920
8C3R71061321
9C631462420
Total8799     

For the second table; the takeaway is that the average cM tracks pretty close to each other at the same meiosis numbers. And after meiosis level 9 which averages 27cM; the “curve” quickly “flatlines” in the mid teens. This is reflected at DNA Painter with many relationships all in play under 20cM.

Table 2: 8,799 AncestryDNA Matches Summarized by Meiosis Events

AncesttryDNAcMcMcM    
MRCA#Mavglowhighmeiosisavg cM
2C12696269
1C2Rh3138782007 
2C1R14127342207132
2C2R847391628 
3C576313208855
2C3R234221409 
3Ch5201695927
3C1R1392861489 
3C1Rh2628711110 
3C2R106226681025
4C12824622010 
3C2Rh202069211 
3C3R34226581118
4Ch71261911 
4C1R53420611411 
3C3Rh122384012 
3C4R12012 
4C1Rh33166301218
4C2R2671679212 
5C4691666212 
3C4Rh21081213 
4C2Rh12126391313
5C1R11371466013 
4C3R271664413 
4C4R117171714 
5Ch291762714 
5C1Rh7146601415
5C2R3001464114 
6C19221265614 
5C2Rh91472715 
5C3R75146401513
6Ch971162515 
6C1R15031265215 
5C3Rh118181816 
5C4R21010101613
6C1Rh581062216 
6C2R6181264416 
7C2621364116 
6C2Rh471262917 
6C3R12156301713
7Ch101563917 
7C1R3221264317 
7C1Rh71764318 
7C2R17166251815
8C3101263518 
7C3R51561819 
8Ch6107171913
8C1R531663719 
8C2R12178192015
9C631462420 
8C3R71061321 
Total8799       

Sidebar – this evaluation also acts as a Quality Control indicator. Watch for data points way outside the norms. I had three Matches who skewed one of the numbers. I went back to them – they were close to each other and I was sure they were from an NPE. Upon reevaluation, they needed to be a generation closer to our CA. I made the shift, and all the numbers fell back into the norm.

These insights are helping me with a new review of Walking The Clusters Back, where in I need to use judgment when imputing relationships and CAs.

[06G] Segment-ology: Insights into cM Patterns; by Jim Bartlett 20260122

17 thoughts on “Insights into cM Patterns

  1. Jim – You say “I can only dream of how many ThruLines I’d get at the 7C and 8C levels.” As you likely know, you can fool AncestryDNA into providing some of these by making yourself a sibling to your parent (or grandparent) in the tree you are linked on. What had been your 6th GGPs are now your 5th GGPs and visible. I’m not sure how many would show up if you are now two or three generations older than your actual 7Cs – but I do it every few years and I find some interesting and real 7Cs. I posted this on Facebook years ago and three or four people thought it was the end of the world; that at that moment someone might be looking at my tree and get the wrong data. Meh. It takes maybe 36 hours to update, and I change it back a few days later.

    Liked by 1 person

    • Rich, I do use that tip – several times a year. And like you, I try to whiz through the new ThruLines (and put them in my spreadsheet), so that I can revert to the standard way quickly. Note that ThruLines limit is 6C level on both parties. So what you can get at the parent level is only 6C1R. You look like 7 generations back (but the CA is actually 8 generations back); and the Match has to be within 7 generations of that same, fairly distant, CA. It does add some true Matches to our list, but if Ancestry up the level to 7C for ThruLines, we’d get a LOT more. In the meantime, we do what we can. Jim

      Like

  2. Jim, WOW is right. Very impressive. I’m imagine you are shooting for 10000 matches now to make statistics easy. I was wondering if you have correlated all these matches somehow with your TG’s and if any insights resulted? I know much of this match data is from Ancestry, but with all the overlapping matches in multiple databases the segments often seem identifiable.

    Liked by 1 person

    • pdtbill, When I add in my data for other companies (not really apples to apples because of Timber), I have over 10,000. My goal was to confirm relationships with 10% of my Matches at AncestryDNA, which is hard (but ProTools are helping – not enough time). And yes, I’ve determined the TG with 1,065 AncestryDNA Matches, but not all of those Matches include MRCAs.
      I am working on an improved/streamlined Walk The Clusters Back (WTCB) version, which lets me impute known TGs in a Cluster to other Matches (working down to 20cM Matches). This TGID then carries over to the data set in this post. My “expectation” there is that the TGs will be distributed, roughly, equally by generation. In other words, if I round my 372 TGs up to 400 TGs, I’d have 100 TGs “flowing” through each of my grandparents (with Matches at nominal 1C level. Note: “flowing through” is not the same as “originating at”. Then 50 at each G grandparent; 25 per 2XG grandparent, etc. to about 1 TG for each 7xG grandparent at 7C level. In other words at the 6C level (where I have the most Matches), I would expect to see about 3 TGs (or actually 6 TGs for each couple). So if you track the TGs in the Common Ancestor spreadsheet (as I do), don’t expect them all to be the same in each family, but do expect a small number of them, probably confined to different branches in each 5xG grandparent/couple family.
      I check GEDmatch every week for new Matches from AncestryDNA kits… AND I put the TGID in the new Notes column at GEDmatch to track the ones I’ve done. Jim

      Like

  3. Jim o ricevuto un altra corrispondenza di condiviso di 13.9cN con due segmenti il più grande triangolato con altri di 7,1cM che si aggiunge al gruppo Gia esistente di 18 match piu io sul chromosome16 e dai suoi cognomi genitori mi pare che sia tra rom e misti rim

    Like

    • Kevin, It looks like you are making good progress with this TG on Chr 16. It’s now time to transition to genealogy research to try to determine how the TG Matches all interrlate to each other – to find your Common Ancestor. Jim

      Like

      • Jim mi sai spiegare che vuol dire questo messaggio dal team di myheritage sul fatto che ho queste corrispondenze che ti parlo sempre

        Like

      • Kevin, It’s all about the power of autosomal DNA to find distant cousins to you. Generally, the more distant they are the smaller the shared cMs. You should use DNA Painter, and plug in the small cM amount and see the wide range of relationship possibilities. These small segment Matches may well be from your Ancestors well beyond a genealogy timeframe. The DNA gives you Matches and their cMs, which provide a rough range of possible relations. The next step is to use genealogy to figure out how you are realted to the Matches. Jim

        Like

      • Jim.come posso vedere se un tg triangolato viene da un figlio ho dei figli dei antenati o antenato?

        Like

      • Kevin – only with genealogy – finding a Common Ancestor with each MAtch. The TG segment of DNA you share with each Match was present in one of your parents, one of your grandparents, one of your Great grandparents, etc back to the Common Ancestor you have with that Match. You may have different Common Ancestors with different Matches, but they must all be along one path to the most distant CA. In other words with the Matches in a TG, you may be a 4C with one, and 5C with another and 6C with a different Match – but the CAs would be along one path. As I explained in a recent post, when you get down to 15cM or below, that amount of DNA can be shared with a wide array of different relationships from 4C on back. For each of your Matches in the TG, you should look up the shared cM using DNA Painter to see the wide array of possibilities. That’s all the DNA can do! The rest has to be genealogy work – looking at Trees and records. Jim

        Like

  4. Thank you for this explanation … I could feel “bits and pieces” of that story lurking in all we speak to in genetic genealogy but you have explained it in a way that helps make sense of it all to me !!! Thank you.

    Liked by 1 person

    • Linda, Well… my wife is a PhD scientist and taught me a lot about DNA/biology. Her work with cancer researchers had a parallel with this post… be sure you understand when some data is missing (for a variety of reasons), because it skews the averages. I see many genetic genealogists trying to “sharpen a marshmallow” – trying to pinpoint a relationship using shared cMs… we can get a range, but not “the” relationsip with cMs. At that point we have to include genealogy in the analysis. Jim

      Liked by 1 person

Leave a reply to pdtbill Cancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.