Walking the Match Clusters Back

A Segment-ology TIDBIT

It appears to me that the next step for Clusters is “Walking the Clusters Back.”

By this I mean, start with the original Leeds Method, 2nd cousins (2C) and 3C, which tends to result in 4 Clusters – one for each grandparent. Often, particularly with known 2C and 3C, you will be able to determine the grandparent for each Cluster.

Then adjust the shared segment cM threshold to focus on 3C and 4C and try to get 8 Clusters. This may take some fine tuning in the threshold, but if you get plus or minus one or two Clusters, that’s OK – just work around it. Now if you can tell from the Matches who were in the 4 Cluster Matrix who repeat in this nominal 8 Cluster Matrix, you know which two Clusters belong to each of the 4 grandparents. Then, if you can figure out the great grandparent in one of the two Clusters for each grandparent, then the other Cluster should be for the other great grandparent.

Once you do what you can with the 8 great grandparent Clusters, adjust the cM thresholds, and rerun a Cluster Matrix to shoot for 16 Clusters and repeat the process.

This would be Walking the Clusters Back. And, in the long run, it might be more efficient and accurate that trying to start with a small cM threshold and getting a large number of Clusters – 128 to 512 Clusters. As the number of Clusters grows, more and more Matches will be conflicting; and more distant Matches may well share more than one Common Ancestor with you. It just gets more complicated to sort out at the larger Matrix levels. Walking the Clusters Back will make this process easier.

And the absolutely great news – a huge benefit of Clusters – is that Shared Matches will cluster when they are Private, or have little or no Tree, or even when they have a robust Tree, but you cannot find any Common Ancestor. In other words no genealogy, nor TGs for that matter, are required to place a Match in a Cluster. Also AncestryDNA Matches who share less that 20cM can also be manually added to a Cluster, based on their Shared Matches. This is bringing “into the fold” Matches which normally would not be grouped. And putting these Matches into Clusters at any level, really helps when it comes to building parts of their Tree out to meet yours.

Match Clusters really fine tune our data. Happy dance… [HT: Dana Leeds]


[22AB] Segment-ology: Walking the Clusters Back TIDBIT by Jim Bartlett 20190214

7 thoughts on “Walking the Match Clusters Back

  1. Thanks Jim for all the great posts over the last few days on this. It’s very timely as our DNA Discussion group at the Society of Australian Genealogists plans to discuss this very topic on 9th March!

    Love all your posts. Thanks for sharing.


  2. Interesting posts — and comments — in the past few days. So much of this clustering is so dependent on one’s own family history and when they migrated to the US (if they did) and how many in the family had children. Of my mom’s 4 grandparents, ahentafel 4 and 5 were immigrants from Italy. #5 came alone; her kin is still back in Italy and resistant to DNA testing. #4 came with brothers; some descendants have tested. They cluster w/ all of us descend from 4 AND 5 so I cannot break out further at this time. #6 was a 1st generation American; his father’s siblings had no kids living to adulthood; his mother’s siblings were more “fruitful” so I can distinguish that cluster, but cannot break out between #6’s parents (12 and 13). #7, thankfully, is the gold mine and probably accounts for 95% of my mom’s Ancestry matches. She was born in USA, as were all 4 of her grandparents. I can easily cluster my mother’s matches representing those 4 grandparents of #7 (ahnentafels 28,29,30,31) and I can even tentatively distinguish clusters for each parent of 29 and of 30). Further back is still too hypothetical, but is at 3G level for my mom.


    • cathmary, That is great. If you look at my most recent post – Match Cluster Report 1 – you can look down the list and find only one Cluster, #124, which has #7 Ancestors. They are all descended from 2 brothers from Scotland (CAMPBELL) who married 2 sisters from Germany (WEHRLE), who immigrated in the 1850s and wound up in Charleston, WV. I get very few Matches on these lines, and all that I get are from this one branch. Clustering may give me some more likely Matches, but trying to find CAMPBELL CAs in Scotland is a nightmare; and same for WEHRLEs in Germany. So out of 86 Clusters, only 1 Cluster to represent my mother’s mother’s line…. Jim


    • I understand, separating becomes easier when there aren’t many offspring. Oma had one child, my 2xg-gpa, George.

      My paternal side came to the US from Germany in the late 1840’s. Very few matches on Ancestry.com on that side for me.

      My maternal side has been easy to sort, also. The North vs. the South. 🙂

      The unknown family of Oma…many have tried fruitlessly for 50 years to find the surname! I thought DNA would make it easier. Now, with all the WILLIAM’s family with complete trees on Ancestry that link to Mr. and Mrs…it really looks as if WILLIAM is the surname.

      Thanks for your input!
      Mary T

      Liked by 1 person

  3. Thank you for replying so quickly!

    I am on the correct path! One of Mr. William’s 5th great-grandson’s share most often with most of us. 😊

    I have begun to arrange the information on a spread-sheet.
    I have created a column for each of my known cousins (1c, 2c and 3c that I have access to) that match with family #2. I am only using known cousins that I have determined MRCA.
    Then, entered known information:
    The DNA match of the Mr. William’s family to my family.
    1. Relationship (MRCA);
    2. Shared DNA amounts and segments;
    3. Number of generations to Mr. William.
    4. and the names in each generation to Mr. William. At this point, Mr. William’s sons: James, Seth and William share most often with my family.

    I have one more piece of circumstantial evidence to the puzzle that leads me in this direction…the surname ‘William’. William was the maiden name of the 2nd wife to my 3xg-pa [in other words…when Oma died…he married a sister…maybe?]

    Thank you for your time,
    Mary T


    • Again, I think you are on the right track. One of the benefits of Clustering is the groups (Clusters) that are formed do not need to have Trees or DNA info – what you get are family groups, just because most of them all match each other a lot. Jim


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.