Use Clusters!

Clusters form on a Common Ancestor (CA). We don’t have proof of this but a) it makes sense (why else would our Matches match each other in a Cluster?); and b) it sure seems to work (I’ve found many new CAs with Matches, just by focusing on the CA in a Cluster).

So, with this concept in mind, let’s use our Clusters!

  1. Known CA – If you *know*, or even suspect, the CA of a Cluster, search other Matches in that Cluster for that CA or location or a collateral line. If a Cluster Match has a good Tree, there’s a good chance you’ll find the CA in their Tree. There’s a good chance multiple Matches in a Cluster will all have the same CA. Armed with a known CA, I’ve often been able to build out a Match’s Tree to that CA.
  2. Unknown CA – If you don’t have a clue to the Cluster CA, find the most likely CA among the Matches – whether you have that surname or not. Let the Matches tell you the Cluster CA – per this blogpost. This is also effective for Brick Walls and unknown parentage.
  3. Suspect CA – If some on the internet propose an Ancestor for one of your lines without proof, or if you are suspicious of their “proof”, test out that Ancestor. Look for that surname among the Matches in appropriate Clusters. “Appropriate” means these Clusters are probably on that line. Try the Unknown CA process and see if this same surname comes up. Clearly, if many people have bought into this Suspect CA, this process won’t work (however, then using this process with the Suspect CA’s mother’s surname, may be helpful). Example: During 40 years of research on my NEWLON line, many had heard the claim by one researcher that a spouse was “Martha JANNEY”, but without proof, few used that information. So I decided to test it. Virtually none of my Cluster Matches had the JANNEY surname; but many had the CUMMIN/GS surname. In fact, searching all of my DNA Matches (over 125,000 of them) turned up 17 Matches (down to 6cM) with the JANNEY surname in Loudoun Co, VA – none in any of my “appropriate” Clusters.

Bottom line: Use the concept that Clusters form on a CA. Use it to find CAs with more Matches; Use it to break through Brick Walls or explore Clusters without a CA. Use it to *test* likely or suspicious surnames in selected Clusters – if the CA is correct, it should show up in multiple Matches in a Cluster.


[19J] Segment-ology: Use Clusters! by Jim Bartlett 20200705

Easy Manual Clustering at AncestryDNA

Auto-Clustering at AncestryDNA is in a pause mode now. But we can still look at and analyze our own Matches any way we want. We can even form our own Clusters. Here is a modest process that may produce Clusters that are very helpful to us. AncestryDNA does not provide segment information that would allow grouping by Triangulated Groups, so Clustering is the best way to group Matches. And there are several advantages to using Clusters.

Manual Clustering Process at AncestryDNA

  1. Start with your ThruLines. These are Matches who share a Common Ancestor (usually a couple) with us. The ThruLines process looks for obvious CAs; it looks in Private (but searchable) Trees for CAs; it sometimes “fills in the blanks” with information from other (even non-DNA Match) Trees to create a link between you and a DNA Match back to a CA. This ‘fill in the blanks” process may be in your Tree or your Match’s Tree or both. The ThruLines process works out to 5xG grandparents on both sides – if either side is more than 7 generations back, it will not be reported. In any case, you should review the information provided by Ancestry and decide if the ThruLines CA is correct, or not.
  2. Enter the CA information in your Match’s Note box [see “Add note”] – I use a combination of Ahnentafel Number/side; relationship; and surnames. Example: A0140P/6C: WELCH/SPENCE – the Match and I are 6C, sharing ancestors Sylvester WELCH Jr and Anne SPENCE; Sylvester WELCH is my Ahnentafel Number 140 on my Paternal side. I like using Ahnentafel numbers as they are easy to compare and determine relationships. Just divide by 2 to get 140>70>35>17>8>4>2>1 (me), so A0008P/2C: BARTLETT/NEWLON is on this same ancestral line. Do this for all your valid or suspected* ThruLines CAs. NB: Some Matches will share more than one CA with you – enter them both.         [* I include suspected ThruLines CAs – if they are incorrect, they almost never Cluster and can thus be culled out.]  Anyway – use whatever system works for you, just enter something in the Note box. You’ll be looking at these Notes of Shared Matches to form Clusters, and you want to know who shares the same CA.
  1. After going one or all the ThruLines CAs, call up one of these Matches and review their Shared Matches. I count the number of SMs and the number on the same ancestral line, and record this in the Note box. Example: SM: 17/25xA0140P. This means that out of 25 total Shared Matches, 17 of them had a Note indicating A0140P CA. NB a Shared Match with a Note indicating A0070P would be included. A SM with A0034P would also be included because 34P is really a short cut for 34P/35P, and 35P is in the same ancestral line. Likewise, 8P is in the same ancestral line and would be counted as also having A0140P ancestry. Repeat for all ThruLines Matches. It doesn’t take that long for such a powerful tool as Clustering.
  2. Use judgement to decide who is in a Cluster. In some cases, it’s crystal clear – virtually every Shared Match has an “SM: note” with the same CA. Other cases are not so clear, so you need to decide if there is sufficient evidence to include a Match in a Cluster. In some cases, a Match with a ThruLines CA will actually have several Shared Matches with a “different CA” – the Clustering process dictates such a Match be Clustered with the Matches with a “different CA”. And I would certainly review that Match again to see if there isn’t some clue that indicates the “different CA” is in their tree, too.
  3. Cluster ID – you can use any system you want to name your Clusters. One way is CL001 to CL200. Another way is to use the CA – Example: CL0140P1. This is the Ahnentafel Number preceeded by CL. NB: I added a 1 at the end because some of your Ancestors may be linked to more than one Cluster. [I have Ancestor A0556M, a 7xG grandparent couple, who are in three large Clusters.] Add this Cluster ID to the Match notes. Example A0170P-CL047/6C: WELCH/SPENCE. Or use whatever system you want.
  4. Once you have determined Clusters based on ThruLines Matches and CAs, you can go back and look at a Match in a Cluster and look at his/her Shared Matches who aren’t in a Cluster. Do some have several Shared Matches themselves who are in a Cluster? If so, add these Shared Matches to the Cluster. NB: You can also look at Matches under 20cM – many of them have Shared Matches. If several Shared Matches are in one particular Cluster, add the under 20cM Match to the Cluster.

Clusters are one of the best tools I’ve found for grouping AncestryDNA Matches and finding more CAs.


[19I] Segment-ology: Easy Manual Clustering at AncestryDNA by Jim Bartlett 20200701