Easy Manual Clustering at AncestryDNA

Auto-Clustering at AncestryDNA is in a pause mode now. But we can still look at and analyze our own Matches any way we want. We can even form our own Clusters. Here is a modest process that may produce Clusters that are very helpful to us. AncestryDNA does not provide segment information that would allow grouping by Triangulated Groups, so Clustering is the best way to group Matches. And there are several advantages to using Clusters.

Manual Clustering Process at AncestryDNA

Start with your ThruLines. These are Matches who share a Common Ancestor (usually a couple) with us. The ThruLines process looks for obvious CAs; it looks in Private (but searchable) Trees for CAs; it sometimes “fills in the blanks” with information from other (even non-DNA Match) Trees to create a link between you and a DNA Match back to a CA. This ‘fill in the blanks” process may be in your Tree or your Match’s Tree or both. The ThruLines process works out to 5xG grandparents on both sides – if either side is more than 7 generations back, it will not be reported. In any case, you should review the information provided by Ancestry and decide if the ThruLines CA is correct, or not.
Enter the CA information in your Match’s Note box [see “Add note”] – I use a combination of Ahnentafel Number/side; relationship; and surnames. Example: A0140P/6C: WELCH/SPENCE – the Match and I are 6C, sharing ancestors Sylvester WELCH Jr and Anne SPENCE; Sylvester WELCH is my Ahnentafel Number 140 on my Paternal side. I like using Ahnentafel numbers as they are easy to compare and determine relationships. Just divide by 2 to get 140>70>35>17>8>4>2>1 (me), so A0008P/2C: BARTLETT/NEWLON is on this same ancestral line. Do this for all your valid or suspected* ThruLines CAs. NB: Some Matches will share more than one CA with you – enter them both. [* I include suspected ThruLines CAs – if they are incorrect, they almost never Cluster and can thus be culled out.] Anyway – use whatever system works for you, just enter something in the Note box. You’ll be looking at these Notes of Shared Matches to form Clusters, and you want to know who shares the same CA.

After going one or all the ThruLines CAs, call up one of these Matches and review their Shared Matches. I count the number of SMs and the number on the same ancestral line, and record this in the Note box. Example: SM: 17/25xA0140P. This means that out of 25 total Shared Matches, 17 of them had a Note indicating A0140P CA. NB a Shared Match with a Note indicating A0070P would be included. A SM with A0034P would also be included because 34P is really a short cut for 34P/35P, and 35P is in the same ancestral line. Likewise, 8P is in the same ancestral line and would be counted as also having A0140P ancestry. Repeat for all ThruLines Matches. It doesn’t take that long for such a powerful tool as Clustering.
Use judgement to decide who is in a Cluster. In some cases, it’s crystal clear – virtually every Shared Match has an “SM: note” with the same CA. Other cases are not so clear, so you need to decide if there is sufficient evidence to include a Match in a Cluster. In some cases, a Match with a ThruLines CA will actually have several Shared Matches with a “different CA” – the Clustering process dictates such a Match be Clustered with the Matches with a “different CA”. And I would certainly review that Match again to see if there isn’t some clue that indicates the “different CA” is in their tree, too.
Cluster ID – you can use any system you want to name your Clusters. One way is CL001 to CL200. Another way is to use the CA – Example: CL0140P1. This is the Ahnentafel Number preceeded by CL. NB: I added a 1 at the end because some of your Ancestors may be linked to more than one Cluster. [I have Ancestor A0556M, a 7xG grandparent couple, who are in three large Clusters.] Add this Cluster ID to the Match notes. Example A0170P-CL047/6C: WELCH/SPENCE. Or use whatever system you want.
Once you have determined Clusters based on ThruLines Matches and CAs, you can go back and look at a Match in a Cluster and look at his/her Shared Matches who aren’t in a Cluster. Do some have several Shared Matches themselves who are in a Cluster? If so, add these Shared Matches to the Cluster. NB: You can also look at Matches under 20cM – many of them have Shared Matches. If several Shared Matches are in one particular Cluster, add the under 20cM Match to the Cluster.

Clusters are one of the best tools I’ve found for grouping AncestryDNA Matches and finding more CAs.

[19I] Segment-ology: Easy Manual Clustering at AncestryDNA by Jim Bartlett 20200701

7 thoughts on “Easy Manual Clustering at AncestryDNA”

Pingback: Friday's Family History Finds | Empty Branches on the Family Tree
jim4bartletts on July 2, 2020 at 9:51 pm said:

Barb, maybe. I would certainly follow up. First a True 6C Match and a shared DNA segment are just the start of a journey. I have some very solid Triangulated Groups with Matches in them that range from 4C to 8C on several different lines – they each have a solid paper trail to our CA, and the TG tells me the segments are solid, too. But they cannot all be right. There is only one ancestral line for each of our segments. These Matches share the same DNA segment with me and each other, yet they are cousins on different lines. This is possible and normal. It just means that some of them also relate to me a different way – such that all these TG Matches will be have the same CA with me (and it may be a CA that I don’t have in my Tree yet.)
The first thing I would look for is if some of the Cluster Matches share a different CA with each other. And/or can you build the Matches’ Tree back to the CA your share with your 6C?
As I’ve said, this is just the (promising) start of a journey…. Jim

LikeLike

Reply ↓
- Barb LaFara on July 2, 2020 at 11:07 pm said:
  
  Wow! I really appreciate you taking the time to explain this to me. I understand but don’t think the Ancestry DNA site allows me to explore these possible connections unless I build my tree forward from my 7th great grandparents. It seems daunting. If Ancestry DNA had some of the tools found on GEDmatch, or FTDNA, then it would be much easier.
  
  LikeLike
  
  Reply ↓
  - jim4bartletts on July 2, 2020 at 11:48 pm said:
    
    Barb – Two different issues. Clustering doesn’t depend on your Tree. Even if you had no Tree, you could still Cluster (group) your Matches. And ThruLines doesn’t depend on you having your Tree built out to 7xG grandparents – ThruLines will try to “fill in” the Ancestors you are missing. But, make no mistake about it, building your Tree out to 7xG grandparents will help you in both areas – knowing your Ancestors will generate more ThruLines CAs (Ancestry cannot fill in everything); and determining the CA of a Cluster is also much easier when you have a robust Tree. Bottom line: you are on the right track to build your Tree out as far as you can. I think segment Triangulation is the best form of grouping (we still have to do the genealogy), but without segment info at AncestryDNA, Clustering is a great method there.
    
    LikeLike
Barb LaFara on July 1, 2020 at 10:02 pm said:

I just tried this technique and have observed something interesting, I think… I have a 6th cousin match with a good tree and am confident in the Thru Lines connection. When I look at our common matches (18), nearly all are 28-31 cM over 2 segments. Could this be evidence of some “sticky” DNA associated with our common ancestor?

LikeLike

Reply ↓
Pat on July 1, 2020 at 3:56 pm said:

Thank you for this thought-provoking treatise! I do most of this – especially extensive use of the Notes field to highlight MRCAs – but maybe in a different fashion. But, you’ve added a couple of tweaks that I plan to use.

LikeLike

Reply ↓
- jim4bartletts on July 8, 2020 at 1:49 pm said:
  
  Thanks, Pat. You are encouraged to add any tips you use – we are all in this genetic genealogy hobby together.
  Jim
  
  LikeLike
  
  Reply ↓