Let the Matches Tell Us the Cluster Common Ancestor

Using a 20cM threshold at AncestryDNA, I got 156 Clusters. That’s roughly one Cluster for each of my 128 5xG grandparents – or two Clusters per 5xG grandparent couples – often with valuable Common Ancestor (CA) hints from ThruLines. I don’t know 50 of my 128 5xG grandparents (they are brick walled) – so I would expect (50×156/128=) 61 of my 156 clusters to be blank. What’s a body to do?

Well… in the first place the above calculation is based on finding a CA at the 5xG grandparent level. ThruLines provides clues for all the Ancestors I know – but, clearly, they cannot help with Clusters (or TGs) beyond a brick wall. For almost all of the Clusters, I know the parent; and for roughly 80% I know the grandparent; and for many I know the CA out to the brick wall. So I’ve got a start. But, for many of my Clusters, there is very little otherwise to go on – just a lot of Matches in a Cluster. What’s a body to do?

As I’ve said before, let’s think about lemonade…  In my last post (Using a Group Ancestor), I noted that grouping (segment Triangulation and Shared Match Clustering) results in a group of Matches with the same Common Ancestor (CA). This is the concept, even if we don’t have any clue as to who the CA is. But let’s make “the certainty that there is a CA” work for us… Let’s have the Matches tell us who the CA is for a Cluster. Seems like lemonade to me.

Here is a process for AncestryDNA: [I hope you’ve saved your last Cluster report]

  1. Select a large Cluster for which you have no known CAs (or only a few which are in conflict with each other).
  2. Make a spreadsheet with three columns: Match Name and Surnames and Notes.
  3. Select a Match in the Cluster who has a Tree with more than 99 people.
  4. Type the Match name in the spreadsheet.
  5. Go to that Match in AncestryDNA (either from the URL in the Cluster; or by searching AncestryDNA).
  6. Type the surnames for that Match (both Shared Surnames & Match’s Tree Only) in Surname column.
  7. Copy the Match name down the spreadsheet for each surname.
  8. Repeat for each Match in the Cluster with a Tree over 99 people.
  9. Sort the spreadsheet on the Surname column.
  10. Scroll down the list and highlight likely Surname groups [it would be great to find a clear winner – repeated multiple times. If not pick the top few surnames].
  11. Go back to the Matches with most likely surname(s) and put in the Notes column the Patriarch or any other identifying information (birth, location, ethnicity, etc). The expectation (hope) is that you’ll find a Common Ancestor or two in this process.

I can almost hear the collective groan at step #6. Yes, it’s an onerous task. I sat down with a favorite beverage and typed non-stop the 660 surnames for Matches in one Cluster; 750 in another Cluster. But, think about this another way: would you spend a half-day of work to find a new Ancestor? That would be a nice glass of lemonade.

In my first Cluster try, I found three Surnames (ADAMS, CAUDILL and CRAFT) repeated several times. A quick and dirty Tree quickly determined John ADAMS married 1769 Loudoun Co, VA Nancy CAUDILL; and their daughter, Elizabeth married Archelous CRAFT – and 5 of my Matches in the Cluster descended from these two couples!! I already had some clues that this Cluster was on my father’s father’s side. This includes my NEWLON line which had a brick wall born c1774 Loudon Co, VA which I determined was Susan CUMMINGS – blogpost here. Her father is strongly suspected to be John CUMMINGS born c1746, but nothing is known about John’s first wife, the mother of Susan CUMMINGS and my Ancestor – a new brick wall. If John’s first wife was an ADAMS, all of this would fall into place as a hypothesis.

By the nature of Shared Match Clustering, this Cluster must have a CA. With five widely separated Matches agreeing on the same CA (and no other surnames turning out any hints at all), I think this is a strong clue. But, more research is needed.

The other Cluster had several repeated surnames, but none that I have been able to link together, yet. I may drop down and look at the surnames of Matches with Trees in the 50-99 people range… maybe another hour of typing… If I find a clue it will all be worth while.

Bottom Line: A Cluster (or a TG) has a CA. The Matches in a Cluster should all share this CA. Let the Matches Tell Us the Cluster Common Ancestor.  The process above is one way to do this. A particular advantage to me is that this process is comprehensive, and with no bias – the data from the Matches is treated evenly.

Post Script: By it’s nature genealogy is an ego-centric hobby. We tend to focus on ourselves as the center of the universe. Or, if we are professionals, we treat the Client as the center of the universe. Everything revolves around our Ancestors and what we can find out about them. But each of us is a small part of the human race, and our Matches – our cousins – are part of this larger picture. They fit in, too. They are an interlocking part of the whole jigsaw puzzle, and in some (many?) cases, some of them know more than I do . The process above draws on the data they have provided. Often, they have clues to the solutions we seek. Often, they know what’s on the other side of our brick walls.

Edit 6/22/20: I’ve been asked to add a photo of my spreadsheet. Here it is – showing the top two surnames.

Spreadsheet of Cluster Common Ancestors

The 3rd column is Match Names and it has been narrowed for Match privacy. When I started, I had columns for Company and Where (the name of the Cluster run – 20cMCL63: Cluster 63 of the Shared Match run using a 20cM threshold), but it turns out this is a Quick and Dirty spreadsheet, and I didn’t need those columns. The objective is to get started on a Quick and Dirty Tree, and work from there. As soon as I saw the last line – a CRAFT married to an ADAMS, I started the Q&D Tree and found the five Matches who all tied together. Since then, I’ve used the previous blogpost on Searching and have found over a dozen more Matches who descend from this same line. All of the Cluster Matches were over 20cM. However, now knowing what I’m looking for, the Search process let me drop below 20cM and find many more – and most of them have above-20cM Shared Matches from the same Cluster. This is added evidence that I tie into this line some how.

[19H] Segment-ology: Let the Matches Tell Us the Cluster Common Ancestor by Jim Bartlett 20200620

Using a Group Common Ancestor

A Triangulation (and grouping) Concept

We have spent a lot of time and effort to describe *how* to group our Matches: segment Triangulation, DNA Painting, Shared Match Clustering. Each of these processes results in a group of Matches that should have a Common Ancestor (CA). This is an important concept.

But the main thing is to *use* this concept – to use the information found in these groups. If a group is formed around a CA, then all of the Matches in the group should share a CA. Once a CA is found, each Match in the group should also have that group CA, or be a closer cousin with an MRCA that descends from the group CA, or have a more distant MRCA which is ancestral to the group CA. In other words, all the Matches in a group should have the same distant CA.

So… if we find a CA for a group, the other Matches in the group should have the same CA line. This is a powerful focus – let’s *use* it. We should be able to look at other Matches in the group (who have Trees) and find that CA – either directly through a search, or indirectly by building out their Tree.

I illustrated this in Case 3 of Chapter 1 (Lessons Learned from Triangulating a Genome) of “Advanced Genetic Genealogy: Techniques and Case Studies” – here or here. This was all about one of my TGs which I call [04P36]. At Ancestry, I found a few cousins (who had uploaded to GEDmatch) in that TG who  shared my HIGGINBOTHAM ancestry. Armed with that hint, I searched for HIGGINBOTHAMs in other Matches (in that TG) who had trees. I also contacted Matches from FTDNA, 23andMe and MyHeritage – and several replied that they had the same HIGGINBOTHAM Ancestry. In the end I found 14 different Matches ranging from 4C to 8C on this HIGGINBOTHAM line in TG [04P36].

Because TG [04P36] came down a line of descent with the HIGGINBOTHAM surname in 5 generations, this case was an easier example – searching for one distinct surname. If a group represents a CA with a male-female zig-zag line of descent to me, it will be harder – the surname will change often. However, each line of descent (from a given Ancestor) is fixed – and we may find Match cousins with MRCAs of different surnames, but they will all be on the same ancestral line. This is akin to “Genealogy Triangulation” – getting an alignment of multiple cousins on one line.

Finding one Match with a CA in a group is not the end of the story – it’s a clue to the beginning of more research. If we find a CA for a group, but no other Match seems to have that CA, maybe we need to look for a different CA. The “correct” CA for each group should lead to Genealogy Triangulation – agreement by other Matches on the same ancestral line. If you find a CA in a group, *use* it to find more Matches on that same line. Seek CA agreement among Matches in each group.


[08D] Segment-ology: Using a Group Common Ancestor Concept by Jim Bartlett 20200620

Using Ethnicity to Identify a Cluster

A Segmentology TIDBIT

My Ancestor 14M was John William CAMPBELL, born 1856 NY; died 1916 WV. His parents were Samuel CAMPBELL and Ann CLARK who were married 1851 in Scotland and immigrated to the US in 1853. This 1/8 of my ancestry is the only known part to come from Scotland. Several cousins have done Y-DNA testing and the CAMPELL line is the Argyll CAMPBELLs.

I have over 125,000 Matches at AncestryDNA. I have identified Common Ancestors with over 4,500 Matches – only 5 of them are on my CAMPBELL line. About 12.5% of my DNA is from my CAMPBELL line, and, all other things being equal, about 12.5% of my Matches should come from my CAMPBELL line.  But all things are not equal – this CAMPBELL line is relatively small, and there are no known Ancestors before 1850, and there are no known links to any Ancestors in Scotland.

This doesn’t mean that none of my other Matches are cousins from this CAMPBELL line. However, it does result in me not being able to find any more links. I have tens of thousands of Matches with no Trees; I’ve even found some with a CAMPBELL surname – but no way to determine if I am related to them (other than the few who have matching Y-DNA at FamilyTreeDNA).

So, I drop back and relook at the big picture: exactly 1/8 of my Ancestry came from Scotland (well, maybe not going way back, but probably within a genealogy timeframe); roughly 1/8 of my DNA came from/through Scotland; and if not 1/8, perhaps 10,000 of my Matches should be on this part of my Ancestry– certainly more than the five close cousins I already knew about.

I decided to turn this lemon into lemonade. The lemon is recent Scottish immigrant ancestor – the lemonade is Scotland ethnicity. If this is the only part of ancestry from Scotland, maybe I could use that information. When I Cluster my AncestryDNA Matches at the 20cM Threshold (the lowest cM amount with Shared Matches to each other) I get about 160 Clusters. 1/8 of those is 20 Clusters – a manageable number. So when I see some solid looking Clusters without any hints of other ancestry, maybe they are from my Scotland line.

Here is one such Cluster. I clicked on the link for each Match and checked their ethnicity:

Every Match in this Cluster has 14% to 62% Scotland ethnicity. A few scattered Matches with Scotland ethnicity might be expected randomly, but for all of them to have significant amounts of Scotland ethnicity is a strong clue.

I think I can safely assume this CL149/14/[Scotland…] Cluster represents my Ancestor, John CAMPBELL – Ahnentafel 14M. If I knew the DNA segment, I could Paint this Cluster. I have several others that also show a pretty clear Cluster “picture”. Next I’ll be looking a some other Clusters which may even have a ThruLines Common Ancestor in them, but also have a lot of Scotland ethnicity – the ThruLines CA may be the outlier… With only one ThruLines CA I don’t have a high confidence that it’s right. But with high concordance of Scottish ethnicity, that’s a strong clue the Cluster is on my CAMPBELL line.

The next step is studying any Trees in these Scotland Clusters to see if those Matches have some Common Ancestors among themselves… That will be the sweetest lemonade of all.


[22AU] Segment-ology: Using Ethnicity to Identify a Cluster TIDBIT by Jim Bartlett 20200612