Let the Matches Tell Us the Cluster Common Ancestor

Using a 20cM threshold at AncestryDNA, I got 156 Clusters. That’s roughly one Cluster for each of my 128 5xG grandparents – or two Clusters per 5xG grandparent couples – often with valuable Common Ancestor (CA) hints from ThruLines. I don’t know 50 of my 128 5xG grandparents (they are brick walled) – so I would expect (50×156/128=) 61 of my 156 clusters to be blank. What’s a body to do?

Well… in the first place the above calculation is based on finding a CA at the 5xG grandparent level. ThruLines provides clues for all the Ancestors I know – but, clearly, they cannot help with Clusters (or TGs) beyond a brick wall. For almost all of the Clusters, I know the parent; and for roughly 80% I know the grandparent; and for many I know the CA out to the brick wall. So I’ve got a start. But, for many of my Clusters, there is very little otherwise to go on – just a lot of Matches in a Cluster. What’s a body to do?

As I’ve said before, let’s think about lemonade…  In my last post (Using a Group Ancestor), I noted that grouping (segment Triangulation and Shared Match Clustering) results in a group of Matches with the same Common Ancestor (CA). This is the concept, even if we don’t have any clue as to who the CA is. But let’s make “the certainty that there is a CA” work for us… Let’s have the Matches tell us who the CA is for a Cluster. Seems like lemonade to me.

Here is a process for AncestryDNA: [I hope you’ve saved your last Cluster report]

  1. Select a large Cluster for which you have no known CAs (or only a few which are in conflict with each other).
  2. Make a spreadsheet with three columns: Match Name and Surnames and Notes.
  3. Select a Match in the Cluster who has a Tree with more than 99 people.
  4. Type the Match name in the spreadsheet.
  5. Go to that Match in AncestryDNA (either from the URL in the Cluster; or by searching AncestryDNA).
  6. Type the surnames for that Match (both Shared Surnames & Match’s Tree Only) in Surname column.
  7. Copy the Match name down the spreadsheet for each surname.
  8. Repeat for each Match in the Cluster with a Tree over 99 people.
  9. Sort the spreadsheet on the Surname column.
  10. Scroll down the list and highlight likely Surname groups [it would be great to find a clear winner – repeated multiple times. If not pick the top few surnames].
  11. Go back to the Matches with most likely surname(s) and put in the Notes column the Patriarch or any other identifying information (birth, location, ethnicity, etc). The expectation (hope) is that you’ll find a Common Ancestor or two in this process.

I can almost hear the collective groan at step #6. Yes, it’s an onerous task. I sat down with a favorite beverage and typed non-stop the 660 surnames for Matches in one Cluster; 750 in another Cluster. But, think about this another way: would you spend a half-day of work to find a new Ancestor? That would be a nice glass of lemonade.

In my first Cluster try, I found three Surnames (ADAMS, CAUDILL and CRAFT) repeated several times. A quick and dirty Tree quickly determined John ADAMS married 1769 Loudoun Co, VA Nancy CAUDILL; and their daughter, Elizabeth married Archelous CRAFT – and 5 of my Matches in the Cluster descended from these two couples!! I already had some clues that this Cluster was on my father’s father’s side. This includes my NEWLON line which had a brick wall born c1774 Loudon Co, VA which I determined was Susan CUMMINGS – blogpost here. Her father is strongly suspected to be John CUMMINGS born c1746, but nothing is known about John’s first wife, the mother of Susan CUMMINGS and my Ancestor – a new brick wall. If John’s first wife was an ADAMS, all of this would fall into place as a hypothesis.

By the nature of Shared Match Clustering, this Cluster must have a CA. With five widely separated Matches agreeing on the same CA (and no other surnames turning out any hints at all), I think this is a strong clue. But, more research is needed.

The other Cluster had several repeated surnames, but none that I have been able to link together, yet. I may drop down and look at the surnames of Matches with Trees in the 50-99 people range… maybe another hour of typing… If I find a clue it will all be worth while.

Bottom Line: A Cluster (or a TG) has a CA. The Matches in a Cluster should all share this CA. Let the Matches Tell Us the Cluster Common Ancestor.  The process above is one way to do this. A particular advantage to me is that this process is comprehensive, and with no bias – the data from the Matches is treated evenly.

Post Script: By it’s nature genealogy is an ego-centric hobby. We tend to focus on ourselves as the center of the universe. Or, if we are professionals, we treat the Client as the center of the universe. Everything revolves around our Ancestors and what we can find out about them. But each of us is a small part of the human race, and our Matches – our cousins – are part of this larger picture. They fit in, too. They are an interlocking part of the whole jigsaw puzzle, and in some (many?) cases, some of them know more than I do . The process above draws on the data they have provided. Often, they have clues to the solutions we seek. Often, they know what’s on the other side of our brick walls.

Edit 6/22/20: I’ve been asked to add a photo of my spreadsheet. Here it is – showing the top two surnames.

Spreadsheet of Cluster Common Ancestors

The 3rd column is Match Names and it has been narrowed for Match privacy. When I started, I had columns for Company and Where (the name of the Cluster run – 20cMCL63: Cluster 63 of the Shared Match run using a 20cM threshold), but it turns out this is a Quick and Dirty spreadsheet, and I didn’t need those columns. The objective is to get started on a Quick and Dirty Tree, and work from there. As soon as I saw the last line – a CRAFT married to an ADAMS, I started the Q&D Tree and found the five Matches who all tied together. Since then, I’ve used the previous blogpost on Searching and have found over a dozen more Matches who descend from this same line. All of the Cluster Matches were over 20cM. However, now knowing what I’m looking for, the Search process let me drop below 20cM and find many more – and most of them have above-20cM Shared Matches from the same Cluster. This is added evidence that I tie into this line some how.

[19H] Segment-ology: Let the Matches Tell Us the Cluster Common Ancestor by Jim Bartlett 20200620

19 thoughts on “Let the Matches Tell Us the Cluster Common Ancestor

  1. Pingback: Use Clusters! | segment-ology

  2. Jim, this is a great resource and i’m trying to get myself going on building out my TG’s. I’ve got my Ancestry AT test and my mothers up at GedMatch. I know that i will have “about” 34 big segments from my mom and dad. How do i identify those big segments without looking at all of the thousands of matches? I’m missing something as it would seem logical to break the segments first into the big areas as we then know if it’s maternal or paternal.

    Like

    • David, thanks for your feedback and question. You’ll actually get about 68 big segments from your 4 grandparents. You can identify some of them with 1C, but most of us don’t have enough close cousins to figure it out. You can start by identifying all of your maternal segments (shared also with your mother) – the rest of your shared segments over 15cM are probably from your father. I found most of my 372 TGs at 23andMe, MyHeritage and FTDNA. Each costs about $50 (on sale) and they each have ways to determine the TGs. When I did it, it was by comparing them one by one – it’s much easier now. For me an Excel spreadsheet was the best tool to capture, sort and compare shared segments. Do your maternal side TGs first – it will be easier to identify those segments. Start with largest segments and work down. What’s left is your fathers side. It does take time, but then you have a fixed map to work with – filling in the CAs.

      Like

      • OK sorry for being dense, but “You can start by identifying all of your maternal segments (shared also with your mother) ” is what i was trying to do 🙂
        Are you saying that I only identify those segments by looking at GedMatch or MyHeritage and where maternal and me shared a segment with someone else that’s how i do it? I’d love to see and article as to why, if you have yours and a parents atDNA why you don’t see the “34” segments that came from both parents. I fully understand that those segments really came from the ancestors, and we have that ~350-400 TGs from way back, but those are constrained by the 34 from our parents.

        Like

      • David,
        Maybe an analogy. Since I’m an engineer, I’ll use building a house. Your resources are 2x4s, nails, plywood, etc. You cannot just start with the walls, you have to build it up from the components you have. This isn’t a good analogy from all angles, but the takeaway is that our shared segments are the resources for us to build a chromosome map. Our DNA does not include any “signposts” that say: “here is where a grandparent’s segment starts”. We have to build up the TGs (from shared segments) to identify where the crossover points are. Then the genealogy of Common Ancestors with Matches tells us which ancestor each TG comes from. From that info we can tell which TGs are from which grandparents. If you have two siblings who hava also tested, you can try Visual Phasing (looking at the large segments of 3 siblings at GEDmatch let’s you determine the grandparent crossover points). The key to determine large grandparent segments directly is with very close Matches who share with you all or most of those large segments. Otherwise, we have to build up TGs with smaller segments.

        Like

      • Thank you Jim for spending the time to help. I’m getting close but i’m missing something. Let me try another example and hopefully i can figure out where i’m so obviously wrong. I too am an engineer and i think that’s getting in the way 🙂
        Let’s look at a single chromosome which has 200 cm’s in it. Father gives the first 50cm and mom gives the remaining 150 cm. There is therefore one crossover and from the 50cm approximately 25 comes from paternal GF and 25 from paternal GM (same in the 150).
        Now I have atDNA tests for both me and mother, but not father. Why can’t i look at the chromosome and see that me an mom only match on the 150cm and automatically deduce that the 50 came from the father? Why do i need other tests to tell me where that break is? That 50 doesn’t match mom so it has to be from dad. What am i missing?

        Like

      • I got through my schooling without any biology, so I’ve really had to study hard since I started with DNA in 2002. The best was a free semester online from MITx – I should blog about it. Your mother and father each gave you 22 autosomes (which is why you got exactly the same amount of atDNA from each parent. You have two of each chromosome 01-22. Think of 44 individual worms. Your mother passes that 200cM chromosome to you which is a combination of 50cM from her father and 150cM from her mother – but 100% from your mother. Your father passes a separate 200cM chromosome to you with, say, a 125cM segment from his mother and a 75cM segment from his father. When you share a 20cM segment on the 200cM Chr with a Match, we don’t know which of you two 200cM chromosomes it’s on. However, if that Match also share most of the same segment with your mother, we assume it must be on the chromosome your mother passed to you. In one sense, you can “Paint” 20cM of your maternal 200cM Chr with that shared segment from a Match. As you “paint” more and more shared segments from Matches who also share each segment with your mother, you’ll start to notice a break point about 50cM along the Chr. No shared segment overlaps that point. Because to do so they would have to be descended from both your mother’s father and mother. This doesn’t happen with cousin Matches, so with enough Painting the recombination/crossover point between your mother’s parents is revealed. Hope this helps.

        Jim Bartlett Sent from my iPhone DNA blog: http://www.segmentology.org

        >

        Like

      • The penny drops. You have TWO full of each chromosome. Not one merged one. So you match 100% of your father and 100% of your mother. The matches let you distinguish which of the pair you are looking at. Duh. I hate being dense sometimes. I’ve seen that written so many times, but i just wasnt processing it. I can blame this on …. um age, yeah that’s right age, i’ll go with that right now. No wait don’t want to admit to getting older so it must have been inadequate teaching in the past, yep that’s it bad teachers 🙂

        Like

  3. Great method. I copy and paste the surnames from the AncestryDNA Match page into a Google sheets spreadsheet using “paste plain text” so each surname goes into it’s own row. Then I do basically the same thing sorting by surname, looking for repeated surnames, and looking for pedigree intersection among the matches in the cluster!

    Like

      • Nicole – I figured it out – nice. There are two lists: Shared Surnames & [Match’s] Tree Only. So highlight one list. Go to spreadsheet and right-click on a cell and select “text”, then enter. Repeat with second list. What is copied and pasted is the Surname and the number of people in the Tree with that name. This will still sort nicely on surname. A good time/accuracy saver. Thanks again, Jim

        Like

      • Glad you found the list of the match’s surnames only. Yes, it does save time! I just used this method over the weekend to find a common ancestor for a cluster related to my grandmother’s (the test taker) 2nd great grandfather, Jacob Rose, whose early life and parents are unknown. Walking the clusters back, I was able to identify the CA of some of the more distant clusters. They were Jacob’s wife’s ancestors. What was left were a couple other clusters with no common ancestor hints. After using this method for analyzing the surnames from the matches in the unknown cluster with trees, the common ancestral couple of one of them appears to be a Douglas family in Virginia. I now have a hypothesis for Jacob Rose’s family. Since I use Airtable database/spreadsheets to keep track of all my DNA matches and research, I copied the surname table I made in Google sheets into Airtable which has “group by” function within the spreadsheet. (First, I used the Google Sheets function to separate data in cells so that the surname was in it’s own column). Copying the surnames into Airtable and grouping them made it even easier to see which surnames had more than one match with that surname in their tree. I was working with the trees of 8 matches in the cluster. There were several surnames that appeared twice, but Douglas appeared in 6 of the trees. After finding those and plotting them in Lucidchart, I was able to find that the other two match’s trees could also be traced back to the Douglas couple. Now my next phase of research will focus on tracing the Douglas descendants through traditional records.

        Like

      • Nicole, Thanks again for the tip. I just did another Cluster, but a common surname is not jumping out with this one. Nevertheless, I think this process of looking for a Common Ancestor among the Matches in a Cluster is a good one – your method shortened it up somewhat. Thanks again. Jim

        Liked by 1 person

  4. Hello Jim!
    I have a question for you. Could you share your spreadsheet in a photo? I am visual learner.

    Thank you in advance,
    Maria

    Like

  5. Thanks Jim for another new way to approach the cluster reports. My Mum has two problem paternal lines, luckily I ran the report for her Ancestry matches very recently so was able to use that. She only has 78 clusters up to 20cMs. I recently spent some weeks working through the ‘Walking back the clusters’ approach a month or two back and thought I had researched all the clusters thoroughly looking for shared matches using GMP to assist searching for common names. In the first cluster I looked at there were 12 trees, but only 5 of any use. It’s a tedious process typing all the names but something I’m afraid we are going to have to get used to in the absence of any downloading tools and no pedigree thief! I got a bit excited when I started getting some commons names but in the end just found another pair of 1C1R’s that I’d missed on my first review. One step closer I guess, and one more clue! I have a few in this cluster and no doubt I just have to keep amassing them until they all join up! I’ll start on another cluster today!

    Liked by 1 person

    • I’m the same – sometimes I do the analysis and nothing really falls into place; but other times I hit the jackpot. Every time it even partially works, that’s progress – and encouragement to keep trying. With this new process, I’ve got a lot of Clusters in the queue, so it will keep me busy for a while;>j Jim

      Liked by 2 people

  6. Cathy, Yes! And once upon a time the Chrome AncestryHelper would list and help gather the surnames. I hope AncestryDNA is working on more tools. One nice tool would be extending ThruLines out to 8th cousins (which we used to get with Circles. I am now convinced that our DNA can be linked back (chromosome mapped) to our 8th cousins to our 7xG grandparents. That’s getting close to the fringes of genealogy based on multiple records. But, even with a few records Walking the Ancestor Back narrows our possibilities down to only two parent for each Ancestor we “know”. I’m now up to nine separated Matches in one Cluster who all descend from Archelous CRAFT and Elizabeth ADAMS – This *has* to tie into my Tree somehow… Jim

    Liked by 1 person

  7. I have a similar process that doesn’t require all the typing in. Unfortunately, one step in my process (getting the match’s tree and/or building the Q&D tree) has become more complicated due to the [I hope you’ve saved your last Cluster report] problem. What’s a body to do? 😉

    Liked by 1 person

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.