Using Ethnicity to Identify a Cluster

A Segmentology TIDBIT

My Ancestor 14M was John William CAMPBELL, born 1856 NY; died 1916 WV. His parents were Samuel CAMPBELL and Ann CLARK who were married 1851 in Scotland and immigrated to the US in 1853. This 1/8 of my ancestry is the only known part to come from Scotland. Several cousins have done Y-DNA testing and the CAMPELL line is the Argyll CAMPBELLs.

I have over 125,000 Matches at AncestryDNA. I have identified Common Ancestors with over 4,500 Matches – only 5 of them are on my CAMPBELL line. About 12.5% of my DNA is from my CAMPBELL line, and, all other things being equal, about 12.5% of my Matches should come from my CAMPBELL line.  But all things are not equal – this CAMPBELL line is relatively small, and there are no known Ancestors before 1850, and there are no known links to any Ancestors in Scotland.

This doesn’t mean that none of my other Matches are cousins from this CAMPBELL line. However, it does result in me not being able to find any more links. I have tens of thousands of Matches with no Trees; I’ve even found some with a CAMPBELL surname – but no way to determine if I am related to them (other than the few who have matching Y-DNA at FamilyTreeDNA).

So, I drop back and relook at the big picture: exactly 1/8 of my Ancestry came from Scotland (well, maybe not going way back, but probably within a genealogy timeframe); roughly 1/8 of my DNA came from/through Scotland; and if not 1/8, perhaps 10,000 of my Matches should be on this part of my Ancestry– certainly more than the five close cousins I already knew about.

I decided to turn this lemon into lemonade. The lemon is recent Scottish immigrant ancestor – the lemonade is Scotland ethnicity. If this is the only part of ancestry from Scotland, maybe I could use that information. When I Cluster my AncestryDNA Matches at the 20cM Threshold (the lowest cM amount with Shared Matches to each other) I get about 160 Clusters. 1/8 of those is 20 Clusters – a manageable number. So when I see some solid looking Clusters without any hints of other ancestry, maybe they are from my Scotland line.

Here is one such Cluster. I clicked on the link for each Match and checked their ethnicity:

Every Match in this Cluster has 14% to 62% Scotland ethnicity. A few scattered Matches with Scotland ethnicity might be expected randomly, but for all of them to have significant amounts of Scotland ethnicity is a strong clue.

I think I can safely assume this CL149/14/[Scotland…] Cluster represents my Ancestor, John CAMPBELL – Ahnentafel 14M. If I knew the DNA segment, I could Paint this Cluster. I have several others that also show a pretty clear Cluster “picture”. Next I’ll be looking a some other Clusters which may even have a ThruLines Common Ancestor in them, but also have a lot of Scotland ethnicity – the ThruLines CA may be the outlier… With only one ThruLines CA I don’t have a high confidence that it’s right. But with high concordance of Scottish ethnicity, that’s a strong clue the Cluster is on my CAMPBELL line.

The next step is studying any Trees in these Scotland Clusters to see if those Matches have some Common Ancestors among themselves… That will be the sweetest lemonade of all.


[22AU] Segment-ology: Using Ethnicity to Identify a Cluster TIDBIT by Jim Bartlett 20200612

6 thoughts on “Using Ethnicity to Identify a Cluster

  1. Jim, thanks for Sharing another clever way to investigate, within the bounds of the information ancestry shares. Have you done genealogical research at ScotlandsPeople? ( I have found it to be a very interesting site – it was very helpful to me, in addition to a few family archives, record research online and in state and local archives, and collaboration with other researchers and DNA matches in Scotland, Canada and Australia. It tipped the balance for me in walking back my g-grandfather, with a common Scottish name, who emigrated from Glasgow, Scotland to Canada in 1865 and then to Oregon, USA in 1866.

    ScotlandsPeople has a fabulous collection of records and document images. They flesh out the information you can get even with an ancestry world membership, which only has indexes for some of these records. It also has documents clearly describing Scottish history, customs and legal matters, which really helps clarify some of the information you find in the records – many things were quite different than we are used to in America, for instance land ownership and inheritance. You can also search by locality in ScotlandsPlaces. You can do searches for free but have to buy credits to be able to see most of the images. It might seem expensive, but I have been pretty successful at narrowing down my searches by a variety of means before I decide to use credits. I have hit the target most of the time, so I have stayed within my limited budget, and it has been well worth the cost.


  2. Jim – I’m similar – 12.5% from Scotland – Orkney in my case. I map my genome, so with this many 20cM matches, I would do what I could to get the segment data (I’m sure you’ve done that). I would run Shared Clustering at the 6cM level, and then check every name against my master spreadsheet from 23, FT, MyHeritage and Gedmatch for this cluster. On average in my case about 1 in 30 has tested elsewhere. So if I have 120 names for this cluster, I’m expecting four to have segment data. I double check the shared matches on Ancestry and then label the segment. Shared Clustering is disabled for now, but I have my downloads.


    • Rich, Thanks for your process. I do basically the same. Our DNA is fixed, our Ancestors who passed this DNA down is fixed; and we are just trying to tease out the “picture”. As you indicated, I have done all my Triangulated Groups – the Start and End locations of 372 TGs has not changed in my spreadsheet in several years. I know the grandparents of almost all of these, and am working on Walking The Ancestor Back on most of them. Now the process is focused on Shared Clusters from AncestryDNA Matches – I’ve done several downloads over the past 6 months – my 4/18/20 download with 6cM threshold took several days. Clustering on 20cM results in 4787 Matches and 156 Clusters. It’s time consuming but I’m working my way through that spreadsheet which includes just over 200 known TGs – about a 4% rate. As a “clue”, I impute the TGs and MRCAs to all other Matches in Cluster (when there is good concordance). These will help when I then run the Shared Cluster report on all Matches down to 6cM – there are many more MRCAs and TGs in those 120,000 additional Matches. If these additional Matches have Shared Matches (many of them do), then they are grouped with the appropriate Clusters. Another review of the whole list shows if the TG/MRCA concordance remains or if an adjustment is needed. Because the “truth” is locked in my DNA, these Clusters will tend to align. Not 100% absolute, but enough for focused analysis. Jim


  3. Nicely done, Jim! Good Lemonade!

    My DNA is on GEDCOM and FTDNA, but I’ve only recently tested again under the auspices of Ancestry.

    Though most of my correspondence with you has been about Bartletts, I do remember that I have a match with your son (also named Jim Bartlett?) that I don’t share with you. I remember discussing with you that it would be from your son’s mother, not you. Are your son’s results and tree on Ancestry? If so I would like to explore and see where my connection to him would be.

    Always nice to learn what you are up to. I’m still hoping for a breakthrough with the Bartletts!

    Best regards, Lou Mongan


    • Lou,

      Good to see you here! I sent you an email with a link to my son’s mother’s Tree. He is on FTDNA, 23andMe and MyHeritage, but not AncestryDNA.

      I, too, am hopeful that we can find a link from the VA/TN BARTLETTs back to the Mayflower BARTLETTS. The “trick” in this blogpost is similar to what we need for your BARTLETT line. Find Clusters that go back to John or Joseph or Nathan or Joshua and then look carefully in the Trees of all the Matches in those Clusters for BARTLETTs from the “other side”. Every match I find with a Tree I always look for BARTLETTs – and I’ve found several who have BARTLETTs from Plymouth, MA. That’s random, of course, but for each of you it would be an important clue.

      Here’s another “trick” I haven’t written up yet: Open your page with all DNA Matches and click on the Search button to get 3 search fields. Under surname list BARTLETT, under birth location list Plymouth County, MA – I just tried it and got over 50 DNA Match, all with a BARTLETT ancestor from Plymouth, MA. See if there is a thread among them… Jim


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.