How To Detect False Matches Theory

Small-segment Matches (<15cM) with conflicting Shared Matches (no consensus) are probably false.

This Observation/Theory comes from two directions:

1. Science and observation tells us that (A) almost all Shared DNA Matches (Matches) >15cM are true; and (B) to varying degrees, some Matches below 15cM are false. There is a distribution curve which has a small percentage of 14cM Matches as false, down to about half of 7cM Matches are false and larger percentages of even smaller segments are false. It’s hard to sharpen this marshmallow of data, so please grasp the overall concept that as the cMs decrease below 15cM, the percentage of false Matches increases to areas where most Matches are false.  And don’t be confused – you can have a true Common Ancestor with a false Match – it’s just that there is no DNA link (or “proof”). In fact, most of our true 4th cousins (4C) and greater will not be DNA Matches at all. Another point is that there is no such thing as “partly true or false” – the shared DNA segment is all true (from a Common Ancestor) or all false (not from a Common Ancestor). Please don’t go down the “rabbit hole” that *part* of a small segment may be true – we are already down to small segments, even tinier segments aren’t a step in the right direction, unless you have a very, very special case. Not the thrust of this blogpost!

2. My observations of Matches under 20cM shared segments as I search/analyze this huge group (80,000) for Matches with a BROWN Ancestor – also a huge subset (thousands) of my under 20cM Matches.

Background 1: I’ve done the “homework”. Using the Walking-The-Clusters-Back process (starting here), I was able to identify hundreds of Clusters to specific parts of my Ancestry. Almost all to a parent; 98% to a grandparent; roughly 90% to a Great grandparent; etc. out to some that were tagged to 7xG grandparents (8C level). Overall, I have in my AncestryDNA Notes for almost all over-20cM Matches, *some* indication of their line in my Ancestry. Most of these have also been tagged to specific Triangulated Groups (TGs). Part of my analysis of the under-20cM Matches is a check of their over-20cM Shared Matches. Most have some Shared Matches. Sometimes, an under-20cM Match will have many Shared Matches in consensus – most in the same Cluster or TG. Sometimes they will not.

Background 2: My BROWN Project. For this project, I filter my Match List by Maternal (the side for my BROWN line), a cM range (ie: 12 to 13cM); and the BROWN surname. I then look at each Match, and put something in the Notes box…

Background 3; Keeping Track.

[“Good” Matches] I’ll add (ie: impute) the Cluster/TG to the Notes for each concensus Match. Note: some of these will be a consensus on the BROWN line; some will be a consensus on some other line. I also add a Note about the oldest BROWN I’ve found for each of these Matches, if it’s likely to tie to my BROWN line – a judgment call.  

[When I’m finally done ferreting out all of these lines in a consensus BROWN Cluster and/or a good probability of a link to my Tree, I’ll look at the number of Matches in each family group, and really dig into the research – this is sort of like collecting clues in Quick-and-Dirty Trees to see what’s worth pursuing.]

[“Iffy” Matches] Some Matches don’t neatly fall into the Good or Bad category – I look at them and usually build their Tree back far enough to determine if a tie-in to my Tree is possible – judgment call.

[“Bad” Matches] I add a Note to Matches with no Shared Matches: ”SM:0”. And I add a Note to Matches with various, conflicting Shared Matches (often on both sides): “SM – var”.  

Observation 1: Research/Tree building for Matches with “SM:0” and “SM – var” Notes almost always went nowhere. They had a BROWN Ancestor (a search filter), but their BROWN Ancestor was from England or Scotland or New England, etc. Clearly a very low probability of linking to my BROWN line in Colonial VA. For these Matches I added “X BROWN” in the Notes to remind me they’ve been looked at and discarded.

Observation 2: Finding a Match with consensus Shared Matches that had known/suspected BROWN Clusters/TGs was a BINGO! They *almost always* had BROWN Ancestry that linked to mine – whether it was in their Tree or in the Q&D Tree I built out for them. Some of the iffy Matches which had a few Shared Matches with favorable BROWN Clusters/TGs, but not a consensus, turned out to link to my BROWNs. Some did not.

In my BROWN project, I’m done with the 11cM to 19cM DNA Matches and am part way through the 10cM Matches – a lot still to do. I wanted to record my observations that DNA Matches who had a BROWN Ancestor pretty easily fell into probable/possible vs “no way” categories. And my growing belief is that Matches with “SM – var” are probably FALSE Matches – not going to be a genetic cousin on any line.

BOTTOM LINE: Matches under 15cM with various conflicting Shared Matches are probably FALSE. Certainly, to be culled out to focus on Matches who have a clear consensus of Shared Matches on one line. This is *not* a guarantee, nor does it mean all other Matches are true. But given the shear number of BROWN Matches to go through, I’m going to begin using this theory as a “TRIAGE” method. It’s the only way I can get through this BROWN Project.

[22BQ] Segment-ology: How To Detect False Matches Theory by Jim Bartlett 20230506

8 thoughts on “How To Detect False Matches Theory

  1. What do you put in the notes box for unidentified Shared Matches that have a Brown in the family tree? I’ve been putting shared cM, number of segments, the name of at least one identified shared match, and the lineage of the surname of interest (e.g. Brown) in the tree of the unidentified Shared Match. Am I missing something? Is there a good way to abbreviate this information?

    Like

    • Emily, IMO there are two tricks” 1. be consistent; 2. do something that will make sense several months/years from now… I try to keep these in mind, but often stray…. My first Note is an MRCA Note if I have one. If not I start with a Shared Match Note: SM: 0 or SM: 1 to indicate there is now help in this area (this is a warning to me, because almost all true cousins will show up with similar cousins as Shared Matches. SMs are my canary in the mine. Almost alway on the next line I’ll start with X BROWN – to show I actually checked for a BROWN MRCA, and there was nothing that was likely. For those that have some potential (very subjective, involving time and geography, usually at the county level), I enter the name/dates for 2 or 3 generations in a Word document (over the past 5 months my BROWN doc has grown 105pp! with part being validated cousins, and the major part being potential cousins- I’m trying to see if large groups form around certain lines – I have 73 Matches to one line from Thomas BROWN b 1773 VA)
      An AncestryDNA Note entry might be: BROWN, David b 1750 Rowan Co, NC #O – where the #O indicates more info is in the Word doc. In this doc I hae several Matches who descend from David, and/or are from Rowan Co, NC. When I’m done with the BROWN review at AncestyDNA, I’ll check for the largest family groups – I’ll also check the BROWN Y-DNA project at FTDNA to see if any of these BROWN lines have been tagged into Groups (my BROWN line is Group 40 with 16 kits in about 11 groups – I think the new info we’ve uncovered will tie most of these together under one Patriarch). For now I’m focused on seeing how under-15cM Matches will (or won’t) group and indicate related lines. Jim

      Liked by 1 person

  2. Thank you for these tips, I always take something new away! I will now start using your suggested shorthand for SMs. Do you have a suggestion on how to maximize the use of the colored dots in AncestryDNA?

    Like

    • Barb; I used to dot for maternal and paternal – but Ancestry has “sides” now. I dot a Match with a Common Ancestor when I’ve entered the Match and their line back to the CA into my Tree (I have several thousand to do, and this lets me keep track). I dot Matches with a TG (from one of the other companies or GEDmatch); I dot Matches with a “confirmed” TG. I dot Matches with incorrect ThruLines, to alert me to that fact. I’ve used dots for particular Projects (I dotted all HIGGINBOTHAMs; and am now dotting BROWN Y-Group 40 Matches). I dot Matches that may have a direct Y or mt path, so I can ask them about taking those tests. These dots let me work on specific lists and also provide a total number for each dot. I do NOT try to dot for most ancestors (there aren’t enough colors) – instead I use Ahnentafel numbers at the beginning of my Notes – prefaced by #A when I’ve confirmed the Common Ancestor couple; and by #L when I’ve imputed them from a concensus of Shared Matches.
      Your methods/processes may be different – so chose what *works* for you. Jim

      Liked by 1 person

  3. Thanks Jim for these thoughts. I spend most of my time reviewing matches under 20 cM. I plot them on DNA Painter and try to reconcile them with my Visual Phasing project which is probably 80% done. I have many conflicts. It is hard to write off any matches as totally “false”. Ancestry matches can’t be plotted, but they do have SideView predictions of side. Many of these smaller DNA matches on ancestry have shared matches which are a mix of people labeled paternal and others labeled maternal. The new match is typically labeled by Ancestry as unassigned. I am coming to believe that I have enough mixture between my paternal and maternal lines that the segment tagged to one side actually get passed through the other side. It gets confusing. I do not have classic endogamy. I get lots of well defined clusters, but also lots of grey squares between the clusters. As I continue to puzzle over my review and tagging of matches, I will keep in mind your 15 cM guideline.

    Like

    • Greg, In my Colonial Virginia ancestry, I know of 14 cases of Ancestors marrying Ancestor (only one is a close cousin marriage). My parents are distant cousins to each other. However two specific distant cousins rarely share any measurable DNA. Nevertheless, I went to GEDmatch and used their utility: Are your parents related. I looks for any part of one of your chromosomes that matches the same part on the other side. Mine showed none. That means that Matches to my DNA are on one side or the other – there is no way a single segment could match on both sides of my DNA. I *do* have a number of Matches who share multiple DNA segments with me and we share a paternal Ancestor on one segment and a maternal Ancestor on the other – not too unusual for Colonial Virginia. But the segments (and the Common Ancestors) can be treated independently.
      And, yes, in the under-15cM *arena* (where most of our Matches are), almost anything can *look* possible with a False segment. Jim

      Liked by 1 person

  4. Great that you are documenting your conclusions about this Jim. You have such a great number of matches to work with the data you are collecting will be a great resource for all of us. It’s also a great reminder about the richness of clues that lie beneath that 20cMs threshold for those willing to keep digging in a systematic way!

    Like

    • Thank you! Of my 90,000 Matches at Ancestry, only about 6,000 are over-20cM – 93% are under 20cM. Even if half of those were false, it still leave over 40,000 true genetic Matches under 20cM. There’s still a lot of gold to be mined…. Jim

      Liked by 1 person

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.