Bad Segments – Good Segments

Posted on September 30, 2021 by Jim Bartlett

A Segment-ology TIDBIT

There are two ways of looking at small segments. But first please remember that ALL of your own DNA is true, even the very smallest part of your DNA came from a parent as a true segment. What we are talking about when we discuss “small segments” are small shared DNA segments with a Match – segments which are determined by a computer algorithm comparing your (true) DNA with a Match’s (true) DNA. Below about 15cM some of those comparisons report a false shared DNA segment. The smaller the segment, the more likely that it is false. The distribution curve starts at about 0% false reporting at about 15cM and drops down to about 50% false reporting at 6-7cM and drops down fairly dramatically below that.

In this post “segment” means a computer generated shared DNA segment.

1. Bad Segments: Small segments have a high probability of being false, and there is no easy way to tell if it’s a valid shared segment or not. And, perhaps, even if it’s a true segment, it’s probably from a very distant Ancestor – probably beyond your genealogy. These small segments are called names and referred to as POISON – DO NOT USE! However, in this derogatory sense we are talking about NOT using these small segments as evidence; NOT the basis of a hypothesis; NOT part of a “proof”. However, these segments are may not be worthless…

2. Good Segments. Shared segments are used by each company to identify DNA Matches, and report them to us. As noted above the small segments may be true or false. But what if they lead us to a person who is really related to us = a cousin? If the “Match” has a Tree we can check it out. We can look at the information presented. Finding a Common Ancestor is only part of the possibilities. Maybe this Match-cousin has more information about our Common Ancestor than we do. Maybe they’ve found records we don’t have, written an interesting story, uploaded pictures we didn’t have. Maybe we can establish a dialog (message, email, phone, in person…) I have made lasting friendships with some of my Matches – some of whom we still don’t know how we are related. The possibilities and opportunities are endless.

At AncestryDNA, ThruLines finds cousins with a Common Ancestor, down to 8cM (they used to go down to 6cM). I checked every one of them, and often found new information. With each DNA Match, keep your genealogy cap on. A small segment may in fact be false, but that doesn’t mean there isn’t a true relationship. Remember, about half of your true 4th cousins won’t share any DNA with you. My advice: don’t ignore a true cousin just because you share a small segment. Genetic Genealogists, myself included, have long stated that a Match with a Common Ancestor and a shared DNA segment does not necessarily mean the shared DNA segment came from the Common Ancestor. By the same logic, a relative with a Common Ancestor to you, may or may not have a true shared DNA segment from that Common Ancestor.

If you are trying to prove a bio-ancestor, or a brick wall Ancestor, or some other relationship using DNA, don’t use small segments. If they cannot be proved to be true segments, they must be ignored as part of a proof. But, on the other hand, don’t ignore a paper-trail relationship just because you share a small segment. Learn what you can from a genealogy perspective and ignore the DNA.

Just my perspective as a long-time genealogist…

[22BD] Segment-ology: Bad Segments – Good Segments TIDBIT by Jim Bartlett 20210930

12 thoughts on “Bad Segments – Good Segments”

RH on October 31, 2021 at 9:31 pm said:

is it possible for a 6-7cm match that triangulates with one or more of your matches to be false ?

LikeLike

Reply ↓
- jim4bartletts on November 2, 2021 at 2:13 am said:
  
  The short answer is yes. A longer answer is that your own DNA is always true, so if you can triangulate the 6-7cM segment with several other Matches (each matching all the others), it would indicate to me that the 6-7cM segment came from the same parent and ancestral line as the Triangulated Group is from – and it would be true. Actually, that’s not a good way to say it. the 6-7cM segment would almost certainly come from the same parent and grandparent, and great grandparent as the TG, but somewhere it might split the TG into parts (small part and large part – with the small part going back many generations to a CA). Your own DNA has to come from somewhere. Jim
  
  LikeLike
  
  Reply ↓
HAK on October 31, 2021 at 2:44 pm said:

I recently noticed a few interesting things about “closer” matches at 23andMe. In one case I figured out how I match someone with whom I share a mere. 5.5 cM of autosomal DNA. He was high on my list due to a shared 20-25 cM segment on the X chromosome. I dug for records and found the trail. Later I went back to look at this guy’s DNA against a known 3rd cousin who also descends from this line. (I love that feature.) Yike, they have a 41 cM segment on one non-sex chromosome. (These 2 guys would be unable to share any true matching X chromosome segment, due to their specific pathway to the relevant pair of MCRA’s.) I love that you caution us to be wary about <15cM but not outright ignore these matches.

I am trying to figure out distant Jewish ancestry, AJ and/or SJ on up to 3 separate lines, apparently paternal and maternal. Fortunately I do have some 100% AJ matches who share 15-30 cM with me. I can have more confidence now in those matches at least. Thanks!

(One more thing. For potential matches involving SJ, for someone whose family has this Sephardic ancestry in their more distant past… you will see many, many scattered matches that you can GUESS (?) must be through common SJ. But most of these matches will be <15 cM. I match people who don't seem to share any of my other ethnic background (German, Frisian, maybe some Scandinavian and maybe some Polish/Sorbian or Czechian… and maybe French from a known 17th century guy named Moulin). And they are in Brazil or Cuba or Mexico or Turkey or Syria (a generation or two back)… who match each other and me… and other Europeans in places like the Netherlands, France, Italy, Hungary, Germany and England and Ireland… and a couple in Spain even… also a few Russians (who maybe now live in the USA) and Israelites. I try very hard to look for evidence of a match through a German in Brazil, for example. It's just never there… and they do match each other… not a bunch of other Germans with me. I'd sure like to figure this out one day. The DNA trail is pointing in a certain direction… but I KNOW that's all it's doing.)

LikeLike

Reply ↓
- jim4bartletts on November 2, 2021 at 2:06 am said:
  
  HAK, Thanks for your feedback and comments. Remember that if someone shares two DNA segments with you, they could be from two different Common Ancestors – so the 41cM segment on a non-sex chromosome may be from one ancestor; and the X chromosome segment may be from a different Ancestor (in fact it would have to be, if the segments were IBD). . Jim
  
  LikeLike
  
  Reply ↓
Fred on October 1, 2021 at 6:32 pm said:

I would like to present data on my short segment (6 cM & 7 cM) matches from Ancestry, which also have ThruLines (TL), along with my ideas about this. The results tabulated below are for all of my apparently valid TL matches, as presented with four different categories of shared matches, if any. The categories are (from left to right): 1. Total number of TL matches; 2. Number of TL matches with no shared matches; 3. Number of TL matches with one or more shared matches, but the majority of the shared matches seem to be from a different ancestral line than the TL match (referred to as misleading); and 4. At least one shared match, and 50% or more of the shared matches seem to be from the same ancestral line as the TL match (potentially useful).
Number of Shared Matches
cM Total None Misleading Useful
6 32 7 3 22
7 49 8 6 35

My primary genealogical goal here is to use the information from short segment TL matches to help identify long segment matches that are part of a cluster. The far right column (Useful) could help, although in some instances the genealogy would be complex; but the column just to the left (Misleading) would yield bad or mostly bad information for my purpose. The Useful column outnumbered the Misleading column by 6.3:1 in my case. I think that the 6 cM and 7 cM TL matches help me reach my genealogical goal. The reason the 6 cM row in the above table has fewer matches than the 7 cM row is mainly because of the 6.00 cM bar (as previously used by Ancestry instead of a 5.5 cM bar). Shared matches are with the 20 cM bar used by Ancestry (lower bar would be better for my purpose).

The trend in the cM data listed above, although limited, indicates that still shorter segments would probably be useful. The shorter segments would be expected to rely more on indirect detection of shared matches, which can occur on the same chromosome or on a different chromosome than the short segment match. Since the indirect detection of shared matches (previously introduced) can occur with an invalid match (or a valid match), it should even be possible with a 0 cM match, although the relative usefulness of the shared matches of 0 cM TL matches cannot be estimated from the limited data above. Conventional concepts in genetic genealogy (e.g., triangulation), or consideration of short segment matches in isolation, are not as valuable for evaluating the potential usefulness of the shared matches of short or very short segment TL matches.

Ancestry is unmatched in data size, quantity of trees, and their TL matches, but they do not have a chromosome browser. Use of small segment TL matches is an alternative way to help extract information about long segment matches, and Ancestry is in the best position to do this. Management of data size due to the large quantity of small segment matches (or maybe very small segment matches) would seem to be feasible by periodically filtering out those that do not have TL matches.

LikeLike

Reply ↓
- jim4bartletts on October 2, 2021 at 7:29 pm said:
  
  Fred, I’m glad to present your data and analysis. As a genealogist of over 45 years, I’ve worked most of that time without DNA. So small segments are not a problem for me (I mostly ignore their size and focus on the Match’s Tree and records). However, I also focus on the Shared Matches – SMs are a strong indicator, and a valuable tool. When I find a small-segment Match who appears to be a 5C, I think of several things: a 5C has a fairly wide range of observed cMs down to 0, and including 6 and 7cM – so it’s in the range. I then look at the SMs – in some cases a clear majority of the SMs are from the same line as the 5C CA – this is very good news; in some cases a clear majority of the SMs are from a very different CA line (go back to the Match and look for that line – including extending their Tree to do so); and in some cases the SMs are all over the place – on both sides of my Ancestry. In each of these cases, I make type the SM indication in the Notes box – usually with an X next to the 5C CA as probably not correct – and I use an Orange Dot (my way of “filtering out” this Match until I have more data). Jim
  
  LikeLike
  
  Reply ↓
Perry Streeter on September 30, 2021 at 6:10 pm said:

Thank you for the statistics and insights!

With respect to your comments above that “I have concluded that 7cM shared segments which Triangulate are usually true. I’m pretty confident that 7cM segments which don’t Triangulate are false,” what statistics or further insights do you have regarding segments, as reported from the GEDmatch Tier 1 ($) Segment Triangulation tool, below 7 cM?

7 cM is the current default for general use of the Triangulation tool but lower values can be used via Multiple Kit Analysis. I understand from above that one-to-one matches at 6-7 cm are about 50% false but are triangulated segment matches at 6-7 cM also about 50% false? Stated another way, at what cM range are triangulated segments usually true?

LikeLike

Reply ↓
- jim4bartletts on September 30, 2021 at 6:35 pm said:
  
  Perry, I have not done any simulations, or tried to pin it down. As you know, DNA is fairly random. Just like trying to decide on an umbrella when there is a 50% chance of rain – in the end, it either rains or it doesn’t. I will say: with larger segments creating a solid TG, the underlying SNPs are equivalent to phased data – they have to line up with the SNPs on one side of your DNA. *If* a 6-7cM shared segment shows as a Match to all the other overlapping segments in the TG at GEDmatch, then I would conclude it has the phased SNPs and is true. But I have no *proof* of that. At some point (and I don’t know that point), there are true small segments that are the same in most humans – they often show up as pile-ups – they could come from many different Ancestors. In this case we shouldn’t call them IBD (Identical because they are from a CA, because they may not be). I have some small segments from Neanderthals… So a word of caution – there is no absolute, and the further you deviate from solid footing, the more reinforcing evidence you need.
  And always check – one-to-one – any results from the Multiple Kit Analysis…
  Jim
  
  LikeLike
  
  Reply ↓
  - Perry Streeter on October 1, 2021 at 4:05 pm said:
    
    Thank you!
    
    LikeLike
cryptoref on September 30, 2021 at 2:36 pm said:

A related question. Suppose you have match with multiple segments. You have a 40 cM match on one segment and a 7.4 on another. Does the 40 cM match change the odds that the 7.4 is IBS? Or are the odds independent of any other match?
Additionally, if the 7.4 match has no other matches or it forms a group of matches. How does that affect the determination of a good versus bad match.

LikeLike

Reply ↓
- jim4bartletts on September 30, 2021 at 4:19 pm said:
  
  Cryptoref; The events are independent! Think of it from the DNA’s point of view – the DNA goes through the recombination process with no “knowledge” of the host, or potential Matches or anything else. And what happens on Chr 06 is physically separated from Chr 05. Each segment has an independent journey down to you. And the “segments” we are talking about here are really *shared* segments as determined by a computer algorithm which compares your DNA (which is all true) with a Match’s DNA (which also is all true), and finds overlaps which are the *shared* DNA. The process is completely independent at each event.
  I have worked with a lot of 7cM shared segments – some Triangulate with other shared segments, some do not. We know that about 50% of 7cM shared segments are true, %50 false. So I have concluded that 7cM shared segments which Triangulate are usually true. I’m pretty confident that 7cM segments which don’t Triangulate are false. NB: As with almost all segments – having a Match with a CA and a shared segment (or TG) is OK, but it does NOT necessarily mean the shared segment (TG) came from *that* CA – the shared segment could come from almost any CA – you cannot tell by the two facts (CA,TG) alone – you also need other evidence… Jim
  
  LikeLike
  
  Reply ↓
  - cryptoref on September 30, 2021 at 10:33 pm said:
    
    That’s pretty much what I figured, but it’s always good to ask to make sure.
    
    LikeLike