I am sometimes asked if Triangulation “always” works. And with that question, there is always the follow up questions: with 15cM shared segments, 10cM segments, 7 cM segments, 5 cM segments, any size segments?
There are many things about DNA in genetic genealogy that are based on a distribution curve. And the distribution curve is based on experience. The classic example is IBD vs IBC segments. No one has reported, yet, any shared segment over 15cM that has proved to be not IBD. Very few examples exist for shared segments in the 10-15cM range. As we lower the range to 7cM, experience indicates that the percent IBD drops to about 50% range, give or take. The point is that there is a distribution curve, and we cannot say for certain that a shared segment below 15cM is IBD or not, just by the cMs.
Well, Triangulation is formed from shared segments. If we knew for a fact that the shared segments in a Triangulation were all IBD, then it would be an easy call. Given three IBD shared segments on the same segment area, from widely separated Matches, Triangulation should always work. But notice the qualifiers in that statement: three, IBD, same segment area, widely separated and should. This would be a very “tight” Triangulated Group, and we still need to say “should” because DNA is random and does not necessarily follow a set of rules like geometry.
In Triangulation we start with the shared segments reported by the various companies (23andMe, FTDNA and GEDmatch). The fact that three widely separated Matches match each other on the same segment, significantly increases the probability that the shared segments are IBD. So the IBD distribution curve based on cMs for a single shared segment, is shifted somewhat for Triangulation. Of course the question is how much.
In my experience, there have been a very few shared segments in the 10-15cM range that did not Triangulate. There are also some shared segments in the 7-10cM range that do not triangulate – the percentage goes up as you drop down to 7cM. This appears to be roughly in line with the non-IBD rate we see for shared segments. I have used a 7cM threshold for Triangulation for the past two years. I have not found any discrepancy, yet. About the end of 2014, I added shared segments in the 5-7cM range to my spreadsheet. Most of them did not Triangulate and were thus classified as IBS/IBC – this was expected. Some of them could not be categorized as there was no way to compare them (most comparisons at this level need to be made at GEDmatch). Some did Triangulate, and, so far, they have all “fit” in the TGs. A few of these have been very helpful.
I am comfortable saying Triangulation almost always works down to 5cM. The caveats include widely separated Matches, and an overlap of 5cM (estimated) for all segments. The TG “should” have one Common Ancestor. Eventually all TGs will be subdivided into even smaller TGs, and this will split the CAs between husband and wife – but that is another blog post, someday.
However, the question remains – what is the IBD distribution curve for TGs? At some point, as we reduce the cMs for shared segments in a TG, there will be IBC TGs. We still have the issue that algorithms can create IBC shared segments, so it’s reasonable to expect IBC TGs over a distribution range. There is no report I know of that addresses this distribution, yet. We may need to have completed chromosome maps for such an analysis.
11A Segmentology: Does Triangulation Always Work; Jim Bartlett 20150510
Was there a follow up on triangulation and endogamy? Sorry, but I couldn’t find another blog entry when I did a search for “endogamy” and “triangulation”.
MY HOPEFULL RELEVANT QUESTION IN CONTEXT
I’m matching a lot of people on the same stretch of the same chromosome. I can see this with the triangulation tool at GEDMatch and FTDNA. For a few reasons, I know that these are people who are Jewish. I can also do a search on my matches to single out a sub-set of Ashkenazi at My Heritage and many of them also match me on the same bit of DNA. At FTDNA I can use the chromosome browser and set it at 10+ cM and they’re still there. Many of these matches often share a second bit of DNA too. I think it has to be more than just random chance (IBC). But here’s the thing, I’m German. So this is a bit odd or unexpected. Also, if I do have a distant relative who was Jewish would he or she actually be a much more distant ancestor due to the effect of endogamy? I’m trying to focus my efforts to find relevant records. Trickier if you are German. Should I look for an ancestor going back to the 1700’s or much, much earlier given the potential effect of endogamy? I’d think that there would be almost zero chance of finding any documentation that predates the 17th century. Can I at least conclude that it’s highly probable that I did have a Jewish ancestor, albeit very distant?
Sorry to be so late in responding. I would say, yes, you have a Jewish ancestor, and because of endogamy, he/she is farther back than the shared segments would indicate. My advice is to work Matches one segment at a time. Try to form Triangulated Groups, then work with the Matches in each TG. Often you can tell from the surnames when you have a Triangulated Group of Matches. Good luck, Jim
LikeLiked by 1 person
Wonderful information and I have a question. Using the GedMatch Tier1 Triangulation tool I fully realize is not the only means to arrive at a valid triangulated match. But for me seems to be the easiest. Using this means to arrive at a conclusion…..in these reports[Tier1 Triangulation] GedMatch indicates they do not show segments under 7 cM. My question would be for all those who believe in small segments their arrival at a valid match will NOT include results from this tool. So another point in the unreliability of small segments is the best triangulation tool cannot be used ? M E Ray
Yes, the best Triangulation tool cannot be used. Sorry. I’m also sorry that small segments are, in fact, unreliable. We cannot change a known unreliability by using a different tool. You can use GEDmatch’s tools to compare two random people with a 3cM threshold, and often get several small shared segments. How do you know if they are IBD or not? Particularly when the statistics tell us that a high percentage is not IBD.
Another way to look at this is to understand that you and I have over 98% matching DNA; an over 90% match DNA with a chimp. So we share a Common Ancestor with a chimp. But that’s not genealogy. The SNPs in atDNA are selected because they tend to vary the most, but individually each SNP can only have one of 4 values, and each one has the same value most of the time. If we just looked at a few SNPs in a row, most of us would match. And indeed, any two of us have a Common Ancestor – but again, not necessarily in a genealogy timeframe. So we rely on longer strings of identical SNPs (in shared segments) to narrow down our matching closer to a genealogy timeframe.
And from another viewpoint, suppose we used a different tool, and found a 3cM shared segment. The “average” Common Ancestor at that level is probably more than 20 generations back. Just because you find a Common Ancestor back 8 generations, even with several Matches, how can we be confident that that’s the correct genetic Common Ancestor. We cannot do that with a simple GEDmatch tool.
The issue is Mathematics. The best explanation of these issues is in a roughly 40 page white paper that hits rather difficult levels of calculus and is available from Ancestry.com. The white paper does walk you step by step through the equations. Still, a reasonable knowledge of calculus is required to understand the explanations.
If you have taken College level Statistics, you can more or less understand the issues. With all of that said, Jim gave a very reasonable explanation. Sadly, any better answer than Jim gave requires higher Mathematics.
Thanks, Sam! There are several probability distributions at play here, working in different directions. It’s hard to condense it all into words, or even diagrams. Jim
To somewhat change the subject to a less complicated area, A 6 cM match at Ancestry.Com is rated by Ancestry with a 15% chance of being accurate. If one did a summation, it would take a lot of matches to reach 50%, assuming that doing a summation is a valid tool in this case.
Pingback: Does Triangulation Work? | segment-ology
“As we lower the range to 7cM, experience indicates that the percent IBD drops to about 50% range, give or take. The point is that there is a distribution curve, and we cannot say for certain that a shared segment below 15cM is IBD or not, just by the cMs.
First my non contagious expousre to calculus was in class 1966-3 USN Nuclear Power School. I also took business level statistics in the late 1970’s at Woodbury University
I have a problem with the idea that “triangulation” is the one and only way to determine matches. A 7cm match represents roughly 2% of the possible matches and discards the other 98% as IBS.
For privacy reasons I’ve not given the ID of the matching tree. I’ve substituted numbers instead.
Barnes_Brinsley-1715 Lindley_Sarah_Elizabeth AncestryDNA 04-sam.png
Barnes_Brinsley-1715 none AncestryDNA 06-sam.png
Barnes_Brinsley-1715 Lindley_Sarah_Elizabeth AncestryDNA 07-sam.png
Barnes_Brinsley-1715 Lindley_Sarah_Elizabeth AncestryDNA 08-sam.png
Barnes_Brinsley-1715 none AncestryDNA 10-sam-N.png
Barnes_Brinsley-1715 Lindley_Sarah_Elizabeth AncestryDNA 13-sam.png
Barnes_Brinsley-1715 Lindley_Sarah_Elizabeth AncestryDNA 14-sam.png
Barnes_Brinsley-1715 Lindley_Sarah_Elizabeth AncestryDNA 15-sam.png
Barnes_Brinsley-1715 none AncestryDNA 16-sam.png
Barnes_Brinsley-1715 Lindley_Sarah_Elizabethy AncestryDNA 17-sam.png
Barnes_William-1678 none AncestryDNA 19-sibling-sam.png father of Barnes_Brinsley-1715
Barnes_Brinsley-1715 none AncestryDNA 19-sibling-sam.png
Barnes_Brinsley-1715 none AncestryDNA 21-sam-N.png
My 2nd-1r cousin
Barnes_Thomas-1662 none AncestryDNA 01-BA.png Uncle of Barnes_Brinsley-1715
Barnes_Brinsley-1715 none AncestryDNA 02-BA.png
Barnes_Brinsley-1715 Lindley_Sarah_Elizabeth AncestryDNA 03-BA-N.png
Barnes_Brinsley-1715 Lindley_Sarah_Elizabeth AncestryDNA 04-BA-N.png
Barnes_Brinsley-1715 Lindley_Sarah_Elizabeth AncestryDNA 09_1-BA.png
Barnes_Brinsley-1715 none AncestryDNA 11-BA.png
Barnes_Thomas-1662 none AncestryDNA 12-BA.pdf Uncle of Barnes_Brinsley-1715
Barnes_Brinsley-1715 Lindley_Sarah_Elizabeth AncestryDNA 18-BA-N.png
Barnes_Brinsley-1715 Lindley_Sarah_Elizabeth AncestryDNA 18-BA-N.png
Barnes_Brinsley-1715 none AncestryDNA 20-BA.png
Barnes_Brinsley-1715 Lindley_Sarah_Elizabeth AncestryDNA 05-BA-N.png
Barnes_Brinsley-1715 none AncestryDNA 20-BA.png
This is 22 ancestral matches plus 2 Uncles. Going back into the 1600’s, I suspect that the 2 uncles are a result of errors in the tree and in fact are the grandfather with the wrong info. But, lets drop those two as questionable leaving 20 matches.
I will use your 50% probability
Summation(50%)+(50%*.5)+(25%*.5)+(12.5*.5)+(6.25*.5)+(3.125*.5) . . .
=50+25+12.5+6.25+3.125 . . .
=97.225% . . .
Conclusion, if ANY 5 of the 20 matches(from AncestryDNA) are over 7cm, then the chance that Brinsley Barns-1715 is the ancestor of B A and I is in the 90% plus range
It doesn’t work that way. The point is many of the 7cM shared segments are IBS. You cannot add a bunch of them together to change that. By your alternative process, we could take a pile-up area with hundreds of small shared segments and “prove” this clearly IBS area was an IBD. This is clearly not right – we cannot take a bunch of IBS segments (that don’t triangulate in one small area), and work the math to prove they are IBD. A shared segment is either IBD (on one chromosome, from an ancestor), or not. Triangulation is one way to sort this out. Phasing is another way.
Mr Barnes may be an ancestor of you 3. The fact that you all share some matching segments indicates that. It’s the summation of probabilities to “prove” it, that I object to.
Never claimed that I “Proved” anything. I do claim that a matching segment that meets the 50% threshold with one atDNA match has it’s probability increased as greater numbers of 50% threshold matches for non close relative are found. The Zeno’s Paradox equation that I used is very basic Calculus and can also be used statistically. What it can never do is prove certainty. But, it can get rather close.
I definitely agree that that pile up areas need to be treated differently than other areas. That is why I specify a specific percentage threshold in my response, rather than a cM threshold as I included in my first post. Note, as part of our final exam in the base(Academic) portion of USN Nuclear Power School class of 1966/3, we had to derive the Schrödinger equation from memory. I am much more comfortable with deriving the units of statistical analysis expressed as imaginary units of area called “barns” using Schrödinger’ equation, than I am(from what I have seen) in the way that the units of statistical analysis expressed as units of imaginary length called centiMorgans are calculated.
In fact, I believe that your objection is in my using such an imprecise measurement as the centiMorgan not with the Equation. Yes, I agree that a 7.0 cMs have decidedly less than a 50% probability in pileup regions. I also agree that using your statement, “As we lower the range to 7cM, experience indicates that the percent IBD drops to about 50% range, give or take.” with out going into the problems with the cM derivation and accuracy was a mistake on my part
cMs are derived empirically, and are in lookup tables. As the companies report them, they are not precise (I call them fuzzy). But they are the best measure we have for shared segments (see a previous blog post), and for genealogy (and Triangulation), they work very well. For shared segments below about 10cM, they provide a good warning flag that the shared segment may be IBS.
“- – – they are the best measure we have for shared segments” Absolutely agree. Calculating probability of interactions in Nuclear Physics, is much less complex and has been done for 90 years. The equations that I’ve seen for deriving cMs are much more complex and much newer. As retired military, we both understand that using the least bad data or procedure is frequently the best that we can do.
Sam – As I’ve outlined in the blog, the DNA tests are extremely accurate (all 3 companies), and the reported shared segment data is entirely adequate for determining Triangulated Groups (which provide a cadre of Matches who share a Common Ancestor), and mapping our genomes. This whole process is much more accurate than your average obituary, Will, marriage bond, census record, etc. DNA is a great tool, and a boon to serious genealogists.
Still, the equation that I used is a well proven Classic. So, the possibilities are that my 70 year old mind is failing, cMs are a rather imperfect measurement, my original conclusion is reasonably valid or some combination of the three options.
Are there any statistics out there that relate probability of IBD to matching segment size in cM? I’m curious about what “very high at 10cM” actually means. Is it 75% which can be seen as half way between the 50% at 7 cM and virtually 100% at 15 cM or is upwards of 90% making the curve much steeper?
Thanks for your great posts,
Try this site at ISOGG: http://www.isogg.org/wiki/Identical_by_descent
I think the IBD rate at 10cM is over 90%. We don’t have a lot of posted data to go on. The curve is steeper between about 5 and 10cM and then levels off at 15cM
” As we lower the range to 7cM, experience indicates that the percent IBD drops to about 50% range, give or take.” I am curious to know your thoughts regarding the major testing companies’ definitions of “matches,” and what efforts they have made and should make to reduce the percentage of “false positive” matches ? For instance, my FTDNA matches each have at least a single segment of 7.69cm or longer, and a total of segments of 20cm or longer. Thanks !
Robert, As you note, there is a curve for IBD vs cM that we have determined by experience. It’s virtually 100% at 15cM, very high at 10cM, dropping ever more until it’s around 50% at 7cM and dropping ever more quickly beyond that. However, we know there are some 1cM, 3cM and 5cM segments that are IBD – they are just very hard to find. All three companies have algorithms to determine shared or “matching” segments. Since there are always some 8cM shared segments which are IBD and some which are IBS, tne companies try to strike a balance. By most accounts, FTDNA has the most stringent algorithm (they should have the lowest IBS rate); followed by 23andMe; and then AncestryDNA (IMO). Ancestry claims to offset it’s 5cM threshold by using population phasing (a week substitute for phasing with your parents). You can set the threshold at GEDmatch wherever you want – the default is 7cM and 700 SNPs.
Jim; Thanks for the blog. Can you clear up a confusion for me? If three of us from different families triangulate on a segment assigned to Mr Jones Jr. and he got his segment from his father, how do I know that my segment didn’t come down to me through another child of Mr. Jones Sr.? Would I have to rely on the genealogy and time frame to determine which Mr. Jones I descend from?
Every Match with overlapping segments in a TG will have a Common Ancestor – an ancestor who passed the DNA segment to each of them. Which ancestor you all share is determined by genealogy.
LikeLiked by 1 person
Pingback: Segmentology.org by Jim Bartlett | DNAeXplained – Genetic Genealogy
Israel, Thanks for the kind words. The short answer is that from a genetic viewpoint, each Ancestor is different (even though different spaces in the Tree are occupied by the same individual). Given the randomness of DNA, it is extremely unlikely that exactly the same segment would come down different paths. So from the DNA’s perspective, each ancestor block is independent, and the TGs will be different. More later…
Israel, you are correct, and I must be careful. I am trying to write this blog for the majority of genealogists. I want to present things that many folks can use successfully. With atDNA, and the wide variety of our ancestries, not to mention the randomness of DNA itself, as well as the objectives of each person, there are many different facets to genetic genealogy.
For most I think it’s a good idea to learn Triangulation using larger segments at first. Learn the process, get the “feel” of how the segments overlap and group. As for myself, I didn’t explore any segments below 7cM until I had over 80% of my chromosomes mapped. Then it was relatively easy to see where the smaller segments “fit”, and compare to Matches in the TGs.
To me, the distribution curve for IBD segments vs cM is important. It tells us what to expect for various segments. I’d like a similar curve for IBD segments which Triangulate vs cM. Can we really expect almost all 7cM, or even 5cM, segments in TGs to be IBD? Is this curve the same as the IBD vs cM curve, or not? Some genealogists like to play it safe; others are willing to work on the fringes, with an understanding of the probabilities.
Endogamous populations don’t really factor into Triangulation. Triangulation is a pretty mechanical process which is not affected by who your Ancestors are. Endogamous issues do come into play with assigning TGs to a side and, particularly, in determining the correct Common Ancestor. I hope to address this in a later blog post…
LikeLiked by 1 person
You do very important work.
I shall wait until you address the issue of endogamy and triangulations and make my comments then, as necessary.
I am learning so much by reading (and re-reading) your blog posts. Thanks so much for sharing your knowledge and experience! I am particularly eager for the post where you plan to discuss how to to determine that you have identified the correct common ancestor when said ancestor is part of an endomgamous population.
I recently chanced upon an interesting discussion on the Facebook ISOGG page regarding a “gateway ancestor” identification service where a link was given to this article: http://ourpuzzlingpast.com/geneblog/2015/04/19/genealogy-and-autosomal-dna-matches-common-errors-in-proving-an-ancestor-and-the-allure-of-easy-gateway-ancestors/
I think I mostly (although not entirely) understand the criticism of the gateway ancestor service (I had my suspicions that this service was hype even before reading the criticisms) but the post also comments upon difficulties of determining the correct common ancestor in endogamous populations and concludes with the following:
*** Trying to make genealogical proof arguments out of single segment DNA matches is a challenge for which I believe few are prepared. To unravel these deep connections will take much time and money as large groups of people will need be tested, and uniparental DNA (Y and mitochondrial) testing may be required to rule out possible lines of descent. Given large enough data sets of tested individuals some innovations may eventually be accepted as “proof”, such as AncestryDNA’s “Circles”. However, the AncestryDNA Circles, and even more so AncestryDNA’s recently introduced New Ancestor Discoveries, are still in the early stages of being user tested and for now cannot alone be used as “proof” in genealogy. ***
I’d definitely be interested in your take on this post by Puzzled.
Peggy – As with most things – I agree with some of Puzzled’s post, but not all. I agree with the last sentence. I agree that we cannot “prove” anything with a single segment match – there are too many variables and alternatives. Such a match is just one piece of evidence. I don’t subscribe to the concept that you have to test many of your close relatives. It helps, yes, but it’s not required. Just see what Adoptees are doing with no ancestry (including close relatives), to start with. And, although Y-DNA and mtDNA are powerful tools, they have very limited use across our ancestry.
At this time the number of folks getting an atDNA test is about doubling every 12 months – the number of random 3rd, 4th, 5th, etc cousin-matches you have today, will double in the next 12 months (don’t get behind – there are more Matches coming all the time). To me, this means many of us can form Triangulated Groups (TGs) that cover most of our Ancestry (yes it does take some time and effort). And these TGs will double in size (number of Matches) every 12 months. The TGs will be fixed over much of your DNA. And with all the new Matches, there is a growing probability that some of them in each TG will help determine some Common Ancestors.
So the “proof” will be in the form of preponderance of evidence. Although each Match may share several CAs from an endogamous population, all of the various Matches in a TG will not have that same group of CAs with you. In other words, when several, widely separated (distant cousins to each other) Matches in a TG all agree on the same CA, we then have Genealogy Triangulation (GT). The more Matches in a GT the stronger the evidence. This, too, is hard work, and takes time. But to me, the “proof” will be 4 or more Matches with the same CA (i.e. GT) in a TG. Chromosome Mapping will provide a Quality Assurance feature to the result, as we check the crossover points by generation. All to be discussed in future posts…
By the way, I am a fan of AncestryDNA’s Circles as a Hint. They do some of the work for you, but the Matches in a Circle could be from different TGs and different GTs. Although I note the Circles, and study each person in them (I am still a genealogist), I cannot assume anything dealing with DNA, until I’ve seen the shared DNA segments. No comment on NADs yet.
LikeLiked by 1 person
We must be careful – each in his own work – when thinking in terms of a probability curve. It is too easy to say that low probability is automatically to be ignored. Twenty percent is not the same as “never.” Twenty percent in fact means that once in five times, it is likely to be valid. If you automatically chop off the low probability segments, you are throwing out good data some non-negligible percent of the time.
But of course we must be conservative, especially when dealing with endogamous populations.
This is not an NFL injury report where “probable” is supposed to mean 75% but in fact means “Don’t be silly. Certainly he will play Sunday.”
I match 3 siblings (full siblings) on a shared segment but I match each one to a different degree thanks to the random nature of autosomal DNA–ranging from more than 15 cM down to slightly more than 7 cM. I do not match the mother so we know the match is on the paternal side and we know exactly where we match. We are 6th cousins. FTDNA estimated that the one I have the largest match with is a likely 3rd cousin, the next a 4th cousin, and the smallest segment match a 5th cousin. Obviously since I know they are siblings they are all the same relationship to me — 6th cousins, this gives us a good example of why we call these “estimates.”
LikeLiked by 1 person
I won’t remove comments here that I disagree with. I misread your comment to apply to everyone. You’ve now clarified that you are only speaking about your own situation. I’d prefer that we limit comments on this blog to the substance of the blog posts. Thank you.
Kanali – you are entitled to your opinion. My opinion is that we’ll find most Common Ancestors of our TGs within 4-5 centuries, and most of those within the past 2-3 centuries – since 1700.
This isn’t my opinion, I was stating a fact with my personal encounter. I wasn’t saying anything negative. I think you’re just misinterpreting what I am saying.
I was simply stating that in my case, being a Polynesian, I can have a lot of triangulation with Polynesians of other island nations yet our common ancestor goes back centuries ago, not recent. I share 300 – 400cM with Maoris ,and my mother in the 600s and 700cM with Maoris on FTDNA. But Maoris and Hawaiians have been separated since around 1200AD and no contact between them until the 1800s but I’m not talking migration.
This wasn’t meant to offend you, but feel free to remove my comments.
LikeLiked by 1 person
My issue is sharing the same segments with multiple people, yet cannot find a common ancestor because it goes back centuries. 🙂
LikeLiked by 1 person
Some TGs may go back centuries (I have found Common Ancestors with many 7th thru 10th cousins); other TGs may be 2-4th cousins (I’ve found those, too). What Common Ancestors you find depends a lot on how much of your Tree you’ve documented, and how much your Match has.
Definitely true! If only our trees could go back at least 8 centuries ago when our ancestors with these triangulated matches lived.
An excellent blog Jim 🙂
It leaves me wondering what you mean by “widely separated” AND
Do you think the statistical probability curve alters its characteristics the more occupants there are within an overlapping cluster, that are deemed to be ICW / matching each other?
To me, MORE signifies greater statistical confidence.
Graham, “widely separated” means the three Matches are distant cousins; as opposed to two of the Matches being parent and child – which isn’t really Triangulation (although they both can be in a TG). For your other issue, I’ll address that in a separate blog post – but the short answer is yes.
LikeLiked by 1 person
My Name is Valerie and I believe we have a DNA match. I found information indicating this on GEDmatch but suddenly, The information is no longer there. I don’t know if it was just an error on the part of the website but I would very much like to find out if we are indeed related. I send you an email via Gmail a few days ago but not sure if you check that often. Hope to hear back from you. If you check your Gmail acct, you should be able to find my email and be able to reply if you would like.