Small Segments and Triangulation

How small can we go with triangulation?

We have anecdotal information to indicate “almost all” shared segments above 15cM are Identical By Descent (IBD). There is always a tail on the distribution curve of random events, so we cannot say 100%.

From my experience (mapping over 90% of my chromosomes) I am confident that triangulation can tighten the distribution curve so that “almost all” segments down to 7cM in a Triangulated Group (TG) are IBD. I say this because I find some 7-10cM shared segments which do not triangulate with the TG on either the maternal or paternal sides. Although several segments in each TG triangulate with each other, some shared segments, with the same “address”, do not. To me this is proof positive that these shared segments which do not triangulate must be Identical by State (IBS), meaning, in this case, not-IBD. And the number of such 7-10cM shared segments which don’t triangulate, and are thus IBS, seems to generally agree with the percent IBS in the ISOGG/Wiki:

However, the fact that the segments in a TG do triangulate does not, in my mind, provide a 100% guarantee that they are all IBD. The same is true for a random shared segment in the 10-15cM range – most, but not all, are IBD. But in the aggregate, when we have say 20 shared segments in a TG, usually of various cMs, this pretty much defines that area of the chromosome as coming from an ancestor. If 1 or 2 of those triangulated shared segments turns out to be IBS, it’s not harmful in the grand scheme – we are looking for a Common Ancestor (CA) for the TG, and generally find only a few Matches in the TG who have a robust enough Ancestral Tree to help with this goal. We are looking for several such Matches to confirm the same CA. Having a close cousin in the TG, increases our confidence in the CA. As our Match list doubles over the next 12 months, so too should the number of Matches in each TG, adding to the preponderance of evidence for both the TG and CA. The key is that several distant cousins all agree on the same CA for the TG – this, too, adds to our confidence level.

My chromosome mapping has resulted in about 350 defined TGs which are adjacent to each other (“heel and toe”) – covering long stretches of each of my 45 chromosomes, with only a few bare spots over 10cM. All new Matches have shared segments which easily “fit” into, and triangulate with, existing TGs – except a small percentage in the 7-10cM range which don’t and are then labeled IBS. This has also added to my confidence that triangulation, down to 7cM shared segments, is a good process. The outline of my chromosome map is coming into sharper focus, with fairly well defined crossover points, and ambiguities are fading away.

With this “success”, I’ve been including shared segments in my analysis down to 500 SNPs and 5cM by adjusting the thresholds at GEDmatch. Almost all of the 5-7cM segments do NOT triangulate, and are thus IBS. A few do triangulate – guessing at about 5-10% range. This seems reasonable to me as there are 5cM shared segments which are IBD. I’m adding these into my TGs, but color coding the small cM value to highlight it. To date I cannot recall any which have resulted in a confirming CA. Most of these 5-7cM IBD segments may well be from an even more distant CA…  I also include shared segments down to 5cM from close, known cousins. Most are also IBS, but a few of them, so far, agree with the TG CA, and are probably IBD.

The problem is we don’t have a good test for IBD vs IBS. Some have used results from phased data to develop rough percentages for IBD/IBS ratios vs cMs for shared segments. See I’ve seen no distribution “curve” yet. We don’t have such data for triangulated segments, so we really don’t know what effect triangulation has. Triangulation depends, in part, on using long shared segments. This, coupled with widely separated cousins who got exactly the same long segment, increases the odds that the shared segments are IBD. These two factors (length of segment and a match) combine to increase the probability of IBD. But as we decrease the shared segment size, we reduce that factor. We don’t know, yet, by how much this affects the curve.

Clearly, very small segments (under 5cM) are much easier to match, although most are IBS. Also, many of these very small segments will also triangulate. Triangulation is not a guarantee of IBD. We cannot use triangulation to prove triangulation. In other words, if segment length is a key factor in triangulation, we cannot say that triangulation itself proves smaller shared segments are IBD – it’s a circular argument. We need more corroborating data.

I am hesitant about establishing “rules” for segment sizes for triangulation. We are dealing with distribution curves – with tails. We have not yet drawn these curves, but at some point (as the segment size is reduced), the false positives will occur, even with triangulation. I am confident that triangulation shifts the IBD/IBS-vs-cM distribution curve “to the left”. Triangulation definitely culls out many (most?) IBS segments in the 7-10cM range. Thus the IBD/IBS ratio for a given cM must increase. To what extent is yet to be determined.

Triangulation is a tool. Use judgment when using it.

For me, shared segments below 5cM are uncharted territory for triangulation. I am confident of a Triangulation “guideline” for shared segments down to 7cM. Based on my experience with most segments in the 5-7cM range being IBS, I’m now fairly confident that triangulation also works down to 5cM. At the least, triangulation culls out most of the IBS shared segments. I think most of the few remaining 5-7cM shared segments which triangulate are IBD. For me, it’s at least worth the chance to include them in a TG and enlist the help of those Matches in finding the CA.


13 Segmentology: Small Segments and Triangulation by Jim Bartlett 20150930