Does Triangulation Work?

Sure it does! Triangulation is a tool to use with autosomal DNA. Let’s see how it might work:

  1. Does it work in grouping your shared segments?
  2. Does it work in culling out IBS segments?
  3. Does it work to define and map your ancestral segments?
  4. Does it work to insure that all Matches in a Triangulated Group have an IBD segment?
  5. Does it work in identifying Matches who all share the same Common Ancestor?
  6. Does it work for any size segments? – see more at: Does Triangulation Always Work?

The Big Picture

Let’s start with the Big Picture.  We take an atDNA test and the company reports a list of our Matches. We can also get Matches by uploading our raw DNA data to GEDmatch. Each of the companies compares our raw DNA data to that of all the others in their database, and uses their proprietary matching algorithm to generate a list of Matches. At 23andMe, FTDNA and GEDmatch, they also provide the shared segment information (Chromosome, Start Location, End Location, cMs, and SNPs) for each shared segment. For this discussion I’m only going to be talking about segments over 7cM, just to avoid any debate about smaller segments. Each of the companies have pluses and minuses that go along with their matching algorithm, but we are going to go with the list of Matches they provide to us.

So this is the data we want to work with using the Triangulation tool.

Ancestral vs Shared segments

Please re-read “What Is a Segment?” to recall there are ancestral segments – ones you get from an ancestor – located completely on one of your chromosomes; and there are shared segments – ones that the computer algorithm determines by comparing data on both your chromosomes with data on both chromosomes of another person.

Shared segments are either IBD or not-IBD (IBS)

Most of these shared segments are IBD – meaning they come from a Common Ancestor – common to you and your Match. Some of the shared segments are IBS – meaning they don’t come from a Common Ancestor; they are segments made up by the computer algorithm. We cannot tell which is which by just looking at the one shared segment. ISOGG has a very good wiki article about IBD and non-IBD (IBS) segments. The bottoms lines are:

  1. Shared segments (also called matching segments or Half-Identical Regions (HIRs)) 15cM and greater are IBD virtually 100% of the time.
  2. Shared segments under 5cMs should generally not be used in genealogical analyses [and in this post we are not considering shared segments under 7cM].

So for this blog post we will focus on shared segments from 7cM to 15cM as reported by the companies. And note that each of these segments is either IBD or IBS.

Triangulation Criteria

For Triangulation we find three sets of shared segments which match each other. This usually means you and two Matches have shared segments which overlap at least 7cM, AND the two Matches share a segment which overlaps the same area at least 7cM. This means all three of you have the same, long string of SNPs in the same location. This is Triangulation.

Usually Triangulated Groups (TGs) include more than just you and two Matches. They may include 5, 10, 20 or more Matches. Each TG includes all of the shared segments, and these triangulated segments determine the start and end locations of the TG, such that the TG includes them all.

My Experience with Triangulated Groups

I have over 5,000 different Matches in my spreadsheet, with perhaps 6,000 separate shared segments over 7cM. As a result of the Triangulation process, these shared segments have been placed into 4 categories:

  1. A Triangulated Group on my Dad’s side.
  2. A Triangulated Group on my Mom’s side.
  3. An IBS group (these segments overlap, but do not match, TGs on either side)
  4. Undetermined as yet

The TGs above cover 90% of my 45 chromosomes, and define 340 separate TG segments on my DNA. Most of the TGs are heal-and-toe (adjacent) to each other on each chromosome, with only a few gaps. All of my shared segments either “fit” into (overlap within) one of these TGs or they are IBS (or they are undetermined).

TGs Form a Chromosome Map

The key point here is that these TGs map my chromosomes into specific segments. Each of these specific segments comes from an Ancestor. Similarly your chromosomes are divided into specific segments, defined by crossover points from each generation. Re-read Bottom-up and Top-Down for a refresher on how crossover points and segments are formed. Within each such ancestral segment, defined by start and end locations, each of us will have a continuous string of SNPs – usually thousands of them. And each such ancestral segment comes down a specific path from a specific ancestor to us. On this point endogamy does not matter. In the bizarre extreme, all of your ancestors in one generation could be the same man and woman, but each of your ancestral segments only came from only one place in your Tree, and down one path to you.

IBS Segments

 Most of your shared segments will be IBD and will form TGs. But some shared segments will overlap the segments of a TG but they won’t match any of them. On either side. These shared segments are clearly IBS. If they were IBD – from an ancestor – they would match overlapping segments in a TG.

Are All Segments in a TG IBD?

So one of the arguments for Triangulated Groups is that if three people all match each other on the same segment, the shared segments must be IBD. We have three pairs of matches, each pair with the same long string of SNPs. We have all three companies with proprietary algorithms that try to insure their Matches are IBD. We have TGs that are mapped on our chromosomes, and know that some ancestor provided that segment. It sure looks like the shared segments in these TGs have the same SNPs that our ancestors passed down to us. This is even more compelling when there are several, or more, shared segments which Triangulate and form a TG. But are we sure every segment in a TG is IBD?  Read on for some possible exceptions.

Some Areas to Look Out for:

  1.  If you have only one Match (Match1) who matches a number of your close relatives in an apparent TG: You and Match1 might share an IBS segment (based on Match1’s segment being false); and then Match1 may well match all of your close relatives who have the same ancestral segment you have. For this reason observe the caution that TGs should be formed with widely separated cousins – the wider the better. Another test is to find other Matches (not closely related to Match1 who Triangulate with you and see if they match Match1. All Matches in a TG, who overlap enough, should match each other. If they do not, then an analysis should be done to weed out any potential IBS segments.
  2. You match several other Matches who are closely related to each other in an apparent TG: you might have a false segment and may well match all of the Matches who share the same good ancestral segment. As in the previous paragraph, it’s important to form a TG with widely separated cousins. The test here is to look for other overlapping Matches for this segment area. If this is an IBS TG, the other Matches will not also match the Match family. Also, you do have a true ancestral segment for each area of your chromosomes. If several related Matches all match you in one segment area, and your segment is false with them, you should be able to form two other TGs (one from each parent) based on your ancestral segments compared with other Matches.
  3. Another argument used to debunk Triangulation, is endogamy. The theory here is that due to endogamy – some of our ancestors being the same person – the same ancestral segments are floating around and TGs may be formed with different ancestors. In theory, this is possible – in practice it is improbable. In the first place, endogamy means the two ancestors who are the same person actually had a Common Ancestor. So in fact the TG shared segment really did come from Common Ancestor, several more generations back. With each generation going back, the probability of a match is divided by 4, or 16 for the two generations involved in a first cousin endogamy. Clearly it is much more likely that our Matches in a TG are from a closer cousinship.

Also, based on my chromosome map, the ancestral segment I got for each TG is from a specific ancestor, down a specific line of descent to me. It has a specific string of SNPs that we generally think of as unique. Is it possible, in the 7-15cM range, for a Match to have exactly the same string of SNPs from a different ancestor? With random DNA almost anything is possible, but the premise of autosomal DNA is that this would be very rare. If it did occur, the shared segment would technically be IBS, because it was not identical because of descent from a Common Ancestor. But, we might have to leave the door open for this possibility.

Back to a Big Picture Thought

The number of people taking an atDNA test is about doubling every 12 months. If this continues, I’ll have 10,000 Matches, with shared segments, by this time next year; and 20,000 Matches by the end of 2017. My chromosome map has pretty much been determined (I am now focused on determining the correct Common Ancestor for each TG). A doubling of Matches means a doubling of each TG every year. The point is that if we assume 80-90% of our Match segments are IBD (I actually believe it’s closer to 95%), all of those IBD segments are being added to my existing TGs. Couple this with the fact that most of our Matches are beyond 5th cousins (I believe most of our Matches are actually 6-8th cousins, and some beyond). Even if a few of the Matches in our TGs turn out to be IBS, we are still getting a great influx of true cousins into our TGs.

So to summarize:

  1. Do TGs work to group your Matches? Sure! Instead of the long list of miscellaneous Matches reported by the companies, you can form Triangulated Groups. See Benefits of Triangulation.
  1. Do TGs work to cull out IBS segments? Sure! Many of your 7-15cM segments will not triangulate with any overlapping TG, indicating those “shared” segments are probably IBS. As noted above, not all of the IBS segments may be identified this way, but many (I think most) will. This is progress – it’s an improvement over the list you get from the companies.
  1. Do TGs work to define and map your ancestral segments? Absolutely! It’s hard work, but an easy mechanical process to define the TGs with start/end locations; and only a little genealogy with known relatives is needed to assign them to maternal and paternal sides.
  1. Do TGs work in insuring all Matches in a TG have an IBD segment? Almost all of the time, and there are ways to find and test suspicious shared segments.
  1. Do TGs work in insuring all the Matches in a TG share the same Common Ancestor? This is a tough one because it’s not possible to rule out some outliers. As noted above, if you carefully form the TGs, the Matches should come from the same Common Ancestor. We have lots of examples of Matches in TGs who do share the same CA. It’s very hard to prove that an IBD segment is really from a different ancestor; and I haven’t seen a single case of it so far.

Your ancestral segment in each TG does come from a specific ancestor of yours, and your cousins from that Ancestor with that segment will match you on it in that TG. As several of us have suggested, to determine the true Ancestor for a TG, you need to “walk the segment back.” This means finding cousins at various levels in each TG – a 2nd cousin, a 4th cousin, and a 6th cousin who all have the same segment and ancestral line. This is often hard, but the number of people taking an atDNA test is doubling annually, and more of these intermediate cousins will gradually show up in our Match lists and TGs.

Bottom line for me: Triangulation is a powerful tool.


11B Segment-ology: Does Triangulation Work? by Jim Bartlett 20151019


What’s all the buzz about “pile-ups”?  In my mind there are three kinds of pile-ups: small, medium and large. They are different, so it’s important to understand each one. In this case Goldilocks should prefer the large pile-ups, but let me go through my views of all three kinds.

Alert: This post contains my opinions about small pile-ups and AncestryDNA (based on my own experience) so you should make your own judgments.


I think the two keys to success with autosomal DNA lie in a robust Tree (as many ancestors out to 13 generations as possible) and as many Match-segments as possible (including as many close relatives as you can get). I spent about a year expanding my Tree as best as I could, and then posted that GEDcom in several places. I’ve tested at all three companies and use GEDmatch.  I put every single shared segment I can find over 7cM into my spreadsheet, and I periodically run a Quality Control check against a fresh download to pick up any missed Matches or segments. I currently have 5,000 different individuals with segment data in my spreadsheet, and have determined a Common Ancestor (CA) with 309 of them.

I have compared virtually every segment against other overlapping segments, and formed Triangulated Groups (TGs) that cover over 90% of my 45 chromosomes. It is now rare for me to get a new shared segment that changes my chromosome map in any way. This process has provided some insights on medium and large pile-ups.


My definition of pile-up sizes:

  1. Small is smaller than 5cM
  2. Medium is 5-10cM
  3. Large is greater than 10cM

Small pile-ups – by my definition, these pileups are composed almost entirely of IBS shared segments. When AncestryDNA first rolled out their autosomal DNA test, their threshold was 5Mbp. This threshold included many shared segments well below 5cM, and resulted in many thousands of bogus Matches. To their credit, they provided a caution about these. When AncestryDNA revised their threshold to 5cM, many of these Matches went away. Part of their explanation was the elimination of “pile-ups”.  I agree that these “small pile-ups” should be eliminated. And when they reset their threshold to 5cM, that should have eliminated this problem. However, their explanations continue to stress the elimination of “pile-ups”. I just hope they don’t also toss out Matches in larger pile-ups – throwing the baby out with the bath water.

Medium pile-ups – 5-10cM range. As I gathered as many segments over 5cM as I could and sorted them in my spreadsheet, I noticed a few areas that had many such segments, all in a very narrow chromosome area. Very clearly a pile-up! Virtually none of them matched each other, although they had almost the same segment start/end locations. And there were a lot of them – many more than in large TGs.

In discussions on various email lists, we compared notes, and found that most of these areas were unique to our own experience. In general they were not due to some common feature of most human genomes. A notable exception to this blanket statement is the HLA Region on Chromosome 6 – roughly from 29.8 to 33.1Mbp.

However, most of the other areas were not tied to known issues like the HLA Region. In my analysis, it was not possible for me to link these to one parental side or the other. The fact that these areas include so many IBC segments indicates to me that it’s the combination of both of my chromosomes (maternal and paternal) that allows the “matches”. It’s the unique combination of alleles in these small stretches of DNA that make matching much easier. And this unique combination is only in my genome. On chromosome 18, I have 307 segments in the 7 to 11 cM range. They are all in a very tight area:  from location 5,800,000 to 8,700,000bp.  Very few of them triangulate.

Sometimes the pile-up area has been documented. On chromosome 15, I have 281 segments in the 7 to 10cM range. They are at: 24,000,000 to 28,000,000 bp. This area partly overlaps a known pile-up area (20,100,000 to 25,200,000). But the known pile-up area is only partly the cause in my case. See 14 small pile-up areas found by Li et al (2014), listed at the ISOGG Wiki: These medium pile-up areas, and a few others in my experience, are characterized by a very tall pile-up of many segments about the same size in a narrow area just a little larger than the segments. The Li et al (2014) article refers to “regions where excess IBD is detected…” Virtually all of the segments I have noted above are IBS/IBC – they do NOT triangulate with the other segments.  A few segments in these regions do triangulate with known close relatives, and each other. I’ve kept those segments in maternal and paternal TGs, as appropriate, covering that area. After all, both my mother and father gave me those areas, and they in turn got them from their parents, etc.  It is very probable that these segments are IBD and come from a CA.

My experience is that these are areas with a lot of shared segments in the 7-10cM range that are in a tight area, usually just 10cM wide, and a very high proportion of these segments are IBS/IBC.  A few segments in these areas will be IBD, but they will tend to be larger than the 7-10cM segments.

My bottom line for these pile-ups: Unless you have a lot of free time, skip over these areas – particularly the shared segments under 10cM. Concentrate on triangulating any larger segments in these areas and then move on to other areas.

Large pile-ups – these are my favorites. Larger shared segments (over 10cM) that spread out and overlap each other over wider areas.  These segments tend to triangulate with each other, forming TGs on both sides.  I have some of these TGs which include over 50 shared segments.  Since the shared segments triangulate with each other, this is a good pile-up. These TGs are large because more people have these shared segments – probably because the Common Ancestors had large families in Colonial America, leaving us with many, many cousins. Another reason could be a more distant Common Ancestor, who would also leave us a large number of cousins.

In some cases we can use this observation to our advantage. I have a 2nd cousin, on his mother’s side, who is also an 8th cousin, on his father’s side. Our close Common Ancestor was an immigrant to the US in the mid-1800s, and I get relatively few Matches on the segments I share with him. However, on one segment, we have many Matches – it turns out our Common Ancestor is on his father’s side. The tip-off should have been the size of the TG (measured by the number of Matches).

Another observation about large pile-ups…. They will get larger. The number of folks taking an atDNA test is about doubling every 12 months. A consequence of this is that all of our TGs will also double in the next 12 months. So, if you have pile-ups now, they will about double by this time next year. Use these larger TGs to your advantage – work with the Matches to investigate place/time matches, if a Common Ancestor is not easily determined.


  1. In general, don’t work with shared segments below 5cM. Most are IBS/IBC – even if they appear to triangulate. We don’t have a good test below 5cM to indicate IBD.
  2. Watch for, and avoid, pile-ups in the 5-10cM range. These are characterized by many shared segments in the 5-10cM range in a very tight location- usually only 10 or 11cM wide. Move on to larger shared segments in other locations.
  3. Embrace the large pile-ups. They may from Common Ancestors with large families and/or more distant Common Ancestors. In either case, work with the Matches in these TGs as a Team to determine the Common Ancestor.

18 Segment-ology: Pile-ups by Jim Bartlett 20151007

Anatomy of an IBS segment

This is a guest blog post by Dr. Ann Turner, who has been a great mentor for me.

Anatomy of an IBS segment

 Ann Turner

October 1, 2015

Jim Bartlett, my host for this blog post, shares a 7.8 cM segment at 23andMe with my nephew Larry. This was a serendipitous find, for Jim broke down a brick wall for me with records from an orphan’s court. In turn, I provided a solution to a minor mystery for Jim – where did John Henry go when he disappeared from Frederick County, Virginia?

That discovery was back in 2011, before we had developed much in the way of techniques to analyze segment data. There was one troubling aspect:  Jim did not match my sister (or her husband, either). This could be explained away if there was a false negative in my sister. Fast forward to 2015. Jim’s intensive work on triangulated segments has filled in the section containing Larry’s segment with more cousins. Larry did not match anyone on either one of Jim’s chromosomes.

Is it possible that this match was not Identical by Descent (IBD), but just Identical by State (IBS)?

A Terminology Detour

The terms “Identical by Descent” and “Identical by State” predate their application to segmentology, Jim’s felicitous term for analyzing autosomal DNA. The glossary in Human Evolutionary Genetics[1] contrasts the two phrases:

Identity by Descent: Property of alleles in an individual or in two people that are identical because they were inherited from a common ancestor; as opposed to identity by state

Identity by State: Property of alleles in an individual or in two people that are identical because of coincidental mutational processes, and not because they were inherited from a common ancestor (identity by descent)

In effect, “identical” is the more general word, and the phrase describes two mutually exclusive ways of achieving identity – BY state or BY identity.

Also, the definition is about alleles, alternative versions of a single marker. There are examples in genetic genealogy when we look at the type of DNA that follows one line, the straight paternal line or the straight maternal line.

For the Y chromosome, the ancestral haplotype may sometimes be deduced from multiple lines of descent. The question then becomes whether a variation on the theme marks a specific line: does the fact that two individuals both share a one-step difference from the ancestral haplotype on DYS19 mean that they have identified a branch tag to a more recent common ancestor (the mutation is identical by descent), or did the mutation occur independently in two different lines of descent (the mutation is identical by state)? The mutation rate is high enough that either explanation could hold true.

For mtDNA, there are certain hotspots where a mutation is not a reliable indicator for defining haplogroups or even genealogical relationships. A mutation 16519C has occurred independently hundreds of times in different haplogroup subclades, and insertions at 309.1C (and 309.2C) are frequent enough that even siblings are known to differ.

Adapting the two terms IBS and IBD for segmentology stretches the original context to include regions of the genome, not just single markers. Furthermore, the mutation rate for autosomal DNA is orders of magnitude lower than Y-STRs or mtDNA. Differences in two autosomal markers are not likely to be due to a recent mutation.

With this shift to testing multiple autosomal markers, some authors began to employ the phrase Identical by State as the broader concept. Then some, but not all, IBS regions would also be Identical by Descent. That leaves a vacuum – what should we call regions that are IBS but not IBD? Charles Brenner created his own term, which is not particularly evocative but illustrates the frustrating dilemma:

“Identical by state” (IBS) as used here is synonymous with “identical”, an umbrella meaning in that IBS  thus includes IBD as a subset. Adopting the umbrella definition for IBS means some other term may be needed to mean IBS but not IBD and for this purpose I use the word “strict.”[2]

Indeed, it appears that many technical articles avoid the term IBS entirely. A search of Google Scholar  yields 17,100 citations for Identical (or Identity)  by Descent but only 4,700 citations for Identical (or Identity) by State. Scanning a small sample of those articles reveals that they often describe a segment as IBD or “not IBD”, period.

My personal preference is to hew to the original concept, where identity is the broader, more general term. It avoids the awkward need for a special term to describe IBS but not IBD. Plus in the future, when we can do whole genome sequencing, reserving IBS for accidental identity due to a parallel mutation may become more relevant. In spite of the low mutation rate, the vast number of loci and (perhaps) the large number of tested people will result in a certain number of recurrent mutations. We are already seeing this with more comprehensive sequencing of the Y chromosome.

I have no objections to those who prefer IBS for the more general term, but for the purposes of this blog post, I mean Identical “just/merely/only” by State. For further clarity, we need to emphasize that we are speaking of HALF identity, where at least one of the two alleles in one party’s genotype matches at least one allele in the other party. Leon Kull coined the acronym HIR for Half-Identical Region. That obviously leaves a lot of wiggle room, as shown in the next section.

Dissecting the Segment

Jim graciously shared his raw data with me so I could use Excel to view each and every one of the 850 SNPs in the segment. (See Supplemental data file.) The segment boundaries are opposite homozygotes (e.g. CC and GG) – they do not match at all. Figure 1 shows some of the column headers in the spreadsheet with a few sample rows of data.

Columns A, B, and C give the chromosome number, chromosome position, and SNP ID as found in the raw data download. They are redacted here for privacy reasons, but the column labels are preserved for those who would like to use the spreadsheet as a template for their own analyses.

Column D is for Jim’s genotype data. If Jim is homozygous for a marker (e.g. CC), then he obviously received a C from his father and a C from his mother. If Jim is heterozygous possible alleles are always listed in an arbitrary order (often alphabetical). The C allele could have come from his mother and the T allele from his father, or vice versa. Columns E, F, and G are genotype data for Larry, his mother, and his father.

I also phased Larry’s data so I could tell which allele came from which parent, using David Pike’s utility Phase a Child when given data for child and both parents for the calculations. In a separate step (not shown here), I reformatted the results and loaded them in to the spreadsheet so the rows aligned with Jim’s results. Column H has the maternal allele (from my sister) and Column I has the paternal allele. The results could not be phased in cases where all three parties were heterozygous, and the genotype is retained. A heterozygous result is a universal match – no matter what Jim’s genotype is, at least one of Larry’s alleles will match at least one of Jim’s alleles, because each SNP has only two possible versions.[3] The full spreadsheet can be seen at this link.

Ann Taylor Figure 1

Figure 1

Columns J and K use Excel formulas to show whether Jim matches the maternal allele and/or the paternal allele (coded with a “1”). Conditional formatting shows pink for a maternal match and blue for a paternal match. It’s readily apparent that a mismatch in the maternal side is filled in by a match in the paternal side, and vice versa. Figure 2 shows this pink and blue pattern horizontally (similar to a chromosome browser) for a somewhat longer stretch of 31 SNPs.

Ann Taylor Figure 2

Figure 2

The remaining columns in the spreadsheet (L through T) contain calculations used to generate some summary statistics:

1) The apparent long run of 850 half-identical SNPs is broken up into 61 shorter runs on the maternal side and 31 shorter runs on the paternal side. It is entirely possible that these runs would be fragmented even further if Jim also had phased data.

2) Jim and Larry are both homozygous for the same allele for 368 of the SNPs. If Jim inherited the same allele from his father AND his mother, and ditto for Larry, it seems likely that the allele is rather common in the general population. That makes for easy pickings.

3) Jim is heterozygous for 310 SNPs and Larry for 311 SNPs, about 36%. There are 482 SNPs where at least one party is heterozygous (57%). These are universal matches.

Most segments of this length will actually be IBD.[4] This example is somewhat exceptional, deliberately chosen to dramatize the possible pitfalls and serve as a warning about smaller segments. One explanation may be that the 36% level of heterozygosity happened to be particularly high for this one region. The overall average for Jim and Larry was 28.2% and 30.7% respectively

Red flags were waving for this segment: the lack of triangulation and the lack of a match for both of Larry’s parents. Is the converse true? Can triangulation or a match in a parent prove IBD? No, many counter-examples can be found, especially at shorter segment lengths.[5]

Phasing is our most pressing need, yet it is not always available.[6] Any alternative methodology for claiming that certain short HIRs are IBD must be able to demonstrate that the segment survives in test cases where the phase is known.

One more moral of the story: a genealogical connection can be made without DNA!

[1] M.A. Jobling et al, Human Evolutionary Genetics: Origins, Peoples & Disease, Garland Science, 2004.

[2] Brenner CH. Understanding Y haplotype matching probability. Forensic Sci Int Genet. 2014 Jan;8(1):233-43.

[3] It is possible to have three or four alleles (A/C/G/T) for a SNP, but these are rare and SNP chips tend to avoid them.

[4] According to 23andMe’s simulations “IBD segment lengths [i.e. HIRs] greater than 7 cM were observed 90% of the time in at least one parent. Preliminary data suggest that 7 cM segments shared between a distant cousin and child that were not observed in the parents were due to false negatives in the parents.” Henn BM et al, “Cryptic distant relatives are common in both isolated and cosmopolitan genetic samples.” PLoS One. 2012;7(4):e34267.

[5] See my blog post for details on how an experimental phased data file eliminated a large number of small segments reported by Family Tree DNA.

[6] AncestryDNA phases data for its internal calculations, but the raw data download shows genotypes with the alleles in an arbitrary order.