The Attributes of a TG

This article will describe the various attributes of a Triangulated Group (TG). Some have noted that I use the term TG to describe both a Group of Matches as well as an ancestral segment. Well… yes, I do. Read on.

Once established, each TG has certain attributes which can be used to describe and/or define the TG:

A1. A TG is a group of shared segments from Matches. We often think about the Matches in a TG. They have a Common Ancestor. They can be contacted and encouraged to collaborate on finding the common ancestry. So in this sense a TG is a group of Matches. However, note that any of these Matches could, potentially, also share the same or a different ancestor with you on another segment (in a different TG)

A2. A TG occupies a specific physical space on a chromosome. It is, in effect, a segment in its own right – a segment from one of your ancestors. The TG is on one chromosome with a start location and an end location. These start and end locations are determined by the matching, overlapping, shared segments (from Matches) within the TG. Please review: Anatomy of a TG.

A3. As a segment, a TG has a specific string of SNPs on one chromosome. These would be the same SNPs in a segment on your chromosome, and on the segment each one of your Matches shares with you in the TG. All the SNPs would be the same. The SNPs have to be the same for IBD segments to match.

A4. A TG is the equivalent of phased data. The TG represents an ancestral segment on a chromosome. All of the SNP values (alleles) are on one chromosome, and are the same SNP values you got from your mother or father (depending on the chromosome). What you have in a TG (segment) is exactly what you would have with phased data.

  1. We don’t see the actual ACGT values in a TG that we would get with true phasing (with a child-parents trio), but they are the same values in the TG. The TG segment represents part of one of your chromosomes – it must have the same ACGT values that your parent passed to you on that chromosome.
  2. We can treat the TG as phased data, and any other shared IBD segment which Triangulates with the TG will have the same SNP values.
  3. This is true, even if you have formed a TG (with matching, overlaying, shared segments from Matches) and do not have the genealogy to determine which side it is on. You can be confident that the TG exactly matches the DNA on one of your two chromosomes. In this case the TG is not entirely equivalent to a true phased segment. But, if you had the phased information, you’d already know which side the TG was on. And very often you can determine the side of a TG by imputation – by determining which side it’s not on; or by the admixture of the segment.

A5. Technically, each TG has a cM value. However, it usually takes a lookup table to determine the cM value for a segment on a given chromosome, between two points. This is what the testing companies and GEDmatch do for each shared segment they report. It’s a lot of work for genetic genealogists – and, in general, our TGs will morph over time as new shared segments are added, and the TG cMs will need to be adjusted. However, we can fairly easily make rough estimates of the TG cMs, which are plenty good enough for genealogy:

  1. Subtract the TG start location from the end location to get the number of base pairs (bps), divide by 1,000,000 to get Mbp, which is roughly equal to the number of cMs.
  2. Or, eyeball the cMs of the larger shared segments in a TG and extrapolate to the full TG (if you’re lucky, you may have a shared segment which nearly fills the TG)
  3. Note that the cM is a fuzzy value anyway – it’s empirically derived (an average of many observations), and it’s an average of the female and male averages. So don’t go to too much trouble, and use round numbers. AND note that there is a wide range of possibilities when trying to use cMs to determine approximate cousinships. See Blaine Bettinger’s chart for the ranges of cMs vs cousinships.

A6. TGs have fuzzy ends. There is no “signpost” in our DNA to identify crossover points, or where a shared segment starts or ends. The company algorithms estimate shared segments by looking for areas of DNA that are identical, and it then continues until the DNA is not identical. This results in some, usually small, amount of overrun (a longer segment than is actually there from a Common Ancestor). So my convention is to use the start location of the first shared segment in a TG (the one with the lowest start in bp). This is the start location of the TG. The end location is determined several ways:

  1. If there is no overlap with the next TG on that chromosome, then use the largest end location of all the shared segments in the TG – the shared segment which runs the farthest. This often is not the last shared segment in a sorted spreadsheet – you need to look at them all.
  2. If there is a small, fuzzy overlap of 1-2Mbp with the next TG, I use the start location of the next TG, and accept the fact that there is a fuzzy overlap. We don’t need to be real precise for genealogy. Each TG represents a large block of DNA from an Ancestor – the fact that the edges of the block may be fuzzy should not obscure the big picture: the main TG segment came from an Ancestor!
  3. If there is a large overlap with the next obvious TG (almost always from a large shared segment with a close cousin, which probably spans more than one TG), I start a new TG at the obvious point dictated by the next group of shared segments, and use the same point as the end location of the first TG. This involves judgment – there is no hard rule – and the data will usually indicate where to start a new TG. Just accept that close cousins may share large segment which span more than one TG.
  4. If the shared segments in the TG all end a “few” Mbp before the next TG starts, I will just round up, and use the start location of the next TG as the end location. Again, use judgment.
  5. If there appears to be a large enough gap between two known TGs, I create a “dummy” TG to fill the gap. And then I keep looking for some Matches with shared segments to fill that gap. At this date my dummy TGs are about 7% of my DNA.
  6. Using these conventions will result in TGs that are adjacent to each other over all your chromosomes. Even with some “dummy“ TGs, this process will organize all of your IBD Match segments into TGs over all of your DNA. When done, this is a happy day! You can then focus on TGs that should link to specific ancestors. And all new Match segments will generally fit easily into existing TGs.

A7. As you work this process of forming TGs and assigning them to sides using genealogy, you are creating a chromosome map. As new Matches are posted, their shared segments may adjust the start and/or end locations of the TGs. When you get lucky, you’ll find a new shared segment that fills a “dummy” TG.

A8. Naming TGs. I label each TG (and all the shared segments in it) with a short code – like 07C25. The 07 means Chr 7; C means the TG starts within the third group of 10Mbp – in this case between 20-30Mbp; 25 means this TG ion on my father’s (2), mother’s (5) side – using Ahnentafel numbers. If I were starting over I’d use 07.027PM to indicate Chr 7; start 27Mbp; on Paternal, and his Maternal ancestry. Note: before you do any assigning, the label might be 07.027, then add P or M when determined. I usually add this code in the subject line of emails and messages – it just helps me keep organized.

A9. Also note that each TG also has within it, many genes. Each of your 22,000 or so genes will have a specific location on your DNA. If you become curious about any particular gene, you can look up where it is located (chromosome and location). You will have two of each gene, one on your maternal chromosome and one on your paternal chromosome. You can then look at your chromosome map of TGs and see which maternal TG and which paternal TG it’s in. If you’ve determined a Common Ancestor for those TGs, you’ll know which ancestor passed that gene down to you. You can also add a gene to your spreadsheet, so that it sorts with all the segments and TGs. Examples: Short Sleepers gene BHLHE41 would be 12.026M and 12.026P (very close to LRRK2 (Parkinson’s) at 12.031. Also my Neanderthal segment is 10.130 (I don’t know which side)

Have fun with your TGs!


15B Segment-ology: The Attributes of a TG by Jim Bartlett 210919

Understanding and Using TGs

Each Triangulated Group represents a segment of DNA that come from a specific ancestor in your ancestry. It comes from that ancestor, down one specific path, to you. Even if you have the same ancestor in your ancestry more than once, a specific TG segment comes from only one instance of that ancestor, down one specific path to you.

Review Ancestral Segments

Let’s review ancestral segments. Please also refer to the posts and figures for Segments: Bottom-Up and Segments: Top-Down. At each generation of ancestors, each one passes down specific segments to you. In other words, your entire DNA is made up of segments from your ancestors of each generation. All 4 of your grandparents passed down various specific segments to you that make up all of the DNA on all of your chromosomes – all of your DNA came, in various segments, from those 4 ancestors. The same statement can be made of all 8 of your Great grandparents – you have different segments from each of your 8 G grandparents, which, in total, provide all of your DNA on every chromosome. Some Great grandparents may not contribute to each and every chromosome, but all of your chromosomes will be a patch-work-looking quilt of segments from your Great grandparents. The same is true for every generation: your DNA is made up from segments from your ancestors in that generation. As you go back 6 or 7 or 8 generations, some of the ancestors of that generation will drop out. That is you don’t get DNA from every one of them. They are still your ancestors, you just didn’t inherit DNA from them. Review The Porcupine Chart.

Another way to put it is that segments from distant ancestors are combined to make larger segments in closer ancestors. The smaller segments are still there, it’s just that the closer ancestor passed down larger segments which are made up of the smaller segments from their ancestors. Your parent passes complete chromosomes (very large segments) to you, which are made up of smaller segments from his/her ancestors. See the graphics in Top-Down.

Mapping with Triangulated Groups

Now the hard part of chromosome mapping with Triangulated Groups (TGs) is that the TG segments don’t generally create a map for any specific generation. The TG segments could come from ancestors of different generations. It all depends on how your TGs are laid out, and on the shared segments you get with Matches. You will only “see” TG segments that are made up of shared segments from Matches. So some TG segments may be from closer generations than others. But, in general, your TG segments will be adjacent to each other from the start to the end of each chromosome.

Cousinships with Matches in a TG

A TG is made up of Matches with shared segments. Review: The Anatomy of a TG. The shared segments (with Matches) in a TG will generally fall into 3 categories:

  1. Shared segments with Matches who are cousins on the Common Ancestor who created your TG segment. For a TG segment created by a 6G grandparent, this would be 7th cousins. Note: it would be very unlikely for you to have more than three exact 7th cousins (each one from a different child) from a 6G grandparent on one segment (i.e. in a TG). More on this later.
  2. Shared segments with Matches who are closer cousins – they share a Common Ancestor with you who is closer than the 6G grandparent who created your TG segment. Note: it would be very unlikely for you to have more than three exact 4th or 5th or 6th cousins (each one from a different child) from the same xG grandparent on one segment (i.e. in a TG). More on this later. These closer cousins would share a Most Recent Common Ancestor (MRCA) who is in the direct path of descent from the 6G grandparent (who created the TG segment) down to you. Because of this they also share the 6G grandparent ancestor with you. But, in this case, you will both descend from the same child of the 6G grandparent, and that does not count against the “limit” of three 7th cousins from a specific ancestor.
  3. Shared segments with Matches who are more distant cousins – they share a Common Ancestor with you who is ancestral to your 6G grandparent who created your TG segment. This frequently happens when the shared segment is less the full TG segment from the 6G grandparent (for example). So these Matches could be 8th or 9th or 10th cousins or even more distant. They descend from a smaller “sticky” segment which is passed down to them (along a specific path) and down through your 6G grandparent to you, along the same path as categories 1 and 2 above. So you might be 9th cousins on an 8G grandparent, who is a grandparent of your 6G grandparent. In fact, when your shared segment (in a TG) is in the 7 to 10cM range, the odds are roughly 80% that the Common Ancestor will be more than 10 generations (9th cousins) back. That’s the bad news – most of these smaller shared segments will be at the far reaches, or beyond, our genealogies. The good news is that roughly 20% of the Matches (sharing 7 to 10cM segments) will be closer – spread among 1st to 8th cousins. For shared segments in the 10 to 20cM range, the odds are roughly 60% that the Common Ancestor will be more than 10 generations back – which means 40% of those Matches will be closer cousins. In fact, with 100 shared segments in a TG in the 10-20cM range, you’d probably have two or three 3rd cousins, two or three 4th cousins, two or three 5th cousins, two or three 6th cousins, etc. You can see a graph of these probabilities in this article at ISOGG.

100 Matches on a TG!

So the next issue is: can we have 100 Matching cousins in a TG? The answer is absolutely!  The odds are almost nil that more than 4 children will pass down the same matching segment to cousin descendants. Hence the caution that we almost certainly cannot have more than 4 cousins, from different children of a specific ancestor, passing down matching shared segments to cousins in one TG (i.e. on one segment). Each child inherits a different mix of DNA from the parents – rarely will more than four of them get the same DNA, and much more rarely would their descendants wind up sharing that DNA on the same segment (i.e. in a TG).

However – 100 Matches can easily descend from an ancestor 10 generations back without violating that concept. You and a sibling could share the same segment with two siblings who are 1st cousins from a grandparent. Similarly 4 Matches could be 2nd cousins with no more than two children from each ancestor. And 8 more Matches at 3rd cousin level – up to 512 Matches at 9th cousin level (10 generations back). And you could double that number by adding a once-removed cousin at each level. All without using more than two children at each generation. And this is all within a genealogical timeframe of 10 generations. If you consider that 80% of the Matches sharing 7-10cM will be well beyond the 10 generation level, the potential number of Matches in a TG becomes quite large. Of course this means the Matches are not all exactly at the same cousinship level – they are very likely to be spread over a range.

Large numbers of Mathes in a TG is not a pile-up!

So clearly a large number of Matches in a TG should not sound an alarm. Or arbitrarily define the TG as a pileup area, or require that all the Matches should be discarded. Nonsense! Large TGs, with Matches matching each other on overlapping shared segments, just indicates that many of the Matches will be more distant cousins. Some Matches will be beyond the genealogies of some – it depends on each genealogist and how far back their Tree goes, and the same for their Matches. But, statistically, there will be some closer cousins in the mix of Matches in the TG. We won’t know which are the closer cousins until we share with them.

The ancestor who “created” your TG segment

In the above discussion I note the ancestor who “created” the TG shared segment. This means the most distant ancestor who had that full segment. His or her ancestors did not have that full segment. His or her parent created (at least) that full segment from their parents. And the segment you inherited from that ancestor is all or, probably more likely, a subset of the segment that ancestor passed down, and eventually got to you. We are talking about the unique segment you inherited from that ancestor, and that segment did not exist in any ancestor further back. The TG segment is unique to you and your ancestor.

Your TG/Segment is unique to you

The ancestor who created your TG segment, probably passed down various overlapping segments to your Matches. These segments may be larger or smaller than the segment you got. What you “see” in a shared segment, is the part your Match got that overlaps the segment you got. So from a Match’s perspective, there is almost certainly a different segment from the Common Ancestor than you got – the Match would have a different TG than you from this same ancestor. Or some of your Matches may get segments from a more distant ancestor. These segments would be smaller and, when combined with other DNA, would result in the DNA segment that your ancestor passed down to you. Thus, those other Matches would be more distant cousins, on a smaller segment than what you got. There are many scenarios that would result in a Match (sharing a segment with you) being a more distant cousin – refer back to alternative 3 above.

The name of the atDNA game is sharing and collaboration

This accounts for some Match cousinships which can be pretty distant.  I have to emphasize that, although some of the TG Matches will be fairly distant (and often beyond your genealogy), some Matches in each TG are probably within a genealogical timeframe. This brings up two important points when using TGs:

  1. Contact every Match you can – you never know which ones may have a Common Ancestor you can identify.
  2. Encourage Matches within a TG to share Trees among the group. Some Matches in a TG may be closer cousins to each other than they are to you, and they may in fact have a Common Ancestor who is ancestral to the ancestor of your TG. Sharing and collaboration are the name of this game.


15A Segment-ology: Understanding an using TGs by Jim Bartlett 20160917


Anatomy of a TG

This is another blog post that gives you some idea of what to expect with autosomal DNA and your segments. In this post we’ll look at the formation of a TG. We’ll walk through the steps:

  1. Start with overlapping segment data
  2. Simplify the data by rounding
  3. Sort by Chromosome and Start location
  4. Then Triangulate the segments (no genealogy required)
  5. Highlight one of the two resulting TGs
  6. Show this data graphically – like you’d see in a chromosome browser
  7. Overlay the total TG
  8. Then use our imagination and x-ray vision (or GEDmatch) to show what the ancestral segments of the Matches might look like
  9. Do some analysis…


Figure 1. Some overlapping segment data

10B Figure 1

Letters represent Match names – data is taken from my spreadsheet.

Figure 2 – divide the Start/End locations by 1000000

10B Figure 2

It’s much easier to read the Start/End locations in Mbp; and it’s just as accurate for genealogy.

Figure 3 – the data is sorted by Chromosome and Start location

10B Figure 3

This makes it much easier to see overlapping segments.

Figure 4 – this shows the results of Triangulation into groups 16A and 16B.

10B Figure 4

No genealogy was involved in this process – it’s purely a matter of comparing segments at 23andMe or GEDmatch; or looking for ICW Matches in this list and each ICW list at FTDNA. Again, this is real data from my spreadsheet. Often there is more mixture between the two TGs, but I hope you get the idea.

Figure 5 – Here is only TG 16B data

10B Figure 5

It’s still arranged by Chromosome and Start location.

Figure 6 – Same data and the shared segments displayed graphically

10B Figure 6

This is how you’d see the data in a chromosome browser. Note the top 11 bars will all match each other. The bottom bars will usually all match each other too, and they’ll usually also match the top 8 bars, but maybe R and N will not match at the 7cM level at GEDmatch. Just lower the level to 500 SNPs and 5cM and you’ll find there is enough for a Match. Let’s see what the TG for this data looks like…

Figure 7 – Now the fun begins…

10B Figure 7

Usually the TG is pretty clear cut, but I’ve intentionally selected one with two kinds of ambiguity. In almost all cases the ends of the segments are fuzzy. You can read about Fuzzy Data in my blog post here.

Judgment is needed at this point. I’ve shown the “guaranteed” TG in red, with orange tips where the data looks fuzzy. I want to emphasize that this is NOT a problem for genealogy – the TG (wherever the true crossover points are that define the TG Start and End locations – somewhere in the orange areas) represents an ancestral segment from one of your ancestors. The fuzzy ends are not an issue. Your Matches will share a Common Ancestor with you – and that’s where the focus should be. The crossovers defining the real TG will be somewhere in the fuzzy orange tips.

You can also see that this data indicates a probable more distant crossover point – around say 51Mbp. In this case the top 8 Matches and Match S are probably closer cousins sharing a larger, closer segment with you. At this point you might want to review Crossovers by Generation here.  Going back one or more generations we may see the large red TG being subdivided into two smaller ancestral segments – each with its own Common Ancestor each one of which is ancestral to the Common Ancestor for the red segment. In this case the last 8 Matches (T, J, Q, L, M, I, A and K) will have a different, more distant CA than Matches G and H. The main, red, TG may be from a 5G grandparent, and the smaller, green and purple segments may be from a 6G or 7G grandparent. Actually the purple segment, as an example, may pass intact through several generations, and you could share this same segment with 7C and 8C…

Again, most of your TGs will be tighter and from a single CA, but I wanted to take this opportunity to show what sometimes happens. You can avoid any conflict by watching for this situation and just declaring two TGs in this case – see the green and purple bars. Then it’s like the case where a close cousin spans more than one TG – the close cousin will help you define a larger segment from a closer ancestor, and the close cousin, along with different groups of Matches in different TGs will share more distant Common Ancestors with those TGs, but those more distant CAs will be ancestral to the MRCA the close cousin shares with you.

So now let’s use our imagination a little (or we could actually Triangulate this area from the perspective of some of our Matches. In this next Figure 8, I’ve guessed at what the ancestral segments might be the Common Ancestor down to the different Matches – as shown in green.

Figure 8 – showing ancestral segments for all Matches

10B Figure 8

In some cases our Matches have somewhat larger ancestral segments than we have, or they might have segments that extend over one end of our ancestral segment, or the other. In all cases the blue represents the overlap between each Match’s ancestral segment and our own ancestral segment. And the data is not exact, so the ends often don’t line up vertically. The long green bar at the bottom of Figure 8 is a segment an ancestor passed down to living people – you got part of it, the red part.

At GEDmatch it’s often fun, and instructive, to compare two Matches to each other. Sometimes they turn out to be parent/child, or siblings. This is the exercise you’ll want to do if you are trying to map an ancestor.

So, again, if you have collected a lot of shared segments from FTDNA, 23andMe and GEDmatch, they have to go somewhere. It’s not hard to compare them to each other and see where they Triangulate. If they are IBD, they have to go on one chromosome or the other. When you do this you’ll find there are natural break points where the crossover points are located (often the precise location is a little fuzzy). Just look at the data above.


05D Segment-ology: Anatomy of a TG by Jim Bartlett 20160204

Crossovers by Generation

There has been some amount of discussion about segment size, triangulation, and the number of cousins who can share a Triangulated Group. The discussion often uses terms like extremely rare, small segments, distant ancestors, etc. without using specific examples. The arguments go from it’s OK to triangulate with close relatives, to it’s virtually impossible with distant relatives – and there is no discussion of any middle ground.  The odds do diminish as you go back in ancestry, but there is no artificial dividing line: closer works, distant doesn’t work. There are always a gradation – shades of gray, if you will. Let’s see if we can put boundaries on it.

In my mind, one way to try to see the forest, and the trees, is to really take a look at an average genome (23 chromosomes, 3 billion base pairs), and see what kind of segments we might see at each generational level. Most of us know that we get pretty large segments from our grandparents, and the size drops down with each generation as we work our way back/up our ancestry.  So let’s develop a table and take it back and see what we have.

The average number of crossovers per generation is 34. Yes, the average for males (fathers) is 27, and the average for females (mothers) is 41 (per ). But this difference (with respect to the total number of crossovers in a genome) fades after just a few generations – so we’ll use the average, 34.

Crossover Points in One Generation

Let’s start with a parent and 23 pairs of chromosomes. In passing a genome to a child, this parent adds 34 crossovers, which results in 23+34 = 57 segments. Here is Figure 1 showing 34 crossovers and the 57 segments in one genome:

05D Figure 1

These are generally large segments from the grandparents. On average, these segments will be 3,400 cM divided by 57 segments or about 60cM per segment. But clearly some are larger and some are smaller. Sometimes a chromosome is passed intact – see Chr 21 above. You can try this at home, on a sheet of paper – just make 23 horizontal lines and put 34 vertical tic marks on them. You can put a few more or less tic marks, but the overall picture of relatively large segments from your grandparents will be the same.

The important observation here is that you have these ancestral segments on your chromosomes – they are fixed between fixed crossover points created when your parent passed these chromosomes to you.  Of course you don’t know where they are at first, but as you determine Triangulated Groups (TGs) with various cousins, you’ll find that none of the shared segments span across one of these crossover points. And in fact, with enough shared segments you will start to see these crossover points firm up, with separate TGs (from the other grandparent) on either side of them. This chromosome mapping, with shared segments, identifies the crossover points for your ancestral segments. The shared segments with Matches usually only overlap part of your ancestral segment from a Common Ancestor – in this case a grandparent.

Crossover Points in Two Generations

Adding 34 tic marks per generation is a good exercise to carry out for several generations and get the feel for how this works. Let’s try another 34 vertical tic marks. We’ll add the tic marks to show the crossover points that were formed when grandparents passed the chromosomes (which they got from their parents) down to your parents. In effect this takes the 57 segments we had in Figure 1, and (with 34 more crossovers) creates 91 total segments as shown in the genome in Figure 2:

05D Figure 2

We still have fairly large segments. On average now, these ancestral segments are 3,400/91 = about 37cM per segment. Again – some will be larger, some smaller. Each of these segments in Figure 2 (between tic marks – both old and new) are from a great grandparent. These segments fill up each and every chromosome in this genome. You may note that some of the grandparent segments were not subdivided. This is not unusual. In fact it has to happen. We started with 57 ancestral segments and added 34 new tic marks (crossover points) – so 34 segments got subdivided and 23 segments did not.

Crossover Points for 13 Generations

In the next generation back, we would add 34 more new tic marks (crossovers) which would subdivide only 34 of the 91 ancestral segments creating a total of 125 ancestral segments from 2G grandparents, and leaving 57 segments untouched (no subdivision). Here is a table in Figure 3 carrying this math out for 13 generations:

05D Figure 3

Discussion of Figure 3:

Note: This is a table with various values, depending on which generation you are focused on. So successively, pretend you are at a particular generation and read across to see the statistics. Cousins are abbreviated: 2nd cousin is 2C; 2nd cousin once removed is 2C1R.

– Gen 0: You have 23 chromosomes from a parent (we are only working on one genome, so the number of ancestors is 1. Your parent gave you 23 very large segments (which are chromosomes)

– Gen 1: You get DNA contributions from your 2 grandparents. This is in 57 segments spread over one genome. At this level of your ancestry you would see Matches with 1Cs. Review this in Figure 1.

– Gen 2: You get DNA contributions from your 4 Great grandparents on one side. Now you have 91 ancestral segments spread over 23 chromosomes, and each segment averages about 37cM. Some of these ancestral segments are larger, and some are smaller; and they all add up to 23 complete chromosomes (one full genome). This is the generation that you usually share with 2Cs – review Figure 2. In Figure 3 I also show the calculated shared segment values for the various cousins. With a 2C, you would normally share a total of 106cM (from one side). But the average size of the segments from the Great grandparents is only 37cM. This reflects the fact that you will probably share multiple segments with a 2C – perhaps on average three 37cM segments totaling 111cM… Remember these are averages and in actual practice there is a LOT of variation.

-Gen 3: This shows an average ancestral segment size of 27cM from your 2G grandparents – spread over 125 total segments. The total shared segment for a particular 3C is about 27cM – so you might expect a single segment from a 3C (again, this is just an average, but it might reflect what you often see). I’ve underlined ancestral segment (what you actually got from an ancestor), and shared segment which is the overlap between you and a Match. This overlap is rarely exactly the same ancestral segment in both you and your Match – one or both of you probably has somewhat more in the full segment you got from the Common Ancestor.

NB: this overlapping (shared) segment vs ancestral segment difference may be the root cause of some math calculations which have been touted as proving that exact matches among more than 3C are very rare.  Several cousins having the exact same ancestral segment may be fairly rare, but experience with Triangulated Groups shows that overlaps are not that rare.

-Gen 4: Ancestral segments (averaging 21 cM) from your 16 3G grandparents are spread over about 159 segments. So you would see, on average, an ancestral segment from each 3G grandparent in roughly different 10 segments spread over the chromosomes in that genome. Most of your Matches would be 4C (or 3C1R or 4C1R). The shared segments would average 6.6cM, but another way to look at this is that roughly half of them would be over 7cM. However, experience shows that a relatively small percentage of our Matches are 4C and closer relatives. So there are not many such Matches to cover all the segments in our genome.

-Gen 5: Our 32 4G grandparents still give us fairly large 17cM ancestral segments (on average) spread out over 193 segments. We would still see most of our 4G grandparents in multiple segments. Our 5C Matches only share, on average, 1.7cM. So only some of them, on the tails of the distribution curves, will share 7cM or more. The offset is that we have so many 5Cs, that we still get plenty of IBD matches with them. However, the key point here is that while we may have a 17cM ancestral segment from a 4G grandparent, a 5C is only likely to share part of that with us. It would take several 5Cs, each with a 7-10cM segment, partially overlapping our own ancestral segment, to “cover” our 17cM ancestral segment.  In practice we often get 5C Matches with above average segments, but usually not as large as 17cM.

-Gen 6: Our 64 5G grandparents pass down ancestral segments to us that average about 15cM. They pass these down to an average of 227 segments; and each 5G grandparent will pass down DNA to 3 or 4 different segments, on average. Perhaps some of our 5G grandparents won’t have DNA that reaches us at all, while others my pass down 5 or more segments – roughly, it usually averages out. At this level most of our Matches will be 6C, give or take a little. A 6C, on average, only shares 0.4cM of DNA with us. But there are long tails on these distribution curves, AND we have a LOT of 6Cs. The result is that we do have many 6C who do share IBD segments with us over 7cM. Yes, the probability of a specific 6C shared segment is one forth the probability of a 5C, but we have so many more 6C than 5C, we actually get more Matches with 6C. This means more 6C Matches are out there with a shared segment over 7cM, than there are for 5C. Again, it will normally take several of them to “cover” and ancestral segment (a TG).

-Gen 8: Skipping a generation to the 256 7G grandparents. At this point there are an average of 295 segments, or about one segment per 7G grandparent. Clearly by this time some of the 7G grandparents do not contribute to your DNA, and some 7G grandparents contribute to several ancestral segments. Your ancestral segments are in the 11-12cM range, on average. And despite the fact that 8Cs only share a small amount of DNA on average, there will still be many 8C with shared segments above a 7cM threshold.


All through this analysis, the number of ancestral segments has increased by a constant 34 with each generation; the average segment size starts off large and decreases with each generation, but even after 13 generation, the average ancestral segment is still over 7cM; the number of ancestors continues to double with each generation (and at some point duplicates will start to appear, but as I’ve outlined in Endogamy I and II, each duplicate really acts like a separate ancestor); and the average size of shared segments decreases by a factor of 4 with each generation, but we still see many Matches with shared segments over 7cM. To expand on this last point, I have over 10,400 “phased” Matches at AncestryDNA, with all the pile-ups and IBS already culled out. About 400 of these Matches are 4C or closer, leaving over 10,000 Matches in the 5C or more distant range. The distribution of these is spread out among 5C, 6C, 7C, 8C, etc. It is, so far, unclear how far back these go, but clearly there are many in the 5C-8C range. And AncestryDNA claims their “phasing” program has less than a 1% error rate. So 99% of these are IBD shared segments, probably most in the 6C-to-8C range. To my thinking, this means most of them must line up somewhere on our chromosomes. If we assume half, or 5,000, of these Matches are for each genome, on average, then these 5,000 Matches must be on 300 to 400 of my ancestral segments – or over 10 Matches in the 5C-8C range on every segment, on average. Some ancestral segments (TGs) may have more, some may have less, but the 5,000 IBD Matches have to go somewhere.  I’ve picked on AncestryDNA here, because they poo-poo Triangulation (I think they don’t really understand it), and because they have equations that some have used to argue that we cannot have multiple 4C or above in TGs. But the same analysis is true using 23andMe and FTDNA data – they each report many Matches, they each claim a small IBS rate (under 5%), and by their own estimates, most of our Matches are beyond 4C. All of these IBD Matches have to be on our chromosomes somewhere. And, in 14 months (by my estimation), we will have twice as many Matches as we have now – we’ll have over 20 Matches per ancestral segment (TG)!

NOTE: the number of crossovers per generation will average out. So the number of segments created by each generation is fairly accurate – there is much less variation in these numbers than you might find in the average cM for an ancestral segment (which has a somewhat wider range) or a shared segment (which appears to have a much wider range).

“the main thing is to keep the main thing the main thing”

  1. Your genome (chromosomes) is divided into segments by crossover points.
  2. These are your ancestral segments, and each one is from a specific ancestor.
  3. Each Match will have his/her own crossover points and ancestral segments from specific ancestors.
  4. When you share an IBD segment with a Match this segment comes from a Common Ancestor (CA).
  5. A shared segment means your ancestral segment and your Match’s ancestral segment overlap.
  6. Your Match may have a small ancestral segment, which falls within your ancestral segment; a large ancestral segment, which includes your ancestral segment; or, usually, any size ancestral segment which overlaps a portion of your ancestral segment.
  7. The overlapping amount may be relatively small (say 7cM), or as large as your ancestral segment.
  8. The odds are very small that you and a Match would get exactly the same segment from a CA. And certainly the odds would be extremely small that you and several Matches would get exactly the same ancestral segment from a Common Ancestor.
  9. However, from the numbers of IBD shared segments we are getting from Matches, compared to the number of ancestral segments, it is highly probable that multiple Matches can and do have ancestral segments which overlap your ancestral segments.


Note: A full Triangulated Group (TG) is equivalent to one of your ancestral segments. Which ancestral segment the TG represents depends on the shared (overlapping) segments you have with your Matches.  Several Matches with overlapping segments in a TG will tend to “wall paper” your ancestral segment – with enough of the right Match/segments your TG will cover the whole ancestral segment. Some TGs may be from a closer ancestor (say a great grandparent), some may be somewhat more distant (say a 7G grandparent). From my experience, most TGs will be in the 10-40cM range. This does create a hodge-podge effect (with TGs from different generations), but the TGs tend to be adjacent to each other from one end of each chromosome to the other. Alternatively, you can try to map to a specific generation – perhaps starting with grandparents (and determine those crossovers), and then determine which of those segments are subdivided into smaller segments from the great grandparents, and which segments remain intact going back that one generation. And then continue in this fashion with each additional generation. The drawback to this process is that you need many close relatives to take DNA tests to determine all the crossover points at each generation.


A final word of caution: don’t get too lost in the details or the math. Generally, you will have many Matches and IBD segments. Because they are IBD segments, they have to go somewhere on Mom’s side or Dad’s side. 23andMe and FTDNA have developed algorithms to help insure that most of your Match segments over 7cM are IBD, and from experience we know that almost all of the shared segments over 10cM are IBD, and well over half of the 7-10cM segments are IBD. So if you are reading this blog, you are probably into utilizing segments, along with your genealogy, to improve your family Tree. You should also upload to GEDmatch to find other Matches (from all 3 testing companies) with segments. When segments over 7cM Triangulate, it’s a very strong indication that those segments are IBD and the resulting TGs are from a Common Ancestor. You have an ancestral segment at the location of each TG, and your Matches share part of that ancestral segment with you. Each ancestral segment (TG) came from one of your parents and one of your grandparents, etc. Match/segments in that TG have to come from a distant ancestor who is ancestral to that grandparent. There is no cutoff to this process. We cannot say that only our large ancestral segments are valid. All of our ancestral segments came from a specific ancestor. Our ancestral segments have their own ancestral “Tree”. You may be more confident about a TG including a first or second cousin, but you probably don’t have enough tested cousins to cover every TG over all of your chromosomes. That doesn’t mean these other TGs are not valid, it just means you don’t have a close cousin to validate it. You have to use the closest cousin you can find to validate each TG.  Your ancestral segments are real! They are part of you, from your ancestors. And Matches who share those segments, also share their ancestry – no matter how far back the Common Ancestor is. Note from Figure 1 and 2 that segments from more distant ancestors are “nested” within larger segments from closer ancestors. So if you cannot determine the most distant Common Ancestor, look for the closer Common Ancestor who provided the larger ancestral segment.


05D Segment-ology: Crossovers by Generation by Jim Bartlett 21060201

Endogamy PART II

Endogamy Part II – One Segment from One Ancestor


In Endogamy Part I (Shared DNA), we found that the total cM shared between you and a Match is multiplied by the number of times you had the Common Ancestor (CA) in your Tree. So if you and your Match were 5th cousins (5C), you would normally share 3.4cM. If your CA (between you and your Match) was in your Tree five times you would tend to share, on average, 5 x 3.4cM = 17cM. If your Match has that CA in her Tree three times, the total you would tend to share, on average, would be 3 x 5 x 3.4cM or 51cM. For close-cousin Matches (say 1C, 2C, 3C), this make a big difference. For distant-cousin Matches (say 6C-8C or more), where you would probably not match at all most of the time, the endogamy may increase the total cM enough that you’ll actually get an above-threshold Match. Note: a 7C would normally share 0.1cM, so even with an Endogamy factor of E15, you’d only share 1.5cM, on average, and you’d need to be on the tail of the distribution curve to exceed a 7cM threshold for a Match.

In this post – Endogamy PART II – we’ll look at what happens to an individual segment.

Ground Rules

Shared Segment means an IBD segment, from an Ancestor. I usually consider all shared segments over 7cM in a Triangulated Group to be IBD segments.

One Ancestor or one CA means one Ancestral line. See CA and MRCA for a discussion of the CA for a shared segment. Note: at different cousinship levels, there may be intermediate CAs, or MRCAs, which are all in one Ancestral line. The shared segment comes from the most distant CA of that Ancestral line. In other words, all the Matches with shared segments (think a Triangulated Group), will have a CA with you on one Ancestral line – you and they will all descend from one CA.

Endogamy means a CA is in our Tree multiple times. So Ancestor A can be represented by A1, A2, A3, etc. for each time that Ancestor is in out Tree. As you read on, you’ll note that it’s important to treat each one (A1, A2, A3, etc.) as a separate Ancestor (even though they are all the same individual).

Assumption: Each Match will have one CA with you. I know in many cases a Match may share multiple Ancestors with you. But for the purposes of this blog post, we will only look at the effects of one CA. We have to build up, one concept at a time. Learn the concept in this post.

Shared Segments from Duplicate Ancestors – analysis

Let’s look at a shared segment between you and a Match. It could be any shared segment. Just to add some reality to this discussion, let’s say it’s on Chr 10 from 8 to 20Mbp with 23cM. We’ll call this SEG-1.

In this example, the CA for you and a Match is in your Tree twice: A1 and A2. Both A1 and A2 are the same person, so both A1 and A2 have the same SEG-1. Let’s look at Figure 1 and see how far SEG-1 can descend toward you.

16E Figure 1Analysis of Figure 1:

In Generation 6 (G6), YOU and a MATCH are 4C, and share SEG-1.  SEG-1 came from Common Ancestor A. Red indicates that person has SEG-1.

In G1, your 2 ancestors, A1 and A2, have SEG-1; and your Match’s ancestor, A, has SEG-1. These are all the same individual – so naturally she has SEG-1, no matter where she is in your Tree or your Match’s Tree.

SEG-1 is passed down from A to the Match through one ancestral line.

In G2, SEG-1 is passed from A1 to her son; and from A2 to her daughter.

In G3 A1’s son passes SEG-1 to the paternal chromosome in his son; and A2’s daughter passes SEG-1 to the maternal chromosome in her son. This G3 son now has two copies of SEG-1, one on his paternal Chr 10, and one on his maternal Chr 10. This is indicated by **.

In G3, the ** father, recombines his two Chr 10s to make one Chr 10 to pass to his son. Only one of the two SEG-1s can be passed on. The G4 son will only get one SEG-1. This is indicated by *. We don’t know whether the maternal or paternal SEG-1 is passed on – it’s a 50/50 chance for either. If you need a refresher on recombination and how only one area of two chromosomes can be passed down, please review Segments: Top-Down.

In G4 and G5, SEG-1 will be passed down to you, and you will share this segment with your 4C Match.

Starting with G1, there are lots of possibilities, but for you and your Match to both share SEG-1, it has to start in a CA (in this case A, A1 and A2) and be passed down through each generation to you and your Match.

One Segment from One Ancestor

This is a fundamental concept of genetic genealogy. Each shared segment can come from only one Ancestor. This means from only one of several Ancestors (A1, A2, A3, etc.) in Trees with endogamy.  This can be extended to each Triangulated Group can come from only one Ancestor. A corollary is that a different shared segment (or TG) can come from a different Ancestor. With an Ancestor in your Tree 5 times, it is possible for you to have a different segment from each one. It’s also possible for one Ancestor to pass down several different shared segments (in different TGs). So although we can say “One Segment is from One Ancestor”, the reverse is not true. We have to say “One Ancestor can pass down Multiple Segments” (or no segment).

Shared Segments from Multiple Ancestors – analysis

We can apply this same concept – One Segment from One Ancestor – to even greater endogamy. See Figure 2.

16E Figure 2

Analysis of Figure 2:

In this case we have E5, the Common Ancestor is in your Tree five times. When cousins (descending from the CA) marry, their child may get two SEG-1s – one on each chromosome. This is indicated by the double **. The next generation gets only one SEG-1 from that line. As noted with the A3 line, the son in G4, could have double ** – one from A1 or A2 (paternal chromosome) and one from A3 (maternal chromosome). The daughter is G5 got a paternal SEG-1 from A1, A2 or A3, and a maternal SEG-1 from A4 or A5. You got a SEG-1 from A1 or A2 or A3 or A4 or A5. At this point we cannot tell which of your Ancestors passed down the SEG-1, but we do know it could only have come from one of them.

In this discussion, I’ve used the “worst case” scenario – each of your A Ancestors passed SEG-1 down as far as she could. In fact, SEG-1 is subject to the 50/50 rule – half the time it will be passed down, half the time it won’t. However, the fact that we share SEG-1 with a Match, and have a Common Ancestor A with that Match, means at least one of your Ancestors A had to pass it all the way down to you and the Match.

We are all well aware that you and a Match probably have multiple Common Ancestors. Again, this analysis does not sort out which Ancestor is the correct Common Ancestor for SEG-1, nor does it sort out which one of the multiple Common Ancestors it is. This analysis just establishes the point that:

One Segment comes from One Ancestor

A very unusual exception

In the case where your parents are cousins, it is possible (but not very probable), that you would carry two SEG-1s (from this example) – one paternal, one maternal. This would be the case in Figure 2 if you were the daughter ** in G5. At GEDmatch, any such segment areas would be highlighted with their “Are Your Parents Related?” tool. So it’s easy to check for such segments, and be aware of where they exist (Chromosome and Start/End locations). In all other cases you don’t need to worry about this issue. For any shared segments meeting this very unusual criteria (exactly the same segment on both chromosomes), you wouldn’t know which of your two ancestors it came from. If there were any difference in these two shared segments (they were not exactly the same), then chromosome mapping would usually tell you which one was which.


Each shared segment comes from one Ancestor. In the case of endogamy with multiple identical Ancestors, each shared segment comes from only one of them.

It’s possible to have ten identical ancestors in your Tree, and to get a different shared segment (as in a different TG) from each of them.


16E Segment-ology: Endogamy PART II – One Segment from One Ancestor by Jim Bartlett 20160104



Shared IBD segments come from a Common Ancestor (CA). Matching & overlapping IBD segments form Triangulated Groups (TGs). Every Match in a TG with significantly overlapping shared segments will have the same CA! And closer Matches (cousins), will also have a closer CA. So how can we have a close CA and a distant CA when they are in the same TG?  When they are all in the same ancestral line!

Let’s start with a distant cousin (Match) and look at the Common Ancestor.

05C Figure 1

Some notes about Figure 1

– With atDNA the path from the CA can go through males (boxes) and/or females (circles) in any order – it does not matter.

– The CA is one of the two parents above – the DNA that passed down from the CA to you and your 7th cousin (7C) came from one person. In this example, I’ve assumed the mother just to illustrate that it is just one parent. In most cases we don’t know which parent the DNA is from.

– The CA has at least two children: one is an ancestor of your 7C Match (M); and one is the ancestor of you (U)

– In this case the CA is also an MRCA (Most Recent Common Ancestor) – you and your Match don’t relate any closer on this line. However, in genetic genealogy, we tend to call this the CA, rather than the MRCA.

– You and your Match (M), will also share all of the Ancestors of the CA.

– This Figure 1 assumes the CA shown is the correct CA – the one who passed down the shared DNA segment to you and your Match.  We don’t really know if this CA is correct, until we find corroborating evidence – read on.

So how do we confirm that this CA line (either the mother or the father) is the one who passed down the segment you and the Match (M) share (as opposed to some other ancestral line)? One way is by Triangulation. When several people share the same segment, and all have paper trails to the same CA, we assume this CA must be correct. Another method is by “walking the ancestry back”. That is through closer cousins who also share this segment (in a Triangulated Group). We are generally pretty comfortable (when a close cousin shares a lot of DNA with us) that the closer CA is correct. In other words, when a 2nd cousin (2C) shares 220cM with us, and has large individual shared segments with us, we assume the CA is the known Great grandparent. And with large segments this is almost always true. Then if a known 4C shares a good sized segment with us, we also assume the known 3G grandparent is the CA. If all of these occur in the same TG, we need to call the intermediate CAs, MRCAs (Most Recent Common Ancestors) to distinguish them from the CA of the TG. Let’s see how this looks in Figure 2.

05C Figure 2

Some notes on Figure 2:

– The Tree for your Match (7C) and you is the same as in figure 1.

– A matching 2C on the same segment will have an MRCA with you on the G grandparent.

– Matches with 4C and 6C are also shown with MRCAs on the ancestral line from you to the CA.

– Everyone in Figure 2 descends from the CA.

This scenario, with intermediate MRCAs, adds a lot of confidence to the CA being the Ancestor who passed down the DNA that all of you (Match M, you, 2C, 4C and 6C) share.

Note that the intermediate MRCAs could have just as easily been on the Matches line. And/or you and the Match may both have intermediate MRCAs. The key point is that the MRCAs are in the ancestral line to the CA.

This concept applies equally to TGs. Each TG really represents a segment from an Ancestor to you. A “tight” TG – one with significantly overlapping segments among all the Matches in the TG – will have a CA just like a shared segment does. And all the Matches in the TG will share that CA. A “wide” TG – with “cascading” segments such that one at the beginning of the TG doesn’t overlap one at the end of the TG – may well turn out to be two TGs, with two CAs… more on that in a different post.

So there is always an Ancestor who is the most distant MRCA of your TG, for a given threshold. That means that any Ancestor who is more distant would not show up as a Match (using the given threshold, say 7cM), because the segment from the more distant Ancestor down to you would be too small at that distance to match anyone. For example, you may have gotten only 6cM from a 7G grandparent. In that case you would never get any Matches who were cousins on that 7G grandparent, using a 7cM match threshold. Others may get large enough segments from that same 7G grandparent, and maybe get some Matches, but you would not.

It appears this most distant MRCA of a TG may be fairly deep in our Trees in many cases. As a result we are having a hard time finding them. Our best tactic then is testing close cousins, and finding intermediate cousins among all of our Matches. This means testing at all companies and uploading to GEDmatch to get the most Matches you can. We never know when the key intermediate Match will show up – they won’t always have significantly larger segments. And lowering the threshold at GEDmatch, in general, will only result in even more distant cousins and more distant CAs.


A shared segment is from a Common Ancestor (CA) with a Match (cousin).

Closer cousins would have MRCAs with you who are descendants of the CA. Your Matches may also have closer cousins with MRCAs who also descend from the CA.

These “intermediate” MRCAs increase the probability that the CA passed down the shared segments.

We still do not know if the CA is the mother or the father, but we can be very confident that this is the correct ancestral line, and not some different or alternative ancestral line.



05C Segment-ology: CA and MRCA by Jim Bartlett 20160101

Endogamy PART I

Endogamy PART I – Shared DNA

This blogpost looks at the amount of shared DNA from endogamy. It does not address the genealogy of endogamy, but instead establishes some terminology and reference material.

First let’s define endogamy: the custom of marrying within the limits of a local community, clan or tribe [Oxford Dictionaries online].

This means cousins marry each other; and those two cousins have at least one ancestor who is the same. In others words an ancestor is in our tree more than once. The same individual occupies two (or more) blocks (or positions) in our tree, and their respective descendants (cousins) marry each other.

Classic examples of endogamous populations include Ashkenazi Jews and Low German Mennonites. In genealogy, endogamy is also used to describe multiple cousin marriages in limited population area such as those found in various areas of Colonial America, for instance [c.f. ISOGG wiki].

Let’s take a more in depth look at how DNA is passed down, how much DNA is shared between cousins, and examine the impact of endogamy. How does endogamy affect the total amount of DNA shared between cousins and the size of the shared segments?

Ground Rules

 Use average cMs. DNA is very random and there is a wide range of possible values of segment cMs passed down from ancestors, as well as the amount of shared cMs between cousins. For this article, I will consistently use the calculated average values. In practice we see values above and below these average values, but with large data they should average out to the calculated values. By using the average cMs we should all come to the same results.

Use 7040cM as the total cMs in one person. Each company tracks the cMs a little differently. I picked this value because it’s roughly right*, it divides easily, and it compliments my notional Segment Size Chart here. We want to stay focused on the big picture and keep things in good perspective, rather than get into a debate about which company has the best total. I’ll use 7040 as the “base”, and also show the percentage that is passed down and shared. You can use a different base if you want. It’s the relative values we are after here, so it really doesn’t make much difference which base you use. The takeaway should be a general understanding of the effects of endogamy.

Use A to designate an ancestors who is in a pedigree more than once. A1 and A2 would be the same individual (A) in two different positions in a pedigree.

Use one Ancestor (A). We usually note a couple as the Common Ancestor because we don’t know which one passed the shared DNA segment down to you and your Match.  But only one Ancestor of this couple had that DNA, and I use only one Ancestor is this analysis.

 Base Chart [E1]

For this discussion we will use average values, and each descendant will get exactly half of their parent’s DNA. Also the shared amount decreases by a factor of 4 with each generation. This gives us the following Base Chart:07D Fig 1

Explanation of Figure 1:

Values under You and Match are in cM. 4C means 4th Cousin; and 4C1R means 4th Cousin once removed. This will be similar in other figures.

Column 1 shows a Common Ancestor (A) at the top of the chart (with a total of 7040cM of DNA). The list of descendants is noted by Gen 1, Gen 2, etc. Note with atDNA, the descendants could be male or female.

Column 2 shows the total amount of DNA passed down from the Common Ancestor (A) to the descendants in each Gen. For the purposes of this article, I used one half of the ancestor’s DNA in each succeeding descendant. Usually this column represents you.

Column 3 shows the relationship between the descendants on your line vs. the descendants of a Match’s line in Column 4.

Column 4 shows the total amount of DNA passed down from the same Common Ancestor (A) to the descendants in each Gen. Again, I used one half of that ancestor’s DNA in each succeeding descendant. Usually this column represents your Match.

Column 5 shows the total amount of DNA that would be shared between you and your Match at each generation. Note that the amount decreases by a factor of 4 in each generation. [Sidenote:In the case of a half cousin, the amount of shared DNA is halved. Example 4C = 13.75cM shared; 4C1R = 6.875cM shared; 5C = 3.438cM shared.] Note that in Gen 6 (5C level) the share is 3.44cM, which is well below a matching threshold of 7cM. Clearly the average 5C would not show up as a Match. However, we know we have many 5C Matches above 7cM, so those Matches which are reported are well into the upper “tail” of the 5C distribution curve – see cM notional distribution curves here.

Column 6 shows these shared cMs as a percentage of the base [7040cM]

Column 7 is a little trick – it shows years inversely spaced at 30 year intervals, starting with a genealogist born about 1950. This allows you to either 1) look at a year of interest to you and see the probable cousins you’d have with ancestors of that time period, or 2) look at the cousinship of a Match and see approximately when the Common Ancestor lived. Of course it’s a very rough approximation, AND you should feel free to use different years that roughly work with your pedigree. This one works pretty well for me…

Column 8 is another little trick – it shows the number of ancestors you would have at each Gen going back – another inversion list. For example: if you and your Match are 8C, you would each have 512 ancestors at your Common Ancestor level. In other words the CA is 1 of 512 ancestors. It’s a handy lookup feature of Figure 1.

Endogamy factor – I have noted this chart as Endogamy 1 [E1], meaning both you and your Match only have the CA in your ancestry once. More on this later.

Modified Base Chart (Cousin Ancestors) [E2]

Now let’s modify the Base Chart and show you having two of the same Common Ancestor (A1 and A2) whose Great grandchildren married each other.

07D Fig 2

Explanation of Figure 2.

Columns 1-4 are similar those columns in Figure 1 with three important differences: (1) they are both on your side (2) the two 2C at Gen 3 marry each other, and (3) in Gen 4 the 440cM which was passed down from Gen 3 for each of A1 and A2 are shown, as well as that amount being combined into a total of 880cM for the single descendant (child) in Gen 4. In succeeding generations the DNA is halved at each generation.

Column 5 shows the net (combined) amount of DNA from A (A1 + A2) for the descendants of the Gen 3 marriage, starting in Gen 4. The net DNA is now twice as much as it was in Column 2 for Gen 4 in Figure 1.

Columns 6-7 are the same as Columns 3-4 in Figure 1.

Columns 8-9 have twice the values at each Gen compared to Figure 1. The shared DNA is now twice as much (by total and percentage).

Endogamy factor – With 2 identical Common Ancestors in your Tree, we have E2.

Important Note: When the DNA is passed from the Gen 3 parents (A1 and A2) to the Gen 4 child, the Gen 4 child gets the total DNA from A1 in various segments on one set of chromosomes (say the paternal side), and the DNA from A2 on the other set of chromosomes (the maternal side). There is no mix at this point. The various segments are subdivided, or not, and passed down normally. In the next generation, the Gen 4 child will recombine both chromosomes and pass the DNA to the Gen 5 child. There is a small probability that some segments from ancestors A1 and A2 may be exactly the same, but they would be on opposing chromosomes in Gen 4 and only one segment area could be passed on to Gen 5 child. There is a very small probability that separate, but adjacent, segments from A1 and A2 (on opposing chromosomes) could wind up adjacent again in Gen 5 child, and be “stitched together” to form a larger segment in Gen 5 from ancestor A than there was in Gen 4. Note that this very small probability can only happen in this one generation (the generation of a child with cousin parents passing DNA to his/her child; in this case Gen 4 to Gen5). In succeeding generations, all the segments for ancestor A are on one side, and can only be subdivided.

Key Findings

Total DNA – As it turns out, no matter where in your ancestry the cousins marry each other, their descendants will have twice the DNA from the Common Ancestor. It doesn’t matter if first cousins or fifth cousins marry, their descendants will carry twice the total Common Ancestor’s DNA (on average). And it doesn’t matter if cousins married recently or 6 generations back, their descendants will carry twice the Common Ancestor’s DNA. This simplifies the analysis a lot!

Shared DNAthe amount of shared DNA will double (with this E2 scenario). An E1 5C = 3.438cM (see Fig 1); an E2 5C = 6.875cM (see Fig 2)

Net effect – With E2 the shared DNA is equivalent to an additional “once removed” in the cousinship. A true 5C Match (normally sharing 3.438cM with E1), with E2 would look like a 4C1R (6.875cM)

Segment Sizes – Although, on average, the total DNA will be doubled, the various segments will not be larger, in general. For sure, the segment sizes are not doubled!

Modified Base Chart (3 Identical Common Ancestors) [E3]

Suppose you have three identical Common Ancestors (A1, A2 and A3) in your Tree. Usually this means two cousin marriages involving the same ancestor.

07D Fig 3

Explanation of Figure 3.

The columns are similar in function to that of Figure 2.

In Gen 3 two 2nd cousins, the highlighted descendants of A1 and A2, marry. Then in Gen 4, a child from this marriage, marries a descendant of A3, also highlighted.

Columns 2, 4 and 5 show the “half-amount” of DNA from ancestors A1, A2 and A3 that continues to add up in each generation (see Column 6). Note this is always the sum of respective portions from A1, A2 and A3, AND in Column 6 the net amount is halved in each succeeding generation.

Columns 9 and 10 show three times the total shared cM and total percent shared.

Endogamy factor – With 3 identical CAs in your Tree we have E3.

Modified Base Chart (2 Identical CAs plus 2 Identical CAs) [E4]

Let’s try an example with cousins in your Tree and cousins in your Match’s Tree. The process should be familiar now.

07D Fig 4

Explanation of Figure 4.

See previous Figures for explanations of the Columns.

As before, in Gen 3 two 2nd cousins in your Tree marry, and all succeeding total DNA is doubled.

In Gen 4 two 3rd cousins in your Match’s Tree marry, and all succeeding total DNA is doubled.

To get the shared DNA at Gen 5 we take the A1 DNA (220cM) compared to A3 DNA (220cM), and from Figure 1 we know this is 13.75cM, We then compare A1 to A4 and get 13.75cM; as is A2 to A3 and A2 to A4. So we have a total of 4 times 13.75cM or 55.0cM total shared. Here we have E2 on your side and E2 on your Match’s side.

Endogamy factor – E2 x E2 is E4.

Modified Base Chart (3 Identical CAs plus 2 Identical CAs) [E6]

So you might ask in the previous chart, do we add (E2 + E2 = E4) or multiply (E2 x E2 = E4)? Let’s resolve this in the following figure.

07D Fig 5

Explanation of Figure 5.

This is the reason why I continue to separately show the total contribution of DNA from each of the Ancestors (A1, A2, A3, A4, and A5 in this case). I don’t know how to compare 660cM and 440cM in Gen 5 to get the shared cM. But comparing these 5 ancestors in separate pairs means we can use shared values we already know from Figure 1. In this case, compare at 220cM for A1-A4, A1-A5, A2-A4, A2-A5, A3-A4 and A3-A5 – a total of 6 sharing comparisons. So we use E3 x E2 = E6.

Endogamy factor is E6; and we can multiply the 220cM-220cM share (13.75cM from Figure 1) by 6. Or 13.75cM x 6 = 82.5cM.

Common Ancestor is in only in each Tree once [E1]

What happens if we have lots of endogamy in our ancestry, but the Common Ancestor with a Match is not repeated in either Tree? Well we would not have any effects of endogamy. The Endogamy factor would be E1, and we’d use Figure 1. The multiplying effect of endogamy on shared DNA only comes into play when the Common Ancestor between you and a Match is repeated in your Tree or in your Match’s Tree.

Modified Base Chart (Common Ancestor is below Endogamy) [E1]

What happens if you and your Match have a Common Ancestor with lots of endogamy? In other words the Common Ancestor is the descendant of endogamy. The analysis of shared DNA is always done by starting with the Common Ancestor’s total DNA [7040cM, or 100%], and working down from there.

07D Fig 6

Explanation of Figure 6.

You can put as many identical Ancestors as you want in this chart (like A1 and A2 above, or the example in Figure 5). But to determine the shared DNA from a Common Ancestor, you must start with that ancestor – noted as B in Figure 6. In this example, ancestor B is only in your Tree once and your Match’s Tree once, notwithstanding the fact that B has multiple A ancestors. B is a separate, individual ancestor and the shared DNA from this B ancestor must be calculated with B as the base.

Endogamy factor is E1 in this case. There is no change in the amount or percentage of shared DNA with any cousin on Common Ancestor B in this case.

Summary Findings:

Total DNA in descendants of multiple Common Ancestors is multiplied by the number of CAs. It doesn’t matter how distant the marrying cousins are or where they are in your Tree. The number of Common Ancestors in a Tree determines the Endogamy factor – a CA in a Tree three times is E3, for example.

Shared DNA with a Cousin is multiplied by the Endogamy factors of you and your Match.

Endogamy only affects the shared DNA from the Common Ancestor between you and a Match.

  • General endogamy, or “population endogamy”, does not affect the shared DNA calculation, except as it applies to the specific CA.
  • Specific endogamy on Ancestors other than the Common Ancestor does not affect the shared DNA calculation.
  • Endogamy ancestral to the Common Ancestor with a Match does not affect the shared DNA calculation
  • If you know all 8 of your Great grandparents are different, and/or all 16 of your 2xGreat grandparents are all different, and/or can be sure (say by geography, ethnicity, etc.) that none of your 32 3xGreat grandparents are repeated as your ancestors, then your Endogamy factor would be E1 (use Figure 1) with any Match who is a cousin from one of these ancestors. If you are positive that any other more distant ancestor was in your Tree only once, the Endogamy factor is E1. However, you also need to consider the Endogamy factor of your Match.

Endogamy must be considered for both you and your Match.

  • Use an Endogamy factor, E, for each time the Common Ancestor is in your Tree and/or your Match’s Tree.
  • If the Common Ancestor is in a Tree only once the Endogamy factor is E1; twice E2; three times E3, etc.
  • Multiply to combine Endogamy factors from you and your Match. Examples: E1 x E1 = E1 (no endogamy); E4 x E2 = E8, and the total amount of shared DNA in Figure 1 for that Gen is multiplied by 8. An E8 5C would share 8 x 3.438 = 27.5cM, which would look like a 3C1R.

Perceived effect of endogamy is the equivalent of one additional “once-removed” for each additional CA involved. So a true 4C (usually sharing 13.75cM), would share 27.5cM with E2 and look like a 3C1R, or 55cM with E4 and look like a 3C. Referring to Figure 1 at the 4C level, we have 32 ancestors, and so does our Match. So to reach E4, both you and the Match would need to have the 2xGreat grandparent (CA) in your Tree twice, for example.

If all or much of your ancestry is in one “pool” of endogamy, the opportunity for large Endogamy factors is great. If various branches of your ancestry come from very different geographic areas or ethnicities, etc., the Endogamy factors will be smaller. You might want to examine various parts of your ancestry to see where endogamy might play a role. Endogamy means more shared DNA, which will also mean more Matches.

The size of shared DNA segments is not, generally, changed by endogamy. Certainly, endogamy does not double the size of shared segments.

Summary Thoughts

This has been an interesting drill for me (I’m sorry for all the tables and numbers).

This article is based on the calculated averages – “your results may vary”. I am certain that many of our Matches are in the 6th to 8th cousin range, and our shared DNA is based on both endogamy and the long “tails” on the cM distribution curves.

I hope this blogpost will help facilitate further discussion of endogamy in genetic genealogy.


07D Segment-ology: Endogamy I – Shared DNA by Jim Bartlett 20151202

* At the atDNA totals are 6769cM at FTDNA; 7174cM at GEDmatch and 7075cM at 23andMe; and ISOGG uses 6800cM at Other sources have different totals.