Crossover and Segment Formation

A Segment-ology TIDBIT…

Crossovers and segments are formed by random DNA biology.  They are formed at conception in each of our ancestors and in ourselves. They are at fixed, permanent locations in each of us. They are not affected by family size, geography, wealth, status, intelligence, etc. For each of us, they are a fixed structure of our chromosomes – like a picture or jigsaw puzzle – which is different for each person.


22A Segment-ology: Crossover and Segment Formation TIDBIT by Jim Bartlett 20170101

A Targeted Process at AncestryDNA

AncestryDNA does not provide segment info. This is a problem for Segmentologists who want to Triangulate – like me. Triangulation has worked well in grouping my many thousands of segments into specific groups representing specific ancestral lines. These Triangulated Groups (TGs) now cover 94% of my 45 chromosomes, and identify specific maternal and paternal segments of DNA from my ancestors. Over 75% of my 45 chromosomes have some identified Common Ancestor(s) beyond my parents. Are they all correct? Of course not. But many are linked to first, second, third and fourth cousins, which are walking back the Common Ancestors of those segments. These are reinforcing and validating more distant Common Ancestors shared by fifth through ninth cousins.

I have over 400 Hints at AncestryDNA, and very few of them can be linked to GEDmatch kits. It’s frustrating to know so many Triangulated Groups on one hand; and so many Common Ancestors at AncestryDNA on the other hand; and not be able to merge this information. Knowing the segment data for each of the AncestryDNA Hints would significantly expand my chromosome map. Grrrrrrr:-(

So I’ve tried a new process at AncestryDNA, and am having some amount of success with it – I thought I’d share it in this blog.

I selected a surname – HIGGINBOTHAM – on my mother’s side. This surname is in my Tree 6, 7, 8 and 9 generations back. The patriarch (9 generations back) is thought to be John HIGGINBOTHAM, but there is some controversy about his given name. However, there is general agreement on several of his children, and I have several AncestryDNA Hints at the 6C, 7C and 8C levels going back on this HIGGINBOTHAM line.

So I searched on this surname at my AncestryDNA Results page and got over 100 Matches. I looked at each one and selected 46 which were Hints or had strong Colonial Virginia ties to Amherst Co, VA (where my HIGGINBOTHAMs were for 3 generations) or to nearby counties.

Note that at 9 generations back we have 256 ancestors on our mother’s side. My chromosome mapping currently shows just over 200 TGs on my mother’s side. So I should reasonably expect about one, maybe two or three, of my TGs to be from my HIGGINBOTHAM line. Maybe there are a few Matches at the 6C level which actually then branch off on the HIGGINBOTHAM’s wife’s ancestry.

I was surprised at the number of HIGGINBOTHAM cousins I had – particularly since I was pretty sure they shouldn’t be spread out over many different TGs. And, on the other hand, I don’t believe that many cousins should all pile up on the same ancestor – unless, of course, some of them were related to each other more closely (some Matches were from the same Admin), or they only descended from two or three of the Patriarch’s children at each TG.

I needed to find out the actual segment data each match shared with me….

So I drafted a standard message:

You and I have a DNA match at AncestryDNA, and have a common HIGGINBOTHAM line. I have 46 such Matches at AncestryDNA! I am mapping my DNA (linking Ancestors to specific DNA segments), and would very much like to determine the segment for each of these lines. Most of us are 8th cousins, and I did not expect to get HIGGINBOTHAM DNA from more than one or two different segments. To sort this out, I am requesting that you upload your raw DNA file to .It’s a free site – easy to register – with complete instructions on their home page. My GEDmatch ID is Mxxxxxx. You will get many more Matches at GEDmatch, with emails. No medical or health info. Please let me know your GEDmatch kit number if you upload at GEDmatch.

I will provide you feedback on what I find, including other cousins who share the same segments. This will help all of us.

FYI, I’m having success with atDNA and would be happy to help you. See my How to Succeed list at: It has some good links at the end. My  DNA blog is at – written for genealogists in plain language.

Hope to hear from you. Jim Bartlett

end of message

Note – do not send such a standard message to 46 Matches at one sitting. Ancestry looks for “spam-like” messages, and will block your messaging ability (you have to phone in and talk them out of it, to get reinstated). I sent 5 or 6 a night for a week or so.

Well… this worked out much better than previous attempts to randomly beg Matches to upload to GEDmatch. I now have 23 GEDmatch kit numbers out of this group. And several others are still working on it. The results have been in several categories:

  1. At GEDmatch I can also compare with my deceased father’s kit uploaded from FTDNA. It turns out several of my HIGGINBOTHAM cousins share a DNA segment with both me and my father. So although we may well be genealogy cousins on HIGGINBOTHAM, the shared DNA that formed our match at AncestryDNA is from my father’s side – not from a HIGGINBOTHAM line.
  2. Several TGs are getting most of the shared segments. TG [04P36] has six Matches on two children of the Patriarch (some of them are 5C or 6C to each other from one child of the Patriarch). TG [10I36] has 3 Matches; and TG [04B36] has two Matches. [NB my TG naming system starts with the Chr (04 and 10); the letter roughly represents the 10Mbp block where the segment starts; and 3-6 are Ahnentafel for my mother’s father’s side – where my HIGGINBOTHAM ancestry is]
  3. The rest of the Matches are spread out over different TGs – most of these TGs include Matches with other Common Ancestors. So it is entirely possible, indeed probable, that most of these TGs will come from different Common Ancestors. As time allows, I will investigate the several surnames from each of these TGs to see if there is consensus among the Matches.

This process would probably be too overwhelming for common surnames like JONES, SMITH, JOHNSON, etc. And your own surname might not be very helpful in determining a few TGs – your father’s or mother’s surname could be sprinkled all over your chromosomes – so it would be harder to form groups. Since we probably have the majority of our Matches in the sixth to eighth cousin range, I’m thinking that would be a good place to select surnames.

The main point here is that by using a more personal message, I’m getting more cooperation from my AncestryDNA Matches. By selecting a surname, and doing some homework to make sure we have the same Patriarch, the message is targeted. By promising to provide feedback to each Match who uploads to GEDmatch, I’m helping my Matches. The Matches don’t have to understand Triangulation or Segmentology – all they need to do is upload to GEDmatch. It seems to be working…

This process also works for testing educated guesses on new Surnames. It takes advantage of the more than 23,000 Matches I now have at AncestryDNA. I can search for a particular SURNAME and see if it pops up. Out of 23,000 Matches, each of my Ancestral surnames should be shared by some of my AncestryDNA Matches.

For example, I’ve looked for years for the maiden name of the wife of my ancestor Thomas BARTLETT c1705-1783 of Richmond Co, VA. At one point he owned a piece of land between two EIDSON brothers, so I thought perhaps his wife’s maiden name was EIDSON (and I’ve collected a lot of EIDSON records in Richmond Co, VA trying to find the connection – to no avail). So I searched my AncestryDNA Results for the Surname: EIDSON. I only got one EIDSON hit, and that clearly was not a link to Richmond Co, VA in the early 1700s. This means to me that that surname is probably unlikely as my ancestor. I’ll try some other surnames from my FAN list for Thomas BARTLETT. And in a year or two, when I have twice as many Matches at AncestryDNA, I may try EIDSON again. And revisit some of the other surnames…

For me, this targeted approach is turning out to be a good way to get uploads to GEDmatch and to find Triangulated Groups with several Matches who share the same ancestry with me.


15G Segment-ology: A Targeted Process at AncestryDNA by Jim Bartlett 20161020

The Attributes of a TG

This article will describe the various attributes of a Triangulated Group (TG). Some have noted that I use the term TG to describe both a Group of Matches as well as an ancestral segment. Well… yes, I do. Read on.

Once established, each TG has certain attributes which can be used to describe and/or define the TG:

A1. A TG is a group of shared segments from Matches. We often think about the Matches in a TG. They have a Common Ancestor. They can be contacted and encouraged to collaborate on finding the common ancestry. So in this sense a TG is a group of Matches. However, note that any of these Matches could, potentially, also share the same or a different ancestor with you on another segment (in a different TG)

A2. A TG occupies a specific physical space on a chromosome. It is, in effect, a segment in its own right – a segment from one of your ancestors. The TG is on one chromosome with a start location and an end location. These start and end locations are determined by the matching, overlapping, shared segments (from Matches) within the TG. Please review: Anatomy of a TG.

A3. As a segment, a TG has a specific string of SNPs on one chromosome. These would be the same SNPs in a segment on your chromosome, and on the segment each one of your Matches shares with you in the TG. All the SNPs would be the same. The SNPs have to be the same for IBD segments to match.

A4. A TG is the equivalent of phased data. The TG represents an ancestral segment on a chromosome. All of the SNP values (alleles) are on one chromosome, and are the same SNP values you got from your mother or father (depending on the chromosome). What you have in a TG (segment) is exactly what you would have with phased data.

  1. We don’t see the actual ACGT values in a TG that we would get with true phasing (with a child-parents trio), but they are the same values in the TG. The TG segment represents part of one of your chromosomes – it must have the same ACGT values that your parent passed to you on that chromosome.
  2. We can treat the TG as phased data, and any other shared IBD segment which Triangulates with the TG will have the same SNP values.
  3. This is true, even if you have formed a TG (with matching, overlaying, shared segments from Matches) and do not have the genealogy to determine which side it is on. You can be confident that the TG exactly matches the DNA on one of your two chromosomes. In this case the TG is not entirely equivalent to a true phased segment. But, if you had the phased information, you’d already know which side the TG was on. And very often you can determine the side of a TG by imputation – by determining which side it’s not on; or by the admixture of the segment.

A5. Technically, each TG has a cM value. However, it usually takes a lookup table to determine the cM value for a segment on a given chromosome, between two points. This is what the testing companies and GEDmatch do for each shared segment they report. It’s a lot of work for genetic genealogists – and, in general, our TGs will morph over time as new shared segments are added, and the TG cMs will need to be adjusted. However, we can fairly easily make rough estimates of the TG cMs, which are plenty good enough for genealogy:

  1. Subtract the TG start location from the end location to get the number of base pairs (bps), divide by 1,000,000 to get Mbp, which is roughly equal to the number of cMs.
  2. Or, eyeball the cMs of the larger shared segments in a TG and extrapolate to the full TG (if you’re lucky, you may have a shared segment which nearly fills the TG)
  3. Note that the cM is a fuzzy value anyway – it’s empirically derived (an average of many observations), and it’s an average of the female and male averages. So don’t go to too much trouble, and use round numbers. AND note that there is a wide range of possibilities when trying to use cMs to determine approximate cousinships. See Blaine Bettinger’s chart for the ranges of cMs vs cousinships.

A6. TGs have fuzzy ends. There is no “signpost” in our DNA to identify crossover points, or where a shared segment starts or ends. The company algorithms estimate shared segments by looking for areas of DNA that are identical, and it then continues until the DNA is not identical. This results in some, usually small, amount of overrun (a longer segment than is actually there from a Common Ancestor). So my convention is to use the start location of the first shared segment in a TG (the one with the lowest start in bp). This is the start location of the TG. The end location is determined several ways:

  1. If there is no overlap with the next TG on that chromosome, then use the largest end location of all the shared segments in the TG – the shared segment which runs the farthest. This often is not the last shared segment in a sorted spreadsheet – you need to look at them all.
  2. If there is a small, fuzzy overlap of 1-2Mbp with the next TG, I use the start location of the next TG, and accept the fact that there is a fuzzy overlap. We don’t need to be real precise for genealogy. Each TG represents a large block of DNA from an Ancestor – the fact that the edges of the block may be fuzzy should not obscure the big picture: the main TG segment came from an Ancestor!
  3. If there is a large overlap with the next obvious TG (almost always from a large shared segment with a close cousin, which probably spans more than one TG), I start a new TG at the obvious point dictated by the next group of shared segments, and use the same point as the end location of the first TG. This involves judgment – there is no hard rule – and the data will usually indicate where to start a new TG. Just accept that close cousins may share large segment which span more than one TG.
  4. If the shared segments in the TG all end a “few” Mbp before the next TG starts, I will just round up, and use the start location of the next TG as the end location. Again, use judgment.
  5. If there appears to be a large enough gap between two known TGs, I create a “dummy” TG to fill the gap. And then I keep looking for some Matches with shared segments to fill that gap. At this date my dummy TGs are about 7% of my DNA.
  6. Using these conventions will result in TGs that are adjacent to each other over all your chromosomes. Even with some “dummy“ TGs, this process will organize all of your IBD Match segments into TGs over all of your DNA. When done, this is a happy day! You can then focus on TGs that should link to specific ancestors. And all new Match segments will generally fit easily into existing TGs.

A7. As you work this process of forming TGs and assigning them to sides using genealogy, you are creating a chromosome map. As new Matches are posted, their shared segments may adjust the start and/or end locations of the TGs. When you get lucky, you’ll find a new shared segment that fills a “dummy” TG.

A8. Naming TGs. I label each TG (and all the shared segments in it) with a short code – like 07C25. The 07 means Chr 7; C means the TG starts within the third group of 10Mbp – in this case between 20-30Mbp; 25 means this TG ion on my father’s (2), mother’s (5) side – using Ahnentafel numbers. If I were starting over I’d use 07.027PM to indicate Chr 7; start 27Mbp; on Paternal, and his Maternal ancestry. Note: before you do any assigning, the label might be 07.027, then add P or M when determined. I usually add this code in the subject line of emails and messages – it just helps me keep organized.

A9. Also note that each TG also has within it, many genes. Each of your 22,000 or so genes will have a specific location on your DNA. If you become curious about any particular gene, you can look up where it is located (chromosome and location). You will have two of each gene, one on your maternal chromosome and one on your paternal chromosome. You can then look at your chromosome map of TGs and see which maternal TG and which paternal TG it’s in. If you’ve determined a Common Ancestor for those TGs, you’ll know which ancestor passed that gene down to you. You can also add a gene to your spreadsheet, so that it sorts with all the segments and TGs. Examples: Short Sleepers gene BHLHE41 would be 12.026M and 12.026P (very close to LRRK2 (Parkinson’s) at 12.031. Also my Neanderthal segment is 10.130 (I don’t know which side)

Have fun with your TGs!


15B Segment-ology: The Attributes of a TG by Jim Bartlett 210919

Understanding and Using TGs

Each Triangulated Group represents a segment of DNA that come from a specific ancestor in your ancestry. It comes from that ancestor, down one specific path, to you. Even if you have the same ancestor in your ancestry more than once, a specific TG segment comes from only one instance of that ancestor, down one specific path to you.

Review Ancestral Segments

Let’s review ancestral segments. Please also refer to the posts and figures for Segments: Bottom-Up and Segments: Top-Down. At each generation of ancestors, each one passes down specific segments to you. In other words, your entire DNA is made up of segments from your ancestors of each generation. All 4 of your grandparents passed down various specific segments to you that make up all of the DNA on all of your chromosomes – all of your DNA came, in various segments, from those 4 ancestors. The same statement can be made of all 8 of your Great grandparents – you have different segments from each of your 8 G grandparents, which, in total, provide all of your DNA on every chromosome. Some Great grandparents may not contribute to each and every chromosome, but all of your chromosomes will be a patch-work-looking quilt of segments from your Great grandparents. The same is true for every generation: your DNA is made up from segments from your ancestors in that generation. As you go back 6 or 7 or 8 generations, some of the ancestors of that generation will drop out. That is you don’t get DNA from every one of them. They are still your ancestors, you just didn’t inherit DNA from them. Review The Porcupine Chart.

Another way to put it is that segments from distant ancestors are combined to make larger segments in closer ancestors. The smaller segments are still there, it’s just that the closer ancestor passed down larger segments which are made up of the smaller segments from their ancestors. Your parent passes complete chromosomes (very large segments) to you, which are made up of smaller segments from his/her ancestors. See the graphics in Top-Down.

Mapping with Triangulated Groups

Now the hard part of chromosome mapping with Triangulated Groups (TGs) is that the TG segments don’t generally create a map for any specific generation. The TG segments could come from ancestors of different generations. It all depends on how your TGs are laid out, and on the shared segments you get with Matches. You will only “see” TG segments that are made up of shared segments from Matches. So some TG segments may be from closer generations than others. But, in general, your TG segments will be adjacent to each other from the start to the end of each chromosome.

Cousinships with Matches in a TG

A TG is made up of Matches with shared segments. Review: The Anatomy of a TG. The shared segments (with Matches) in a TG will generally fall into 3 categories:

  1. Shared segments with Matches who are cousins on the Common Ancestor who created your TG segment. For a TG segment created by a 6G grandparent, this would be 7th cousins. Note: it would be very unlikely for you to have more than three exact 7th cousins (each one from a different child) from a 6G grandparent on one segment (i.e. in a TG). More on this later.
  2. Shared segments with Matches who are closer cousins – they share a Common Ancestor with you who is closer than the 6G grandparent who created your TG segment. Note: it would be very unlikely for you to have more than three exact 4th or 5th or 6th cousins (each one from a different child) from the same xG grandparent on one segment (i.e. in a TG). More on this later. These closer cousins would share a Most Recent Common Ancestor (MRCA) who is in the direct path of descent from the 6G grandparent (who created the TG segment) down to you. Because of this they also share the 6G grandparent ancestor with you. But, in this case, you will both descend from the same child of the 6G grandparent, and that does not count against the “limit” of three 7th cousins from a specific ancestor.
  3. Shared segments with Matches who are more distant cousins – they share a Common Ancestor with you who is ancestral to your 6G grandparent who created your TG segment. This frequently happens when the shared segment is less the full TG segment from the 6G grandparent (for example). So these Matches could be 8th or 9th or 10th cousins or even more distant. They descend from a smaller “sticky” segment which is passed down to them (along a specific path) and down through your 6G grandparent to you, along the same path as categories 1 and 2 above. So you might be 9th cousins on an 8G grandparent, who is a grandparent of your 6G grandparent. In fact, when your shared segment (in a TG) is in the 7 to 10cM range, the odds are roughly 80% that the Common Ancestor will be more than 10 generations (9th cousins) back. That’s the bad news – most of these smaller shared segments will be at the far reaches, or beyond, our genealogies. The good news is that roughly 20% of the Matches (sharing 7 to 10cM segments) will be closer – spread among 1st to 8th cousins. For shared segments in the 10 to 20cM range, the odds are roughly 60% that the Common Ancestor will be more than 10 generations back – which means 40% of those Matches will be closer cousins. In fact, with 100 shared segments in a TG in the 10-20cM range, you’d probably have two or three 3rd cousins, two or three 4th cousins, two or three 5th cousins, two or three 6th cousins, etc. You can see a graph of these probabilities in this article at ISOGG.

100 Matches on a TG!

So the next issue is: can we have 100 Matching cousins in a TG? The answer is absolutely!  The odds are almost nil that more than 4 children will pass down the same matching segment to cousin descendants. Hence the caution that we almost certainly cannot have more than 4 cousins, from different children of a specific ancestor, passing down matching shared segments to cousins in one TG (i.e. on one segment). Each child inherits a different mix of DNA from the parents – rarely will more than four of them get the same DNA, and much more rarely would their descendants wind up sharing that DNA on the same segment (i.e. in a TG).

However – 100 Matches can easily descend from an ancestor 10 generations back without violating that concept. You and a sibling could share the same segment with two siblings who are 1st cousins from a grandparent. Similarly 4 Matches could be 2nd cousins with no more than two children from each ancestor. And 8 more Matches at 3rd cousin level – up to 512 Matches at 9th cousin level (10 generations back). And you could double that number by adding a once-removed cousin at each level. All without using more than two children at each generation. And this is all within a genealogical timeframe of 10 generations. If you consider that 80% of the Matches sharing 7-10cM will be well beyond the 10 generation level, the potential number of Matches in a TG becomes quite large. Of course this means the Matches are not all exactly at the same cousinship level – they are very likely to be spread over a range.

Large numbers of Mathes in a TG is not a pile-up!

So clearly a large number of Matches in a TG should not sound an alarm. Or arbitrarily define the TG as a pileup area, or require that all the Matches should be discarded. Nonsense! Large TGs, with Matches matching each other on overlapping shared segments, just indicates that many of the Matches will be more distant cousins. Some Matches will be beyond the genealogies of some – it depends on each genealogist and how far back their Tree goes, and the same for their Matches. But, statistically, there will be some closer cousins in the mix of Matches in the TG. We won’t know which are the closer cousins until we share with them.

The ancestor who “created” your TG segment

In the above discussion I note the ancestor who “created” the TG shared segment. This means the most distant ancestor who had that full segment. His or her ancestors did not have that full segment. His or her parent created (at least) that full segment from their parents. And the segment you inherited from that ancestor is all or, probably more likely, a subset of the segment that ancestor passed down, and eventually got to you. We are talking about the unique segment you inherited from that ancestor, and that segment did not exist in any ancestor further back. The TG segment is unique to you and your ancestor.

Your TG/Segment is unique to you

The ancestor who created your TG segment, probably passed down various overlapping segments to your Matches. These segments may be larger or smaller than the segment you got. What you “see” in a shared segment, is the part your Match got that overlaps the segment you got. So from a Match’s perspective, there is almost certainly a different segment from the Common Ancestor than you got – the Match would have a different TG than you from this same ancestor. Or some of your Matches may get segments from a more distant ancestor. These segments would be smaller and, when combined with other DNA, would result in the DNA segment that your ancestor passed down to you. Thus, those other Matches would be more distant cousins, on a smaller segment than what you got. There are many scenarios that would result in a Match (sharing a segment with you) being a more distant cousin – refer back to alternative 3 above.

The name of the atDNA game is sharing and collaboration

This accounts for some Match cousinships which can be pretty distant.  I have to emphasize that, although some of the TG Matches will be fairly distant (and often beyond your genealogy), some Matches in each TG are probably within a genealogical timeframe. This brings up two important points when using TGs:

  1. Contact every Match you can – you never know which ones may have a Common Ancestor you can identify.
  2. Encourage Matches within a TG to share Trees among the group. Some Matches in a TG may be closer cousins to each other than they are to you, and they may in fact have a Common Ancestor who is ancestral to the ancestor of your TG. Sharing and collaboration are the name of this game.


15A Segment-ology: Understanding an using TGs by Jim Bartlett 20160917


Anatomy of a TG

This is another blog post that gives you some idea of what to expect with autosomal DNA and your segments. In this post we’ll look at the formation of a TG. We’ll walk through the steps:

  1. Start with overlapping segment data
  2. Simplify the data by rounding
  3. Sort by Chromosome and Start location
  4. Then Triangulate the segments (no genealogy required)
  5. Highlight one of the two resulting TGs
  6. Show this data graphically – like you’d see in a chromosome browser
  7. Overlay the total TG
  8. Then use our imagination and x-ray vision (or GEDmatch) to show what the ancestral segments of the Matches might look like
  9. Do some analysis…


Figure 1. Some overlapping segment data

10B Figure 1

Letters represent Match names – data is taken from my spreadsheet.

Figure 2 – divide the Start/End locations by 1000000

10B Figure 2

It’s much easier to read the Start/End locations in Mbp; and it’s just as accurate for genealogy.

Figure 3 – the data is sorted by Chromosome and Start location

10B Figure 3

This makes it much easier to see overlapping segments.

Figure 4 – this shows the results of Triangulation into groups 16A and 16B.

10B Figure 4

No genealogy was involved in this process – it’s purely a matter of comparing segments at 23andMe or GEDmatch; or looking for ICW Matches in this list and each ICW list at FTDNA. Again, this is real data from my spreadsheet. Often there is more mixture between the two TGs, but I hope you get the idea.

Figure 5 – Here is only TG 16B data

10B Figure 5

It’s still arranged by Chromosome and Start location.

Figure 6 – Same data and the shared segments displayed graphically

10B Figure 6

This is how you’d see the data in a chromosome browser. Note the top 11 bars will all match each other. The bottom bars will usually all match each other too, and they’ll usually also match the top 8 bars, but maybe R and N will not match at the 7cM level at GEDmatch. Just lower the level to 500 SNPs and 5cM and you’ll find there is enough for a Match. Let’s see what the TG for this data looks like…

Figure 7 – Now the fun begins…

10B Figure 7

Usually the TG is pretty clear cut, but I’ve intentionally selected one with two kinds of ambiguity. In almost all cases the ends of the segments are fuzzy. You can read about Fuzzy Data in my blog post here.

Judgment is needed at this point. I’ve shown the “guaranteed” TG in red, with orange tips where the data looks fuzzy. I want to emphasize that this is NOT a problem for genealogy – the TG (wherever the true crossover points are that define the TG Start and End locations – somewhere in the orange areas) represents an ancestral segment from one of your ancestors. The fuzzy ends are not an issue. Your Matches will share a Common Ancestor with you – and that’s where the focus should be. The crossovers defining the real TG will be somewhere in the fuzzy orange tips.

You can also see that this data indicates a probable more distant crossover point – around say 51Mbp. In this case the top 8 Matches and Match S are probably closer cousins sharing a larger, closer segment with you. At this point you might want to review Crossovers by Generation here.  Going back one or more generations we may see the large red TG being subdivided into two smaller ancestral segments – each with its own Common Ancestor each one of which is ancestral to the Common Ancestor for the red segment. In this case the last 8 Matches (T, J, Q, L, M, I, A and K) will have a different, more distant CA than Matches G and H. The main, red, TG may be from a 5G grandparent, and the smaller, green and purple segments may be from a 6G or 7G grandparent. Actually the purple segment, as an example, may pass intact through several generations, and you could share this same segment with 7C and 8C…

Again, most of your TGs will be tighter and from a single CA, but I wanted to take this opportunity to show what sometimes happens. You can avoid any conflict by watching for this situation and just declaring two TGs in this case – see the green and purple bars. Then it’s like the case where a close cousin spans more than one TG – the close cousin will help you define a larger segment from a closer ancestor, and the close cousin, along with different groups of Matches in different TGs will share more distant Common Ancestors with those TGs, but those more distant CAs will be ancestral to the MRCA the close cousin shares with you.

So now let’s use our imagination a little (or we could actually Triangulate this area from the perspective of some of our Matches. In this next Figure 8, I’ve guessed at what the ancestral segments might be the Common Ancestor down to the different Matches – as shown in green.

Figure 8 – showing ancestral segments for all Matches

10B Figure 8

In some cases our Matches have somewhat larger ancestral segments than we have, or they might have segments that extend over one end of our ancestral segment, or the other. In all cases the blue represents the overlap between each Match’s ancestral segment and our own ancestral segment. And the data is not exact, so the ends often don’t line up vertically. The long green bar at the bottom of Figure 8 is a segment an ancestor passed down to living people – you got part of it, the red part.

At GEDmatch it’s often fun, and instructive, to compare two Matches to each other. Sometimes they turn out to be parent/child, or siblings. This is the exercise you’ll want to do if you are trying to map an ancestor.

So, again, if you have collected a lot of shared segments from FTDNA, 23andMe and GEDmatch, they have to go somewhere. It’s not hard to compare them to each other and see where they Triangulate. If they are IBD, they have to go on one chromosome or the other. When you do this you’ll find there are natural break points where the crossover points are located (often the precise location is a little fuzzy). Just look at the data above.


05D Segment-ology: Anatomy of a TG by Jim Bartlett 20160204

Crossovers by Generation

There has been some amount of discussion about segment size, triangulation, and the number of cousins who can share a Triangulated Group. The discussion often uses terms like extremely rare, small segments, distant ancestors, etc. without using specific examples. The arguments go from it’s OK to triangulate with close relatives, to it’s virtually impossible with distant relatives – and there is no discussion of any middle ground.  The odds do diminish as you go back in ancestry, but there is no artificial dividing line: closer works, distant doesn’t work. There are always a gradation – shades of gray, if you will. Let’s see if we can put boundaries on it.

In my mind, one way to try to see the forest, and the trees, is to really take a look at an average genome (23 chromosomes, 3 billion base pairs), and see what kind of segments we might see at each generational level. Most of us know that we get pretty large segments from our grandparents, and the size drops down with each generation as we work our way back/up our ancestry.  So let’s develop a table and take it back and see what we have.

The average number of crossovers per generation is 34. Yes, the average for males (fathers) is 27, and the average for females (mothers) is 41 (per ). But this difference (with respect to the total number of crossovers in a genome) fades after just a few generations – so we’ll use the average, 34.

Crossover Points in One Generation

Let’s start with a parent and 23 pairs of chromosomes. In passing a genome to a child, this parent adds 34 crossovers, which results in 23+34 = 57 segments. Here is Figure 1 showing 34 crossovers and the 57 segments in one genome:

05D Figure 1

These are generally large segments from the grandparents. On average, these segments will be 3,400 cM divided by 57 segments or about 60cM per segment. But clearly some are larger and some are smaller. Sometimes a chromosome is passed intact – see Chr 21 above. You can try this at home, on a sheet of paper – just make 23 horizontal lines and put 34 vertical tic marks on them. You can put a few more or less tic marks, but the overall picture of relatively large segments from your grandparents will be the same.

The important observation here is that you have these ancestral segments on your chromosomes – they are fixed between fixed crossover points created when your parent passed these chromosomes to you.  Of course you don’t know where they are at first, but as you determine Triangulated Groups (TGs) with various cousins, you’ll find that none of the shared segments span across one of these crossover points. And in fact, with enough shared segments you will start to see these crossover points firm up, with separate TGs (from the other grandparent) on either side of them. This chromosome mapping, with shared segments, identifies the crossover points for your ancestral segments. The shared segments with Matches usually only overlap part of your ancestral segment from a Common Ancestor – in this case a grandparent.

Crossover Points in Two Generations

Adding 34 tic marks per generation is a good exercise to carry out for several generations and get the feel for how this works. Let’s try another 34 vertical tic marks. We’ll add the tic marks to show the crossover points that were formed when grandparents passed the chromosomes (which they got from their parents) down to your parents. In effect this takes the 57 segments we had in Figure 1, and (with 34 more crossovers) creates 91 total segments as shown in the genome in Figure 2:

05D Figure 2

We still have fairly large segments. On average now, these ancestral segments are 3,400/91 = about 37cM per segment. Again – some will be larger, some smaller. Each of these segments in Figure 2 (between tic marks – both old and new) are from a great grandparent. These segments fill up each and every chromosome in this genome. You may note that some of the grandparent segments were not subdivided. This is not unusual. In fact it has to happen. We started with 57 ancestral segments and added 34 new tic marks (crossover points) – so 34 segments got subdivided and 23 segments did not.

Crossover Points for 13 Generations

In the next generation back, we would add 34 more new tic marks (crossovers) which would subdivide only 34 of the 91 ancestral segments creating a total of 125 ancestral segments from 2G grandparents, and leaving 57 segments untouched (no subdivision). Here is a table in Figure 3 carrying this math out for 13 generations:

05D Figure 3

Discussion of Figure 3:

Note: This is a table with various values, depending on which generation you are focused on. So successively, pretend you are at a particular generation and read across to see the statistics. Cousins are abbreviated: 2nd cousin is 2C; 2nd cousin once removed is 2C1R.

– Gen 0: You have 23 chromosomes from a parent (we are only working on one genome, so the number of ancestors is 1. Your parent gave you 23 very large segments (which are chromosomes)

– Gen 1: You get DNA contributions from your 2 grandparents. This is in 57 segments spread over one genome. At this level of your ancestry you would see Matches with 1Cs. Review this in Figure 1.

– Gen 2: You get DNA contributions from your 4 Great grandparents on one side. Now you have 91 ancestral segments spread over 23 chromosomes, and each segment averages about 37cM. Some of these ancestral segments are larger, and some are smaller; and they all add up to 23 complete chromosomes (one full genome). This is the generation that you usually share with 2Cs – review Figure 2. In Figure 3 I also show the calculated shared segment values for the various cousins. With a 2C, you would normally share a total of 106cM (from one side). But the average size of the segments from the Great grandparents is only 37cM. This reflects the fact that you will probably share multiple segments with a 2C – perhaps on average three 37cM segments totaling 111cM… Remember these are averages and in actual practice there is a LOT of variation.

-Gen 3: This shows an average ancestral segment size of 27cM from your 2G grandparents – spread over 125 total segments. The total shared segment for a particular 3C is about 27cM – so you might expect a single segment from a 3C (again, this is just an average, but it might reflect what you often see). I’ve underlined ancestral segment (what you actually got from an ancestor), and shared segment which is the overlap between you and a Match. This overlap is rarely exactly the same ancestral segment in both you and your Match – one or both of you probably has somewhat more in the full segment you got from the Common Ancestor.

NB: this overlapping (shared) segment vs ancestral segment difference may be the root cause of some math calculations which have been touted as proving that exact matches among more than 3C are very rare.  Several cousins having the exact same ancestral segment may be fairly rare, but experience with Triangulated Groups shows that overlaps are not that rare.

-Gen 4: Ancestral segments (averaging 21 cM) from your 16 3G grandparents are spread over about 159 segments. So you would see, on average, an ancestral segment from each 3G grandparent in roughly different 10 segments spread over the chromosomes in that genome. Most of your Matches would be 4C (or 3C1R or 4C1R). The shared segments would average 6.6cM, but another way to look at this is that roughly half of them would be over 7cM. However, experience shows that a relatively small percentage of our Matches are 4C and closer relatives. So there are not many such Matches to cover all the segments in our genome.

-Gen 5: Our 32 4G grandparents still give us fairly large 17cM ancestral segments (on average) spread out over 193 segments. We would still see most of our 4G grandparents in multiple segments. Our 5C Matches only share, on average, 1.7cM. So only some of them, on the tails of the distribution curves, will share 7cM or more. The offset is that we have so many 5Cs, that we still get plenty of IBD matches with them. However, the key point here is that while we may have a 17cM ancestral segment from a 4G grandparent, a 5C is only likely to share part of that with us. It would take several 5Cs, each with a 7-10cM segment, partially overlapping our own ancestral segment, to “cover” our 17cM ancestral segment.  In practice we often get 5C Matches with above average segments, but usually not as large as 17cM.

-Gen 6: Our 64 5G grandparents pass down ancestral segments to us that average about 15cM. They pass these down to an average of 227 segments; and each 5G grandparent will pass down DNA to 3 or 4 different segments, on average. Perhaps some of our 5G grandparents won’t have DNA that reaches us at all, while others my pass down 5 or more segments – roughly, it usually averages out. At this level most of our Matches will be 6C, give or take a little. A 6C, on average, only shares 0.4cM of DNA with us. But there are long tails on these distribution curves, AND we have a LOT of 6Cs. The result is that we do have many 6C who do share IBD segments with us over 7cM. Yes, the probability of a specific 6C shared segment is one forth the probability of a 5C, but we have so many more 6C than 5C, we actually get more Matches with 6C. This means more 6C Matches are out there with a shared segment over 7cM, than there are for 5C. Again, it will normally take several of them to “cover” and ancestral segment (a TG).

-Gen 8: Skipping a generation to the 256 7G grandparents. At this point there are an average of 295 segments, or about one segment per 7G grandparent. Clearly by this time some of the 7G grandparents do not contribute to your DNA, and some 7G grandparents contribute to several ancestral segments. Your ancestral segments are in the 11-12cM range, on average. And despite the fact that 8Cs only share a small amount of DNA on average, there will still be many 8C with shared segments above a 7cM threshold.


All through this analysis, the number of ancestral segments has increased by a constant 34 with each generation; the average segment size starts off large and decreases with each generation, but even after 13 generation, the average ancestral segment is still over 7cM; the number of ancestors continues to double with each generation (and at some point duplicates will start to appear, but as I’ve outlined in Endogamy I and II, each duplicate really acts like a separate ancestor); and the average size of shared segments decreases by a factor of 4 with each generation, but we still see many Matches with shared segments over 7cM. To expand on this last point, I have over 10,400 “phased” Matches at AncestryDNA, with all the pile-ups and IBS already culled out. About 400 of these Matches are 4C or closer, leaving over 10,000 Matches in the 5C or more distant range. The distribution of these is spread out among 5C, 6C, 7C, 8C, etc. It is, so far, unclear how far back these go, but clearly there are many in the 5C-8C range. And AncestryDNA claims their “phasing” program has less than a 1% error rate. So 99% of these are IBD shared segments, probably most in the 6C-to-8C range. To my thinking, this means most of them must line up somewhere on our chromosomes. If we assume half, or 5,000, of these Matches are for each genome, on average, then these 5,000 Matches must be on 300 to 400 of my ancestral segments – or over 10 Matches in the 5C-8C range on every segment, on average. Some ancestral segments (TGs) may have more, some may have less, but the 5,000 IBD Matches have to go somewhere.  I’ve picked on AncestryDNA here, because they poo-poo Triangulation (I think they don’t really understand it), and because they have equations that some have used to argue that we cannot have multiple 4C or above in TGs. But the same analysis is true using 23andMe and FTDNA data – they each report many Matches, they each claim a small IBS rate (under 5%), and by their own estimates, most of our Matches are beyond 4C. All of these IBD Matches have to be on our chromosomes somewhere. And, in 14 months (by my estimation), we will have twice as many Matches as we have now – we’ll have over 20 Matches per ancestral segment (TG)!

NOTE: the number of crossovers per generation will average out. So the number of segments created by each generation is fairly accurate – there is much less variation in these numbers than you might find in the average cM for an ancestral segment (which has a somewhat wider range) or a shared segment (which appears to have a much wider range).

“the main thing is to keep the main thing the main thing”

  1. Your genome (chromosomes) is divided into segments by crossover points.
  2. These are your ancestral segments, and each one is from a specific ancestor.
  3. Each Match will have his/her own crossover points and ancestral segments from specific ancestors.
  4. When you share an IBD segment with a Match this segment comes from a Common Ancestor (CA).
  5. A shared segment means your ancestral segment and your Match’s ancestral segment overlap.
  6. Your Match may have a small ancestral segment, which falls within your ancestral segment; a large ancestral segment, which includes your ancestral segment; or, usually, any size ancestral segment which overlaps a portion of your ancestral segment.
  7. The overlapping amount may be relatively small (say 7cM), or as large as your ancestral segment.
  8. The odds are very small that you and a Match would get exactly the same segment from a CA. And certainly the odds would be extremely small that you and several Matches would get exactly the same ancestral segment from a Common Ancestor.
  9. However, from the numbers of IBD shared segments we are getting from Matches, compared to the number of ancestral segments, it is highly probable that multiple Matches can and do have ancestral segments which overlap your ancestral segments.


Note: A full Triangulated Group (TG) is equivalent to one of your ancestral segments. Which ancestral segment the TG represents depends on the shared (overlapping) segments you have with your Matches.  Several Matches with overlapping segments in a TG will tend to “wall paper” your ancestral segment – with enough of the right Match/segments your TG will cover the whole ancestral segment. Some TGs may be from a closer ancestor (say a great grandparent), some may be somewhat more distant (say a 7G grandparent). From my experience, most TGs will be in the 10-40cM range. This does create a hodge-podge effect (with TGs from different generations), but the TGs tend to be adjacent to each other from one end of each chromosome to the other. Alternatively, you can try to map to a specific generation – perhaps starting with grandparents (and determine those crossovers), and then determine which of those segments are subdivided into smaller segments from the great grandparents, and which segments remain intact going back that one generation. And then continue in this fashion with each additional generation. The drawback to this process is that you need many close relatives to take DNA tests to determine all the crossover points at each generation.


A final word of caution: don’t get too lost in the details or the math. Generally, you will have many Matches and IBD segments. Because they are IBD segments, they have to go somewhere on Mom’s side or Dad’s side. 23andMe and FTDNA have developed algorithms to help insure that most of your Match segments over 7cM are IBD, and from experience we know that almost all of the shared segments over 10cM are IBD, and well over half of the 7-10cM segments are IBD. So if you are reading this blog, you are probably into utilizing segments, along with your genealogy, to improve your family Tree. You should also upload to GEDmatch to find other Matches (from all 3 testing companies) with segments. When segments over 7cM Triangulate, it’s a very strong indication that those segments are IBD and the resulting TGs are from a Common Ancestor. You have an ancestral segment at the location of each TG, and your Matches share part of that ancestral segment with you. Each ancestral segment (TG) came from one of your parents and one of your grandparents, etc. Match/segments in that TG have to come from a distant ancestor who is ancestral to that grandparent. There is no cutoff to this process. We cannot say that only our large ancestral segments are valid. All of our ancestral segments came from a specific ancestor. Our ancestral segments have their own ancestral “Tree”. You may be more confident about a TG including a first or second cousin, but you probably don’t have enough tested cousins to cover every TG over all of your chromosomes. That doesn’t mean these other TGs are not valid, it just means you don’t have a close cousin to validate it. You have to use the closest cousin you can find to validate each TG.  Your ancestral segments are real! They are part of you, from your ancestors. And Matches who share those segments, also share their ancestry – no matter how far back the Common Ancestor is. Note from Figure 1 and 2 that segments from more distant ancestors are “nested” within larger segments from closer ancestors. So if you cannot determine the most distant Common Ancestor, look for the closer Common Ancestor who provided the larger ancestral segment.


05D Segment-ology: Crossovers by Generation by Jim Bartlett 21060201

Endogamy PART II

Endogamy Part II – One Segment from One Ancestor


In Endogamy Part I (Shared DNA), we found that the total cM shared between you and a Match is multiplied by the number of times you had the Common Ancestor (CA) in your Tree. So if you and your Match were 5th cousins (5C), you would normally share 3.4cM. If your CA (between you and your Match) was in your Tree five times you would tend to share, on average, 5 x 3.4cM = 17cM. If your Match has that CA in her Tree three times, the total you would tend to share, on average, would be 3 x 5 x 3.4cM or 51cM. For close-cousin Matches (say 1C, 2C, 3C), this make a big difference. For distant-cousin Matches (say 6C-8C or more), where you would probably not match at all most of the time, the endogamy may increase the total cM enough that you’ll actually get an above-threshold Match. Note: a 7C would normally share 0.1cM, so even with an Endogamy factor of E15, you’d only share 1.5cM, on average, and you’d need to be on the tail of the distribution curve to exceed a 7cM threshold for a Match.

In this post – Endogamy PART II – we’ll look at what happens to an individual segment.

Ground Rules

Shared Segment means an IBD segment, from an Ancestor. I usually consider all shared segments over 7cM in a Triangulated Group to be IBD segments.

One Ancestor or one CA means one Ancestral line. See CA and MRCA for a discussion of the CA for a shared segment. Note: at different cousinship levels, there may be intermediate CAs, or MRCAs, which are all in one Ancestral line. The shared segment comes from the most distant CA of that Ancestral line. In other words, all the Matches with shared segments (think a Triangulated Group), will have a CA with you on one Ancestral line – you and they will all descend from one CA.

Endogamy means a CA is in our Tree multiple times. So Ancestor A can be represented by A1, A2, A3, etc. for each time that Ancestor is in out Tree. As you read on, you’ll note that it’s important to treat each one (A1, A2, A3, etc.) as a separate Ancestor (even though they are all the same individual).

Assumption: Each Match will have one CA with you. I know in many cases a Match may share multiple Ancestors with you. But for the purposes of this blog post, we will only look at the effects of one CA. We have to build up, one concept at a time. Learn the concept in this post.

Shared Segments from Duplicate Ancestors – analysis

Let’s look at a shared segment between you and a Match. It could be any shared segment. Just to add some reality to this discussion, let’s say it’s on Chr 10 from 8 to 20Mbp with 23cM. We’ll call this SEG-1.

In this example, the CA for you and a Match is in your Tree twice: A1 and A2. Both A1 and A2 are the same person, so both A1 and A2 have the same SEG-1. Let’s look at Figure 1 and see how far SEG-1 can descend toward you.

16E Figure 1Analysis of Figure 1:

In Generation 6 (G6), YOU and a MATCH are 4C, and share SEG-1.  SEG-1 came from Common Ancestor A. Red indicates that person has SEG-1.

In G1, your 2 ancestors, A1 and A2, have SEG-1; and your Match’s ancestor, A, has SEG-1. These are all the same individual – so naturally she has SEG-1, no matter where she is in your Tree or your Match’s Tree.

SEG-1 is passed down from A to the Match through one ancestral line.

In G2, SEG-1 is passed from A1 to her son; and from A2 to her daughter.

In G3 A1’s son passes SEG-1 to the paternal chromosome in his son; and A2’s daughter passes SEG-1 to the maternal chromosome in her son. This G3 son now has two copies of SEG-1, one on his paternal Chr 10, and one on his maternal Chr 10. This is indicated by **.

In G3, the ** father, recombines his two Chr 10s to make one Chr 10 to pass to his son. Only one of the two SEG-1s can be passed on. The G4 son will only get one SEG-1. This is indicated by *. We don’t know whether the maternal or paternal SEG-1 is passed on – it’s a 50/50 chance for either. If you need a refresher on recombination and how only one area of two chromosomes can be passed down, please review Segments: Top-Down.

In G4 and G5, SEG-1 will be passed down to you, and you will share this segment with your 4C Match.

Starting with G1, there are lots of possibilities, but for you and your Match to both share SEG-1, it has to start in a CA (in this case A, A1 and A2) and be passed down through each generation to you and your Match.

One Segment from One Ancestor

This is a fundamental concept of genetic genealogy. Each shared segment can come from only one Ancestor. This means from only one of several Ancestors (A1, A2, A3, etc.) in Trees with endogamy.  This can be extended to each Triangulated Group can come from only one Ancestor. A corollary is that a different shared segment (or TG) can come from a different Ancestor. With an Ancestor in your Tree 5 times, it is possible for you to have a different segment from each one. It’s also possible for one Ancestor to pass down several different shared segments (in different TGs). So although we can say “One Segment is from One Ancestor”, the reverse is not true. We have to say “One Ancestor can pass down Multiple Segments” (or no segment).

Shared Segments from Multiple Ancestors – analysis

We can apply this same concept – One Segment from One Ancestor – to even greater endogamy. See Figure 2.

16E Figure 2

Analysis of Figure 2:

In this case we have E5, the Common Ancestor is in your Tree five times. When cousins (descending from the CA) marry, their child may get two SEG-1s – one on each chromosome. This is indicated by the double **. The next generation gets only one SEG-1 from that line. As noted with the A3 line, the son in G4, could have double ** – one from A1 or A2 (paternal chromosome) and one from A3 (maternal chromosome). The daughter is G5 got a paternal SEG-1 from A1, A2 or A3, and a maternal SEG-1 from A4 or A5. You got a SEG-1 from A1 or A2 or A3 or A4 or A5. At this point we cannot tell which of your Ancestors passed down the SEG-1, but we do know it could only have come from one of them.

In this discussion, I’ve used the “worst case” scenario – each of your A Ancestors passed SEG-1 down as far as she could. In fact, SEG-1 is subject to the 50/50 rule – half the time it will be passed down, half the time it won’t. However, the fact that we share SEG-1 with a Match, and have a Common Ancestor A with that Match, means at least one of your Ancestors A had to pass it all the way down to you and the Match.

We are all well aware that you and a Match probably have multiple Common Ancestors. Again, this analysis does not sort out which Ancestor is the correct Common Ancestor for SEG-1, nor does it sort out which one of the multiple Common Ancestors it is. This analysis just establishes the point that:

One Segment comes from One Ancestor

A very unusual exception

In the case where your parents are cousins, it is possible (but not very probable), that you would carry two SEG-1s (from this example) – one paternal, one maternal. This would be the case in Figure 2 if you were the daughter ** in G5. At GEDmatch, any such segment areas would be highlighted with their “Are Your Parents Related?” tool. So it’s easy to check for such segments, and be aware of where they exist (Chromosome and Start/End locations). In all other cases you don’t need to worry about this issue. For any shared segments meeting this very unusual criteria (exactly the same segment on both chromosomes), you wouldn’t know which of your two ancestors it came from. If there were any difference in these two shared segments (they were not exactly the same), then chromosome mapping would usually tell you which one was which.


Each shared segment comes from one Ancestor. In the case of endogamy with multiple identical Ancestors, each shared segment comes from only one of them.

It’s possible to have ten identical ancestors in your Tree, and to get a different shared segment (as in a different TG) from each of them.


16E Segment-ology: Endogamy PART II – One Segment from One Ancestor by Jim Bartlett 20160104