Roughly Right is OK for Genealogy

A Segment-ology TIDBIT

A Chromosome Map divides each chromosome into segments [or TGs] from specific ancestral lines. The data we get is fuzzy (see Fuzzy Segments), so the size and boundaries of these ancestral segments is a little fuzzy. But so what! For genealogy purposes these ancestral segments are large “targets”. Most shared segments with Matches easily hit these targets. The ancestral line “owns” their target segment. For genetic genealogy, that’s what matters – linking Matches and Common Ancestors with segments. It doesn’t make any difference if the shape or size of the target segment (TG) is a little blurry.

 

[22C] Segment-ology: Roughly Right is OK for Genealogy TIDBIT by Jim Bartlett 20170102

Number of Matches on a Segment (or TG)

A Segment-ology TIDBIT

Segments from Ancestors with large families will generally have more Matches (in general, the larger the families, the more cousins you have). Segments from more distant Ancestors will generally have more Matches (in general, the more distant the ancestor, the more cousins you have). Segments from Ancestors in Colonial America will generally have more Matches (in general, more Americans have taken the DNA tests). Segments from endogamous Ancestors will have more Matches (because of endogamy, there is more matching DNA).

 

22B Segment-ology: Number of Matches on a Segment (or TG) TIDBIT by Jim Bartlett 20170101

 

Crossover and Segment Formation

A Segment-ology TIDBIT…

Crossovers and segments are formed by random DNA biology.  They are formed at conception in each of our ancestors and in ourselves. They are at fixed, permanent locations in each of us. They are not affected by family size, geography, wealth, status, intelligence, etc. For each of us, they are a fixed structure of our chromosomes – like a picture or jigsaw puzzle – which is different for each person.

 

22A Segment-ology: Crossover and Segment Formation TIDBIT by Jim Bartlett 20170101

A Targeted Process at AncestryDNA

AncestryDNA does not provide segment info. This is a problem for Segmentologists who want to Triangulate – like me. Triangulation has worked well in grouping my many thousands of segments into specific groups representing specific ancestral lines. These Triangulated Groups (TGs) now cover 94% of my 45 chromosomes, and identify specific maternal and paternal segments of DNA from my ancestors. Over 75% of my 45 chromosomes have some identified Common Ancestor(s) beyond my parents. Are they all correct? Of course not. But many are linked to first, second, third and fourth cousins, which are walking back the Common Ancestors of those segments. These are reinforcing and validating more distant Common Ancestors shared by fifth through ninth cousins.

I have over 400 Hints at AncestryDNA, and very few of them can be linked to GEDmatch kits. It’s frustrating to know so many Triangulated Groups on one hand; and so many Common Ancestors at AncestryDNA on the other hand; and not be able to merge this information. Knowing the segment data for each of the AncestryDNA Hints would significantly expand my chromosome map. Grrrrrrr:-(

So I’ve tried a new process at AncestryDNA, and am having some amount of success with it – I thought I’d share it in this blog.

I selected a surname – HIGGINBOTHAM – on my mother’s side. This surname is in my Tree 6, 7, 8 and 9 generations back. The patriarch (9 generations back) is thought to be John HIGGINBOTHAM, but there is some controversy about his given name. However, there is general agreement on several of his children, and I have several AncestryDNA Hints at the 6C, 7C and 8C levels going back on this HIGGINBOTHAM line.

So I searched on this surname at my AncestryDNA Results page and got over 100 Matches. I looked at each one and selected 46 which were Hints or had strong Colonial Virginia ties to Amherst Co, VA (where my HIGGINBOTHAMs were for 3 generations) or to nearby counties.

Note that at 9 generations back we have 256 ancestors on our mother’s side. My chromosome mapping currently shows just over 200 TGs on my mother’s side. So I should reasonably expect about one, maybe two or three, of my TGs to be from my HIGGINBOTHAM line. Maybe there are a few Matches at the 6C level which actually then branch off on the HIGGINBOTHAM’s wife’s ancestry.

I was surprised at the number of HIGGINBOTHAM cousins I had – particularly since I was pretty sure they shouldn’t be spread out over many different TGs. And, on the other hand, I don’t believe that many cousins should all pile up on the same ancestor – unless, of course, some of them were related to each other more closely (some Matches were from the same Admin), or they only descended from two or three of the Patriarch’s children at each TG.

I needed to find out the actual segment data each match shared with me….

So I drafted a standard message:

You and I have a DNA match at AncestryDNA, and have a common HIGGINBOTHAM line. I have 46 such Matches at AncestryDNA! I am mapping my DNA (linking Ancestors to specific DNA segments), and would very much like to determine the segment for each of these lines. Most of us are 8th cousins, and I did not expect to get HIGGINBOTHAM DNA from more than one or two different segments. To sort this out, I am requesting that you upload your raw DNA file to www.GEDmatch.com .It’s a free site – easy to register – with complete instructions on their home page. My GEDmatch ID is Mxxxxxx. You will get many more Matches at GEDmatch, with emails. No medical or health info. Please let me know your GEDmatch kit number if you upload at GEDmatch.

I will provide you feedback on what I find, including other cousins who share the same segments. This will help all of us.

FYI, I’m having success with atDNA and would be happy to help you. See my How to Succeed list at: http://boards.rootsweb.com/topics.dnaresearch.autosomal/301/mb.ashx It has some good links at the end. My  DNA blog is at www.segmentology.org – written for genealogists in plain language.

Hope to hear from you. Jim Bartlett jim4bartletts@verizon.net

end of message

Note – do not send such a standard message to 46 Matches at one sitting. Ancestry looks for “spam-like” messages, and will block your messaging ability (you have to phone in and talk them out of it, to get reinstated). I sent 5 or 6 a night for a week or so.

Well… this worked out much better than previous attempts to randomly beg Matches to upload to GEDmatch. I now have 23 GEDmatch kit numbers out of this group. And several others are still working on it. The results have been in several categories:

  1. At GEDmatch I can also compare with my deceased father’s kit uploaded from FTDNA. It turns out several of my HIGGINBOTHAM cousins share a DNA segment with both me and my father. So although we may well be genealogy cousins on HIGGINBOTHAM, the shared DNA that formed our match at AncestryDNA is from my father’s side – not from a HIGGINBOTHAM line.
  2. Several TGs are getting most of the shared segments. TG [04P36] has six Matches on two children of the Patriarch (some of them are 5C or 6C to each other from one child of the Patriarch). TG [10I36] has 3 Matches; and TG [04B36] has two Matches. [NB my TG naming system starts with the Chr (04 and 10); the letter roughly represents the 10Mbp block where the segment starts; and 3-6 are Ahnentafel for my mother’s father’s side – where my HIGGINBOTHAM ancestry is]
  3. The rest of the Matches are spread out over different TGs – most of these TGs include Matches with other Common Ancestors. So it is entirely possible, indeed probable, that most of these TGs will come from different Common Ancestors. As time allows, I will investigate the several surnames from each of these TGs to see if there is consensus among the Matches.

This process would probably be too overwhelming for common surnames like JONES, SMITH, JOHNSON, etc. And your own surname might not be very helpful in determining a few TGs – your father’s or mother’s surname could be sprinkled all over your chromosomes – so it would be harder to form groups. Since we probably have the majority of our Matches in the sixth to eighth cousin range, I’m thinking that would be a good place to select surnames.

The main point here is that by using a more personal message, I’m getting more cooperation from my AncestryDNA Matches. By selecting a surname, and doing some homework to make sure we have the same Patriarch, the message is targeted. By promising to provide feedback to each Match who uploads to GEDmatch, I’m helping my Matches. The Matches don’t have to understand Triangulation or Segmentology – all they need to do is upload to GEDmatch. It seems to be working…

This process also works for testing educated guesses on new Surnames. It takes advantage of the more than 23,000 Matches I now have at AncestryDNA. I can search for a particular SURNAME and see if it pops up. Out of 23,000 Matches, each of my Ancestral surnames should be shared by some of my AncestryDNA Matches.

For example, I’ve looked for years for the maiden name of the wife of my ancestor Thomas BARTLETT c1705-1783 of Richmond Co, VA. At one point he owned a piece of land between two EIDSON brothers, so I thought perhaps his wife’s maiden name was EIDSON (and I’ve collected a lot of EIDSON records in Richmond Co, VA trying to find the connection – to no avail). So I searched my AncestryDNA Results for the Surname: EIDSON. I only got one EIDSON hit, and that clearly was not a link to Richmond Co, VA in the early 1700s. This means to me that that surname is probably unlikely as my ancestor. I’ll try some other surnames from my FAN list for Thomas BARTLETT. And in a year or two, when I have twice as many Matches at AncestryDNA, I may try EIDSON again. And revisit some of the other surnames…

For me, this targeted approach is turning out to be a good way to get uploads to GEDmatch and to find Triangulated Groups with several Matches who share the same ancestry with me.

 

15G Segment-ology: A Targeted Process at AncestryDNA by Jim Bartlett 20161020

The Attributes of a TG

This article will describe the various attributes of a Triangulated Group (TG). Some have noted that I use the term TG to describe both a Group of Matches as well as an ancestral segment. Well… yes, I do. Read on.

Once established, each TG has certain attributes which can be used to describe and/or define the TG:

A1. A TG is a group of shared segments from Matches. We often think about the Matches in a TG. They have a Common Ancestor. They can be contacted and encouraged to collaborate on finding the common ancestry. So in this sense a TG is a group of Matches. However, note that any of these Matches could, potentially, also share the same or a different ancestor with you on another segment (in a different TG)

A2. A TG occupies a specific physical space on a chromosome. It is, in effect, a segment in its own right – a segment from one of your ancestors. The TG is on one chromosome with a start location and an end location. These start and end locations are determined by the matching, overlapping, shared segments (from Matches) within the TG. Please review: Anatomy of a TG.

A3. As a segment, a TG has a specific string of SNPs on one chromosome. These would be the same SNPs in a segment on your chromosome, and on the segment each one of your Matches shares with you in the TG. All the SNPs would be the same. The SNPs have to be the same for IBD segments to match.

A4. A TG is the equivalent of phased data. The TG represents an ancestral segment on a chromosome. All of the SNP values (alleles) are on one chromosome, and are the same SNP values you got from your mother or father (depending on the chromosome). What you have in a TG (segment) is exactly what you would have with phased data.

  1. We don’t see the actual ACGT values in a TG that we would get with true phasing (with a child-parents trio), but they are the same values in the TG. The TG segment represents part of one of your chromosomes – it must have the same ACGT values that your parent passed to you on that chromosome.
  2. We can treat the TG as phased data, and any other shared IBD segment which Triangulates with the TG will have the same SNP values.
  3. This is true, even if you have formed a TG (with matching, overlaying, shared segments from Matches) and do not have the genealogy to determine which side it is on. You can be confident that the TG exactly matches the DNA on one of your two chromosomes. In this case the TG is not entirely equivalent to a true phased segment. But, if you had the phased information, you’d already know which side the TG was on. And very often you can determine the side of a TG by imputation – by determining which side it’s not on; or by the admixture of the segment.

A5. Technically, each TG has a cM value. However, it usually takes a lookup table to determine the cM value for a segment on a given chromosome, between two points. This is what the testing companies and GEDmatch do for each shared segment they report. It’s a lot of work for genetic genealogists – and, in general, our TGs will morph over time as new shared segments are added, and the TG cMs will need to be adjusted. However, we can fairly easily make rough estimates of the TG cMs, which are plenty good enough for genealogy:

  1. Subtract the TG start location from the end location to get the number of base pairs (bps), divide by 1,000,000 to get Mbp, which is roughly equal to the number of cMs.
  2. Or, eyeball the cMs of the larger shared segments in a TG and extrapolate to the full TG (if you’re lucky, you may have a shared segment which nearly fills the TG)
  3. Note that the cM is a fuzzy value anyway – it’s empirically derived (an average of many observations), and it’s an average of the female and male averages. So don’t go to too much trouble, and use round numbers. AND note that there is a wide range of possibilities when trying to use cMs to determine approximate cousinships. See Blaine Bettinger’s chart for the ranges of cMs vs cousinships.

A6. TGs have fuzzy ends. There is no “signpost” in our DNA to identify crossover points, or where a shared segment starts or ends. The company algorithms estimate shared segments by looking for areas of DNA that are identical, and it then continues until the DNA is not identical. This results in some, usually small, amount of overrun (a longer segment than is actually there from a Common Ancestor). So my convention is to use the start location of the first shared segment in a TG (the one with the lowest start in bp). This is the start location of the TG. The end location is determined several ways:

  1. If there is no overlap with the next TG on that chromosome, then use the largest end location of all the shared segments in the TG – the shared segment which runs the farthest. This often is not the last shared segment in a sorted spreadsheet – you need to look at them all.
  2. If there is a small, fuzzy overlap of 1-2Mbp with the next TG, I use the start location of the next TG, and accept the fact that there is a fuzzy overlap. We don’t need to be real precise for genealogy. Each TG represents a large block of DNA from an Ancestor – the fact that the edges of the block may be fuzzy should not obscure the big picture: the main TG segment came from an Ancestor!
  3. If there is a large overlap with the next obvious TG (almost always from a large shared segment with a close cousin, which probably spans more than one TG), I start a new TG at the obvious point dictated by the next group of shared segments, and use the same point as the end location of the first TG. This involves judgment – there is no hard rule – and the data will usually indicate where to start a new TG. Just accept that close cousins may share large segment which span more than one TG.
  4. If the shared segments in the TG all end a “few” Mbp before the next TG starts, I will just round up, and use the start location of the next TG as the end location. Again, use judgment.
  5. If there appears to be a large enough gap between two known TGs, I create a “dummy” TG to fill the gap. And then I keep looking for some Matches with shared segments to fill that gap. At this date my dummy TGs are about 7% of my DNA.
  6. Using these conventions will result in TGs that are adjacent to each other over all your chromosomes. Even with some “dummy“ TGs, this process will organize all of your IBD Match segments into TGs over all of your DNA. When done, this is a happy day! You can then focus on TGs that should link to specific ancestors. And all new Match segments will generally fit easily into existing TGs.

A7. As you work this process of forming TGs and assigning them to sides using genealogy, you are creating a chromosome map. As new Matches are posted, their shared segments may adjust the start and/or end locations of the TGs. When you get lucky, you’ll find a new shared segment that fills a “dummy” TG.

A8. Naming TGs. I label each TG (and all the shared segments in it) with a short code – like 07C25. The 07 means Chr 7; C means the TG starts within the third group of 10Mbp – in this case between 20-30Mbp; 25 means this TG ion on my father’s (2), mother’s (5) side – using Ahnentafel numbers. If I were starting over I’d use 07.027PM to indicate Chr 7; start 27Mbp; on Paternal, and his Maternal ancestry. Note: before you do any assigning, the label might be 07.027, then add P or M when determined. I usually add this code in the subject line of emails and messages – it just helps me keep organized.

A9. Also note that each TG also has within it, many genes. Each of your 22,000 or so genes will have a specific location on your DNA. If you become curious about any particular gene, you can look up where it is located (chromosome and location). You will have two of each gene, one on your maternal chromosome and one on your paternal chromosome. You can then look at your chromosome map of TGs and see which maternal TG and which paternal TG it’s in. If you’ve determined a Common Ancestor for those TGs, you’ll know which ancestor passed that gene down to you. You can also add a gene to your spreadsheet, so that it sorts with all the segments and TGs. Examples: Short Sleepers gene BHLHE41 would be 12.026M and 12.026P (very close to LRRK2 (Parkinson’s) at 12.031. Also my Neanderthal segment is 10.130 (I don’t know which side)

Have fun with your TGs!

 

15B Segment-ology: The Attributes of a TG by Jim Bartlett 210919

Understanding and Using TGs

Each Triangulated Group represents a segment of DNA that come from a specific ancestor in your ancestry. It comes from that ancestor, down one specific path, to you. Even if you have the same ancestor in your ancestry more than once, a specific TG segment comes from only one instance of that ancestor, down one specific path to you.

Review Ancestral Segments

Let’s review ancestral segments. Please also refer to the posts and figures for Segments: Bottom-Up and Segments: Top-Down. At each generation of ancestors, each one passes down specific segments to you. In other words, your entire DNA is made up of segments from your ancestors of each generation. All 4 of your grandparents passed down various specific segments to you that make up all of the DNA on all of your chromosomes – all of your DNA came, in various segments, from those 4 ancestors. The same statement can be made of all 8 of your Great grandparents – you have different segments from each of your 8 G grandparents, which, in total, provide all of your DNA on every chromosome. Some Great grandparents may not contribute to each and every chromosome, but all of your chromosomes will be a patch-work-looking quilt of segments from your Great grandparents. The same is true for every generation: your DNA is made up from segments from your ancestors in that generation. As you go back 6 or 7 or 8 generations, some of the ancestors of that generation will drop out. That is you don’t get DNA from every one of them. They are still your ancestors, you just didn’t inherit DNA from them. Review The Porcupine Chart.

Another way to put it is that segments from distant ancestors are combined to make larger segments in closer ancestors. The smaller segments are still there, it’s just that the closer ancestor passed down larger segments which are made up of the smaller segments from their ancestors. Your parent passes complete chromosomes (very large segments) to you, which are made up of smaller segments from his/her ancestors. See the graphics in Top-Down.

Mapping with Triangulated Groups

Now the hard part of chromosome mapping with Triangulated Groups (TGs) is that the TG segments don’t generally create a map for any specific generation. The TG segments could come from ancestors of different generations. It all depends on how your TGs are laid out, and on the shared segments you get with Matches. You will only “see” TG segments that are made up of shared segments from Matches. So some TG segments may be from closer generations than others. But, in general, your TG segments will be adjacent to each other from the start to the end of each chromosome.

Cousinships with Matches in a TG

A TG is made up of Matches with shared segments. Review: The Anatomy of a TG. The shared segments (with Matches) in a TG will generally fall into 3 categories:

  1. Shared segments with Matches who are cousins on the Common Ancestor who created your TG segment. For a TG segment created by a 6G grandparent, this would be 7th cousins. Note: it would be very unlikely for you to have more than three exact 7th cousins (each one from a different child) from a 6G grandparent on one segment (i.e. in a TG). More on this later.
  2. Shared segments with Matches who are closer cousins – they share a Common Ancestor with you who is closer than the 6G grandparent who created your TG segment. Note: it would be very unlikely for you to have more than three exact 4th or 5th or 6th cousins (each one from a different child) from the same xG grandparent on one segment (i.e. in a TG). More on this later. These closer cousins would share a Most Recent Common Ancestor (MRCA) who is in the direct path of descent from the 6G grandparent (who created the TG segment) down to you. Because of this they also share the 6G grandparent ancestor with you. But, in this case, you will both descend from the same child of the 6G grandparent, and that does not count against the “limit” of three 7th cousins from a specific ancestor.
  3. Shared segments with Matches who are more distant cousins – they share a Common Ancestor with you who is ancestral to your 6G grandparent who created your TG segment. This frequently happens when the shared segment is less the full TG segment from the 6G grandparent (for example). So these Matches could be 8th or 9th or 10th cousins or even more distant. They descend from a smaller “sticky” segment which is passed down to them (along a specific path) and down through your 6G grandparent to you, along the same path as categories 1 and 2 above. So you might be 9th cousins on an 8G grandparent, who is a grandparent of your 6G grandparent. In fact, when your shared segment (in a TG) is in the 7 to 10cM range, the odds are roughly 80% that the Common Ancestor will be more than 10 generations (9th cousins) back. That’s the bad news – most of these smaller shared segments will be at the far reaches, or beyond, our genealogies. The good news is that roughly 20% of the Matches (sharing 7 to 10cM segments) will be closer – spread among 1st to 8th cousins. For shared segments in the 10 to 20cM range, the odds are roughly 60% that the Common Ancestor will be more than 10 generations back – which means 40% of those Matches will be closer cousins. In fact, with 100 shared segments in a TG in the 10-20cM range, you’d probably have two or three 3rd cousins, two or three 4th cousins, two or three 5th cousins, two or three 6th cousins, etc. You can see a graph of these probabilities in this article at ISOGG.

100 Matches on a TG!

So the next issue is: can we have 100 Matching cousins in a TG? The answer is absolutely!  The odds are almost nil that more than 4 children will pass down the same matching segment to cousin descendants. Hence the caution that we almost certainly cannot have more than 4 cousins, from different children of a specific ancestor, passing down matching shared segments to cousins in one TG (i.e. on one segment). Each child inherits a different mix of DNA from the parents – rarely will more than four of them get the same DNA, and much more rarely would their descendants wind up sharing that DNA on the same segment (i.e. in a TG).

However – 100 Matches can easily descend from an ancestor 10 generations back without violating that concept. You and a sibling could share the same segment with two siblings who are 1st cousins from a grandparent. Similarly 4 Matches could be 2nd cousins with no more than two children from each ancestor. And 8 more Matches at 3rd cousin level – up to 512 Matches at 9th cousin level (10 generations back). And you could double that number by adding a once-removed cousin at each level. All without using more than two children at each generation. And this is all within a genealogical timeframe of 10 generations. If you consider that 80% of the Matches sharing 7-10cM will be well beyond the 10 generation level, the potential number of Matches in a TG becomes quite large. Of course this means the Matches are not all exactly at the same cousinship level – they are very likely to be spread over a range.

Large numbers of Mathes in a TG is not a pile-up!

So clearly a large number of Matches in a TG should not sound an alarm. Or arbitrarily define the TG as a pileup area, or require that all the Matches should be discarded. Nonsense! Large TGs, with Matches matching each other on overlapping shared segments, just indicates that many of the Matches will be more distant cousins. Some Matches will be beyond the genealogies of some – it depends on each genealogist and how far back their Tree goes, and the same for their Matches. But, statistically, there will be some closer cousins in the mix of Matches in the TG. We won’t know which are the closer cousins until we share with them.

The ancestor who “created” your TG segment

In the above discussion I note the ancestor who “created” the TG shared segment. This means the most distant ancestor who had that full segment. His or her ancestors did not have that full segment. His or her parent created (at least) that full segment from their parents. And the segment you inherited from that ancestor is all or, probably more likely, a subset of the segment that ancestor passed down, and eventually got to you. We are talking about the unique segment you inherited from that ancestor, and that segment did not exist in any ancestor further back. The TG segment is unique to you and your ancestor.

Your TG/Segment is unique to you

The ancestor who created your TG segment, probably passed down various overlapping segments to your Matches. These segments may be larger or smaller than the segment you got. What you “see” in a shared segment, is the part your Match got that overlaps the segment you got. So from a Match’s perspective, there is almost certainly a different segment from the Common Ancestor than you got – the Match would have a different TG than you from this same ancestor. Or some of your Matches may get segments from a more distant ancestor. These segments would be smaller and, when combined with other DNA, would result in the DNA segment that your ancestor passed down to you. Thus, those other Matches would be more distant cousins, on a smaller segment than what you got. There are many scenarios that would result in a Match (sharing a segment with you) being a more distant cousin – refer back to alternative 3 above.

The name of the atDNA game is sharing and collaboration

This accounts for some Match cousinships which can be pretty distant.  I have to emphasize that, although some of the TG Matches will be fairly distant (and often beyond your genealogy), some Matches in each TG are probably within a genealogical timeframe. This brings up two important points when using TGs:

  1. Contact every Match you can – you never know which ones may have a Common Ancestor you can identify.
  2. Encourage Matches within a TG to share Trees among the group. Some Matches in a TG may be closer cousins to each other than they are to you, and they may in fact have a Common Ancestor who is ancestral to the ancestor of your TG. Sharing and collaboration are the name of this game.

 

15A Segment-ology: Understanding an using TGs by Jim Bartlett 20160917

 

Anatomy of a TG

This is another blog post that gives you some idea of what to expect with autosomal DNA and your segments. In this post we’ll look at the formation of a TG. We’ll walk through the steps:

  1. Start with overlapping segment data
  2. Simplify the data by rounding
  3. Sort by Chromosome and Start location
  4. Then Triangulate the segments (no genealogy required)
  5. Highlight one of the two resulting TGs
  6. Show this data graphically – like you’d see in a chromosome browser
  7. Overlay the total TG
  8. Then use our imagination and x-ray vision (or GEDmatch) to show what the ancestral segments of the Matches might look like
  9. Do some analysis…

Ready?

Figure 1. Some overlapping segment data

10B Figure 1

Letters represent Match names – data is taken from my spreadsheet.

Figure 2 – divide the Start/End locations by 1000000

10B Figure 2

It’s much easier to read the Start/End locations in Mbp; and it’s just as accurate for genealogy.

Figure 3 – the data is sorted by Chromosome and Start location

10B Figure 3

This makes it much easier to see overlapping segments.

Figure 4 – this shows the results of Triangulation into groups 16A and 16B.

10B Figure 4

No genealogy was involved in this process – it’s purely a matter of comparing segments at 23andMe or GEDmatch; or looking for ICW Matches in this list and each ICW list at FTDNA. Again, this is real data from my spreadsheet. Often there is more mixture between the two TGs, but I hope you get the idea.

Figure 5 – Here is only TG 16B data

10B Figure 5

It’s still arranged by Chromosome and Start location.

Figure 6 – Same data and the shared segments displayed graphically

10B Figure 6

This is how you’d see the data in a chromosome browser. Note the top 11 bars will all match each other. The bottom bars will usually all match each other too, and they’ll usually also match the top 8 bars, but maybe R and N will not match at the 7cM level at GEDmatch. Just lower the level to 500 SNPs and 5cM and you’ll find there is enough for a Match. Let’s see what the TG for this data looks like…

Figure 7 – Now the fun begins…

10B Figure 7

Usually the TG is pretty clear cut, but I’ve intentionally selected one with two kinds of ambiguity. In almost all cases the ends of the segments are fuzzy. You can read about Fuzzy Data in my blog post here.

Judgment is needed at this point. I’ve shown the “guaranteed” TG in red, with orange tips where the data looks fuzzy. I want to emphasize that this is NOT a problem for genealogy – the TG (wherever the true crossover points are that define the TG Start and End locations – somewhere in the orange areas) represents an ancestral segment from one of your ancestors. The fuzzy ends are not an issue. Your Matches will share a Common Ancestor with you – and that’s where the focus should be. The crossovers defining the real TG will be somewhere in the fuzzy orange tips.

You can also see that this data indicates a probable more distant crossover point – around say 51Mbp. In this case the top 8 Matches and Match S are probably closer cousins sharing a larger, closer segment with you. At this point you might want to review Crossovers by Generation here.  Going back one or more generations we may see the large red TG being subdivided into two smaller ancestral segments – each with its own Common Ancestor each one of which is ancestral to the Common Ancestor for the red segment. In this case the last 8 Matches (T, J, Q, L, M, I, A and K) will have a different, more distant CA than Matches G and H. The main, red, TG may be from a 5G grandparent, and the smaller, green and purple segments may be from a 6G or 7G grandparent. Actually the purple segment, as an example, may pass intact through several generations, and you could share this same segment with 7C and 8C…

Again, most of your TGs will be tighter and from a single CA, but I wanted to take this opportunity to show what sometimes happens. You can avoid any conflict by watching for this situation and just declaring two TGs in this case – see the green and purple bars. Then it’s like the case where a close cousin spans more than one TG – the close cousin will help you define a larger segment from a closer ancestor, and the close cousin, along with different groups of Matches in different TGs will share more distant Common Ancestors with those TGs, but those more distant CAs will be ancestral to the MRCA the close cousin shares with you.

So now let’s use our imagination a little (or we could actually Triangulate this area from the perspective of some of our Matches. In this next Figure 8, I’ve guessed at what the ancestral segments might be the Common Ancestor down to the different Matches – as shown in green.

Figure 8 – showing ancestral segments for all Matches

10B Figure 8

In some cases our Matches have somewhat larger ancestral segments than we have, or they might have segments that extend over one end of our ancestral segment, or the other. In all cases the blue represents the overlap between each Match’s ancestral segment and our own ancestral segment. And the data is not exact, so the ends often don’t line up vertically. The long green bar at the bottom of Figure 8 is a segment an ancestor passed down to living people – you got part of it, the red part.

At GEDmatch it’s often fun, and instructive, to compare two Matches to each other. Sometimes they turn out to be parent/child, or siblings. This is the exercise you’ll want to do if you are trying to map an ancestor.

So, again, if you have collected a lot of shared segments from FTDNA, 23andMe and GEDmatch, they have to go somewhere. It’s not hard to compare them to each other and see where they Triangulate. If they are IBD, they have to go on one chromosome or the other. When you do this you’ll find there are natural break points where the crossover points are located (often the precise location is a little fuzzy). Just look at the data above.

 

05D Segment-ology: Anatomy of a TG by Jim Bartlett 20160204