Here is a 3-step process for Triangulation: Collect, Arrange, Compare/Group.
- Collect all the Match-segments you can. I recommend testing at all three companies (23andMe, FTDNA, and AncestryDNA), and using GEDmatch. But, wherever you test, get all of your segments into a spreadsheet. If you are using more than one company, you need to download, and then arrange, the data in the same format as your spreadsheet. Downloading/arranging is best when starting a new spreadsheet. Downloading avoids typing errors, but direct typing is sometimes easier for updates. I recommend deleting all segments under 7cM – most of them will be IBC/IBS (false segments) anyway, and even the ones which may be IBD are very difficult to confirm as such. You are much better off doing as much Triangulation as you can with segments over 7cM (or use a 10cM threshold if you wish), and then adding smaller segments back in later, if you want to analyze them. NB: Some of your closer Matches will share multiple segments with you – each segment must be entered as a separate row in your spreadsheet. The minimum requirement for a Triangulation with a spreadsheet includes columns for MatchName, Chromosome, SegmentStartLocation, SengmentEndLocation, cMs and TG. Most of us also have columns for SNPs, company, testee, TG, and any other information of interest to you. Perhaps I need a separate blog post about spreadsheets… ;>j
- Arrange the segments by sorting the entire spreadsheet (Cntr-A) by Chromosome and Segment StartLocation. This is one sort with two levels – the Chromosome column is the first level. This puts all of your segments in order – from the first one on Chromosome 1 to the last one on Chromosome 23 (for sorting purposes I recommend changing Chromosome X to 23 or 23X so it will sort after 22). This serves the purpose of putting overlapping segments close to each other in the spreadsheet where they are easy to compare.
- Compare/Group overlapping segments. All of these segments are shared segments with you. So with segments that overlap each other, you want to know if they match each other at this location. If so this is Triangulation. This comparison is done a little differently at each company, but the goal is the same: two segments either match each other, or they don’t (or there isn’t enough overlapping segment information to determine a match). All the Matches who match each other will form a Triangulated Group, on one chromosome – call this TG A (or any other name you want). Go through the same process with the segments who didn’t match TG A. They will often match each other and will form a second, overlapping TG, on the other chromosome – call this TG B. [Remember you have two of each numbered chromosome.] So to review, and put it all a different way: All of your segments (every row of your spreadsheet) will go into one of 4 categories:
- – TG A [the first one with segments which match each other]
- – TG B [the other, overlapping, one with segments which match each other]
- – IBC/IBS [the segments don’t match either TG A or TG B]
- – Undetermined [there are not enough segments to form both TG A and TG B and/or there isn’t enough overlapping data to determine a match.]
- NB: None of the segments in TG A should match any of the segments in TG B.
- At GEDmatch – the comparisons are easy. Just compare two kit numbers using the one-to-one utility to see if they match each other on the appropriate segment. The ones that do are Triangulated. You may also use the Tier1 Triangulation utility or the Segment utility. I prefer using the one-to-one utility and Chrome.
- At 23andMe you have several different utilities:
- – Family Inheritance: Advanced lets you compare up to 5 Matches at a time. You may also request a spreadsheet of all your shared segments; sort that by chromosome and SegmentStart, and check to see if two of your Matches match each other. The ones that do are Triangulated.
- – Countries of Ancestry: Sort a Match’s spreadsheet by chromosome and SegmentStart, search for your own name, and highlight the overlapping segments. The Matches on this highlighted list who are also on overlapping segments in your spreadsheet are Triangulated (the CoA spreadsheet confirms the match between two of your Matches)
- At FTDNA it’s a little trickier, because they don’t have a utility to compare two of your Matches. So the most positive method is to contact the Matches and ask them to confirm if they match your overlapping Matches, or not. The ones that do are Triangulated. An almost-as-good alternative is to use the InCommonWith utility. Look for the 2-squigley-arrows icon next to a Match’s name, click that, and select In Common With to get a list of your Matches who also match the Match you started with. Compare that list of Matches with the list of list of Matches with overlapping segments in your spreadsheet. Matches on both lists are considered to be Triangulated. Although this is not a foolproof method, it works most of the time. And if you find three or four ICW Matches in the same TG, the odds are much closer to 100%. Remember, every segment in your spreadsheet must go in one TG or the other, or be IBC/IBS, or be undetermined. If a particular Match, in one TG, is critical to your analysis, then try hard to confirm the Triangulation by contacting the Matches.
- AncestryDNA has no DNA analysis utilities. You need to convince your Matches to upload their raw data to GEDmatch (for free) or FTDNA (for a fee), and see the paragraphs above.
Comments to improve this blog post are welcomed.
10 Segmentology: How to Triangulate; by Jim Bartlett 20150511
Nice blog poost
LikeLike
Harley – thanks for the kind feedback. Jim
LikeLike
Jim
I am having some trouble triangulating on Chr16. The provider is MyHeritage, to which I uploaded all 6 kits–uncle and 5 cousins. I’m trying to make grandparents, so I expect 4 lines. It is easy to work out siblings; if they have different matches from the same parent at locA, then one is grandmother, the other grandfather. But working out crossovers in the same grandparent seems not so straightforward.
How much overlap–in either cM or Mbp–do you consider a triangulation? I am asking about leading or trailing edge overlaps in this case.
Also, I regularly get matches that “triangulate” using the MyH browser on a subsection of their depicted overlap: front end, back end, middle, anywhere, sometimes 2 small ones. It could be two pieces each reported as 30 cM, entirely overlapped, but partially triangulated. So, I gave up using the browser, at least as my first pass–now, I try to find matches that match another member of the family in that spot. Then I try to backfill gaps using MyH triangulation. But still I have some irregularities pop up. It’s making my head buzz.
Can you reduce this to a simple decision making process for me? Should I map only the pieces that show as triangulated in MyH, instead of the bigger reported segment? (I am not speaking of extra length fore or aft, in this case.) Should I set the standard for a triangulated segment higher than the default 2 cM?
And, do we have access to all our matches on MyH? I’d think most matches of any significant length would fit within the 10K or so matches we each have…but I’d like to rule this out as a source of the problem.
Thanks!
LikeLike
Kathryn – MyHeritage has a little glitch. The allow uploading from several different companies which use different testing chips – so the 700,000 SNPs that are tested are not a one for one overlap. So they use imputation to “fill in” in order to create longer shared matches. So the Shared DNA Segments tend to be a little longer on both ends. But they stick to what is actually measured (in SNPs), when they determine segment Triangulation overlaps. That’s why you’ll see the Triangulation “loop” often smaller than the Shared DNA Segments. This happens on a smaller scale at the other companies because there is no “signpost” in our DNA to signal where one Ancestor’s DNA stops and another Ancestor’s DNA starts. The process of matching SNPs just keeps going until there is no more match – but that may include some amont of zig-zag between the SNPs from both chromosomes. So the ends are fuzzy. For TGs, we should use the TG segments (or just accept that the ends are fuzzy). The important thing is a long string of SNPs (a long segment) that comes from an Ancestor. I usually decide on the start location that incorporates all of the segment (including some fuzziness); and let the end locations overlap the start of the next TG. Unless we are trying to pinpoint a specific medical SNP near a crossover, the actual crossover points are not critical. What’s critical is a big hunk of DNA from one Ancestor, followed on the chromosome by a DNA segment from a different Ancestor (who may be close or distant).
Generally, grandparent segments in your DNA will be large – about 34 of them on each side to cover 23 chromosomes. So, generally, the crossovers in you will be pretty spread out. Your sibling will get their own 34 segments on each side, and the crossovers will be randomly different – some may be close to yours and some won’t.
Absolutely set the triangulation threshold above 2cM!! I’d use 8cM as a minimum (or even 10 or 15cM). All of your Shared DNA Segments at MyHeritage are available to you. You can easily download a spreadsheet with all of the Match names and shared segments – on your home DNA Match page, click on the 3-dot elipsis on the right just over the Match list.
Jim
LikeLike
Thank you very much, Jim. I suspected the SNP comparison might be at the root of the problem, and am very glad to have my suspicions confirmed with fact.
I am not just creating my grandparents as its own exercise. I am trying to identify one of my 2g-grandmothers–my uncle’s paternal g-grandmother. Her name is lost in the mists of time…so finding those crossovers seems important. Based on the DNAPainter tool, with all those kits I have about 25% of Rebecca UNK French’s to locate.
Here’s a specific conundrum I was puzzling: Three of us have a long segment crossing the centromere called K; one of us has a 6.6 cM at the very front of the segment (also K). Should I believe the short segment is IBS? Seems rather unlikely to me, but if I set my threshold to 7, that little conundrum goes away…
Again, thank you so much for all you help. It’s very good to have someone with loads of experience to check in with.
LikeLike
Kathryn,
First – a 6cM shared segment is very likely a false segment, and over the centromere is iffy (not many SNPs there). Whenever I get a “problem child” segment, I set it aside, and focus on all the other segments – time is too precious….
Second – there are two approaches to finding a bio-Ancestor: 1. segment Triangulation to group Matches; ID likely TGs, and then look at the Trees for the Matches in those TGs – the problem being that most of the Matches with segment info, don’t have good Trees; 2. use Shared Match Clustering at AncestryDNA where there are more good Trees. I’ve had more luck with the latter, because in the end we need Matches (in a group) with Trees we can review for a consensus bio-Ancestor.
Jim
LikeLike
Kathryn, Note that, roughly, 1/16 of your Matches will be from each of your 2xG grandparents – so there should be a lot of info to go by. I’d almost start with a Leeds grouping of 4 grandparents (using 90-300cM Matches, and then gradually add more, slightly smaller Matches, again focusing on the one relevant grandparent, and then continue with even more Matches to fill out groups (Clusters) – one of which would be for your target 2xG grandparent. Jim
LikeLike
Hi Jim,
I am trying to create TGs on Chr 17 and assigning them to GP or GF. I am using my brother’s and our 1C’s kits at the moment. I know they share a segment in a certain area, but I cannot find any matches in common there. Not sure what is going on–I could assume there are no common matches in that zone, but…
When grouping with FTDNA, I discovered an interesting result, depending on which screen I use. If I choose ICW “Leo Martin” from my 1C’s match page and search for my brother, I get no result. (Which might be okay, considering my brother does not have “Leo Martin” in his own match list.) BUT–if I do the same ICW “Leo Martin” search from the chromosome browser, my brother shows up in the ICW list. (This could be quite handy for sorting matches, if true.)
Not sure what to make of this.
Thanks!
LikeLike
I would not use the Chromosome Browser to find ICW Matches – just because the browser shows an overlapping segment does NOT mean the two segment are on the same chromosome – they could be the same Chr # but one on paternal and the other on maternal sides.
I have a few, relatively small, DNA areas where I don’t have any shared DNA from a Match. And I have a number of TGs with no known Common Ancestor (and even if I had one CA, I cannot be sure that it’s correct – we need multiple CAs on the same line to be confident. Jim
LikeLike
Hi Jim
I’m not sure I understand the distinction between using ICW from the matches page and using ICW from the chromosome browser page. The chromosome browser appears to offer the same function, even if you ignore the visual representation. That is, I can still filter my match list by ICW. I already know they appear to overlap–I started with a dump into DNAPainter.
But, in any case, I think I should not get hung up on this little segment. I still cannot work out very reliably which pieces go to GP and GF. I might need to go away and come back later…
LikeLike
Kathryn,
ICW is not based on DNA segments and is not Triangulation. Several Matches who are mostly all on each other’s ICW lists form a Cluster and have a Common Ancestor. If these are also all on the same/overlapping segments, they would be a Triangulated Group
Overlapping segments, by themselves, are not a TG – we have to determine if they overlap each other (verify that third side of the triangle). Can’t do that at FTDNA, unless you ask Match A if they match Match B on the same segment. However, if you have overlapping segments AND the Matches are on each others ICW list, there is a very high probability that the shared segments are Triangulated.
This is a corrected post – the original was a garble from my iPhone. Jim
LikeLike
Hi Jim,
I referred to some weirdness at the tail end of my and L’s shared X-DNA.
Overall, all the X maps appear to have 3 large TGs.
1) I have a known 2C1R on GM’s side. She shares a small segment at 105-119 Mbp. My 3rd paternal TG starts at 110 Mbp. But she will not triangulate with that match or any match. It seems improbable to me that she belongs to my maternal X.
2) Here is my share with uncle L. This looks out of sync with the overall picture, to my eye.
K / L X 134537928 150557918 33.02 2016 Half
K / L X 90697269 109065690 15.32 1413 Half
3) Introduction to J (male again)
GP –> son (L)
GP –> son (C) –> dau –> J (son)
Here is the X-DNA share:
J / L X 134537928 152503241 36.65 2262 Half
J / L X 134069784 152877997 37.83 2351 Complete
I am suspicious that GM had a common segment at the tail end of her two X-chromosomes. (GM’s two sets of grandparents had the same surnames, although I have not been able to establish their relationships to one another.) But what do you think is going on in this situation?
LikeLike
Kathryn, Although Chr X has some unique characteristics, the fact that a male X always passes intact means that any Common Ancestors are more distant than they would be on other chromosomes. And Chr X cannot be passed down with two males in a row. Other than that, Chr X is much like other chromosomes. For these reasons I wouldn’t put too much analysis into Chr X segments. If your GM had two identical segments on Chr X (in other words her DNA is Full Identical Region (FIR) instead of the normal Half (HIR) regions we normally see between Matches. This usually means some of her ancestors married each other. Jim
LikeLike
I understand. I have to think about your comment about the CA being farther back on the male side. I was particularly interested in GF’s X because it might give me a hint as to my missing 2g-GM. She has a given name and no surname.
LikeLike
Jim,
It looks like half my questions never made it to post. The earlier one I referred to was about some strange goings on at the tail end of uncle L’s X. When I get my wind up to write it all out again, I’ll ask it again… I suspect my GM had the same segment on both X’s in that location…
LikeLike
I’ve tried to answer each one… Jim
LikeLike
No no, I think the problem is in the transmission. I have to get smarter about WordPress or something. I write then log in when prompted. But I think I should try it the other way round.
LikeLike
Hi Jim,
Might have been my use of Firefox. Just tried with Chrome and no issue.
LikeLiked by 1 person
Hi Jim,
I am monopolizing your time, I fear…
I need an external head shake, please. Please check my logic…
My father and L are brothers. They each got a recombined version of my GM’s X chromosome. They will not be the same but they could share some regions. My dad had only one X chromosome, so I got it unrecombined. As luck would have it, if I share almost nothing on X with uncle L (excepting the weird stuff from my earlier question), so, between us, we have the full set of matches on both of GM’s X. That’s check #1.
Now, my cousin T, whose MOTHER, R, was my Dad’s sister, also got one X. It is a recombination of R’s 2 chromosomes, one of which is a straight copy of GF’s only X. So, if T shares nothing with with uncle L and me, then did he inherit a complete X chromosome from GF? Am I really that lucky?? Check #2.
LikeLike
Kathryn – So your father got one X which was your GM rX1; L got one X which was your GM rX2; and R got 2 Xs: one was your GM rX3 and your GF X. Now T (who must be male to get only one X), got r(GF X & GM rX1). When compared with you and L, we can discard the component R got from GF, because neither you nor L could have gotten that. So you and L don’t share any X, so you must have gotten completely different halves from GM. The only thing to compare with T is the rX3 portion from GM. If there is no match with GM’s X then R must have passed 0 of rX3 from GM to T, meaning that R must have passed the full GF X to T. Technically this is possible; but the probability is low. Jim
LikeLike
I’m a little afraid I have again been guilty of asking questions you have addressed elsewhere…
LikeLike
Kathryn; Don’t worry about it – I usually develop an answer from scratch, which may have a different twist on the question… and… I’ve sometimes developed better explanations. Jim
LikeLike
Pingback: The Fundamental Building Blocks of Genetic Genealogy | segment-ology
Hi Jim, I know I’m tacking onto a post from a few years ago, but here is as good as any I suppose. I grasp the concept of tracing cousins with shared segments back to a MRCA “couple”, but how do we decided whether the male or the female or the pair is the CA? I am supposing that it takes finding cousins further out (2nd, 3rd, 4th. etc) to show which line a segment follows. A 1st cousin really isn’t enough to show much more than which pair of grandparents we are looking at, and not necessarily that it is from grandpa vs grandma, correct?
You have talked about “pointers” before and if I remember right they were the 1st cousins, aunts and uncles, etc that at least pointed to one side or the other of our grandparents. Should we not really pay too much attention until we really start pointing out great grandparents and further back?
I hope I have asked this so that its understandable.
LikeLike
You are correct that we don’t know who in an MRCA passed down the DNA. A match with a more distant cousin will tell you.
Pointers are where you find the. My maternal uncle provides a Pointer to the maternal side; close cousins provide Pointers out to the MRCA couple. Beyond 3C or 4C – be careful: the cousin might be related more than one way, giving more than one Pointer, only one of which might be correct.
LikeLike
Teresa – my typo: GA should have been TG. My TG ID system is as follows. Example 03C24 – this means chromosome 3; starting somewhere from 20 to 30Mbp [that’s the C – a letter for each 10Mbp]; 2 is ahnentafel for P (my father); 4 is ahnentafel for his father. Some people, instead of my 24. 25, 36 and 37, use PP, PM, MP and MM. When you first start out and form 2 overlapping TGs you might not know the side – you could call them 03Ca and 03Cb, until you figure it out. When I send an email or message, I often include the TG ID – particularly if there is some substance to the correspondence – I can then save a copy in that folder, or Evernote, etc. This collects all the notes about that TG under the same ID. I have some TG correspondence that spans 5 years…
LikeLike
Thank you!
LikeLike
Teresa – I’ll try; but I’m not sure I’ve found the same Examples you have. They have tried to develop a unique ID number for each GA. The examples I saw had the Chromosome number first as in C8; and then the total cM (not just the cM for this TG). The numbers on the bottom line in the boxes are just a programming tool to place the boxes. I recommend a letter code for the start of each TG (any start location from 1 to 10Mbp is A; 10-20Mbp is B; etc. It’s very difficult to do this by computer algorithm, because close cousins have segments that span more than one TG. I use the TG tool to check my TGs (as in Quality Control). I compare the GEDmatch report against my spreadsheet, to insure there are no anomalies. IMO, there is not much from the GEDmatch report that you can use, except the fact that the kits Triangulate. Jim
LikeLike
Thank you! So you do generate a unique number for each – not sure what GA stands for? How do you handle the cousins with segments spanning more than one TG?
LikeLike
Can you explain what the numbering system is on the Gedmatch triangulation group report: specifically, what From: refers to? Example: From O28 TG V2; From P29 TG: V2…. Thanks in advance!
LikeLike
Pingback: 6 Great Websites for Autosomal Tools and Techniques | aweekofgenealogy.com
Just curious, how do I determine which chomosome is paternal and which is maternal, just reading above it dawned on me why in a chromosome browser I have two folks in a row looking like they’re overlapping but actually not matching each other. Any help appreciated most likely will be very revealing for me. Thx
LikeLike
Rich,
The primary way is through genealogy – finding a known cousin in a Triangulated Group. This lets you know which side (chromosome from a parent) the TG is on. If your parents have very different ethnicities (like jewish and non-jewish), you can often tell by the Matches. My stepson is half Jewish/Poland and half British – I can often tell by the Match surnames in each TG, which is which….
LikeLike
Jim,
I am a novice here but think I have a fairly good grasp on the basics of triangulation and it benefits. i.e. significant overlap and ICW, helps identifying maternal or paternal segments, etc. I am getting hung up on doing the TGA,TGB, IBC, U groupings. Does a match have to match every other match in a TGA or TGB? If not, is it sufficient to match just one other match or does there need to be some other interconnections other than their matching you? I am thinking that in a large group there would be groups of matches who have a common MRCA with these groups converging at some point on a CA. Am I heading in the right direction here, or do I need to rethink all of this?
Tammy
LikeLike
Tammy – you are on the right track. If a segment overlaps another by the threshold amount, they should match every such segment. If your segments are “staggered” over the TG, the overlap of some may not be big enough to form a match. So you have to use judgment here. If two segments look like they should match, but don’t, you have to determine why not. It’s your own DNA and puzzle – don’t try to force it, or try to fool yourself. When in doubt, I look for other clues.
LikeLike
For FTDNA data the easiest way to find and compare matching segments is by using the ADSA (Autosomal DNA Segment Analyzer) report at dnagedcom.com an online manual explains how to use the tool https://www.dnagedcom.com/adsa/adsamanual.html.php
LikeLike
grahcom – I prefer my own spreadsheet, and haven’t used ADSA. But for those who do, that may be a good alternative method. NB: ADSA cannot compare two of your Matches by segment, it also uses the ICW process, which is correct most of the time. If it is a critical Triangulation, I’d recommend getting a confirmation from one of the two Matches that they match the other Match on the same segment. We know from ICW that they match the other Match somewhere, but for Triangulation you may want to confirm that it’s on the correct segment.
LikeLike
You’ve touched on this in a few previous comments, but are you able quantify what constitutes ovelrlapping? Is there a standard value that is accepted? When trying to form a TG you said that the start/end points don’t need match exactly but how much of an overlap is required to consider the segments part of the same TG..75%…50%…10%? Obviously GEDmatch’s triangulation utility uses some threshold, but I’m curious what you use.
Thanks for your blog. I’m finding it very interesting.
LikeLike
Brian – I’ve used a 7cM threshold for a while. And it is not hard to find most matches that overlap by 7cM. But it’s not really critical – if they don’t overlap enough, they won’t triangulate. So the overlap is really a screening process.
LikeLike
The Match 1 segment is between you and Match 1. The Match 1 and 2 overlap is what all 3 of you share.
LikeLike
I understand that. My point is that the output of the Gedmatch triangulation utility doesn’t show where I match Match1 1 or Match 2. My kit does not appear in the output list at all. Do I need to merge the triangulation output with my segment match output? Something else?
LikeLike
June – I now see your point . The Triangulation format has changed a little since I last used it (I use a spreadsheet and compare all ne segments, one by one. From time to time I check using the GEDmatch Triangulation utility, but have not found anything different with that utility in quite a while. It, too, is just a tool). Yes, to find your segments with Matches, you need to run the Segment utility for many of them, and then check the rest of your GEDmatch Matches on One-to-Many using the One-to-One utility.
LikeLike
I’m still having trouble assigning TGs. When you run a Gedmatch triangulation it shows the pairwise segment matches of your best 400 matches but it doesn’t show the segment match of you to them. In other words it shows Match 1 to Match 2 but it doesn’t show Match1 to you or Match 2 to you. Presumable they each match you in the same general area but is the match 1 to match 2 result sufficient to assign them to the same TG?
LikeLike
Pingback: Fuzzy Data, Fuzzy Segments – No Worry | segment-ology
Jim, can you talk a bit more about the best way to keep track of your TGs. Should I add a TG column to the spreadsheet and fill in the appropriate notation after running the 3way comparisons? I’m also not sure of the spreadsheet implications for this comment “Also be sure to keep track of which Matches match each other in larger TGs, which may wind up splitting”
LikeLike
The best way is to group them by start location, and review the overlaps each time another Match segment is added. You can always add a column with notations. Watch for several blogs on spreadsheets.
Jim – Sent from my iPhone – FaceTime!
>
LikeLike
Do segment beginning and ending points have to match exactly, or is overlap still a match? For example, AP and I match on a segment 1-2-3-4-5-6-7-8-9. CK and I match on that segment 3-4-5-6-7-8-9. SP and I match on the same segment 1-2-3-4-5-6. Are these all matches?
Thank you for all your help!
LikeLike
Erica,
Segments should overlap – they do not need to start and/or end at the same location. They only need to match each other (or be In Common With each other at FTDNA)
LikeLike
May I check my understanding of point 3 please? On the same chromosome number … …
1. If two or more people with segments that have exactly the same start and end locations, they go into TGA.
2. If other people have segments with different (to TGA) start & end locations, but overlapping or contained within the TGA start & end locations, they go into TGB, as they must be on the opposite (Paternal vs Maternal) of the chromosome pairs?
3. What about people who share the same start, but not end, location? Are they related to TGA? Some have the same surname as soneone in TGA, so wondering if they have a smaller segment because it’s been ‘diced & spliced’ in a successive generation?
4. People with segments with identical start & end locations further down the same chromosome form TGC, but in reality could be the same people as in TGA (or TGB)?
Thanks for your blog posts , they are extremely helpful as this is my first foray into using atDNA to search for ‘unknown’ fathers in two successive generations.
Cheers,
Bev
LikeLike
Bev
No where is there a requirement for shared segments to start and/or end at the same location. They need to overlap. AND they need to match each other for Triangulation. If you have two Matches with segments that overlap, you need to determine if the two Matches have overlapping segments which overlap your two segments with them. In other words the three pairs (you and Match 1; you and Match 2; and Match 1 and Match 2) must all have overlapping shared segments. It is not sufficient that both Matches have overlapping segments with you; they must also have overlapping segments with each other (or be ICW each other at FTDNA)
LikeLike
Thanks very much Jim, that increases the size of TGs significantly 🙂 Also have some where start & end locations (in simplified terms), are 1-10, 1-8, 5-8, 5-10, 8-10, 8-15, 10-15. How would you place these in TGs?
LikeLike
Bev
Nominally, they go into a TG from 1-15. But it is likely to split into two TGs 1-10 and 10-15 (being a husband & wife), with the 8-15 being a closer cousin who spans both TGs. This is an evolving concept. The other option might be that some of the start/end points in the 8-10 region are fuzzy. I’m working on a fuzzy data/fuzzy segment blog…
Also be sure to keep track of which Matches match each other in larger TGs, which may wind up splitting.
Jim – Sent from my iPhone – FaceTime!
>
LikeLike
Thanks Jim. Would love a blog post on searching for ‘unknown’ paternity events when you have time. My older family member has two successive generations of illegitimacy, so I have very few names to search for her! Thanks, Bev
Sent from Samsung tablet
LikeLike
Bev
Brick Walls is on the list. And the key is forming and working with your TGs! The DNA segments come from your ancestral lines, whether you know their names or not. The Matches in your TGs may well know CAs among themselves, that you don’t know – that you cannot get to because of a brick wall. The short answer is to encourage your TG Matches to collaborate with each other as well as you – and grasp any CA they come up with that you *don’t* have – particularly if it can be tied to the place/time of your Brick Wall.
LikeLike
Thanks Jim for your prompt reply. I’m eagerly awaiting each of your new blog posts, as the timing is perfect for me.
I’ve sorted the FTDNA start & end location data into Excel & at present am transferring across all the GEDcom values. It’s a slow laborious task!
Thanks again, Bev
Sent from Samsung tablet
LikeLike
Try downloading the Tier 1 Matching Segment file, copying it to a separate spreadsheet, rearranging the columns as appropriate and then adding it to your spreadsheet.
Jim – Sent from my iPhone – FaceTime!
>
LikeLike
Thanks Jim, that was worth every cent of the $10! Also ran the triangulation tool successfully. Now to send more emails.
Is there anything else I can do with the data?
Cheers Bev
Sent from Samsung tablet
LikeLike
Bev
Triangulate every segment in your spreadsheet. That will form many TGs and help map most of your DNA. That task will keep you busy for a while. But the result will be a foundation for finding Common Ancestors.
Jim – Sent from my iPhone – FaceTime!
>
LikeLike
Thanks Jim.
I’ve highlighted in colour / sorted all the shared segments into TGs. I also ran the chromosome browser in GEDmatch. How do you ‘map’ your DNA? You can tell I’m new to all this!
So far we’ve only been able to go back 3 generations on a quarter of her tree, and 4-5 on another quarter, with the assumption that I’ve selected the right person as her grandfather. The paternal part of her tree only consists of a conception window and location (which fortunately is an island, so is a discrete location – perhaps in both senses of the word!)
Cheers Bev
Sent from Samsung tablet
LikeLike
Bev There are two types of mapping TGs. One is to physically lay them out and determine which side each one is on – maternal or paternal (this usually requires some genealogy). Two is to link each one to the Common Ancestor (requires a lot of genealogy – usually with your cousins in each TG). See also my Segments: Bottom Up blog post for a link to Kitty’s mapping program.
Jim – Sent from my iPhone – FaceTime!
>
LikeLike
Mmm, any suggestions for if you have hardly any ‘paper’ genealogy? I don’t think I have enough information for either of your suggested methods.
We know the person’s mother, maternal grandparents & maternal aunts & uncles, and an educated guess at maternal grandfather (& his ancestors if my sleuthing is correct). That’s it!
I read your Bottom up blog, & will take another look at Kitty’s program.
Cheers Bev
Sent from Samsung tablet
LikeLike
Bev, you’re getting ahead of me. I’ll answer in a future post.
Jim – Sent from my iPhone – FaceTime!
>
LikeLike
Thanks 🙂
Sent from Samsung tablet
LikeLike
Great blog Jim!
A question I have meant to ask is this:
You say sort on Chromosome and Start Point.
Why not sort on End Point as well?
LikeLike
You may – won’t hurt, but it adds very little to the grouping. It takes an extra step.
LikeLike
Thanks Jim. I tried this eons ago and did not connect it to your comment. Time to go play
k
LikeLike
No. 5 Countries of Ancestry:
How are you generating the list you are talking about?
Could you explain this is more detail? I use an excel spreadsheet for all my matches and understand it quite well. I have downloaded the Country of Ancestry from 23andMe. However, as the underlying person to all the matches my name does not come up in the list.
LikeLike
Karen,
Everyone you Share Genomes with at 23andMe (as well as others who have also taken the survey) has a spreadsheet at CoA. Your name is the default. Click on your names to get a drop down list of your other Matches at CoA, and look at their spreadsheets – you should be in every one of them.
LikeLike
thanks..you just taught me something new!
LikeLike
Katherine
Using the Tier 1 utility at GEDmatch for Segments, you get a large table of segment data which you can copy into a separate spreadsheet. Rearrange the columns to match your FTDNA spreadsheet, then copy them and add to your FTDNA spreadsheet.
LikeLike
Hello – Thanks for the informative blog. I’ve been trying to educate myself – took a webinar from Blaine Bettinger and recently attended a talk by Cece Moore. The more times I hear information, the better I can understand it. Back up at the top of this post, what do you mean by “Collect all the Match-segments you can.” In other words, I’m sitting at my computer ready to start, and exactly what do I do? Thanks much. Katharine Ott
LikeLike
I think my main question is, am I doing this for every single person who matches me, or just in groups that seem to have shared segments?
LikeLike
You can work on every Match, or you can select certain ones of interest – it’s up to you.
LikeLike
My recommendation is to test at all 3 companies (each one has a database of their customers, and you can “collect” all that you match); and upload to GEDmatch where you can “collect” even more Matches and segments which are below the thresholds at the three companies
LikeLike
Hi Jim – Thanks for your patient response. I have experimented some now and have a better handle on the procedure. I’ll be interested to see your Excel discussion though, as I’m just doing the basics. When I merge results from GEDmatch into the spreadsheet that ftDNA made for me, am I doing those by hand individually? — Katharine
LikeLike
I have at least two dozen full base matches that have exceptionally high SNPs. Compared at Gedmatch on one to one (the block shows green on the same chrome). Please tell me if this is normal.
LikeLike
Jim, on 27 April you wrote on genealogy-dna-request@rootsweb.com “Triangulation of segments over 5cM means they are IBD, virtually all of the time.” How do square this with deleting everything under 7 cM even before your start?
LikeLike
So you know we are paying attention. LOL!
LikeLike
This is normal. To see if they Triangulate, you must compare them with each other.
LikeLike
Actually, if this is “Full” match, then it is unusual. GEDmatch had problems, and is down now. Please confirm your matches after GEDmatch is back up
LikeLike
Jim, so glad you are sharing your knowledge with us! I have taken some of the classes but still get lost in the spreadsheet application! I us excel and certainly could use a blog detailing the setting up the sheets! In Xcel we used to have two views list view and table view, what ever happened to that switch?
Great Blog, easy to understand!
Thank you
Margo
LikeLike
Margo, Thanks for the kind words and encouragement. Spreadsheets are on my short list – what data to include, how to get that data, updating the data, and tips on using the spreadsheet.
LikeLiked by 1 person
pm I hope you’ve also read the Benefits of Triangulation…
LikeLike
If two kits both match me on a particular chromosome, same segment beginning and ending, don’t they therefore necessarily also match each other? Or should I compare those kits to each other to see all the chromosome segments they match each other on?
LikeLike
Erica
Not necessarily. One of these segments may be on the chromosome you got from your father (and thus be from an Ancestor on his side) and the other segment may be on the chromosome you got from your mother (and thus be from an ancestor on her side). IF they match each other, they have to be on the same chromosome – that is Triangulation.
LikeLike
I have down loaded all the chromosome related segment start & end file for my matches, which is an option in the FTDNA chromosome browser. Then I sorted by chromosome and by start & end segment. For the matches that are an exact match or overlapping match on each segment, I then use the matrix to determine if they triangulate. I’m only working on segments over 10cm as I have just started.
Thanks so much for posting the details on this Jim as it validated my approach which at times had me wondering what the heck I was doing!
LikeLike
Looks like I just found another problem with Calc. Before I was just building a spreadsheet of my matches. Now that I know how to easily add a Source column. I went for the whole enchilada and downloaded the chromosome browser file. But when I opened it (Open Office Calc is selected automatically) I got an error message that I had exceeded the maximum number of lines and the excess would be deleted. I ended up with 85,502 lines. I have over 40 lines just for my segments. Forty segments times over 4200 matches suggests I could need on the order of 168,000 lines, twice what Calc seems to allow. Is Excel going to have a problem with Ashkenazi size files too?
I then tried changing the minimum segment length in the browser to 10cM before I downloaded but it looks like I got exactly the same result. So I looked more carefully at what had been included and I’m not sure it really lost anything after all as the last match name shown was the same as the last name that appears in the browser list – in Hebrew (two before that were Cyrillic). I’d still like to know if there’s an easy way to shorten the file to include only segments of 10cM and more just to get a smaller file to work with for now. Can the short segments be filtered out somehow on the spreadsheet side as apparently they can’t be eliminated from the download to start with?
Any other suggestions for dealing with huge numbers of matches?
LikeLike
June
Most Matches will average about 5 to 10 segments. I don’t know a way to filter the full download. Excel should handle it all. Alternatively you can select and download 5 Matches at a time, and filter them in big batches.
LikeLike
Jim, I realized after my last post that there was an easy way to do what I wanted. I simply sorted on Matching cM in descending order, selected all matches down to 20 cM, and then copied and pasted it to a new spreadsheet where I resorted things back to the usual order. I was originally going to go down to 10cM but that started running into thousand of rows so thought I’d start out a bit more conservatively while I’m still feeling my way around.
It easily identified several other matches that have overlapping large segments with two people I already knew I share a 35+ cM segment with. Seems like the next thing I might want to do is add a column for calculated matching segment. I presume I will want that for triangulation or does the Gedmatch Triangulation tool do that part automatically?
LikeLike
At FTDNA you can use the matrix tool to see if people are matches to each other, though you cannot see if they match on a particular segment. It is easier than doing ICW for each one.
LikeLike
Hello,
For FTDNA, I thought their matrix option provided the triangluation function, since it indicates which of the 10 match inputs match each other. Am I missing something?
LikeLike
We thought the same thing at the same time!
LikeLike
June, I type FF in the first cell. Then I copy that cell to the clipboard (Cntr-C). Then I highlight the rest of the row, and paste (Cntr-V). Two ways to highlight: 1 is to scroll down; 2 is to select the first cell you want, then navigate [page down, etc] until you can see the last cell, then hold the Shift key down while you click on that last cell. The whole column should now be highlighted and you can then paste (Cntr-V) in what’s in the clipboard (FF).
LikeLiked by 1 person
I hit reply under Jim’s message but for some reason it wants to call it a reply to Ann.
Thanks, Jim. I had forgotten about page down. Its something I usually don’t need as before DNA I never had large files to deal with. But I fear either there is still something flakey with Calc or your instructions were slightly in error.
Doing what you suggested didn’t work. As soon as I hit the PGDN key it changed the range from the selected box to just one box on the new page. But I played around and found that if I did a Shift-PGDN it correctly updated the range from the selected box through the box that it had previously selected with just PGDN. So with continued Shift-PGDMs I was able to add the source column after all. If this is a Calc quirk I think I’d better break down and buy Excel if for no other reason than that I’d know I was speaking the same language as all you other folks who use it.
LikeLike
Thanks, Jim,
I got it all except the little phrase where you say you prefer one to one at GEDMATCH and Chrome. What did you mean by ” and Chrome”?
LikeLike
Chrome is the free Google browser It works very well with GEDmatch.
LikeLike
June, I’m not at all familiar with Calc. In Excel, I can Insert a formal Table, which is automatically limited to the number of rows with data. If I add a new column (as for source), the Table automatically expands sideways but keeps the same number of rows. Then I can put FF in the first row of the column and hit Enter. Highlight the cell with FF, hit Ctrl-C, Shift-End-Down, Ctrl-V to fill the column. (These are Windows commands. I assume Macs have equivalents.)
If you can’t create a Table in Calc, then you can create your own boundary. Go to the bottom of a column with data (End-Down), then move left or right to the column you want to fill. Type some random characters like xxx. Then do the same steps as in the first paragraph. Shift-End-Down will stop at the cell with xxx.
The thing I really love about Tables is that the header row becomes a drop-down list for filtering. So for example, I can filter the rows for Chromosome 12, or for all records with a certain match name, etc. If Calc can’t do this, I’d say it’s almost worth purchasing Excel for that feature alone.
LikeLiked by 1 person
Excellent, Jim! So glad you are going this.
LikeLiked by 2 people
Oops! So glad you are DOING this.
LikeLiked by 1 person
Yes you do need a separate post about spread sheets. While I understand the principles involved I’m having a LOT of trouble building an initial spread sheet. Part of the problem may be that since it has in the past been sufficient for my needs I’m use Open Office Calc rather than Excel. Is Excel necessary for DNA research?
Also as an Ashkenazi Jew I have over 4000 matches from my initial testing at Ancestry and which I subsequently transferred to FTDNA. I later transferred it to Gedmatch. Obviously when I download from FTDNA to a spread sheet the first thing I want to do is add a column to identify the source of the data. At least with Calc I haven’t found an easy way to do this. Even to use fill down I have to sit there holding down a button until it selects all of the column I want to fill. Would this sort of thing be easier with Excel?
Any tips on working with DNA spread sheets would be greatly appreciated.
LikeLiked by 2 people
June, I think any spreadsheet software will do, as long as it has a sort function. At FTDNA, go to Chromosome Browser; at top right click on the download link to open a .csv file. You should be able to either save this spreadsheet to your Calc format OR highlight and copy the whole spreadsheet and then paste it into a Calc spreadsheet. Each company shows your Matches who are in their database, so each Match list will be different. Absolutely add a column to indicate which company the Match is from. I use FF for Family Finder, Gm for GEDmatch, 23 for 23andMe and AD for AncestryDNA (although the only way you’ll see AncestryDNA Matches is thorough GEDmatch.)
LikeLike
I am able to download the spreadsheet and open it with Calc. The problem is that there doesn’t seem to be any easy way to add the source column containing all the same values. In the past I’ve tried putting in the column head and then below that “FF” but the only way to fill the rest of the relevant locations then was to page down while selecting those rows – over 4000 of them. And then I usually overshoot the bottom and have trouble getting back to where I really ant to end the fill.
Today I tried entering FF where the column heading should be and selecting the whole column. It did then let me fill down but it didn’t stop at the point where the rest of the data ended and filled over 100,000 lines with FF. Would this be the same way Excel would have handled this, in other words do I just not know enough about spread sheets in general, or does it indicate a limitation in Calc. How would you go about adding and filling in FF for just those lines in the spread sheet that already have the data downloaded from FTDNA?
LikeLiked by 1 person
Jim,
I’ve just done a cram course on your blog, devouring as much and as fast as I can. I just created a spreadsheet per your recommendations for triangulation and utilized (as a shortcut??) GEDMATCH’s new Beta Triangulation Group utility. I’m concerned. When comparing the Triangulation Group data to the One to One results for any given cousin, there are dramatic differences in the cM and especially in the SNP counts for the exact same individual on the exact same chromosomal segment. The segment ranges sometimes vary, but only slightly between the two utilities. Have you any experience with the Beta Triangulation Group Utility and/or have you heard of any such reports previously? I could post examples, but you get the idea. In terms of mapping, since the segments are essentially identical in each utility for a given match, perhaps I shouldn’t be so concerned, but cMs and SNPs probably matter to some extent. Any thoughts are welcomed. Great Blog! You’ve done a ton of work which will help many of us amateur genetic genealogists! Thanks.
LikeLike
Rick
The algorithms are slightly different and produce slightly different. The data is fuzzy. Triangulated Groups are relatively broad areas and are usually not affected by these differences. I also note that GEDmatch and the companies often “update” (change)their data from time to time.
LikeLike
Pingback: Segmentology.org by Jim Bartlett | DNAeXplained – Genetic Genealogy