Getting Started with GEDmatch

Many of us ask Matches to upload to GEDmatch. Some do. Most are bewildered by what they see – I sure was, when I started! It’s particularly daunting with AncestryDNA Matches who generally don’t have any prior experience with DNA segments. Well, just a little noodling around can go a long way. This blog post will suggest some easy steps for anyone who has just uploaded to GEDmatch.

First the login. Your GEDmatch page is anchored on the email you used to sign up and a password you provided. If you forget your password, just enter your email and click on forgot password.

When your GEDmatch home page opens up – stop for a moment and look around.

– Messages from the GEDmatch Admin are at the top.

Look at each of the big boxes:

Information – your profile info.

File Uploads – links to upload raw DNA data files on the left and links to upload a GEDCOM (Tree info) on the right (it pays to do both).

Learn More – Several links to learn more about GEDmatch – not just yet, but soon, click on each link and look it over.

Analyze Your Data – Just read the title of each of the Utilities for now; I’ll come back to some of them later.

Your DNA Resources – a list of your GEDmatch kits (you can upload more kits). Note the EDIT or DELETE link.

Your GEDCOM Resources – the GEDCOMs (Trees) you’ve uploaded.

Tier 1 Utilities – some advanced utilities for user who pay a subscription (see bottom of your home page); see how much you’ll use GEDmatch before subscribing.

 

JUMP IN – Get your feet wet – take a utility on a trial run and see what it’s like! Start with the ‘One-to-one’compare – just click on that line. Enter your GEDmatch kit number in the first box; and someone else’s kit number in box 2. Usually someone has asked you to upload, and they should have given you their kit number. Then hit enter, or click on the Submit button. You’ll get a table of the DNA shared segment(s) with Chromosome, Start and End Locations, cM, SNP – this is the physical information about the DNA you share with a Match – see my blogpost here for more info. Under the chart are some other data, including “Estimated number of generations to MRCA” – please take this number with a grain of salt! It’s a calculated number. 1.0 means parent/child. As the number gets larger, it’s actually more of an average than anything specific. Don’t put a lot of stock into anything over 4.

Now – use your browser back arrow to get back to the Comparison Entry Form – it should have the two kit numbers still there. Select the button for Graphics and Positions to get a colorful display of your 22 chromosomes compared to your Match. Read the legend at the top with particular attention to the red (no match) and yellow (half match) – when the yellow (and green) is long enough, the utility will show a shared segment with a blue bar. And you’ll see the same table info you saw before. This view helps put the whole DNA matching thing in perspective. For most of your cousin Matches, the colorful chromosome bars will be alternating red and yellow (maybe a little green). This indicates that we match on a lot of our DNA, but just not in long stretches. Generally, you and a cousin Match will have only one stretch of matching yellow (and green) that is long enough to call a Match. As you read through this blog, you’ll learn that when the shared segments of different Matches overlap and match each other, you have a Triangulated Group, which can be very helpful.

 

NEXT STEP – Now that you’d tried the basic one-to-one compare, you’re ready to try the ‘One-to-many’ matches. Go back to your home page and click on that utility. Now just enter your GEDmatch kit number and click on Display Results. Please read the explanatory material at the top – all of it is important. Then scroll down to see your closest 2,000 Matches, arranged by closeness. For each one you’ll see the Kit Number; Autosomal and X-DNA data; each Match’s name/alias, and their email. Maybe you’ll recognize some of the top Matches from your testing company… Now click on the hyperlinked A in the Autosomal/Details column – you’ll see the one-to-one comparison page come up with your two kit numbers already filled in – just click on Submit to “see” the shared segment(s), as described above.

Put on your genealogist hat and email any of your Matches and share Trees and info to discover how you are related. To fully use the DNA data, read my blogposts about Triangulation. It takes a while to get up to speed on the DNA analysis, so I highly recommend using your genealogy hat for a while and get to know your cousins….

If you are trying to relate cM values to cousinship, there is a wide range of possibilities. Check out the August 2017 chart at ISOGG here.

 

ADMIXTURE – Try a test run. On your home page, click on “Admixture (heritage)”. Select, say, Eurogenes, and click on Continue. Enter your kit number in the box; and click on Continue. Look over your results. Go back and try different parameters – each one will give different results. Such is the nature of admixture analysis – different utilities have different reference populations and algorithms. Don’t take any of them as gospel. Have a little fun and try the “Archaic DNA matches” to see how close you are to some ancient people, like Clovis man. Or try the “Are your parents related?” utility.

 

Comments to improve this post are welcomed.

 

Permission is granted to anyone who wants to include a link to this blogpost in their message to Matches just starting with GEDmatch.

 

[21B] Segment-ology: Getting Started with GEDmatch; by Jim Bartlett 20170919

MRCA Knothole Guidelines

A Triangulation CONCEPT

In the MRCA Knothole post, we looked at a funnel anchored to a genetic MRCA. In actual practice, we may not know the genetic MRCA for sure. We may well have determined several genealogy MRCAs in a TG, each from a different line (the lines don’t descended from each other). In this case you might think of multiple potential knotholes and funnels associated with each line. But ONLY ONE line can be correct. With only one or two Matches, we cannot draw firm conclusions. But as you and more Matches determine your MRCAs, a pattern usually emerges which gives you more confidence that one of the MRCAs is probably right. It may be that several Matches in a TG all agree on the same MRCA – my rule of thumb is the number of such Matches should be at least the number of Greats in the MRCA (e.g. four with a 4G grandparent MRCA (5C level). You’d want at least four cousins in the TG who are all at least 1C apart from each other (not a parent, 2 children and an uncle), who are all descendants of the same 4G grandparents*. With multiple MRCAs “walk the ancestor back” – my rule of thumb is at least a 2C or 3C (with the appropriate amount of total shared DNA) to anchor the funnel stem; and skip no more than one generation in the other Match cousins, on the walk “back”. Or use combinations of the above; or use judgment. This is your hobby, your genealogy – use your judgment! The above are my recommended guidelines to be reasonably safe – but always be prepared to adjust as new information comes in.

MY MRCA GUIDELINE SUMMARY:

  1. For Matches who are cousins on the same MRCA: have at least as many such separated cousins as the number of Greats in the MRCA (e.g. five 6C)
  2. For Matches who are cousins with various MRCAs, then walk the ancestor back: have a 2C or 3C anchor, with other cousins at different levels back to the MRCA (e.g. a 2C, 4C, and 6C)

[*NB: the four cousins from a 4G grandparent does not mean they all came from 4 different children of that grandparent. In fact that scenario is very, very unlikely. But the four cousins may be 1C or 2C or 3C to each other (not on your line). It’s possible they all came from only one child of the 4G grandparent.]

08C Segment-ology: The MRCA Knothole Guidelines! Concept by Jim Bartlett 20170910

The MRCA Knothole!

A Triangulation Concept

An MRCA* in a Triangulated Group (TG) creates a point in your ancestry Tree where the DNA from an Ancestor has to pass through to get to you – a knot-hole of sorts. A DNA segment from your Ancestor passes down a specific line of descent to you. When you and a Match determine a genetic MRCA; that MRCA has to be on that specific line of descent. Draw a mental picture (or see the Figures below) of a funnel, with the MRCA and the line of descent to you represented by the narrow stem; and all of the ancestors of the MRCA represented by the V-shaped funnel itself. Any other genetic MRCA for this TG will be 1) within the funnel area (the ancestors of the MRCA), or 2) somewhere on the stem (descendants down to you). The MRCA Knothole is where the stem meets the funnel. Suppose the genetic MRCA is with a 3C on a 2G grandparent couple (one of 8 such couples in your Tree). The MRCA Knothole is this couple; the stem includes their child/your Great grandparent, your grandparent, your parent and you; the funnel includes only four of your 3G grandparents, eight 4G grandparents, etc. This funnel now eliminates 7/8 of your ancestry from contention for this TG segment! Now suppose you find a 5C with an MRCA in this TG, and the MRCA is in the funnel! BINGO!! If this, too, is a genetic MRCA, you’ve just shifted the MRCA Knothole! The funnel moves back two generations and now excludes 31/32 of your ancestry from contention, and really narrows down the possibilities for more distant MRCAs for this TG. This is the concept of “walking the ancestor back” – see blog post here.

[*MRCA (Most Recent Common Ancestor) here means a genetic MRCA, on in the path of the shared DNA.]

The following Figures show the MRCA Knothole for a 2C and then shifted for a 3C. The funnel for the 3C has only half the Ancestors still in contention for the TG of the Matches.

Figure 1. MRCA Knothole and Funnel for 2C

Figure 2. MRCA Knothole and Funnel for 3C

08B Segment-ology: The MRCA Knothole! Concept by Jim Bartlett 20170909

Using AncestryDNA Notes

A Segment-ology TIDBIT

In a previous post, I outlined a Format for AncestryDNA Notes. I have found using the Notes feature at AncestryDNA and some standard format (like the one I outlined) together, provide a very valuable tool. Here are several reasons:

  1. This summarizes what you know and learn about each Match. I now have over 46,000 Matches at AncestryDNA, so I’ve decided to focus on all the Matches with Hints, all the Matches which are 4C (4th cousin) or closer, any who have uploaded to GEDmatch, and selected other Matches I find doing a specific surname searches. At this writing (9/9/17) I have 713 Shared Ancestry Hints and 1,860 4C or closer Matches. This relatively smaller group helps me focus on the lower hanging fruit, keep track of them, and summarize the info for each Match of interest to me. The Notes field shows up as a small, handy, “page” icon adjacent to the Match’s name in various lists at AncestryDNA.
  2. When using the AncestryDNA Helper in Chrome, you can get a download of all Matches – 46,000 in my case. This download includes what I’ve entered into the Notes field. I have now modified my Notes format to include the Ahnentafel number and side where known (including all Hints at least). So instead of 6C1R: BUTCHER/BUSH in my blog post example; I now use 176P/6C1R; BUTCHER/BUSH. This tells me the Common Ancestor is on my Paternal side – and the Ahnentafel number often comes in handy. So I can now sort my download by the Notes field, and all the Matches with Ahnentafel 176 are grouped together. In this example, 176 is at the 6C level, and per Figure 3 of this post, they should be grouped in roughly 4 different TGs [see 7th column: avg segs/anc for one side]. This just gives you a rough idea of what to expect in a chromosome map.
  3. But probably the most exciting aspect of Notes is their availability in Shared Matches. Only 4C or closer will show up as a Shared Match – however, the “4C or closer” designation is applied fairly loosely and may in fact be given to a 5C or 6C or more. Not all of my Hints are in this category, but many are. In any case, whenever I’m looking at a new Match, I check the Shared Match list – and look for those with the “page” icon. Line [4] of my format includes info on the Shared Matches – so I copy lines [1] and [2] from the opened Note of a Shared Match and paste it into Line [4] of the new Match. It sounds much more complex than it really is. I’m just copying key (top line) info from Shared Match Notes into the Notes for new Matches. When I find two (or more) Shared Matches with the same Common Ancestor, rather than copying that info a second time into the new Note, I just put a 2x (or 3x, etc) in front of the existing version. I’m finding that sometimes I have 4, 5, 6 or more Shared Matches with the same ancestry. This is pretty powerful stuff! This is strong evidence that this new Match has that ancestry, too. Not a guarantee, but certainly the first place to look. More and more, this is becoming helpful, even predictive. And it’s helping me and my Match. A big focus for folks with small, no, or Private Trees. When I can correctly predict that a certain Ancestor is probably in a Private Tree, I have a somewhat higher response rate.

There is some amount of work to filling out the Notes on all these Matches. But if I didn’t type it in the Notes box, I’d be writing it long hand on paper or in a notebook, and probably repeating this effort each time that Match came up. The method I’m describing provides a standardized process that goes pretty fast – particularly with practice;>j

I “Star” each Match with a Common Ancestor (most Hints), AND the Matches I can link to FTDNA or 23andMe or GEDmatch accounts – which include segment info and a TG identification. The Starred Matches (of 4C or closer) are also highlighted in the Shared Match lists, and easy to spot. I just made my 1,000th Starred Match. Oh, happy dance!! I have the Notes filled in for all 713 Match Hints. And the stars are proving to be very helpful beacons as I plow through the rest of my 1,860 4C or closer Matches. The groupings are becoming more obvious – and many of them are now showing up with Shared Matches with the same TG identifications. This is another incentive to offer AncestryDNA Matches to upload to GEDmatch. Of course, I also promise to do the DNA analysis and report back to any Match who uploads to GEDmatch (or FTDNA). This is starting to really pay off.

Bottom line: utilize the Notes boxes at AncestryDNA!

 

[22P] Segment-ology: Using AncestryDNA Notes TIDBITS by Jim Bartlett 20170909

Your TGs are pretty unique!

A Triangulation Concept

I often get questions along the lines of: “do my Matches have the same TGs?” or “can I form TGs for my 11 kits in one spreadsheet!” The answers are an emphatic: “NO!” and “NO!” Most of us are pretty ego-centric with our DNA analysis – and this is good! And while we are the center of our own universe of DNA segments, each of our Matches is, likewise, the center of their universe of DNA segments. Each of us gets random segments of DNA from our ancestors – random size segment(s) and random placement somewhere on our chromosomes. Once we get past the 2C (2nd cousin) or 3C level, it is quite amazing that we share DNA with more distant cousins at all. But we have many, many cousins, and many of them beat the odds and we share a DNA segment.  However, this does NOT mean that we and a Match both got the exact same DNA segment from a Common Ancestor (CA) – that very rarely happens.  We get a segment, and our Match gets a segment, and what we “see” in a Chromosome Browser is the portion of our individual segments from the CA that overlaps – the shared part of our segments. When we form a TG with various Match-segments (most matching each other), there are usually fairly well defined start and end locations to the TG*. Of the shared segments in a TG: some will start at the start of the TG, and end before the end of the TG; some will end at the end of the TG; some will “float” within the TG; and some, particularly with closer cousins, will be larger than the TG. These are all normal and expected [see an example in Figure 6 here]. The point of this concept is that each of the Matches in the TG will have their own unique TG – representing the full segment they got from the CA. If your shared segment with a Match, aligns with the start location of a TG, there is a very good chance that the Match’s segment from the CA began before yours did, and the Match’s TG has an earlier start location. Try this experiment: Take two (or more) Matches at GEDmatch with shared segments that start at the start of one of your TGs, and compare them to each other. Often their overlapping segment will start before your TG. In fact, using the Shared Segment search utility at GEDmatch, you can probably find other Matches that match in a Match’s TG, that don’t appear in your TG (and those Matches have segments that don’t overlap enough with your DNA to form a shared segment). The bottom lines for this concept are that Triangulation should be done on one “base” person (usually you) at a time; and there is more to the DNA passed down from an Ancestor than what is shown by any one descendant’s TG. Each of our TGs are pretty unique, and our Matches will not have the same TGs or chromosome map.

[*Sometimes the TG start and end locations are a little fuzzy (see here), but our focus should be on the bulk of the TG.]

08A Segment-ology: Your TGs are pretty unique! Concept by Jim Bartlett 20170907

A Triangulation Overview

A Segment-ology TIDBIT

Triangulation is a tool. It’s a process that can help us with our genealogy. It is not the only tool in our kit bag – there are many other tools that also utilize DNA, including InCommonWith Lists, Matching Segment Lists, Matrix displays, Shared Matches, Clustering, Circles, etc, etc. This blogpost is an overview of Triangulation.

With atDNA we have been using Triangulation to mean two different things:

Segment Triangulation of shared segments (a focus of this blog), and

Ancestry Triangulation (having at least 3 Matches in a Triangulated Group (TG) all match on the same ancestral line; sharing a Common Ancestor (CA) on that line.

In the atDNA community we often conflate these two concepts, and they are very much intertwined. I tend to think first of forming a TG and then looking at the genealogy to determine the side (maternal or paternal) and then finding various MRCAs. But some start with the genealogy and look for Triangulation to add evidence that a CA is correct. Both ways will work, they are intertwined in genetic genealogy, so in this overview I will also conflate them. Here are some overview points about Triangulation:

We look for at least 3 Matches, Much of our work as genealogists involves one-on-one – finding a Common Ancestor with a Match – that’s OK, but it’s not Triangulation.

We look at overlapping DNA segments. ICW and other tools don’t require overlapping segments – that’s OK, but they are not Triangulation.

We look for 3 “segment” legs. This means the 3 people (usually you and two Matches) that form a Triangulated Group are not closely related. But once a TG of 3 cousins is formed, other close relatives can be added to the TG. It’s the TG forming that needs 3 strong legs. So 3 siblings and their parent do not form a TG, but they can be in one.

The shared segments that form a TG must be IBD. From experience we’ve found that:

  • “all” shared segments over 15cM are IBD;
  • shared segments under about 7cM are false most of the time; and
  • the process of comparing overlapping shared segments in a TG will cull out many in the 7 to 15cM range which do not match – I consider these to be false segments.

Blaine Bettinger is working to define Triangulation – not to preclude the use of other tools – but to help us better understand Triangulation as a tool. I use Triangulation as a tool to primarily sort and group all of my IBD segments. I’ve formed about 400 separate TGs over my 45 chromosomes. New Matches always fall into one of these TGs (close Matches may span two or more TGs – it’s OK). This is segment Triangulation. With close relatives, I’ve been able to determine the side for these 400 TGs. This is a huge benefit because new Matches almost always Triangulate with other Matches already in a TG; and I then know which side our Common Ancestor must be on. This is an excellent use of the Triangulation tool.

Ancestry Triangulation does not preclude me from also using the information of Circles, or ICW lists, or ethnic makeup, of even genealogy records or Trees or discussions with Matches to determine CAs.

Within a TG we may find a CA with a Match. As we have pointed out many times: shared DNA plus a CA does NOT mean that the shared DNA came from that CA, or that the CA is somehow “proved” because there is also a shared segment – maybe, but also maybe not. But, by finding 3 Matches in a TG who all share the same CA (Ancestry Triangulation), we increase our confidence (not “prove”) that the CA is linked to the shared segment; and with more Ancestry Triangulation (and/or walking the ancestor back), we increase our confidence even more.

If Triangulation leads to a conclusion that a CA is not linked by the shared DNA, we can still be cousins on that CA, and we can still use ICW, Circles, etc. to pursue a genealogy goal. But we should not say that DNA supports that cousinship conclusion.

IMO, a TG has characteristics that help us in our genealogy goals. Triangulation is a strong tool that takes advantage of our shared DNA with Matches.

I applaud Blaine’s effort to try to define Triangulation and provide some standards for its use.

The above is adapted from my recent post to the Genetic Genealogy Tips and Techniques Facebook Group.

 

[22N] Segment-ology: A Triangulation Overview TIDBIT; by Jim Bartlett 20170728

Triangulation at 23andMe

A Segment-ology TIDBIT

23andMe has a great feature. It starts out as a standard In Common With (ICW) list for each Match you are sharing with (old Shared Genomes Matches and new Open Sharing Matches). This ICW list is near the bottom of your Match’s page. But the difference with other ICW lists is the Shared DNA column. The Matches marked with a “Yes” have overlapping segments – and over 99% of the time they form a Triangulated Group (TG).

So go to your DNA Relatives page and scroll to the bottom and click on the “Download aggregate data” link. You’ll get a spreadsheet of all your matches and most of the 23andMe data. Sort the spreadsheet, and delete the ones with no segment data. Then sort on Chromosome Number and Chromosome Start to put them in a particular order. Add a column called “TG ID”. Now you’re all set to begin Triangulating.

Start with the first Match in the spreadsheet (let’s call it A). Click on the hyperlink* that takes you to A’s page, scroll down to the ICW list and note in your spreadsheet Match A and each Match with a “Yes”. Since you are starting on Chr 01, call this TG: 01A, and put 01A in the TG ID column for A and each “Yes” Match. This pretty much identifies all the other 23andMe Matches that are in a TG with (A). The whole TG 01A (of 23andMe Matches) is created through one Match! There may be a few that don’t overlap A enough to form a shared segment at 23andMe, but all you have to do is go down your spreadsheet list of 23andMe Matches and select the next Match (B) that is not already in a TG; click on B’s hyperlink and look at their ICW list for “Yes” Matches with (B) – some will either overlap with (A) (call all of them 01A, too); or they all form a new TG (say 01B) – one or the other. Then continue with the next Match not already in a TG.

One could probably go through their entire 23andMe list of shared Matches in a few hours, creating TGs for all of them. There may be some with no ICW “Yes” Matches – give them their own TG; and move on. Be careful with Matches with more than one shared segment – make sure to treat each segment individually – this may take a little extra analysis.

Remember, TGs represent segments (from an ancestor) on one of your chromosomes. They are equivalent to phased data. I consider all shared segments at 23andMe which Triangulate to be IBD. All of them should be in a TG on one side (parent), or the other.

If you have known relatives in any of the TGs you can assign those TGs as Paternal or Maternal. This often allows other, overlapping TGs to also be assigned to a side, using logic.

*Note1: clicking on the 23andMe hyperlink (Link to Compare View) is a little tricky – I usually just copy (Cntr-C) the spreadsheet URL, and paste (Cntr-V) it into the URL bar of any open 23andMe page – hit Enter. It goes pretty fast.

Note2: feel free to use any TG ID numbering system you want. I think it’s wise to start with the Chromosome number. But you can name your TGs Bill, or Bob, or Sue if you want. You are creating groups that will tie to ancestral lines.

ARE YOU READING THIS FTDNA? ALL YOU NEED TO ADD IS A YES!! And AncestryDNA could add a similar feature, and hell might freeze over, too.

Enjoy easy Triangulation at 23andMe…

 

[22M] Segment-ology: Triangulation at 23andMe TIDBIT; by Jim Bartlett 20170720