Confusion about Base Pairs

A Segment-ology TIDBIT

Let’s sort this out. A Chromosome is a long string of DNA – which has the form of the famous double helix. If we flattened out the double helix it would look like a ladder, with two sides connected by lots of rungs. On each end of every rung is a molecule we call a base – called A, C, G or T for short. The two ends of each rung are always paired, with A on one end and T on the other end, or C on one end and G on the other. That’s because in chemistry, the A molecule bonds much more readily with a T; and a C bonds easily with a G. They form what is called a base pair. And if you know one end of each rung, you know the other end. 23 chromosomes make up a genome, and a genome has about 3 billion of these base pairs*.

As we look at one side of the chromosome “ladder” we see one of these molecules at every rung. Important: There is no hard and fast rule about the order of the ACGTs along one side of the ladder.

In our bodies we have two genomes – one set of chromosomes from the father and one set from the mother.

For atDNA testing a laboratory looks at, say, 600,000 specific base pairs called SNPs (pronounced snips). Each of these SNPs is at a specific location on a chromosome, and the lab looks at one side (the “forward” side) and determines if it is an A, C, G, or T. Because we have two of each chromosome, they actually get two values (called alleles), one from the paternal chromosome, and one from the maternal chromosome. Because all these SNPs are floating around in a soup, we don’t know which one is from Mom and which one is from Dad. One convention is to list them alphabetically, resulting in ten possibilities: AA, AC, AG, AT, CC, CG, CT, GG, GT, and TT.  You can see that in these “pairs” the A is not necessarily paired with T. That’s because the DNA from each parent came to you from very different, and usually very distant, paths – they don’t touch or interact with each other. And, the SNP base pairs were chosen for and atDNA test, because they offer variability.

A shared DNA segment between you and a Match consists of a long string of SNPs (usually 1,000 or more) where you have at least one of your two alleles match at least one of your Match’s two alleles. The longer the shared segment, the greater the probability that it had to come from a Common Ancestor.

BOTTOM LINE – As genetic genealogists we are not concerned with the “base pairs” on each end of a rung, we are very much interested in the two SNP alleles we got from our two parents – not called “base pairs.”

*[An experiment you can do at home: at GEDmatch compare your kit to your kit in the one-to-one utility – you’ll match on 22 chromosomes, from start to finish. Add up the “End Locations” and see how close to 3 billion you come – add in about 155 million for Chr X to get a full genome].


[22U] Segment-ology: Confusion about Base Pairs TIDBIT by Jim Bartlett 20180502

First Time at GEDmatch

A Segment-ology TIDBIT

I’ll try to answer the question: “I just uploaded to GEDmatch, it’s so complex, what do I do now?”

FIRST – take a deep breath, and exhale slowly…….. GEDmatch is a powerful tool, with a lot of features. In many ways it is complex. But it also has some very simple and helpful utilities – I’ll point them out.

SECOND – take a few minutes to look over your GEDmatch homepage – there are several broad areas. The main tools are in the blue box on the right side titled: Analyze Your Data – I’ll come back to that and focus on some of the tools. Other information is in gray boxes – please review the titles before launching into the tools.

On the left side:

Your Log-in Profile – a box with information about you, and how to change it.

Learn More – a box with several links to more information about GEDmatch – you might want to click on each one to get a feel for the resources that are there to help you.

Your DNA Resources – a box with a handy list of your DNA kits at GEDmatch (in the beginning you’ll only have the one you just uploaded, but you can upload more kits whenever you want.)  At the bottom is a link to “EDIT or DELETE” any of your listed DNA Resources – you can open this up to just look.

Your GEDCOM Resources – a box which lists any GEDCOMs you’ve uploaded, and a link to manage them.

On the right side:

File Uploads – a box with tools to upload your raw DNA data files and tools to upload GEDCOMs (a GEDCOM is a file of your genealogy, not your DNA)

Analyze Your Data – a box with tools to utilize your DNA data and a box to utilize GEDCOMS – more below

Tier 1 Utilities – a box with more tools to utilize your data for folks who pay a subscription (wait to see how much you utilize GEDmatch before you subscribe…)

Genesis Beta – a box for certain kits (23andMe V5 kits – since July 2017; and LivingDNA kits)

THIRD – So, after this overview, where to start first?

I’d start with the first tool: One-to-many Matches – Click on this link and use the little “down triangle” to see your kit number and select it (or just type in your kit number – the form is like: A123456). Leave everything else alone and click on Display Results. After a second or two, you’ll get a table of your top 2,000 Matches – with their name, email, kit number, and other info. The default is to list the closest Matches first. Please read the information above the table, before we jump into the fun part. It’s important to know this info and that it’s here as a reference (many folks just look immediately at the Matches and then ask questions which are answered in this introductory material).  Next read each of the column titles – some won’t make sense, yet, but note that some folks list Haplogroups; some have linked genealogies [under GED/WikiTree column]; and there is summary data on the Autosomal and X data. The chart is sorted on Total cM, but notice the small blue triangles that let you sort on most columns.

In one sense this is similar to your results at AncestryDNA – a list of Matches with summary DNA info, and sometimes a link to a Tree.

To see the specific DNA segment(s) you share with a Match, click on the A link (under Autosomal Details). This takes you to the “one-to-one” utility, with your kit and the Match’s kit already filled in. Just click on the Submit button to see the DNA segment data. Or, for a more colorful version, go back to the one-to-one page and click on the Graphics and Positions button and then click on Submit. Again, please read the legend at the top first – then scroll down the page to see all 22 chromosomes – a solid blue bar indicates a shared segment. This view puts the shared DNA with a Match into perspective.

Just like with any other DNA site, you still have to work with your Matches and/or their posted Tree to determine a Common Ancestor. The emails at GEDmatch give you a way to do this.


If you know of someone’s GEDmatch kit number, you can click on the “One-to-one compare” utility on your home page and fill in your and your Matches kit numbers to see the shared segment data.


Try the GEDCOM tools. Click on “GEDCOM + DNA Matches” and fill in your kit number to get a list of your DNA Matches with linked GEDCOMs (Trees). Note a Match (under DNA Name) and click on the link under the GEDCOM ID column to get a summary box. Then click on Pedigree, and adjust the number of generations to suit your search.

If you’ve uploaded your own GEDCOM, my favorite utility is “2 GEDCOMs”. Fill in your GEDCOM number (remember it’s listed on the left side of your GEDmatch Homepage, after it’s uploaded) and your Match’s GEDCOM number (from the GEDCOM + DNA Matches list) and hit “Compare”. This amazing utility will list every ancestor who is the same in both GEDCOMs. Finding a Common Ancestor doesn’t get easier than this!


Try the any of the other utilities in the Analyze Your Data box (some are fun, some are helpful):

-Admixture (there are several different utilities from different scientists – noodle around)

Are your parents related?

Archaic DNA matches – compare your DNA to DNA from the “Clovis” Man or “Kennewick Man” or “Altai Neanderthal” and other ancient people whose DNA has been extracted. Includes a link to an ancient DNA site. Try the kits in the one-to-one utility.

You cannot “hurt” anything here – so click on anything that interests you and noodle around.


GEDmatch lets you compare with folks who have tested at other companies –to find close relatives who tested, but not at the same company you used. Find new relatives – particularly close ones.

A main feature is the ability to “see” the shared segments with your Matches – including between different companies. This feature is essential for Triangulation and/or Chromosome Mapping.

Note: I often find Matches who tested at one of the companies, but for some reason don’t show up as a Match at that company.  I’ve tested at all the companies, and still find many new Matches at GEDmatch.

The 2 GEDCOMs utility gets you straight to Common Ancestors (you have to have uploaded your GEDCOM too).

In the end, it’s still up to you to find to work with your Matches (and their info) to find Common Ancestors. GEDmatch provides some good tools to help.


[22T] Segment-ology: First Time at GEDmatch TIDBIT by Jim Bartlett 20180501

Contact Your Matches Soon!

A Segment-ology TIDBIT

Contact your Matches soon! This is an imperative!

Over 12 million atDNA kits have been sold. Two of the largest companies are AncestryDNA and MyHeritage, who really push for hefty, annual subscriptions. Because of this, and several other reasons, our Matches are dropping out in droves. If you don’t act quickly, many Matches may never see your message. NB: AncestryDNA, 23andMe and MyHeritage all use a messaging system – if your Match pulls the plug, you may miss out on the chance to ever make contact.


-Make up a standard message to include your real name and email, and perhaps a link to your Tree. Express your desire to determine your Common Ancestor(s), and promise to help them through the DNA maze (if you’re reading this blog, you already know more than most Newbies.)

-Send this message to your newest Matches – while they are still logging on to see their ethnicity and closest Matches – before they decide to drop out.

-Give them a way to contact you later (maybe after they lose interest in DNA)

-You will STAND OUT as one of thousands of other Matches, as someone who cares enough to make contract, and as someone who may be able to help them.

-As always, the recommendation is to start with your closest Matches and work your way down the list, as your time allows. Include as many new Matches as possible.


[22S] Segment-ology: Contact Your Matches Soon TIDBIT; by Jim Bartlett 20180415

When a Plan Comes Together…

A Segment-ology TIDBIT

I love it. My prime objective with atDNA has been to map my genome to the most distant Most Recent Common Ancestors (MRCAs) that I can. The two essential ingredients for a Chromosome Map are Segments and Common Ancestors. So my basic game plan is to collect and Triangulate as many shared segments over 7cM as I can, and find as many MRCAs as I can. I have basically completed Triangulating the shared segments with all of my Matches (culling out many Identical By Chance (IBC) or false segments along the way), and now have 360 Triangulated Groups (TGs) covering 97% of my 45 chromosomes.

It’s now a full-court-press to find MRCAs with the Matches in these TGs. Of course, not all MRCAs will be correct, but the more I can find in each TG, the more data I have to develop and test possibilities.

Two ways to find MRCAs with segment data:

1. Start with MRCAs and get the Matches to test/upload to GEDmatch to determine shared segments [see Shared Ancestry Hints below], and

2. Start with Match segments and review their Trees (including getting them to share private Trees) [alas, so many Matches have no Tree.]

One process that has worked pretty well for me, focuses on AncestryDNA Hints. I have 830 Shared Ancestor Hints (SAHs), and I’ve sent a message to every one of them. It’s a standard message saying I agree with the Hint, but note that we might have other Common Ancestors, too. For that reason, and because I’m mapping my DNA segments to specific ancestral lines, I’d like for them to upload to GEDmatch so we can see the shared segment. It’s easy, and I will do the DNA analysis and give them a report back.  About 5-10% of these SAH Matches upload.

Today, in response to my request 7 months ago, I got a message with a GEDmatch kit# for a SAH who is a 5th cousins (5C). At GEDmatch I found our shared segment, typed the info into my Master spreadsheet, and Triangulated with other overlapping segments. The new segment was in one of my few remaining TGs with no known MRCA. So, from the Hint, I now had an MRCA! And it “fit” at the grandparent level with adjacent TGs. I then checked our Shared Matches – there were only 3 – one Private, one No Tree, and one with 57 people.  Well, the 57 people Tree had just the barest of a clue – a maiden name without dates or locations. But I knew where to look, and quickly determined it was the same line as my new GEDmatch kit. Wow! Identify an MRCA in a TG, and get another cousin with the same MRCA line in that TG at the same time.

I have over 600 Matches with MRCAs that “fit” at the grandparent level. And it’s becoming easier every day to find and rule potential MRCAs in or out of a TG.

Communicating with Matches to find MRCAs is the key. Sometimes it literally takes years… You’ll only get a small percentage of responses, but the more emails and messages you send out, the more you’ll get back.


[edited to identify abbreviations]


[22R] Segment-ology: When a Plan Comes Together TIDBIT; by Jim Bartlett 20180316

Make a Resolution to Contact Your Matches

A Segment-ology TIDBIT

Genetic Genealogists will get the most out of our atDNA tests when we contact our Matches!

We have a lot of issues and hurdles with atDNA:

– Many of our Matches have no Trees, very small Trees, and/or incorrect Trees.

– Many of our Matches are unresponsive – for a wide variety of reasons.

– Many of our Matches will have Common Ancestors with us beyond our genealogy horizon.

– Some of our Matches will have multiple Common Ancestors with us, and it’s difficult to sort out which one, if any, is the genetic Common Ancestor.

– Some of our Matches with shared segments smaller than 15cM will be false – they are not a true genetic relative. It’s often hard to tell which such shared segments are true and which are false.

BUT – many of our Matches are what I call intermediate cousins in the 4th to 6th cousin range. By that I mean cousins with a good chance of having a Common Ancestor (CA) in our Tree, or right on the fringes, and within reach of available records/research. These are cousins who can help us assign a side in a Triangulated Group (TG); or determine a more distant CA to a TG (move the “knothole”). These are cousins who can get us closer to resolving brick walls. They are out there….

At the beginning of 2018 there are probably about 10,000,000 people who have taken an autosomal DNA Test at 23andMe, Family Tree DNA, AncestryDNA, My Heritage, LivingDNA, etc. It appears our Match lists are STILL about doubling every year – twice the Matches; twice the 3rd cousins; twice the number of segments in each Triangulate Group; twice the chance for a close Match; twice the number of intermediate cousins; twice the chance for a breakthrough Match – every year.

We have a lot of things which are out of our control (see above). But there is one thing which is very much within the control of every genetic genealogist – contacting Matches. It appears our natural tendency is to look at their Trees, look at their ICW list or Shared Matches, analyze Matrix info, and even Triangulate their shared segments – anything and everything we can do, except contact them. As an example – I have had my brother’s DNA at FTDNA and 23andMe for 5 years, and I’ve received less than two dozen emails or messages from his Matches. And I have two sons, a maternal uncle and several close cousins who have tested with the same pitiful result – virtually no contact. On the other hand, I have sent out several thousands of emails and messages to Matches. And I’ve worked out CAs with over 600 Matches (NB: I don’t claim all are genetic CAs – that is to be determined – but they are all important clues). Yes, many of these CAs I found by looking at my Matches’ Trees, but many were found by my Matches after I contacted them. Some I found by extending a Match’s Tree, but I always try to get an agreement from the Match. Some of my Matches, once I contacted them, turned out to be have a treasure trove of information about my more distant (lesser researched) Ancestors. Many Matches have additional information, not in their Trees. Communicating with Matches has great potential.

And while I’m on this topic, when a Match contacts you, be sure to respond! …in the most positive, helpful, way you can (or have time for). Your Matches are your cousins – treat them like you would any relative at a family reunion… Try not to be dismissive, or to treat them like a salesperson – they are kin, hoping to work on some of your genealogy, too.

Each person has their own objectives in genetic genealogy. And we all usually have a limited time for this hobby. So my advice here is to start with closest Matches and work down. Or work with Matches on a particular line, or within a TG of interest to you. The point is to develop a plan to contact some of your Matches. Although most may not respond, work with the ones who do.

One process that may help is standardized text. If you find yourself writing essentially the same email or message several times, save a copy to a Word (or text) document. Then you can copy and paste it to the next Match. Over time, I try to improve my standard texts – use BLUF (Bottom Line Up Front) to get their attention quickly; be as brief as you can; offer to help; promise feedback; provide your email and or link to your tree.

Make a resolution to contact your Matches! You’ll be glad you did.


[22Q] Segment-ology: Make a Resolution to Contact Your Matches TIDBITS by Jim Bartlett 20180107

Getting Started with GEDmatch

Many of us ask Matches to upload to GEDmatch. Some do. Most are bewildered by what they see – I sure was, when I started! It’s particularly daunting with AncestryDNA Matches who generally don’t have any prior experience with DNA segments. Well, just a little noodling around can go a long way. This blog post will suggest some easy steps for anyone who has just uploaded to GEDmatch.

First the login. Your GEDmatch page is anchored on the email you used to sign up and a password you provided. If you forget your password, just enter your email and click on forgot password.

When your GEDmatch home page opens up – stop for a moment and look around.

– Messages from the GEDmatch Admin are at the top.

Look at each of the big boxes:

Information – your profile info.

File Uploads – links to upload raw DNA data files on the left and links to upload a GEDCOM (Tree info) on the right (it pays to do both).

Learn More – Several links to learn more about GEDmatch – not just yet, but soon, click on each link and look it over.

Analyze Your Data – Just read the title of each of the Utilities for now; I’ll come back to some of them later.

Your DNA Resources – a list of your GEDmatch kits (you can upload more kits). Note the EDIT or DELETE link.

Your GEDCOM Resources – the GEDCOMs (Trees) you’ve uploaded.

Tier 1 Utilities – some advanced utilities for user who pay a subscription (see bottom of your home page); see how much you’ll use GEDmatch before subscribing.


JUMP IN – Get your feet wet – take a utility on a trial run and see what it’s like! Start with the ‘One-to-one’compare – just click on that line. Enter your GEDmatch kit number in the first box; and someone else’s kit number in box 2. Usually someone has asked you to upload, and they should have given you their kit number. Then hit enter, or click on the Submit button. You’ll get a table of the DNA shared segment(s) with Chromosome, Start and End Locations, cM, SNP – this is the physical information about the DNA you share with a Match – see my blogpost here for more info. Under the chart are some other data, including “Estimated number of generations to MRCA” – please take this number with a grain of salt! It’s a calculated number. 1.0 means parent/child. As the number gets larger, it’s actually more of an average than anything specific. Don’t put a lot of stock into anything over 4.

Now – use your browser back arrow to get back to the Comparison Entry Form – it should have the two kit numbers still there. Select the button for Graphics and Positions to get a colorful display of your 22 chromosomes compared to your Match. Read the legend at the top with particular attention to the red (no match) and yellow (half match) – when the yellow (and green) is long enough, the utility will show a shared segment with a blue bar. And you’ll see the same table info you saw before. This view helps put the whole DNA matching thing in perspective. For most of your cousin Matches, the colorful chromosome bars will be alternating red and yellow (maybe a little green). This indicates that we match on a lot of our DNA, but just not in long stretches. Generally, you and a cousin Match will have only one stretch of matching yellow (and green) that is long enough to call a Match. As you read through this blog, you’ll learn that when the shared segments of different Matches overlap and match each other, you have a Triangulated Group, which can be very helpful.


NEXT STEP – Now that you’d tried the basic one-to-one compare, you’re ready to try the ‘One-to-many’ matches. Go back to your home page and click on that utility. Now just enter your GEDmatch kit number and click on Display Results. Please read the explanatory material at the top – all of it is important. Then scroll down to see your closest 2,000 Matches, arranged by closeness. For each one you’ll see the Kit Number; Autosomal and X-DNA data; each Match’s name/alias, and their email. Maybe you’ll recognize some of the top Matches from your testing company… Now click on the hyperlinked A in the Autosomal/Details column – you’ll see the one-to-one comparison page come up with your two kit numbers already filled in – just click on Submit to “see” the shared segment(s), as described above.

Put on your genealogist hat and email any of your Matches and share Trees and info to discover how you are related. To fully use the DNA data, read my blogposts about Triangulation. It takes a while to get up to speed on the DNA analysis, so I highly recommend using your genealogy hat for a while and get to know your cousins….

If you are trying to relate cM values to cousinship, there is a wide range of possibilities. Check out the August 2017 chart at ISOGG here.


ADMIXTURE – Try a test run. On your home page, click on “Admixture (heritage)”. Select, say, Eurogenes, and click on Continue. Enter your kit number in the box; and click on Continue. Look over your results. Go back and try different parameters – each one will give different results. Such is the nature of admixture analysis – different utilities have different reference populations and algorithms. Don’t take any of them as gospel. Have a little fun and try the “Archaic DNA matches” to see how close you are to some ancient people, like Clovis man. Or try the “Are your parents related?” utility.


Comments to improve this post are welcomed.


Permission is granted to anyone who wants to include a link to this blogpost in their message to Matches just starting with GEDmatch.


[21B] Segment-ology: Getting Started with GEDmatch; by Jim Bartlett 20170919

MRCA Knothole Guidelines

A Triangulation CONCEPT

In the MRCA Knothole post, we looked at a funnel anchored to a genetic MRCA. In actual practice, we may not know the genetic MRCA for sure. We may well have determined several genealogy MRCAs in a TG, each from a different line (the lines don’t descended from each other). In this case you might think of multiple potential knotholes and funnels associated with each line. But ONLY ONE line can be correct. With only one or two Matches, we cannot draw firm conclusions. But as you and more Matches determine your MRCAs, a pattern usually emerges which gives you more confidence that one of the MRCAs is probably right. It may be that several Matches in a TG all agree on the same MRCA – my rule of thumb is the number of such Matches should be at least the number of Greats in the MRCA (e.g. four with a 4G grandparent MRCA (5C level). You’d want at least four cousins in the TG who are all at least 1C apart from each other (not a parent, 2 children and an uncle), who are all descendants of the same 4G grandparents*. With multiple MRCAs “walk the ancestor back” – my rule of thumb is at least a 2C or 3C (with the appropriate amount of total shared DNA) to anchor the funnel stem; and skip no more than one generation in the other Match cousins, on the walk “back”. Or use combinations of the above; or use judgment. This is your hobby, your genealogy – use your judgment! The above are my recommended guidelines to be reasonably safe – but always be prepared to adjust as new information comes in.


  1. For Matches who are cousins on the same MRCA: have at least as many such separated cousins as the number of Greats in the MRCA (e.g. five 6C)
  2. For Matches who are cousins with various MRCAs, then walk the ancestor back: have a 2C or 3C anchor, with other cousins at different levels back to the MRCA (e.g. a 2C, 4C, and 6C)

[*NB: the four cousins from a 4G grandparent does not mean they all came from 4 different children of that grandparent. In fact that scenario is very, very unlikely. But the four cousins may be 1C or 2C or 3C to each other (not on your line). It’s possible they all came from only one child of the 4G grandparent.]

08C Segment-ology: The MRCA Knothole Guidelines! Concept by Jim Bartlett 20170910