Endogamy – Get the Facts!

Endogamy is sort of like a scare word for genetic genealogy. It raises a specter of something terrible that will cloud your Ancestry or your DNA. We often see Cluster diagrams with few, big blobs of color that are not very helpful. Oh, woe is me, I have endogamy…

I say: just the facts, Ma’am…  [which Joe Friday never really said]

Endogamy (and/or Pedigree Collapse) affects each person differently, but in a very specific way, in each case. Our DNA is fixed at conception – each of us has a very precise crossover points in our DNA, separating contributions from various Ancestors. We have a very specific set of DNA segments from our Ancestors. Any endogamy has already had its effect on our DNA, if any. Let’s just find out – let’s find out the facts!

1. Upload to GEDmatch.

2. Click on the free program: Are your parents related?

3. Read the intro paragraph

4. Enter you Kit#

5. Submit

6. View the results – green or red (match or no match)

Did GEDmatch find any substantial (over 7cM) green segments, in your DNA?

NO? Then, even if you had some endogamy or pedigree collapse in you Ancestry, you did not get any identical DNA segments from your parents. This means there is no effect when comparing with your Matches – every shared DNA segment with a Match will be from your father’s DNA or your mother’s DNA – no confusion!

YES? Then you do have some effect of endogamy or pedigree collapse in your DNA. BUT, it is confined specifically to the green segments. Write those segments down! Remember those segments! Put those segments into your master segment spreadsheet (two copies: one for each chromosome)! In DNA Painter: add those segments (one on each side) to highlight these areas.

All the rest of your DNA is free from the effects of endogamy or pedigree collapse. All other shared DNA segments with a Match will be from your father’s DNA or your mother’s DNA – no confusion!

Get the facts for your DNA! Free up as much of your DNA as you can.

IMPORTANT EPILOGUE – This blog post addressed the genetic, or DNA, part of endogamy. TGs are more definitive than Clusters, for example. But, it did not necessarily lighten the load on the genealogy side of our hobby. The issues of which Ancestor to link to which DNA segment (TG) remain.

[16F] Segment-ology: Endogamy – Get the Facts! by Jim Bartlett 20220425 EPILOGUE added 20220426

Phasing Your Ancestors’ DNA?

Actually, it’s pretty easy to phase some of your Ancestors’ DNA. If you’ve ever formed a Triangulated Group (TG), you’ve already done it!

I’ve posted before that TGs are phased data. Here. We don’t really care what the SNP values are (A or C or G or T), just that they “the ones” on one chromosome. We know this because all of our Matches in a TG have the same SNP values. That’s how we get the Shared DNA Segments and a TG. We are all in agreement about the segment represented by our matching SNPs.

One of the outcomes of Triangulation is to determine an Ancestral Line that passed down the TG segment to you. This is usually accomplished by determining multiple Match-cousins in the TG with Common Ancestors (CAs) to you on one Ancestral Line. This pretty much confirms the DNA came down through those Ancestors.

So, given a TG with a confirmed Ancestral line, each of your Ancestors in that line, had to have the TG DNA segment in their DNA. They had to have that same phased DNA on one of their Chromosomes, in order to pass it down to you and a Match.

If you are just dying to know the ACGTs of this phased DNA, you’d need to collaborate with some of your Matches in the TG (it doesn’t necessarily need to be the ones with the CAs – they ALL should have the same phased SNPs). From your and your Match’s raw DNA data, compare all of the SNP pairs from the start to the end of the TG, and, for each position, list the single SNP that is the same.

Most of us won’t do this, but we can be content knowing that the data is phased in the TG and is identified by the Matches with specific shared DNA segments. We know the Chromosome, the side, the start position and the end position – the phased data is locked in.

 But that’s not all…

Each of our Matches in a TG almost certainly got a different DNA segment down their Ancestral line – often starting sooner (on the Chromosome) or ending later. Their TG would be different – usually adding some Matches to their TG and not including some of the Matches I had in my TG – and many of the same Matches will still be included. In effect their TG is “offset” some from mine. And their TG is also phased data. And each such Match TG may add more to the phased data of the Most Distant Common Ancestor, and often some of the intermediate Ancestors, depending on where the Match ties into your line.

And that’s not all, either…

Most of our Ancestors are linked to multiple TGs. I have 372 TGs that cover my DNA. That means, on average, about 1/4, or 93 of my TGs come through each of my grandparents. Put another way, my TGs would “cover” about 1/4 of the DNA of each of my grandparents – with my data alone, I could determine phased data for 1/4 of each of my grandparents. Even each 3xG grandparent would average 10 TGs.

And there is more…

My siblings and cousins have DNA that I don’t have, from each of my Ancestors. Their TGs could document more phased data in my Ancestors

There is a limit…

Generally, a parent, or any Ancestor, does not pass down all their DNA to their children – some is lost, forever.

So, quite a journey…  And like most journeys, we need to take it a step at a time. The first step is to Triangulate your own genome. And then work on linking each TG back to a CA (and thus to the Ancestral Line down to yourself).

[14A] Segment-ology: Phasing Your Ancestors’ DNA? by Jim Bartlett 20220420

Insights on Clustering vs Triangulation

A Segmentology TIDBIT

Triangulated Groups will cover an Ancestral line; Clusters tend to focus on a Common Ancestor.

Think about this for a moment. A Triangulated Group is formed around one segment of your DNA. This segment of DNA was passed down to you from your mother or your father. This segment of DNA was first formed in one of your Ancestors – as a part of their DNA which was passed down through a line of your Ancestors to you. Let’s call this Ancestor your “first” Ancestor, with respect to the DNA represented by the TG segment. Generally:  this TG segment probably started as part of a somewhat larger segment of DNA in that “first” Ancestor; it probably got whittled down by recombination along its journey down to you; and portions of the larger segment were also passed down through several children of this “first” Ancestor to other people who became your Matches (because they shared this DNA with you). The bottom line is that you may have a first cousin (1C) who shares part (or all) of this TG segment of DNA with you. You may also have a 3C or a 5C or an 8C who shares part of this TG segment of DNA with you. The point is that within a TG, you may well have Matches who are cousins over a wide range – back to any of your Ancestors between your parent and the “first” Ancestor to have the TG segment DNA. In fact, among your TG Matches there may be cousins beyond your “first” Ancestor – these Match-cousins would share smaller pieces of the TG segment that came from Ancestors of your “first” Ancestor.

Now let’s shift and think about Clusters. Clusters are formed from Matches who are Shared Matches with each other.  Each of the Shared Matches in a Cluster *tend* to match most of the other Shared Matches in the Cluster. That’s why we see Cluster diagrams with squares which are almost solid – most Matches match most of the other Matches. This usually happens when the Matches have the same Common Ancestor. Think about the LEEDS method – the focus is on four Clusters, each one representing a different one of your four grandparents. As the lower cM threshold is reduced, more Matches are included in the analysis, and more Clusters are formed. These Clusters *tend* to drift away from grandparents and form around more distant Ancestors. Although it is not a “rule” or “requirement”, it does seem that each cluster is centered on a specific Ancestor. However, sometimes a Match in a Cluster may be related through an Ancestor a generation closer or farther than most of the Matches. This is because the range of relationships is not rigidly tied to cMs – the smaller the cMs in a Cluster, the larger the range of possibilities. This is also due to the fact that a close Match will be included in one of the Clusters – unless the upper cM limit on Clustering is lowered to preclude close cousins. Beyond the 4-generation LEEDS Clusters, the Clusters with smaller and smaller cMs, get more and more “messy” with more and more exceptions to the one Ancestor per Cluster concept.  But the “tendency” remains: the Clusters “tend” to form with Matches who have the same Common Ancestor. NB: if you want the Clusters to point to one Common Ancestor, you should either adjust the upper cM limit, or manually cull out Matches who are clearly closer cousins.

A few years ago, I Clustered all of my FTDNA Matches (roughly 8,000 of them). I had already Triangulated them into about 370 TGs. I got about 350 Clusters. In both cases there were about 5% of the Matches who didn’t Cluster or Triangulate – these Matches were all under 15cM (most under 10cM) and were the same Matches in both cases – they were false Matches. There was very close to 100% correlation between the Cluster and the TGs (in other words the Matches in each Cluster had the same TG). My conclusion was/is that the Cluster Common Ancestor was the same as the “first” Ancestor for the TGs [I only wish I knew, for sure, who that CA was…]

Bottom line: you should be able to “Walk the Ancestor Back” with different Matches in a TG; and you should see most of your Matches in a Cluster as cousins with the same Common Ancestor (with maybe a few Matches being a little closer or farther cousins).

[22BE] Segment-ology: Insights on Clustering vs Triangulation TIDBIT by Jim Bartlett 20220413