This post is about the distribution of our DNA segments (as represented by TGs) among our Ancestors. It’s gotten long and convoluted, so I am going to post it in pieces. This is Part 1.
I currently have over 110,000 DNA Matches – they are mostly spread over 3/4 of my Ancestry from Colonial America – mostly Virginia. [1/4 of my Ancestry is from my maternal grandmother whose parents were immigrants in the 1860s – and I get relatively few DNA Matches on these Ancestors]. So I am thinking about the distribution of say 90,000 Matches over three of my grandparents from Colonial Virginia, and the chore of finding Common Ancestors (CAs) linked to DNA segments – represented by Triangulated Groups (TGs). I’m musing about the Big Picture – the distribution of TGs in our Ancestry. From a macro view can we learn something? Can we predict something?
These Matches represent a lot of different DNA segments passing down to me (and to my Matches) from my Ancestors. However, I do know something about these “different DNA segments” – I have 372 TGs – each one representing a segment of my DNA from one parent or the other. These 372 TGs are the equivalent to phased data, that “cover” all of my DNA – they are arranged, adjacent to each other, from one end of each chromosome to the other. That’s an average of 186 TG segments on one side.
When I look back at my blog post that used simple math to estimate the number of DNA segments we typically get at each generation – on one side – it shows:
Ancestry used to provide Circles back to the 8C level… They currently limit ThruLines to the 6C level, but at one point clearly acknowledged that the shared DNA segments could come from at least the 8C level, and the Circles they presented showed that plenty of folks had Trees with CAs at that level. I have a lot of evidence that indicates atDNA “works” back to at least the 8C level.
At this point in my musing, I’d like to reflect on several “rules”, or guidelines, or pretty valid assumptions about autosomal DNA.
1. Although atDNA is random, the larger the sample size, the closer to the calculated averages we tend to come. This means that we may find some instances of outliers, but with enough data, it averages out. We see this in several instances – a sibling may share a large part of one chromosome with us, but, on average, the total share will be closer to 50% – some chromosomes may be passed to us from a grandparent intact (no grandparent crossover points), but, on average, the total grandparent crossovers will be closer to the reported averages.
2. Science indicates about 34 crossovers occur per generation on each side. In other words, the biology says our DNA is not pureed into mush and then passed down to the next generation in many little pieces. This means at each generation there are relatively few new subdivisions of the parent’s DNA that is passed to a child (average 34), and over several generations many previous segments are not subdivided by crossovers. [NB: in the closest generation – the grandparents DNA passed to us by our parents – the crossovers may be closer to 27 from the father and 41 from the mother, but this difference damps out with as we go back in generations. For the big picture, I’m using an average of 34.
3. Shared DNA follows a 1/4 “rule” with each generation (rather than a 1/2 “rule”). We see this in the average of 880cM with a 1C, and 220cM with a 2C, and 55cM with a 3C, etc. This means the shared DNA drops off quicky with more distant generations. It’s also the reason why we only share DNA with about 1/2 of our 4C and maybe 1/10 of our 5C. However, even given this steep drop off, we have so many 6C to 9C, that we still get shared DNA with a some of them. Our DNA Match lists are filled with folks who are probably distant cousins…
4. We will almost never see Matches who descend from more than two different children of a distant Ancestor, who share the same (overlapping) DNA segment with us. This one is hard to explain, but it starts with the very low probability that a 7C shares a segment of DNA with us that we each got from the same 6xG grandparent – each of us descending from a different child in that family. Now add to that another Match who shares the same DNA segment with both of us and they descend from a third child in that family. You and a sibling will share about 1/4 of a parent’s DNA. If you add another sibling into the mix, the three of you will only share about 1/8 of a parent’s DNA. Considering the 50/50 chance that a segment will get passed along in the next generation, it gets to be very long odds that you and two Matches will share DNA from a 6xG grandparent’s three children. The science says it’s virtually impossible to add a descendant of a fourth child into a Triangulated Group. I discussed such a scenario in Chapter 1 of “Advanced Genetic Genealogy” and concluded that some Matches with the same TG who descended from 3 different children of the CA probably had mistakes in their genealogy. NB: this “rule” does not preclude multiple Matches in a TG from a distant Ancestor – I have several valid examples of 10 to 20 (or more) Matches from a distant Ancestors. Some of them can be closer cousins to each other, and actually descend from the same child of the distant Ancestor. Also, some of them share different TGs. Bottom line, we won’t see Matches with the same TG descending from more than two children of a CA. I’ll use this “rule” later…
5. The DNA doesn’t play favorites. The DNA process of recombination and crossovers does not have any knowledge of a person’s status, religion, wealth, family size, health, surname, endogamy, etc. etc. The process of recombination is carried out the cell level, without regard to any external factors. So, from the DNA’s perspective, our Ancestors are equivalent. We should expect the same number of segments (TGs) from large or small families; from recent immigrants or Colonial Ancestors; from endogamous family branches or branches without endogamy; etc. However, we can expect a wide range in the number of Matches based on these kinds of factors. But they will not affect the distribution of DNA segments among our Ancestors. The TGs will be distributed randomly – roughly in line with the table above.
So back to my musings….
I believe my TGs come, on average, from my Ancestors around the 5C level – 4xG grandparent couples. Clearly there will be some type of bell-shaped distribution curve, and some will be a little closer and some more distant. At that level, all of my 16 paternal 4xG grandparent couples would pass down about 193 TG segments, averaging about 12 per couple (Note the 193 in the chart above). Probably an average of 6 on the paternal side and 6 on the maternal side. These TGs have to come from somewhere – meaning 6 TGs, average, from each of the 5xG grandparent couples – either as intact segments and/or through recombination. Each of our results may vary some; but we should see these orders of magnitude (which are not large). The total DNA contribution of DNA segments from all the 5xG grandparents, on one side, will add up to a full set of autosomal chromosomes, on one side.
This is the end of Part 1. In Part 2 I’ll muse more about where TGs are formed and their journey from the Ancestors down to us and our Matches. Part 3 will look at conclusions (what can we learn from all of this), and propose a new spreadsheet to track TGs through other children of our Ancestors (what we can do!).
[15H] Segment-ology: Distribution of TGs – Part 1 by Jim Bartlett 20211001
Jim: I’ve been thinking about your Rule #4 in a different way, and I wonder if I’m missing what you are saying. In your example, you talk about 7C’s who descend from children of the same 6xG grandparent and who all share a relatively small, specific segment of DNA. That DNA segment is the basis for a TG, so at least a few 7C’s are involved. I’m one of the 7C’s, and let’s say I descend from child A. Rule #4 says that probably all the other 7C’s are descendants of child B and it is highly unlikely that any members of the TG descend from child C or child D. But, let’s say the 6xG grandparent had 8 children. And let’s say that the specific DNA segment that we’re talking about got passed down to 4 of his 8 children – children A, B, C and D all have the segment and children E, F, G and H do not have the segment. As I said, I’m a descendant of child A. But it seems to me that since children B, C and D all have the segment in question, members of the TG could be descendants of any of these three children with about equal probability. Does this make sense to you? Am I missing the point of your Rule #4?
John, Rule #4, is about 2 children for Matches (in addition to the one for me). We are very lucky when we find a 7C DNA Match – the math says the probability is less than 1 percent. In other words out of 100 true 7C who have tested, we only match one of them, sometimes. I need to find the scientific article with the math that says getting three 7C (sharing the same DNA segment) from three different children of the CA is impossible. I believe it. This is not to say we cannot have 7C Matches from 4 or 5 children of the CA (I see that regularly), it’s just that they cannot have the same TG. Does this clear it up?
Thanks Jim – I too would be interested in this scientific article with the maths of there being three 7Cs sharing the same DNA segment from three different children of the common ancestor – when this emerges. Thank you.
LikeLiked by 1 person
Louis – the short answer is no!. I think the randomness of DNA swamps any preciseness in this area. IF you are only working on the grandparent segments, as in Virtual Phasing or curiosity or mapping that generation only, then I would expect to see this kind of imbalance. But among those segments from your mother or your father there is a roughly 50/50 mix of DNA from their parents, and the difference damps out. Even if you worked on an all-male or all-female line, it has a contribution from the other side at every generation. In my mind this is like predicting the cM of a known 3C – there is a huge range… My point in all of this is to highlight an order of magnitude, not a predictive tool. Think about it and let me know your thoughts… Jim
LikeLiked by 2 people
That does make sense, Jim. Thanks. I guess it also might make more of a difference when you’re working top down determining the segments passed from an ancestor down to his/her descendants. But not as you say, when working bottom up from your own DNA’s point of view.
I hope it makes more sense when I muse my way through Part 2…
LikeLiked by 1 person
Jim: Do you think the crossovers of 27 from the father and 41 from the mother may make a difference if you’re looking at the patrilineal or matrilineal lines?
LikeLiked by 1 person