Pro Tools Part 6

Watch Out…

BLUF: Do not rely strongly on Ancestry’s suggested relationships – I find the true relationship is rarely the top one in Ancestry’s long list of possibilities; and it’s usually down their list somewhat. The cMs with my Matches are always within the ranges in the Shared cM Project and at DNA Painter. But, again, they are rarely at the average.

I’m reviewing all my Matches at the 3C level: 79 Matches with A16 MRCA couple; 92 with A18; 43 with A20, so far. None have been found to be outside the range of inter-relationships (perhaps 50% sampling). All are inside the appropriate ranges. BUT, two siblings may show vastly different values – one somewhat higher and one somewhat lower than the average.  My engineer brain wants two siblings to have very close cMs, but the data is truly random (within the ranges of the Shared cM Project.

Bottom line: be careful, and don’t try to force a fit. Expect the values to be in the range for the relationship; but accept that they may be all over that range. And looking at it the other way, starting with a shared cM value, the relationship [of a Match without a Tree] will NOT necessarily (or even probably) be in a small set of Ancestry suggestions (although almost always on their long list – in the “Tree” view of a Match profile).

[22CN] Segment-ology: Pro Tools Part 6 Watch Out; by Jim Bartlett 20240711

11 thoughts on “Pro Tools Part 6

  1. I have a controversial suggestion about the two siblings sharing vastly different amounts of DNA. One thing to check before chalking it up to random DNA inheritance is to check if the siblings are Male and Female. If they are subtract the female’s shard amount from the male’s shared amount – you’ll have a delta. That delta *might* be shared on the x chromosome and not reported by Ancestry. Default to the relationship indicated at ancestry for the male. The amount of centimorgans shared on X can be an indication of the exact relationship path.

    Imagine a situation where you share two hundred something with the sister and three hundred something with the brother at Ancestry but at 23 and me you’d both share about what is shared with the brother total, only 45 or 60 or whatever is on X with the Sister.

    Like

      • Jim,

        Have you seen instances that support Marilynn’s theory that there’s a relationship between xDNA and aDNA inheritance in the many DNA matches you’ve studied? It’s not as though she’s done a scientific study of this, and yet she discusses this as though it is fact, not just her supposition.

        Thank you.

        Like

      • Mary, I think it’s an observation – and I think it should be considered when comparing among testing companies that include cMs from Chr X and those that do not. Personally, I don’t think as a genealogist that I should rely too much on an estimated relationship based on cMs. Our relationships are fixed – except for close relations (usually with multiple segments) there is generally a broad range of possibilities – I spend my time trying to figure out the genealogy. Jim

        Like

      • I want to encourage Mary to just keep an open mind even though I am not a scientist I’m just looking logically at recognizing patterns in the various reporting methods for the testing platforms. I tend to think of the highest cM for any category at shared cM project as likely being from a 23 and me test that includes x cM when present and the lowest as an ancestry result that excludes x cM when present. Blaine has cautioned me against that simplification but I offer as an example the category of parent child at the shared cM project where the highest b values submitted are up between 3800 and 3900 cM. I think those results came from 23 and Me and include 180ish of cM on x when present because the alternative would be to think that the reported amount of 3900 might be missing 180ish cM on x. Alternately if the lowest amount recorded of 2376 came from 23 and me and included 180ish of cM on x then if those two testers were tested at Ancestry their total shared autosomal would be 180ish lower than 2376. I don’t think it is illogical at all to suggest that the lowest recorded amounts for any category at the shared cM project are from ancestry or from a testing company that deducts x dna from the total amount shared and if that is true then it might also be true that results on the lowest end of the range for any category at the shared cM project might be from an x path in that category and that is super helpful information.

        Like

      • Marilyn, I’m an engineer, so I decided to run some tests. I’ve done an original DNA test at each of the 4 major DNA companies. And I’ve uploaded them all to GEDmatch. To see the difference from each of the companies, I compared my 23andMe kit to my 23andMe kit > 3588cM; and exactly the same result with my FTDNA, MyHeritage and AncestryDNA kits > 3588cM. Both tests were 22 chromosomes. I then did the same comparison on the X Chromosome: all were 190cM, except 23andMe was 192cM. So the raw data is virtually the same. You are correct that some companies treat the X chromosome differently, but IMO, it’s unfair to extrapolate the parent child results to the rest of our Matches who by and large share only one segment with us. About 5 years ago, I had every single shared segment from these 4 companies in one spreadsheet – I was Triangulating all of my segments (and deleting the fraction below 15cM that didn’t Triangulate as false segments). The spreadsheet was about 20,000 rows. I sorted it on Chr and segment start location. I had a number of Matches who had tested at two or more companies. Virtually every one of those multi-company Matches showed up in my spreadsheet within a few rows of each other. There were some very, very slight differences, so they were often not adjacent to each other, but they were close. I then sorted the spreadsheet by name – not a prefect method because the Matches sometimes use different names – but I highlighted the ones that were duplicates, and resorted by Chr and start so I could find them more easliy. There were a hand full (under 10) that were a Match at one company and not a Match at another company – they were all small cM Matches. But my takeaway was that all the companies were pretty consistent. Because they imputed some values, the MyHeritage Matches tended to be slightly different – but they would still Triangulate properly.
        A year or so ago I ran my own shared cM project base on about 10,000 Matches who I had determined were cousins of mine. As opposed to Blaine’s data, I felt my data was better curated/more accurate. Nevertheless, my averages for various cousinships were in the range of his results except for the more distant cousins (which I had more of that he did).. where my avg cM average was a little higher, and the closer 3C aferage cM was a little lower. No biggie.
        Bottom line is IMO the Shared cM Project can be used as intended as a general clue to relationships. When I look at any cM value at DNA Painter, I see a lot of various possibilities for most average cM values – and I’ve seen many of them in my own work.
        So that’s my take – don’t try to tie yourself to any specific relationship (except very close family) based on cMs…. Jim

        Like

      • Thank you for taking the time to write about the test that you did. Your test and looking at segments is much more sophisticated than me utilizing the patterns I see. You acknowledged that how companies report X could come into play and although you did not see anything significant in your comparison, remember that there are relationship paths where 180ish is still possible to share all the way into the 5th degree of distance, where 90ish is possible to share all the way into the 6th degree, 45ish is possible to share all the way into the 7th degree, and whoops there it is 7th degree where the category of 3rd cousin resides. The 7th degree is .78% average (1 out of 128 3C relationships or 1/128 average) The range on average is .59% or .59% to 1.16% regardless of the testing company. Ancestry where 3300 is an average of 50% I estimate that mathematical range to be 38.67 to 77.33. At 23 and Me where 3700 is an average of 50% I estimate that mathematical range to be 43.36 to 86.71. So for, me according to that logic 3C can reasonably be expected anywhere from 38.67 to 86.71. But I know that the 86.71 will be inclusive or mostly inclusive of a reasonable expectation of up to 45 centimorgans on X centimorgans on X were 23 and Me still showing the shared X cM that it includes in its totals. The vast majority of shared results at the shared cM project for the category of 3C are between 38.67 and 86.71. There are a good number of oddly high 3rd cousins that I do not have a suggestion to deal with but there are 485 shared results from 0 to 25 that I think have a high likelihood of not being 23 and me results that just might be Ancestry results on an X path that actually share more centimorgans than Ancestry reports. So speaking to the topic of this post, if you were trying to figure out how you are related to a set of full siblings and you match the brother at 50 cM but only share like 7 cM with his sister, my program will give you some exact relationship paths to research where that situation is expected like your mother’s father’s mother’s sister’s daughter’s son’s daughter and son would be something practical you could research and that situation might happen if you shared 22.5 ish to 45ish on X. I am not saying that this is the answer to every strange situation, but I saw this pattern frequently enough that I took the time to calc it all out and write the formulas in advance in order to be able to quickly isolate the relationship paths in each category that might potentially share additional centimorgans not being reported. I really do think the lowest shared amounts for every category at the shared cM project could potentially be autosomal only results for relationships where x cm exist but are un reported. The alternative to that would be that the lowest amounts are 23 and me results that include cM on X which means the autosomal only amount could be even lower than the lowest shared amount reported for the category. All that of course is dependent on whether the people sharing the results actually knew for a fact the person was their 3rd cousin vs just turning in their relative match list to the project and calling it a day. I love your blog and thank you for entertaining my comments.

        Like

  2. Jim, you’ve been doing all this “DNA Stuff” a lot longer and a lot more than I, but I agree, on how the “engineer brain” wants to see more consistency as far as relationships. But, it gets even more frustrating, when you’re doing the work with 3 sets of sibling DNA. I have an older and a younger sister, with me in the middle. Our total shared matches, in order are: 102,261; 117,566; and 115,157. And, the Close matches, in order, are: 7,815; 10,139; and 10, 057.

    It amazes me how many relations the 3 of us don’t share or share with such varied segment sizes. It certainly has me interested in what I’ll find when I begin trying this new feature (I’m in the middle of a different project, right now).

    Like

    • Doug – for two siblings, I use the “rule” of 1/4 – Sib1 got 1/4 of each parent’s DNA which Sib2 did not get; Sib1 got 1/4 of each parent’s DNA with Sib2 also got; Sib2 got 1/4 of parent’s DNA which Sib1 did not get; and 1/4 of each parent’s DNA neither one got. For three siblings it the “rule” of 1/8 – each sib get’s 1/8 the others didn’t get; 1/8 of parent noone got; 1/8 of parent is shared by all 3 and each pair of sibs shares 1/8, which other sib didn’t. Of course, the biology doesn’t follow the “engineer rules”, and there are small deviations (as you see in the number of Matches). Jim

      Like

  3. One benefit of Ancestry’s proposed relationships is it is theoretically taking into account the age of the testers (while keeping the actual ages private to our eyes), and this let’s them propose solutions that are age appropriate ie.. uncle instead of nephew.

    My current project has me mostly looking only at MoM > 400 cM and there are fewer options to be wrong with at the level. Even then, I usually take whatever they suggest (like 1C1R) and then if I have data on the MoM’s tree I’ll copy the tree path notation assuming it’s a 2C, just to be safe. Sure would be nice if Ancestry let us sort on the strength of the MoM, scrolling through 20+ pages of data takes time and it’s easy to miss things.

    Once you get somewhere below 75 cM (maybe higher), I wouldn’t give a nickel for the estimate from Ancestry – there’s just too many possibilities it could be. I think their reasoning for providing the proposal is to simplify things for their customers – but it often, as you point out, over simplifies and guesses wrong – and people that don’t understand the shared cM project think that whatever it says is right.. and it just gets everyone confused.

    Regards
    Brian Schuck

    Liked by 1 person

Leave a reply to Doug Colquitt Cancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.