segment-ology

Pro Tools Part 5

Featured

Posted on July 7, 2024 by Jim Bartlett

Small Segments At Work

I have Match A at 25cM with no Tree; but a nephew at 1773cM, Match B who has a Tree. Match B is a 3C3R on my Ancestor 16P. Analysis of the 1950 census and his grandmother’s obit, gave me the same name as Match A and a place in my Tree. Match B is 9cM to me. Match B has another uncle at 1771cM, Match C. Match C is also listed by name in the grandmother’s obit. Match C is 8cM with me. And, sure enough Match A and Match C share 2315cM [siblings] with each other [corrected 7/9/2024].

This is about as solid as it gets. Clearly the 9cM Match B and the 8cM Match C are true cousins to me, per genealogy. Each of these Matches share one DNA segment with me. Although this data doesn’t “prove” these 3 segments are the same and linked back to our MRCA 16P, I’d be willing to wager that an upload to GEDmatch would show these segments would Triangulate; and match many other segments from MRCA 16P. In a genealogy sense it doesn’t matter: these Matches belong in my Tree – with or without a DNA link.

I find this example compelling. The old saw: when you hear hoofbeats, think horses. Yes, zebras are a possibility, but the odds in the USA are way in favor of horses. These individuals show up as DNA Matches to me – they share a segment of DNA with me. Some segments are small, some are large. When they come from such a tight fit in one part of my Tree, I’m inclined to believe that they are the same segment. It is “possible” that they each got a randomly different segment, or even false segments, but the logical reasoning is that they share part of the same segment from an MRCA. Why not just accept that for now? Perhaps, someday, some alternative will come up – even so, it would not change the genealogy backed up by records.

Icing on the cake – in reviewing Match C’s shared Matches, Match D (8cM to me) is 3476cM (a daughter) of Match C – another add to my Common Ancestor spreadsheet and to my Tree.

Bottom Line: ProTools is providing a lot of great bread crumbs to follow; and linking a lot of small cM Matches to my Tree. Be sure to scroll to the bottom of a ProTools Shared Match list, looking for high cM interrelationships! Don’t discard genealogy “finds”, just because they share small cMs.

[22CM] Segment-ology: ProTools Part 5 Small Segments At Work; by Jim Bartlett 20240707

Pro Tools Part 4

Featured

Posted on July 5, 2024 by Jim Bartlett

The Spreadsheet

By popular request, below is a section of my Common Ancestor Spreadsheet. Shown are most of the essential columns, In order to fit the space I have in this blog, I’ve deleted a number of columns that I use to record, emails, TGs, Notes, Y or mtDNA possibilities, etc. – they are not pertinent to point of this post. On the far right are 3 columns for cMs between a Match and the *Match for that column.

Common Ancestor Spreadsheet with columns for Shared cM between Matches

Note this part of the spreadsheet is for DNA cousins on my Ancestor John H BARTLETT b 1804 (married to Sarah FLEMING). For each Match, I have their Name, any Admin, cM (with me), # segs, Ahnentafel of MRCA couple (all are 16 in this section), Cousinship; and then the given name and birth year of the child of the MRCA through which they descend; same for grandchild; and Great grandchild; and then a column for more descendants if desired (all in one cell – and I usually run this out – down to the Match). The ** in green means that Match (and the path) is in my Tree. The next columns are for entering an * for a Key *Match and the amount of shared cM between the other Matches and that *Match. You’ll note near the bottom of the spreadsheet, child James b 1836 is listed – he is the child that I descend from (and cousin on him would be under Ahnentafel 8). Hundreds of other MRCA couples, thousands of Match cousin, are in other sections – all sorted by Ahnentafel # and birth year columns.

To do a perfect matrix, I’d need to have 67 columns to show all of the pair-wise relationships. I think I can get a pretty good picture from only one Match for grandchild. And, of course, as I find Shared cMs over about 100, I usually go down each of those rabbit holes and wind up adding most of those Matches to my spreadsheet.

Please feel free to use as much of this format as you link, AND to add/delete/shift columns to suit your own style of research and analysis.

[22CL] Segment-ology: ProTools Part 4 The Spreadsheet; by Jim Bartlett 20240705a

Pro Tools Part 3

Featured

Posted on July 5, 2024 by Jim Bartlett

BLUF – The matrix which can be created by all the shared cM relationships is also showing the range of cousins who don’t Match each other.

I have now shifted to using my Common Ancestor Spreadsheet to analyze the cMs between my Matches. This spreadsheet lists about 9,000 Matches who are known cousins on specific Ancestors (a small percentage of Matches share multiple Common Ancestors with me). The backbone of this spreadsheet is a list of all my Ancestor couples out to 8C level (and some beyond), with columns for their Ahnentafel number (e.g. 16); and husband’s birth year. Under that goes a row for each Match with that Ancestor as a Common Ancestor (with the Ahnentafel number and cousinship (e.g. 3C1R) and the Match’s given names of the child and birth year the CA couple. The next two columns to the right are the Match’s Ancestor who is the grandchild of the CA, and their birth year; etc. With this setup, I can sort on Ahnentafel Number and the first birth year column and then the second birth year column and the whole spreadsheet sorts into family groups.

I am now selecting a Match and entering a * in a new column; and then, in that column, the cM of their closest Matches already in the spreadsheet. [NB: As previously reported, I’m also finding Matches who are very close relatives to the *Match (sometimes a parent or child or 1C), which causes me to go down that rabbit hole – which, in turn, frequently results in a new known cousin Match added to the spreadsheet – it’s like drinking through a fire hose.]

Anyway, as I now look down the amount of Shared cM between Matches (in a * column), I can clearly see the parents/children, siblings, aunts/uncles/nieces/nephews and close 1C and 2C in close rows of the spreadsheet. The Shared cMs get smaller and smaller up and down the spreadsheet – in fairly predictable order as the spreadsheet has different “layers” of relationships – it’s very comforting to see this pattern. Mind you, it’s not a straightforward “curve” – there is the same “jumble” that is reflected in the Shared cM Project cMs – the overlap of possible ranges among different cousinships.

The other thing that is showing up under a *Match, is that not all the 3C or 4C or 5C are showing up as Matches. This is expected. Remember the rough estimates that true 3C only match 90% of the time; and 4C only match about 50% of the time; etc. I would need to have 9,000 columns, to perform a full analysis, and that probably isn’t in the cards. Perhaps one of the 3rd party programmers can come up with a automated program to do this…

Bottom line: for now, it appears the concept of “true cousins don’t always match each other” is alive and well in the Shared cM data…

[22CL] Segment-ology: ProTools Part 3 TIDBIT; by Jim Bartlett 20240705

Pro Tools Part 2

Featured

Posted on July 2, 2024 by Jim Bartlett

A ProTools Epiphany…

As I walk down my hitherto unknown Matches, I’m setting up a small spreadsheet for each group.

Important: Starting with a “base” Match, scroll through *all* the Share Matches – looking for those who share lots of shared cM with each other. Generally 90cM is a good threshold – these Shared Matches (with each other) would generally be 1C or 2C to each other. If I drop down to about 50cM, I get 3C & 4C too. Feedback from the LEEDS method indicates these over-90cM Matches tend to share the same grandparent. This can occur between two Matches who share much less cM with you. Your Matches may be fairly distant; but among themselves they are closely related. Often, some of these Matches are known to you – either through a good Tree or a ThruLines clue (with reference material). Looking through *all* of the Shared Matches, and then through *their* Shared Matches, I’ve usually found a group of Matches who are closely related to each other on some branch of my Tree.

Epiphany: At this point it is not critical, or even necessary, to “pin the tail on the donkey” precisely. These Matches may well be 3C or 4C or 5C or more to you, but they collectively anchor a sub-branch of your Tree. The fact that they share high-cMs with each other, is a very strong indication that their bond is strong and correct [classic genealogy triangulation]. And, even though they are more distantly related to you, their *grouping* is a strong indication that they are related to you through your Common Ancestor to that sub-branch.

Each of the Matches in this sub-branch (including those without Trees of their own), becomes a strong “tell-tale” that tracks a Shared Match Cluster and/or a Triangulated Segment of your DNA.

There is so much new ground to cover here, that I’m now shifting my focus. Instead of trying to fit each Match into a specific place in my Tree, with detailed genealogy research, I’m just highlighting the groups who are clearly descended from a specific person in my Tree. This specific person may be a child, or grandchild, or great grandchild of my Ancestor. At this point, it doesn’t add any more value to my Tree as a whole to know exactly how they relate to each other – just that they do closely relate to each other. Five or ten or twenty of my unknown Matches are now under a grandson of one of my specific Ancestors – although, they may all be around 5C to me. Other Matches cannot be in that sub-branch, unless they share an appropriate amount of DNA with others in that group. So, using inverse logic, we must find a different sub-branch for these other Matches

This process remains a hoot, and a game changer at Ancestry. I really think a large percentage of our Matches can now be correctly put into sub-branches of our Trees. This also highlights Match groups which will be helpful in getting through Brick Walls. Every IBD Match has to tie into some part of our Tree – ProTools is looking like a great tool to help place those Matches, and perhaps identify some small-cM false Matches. For me this clearly helps, identify small cM Matches who are related within a genealogy timeframe (as opposed to being very distantly related).

[22CK] Segment-ology: ProTools Part 2 TIDBIT; by Jim Bartlett 20240702

My Take on Ancestry Pro Tools

Featured

Posted on June 29, 2024 by Jim Bartlett

A Segment-ology TIDBIT

Ancestry ProTools includes several features. ProTools costs $10/month extra on an existing Ancestry account. This post is focused on the cMs between Shared Matches. I’ve fiddled with it for a few days, and, of course, have come up with a helpful spreadsheet.

One method: Focus on a “base” Match of interest to you.

Start with a Match of interest to you (often a high-cM Match with an unknown, or iffy, link to your Tree). I call this the “base” Match. Click on Shared Matches (to use ProTools or subscribe to it).

The resulting Shared Match list (with ProTools) took me a while to get used to. It is essentially a list of the Matches that you and your selected Match have in common. This is a fundamental building block of Shared Match Clustering, and Matches who appear on each other’s Shared Match lists tend to all have the same Common Ancestor. These Clusters can include Matches with Trees (where you can search for a Common Ancestor among them); as well as Matches with Unlinked Trees, Private Trees and NO Trees)

However, the ProTools Shared Match list also reveals the cMs shared between your “base” Match and each of the Shared Matches on the list. These cMs may, or may not, be significant information. So far, the list is only arranged by cMs shared between you and the Matches. I’ve found my “go to” process is to scroll down the right hand list and check the cMs shared between your “base” Match and each of the Shared Matches. This is often a wide range of cM values – from 20cM up to some real surprises. These surprises may be on the last page of Shared Matches – so, for me, it’s well worth the time to look at all the pages of Shared Matches (20 Matches per page). There is a rumor that Ancestry is working on way to let us sort on this value. I have found several cases of a relatively small Match to me, who is a parent, or child, or sibling, or other very close relationship to the “base” Match. This is often a “BINGO” for me – particularly when one or the other of this duo doesn’t have a Tree. These close relationships can also be game changers – 1C, 2C or even 3C can show a family group in one “sub-branch” of your Tree – importantly, separated from other branches.

Inverse Logic: If you are pretty sure of some Matches who descend from one child of a particular Ancestor, and a group of Matches (among themselves), appear to be on the same line, but their cMs with you are somewhat smaller than the other Matches from that Ancestor, then this is a strong clue they are related another generation back, or so.

In any case, this info can be very valuable in conjunction with a WATO analysis at DNAPainter.

Another method: Work on your top Matches on one branch of your Tree.

Of course I tried several spreadsheet methods. The one that works best for me is a list of my top Matches on one branch of my Tree. I determine these Matches from my Notes (derived from ThruLines; Clusters; UnListed Trees; blind luck; etc). Almost all are captured in my Common Ancestor Spreadsheet – here). Since I know how most of my Matches relate at the grandparent level, I focused on the Great Grandparent groups and/or 2xG Grandparents who were on my paternal side. In other words, on known, or suspected, Ahnentafels: 8P, or 16P and 18P, or occasionally one more generation back (32p-39P).

I walked down my Paternal List of Matches and selected the ones I had Notes for that indicated they were from my targeted branch, or, based on previous Clustering, who were Likely to be on that targeted branch (Likely Matches were labeled with an “L”, and usually had NO or very small Trees.) I listed the Match Name, cM, Relationship (e.g. 8P/2C1R), and sometimes the Child the Match descended from. Feel free to add any columns that might be helpful to your analysis – columns can always be moved or deleted or hidden. Out of this list I selected a Key Match (often unknown) and put an asterisk (*) adjacent to them in a new column. I then clicked on the Key Match’s Shared Matches and reviewed that list – on the right side was the shared cM with each Match. Initially I went from top to bottom of that list and put the shared cM amount in the column under the * and in the row for the match – creating a matrix of sorts. After a few iterations, I limited this to shared cM amounts over about 50cM and highlighted amounts over 90cM. As indicated above, I sometimes found very large cMs, indicating very close relationships – clearly on one particular branch twig in my Tree; and sometimes one Match had a full Tree and the others did not (very useful, bringing Matches with little to no info into play). One vexing Match has a father born the same year as me, so I can assume a 1R relationship (and her 162cM is 2C1R 53% of the time per DNA Painter). AND I note she shares 1883cM with another Match who is highly suspected of having a NPE bio-parent in my Tree) – the clues are adding up.

The method above is also creating a sub-branch, that could very well be from an unknown wife/mother (39P) for whom I have very few Matches so far. In these cases, I’m creating additional * columns for the highest cM Match in that group and looking at their Shared Matches – looking for one of their closer Matches who might have a Tree; or looking for other Shared Matches who might provide Trees or other insights – all in all: looking for a Cluster that might go back to 39P…

As I’m playing with this method, and adding more * columns (creating a matrix), I’m basically identifying all my Matches on my 8P/9M MRCA branch, and subdividing them into sub-branches. This will get me to a good Cluster from Matches back through 8P/9P MRCA to 18P/19P to 38P/39P and ultimately to the 78P/79P MRCA who are parents of my unknown wife/mother: 39P.

Traditional Clustering methods can do this alone, but knowing the cM relationship between the Matches helps a lot.

Clearly I’ll be spending time with this new spreadsheet. I can add new Matches that are close to my key Matches but may be under 50cM, or even at 20cM, with me, but with helpful Trees and/or Unlinked Trees. At any rate, its easy to sort the spreadsheet on an * column, and easily see Matches who should be grouped on a sub-Branch. And, at any time, I can easily use DNAPainter’s WATO tool to focus on likely Branches. It’s a whole lot easier to find a link by building a Match’s small tree back, when I have good intel on the Surnames and geography and timeframes.

ProTools identification of shared cMs between Matches is a strong addition – well worth $10 for a trial month, IMO.

Please feel free to post your own methods of squeezing out more info using this feature of ProTools.

[22CJ] Segment-ology: My Take on Ancestry ProTools TIDBIT; by Jim BARTLETT 20240629

Shared Segments for Small Segments

Featured

Posted on June 5, 2024 by Jim Bartlett

The Shared cM Project is an important and powerful tool for genetic genealogy – particularly with it’s integration with the DNA Painter tools. Over 60,000 submissions is impressive.

Two observations on the Shared cM Project – a very high percentage of the submissions were for the closer relationships; and the data was from many different users (perhaps with varying degrees of accuracy).

I now have over 9,000 entries in my Common Ancestors spreadsheet [see my blogpost about this spreadsheet tool]. I’ve curated these down to 7,800 entries from 1C to 8C, that I am pretty confident are correct. Also, my analysis is that there is a high probability, based on Trees and Shared Match Clusters, that each shared segment is from the Common Ancestor.

So I decided to compile cM statistics from my own curated data. I also wanted to see how the small cM relationships played out.

My data is not nearly as robust as the Shared cM Project was for 4C and closer relationships. However, in the 5C range my data was closer; and in the 6C to 8C range I generally had more data points than the Shared cM Project. This reflects my emphasis on all Ancestors out to 8C range.

Overall, there were no big surprises. In general, for 6C to 8C my data was in a tighter range; and I had some data for distant relationship that weren’t in the Shared cM Project (but, no surprises)

Bottom Line: In my opinion the ranges in the Shared cM Project are a little broad – probably a reflection of data from so many sources. I think the broader ranges give folks more wiggle room with low percentage probabilities, when they should really be looking for other possibilities.

Here is my table comparing my data with the Shared cM Project data – the top row indicates the full cousinship; once removed (1R) and twice removed (2R):

The significant increase in data points at the 6C level reflects the power of ThruLines to build Trees back (subject to my review); but only out to 6C.

As always, feedback is welcomed.

[06F] Segment-ology: Shared cMs for Small Segments; by Jim Bartlett 20420605

Which Sibling Is the Bio-Ancestor?

Featured

Posted on April 3, 2024 by Jim Bartlett

A Segment-ology TIDBIT

Up Front – it’s the one with the highest average cM among Match cousins.

Setup: You’ve pretty much determined a particular couple are bio-Ancestors to youself (or someone else) – often by a consensus of Match Trees in a group (usually a Cluster) – see here. However, this bio-couple had a number of children. Which one of them was the bio-Ancestor? It gets harder and harder the more generations back you are researching.

Process: I’ve had good outcomes by determining as many DNA Match cousins as possible for the bio-Ancestor couple. Line up the DNA Matches and the shared DNA cMs under each of the children, and then determine the average cM for each child. In general, one of the averages will be somewhat more than the others – even when you don’t know the link. That’s because you are a closer cousin with Matches who descend from the same child as you do. For instance, you may be a 5C through most of the children – sharing an average of 25cM with those Matches; and you would be 4C with the Matches who descend from then one child who is your Ancestor – sharing an average of 35cM with them. Of course, our results may vary somewhat from the Shared cM Project, but it’s the concept we are focused on here.

When I do this analysis, I drop down into the smaller segments, in order to get a fair comparison among all the cousins I can find. The more Matches we use, the more it averages out to the Shared cM Project and the correct bio-Ancestor child.

[22CI] Segment-ology: Which Sibling Is the Bio-Ancestor? TIDBIT by Jim Bartlett 20240403

Celebrating the First 25 years of Genetic Genealogy

Featured

Posted on February 29, 2024 by Jim Bartlett

Free eBook: Genetic Genealogy: The First 25 Years – 82 pages – the reflections of 34 Contributors – compiled and edited by Diahan Southard. This is a fascinating read from cover to cover. And it’s free to download here: https://diy.yourdnaguide.com/so-far

I am honored and humbled to be included in this project. And a grateful hat-tip to Diahan who conceived this project; herded the cats to gather the various perspectives; curated and edited the inputs and got it ready before RootsTech 2024. And made it free to everyone!

Thanks, Diahan Southard.

[99C] Segment-ology: Celebrating the First 25 years of Genetic Genealogy by Jim Bartlett 20240229

ThruLines Is Quick – Really Quick!!

Featured

Posted on February 29, 2024 by Jim Bartlett

A Segment-ology TIDBIT

My previous post noted that ThruLines quickly adapted when I changed my Tree.

Setup: I have looked at every one of my ThruLines Matches. If you are not sure, just open your DNA Matches list and select the Filters: Unviewed AND Common Ancestors. If you’ve looked at them all (and hopefully added appropriate information in the Notes box for each one), after a minute or two you’ll get a message: No matches match the selected filter. You’re now ready to take advantage of this status.

I have a pesky female Ancestor. I’m not really positive where she fits in a larger part of my Tree (or to any of several floating branches). So I called up her profile; clicked on Edit (top right); clicked on Edit relationships; and clicked on the parent “X”s (to separate, not delete, them). I now went to the Father box and clicked on Add father; and typed in a name I wanted to test as a parent. I then closed the Edit relationships page and went back to my DNA Matches List and filtered on Unviewed AND Common Ancestors…. and ThruLines immediately populated appropriate new Matches who would be cousins through that parent. In the one to two minutes it takes ThruLines to search my 93,000 Matches, it found and listed Matches with ThruLines. Since I had already opened all previously known ThruLines, this new listing was only Matches who were related through the change I had just made. I quickly took notes and reset the original pesky Ancestor. Ready for the next trial. In and out very quickly.

There is more to this story for a later blogpost. The point for this blogpost is twofold:

1. AncestryDNA must already have most of these relationships already worked out, just waiting for me to ask the right question (do you have cousins for “this” relationship?)

2. There is no waiting days for a “refresh” – ThruLines reports as fast as it can scan my Match list (down to 6cM). Just WOW!

Both of these are pretty amazing, IMO.

[22CH] Segment-ology: Thru-Lines is Quick – Really Quick!! TIDBIT by Jim Bartlett 20240228

ThruLines is Quick!

Featured

Posted on February 25, 2024 by Jim Bartlett

A Segment-ology TIDBIT

I was entering a ThruLines line of descent into my Common Ancestor Spreadsheet, when I noted an error in the Match’s Tree. The Tree and ThruLines were at 6C. When I inserted the missing generation in my Tree, the relationship changed to 6C1R. As soon as I clicked back to the Match, the ThruLines was gone! AncestryDNA now *knows* the correct relationship, and since it was beyond 7 generations for one of us, they won’t show it.

Heads up. Copy or screen-shot before you lose the ThruLines link. I guess in a pinch, I could go back to my tree, take out the generation I added, and ”reincarnate” the ThruLines link. Sometimes you have to think like a computer…

[22CG] Segment-ology: ThruLines is Quick! TIDBIT by Jim Bartlett 20240225