My Take on Ancestry ProTools

A Segment-ology TIDBIT

Ancestry ProTools includes several features. ProTools costs $10/month extra on an existing Ancestry account. This post is focused on the cMs between Shared Matches. I’ve fiddled with it for a few days, and, of course, have come up with a helpful spreadsheet.

One method: Focus on a “base” Match of interest to you.

Start with a Match of interest to you (often a high-cM Match with an unknown, or iffy, link to your Tree). I call this the “base” Match. Click on Shared Matches (to use ProTools or subscribe to it).

The resulting Shared Match list (with ProTools) took me a while to get used to. It is essentially a list of the Matches that you and your selected Match have in common. This is a fundamental building block of Shared Match Clustering, and Matches who appear on each other’s Shared Match lists tend to all have the same Common Ancestor. These Clusters can include Matches with Trees (where you can search for a Common Ancestor among them); as well as Matches with Unlinked Trees, Private Trees and NO Trees)

However, the ProTools Shared Match list also reveals the cMs shared between your “base” Match and each of the Shared Matches on the list. These cMs may, or may not, be significant information. So far, the list is only arranged by cMs shared between you and the Matches. I’ve found my “go to” process is to scroll down the right hand list and check the cMs shared between your “base” Match and each of the Shared Matches. This is often a wide range of cM values – from 20cM up to some real surprises. These surprises may be on the last page of Shared Matches – so, for me, it’s well worth the time to look at all the pages of Shared Matches (20 Matches per page). There is a rumor that Ancestry is working on way to let us sort on this value. I have found several cases of a relatively small Match to me, who is a parent, or child, or sibling, or other very close relationship to the “base” Match. This is often a “BINGO” for me – particularly when one or the other of this duo doesn’t have a Tree. These close relationships can also be game changers – 1C, 2C or even 3C can show a family group in one “sub-branch” of your Tree – importantly, separated from other branches. 

Inverse Logic: If you are pretty sure of some Matches who descend from one child of a particular Ancestor, and a group of Matches (among themselves), appear to be on the same line, but their cMs with you are somewhat smaller than the other Matches from that Ancestor, then this is a strong clue they are related another generation back, or so.

In any case, this info can be very valuable in conjunction with a WATO analysis at DNAPainter.

Another method: Work on your top Matches on one branch of your Tree.

Of course I tried several spreadsheet methods. The one that works best for me is a list of my top Matches on one branch of my Tree. I determine these Matches from my Notes (derived from ThruLines; Clusters; UnListed Trees; blind luck; etc). Almost all are captured in my Common Ancestor Spreadsheet – here). Since I know how most of my Matches relate at the grandparent level, I focused on the Great Grandparent groups and/or 2xG Grandparents who were on my paternal side. In other words, on known, or suspected, Ahnentafels: 8P, or 16P and 18P, or occasionally one more generation back (32p-39P).

I walked down my Paternal List of Matches and selected the ones I had Notes for that indicated they were from my targeted branch, or, based on previous Clustering, who were Likely to be on that targeted branch (Likely Matches were labeled with an “L”, and usually had NO or very small Trees.) I listed the Match Name, cM, Relationship (e.g. 8P/2C1R), and sometimes the Child the Match descended from. Feel free to add any columns that might be helpful to your analysis – columns can always be moved or deleted or hidden.  Out of this list I selected a Key Match (often unknown) and put an asterisk (*) adjacent to them in a new column. I then clicked on the Key Match’s Shared Matches and reviewed that list – on the right side was the shared cM with each Match. Initially I went from top to bottom of that list and put the shared cM amount in the column under the * and in the row for the match – creating a matrix of sorts. After a few iterations, I limited this to shared cM amounts over about 50cM and highlighted amounts over 90cM. As indicated above, I sometimes found very large cMs, indicating very close relationships – clearly on one particular branch twig in my Tree; and sometimes one Match had a full Tree and the others did not (very useful, bringing Matches with little to no info into play).  One vexing Match has a father born the same year as me, so I can assume a 1R relationship (and her 162cM is 2C1R 53% of the time per DNA Painter). AND I note she shares 1883cM with another Match who is highly suspected of having a NPE bio-parent in my Tree) – the clues are adding up.

The method above is also creating a sub-branch, that could very well be from an unknown wife/mother (39P) for whom I have very few Matches so far. In these cases, I’m creating additional * columns for the highest cM Match in that group and looking at their Shared Matches – looking for one of their closer Matches who might have a Tree; or looking for other Shared Matches who might provide Trees or other insights – all in all: looking for a Cluster that might go back to 39P…

As I’m playing with this method, and adding more * columns (creating a matrix), I’m basically identifying all my Matches on my 8P/9M MRCA branch, and subdividing them into sub-branches. This will get me to a good Cluster from Matches back through 8P/9P MRCA to 18P/19P to 38P/39P and ultimately to the 78P/79P MRCA who are parents of my unknown wife/mother: 39P.

Traditional Clustering methods can do this alone, but knowing the cM relationship between the Matches helps a lot.  

Clearly I’ll be spending time with this new spreadsheet. I can add new Matches that are close to my key Matches but may be under 50cM, or even at 20cM, with me, but with helpful Trees and/or Unlinked Trees.  At any rate, its easy to sort the spreadsheet on an * column, and easily see Matches who should be grouped on a sub-Branch. And, at any time, I can easily use DNAPainter’s WATO tool to focus on likely Branches. It’s a whole lot easier to find a link by building a Match’s small tree back, when I have good intel on the Surnames and geography and timeframes.

ProTools identification of shared cMs between Matches is a strong addition – well worth $10 for a trial month, IMO.

Please feel free to post your own methods of squeezing out more info using this feature of ProTools.

[22CJ] Segment-ology: My Take on Ancestry ProTools TIDBIT; by Jim BARTLETT 20240629

21 thoughts on “My Take on Ancestry ProTools

  1. Jim, I haven’t personally, tried this new feature on Ancestry, but I’ve seen the output from a number of several sources, and one thing struck me. The seems to provide the same information that I’ve used on MyHeritage’s Review process, except of course for the triangulation icon and their chromosome browser. Did you have this same sense when you first looked at it? Just curious on your reaction. Does Ancestry provide anything else, other that what we’ve grown to expect from MH?

    Thanks!

    Like

    • Doug, Yes, it has some similarities with MyHeritage. The ProTools has a good advantage in that it’s easy to scroll down the list of Shared Matches and quickly see the significant cM shares. I’m working on one now who shares 129cM with me – as I scroll down to my 60cM Match who is a sister (2608cM), then down to my 53cM Match who is a daughter (3473cM); and a 44cM Match who is a nephew (1658cM) – clicking on this last nephew, I see his several shares at 860-1,000cM but as low as 31cM to me. It’s much easier to navigate and find literally all the Matches in that sub-Branch of my Tree. On the other hand, MyHeritage gives the actual segments, and lets you build a solid Triangulated Group. ProTools is a good tool for AncestryDNA Matches… (which is where the most Matches and genealogy are, IMO). Jim

      Like

  2. Thanks Jim,

    I always find your posts helpful, even if it takes me a few readings to capture everything. After following this step by step I find this is very like my approaches to pro-tools as well. While certainly far short of the desired segment/chromosome details we all crave, this is a huge step forward. I have previously identified what I call (ABC’s, for Anchor Best Cousin matches). These being the highest cM matches (in each database) that share Only one branch, be it grandparent, great grandparent, 2X or 3X. I use these ABC’s to follow your “walking-the-DNA-back” approaches, particularly segment by segment. Using these with this new Ancestry shared matches of matches information is greatly aiding both the validation and expansion of this effort. Linking these segments to networks/clusters of smaller matches is moving these segments further generations back. I still struggle with connecting many 15-30cM multiple segment matches (multiple shared ancestors very often), but lots of the tiniest single segment matches are really falling into line with this approach.

    Like

    • James, Thanks for your kind feedback. I don’t always hit the nail on the head squarely the first time, but I often sort out the real gist in the comments. If I ever turn this blog into a booklet, I’ll need to re-read all the comments.
      And thank you for ABCs – that’s a GREAT idea – exactly the concept I was searching for when applying the *’s in my shared cM spreadsheet. I’m going to create an ABC Dot for those Matches. I, too, have some pedigree collapse several generations back (in Colonial Virginia) and agree some Matches are a real challenge. ABCing selected “Anchor” Matches is a MasterClass idea – it will really help in Shared Match Clustering.
      Thanks, again, for your feedback. Jim

      Like

  3. Thank you Jim!

    I have only just been using it to find errors etc. You always distill things down and find the nuggets. I will look from a different vantage point now.

    Your time, efforts and knowledge are much appreciated.

    Ellen

    Like

  4. While I dislike having to pay an extra $120 a year, Pro tools is amazing. I solved two years-old mysteries! I knew both women came from my maternal line but no proof. Their respective cM s showed 1/2 sibling or aunt, allowing me to zero in on the fathers. So Excited!

    Like

    • Bonnie – A great success story! I am bummed somewhat on the cumulating cost of Ancestry’s various tools. I used the World sub for a short time, and then reverted back to US only, where most of my Ancestry is (they now block me on any hint that involves an out of US record). I plan to focus on the ProTools this summer, and then cancel – I should have enough clues to see me through the winter. One of the comments below offered a great idea: he Dotted a lot of over-50cM Matches that were not connected to his Tree; with Shared cMs he has already significantly reduced that that list. At any point we can unsubscribe, and the ProTools stops at the end of that month’s subscription. Jim

      Like

  5. Kevin, Glad my explanations are helping. The key is to use your mother’s test to resolve which side each TG is on; and then use genealogy to figure out the Common Ancestors. It’s often a lot of work. The DNA is a helpful tool. but in the end you have to use the genealogy. Jim

    Like

  6. As a first impression it sounds like MyHeritage without the chromosome browser and triangulation feature. That’s very disappointing. I was hoping that they were finally offering something useful.

    Like

    • Ray, Yes, ProTools is well shy of identifying actual segments, but it is an important upgrade for us. I am seeing reports of a *lot* of new discoveries based on seeing the shared cM between Matches. I, too, am getting closer on several brick walls because I can more finely divide my Clusters. Jim

      Like

  7. Great little article. I’m having fun doing similar things. I had a special colored dot for all matches above 50 cM that I wasn’t able to clearly identify a genetic path for. When I started with pro tools less than a week ago – I had 120. It’s down below 60 now and still declining.

    Having this information is also causing me to modify how I annotate Ancestry Matches in the notes field.

    Like

    • Brian, Thanks for your interesting feedback. I, too, use a special Dot for Matches I want to dig into more and find their relationship. My top one now appears it might be a double-NPE…

      I’m curious about your Note annotation method. Mine starts with #A0016P/2C1R: #A means an Ancestor link; 16P means Anentafel 16 (on Paternal side); and 2C1R is the relationship. After the colon, in list the line of the Matches descent (or use #CA to reference my Common Ancestor spreadsheet with that same info – now with about 9,000 rows.) This Noting system really helps when viewing Shared Matches. When I find a concensus, I use #L0016P in the Notes of Shared Matches; and they then help in other lists (#L meaning Likely). Jim

      Like

      • I have a detailed and consistent system I use in Ancestry which is where I play 90+% of the time, though I’ll say it was created as I went along, and if I were to start over knowing what I know, I would probably change a thing or two.

        As I started typing this, I realized it was going to become as long as your blogpost, so what I’ll do post a general version below, and then email you (not today, as it’s full already), a more detailed description.

        Each great grandparent gets a different colored dot

        If the DNA path from the test taker to the shared match goes through that great grandparent they get that colored dot.

        If I “know” the relationship – the Ancestry Note starts with “Documented” (in retrospect I should have used a different term – but not going to change tens of thousands of notes now). I then type in the connection path.. ie Documented Josiah Shuck and Elizabeth Samuels common ancestors. Sarah Jones>Mike Jones>Allen Shuck>Martin Shuck>Shuck/Samuels. If I were to start this process over, I would do the DNA path in reverse. Shuck/Samuels<Martin Shuck< etc. because relevant information would be more easily visible at a glance that way.

        If I don’t know the relationship, the Ancestry Note starts with. GreatGrandparent name “ancestor”. Followed by additional information in parenthesis if I know path beyond great grandparent. Followed by quality of the tree. Basically I set myself up for future research. Example: Martha Vannice ancestor (Vannice side). Small Tree.

        Then there are all sorts of modifiers.. like if I know their gedmatch number I type that in. If I know where on the Chromosome they match I’ll type that in ie.. #Chr8(6M-11M). There are many more modifiers.

        I then cross post this exact note to all kits that also match. This is one reason that I specifically do not use ahnentafel notation, as I would have to change it for each kit. The other reason being, I just haven’t used ahnentafel notation enough to be extremely familiar with it.

        Another modifier I think I got from one of your posts – assuming your the one who discussed the funnel concept. If I see shared matches in the same cluster with the same names in their tree, but I don’t know how they connect to the DNA kit, I’ll type something like Keele/Usry funnel – indicating in future research to start lookin at those names in the tree.

        It gets more complicated when NPEs in the DNA kits line are involved – as I’m hesitant to use the word documented at the start of my annotation for those.

        I type quickly and so the goal is to annotate all matches above 20 cM, but sometimes I go until it gets muddled, or for those more closely related to me, I’ll go as low as 15 cM.

        Then using DNAGedcom and Shared Clustering with frequent refreshing of data in the notes fields, I just run everything through the spreadsheet generated by Shared Clustering. Makes it VERY easy to see the path.

        Now, with protools, when I run into a match that I’m not sure of, but who as first cousins that have tested whose path I know.. I’m messing up my system and copying as I go the most likely path. Here’s an example of one I’m working on right now:

        Joseph Peterson ancestor. Tree. Swedish names, but connection not found. >Mary Rosenberg>Magnus Peterson>Peterson/Andersson.

        The note above tells me the path will go through the names above. Or sometimes I’ll type in the note field “John Jones = 873cm”. Which just is a scratchpad note to look at them next time I’m coming through and analyzing and I don’t want to scroll through 300 shared matches.

        This was still quite long – but, explains generally what I do. There’s a lot more I do with modifiers and extra colored dots. I use shared clustering extensively while I analyze the kit’s matches, and will work cluster by cluster.

        While I have other spreadsheets and tools I use to keep track of information, I don’t have a centralized spreadsheet. Ancestry notes in connection with downloads becomes my centralized repository of information.

        Like

      • Great feedback, Brian. The keys are “detailed” and “consistent” – they may not seem important at first, but as we go through tens of thousands of Matches, they are invaluable. I cannot remember 9,000 Matches linked to MRCAs, much less the many more Matches that clearly Cluster (and probably share an Ancestor) with them. Reading through your whole process, I’m struck by how similar we are. I started mine when an app came out long ago (now defunct) which relied on a “#” tag. So I use #A to lead off my Note which a Match with and MRCA (and I have an MRCA Dot). Although I have my father’s, brother’s, uncle’s DNA, I’m focused on a goal to develop as good a Chromosome Map linked to Ancestors as I can – for me (the other’s may follow). So my #A is followed by a 4-digit Ahnentafel number, so I don’t have to keep repeating long name strings – AND I can easily see relationships among Shared Matches. Ex: #A0036P which represents my Thomas NEWLON/Susan CUMMINGS MRCA. I lucked out and just happened to start adding the descendants down to the Match. Ex: > Cecelia 1793 m McPHERSON > Susannah F 1814 m WHITE > Joseph M 1851 > Ruth B 1892 m COMSTOCK > Edith F 1923 > Pvt > * [* being the Match]. I did this for many Notes until I started keeping my Common Ancestor spreadsheet with the same information – so now I usually just indicate the child and maybe the grandchild in the Notes and include a #CA to indicate the full info is in the spreadsheet (completely under my control – some of my Matches have disappeared at Ancestry). I am now back tracking and adding this line of descent into my Tree – I think it does help beget additional ThruLines.
        And, like you, I add scratch notes – anything I realize at the time might be important later – someday – when I return to this Match as a Shared Match – to include #G as a preface to GEDmatch kit; #T as a preface to a Triangulate Group {which I note as 01S24 – meaning on Chr 01; S=19th letter so segment starting *about* 190Mbp on Chr 01; and 24 being I’ve confirmed at least on father’s (2), father’s (4) line. I have 372 of these TGs, each with a unique ID – and at the end of the day, all true Match segments will fall into one of these and be part of my Chr Map.
        And, like you, I have special Dots – one for Matches who are in my Tree (goal is get all of them in and dotted); those with segment data and thus TGs; those with real, or close, all-male or female lines (to be pestered regularly to take a Y or mt Test). Wrong ThruLines (so I don’t cycle through them again. And I’ll be adding a Dot for high-cM, but unknown relationship, Matches so I can focus on them (in my spare time?).
        And like you, I am a fan of DGC (I miss Jonathan Bretcher’s Shared Matching program) – the last time that worked, I had a complete download, and Walked The Clusters Back for every Match down to 20cM. and had that result (#A or #L – Likely because of Clustering) in the Notes – they help me every day, but there are a lot of new Matches in the backlog now. The DGC download report is an invaluable, easily searchable, spreadsheet.
        Thanks again for your feed back – Jim

        Like

  8. Jim – I have a spreadsheet of interesting unsolved matches to me and my family, and this new tool is a major step forward in solving them.  It solves some of these mysteries instantly, and I can do deep dives in clusters, looking for close (100+ cM) second order matches. I’m debating about whether to add these new second-order unsolved matches to the spreadsheet.  Pure speculation here, but the pro tools arrangement might allow Ancestry to someday provide segment data like the other sites do. Lower customer service cost for sure, and likely lower legal risk as well.  Rich

    Liked by 2 people

    • Rich, Love your feedback. I’d absolutely add the second-order Matches to the spreadsheet AND look at *their* Shared Matches for any clues that pop up – if so, add a new column and * for that Match. Let me know how it goes… Jim

      Like

  9. Ciao jim.ti ho fatto I messaggi 1settimana fa.volevo dirti quel gruppo sconosciuto triangola sul chr 16 pou ho il gruppo di 3 4 persone e triangola sul 16. Poi ho una corrispondenza con un buon srgmento che adiacente ha questi due gruppi soa il sconosciuto sia altro gruppo diciamo conosciuto.pero questi due gruppi quando li metto insieme non triangolano ognuno triangola per fatti loro ma sono tutti e due adiacenti ha questa corrispondenza sempre sul16.pero alcuni matchs del gruppo sconosciuto hanno stessi numeri rsdi iniziali e finali con alcuni di quel gruppo conosciuto che vuol dire?

    Like

    • Each segment of your DNA is independent. When you form Triangulated Groups, each group represents one segment of your DNA on one chromsome (you have to use genealogy to determine which parent the chromosome is from). An adjacent Triangulated Group represents a different DNA segment, from a different Ancestor. The junction between two TGs on one chromosome is a “crossover point” where one of your ancestors recombined their maternal and paternal chromosomes (which means different ancestors). This crossover could be occurring at almost any generation going back – you have to use genealogy to figure out the MRCA of the adjacent TG segment. Jim

      Like

    • Also, if one TG segment has an overlapped TG segment (the Matches in oneTG do not match the Matches in the other TG), then one TG is on your paternal side and the other TG is on your maternal side. If you can use genealogy to correctly figure out one TG, then the other TG is from your other parent. Jim

      Like

      • Kev.jim ora che ho testato mia madre e rimasto solo il lato paterno ha mia madre quei gruppi non sono usciti.e ho un gruppo tg sul chr 16 che triangola pou ho un altro gruppo tv che triangola sempre sul 16 ma ognuno per conto loro.pero questi due gruppi il sconosciuto che ti ho parlate altre volte e il gruppo diciamo conosciuto sono tutti e due adiacenti ha un match sempre paterno.ma alcuni del gruppo sconosciuto hanno dei numeri rsdi iniziali e finali con quel gruppo conosciuto ma quando I due gruppi il.sconosciuto e il conosciuto non triangolano pero alcuni del tg sconosciuto hanno questi stessi numeri con alcuni del tg conosciuto piy con me e sono tutti e due I gruppi adjacente ha un altro match.sempre sul paterno ho fatto testare mia madre come ti ho detto altra volta che significa?grazie per il fastidio leggo molto I tuoi post sto capendo molte cose.

        Like

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.