Pro Tools Part 19

Featured

Comments on Sacrilegious Genetic Genealogy

I thought these comments were excellent and wanted to share them.

Guest Post from Terry Butcher dated 11 Dec 2024

In regards to your Pro Tools Part 16 Sacrilegious Genetic Genealogy post, I would like to share some thoughts on the topic.

While I appreciate the power that various DNA analysis techniques offer in identifying clusters of matches to specific common ancestors, my primary focus has always been about the genealogy side of the effort.

I feel that I need to connect my tree to each match to really have anything of value. I already accept that I am related to my matches (within the parameters you have described related to cM size). Being able to document the relationship and share it with my matches is my reward for investing the time and effort in researching them.

I try to make a connection with each match and approach each one as an opportunity to learn something new. Each match that I find a common ancestor for in essence validates that specific branch of my tree by having both a paper trail and a DNA match.

I add my matches tree into my tree as I research them. I start by adding them as an unrelated person in my tree and start working back along their tree picking up all of their branches until I either find a common ancestor, hit a dead end or believe there is no longer any possibility because of location has gone back to Europe. It usually doesn’t take long to find most CAs. While researching a match, I usually only add parents and the child, ignoring the other siblings to save effort. However, if I am successful in finding our CA, I will usually go back and pick up the other siblings for several of the most recent generations.

I have been systematically working my way through my matches starting with closest related and have made it down to the 41 cM matches (about 2,000 so far).  If the match has useful information in their tree, I have been successful about 90-95% of the time. In the past, I would contact matches without trees and offer assistance. Now with Shared Matches Pro I am able to find their close matches with trees and sometimes find a CA. This is much welcomed capability that changes what is possible in my research.  I have a total of 132k matches now with 11,500 marked as 4th cousin or closer.  It would take me many, many years to even get through the 4th cousins and closer matches so I am not worried about running out of matches to research that I have an excellent chance of finding a CA.

For the 5-10% of my matches that I build their tree but can not find a CA, I suspect they may be either connected with 2 brick walls that I have at 3rd GGF or some unknown adoption or incorrect parent in my tree. Several of these unsolved CA matches now tie together in their trees and I am hopeful they will eventually result in solutions.

By working through my matches and incorporating their trees into my tree, I have expanded my tree significantly to over 222k people now.  As nearly all of my ancestors have lived in WV since the early 1800’s, my tree is heavily weighted with WV families. I typically don’t have to add but a generation or two until I find my CA.

I am not concerned about having floating tree branches as I believe they will eventually connect into my overall tree.  Anytime I encounter a common surname in my research, I chase it back until it connects with other members of that family which strengthens the connections in my tree.

I value the ability to generate family tree reports showing the relationship path between my match and myself and always share the typically one-page report with my match by saving it to my Dropbox folder and sharing a link in the message I send them.

Any match that I can connect to my tree to a CA has over 10k ancestors (and their descendants) with many up to 40k.   

My approach over my 30 years of genealogy as a hobby has evolved as it has for most I suspect.  As I research, I pick up as much information as I can including photos, obituaries and sometimes other records like draft registration documents, marriage and death certificates. All of these documents are incorporated into the detailed reports I generate whenever the person is included in the report which makes for some very interesting reading for my matches when I share reports with them. I find that Ancestry provides 98% of my information with a bit of help from the other sites whenever I hit a dead end in Ancestry.

[22DA] Segment-ology: Pro Tools Part 19 – Comments on Sacrilegious Genetic Genealogy by Terry Butcher 20241211

Pro Tools Part 18

Featured

Family Group Sheets

One of the key features of my Common Ancestor Spreadsheet (see post here) is that it offers an arrangement like a traditional genealogy Family Group Sheet (FGS). The FGS has an Ancestor couple at the top of the sheet, with a list of their children down the page with birth, death, marriage dates and places. If we are going to create an inventory of our DNA Matches with known links to an MRCA, this FGS spreadsheet format would be a great way to do that. It also turns out to be a handy tool when working with Pro Tools.

The Common Ancestor spreadsheet for Match cousins is actually a “nested” FGS. By sorting on Ancestor Ahnentafel Numbers, all the Matches connected to one Ancestor are grouped together. By also sorting on the birth year of the Ancestor’s children, this “FGS sort” results with Matches grouped under each child. By adding sorts on birth years for grandchildren and great grandchildren, we get a “nested” FGS. I regularly use my entire spreadsheet sorted by these four columns.

This arrangement has several advantages when using Pro Tools…

1. When Pro Tools indicates a parent/child or sibling relationship to an existing Match (already entered into the spreadsheet), I can create a new row and copy most of the info and just adjust one column – a real time saver. And this works even with new Matches with No Tree, Private Tree, Unlinked Tree, Scrawny Tree, even small cMs – Pro Tools has already provided all the relationship information needed.

2. When Pro Tools indicates a (full) 1C relationship to an existing Match, this limits the relationship possibilities to only two. [In my experience, 1C estimates are highly accurate.] Analysis: the new Match is connected to the existing Match (already in the spreadsheet) on (1) the same side I am on, or (2) on the other side. Be aware of this! If the new Match is on the “other” side, they are NOT part of this Ancestor (Ahnentafel) line. If the new Match has any info in a Tree, this “side” issue can usually be figured out and the spreadsheet cells filled out (mostly by copying from the existing Match). If there is no Tree info, the “side” can usually be determined by looking at the Shared Matches of the new Match (sorted on new Match’s cMs). There should be a clear consensus (at/near the top of that list) of the same Ancestor line as the existing Match.  If not, then skip this new Match. If so, I add a row for the new Match, copy data from the existing Match, and enter GUESS for the new Match parent (as a sibling of the existing Match parent), and then the new Match [NB: to save typing, I indicate each “terminal” Match as an asterisk (*) because they are already spelled out in the Match column near the beginning of the row.]

Analysis summary: A) look at their Tree; and/or B) look at their closest SMOMs.

3. For a 1C1R or 1C2R the estimates are still very good, and the process above can be used. Use available info or judgement to shift the new Match to the right or left per the “removes”. Where the individuals are not known, just put Unknown or Private in the cell. The complete path down to the Match is not critical, IMO.

4. When Pro Tools indicate Aunt/Uncle or Niece/Nephew, that too is highly accurate, as are the genders. Similar to the above, there is usually enough information to place them in the spreadsheet (which is like a horizontal Tree).

5. Pro Tools often includes a Half relationship in their estimate. This is based on tables that indicate two estimates shown are almost exactly the same cM range. Although technically correct, it is much more likely, IMO, that the relationships are standard (NOT Half). But a few will be Half so watch for that situation. Remember these Pro Tools cMs are between your Match and the Shared Match (not affected by whether or not you have a Half relationships with the Ancestor)

6. Adding a hitherto unknown child branch – best described by a recent example I had. In looking for my A38 (ALLEN ancestor) cousins, I found a bunch descending from four well documented children of A38 – 56 Match cousins (4C, 4C1R, 4C2R and 4C3R) with an average of 18cM. There appears to be more than four children in the 1810-30 Virginia census records. And there was an old story about this family, that a son named William went west. So when some known Matches had some SMOMs with ancestor William H ALLEN born 1815 in VA and living in IL, I took notice – it seemed to fit. As I pushed it with Pro Tools I found (so far) 10 Matches descending from William H ALLEN averaging 20cM. But more importantly, those Matches also had Shared Matches with 12 of the 56 Matches from other children from this A38. It sure looked like a Cluster with gray cells to other Clusters! I’d really like to determine William’s Y-DNA; and/or some DNA segment data… But, in the meantime, I’ve got two of William’s descendants checking their Matches for links to my A38 ALLEN. There are 147 Trees at Ancestry for William H ALLEN – not a one has any good clue to his ancestry, except that he was born in VA. Not my Brick Wall, but I think there will be 147 happy campers.

A key point in this long story, is the DNA has no sense of geography. The facts that four children stayed in VA (and were well known) and one child moved far away, made no difference to the DNA. From each descendant’s viewpoint, all the lines were equal – and a pretty even distribution of Matches showed up for all 5 children. The DNA is like blind justice.

7. Equality – a final thought is that this spreadsheet is a lot like the DNA – it’s relatively equal over all the Ancestors and descendants. This spreadsheet encourages me to treat all of my Ancestors equally (they each have an Ahnentafel placeholder row). I still have my “favorite” Ancestors, but as I methodically go through the spreadsheet, I’m spending time on each one. This includes the Ancestors that have issues… This spreadsheet also highlights the Brick Wall holes, to be plugged with floating family branches. This is a good thing.

To me, the key points in doing this spreadsheet work also include:

1. An inventory of Matches who have MRCAs with me. Separate from my on-line Tree. Saved in the cloud and/or archived – available to my heirs or selected genealogy archives someday.

2. Family Group Sheets – of sorts* – this is a standard genealogy tool.

3. A Quality Control check on the accuracy of name spelling and birth years; and the FGS itself. This QC review often reveals “quirks” (as a kinder word) that folks have in their Trees…

4. With Ancestor second marriages, this FGS listing will show the demarcation between full cousins and Half cousins. [I add “INSIGHT” rows with marriage years that will sort and separate the children to the different parent couples.] Half cousins for me only occur at the children level in my spreadsheet. Half cousins between Matches and Shared Matches can occur anywhere.

5. A re-sort by Match name highlights multiple relationships. Since shared DNA is divided by 4 (on average) going back each generation, the closer relationships are much more likely. I’ve found some Matches with MRCAs on both sides of my Tree. With single shared segments, the DNA can only come from one Ancestor. With multiple shared segments, there may be a segment for each line.

* I used “of sorts” in 2 above, because this FGS will not usually be a complete list of all Ancestor children, grandchildren, etc. It includes only the ones who provided a DNA path down to our Matches. Which in turn depends on family sizes and who did DNA tests – there can be wide variations on both.

Note: If I were starting over, I’d probably add name & birth year columns for 9 generations – out to 8C level; and then a catch-all column for any additional info. This would provide a handy way to evaluate the cousinship levels. Reminder: I only list the given name and one initial for males; and the given name, initial and married surname for females. I try to keep it as easy and simple as possible.

Bottom line: An FGS spreadsheet offers an easy way to add new Matches which have been identified by Pro Tools as closely related to known Matches. This adds independent, genealogy triangulation and tight Clusters to an inventory of known Matches. It will be an outstanding adjunct to an auto-Clustering program.

Also – you don’t have to use a spreadsheet to benefit from most of the concepts imbedded above.

[22CZ] Segment-ology: Pro Tools Part 18 – Family Group Sheets by Jim Bartlett 20241209

Pro Tools Part 17

Featured

NPEs

If we just consider our own ancestral line, we may miss some NPE’s. We may have an NPE as an Ancestor, IF we haven’t explored the whole family.

Way back, NPE was Non-Paternal Event, but we’ve seen non-Maternal events, too. So we changed it to Not the Parent Expected. The whole issue centers around the expectation of a family with two “expected” parents. Important: an NPE is usually for one child – perhaps your Ancestor; perhaps a different child in the family. We “expect” all the children in a family to be from the husband and wife.  So “usually” an NPE is a one-off event. But life unfolds in many different ways…

A man and a woman create a child – sometimes one of them is not married (i.e. living with their parents, or on their own) – or perhaps this is the case for both of them. Sometimes they are both married to someone else. Sometimes the man is not (or ever) aware the woman got pregnant. Again – in life, there are many variations to this. The point is the NPE does not apply to a family – it applies to a child. This is important to DNA analysis, and how we use Pro Tools.

I have this case for one of my Ancestors. The pregnant woman was an unmarried child in a family who raised her and her son, giving him their surname (which has confused genealogists to this day). It appears the father was not yet married either, but he went on to marry and have children. I know because I got some DNA from him (through the NPE child) and have Matches who descend from him through his other children (half cousins), and though her children by her later marriage (half cousins). [NB: Challenging in my Common Ancestor spreadsheet.]

Getting back to Pro Tools – the DNA truth-teller/helper. In general, the higher-cM SMOM interrelationships lead to one generational level in my Tree – to one MRCA couple. They may be cousins 1 or 2 or 3 times removed (because I’m old), but usually all go back to one MRCA. Then, as I scroll down the SMOM list, I often find SMOMs who descend from one generation further back. This is normal and expected. These would be a generation more distant to us, and should have appropriately smaller cMs, on average. In fact, if this doesn’t happen, we should be suspicious.

NB: Alternatively, some highest-cM Matches may be tied to a closer generation (which should be, on average, a higher-cM relationship). If these higher-cM Matches are at the same generation level, it may be due to multiple segments and, perhaps, additional relationships (with Colonia Virginia ancestry, I sometimes find multiple relationships with some Matches).

Finally, back to NPEs… If one of the Ancestors in an MRCA couple is an NPE, you wouldn’t get any Matches to that couple (just like with an only child; an exception would be if they had more than one child together).  So, instead, look to see that *some* of the Matches are from each bio-parent.  This is how I solved a Brick Wall. I had many Matches to my A36 (4C level) Ancestors [Thomas NEWLON & unknown wife]. As I kept looking at the Shared Matches, I found some smaller-cM Matches to my A72 (5C level) couple [Thomas NEWLON’s parents] who had been well researched. Analysis of “other” Shared Matches revealed many had the CUMMINGS surname (now my A74; 5C level ancestor).

The point is that if Pro Tools points to a group of higher-cM Matches to a 3C, 4C, etc MRCA; the lower-cM Match should point to groups for the next two MRCAs back. This is true whether these MRCAs are well known or an NPE or a Brick Wall. If you find a consensus Ancestor among these smaller-cM Matches you may have found GOLD.

Bottom Line: When dealing with an NPE, think carefully about what that means to Pro Tools, and target your “rabbit holes” appropriately;>j

[22CY] Segment-ology: Pro Tools Part 17 – NPEs by Jim Bartlett 20241208

Pro Tools Part 16

Featured

Sacrilegious Genetic Genealogy

For this post I want to explore a deviation from the normal genealogy and DNA research “requirements”.

Do we need to do comprehensive research on each cousin Match? Do I really need to find the complete link between each Match and our Common Ancestor? The sacrilige: do I care about all my distant cousins – to the extent that I must develop their complete link to me? Do I really care how much DNA they share with me? Must I link the DNA to the Common Ancestor? Or, is it enough to determine that they are on a specific branch of my Tree? I think so!

My standard mantra: our bio-Ancestors and DNA segments are set! We compare each Match to our Tree and DNA to find a Common Ancestor. I’m very close to finding out how 10% of my 100,000 Matches (at Ancestry) are related to my bio-Ancestors.

My experience with Pro Tools indicates many more can be easily found. I acknowledge that some shared DNA segments under 15cM will be false – but that doesn’t mean those Matches aren’t related to me.  Most of our true cousins beyond 3C will not share any DNA with us, so is the cM amount beyond 3C meaningful?  I acknowledge that some Matches will be related beyond a genealogy timeframe.

However, given these negative factors, I believe a lot more of my Matches are related to me within 9 generations back [8C level] – perhaps somewhat more than 20% of my total Matches. It’s taken me 14 years to “collect” and document approximately 10% of my Matches as cousins.  It’s daunting to think what time and effort I’d need to double that.

My sacrilege is to give up on full genealogy research for each Match. Using Pro Tools I’m finding lots of 6-10cM (small segment) Matches (to me) that are children, nieces/nephews, or 1C to strong higher-cM Matches that I have placed in my Tree. Clearly, these Matches are part of a family group well within a genealogy time frame.

I’m inclined to just quickly:

1. Add these small-segment Matches to my Common Ancestor spreadsheet

2. Add a Match Note (at Ancestry) to indicate the Common Ancestor and/or Ahnentafel [e.g. #A0062]

3. Give them my standard star and MRCA Dot; but not the Dot indicating a linked Match

4. Use a new Dot to indicate “Likely” in a family group under the MRCA; but not complete research [I could always filter on that Dot later, and do the research, some day…]

5. Add a shorthand note like:  SMOM: 3,442cM/son of “Match Name” [SMOM: Shared Matches of Match – the cM between them]

I’m looking for a more efficient way to group Matches into known family lines.

There are several points here:

1. Identify additional Matches within a genealogy timeframe (is it over 50% of all Matches?)

2. Group Matches under my Ancestor Couples – often under a specific child or grandchild (why would I need to dig deeper – unless the Match had a robust Tree with many records…)

3. Build a firm interrelated framework for later research on each extended “twig” of my Tree. Get some confidence of my Ancestors and their children and grandchildren.

4. Identify Brick Walls through clear absence of interconnected Matches. My spreadsheet has an Ahnentafel header for each of my Ancestors back to the 8C level – some of them have no known Matches, or what is clearly a small mess of non-interconnecting Matches. These are a judgment call, but with many more Matches involved, these few “problems” become more and more obvious.

5. Connect Floating branches – I now have several strong “clumps” of interconnected Matches, under a single MRCA couple, that I cannot link to my Tree. This is a strong hint in light of #4 above. I plan to explore this more in a separate blogpost.  

For DNAGedCom, Genetic Affairs, DNA Painter: Any way to automate the Clusters/Groups to include only those Matches who interrelate, say, over 90cM (and make that threshold adjustable)?

Bottom line: I think many more , if not most, of our Matches will turn out to be real cousins within a genealogy timeframe (out through 8C level). This includes Matches with no Trees, Private Trees, UnLinked Trees and scrawny Trees – all of these are now put into the mix through Pro Tools. For me, compiling data from my 100,000 Ancestry Matches will be a way to bound (if not counter) the continued warnings that many of our Matches are false and/or distant. Some are, some are not – what can we learn?

As usual, I value your feedback – on the sacrilege of adding Matches to Tree branches based on strong interrelationships, but without fully documenting the genealogy; as well as the bigger picture of possibly linking Floating branches to “bare spots” in our Trees.

[22CX] Segment-ology: Pro Tools Part 16 – Sacrilegious Genetic Genealogy by Jim Bartlett 20241205