What’s all the buzz about “pile-ups”?  In my mind there are three kinds of pile-ups: small, medium and large. They are different, so it’s important to understand each one. In this case Goldilocks should prefer the large pile-ups, but let me go through my views of all three kinds.

Alert: This post contains my opinions about small pile-ups and AncestryDNA (based on my own experience) so you should make your own judgments.


I think the two keys to success with autosomal DNA lie in a robust Tree (as many ancestors out to 13 generations as possible) and as many Match-segments as possible (including as many close relatives as you can get). I spent about a year expanding my Tree as best as I could, and then posted that GEDcom in several places. I’ve tested at all three companies and use GEDmatch.  I put every single shared segment I can find over 7cM into my spreadsheet, and I periodically run a Quality Control check against a fresh download to pick up any missed Matches or segments. I currently have 5,000 different individuals with segment data in my spreadsheet, and have determined a Common Ancestor (CA) with 309 of them.

I have compared virtually every segment against other overlapping segments, and formed Triangulated Groups (TGs) that cover over 90% of my 45 chromosomes. It is now rare for me to get a new shared segment that changes my chromosome map in any way. This process has provided some insights on medium and large pile-ups.


My definition of pile-up sizes:

  1. Small is smaller than 5cM
  2. Medium is 5-10cM
  3. Large is greater than 10cM

Small pile-ups – by my definition, these pileups are composed almost entirely of IBS shared segments. When AncestryDNA first rolled out their autosomal DNA test, their threshold was 5Mbp. This threshold included many shared segments well below 5cM, and resulted in many thousands of bogus Matches. To their credit, they provided a caution about these. When AncestryDNA revised their threshold to 5cM, many of these Matches went away. Part of their explanation was the elimination of “pile-ups”.  I agree that these “small pile-ups” should be eliminated. And when they reset their threshold to 5cM, that should have eliminated this problem. However, their explanations continue to stress the elimination of “pile-ups”. I just hope they don’t also toss out Matches in larger pile-ups – throwing the baby out with the bath water.

Medium pile-ups – 5-10cM range. As I gathered as many segments over 5cM as I could and sorted them in my spreadsheet, I noticed a few areas that had many such segments, all in a very narrow chromosome area. Very clearly a pile-up! Virtually none of them matched each other, although they had almost the same segment start/end locations. And there were a lot of them – many more than in large TGs.

In discussions on various email lists, we compared notes, and found that most of these areas were unique to our own experience. In general they were not due to some common feature of most human genomes. A notable exception to this blanket statement is the HLA Region on Chromosome 6 – roughly from 29.8 to 33.1Mbp.

However, most of the other areas were not tied to known issues like the HLA Region. In my analysis, it was not possible for me to link these to one parental side or the other. The fact that these areas include so many IBC segments indicates to me that it’s the combination of both of my chromosomes (maternal and paternal) that allows the “matches”. It’s the unique combination of alleles in these small stretches of DNA that make matching much easier. And this unique combination is only in my genome. On chromosome 18, I have 307 segments in the 7 to 11 cM range. They are all in a very tight area:  from location 5,800,000 to 8,700,000bp.  Very few of them triangulate.

Sometimes the pile-up area has been documented. On chromosome 15, I have 281 segments in the 7 to 10cM range. They are at: 24,000,000 to 28,000,000 bp. This area partly overlaps a known pile-up area (20,100,000 to 25,200,000). But the known pile-up area is only partly the cause in my case. See 14 small pile-up areas found by Li et al (2014), listed at the ISOGG Wiki: http://www.isogg.org/wiki/Identical_by_descent These medium pile-up areas, and a few others in my experience, are characterized by a very tall pile-up of many segments about the same size in a narrow area just a little larger than the segments. The Li et al (2014) article refers to “regions where excess IBD is detected…” Virtually all of the segments I have noted above are IBS/IBC – they do NOT triangulate with the other segments.  A few segments in these regions do triangulate with known close relatives, and each other. I’ve kept those segments in maternal and paternal TGs, as appropriate, covering that area. After all, both my mother and father gave me those areas, and they in turn got them from their parents, etc.  It is very probable that these segments are IBD and come from a CA.

My experience is that these are areas with a lot of shared segments in the 7-10cM range that are in a tight area, usually just 10cM wide, and a very high proportion of these segments are IBS/IBC.  A few segments in these areas will be IBD, but they will tend to be larger than the 7-10cM segments.

My bottom line for these pile-ups: Unless you have a lot of free time, skip over these areas – particularly the shared segments under 10cM. Concentrate on triangulating any larger segments in these areas and then move on to other areas.

Large pile-ups – these are my favorites. Larger shared segments (over 10cM) that spread out and overlap each other over wider areas.  These segments tend to triangulate with each other, forming TGs on both sides.  I have some of these TGs which include over 50 shared segments.  Since the shared segments triangulate with each other, this is a good pile-up. These TGs are large because more people have these shared segments – probably because the Common Ancestors had large families in Colonial America, leaving us with many, many cousins. Another reason could be a more distant Common Ancestor, who would also leave us a large number of cousins.

In some cases we can use this observation to our advantage. I have a 2nd cousin, on his mother’s side, who is also an 8th cousin, on his father’s side. Our close Common Ancestor was an immigrant to the US in the mid-1800s, and I get relatively few Matches on the segments I share with him. However, on one segment, we have many Matches – it turns out our Common Ancestor is on his father’s side. The tip-off should have been the size of the TG (measured by the number of Matches).

Another observation about large pile-ups…. They will get larger. The number of folks taking an atDNA test is about doubling every 12 months. A consequence of this is that all of our TGs will also double in the next 12 months. So, if you have pile-ups now, they will about double by this time next year. Use these larger TGs to your advantage – work with the Matches to investigate place/time matches, if a Common Ancestor is not easily determined.


  1. In general, don’t work with shared segments below 5cM. Most are IBS/IBC – even if they appear to triangulate. We don’t have a good test below 5cM to indicate IBD.
  2. Watch for, and avoid, pile-ups in the 5-10cM range. These are characterized by many shared segments in the 5-10cM range in a very tight location- usually only 10 or 11cM wide. Move on to larger shared segments in other locations.
  3. Embrace the large pile-ups. They may from Common Ancestors with large families and/or more distant Common Ancestors. In either case, work with the Matches in these TGs as a Team to determine the Common Ancestor.

18 Segment-ology: Pile-ups by Jim Bartlett 20151007

31 thoughts on “Pile-ups

  1. Excellent work and very informative. A very nice 71st birthday present as well.

    In reference to Colonial Ancestry and your suggestion that it skews results; One issue with my tree is that almost all of my ancestors are Colonial American. Most of the rest are Canadian or are not identified yet. I believe that my newest immigrant ancestor arrived in 1829. Probably half of my ancestors were in the Americas pre 1700. A large number of my AncestryDNA matches show multiple, indicated by clickable arrows, up to 5 separate and different matches. What impact do you think might this sort of a tree might have. The overwhelming number of Ancestors go back to the 1600’s

    Also note, my tree is a 80,000 plus person “data mining tool” Questionable new data is allowed. But, errors are diligently searched for and when detected ruthlessly corrected. atDNA has been very useful in finding and correcting errors. I’ve bought atDNA kits for 8 people and work with a number of others who administer a similar number of kits.


    • Sam – Thanks and Happy Birthday! It looks like you are well on your way with atDNA. I’m working on a blog post about endogamy. I personally believe it’s an overblown issue – some impact, but not much – but I’m crunching the numbers to give a fair look at it.


      • Have you written anything on working with endogamous groups? Or can you recommend some reading that could help? The lines on my paternal side are very tangled. The group that I’m working with has about 60 kits on gedmatch. I match 88% of this group. The highest kit matches 93% of the group with the lowest matching 50%.


      • Jeannette,
        I’ll start with the effect of one set of first cousins as ancestors and then extrapolate to endogamy – we’ll see how the numbers shake out.


  2. Dear Jim, Thanks for sharing your work and your conclusions. I appreciate same and I have learned a lot from you these past few years. Linda McKee


  3. Thanx again for very timely and informative article. I believe that Ancestry.com has thrown lots of babies out with the bath water. I can point to 4 matches quickly on my list with segments of 18.4 to 19.7 cM that disappeared with Ancestry’s first purge. The same 4 matches remain on the list of my 2nd cousin and I believe all of my segments are larger. I’m sure that there are many more. My confidence in their process is waning!


    • Jeanette,

      Thanks for the encouragement. I’ve heard a number of stories of large segments which dropped out as well as Ancestry kit’s found at GEDmatch with large IBD segments that AncestryDNA didn’t report. That’s why I like GEDmatch so much – yes, I get more IBS segments, but with their comparison tools, it’s easy to cull them out.


  4. Very good article! I agree 100% that just calling the large pile-ups “pile-ups” is misleading because they are completely valid and the ones you WANT to focus on. John Abbott.

    Your success with triangulation is very encouraging to the rest of us –thanks!


    • John, That’s exactly why I wrote the blog post. “Pile-ups” was turning into a bad name; and folks were beginning to believe that all pile-ups were bad news. As in many things, we need to look at the details and sort it out.
      Triangulation, I admit, is hard work. So I recommend just starting on some areas that are interesting to you. Gradually you’ll build out the framework of your segments on both sides.


  5. Hi Jim
    Do you have a sample spreadsheet that you use for chromosome painting and triangularization that you would be willing to share? I would like to start building a chromosome map and would appreciate starting with a tool that has been exercised and is set up fit for the purpose…

    Thanks for your blogs!


    • Douglas, I’m working on several blog posts about spreadsheets. Mine has evolved, a lot, over the past 5 years. One caution: whatever you want to see in a spreadsheet, is what you have to enter and maintain – it’s literally a two-edged sword. I’ve got way too much data in mine. Way more than is needed for just Triangulation. But, I also use my spreadsheet as my master atDNA tool, diary and repository – it keeps everything in one place. I emailed you the format – watch for the blog post.


  6. Thanks for your marvelous posting! I truly
    enjoyed reading it, you could be a great author.I will
    ensure that I bookmark your blog and will often come back down the road.

    I want to encourage yourself to continue your great job, have a nice day!


  7. I decided to use the GEDMatch chromosome browser to look at the top hundred or so my top X-chromosome matches. As you know, you can’t use X chromosomes to prove much if anything about distant genealogical relationships, but I wanted to look for patterns in what chromosomal regions are matched. I notice that there are definite patterns with strong pileups on the X chromosome, with several distinct regions, so it basically looks like 3 stripes on that chromosome,, and on a region in chromosome 12 in particular. Some chromosomes look like scattered noise, but not those.
    However – I know that Bayes and the birthday paradox may both play a role in these patterns showing up. I need to do more sophisticated analysis. Has anyone tried anything similar?


    • Lois,
      I don’t know of such a study. IMO, each segment of our DNA came from a specific ancestor. All the DNA on a chromosome, including Chr X, came from a parent, who got it from their parent, etc. Somewhere going up our ancestry, that segment is a recombination from 2 ancestors, and it stops being an IBD segment. Drop back one generation and that ancestor is the last one to pass that entire segment to you. Without chromosome mapping we can’t usually tell which ancestor is the ultimate one. See my post on the porcupine chart. With Matching cousins, we can “walk each segment back”, and find this out. Somebody had to pass down each segment of Chr X. Yes, smaller segments may be from distant ancestors, so we need to look for the larger segments with closer cousins.


  8. Echoing other replies here over a year after this post: this is extremely helpful information as are all your posts. Thank you for sharing and explaining your experiences. This seemed like it was too hard till I started reading and re-reading your blog.


  9. Thank you for this write-up. I have some lingering confusion that I hope you can help clear up for me. I just discovered about a dozen FTDNA matches for my father that are all tagged as Maternal. They all have roughly the same start and end locations (of about about 9.5 to 10.1 cM) on the same chromosome. I’ve run them all through the FTDNA Matrix tool and they all show as matching each other. Is this a medium pile-up or a TG? Sorry for being dense about it, but how are we to tell the difference between the two? If they do not match in the Matrix, then it is pile-up? A known 3rd cousin has this same segment, plus several others. How can I tell if this is an actual IBD segment and not an IBS segment that the 3C also just happens to have? I am new to looking for TGs, so I don’t want to start off on a wild goose chase! I’d like the first one to be legit. 😀


    • Matt, Triangulation generally insures that shared segments are IBD. The FTDNA Matrix tool, is an InCommonWith tool that does not necessarily mean Triangulation, but when they overlap on the same chromosome, they are almost always Triangulated – so I think you are OK. I would not call 12 Matches a pileup. I have many TGs with 20-50 Matches; and new Matches pour in every day. A pile-up can occur in a TG – it’s just a lot of segments. A bad pile up is usually a lot of shared segments in the 7-9cM range all in a very tight area on a chromosome (maybe an area of only 10cM). Your best bet is to upload your raw DNA data to GEDmatch, where you can test your Matches’ segments against each other, to insure Triangulation. I lot of Matches in a TG, usually means the Common Ancestor of the TG is further back in history, but often some of the Matches will be closer cousins – like your 3rd cousin.


  10. This article was very helpful. I was quickly able to identify that my matches near the end of chromosome 10 that I was wondering about fell into the large pile-up category. I started with 70 matches in that location and the list grew. By the time I added my mother’s results, there were well over 100 people who more or less all matched with one another. Several of our closest matches were over 20 and a few were over 30 cM. Through connections with some of these matches, I have now found that the roads all lead to County Mayo, Ireland before 1790. We have not yet found the common ancestor, but we are narrowing down the locations within Mayo and identifying a collection of surnames that pop-up frequently.


    • Tetley, Thanks for the feedback. To my thinking, the only bad pile-ups are those with many segments under 10cM, in a tight group, which actually don’t match each other. The larger segments, particularly any over 15cM, are going to be IBD (from a Common Ancestor). If there are a lot of Matches in a TG, that generally means the Common Ancestor is further back in your ancestry; but some of the larger shared segments will be with closer cousins.


  11. Thirteen generations is not possible for those of us with early Southern ancestors, where courthouse fires destroyed vital records in the 1800’s. For some of us, 1850 is an unsurmountable wall. Proving parents of poor Tennessee and Kentucky ancestors is often not possible. Then there’s the early 1900’s immigrant from a European country, where records burned during both world wars. Mid-1800’s immigrants from European countries that don’t have online records. The vast majority of my matches are distant, making DNA less helpful than it might be for others. Maybe a decade from now there will be enough tests to make connections more clear. I’d be thrilled if more people tested for matches instead of ethnicity matches, and added public trees with a minimum of four generations.


  12. Pingback: Comparing DNA using GEDmatch – Knight Family of Seattle

  13. Jim some follow up questions on your post: Summary 1: I have 2 siblings and my mother tested, as well as my father’s sister. If all three siblings and our mother, or in another case all 3 siblings and our aunt, have the exact same 4cM segment, what am I to make of this? It is obviously IBD for at least 1 or 2 generations… but could the original segment from my mother (or aunt) be IBS? So is the segment IBS or IBD? Summary 2: so is there an explanation as to why one would have several 9cM segment match pileups that are not IBD? Summary 3: Yes large cM segment matches are great… but I already know my full tree back several generations. It is the 7-15 generation range that I need to investigate… yet your post seems to imply don’t pay much attention to segments 12cM? Could you please post on how the tedious effort of triangularization will lead to 17th or even 16th Century genealogical value? Thanks.


    • Jason; these overlapping segments are obviously overlapping. They may be true or not – that is the problem, we don’t have an easy way to tell. Using phased data is one way. But even if the overlapping segments are IBD, the highest probability is that they are from a very distant ancestor. You know of a close Common Ancestor, and they may be from that, but the segments could also come from a very distant ancestor. In general, we cannot rule out all distant ancestors. When we share a large segment – say 220cM – it’s fairly easy to rule in, or out, all the relatives who could share that amount of DNA. With 4cM segments the cadre of possibilities is just too much. We cannot just assume the 4cM comes from the closest Common Ancestor, or our favorite Common Ancestor, or a famous Common Ancestor, etc. Too many folks start with what they are trying to prove, and only look at data that supports that conclusion, without considering the other possibilities. Summary 2 – yes – sometimes our own DNA has a area on one chromosome that includes a unique combination of SNPs from both parents that permits a zig-zag IBC match. Many of us have found these under 10cM (very easy at GEDmatch) – in general this doesn’t happen above 10cM. Summary 3: Utilize combinations of MRCAs on the same line – either several, widely separated, Matches all agreeing on the same CA; or several, widely separated, Matches with MRCAs which are all on the same ancestral line for you – in other word a 2C, a 3C, a 5C and a 7C all on the same line – called walking the Ancestor back. It is not sufficient to just have the 7C and an MRCA (unless you can positively eliminate all the other possible MRCAs for that segment – a virtually impossible task) Jim


    • With about 2 million people having already taken an atDNA test, we are going to see our Triangulated Groups growing a lot. Each TG represents a DNA segment you got from a specific ancestral line. If there are a bunch of smaller shared segments in the TG – say 7-12cM – that is a very strong indication to me that they are from fairly distant ancestors – say 10-20 generations back – on average. So my focus would be on trying to find a Common Ancestor with the Matches with larger shared segments – they will tend to be closer cousins. Jim


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s