Why Do I Share Different Amounts of DNA?


– [Blaine] Hi, my name
is Blaine Bettinger. I wanted to talk to you
today about the issue of sharing different amounts of DNA with the same cousin at
different testing companies. Now, if you have the same
match at different companies, you’ve probably experienced
something like this where they might share 85 centimorgans with you at Family Tree DNA, but they share only 70
centimorgans at GEDmatch and maybe 55 centimorgans at AncestryDNA. Why do we see those different amounts? Shouldn’t they all be
the same exact amount? If it’s the same cousin
that’s being compared, their DNA to your DNA, shouldn’t they all be the same amount? Well, let’s look at this
from a real-life example. Luciana, who is, this tree is artificial, but
the match is a real match, and she is approximately a fifth cousin. Luciana and I are fifth cousins, and we have both tested at
four different companies and transferred our DNA to GEDmatch. So, what we’re gonna do
is we’re gonna compare how much DNA we share
at 23andMe, AncestryDNA, Family Tree DNA, GEDmatch, and MyHeritage. Will the amounts be the same, close to each other, or different? And if they are different,
what are some of the reasons why they might actually be different? First, we’ll talk about 23andMe. Luciana at 23andMe,
she is, as you can see, a predicted third to fourth cousin. She shares 0.81%, which translates to 60.13
centimorgans at 23andMe. And as you can see, she
shares four segments with me. If we go to the chromosome
browser at 23andMe, we can see the actual segments. So those four segments
break down like this. There are two segments
very close to each other on chromosome five. One is 17 centimorgans. One is 15. There’s a very small
segment on chromosome seven, and there’s a large
segment on chromosome 21. So those are the four segments that 23andMe identifies Luciana
and I sharing in common, and that adds up to 60 centimorgans. Okay, well, now let’s look at Ancestry. At Ancestry, we’re predicted
to be fourth cousins, in the range of fourth to six cousins. And as you can see, now we
have a substantial difference. Now we have 48 centimorgans instead of the 60 we had at 23andMe. We are still at four segments, however. Now, to get the AncestryDNA
segment information, as you probably know, we don’t get segment
data from AncestryDNA. So we don’t know how
these segments compare to the segments we have
from 23andMe, for example. Okay, at Family Tree
DNA, Luciana is predicted to be a second cousin to a fourth cousin, and now we have
significantly more segments. We have 76 centimorgans that
we are predicted to share. And when we look at
the chromosome browser, that works out to three segments, although there’s a little
bit of a hitch there that we’ll talk about. So at a threshold of six
centimorgans or greater, we share actually 48 centimorgans. So that’s a big difference, right? If we go back a slide, we see
we share 76 centimorgans here. But when we actually look
at the chromosome browser, these are the only
large segments we share. And actually if you use
the chromosome browser at Family Tree DNA, what you’ll see is that
Family Tree DNA includes a lot of the smaller segments
that boost up that number. So if I included all the segments of, say, one, two, three, four, five,
and so on in centimorgans, it bumps it up from where it is now from 48 centimorgans up to
a total of 76 centimorgans. So that’s one of the major
reasons we see differences from one company to the next
is the thresholds they use. Okay, let’s go to GEDmatch. Now, at GEDmatch, using
the one-to-many comparison, Luciana and I share 72.9 centimorgans. That’s the one-to-many. This is the fishing tool at GEDmatch to find all of your matches. Here is the one-to-one. So, for example, we could
click on the little A there that’s hyperlinked, and that will bring us to the
one-to-one, Luciana and I. When we do that, we get a
total of 63.4 centimorgans when we add up these four segments. So the differences there, they’re due to the different thresholds and different analysis that are done. But what we can see is
that 63.4, for example, is very similar to what we got at 23andMe. The segments are pretty similar too. We have the two segments
close to each other in chromosome five, the small
segment on chromosome seven, and the large segment on chromosome 21. Now, at MyHeritage, we are predicted to be third to fourth cousins. We share 66.7 centimorgans or 0.9%, and that’s across three segments. Now, if you’ve been keeping
track of the segment sizes, you can see it says the
largest segment here is 37.6 centimorgans, and that’s much larger than
the segments we’ve seen in the other ones, and we’ll see why that
is in just a moment. So when we look at the segment
information from MyHeritage, remember, it said there
are three segments. And what we see is that
what MyHeritage has done is combined the two
segments on chromosome five. So instead of having
those two smaller segments very close to each other, which very likely could
be a single segment. Here they’ve combined it
into a single segment. So we have now a large
segment on chromosome five, that small segment on chromosome seven, and that large segment on chromosome 21. But that adds up to the total of 66.7 centimorgans from MyHeritage. Okay, well, now what we
can do is we can look at the different amounts from the different testing companies. So we have a total of 60
centimorgans from 23andMe, 48 from AncestryDNA, 76 from Family Tree DNA or 48 if we subtract
out the small segments. At GEDmatch, we have either 72 or 63 depending on which tool we’re using. And at MyHeritage, we have 66.7. So as you can see, we are getting some
very different numbers. Now, some of the reasons
why they’re different, for example, is AncestryDNA
can underestimate the total due to an algorithm called Timber that will downweight some segments. So, for example, Timber, what it does is it looks at segments that are overrepresented, meaning you match more people
than you realistically should on that segment, and so it
will downweight that segment. So what that has the effect of
doing is lowering the amount of total centimorgans that
you share with another match. And as we can see here,
indeed, the 48 centimorgans at AncestryDNA is among the lowest. Also at Family Tree DNA, they were among the highest
with 76 centimorgans due to the fact that those
small segments were included. So there are various reasons. Some of the other reasons
why we see differences is because, for example, MyHeritage uses something
called imputation. Imputation is where it
uses population genetics to sort of fill in the gaps
in areas that are not as, in between tested SNPs. Sometimes that can have the effect of bumping up the numbers,
usually making them higher, but sometimes causing
breaks and other issues. So there are a bunch of reasons. Another reason is that
starts and stops are fuzzy. Because the SNP chips that are
used are sampling the genome rather than sequencing all of our DNA, it’s really hard for the
company to tell exactly where a shared segment of
DNA will start or stop. They have their own algorithms for determining where a
segment starts and stops, and if there’s a little
bit of difference there, then that can result in a
change in the segment size. That’s usually going to
have a much smaller effect than some of the other things like Timber or small segments or
imputation, for example. One of the most common questions I hear is which of these numbers is right? Which should we use for our analysis? And do these differences matter? As you can see here, again, we are talking about a pretty big range, from a low of 48 centimorgans
to a high of 76 centimorgans. That’s a really big difference. The question of which we should use is in fact an important question. Well, does it even matter? Is this an important thing to consider, or is it something we
should largely ignore? Well, it can make a difference if the differences are large enough, meaning if we are seeing a big difference from the lowest to the
highest, for example, it can make a difference. So let’s look at relationship predictions using these numbers. All right, so here, for example, what we’ve done is we’ve
gone to DNAPainter.com and we have used these
relationship probability tools to give us an idea of
the various relationships using the smallest number. So the smallest number
was from AncestryDNA, probably as a result of that downweighting from the Timber algorithm we talked about. So what we can see here is it gives us about a 30% probability of relationships like half third cousin,
third cousin once removed, half second cousin twice removed, second cousin three
times removed, and so on. It also gives us about a 25% probability that it’s a more distant relationship on the order of a fourth and beyond. All right, well, that’s
the smallest number. Let’s look at the largest number,
which was 76 centimorgans. Now, we really shouldn’t be
using that number at all, because we know we need to
take out those small segments. Even if we used just 66, which was the number from
GEDmatch, for example, would be very similar to this. And as you can see, we
get a 32% probability of relationships like third cousin, half second cousin once removed. So what has happened is
that using the larger number has pushed some of the relationships up to a higher probability. So looking at third cousin
on the left, for example, we had a 15% probability. On the right, it’s now a 32% probability. Now, in reality, that probably seems like a bigger difference than
it really is in practice. What I mean by that is once we get down in the range of 75 to
50 centimorgans or so, these fit into a lot of
different categories. In fact, it fits into all
these categories we see on the screen. And as a result, well, some
relationships are going to be more likely than others. We still need to consider all these relationships as possibilities. And that’s one of the problems
of relationship predictions. So although we do want to try to use a number we think
is the most accurate, maybe in the middle of the range if you’re actually able to
test at multiple companies. At the same time, we need to realize that it is not going to
give us a level of accuracy that will allow us to do more
than we can do with it anyway, meaning we still have
to do all the research. We have to do the genealogical
research and add the DNA in. It’s not as if the amount of DNA at this range is gonna make
us that much more confident in one relationship or another. It’s still gonna be within
a lot of different ranges. So hopefully that gives
you some very quick ideas of why you might be
seeing different amounts at different companies, see how it can vary from
one company to the next, and gives you a feel for what you might do and understand with
those different amounts. Thank you very much for joining me.

Add a Comment

Your email address will not be published. Required fields are marked *