DNA Ancestry Testing: Are Their Reports Rooted in Reality?
Last updated in May 2019
23andMe’s current “Meet Your Genes” ad campaign promises “More ways to discover what makes you, you.” AncestryDNA teases that “Millions of people have uncovered something new about themselves. You can too.” MyHeritage exclaims, “Be amazed…Uncover your ethnic origins and find new relatives with our simple DNA test.” HomeDNA, with its “GPS Origins” report, says it can “Pinpoint your ancestry” with “DNA results with a level of specificity no other test can replicate.”
This marketing is wildly successful: More than 15 million people worldwide have succumbed to it by submitting their DNA samples; AncestryDNA alone has processed 10 million.
Unfortunately, what these companies deliver often falls well short of advertised promises. Some promise to pinpoint ancestral origins. They can’t really do that; there will be no pinpointing. Many suggest their methods, reference data, and algorithms are superior to their competition’s. The only way to test those claims is for the companies to share their data and algorithms with the scientific community to test them, and they don’t allow that.
Most of us won’t learn much about our roots from these services. While they can determine whether our genes include traits from specific continents, and perhaps narrow down your origins to broad regions on those continents, estimating ancestral ties to smaller regions involves a lot of guesswork.
Three of Us Sent Samples to Eight Ancestry Companies—and Learned Little
At the bottom of this article, we share the ancestry estimates Checkbook received from eight services for myself and two other genetic guinea pigs.
Each service labels various regions and ethnicities differently. One might lump together all the British Isles and Ireland with Northern Europe; another might break them apart into four sub-regions. They also often use different labels for various ethnicities or regions (“South and West European” vs. “Iberian” or listing Italian, Spanish, and Portuguese separately).
In the summaries, we consolidated and renamed some of the company-defined regions to make comparisons easier.
As you can see, the reports generally agreed with one another on ancestry at the continental level. But once companies tried to express their estimates at regional levels, their reports were less consistent.
Some of these companies fail to adequately explain to their customers why their estimates are imprecise. Many obstacles stand in the way of accurately assigning customers to their ethnic ancestry.
How They Try to Measure Ancestry
About 99.5 percent of all human DNA is exactly the same from person to person. When labs examine DNA, they don’t waste time looking at all of it; they just look for little differences—variants or mutations—passed from generation to generation. Their software scans for patterns in these variants known to be associated with an ethnic group (or a congenital medical condition).
Scientists map out these patterns by building reference datasets using DNA they’ve collected from people globally. Studying how variant patterns are shared by people living in different parts of the world helps them build systems to categorize DNA by ethnicity and create algorithms to analyze new DNA data and estimate it.
Each company has its own reference dataset and algorithm, but all probably partially rely on publicly available reference datasets. We say “probably” because these companies don’t disclose their proprietary formulas and data. “These processes are not validated by the scientific community,” says Sheldon Krimsky, Professor of Humanities and Social Sciences at Tufts University and board chair of the Council for Responsible Genetics. All that secrecy makes it impossible to judge which companies provide the best ancestral estimates—and they all claim theirs is tops.
Why It’s Difficult to Do
Even if companies made their reference datasets and algorithms public, it would still be difficult to test them for accuracy. There are several obstacles to estimating ethnicity precisely.
With each generation, lots gets lost.
With each generation, a lot of traits—and info labs can look for in DNA—don’t get passed on to offspring. When analyzing our DNA, these companies can connect us to many of our ancestors, but they can’t connect us to family with whom genetic connections are lost. For some of us, what’s left is a highly incomplete ancestral picture.
The math gets hard when you look back 1,000+ years.
As you trace your family back across generations, the farther back you look the less each of your relatives contributes to your genome. Say one of your great-grandparents was Japanese and the rest of your relatives were of a different ethnicity. Because great-grandpa-or-grandma contributed only one-eighth of your genes, that makes you 12.5 percent Japanese.
Now trace your lineage back 1,000 years. Your family tree would consist of thousands of grandparents, the earliest of them contributing a small fraction of your genes. With so many relatives, it’s impossible to take your DNA strand and trace it back to one ethnic source, or even to a handful of them. There are too many paths to follow.
While DNA ancestry services can guess about your ancestry using mappings of variants (found by studying lots of DNA), they likely can’t guess very precisely. Yet some of the eight companies we tried sent us highly detailed regional ancestral ethnic percentages, down to one-tenth of a percentage point. They really can only pretend to possess enough knowledge for that kind of precision.
One thing we like about AncestryDNA's reports is they show statistical confidence intervals when users click on each of its estimates. LivingDNA’s reports let us view our ancestry estimates according to different confidence levels; 23andMe also lets you do that, but you really have to dig into your online report to find that feature. Since many companies calculate their estimates using a 50 percent confidence level, we wish all the services clearly provided the context AncestryDNA provides. By relying on such low statistical confidence levels, if a report says you are 12.2 percent Scandinavian…well, that indicates its algorithm says maybe you’re as much as 20 percent Scandinavian, but maybe you’re not Scandinavian at all. It’s very unlikely you’re exactly 12.2 percent Scandinavian, and it’s silly to report such a precise statistic.
Ethnicity is ambiguous.
Throughout history, humans wandered around a lot or were forced to move. As people migrated they often didn’t replace those already living there. Instead, these movements usually resulted in mergers. Anna, our Colombia-born volunteer, was surprised that some of her reports indicated North African and Middle Eastern ancestry. But she quickly realized why: Many of her ancestors migrated to South America from Spain, and most of Iberia was conquered and settled by the Moors for more than 700 years.
All that mixing makes ancestry hard to trace. Often the little differences scientists look for in our DNA aren’t unique to what we today consider distinct ethnic groups. For example, they can’t see enough patterns in DNA to isolate Italians from other Southern European groups.
But the dilemma for DNA ancestry companies is that their customers expect a lot of detail, as did I and our other two volunteers. Receiving a report saying we hail from a broad geographic region (“Southern Europe,” “West & Central Africa”) isn’t nearly as satisfying as learning we’re “Italian” or “Gambian.” Consequently, many DNA ancestry services report ethnic estimates their customers can identify with culturally or find on a modern map, even if these connections aren’t very precise or historically realistic. “It’s entertainment. Most of the time they give people what they want or expect, and that makes them happy,” says Krimsky.
While some companies overreach when calculating their estimates, one makes things too complicated. HomeDNA’s reports try to give customers their “deep ancestry.” But our testees weren’t nearly as interested in its reports as the others we ordered. My report indicated I am nine percent “Tuvan.” My reaction: “What can I do with that? I don’t even know where that is.”
But the conservative assessments sent by a few companies seemed too vague to the three of our volunteers and Checkbook's research staff. We found National Geographic’s reports on how people’s genes migrated through the ages informative, but its ethnicity estimates were so broad they seemed like a waste of money.
Most of the collected genetic data is Eurocentric.
Because most genetic research to date was done by European and North American researchers using DNA collected from European and North American populations, the reference datasets DNA ancestry services use aren’t very diverse. They especially lack data from African and Asian populations. We suspect one reason companies’ estimates for Anna and Nicole differed so much is that they simply haven’t worked hard enough to serve those with Native American and African roots. If geneticists and DNA ancestry companies continue to add to their knowledge by analyzing DNA from different parts of the world, these answers may get better.
However, there is room for improvement.
In addition to improving their services for a more ethnically diverse customers, companies should continue to hone their methods and algorithms. After we submitted our volunteers’ samples, AncestryDNA changed its algorithm, which affected its estimates for most customers. It was quite interesting to see how its reports shifted based on its new analysis—and, of course, also confusing. “When I first got tested by AncestryDNA, the results said I was 30 percent Irish,” says our editorial director Jenn. But a few months later, post rejiggering, her Emerald Isle genes were measured at just 19 percent. “But I’m still going to visit Dublin,” she insists.
You Might Learn More by Doing Old-school Genealogical Research
Most of us can build detailed family trees for many generations with a little research. Click here for a list of several available genealogical resources, many free.
But many consumers view genealogical research as too much work. Even a few hours mapping out a family tree seems like a dreaded homework assignment. Sending in our DNA to a company that promises to apply cutting-edge science to pinpoint ancestry is new, easy, and fun.
But while the DNA ancestry services can provide only very general info about our roots, a little time spent on genealogical research often provides rich details about our families. Some of us can easily identify specific relatives—and where they were born and lived—going back hundreds of years. Even if your research leads to a dead end after a few generations, there’s usually still much to learn. On Ancestry.com, I unearthed extensive notes from an oral history my grandmother gave to a relative. And I quickly and easily traced my maternal line to the first births in New Amsterdam. You won’t find these kinds of concrete facts and rich histories via any DNA reports.