Wesley Johnston's DNA GEDmatch Recommendation Page
Oh, what a tangled web we weave, when first we practice to conceive.


I first did a genetic genealogy test in 2009. In my many years of doing genetic genealogy, the most common hurdle is working with people who do an autosomal DNA test and do not realize how much more benefit their test results could provide them than their testing company has to offer. The testing companies -- Ancestry, most of all -- give the illusion that they have all you need, so that people are not even aware that there is so much more -- not aware that they are receiving pennies of value for the dollars that they spent. The one key thing that everyone who does an autosomal DNA test should do is upload their DNA results from their testing company to the free GEDmatch website (www.gedmatch.com).

GEDmatch provides two enormous benefits that vastly enrich your genetic genealogy results.

  1. GEDmatch lets you compare your kit to anyone else who has tested at any other company -- and uploaded their results to GEDmatch. No matter how large the database of testers at the company where you tested, it is only one company. There are now five major testing companies (alphabetically): Ancestry, Family Tree DNA, Living DNA, MyHeritage, 23andMe. In addition, people have tested at the National Geographic Project and at more geographically-specific testing companies. All of these are there for you to match against on GEDmatch.

  2. GEDmatch has, by far, the best autosomal DNA analytical tools -- far better than any of the testing companies. These tools allow you to achieve the maximum benefit now possible with autosomal DNA results.
    (See Tim Janzen's 2019 Rootstech slides, AND keep in mind that GEDmatch adds new tools every year. For example, the very powerful Multiple Kit Analysis tool was not there when this presentation was made.)
The late Dutch astronomer Bart Bok once said "Some people come to the fountain of knowledge to drink, others just to gargle." If you have not uploaded your DNA results to GEDmatch, you will never drink fully of the knowledge that they contain.

The truest thing about genetic genealogy is that it is a very dynamic field. While there are some things that stay constant or change only slowly, a great deal in genetic genealogy changes in just a year or two. Writings and presentations in 2016 about genetic genealogy likely do not reflect the current state of the art in 2022 and in fact some statements made then have been proven wrong. Even information posted in 2019 can be flat out wrong in 2022 since it was about a situation in 2019 that has radically altered in 2022.

The first thing to do for anyone to whom a presentation or publication about genetic genealogy has been recommended is to check the date it was made to see if it really reflects the current state of the art or might be about something that is now possibly obsolete or wrong.

And that goes for this web page just as much as for any of the others. What I published originally on this page or what you found here when you looked a year ago may very well be obsolete and wrong now. So, I am doing my best to keep it in sync with the reality of the current situation. So, the first thing you should do right now is to go to the bottom of this page and see the date of the most recent update.


GEDmatch Bashing

A new industry has grown up telling people how dangerous GEDmatch is and why you should not use it. Quite simply, there is only one valid reason for not using GEDmatch, and it is a reason that most people not only do not object to but support.

  • Some are simply unaware of just what GEDmatch has and fear that some kind of hack will lead to their raw DNA results being stolen. (This section updated May 2022.)
    GEDmatch's Terms of Service tell us that GEDmatch does have your raw DNA results on servers. And the GEDmatch tools do access that data online. BUT there is no function on GEDmatch that lets you see anyone's raw DNA results -- even your own. GEDmatch only displays summarized information: start and stop locations and counts of cMs and SNPs of matching regions. It never displays raw DNA results. This means hacking would have to come from within the company (Verogen). And there is no account that anyone's raw DNA results have ever been stolen from GEDmatch.

    There simply is nothing to be stolen or hacked on the GEDmatch website itself. The only things that are accessible on the website are the same things that anyone with a GEDmatch account can access. The only things that a hacker can hack are what everyone can already see -- summarized information and not raw DNA results.

  • Some of these are people who have seen past problems with GEDmatch but fail to see that those problems not only no longer exist but have led to long-term solutions that allay the fears they had from the past problem.
    GEDmatch has had two major problems, both of which led to significant changes that assure those problems will not happen again.
    1. Identification of potential crime perpetrators
      The solving of the Golden State Killer cold case used DNA found at crime scenes to compare to people in GEDmatch and build a family tree to narrow down whose DNA that was. This was a problem because GEDmatch had no way of allowing users to opt out of allowing law enforcement to see their kits. It also led to some tragic errors resulting from law enforcement personnel who did not understand the process and gave out information that later had to be retracted but had by then ruined some people's lives. While errors by law enforcement personnel are out of the control of GEDmatch, GEDmatch now allows all users to opt in or out of visiiblity to law enforcement perpetrator searches. (See more on this below in the Verogen issue.) In addition, the Department of Justice and law enforcement agencies have now established standards for the use of DNA in the process. (It is important to note that NO ONE will ever be convicted based on the GEDmatch information. It is a clue that allows law enforcement to gather new DNA known to be from the suspect and then compare it to the original crime scene DNA.) And GEDmatch also now has a new standard for how law enforcement can use the database to view kits who opted to be visible to them in perpetrator searches. This was a problem once, but it is not a problem any more.
    2. Security hacks in July 2020 (This section updated May 2022.)
      Hackers were able to opt everyone in for law enforcement visibility. They also made visible kits that were designated as private research kits, not meant to be publicly visible. GEDmatch caught this problem and shut the entire site down for several days until they could build in security to prevent a repeat of the attack and could roll back the visibility changes made by the hacker. No one's raw DNA result set was stolen by the hackers. (That did not stop the news media from running headlines such as "A Security Breach Exposed More Than One Million DNA Profiles On A Major Genealogy Database".) The result is that there has been no further hacking of GEDmatch. This was a problem once, but it is not a problem any more. Anyone pointing you to an article about the situation during the 2020 hack is pointing you to obsolete information, and any claim or insinuation that this obsolete information is accurate for the current situation is simply false.

  • UC Davis researchers in 2019 found ways to identify most of the genotype of a target kit at sites that allow uploads. (New section added May 2022 - Thanks to Leah Larkin for highlighting this very well.)
    Leah Larkin wrote an excellent 22 Oct 2019 overview of this work. The full paper "Attacks on genetic privacy via uploads to genealogical databases" by UC Davis Center for Population Biology's Michael D. Edge and Graham Coop is available in the 19 Oct 2019 pre-print at bioRxiv and also in the 20 Jan 2020 published version at eLife.

    Essentially, the authors used three very clever ways (which they call IBS Tiling, IBS Probing and IBS Baiting) of uploading real and fabricated DNA kits to the websites that allow uploads (GEDmatch, Family Tree DNA, MyHeritage) in sufficient quantities to identify up to 98% of the base pairs (the raw DNA results) for a targeted kit. They notified those companies 90 days in advance of making the pre-print public so that the companies could take steps to address the 10 ways that the researchers found for preventing such abuse.

    GEDmatch may have addressed some or all of the researchers' 10 recommendations but not publicized it. What we do know is that in ate 2019 (about 6 months after receiving advance notice of the UC Davis paper), GEDmatch's original founders who were basically techies with little legal expertise sought to put GEDmatch on solid legal footing for the future by selling it to Verogen who could provide the legal umbrella and also deal with the legal issues of access to GEDmatch by law enforcement or "adversaries" (the UC Davis term). Within the first few months, Verogen implemented two policies that clearly addressed some of the issues that enabled the adversarial methods of the UC Davis researchers and also the law enforcement access issue. Every kit on GEDmatch had to give opt-in confirmation in early 2020. While an adversary could opt-in just like everyone else, this put a defined base under every kit. Verogen also enabled opt-in/opt-out contro for every kit to be visible to law enforcement or not for potential crime perpetrator identifcation. (A separate issue is identification of deceased John and Jane Does, which GEDmatch does allow by comparing their DNA with that of anyone in the database.) So, there have been some visible steps taken by Verogen since the UC Davis researchers did their attacks.

    The question nearly 3 years later is whether the websites (GEDmatch in particular) have taken steps that would prevent this now: is the current situation vulnerable to such attacks? It would certainly be valuable research for these same researchers to try again with these methods and with others that they may have devised over those 3 years. (I do not know whether they are doing this or not.)

  • Some claim the 2019 sale of GEDmatch to Verogen is reason to abandon GEDmatch.
    GEDmatch began as the creation of two very talented computer-savvy genetic genealogists who operated it for years. But when use of GEDmatch in the Golden State Killer case raised the issue of use of the database by law enforcement, these two had no background in legal issues. The primary purpose of the sale of GEDmatch to Verogen was, as Kitty Cooper said a few months later at i4gg 2020, to provide a legal umbrella for GEDmatch to protect the database and assure that GEDmatch users were in control of whether or not their data was visible to law enforcement.

  • Some claim GEDmatch is of limited use because testing companies have far more than GEDmatch's 2 million kits.
    This really betrays a very narrow concept of what genetic genealogy is. It is not simply to find some matching relative who can reveal to you all the secrets you do not know. Quality of matches far outweights quantity. When is a match really a match who wants to work with you? Ancestry has millions of people who tested for a colored ethnic origins map -- because that is what Ancestry advertised. Most of these people have no interest in pursuing genealogical connections or doing the work of creating a robust family tree. And most of them will not reply to you when you tell them you and they match. The quality of matches on GEDmatch is vastly higher than all those millions of colored map customers because the GEDmatch kit members have shown that they really want to gain the most bang for the bucks they spent on their DNA test and are willing to take steps to do some work on their own. GEDmatch members are far more likely to respond to you about a match than are the millions of colored map customers on testing company websites.

  • Some claim we no longer need GEDmatch since some testing companies now accept DNA uploads from other sites -- completely failing to take into account that GEDmatch has more powerful analytical tools -- by far -- than do any of the tesing companies.
    Diahan Southard has staked out this position. She is very gifted at explaining complicated DNA issues in very understandable terms. And I have a great deal of respect for her within her area of experience. But she is plain and simply wrong about GEDmatch.
    • She claims that what the GEDmatch "fancy tools" can do is possible on testing company websites or that you just do not need what those tools can do. But she admitted in her 4 Dec 2020 Legacy webinar that she really does not use GEDmatch, with two minor exceptions. So, she has no real deep experience of what it is that she is telling people they no longer need.
      The reality is that there is no other website that comes anywhere close to providing the analytical power to do deep genetic genealogy that GEDmatch has. GEDmatch does allow you to compare your DNA results to other kits, no matter what company did those tests, and that is a very powerful feature. But GEDmatch tools provide vastly more analytical power than any of the testing companies. There simply is no viable comparison of what GEDmatch can do with any other web site, testing company or third party. (This is not to say that there are not other websites, such as DNAPainter, that do not have powerful tools. But most of what GEDmatch has does not exist anywhere else.)
    • In her 4 Dec 2020 Legacy Webinar, she casts the whole of genetic genealogy as simply transferring your DNA to find "the one" who knows something about your family history that you don't know about your family history: "That's it. That's the whole goal of transferring really from a genealogical perspective." And quantity of matches is key to her perspective. But she completely ignores the power of projects with GEDmatch multiple kits - the best of both quality and quantity.
      The reality is that there is information in DNA that no "the one" match knows. Genetic genealogy is not about finding "the one" but about finding what the DNA really has to tell us. When you have identified a specific region of chromosome 3 as the DNA of a specific ancestor, you know that anyone who exactly matches you on that region has a high probability of descending from the same ancestor. You can search the GEDmatch database for people who match you on that specific region -- something no testing company provides. And you can do far more, especially when you bring together the kits of many dozens of descendants of the same ancestral couple in a GEDmatch tag group using the Multiple Kit Analysis tool. NONE of this can be done at any testing company.

  • Some of these are journalists whose primary job is to sell their services, even if it means resorting to fear mongering.
    If a reporter is given the assignment to write an article about a child falling off their bike, the reporter will treat that story as the most important thing that happened in the world that day. And, for some reporters, if they can find some element that stirs up fear or anger, that element is known to sell newspapers, gain clicks on online sites and otherwise make sure that the reporter has readers and followers, regardless of the real facts.
    There is truthful responsible reporting about GEDmatch. For example, here is an article about the July 2020 hack that does not pander to fear mongering but gives a full and clear picture of the reality of what happened -- and what di NOT happen.

There really only is one supportable concern that now exists that can lead some people to opt out of GEDmatch. GEDmatch is open to use for identifying unknown remains, usually murder victims, so that if this is an overwhelming issue for you, then GEDmatch is not for you. But that really is the only supportable reason to not use GEDmactch.


Send E-mail to wwjohnston01@yahoo.com
Copyright © 2022 by Wesley Johnston
All rights reserved


Last updated May 15, 2022 - add update notice; update several sections to reflect current situation