It’s spring and privacy concerns are in the air. Between the recent revelations that Facebook let Cambridge Analytica capture data from 87 million of its users to be improperly used to influence the US presidential election, and news that California investigators cracked the long-cold case of the Golden State Killer by running a genetic profile collected from crime scene DNA through a public genealogy website, people are feeling a bit…spooked.
So it’s kind of a weird time to be asking a million people to voluntarily hand over decades of health records, along with dozens of test tubes filled with blood and urine (which of course, contain DNA). But that’s exactly what the National Institutes of Health is doing today.
After more than three years of planning and piloting, the federal research organization is finally rolling out the massive precision health initiative President Obama first announced in 2015. Now renamed All of Us, the ambitious project aims to compile detailed health data from a representative sample of one million Americans so scientists can better understand the mechanisms of disease and move more quickly toward personalized treatments. Starting May 6, anyone over the age of 18 living in the US can enroll in this Grandest of Experiments and donate their data to the greater good.
So far, 45,000 people have already started the process. In May of 2017 All of Us began a beta phase, bringing its recruitment sites online one by one and making sure the systems were running smoothly. It’s got a lot of data to sync up—electronic health records, surveys about participants’ behaviors and environments, and eventually genetic reports and information from wearable fitness devices.
Building out the infrastructure necessary to collect so much data on such a huge cohort has taken time and some serious cash. Last year alone, the All of Us budget was $230 million. For the full project, which will run for a decade, Congress has authorized a whopping $1.455 billion. In addition to the 298 enrollment sites NIH hopes to launch by the end of this year (120 are online so far), that money will go toward a national biobank, run by the Mayo Clinic, where 35 blood and urine samples from each participant will one day be stored. To prepare for the national launch, Mayo doubled the size of its 35,000-square-foot facility in Minnesota and expanded a smaller bank in Florida, as a backup site to protect samples from any localized natural disasters.
Those samples contain the DNA that researchers will sequence, and in a rare first for a research project of this magnitude, they will also return the results to participants. But none of this will happen right away. The first sequencing will begin later this year, beginning with a small, 20,000 person pilot. Before everyone else can get the same treatment, someone has got to build a lot more sequencing machines. “There’s not enough capacity in the US to even begin to do a million people,” says Eric Dishman, director of All of Us. In addition to genotyping—the technique companies like 23andMe uses to create its limited health reports—All of Us will also be doing whole genome sequencing, which requires much more machinery. “It’d be like saying, “Hey, let’s all take a high speed train trip across the US. There just aren’t enough of them right now to do that.”
To handle the digital architecture, Dishman built a team made up of folks from Vanderbilt, the Broad Institute, and Verily—Alphabet’s life science subsidiary. They’re creating a data and research support center to collect, curate, and store participants’ information in a secure cloud environment. They’re also building analytical tools to help researchers comb through the data, looking for connections that could lead to new discoveries.
Anyone will be able to access information from the project, but the levels of access will be tiered. General queries about the overall demographics of the All of Us cohort—data with a low risk of reidentification—will be open to the public. More sensitive data will be under tighter controls—some will be available to citizen scientists partnering with research organizations, and some will be only available to researchers with training in human subject studies who’ve taken an oath not to match up deidentified health data with names, addresses, and social security numbers.
That research portal will launch sometime in the first half of 2019. Researchers can expect to have access to most of the common computational tools already in wide use by the biomedical field, but they won’t be able to download any data to their own systems.
That safeguard will be combined with end-to-end encryption and certificates of confidentiality to keep people’s information safe. Those certificates prohibit the NIH and its partner organizations from sharing any data with law enforcement, or any other federal, state, or local government agencies. One last additional protection: If that information were to somehow be illegally obtained, through a hack or some other breach, it would be inadmissible in court. Congress had All of Us in mind when they signed those protections into law in 2016 with the 21st Century Cures Act, but the same safeguards extend to all federally funded human health research.
After news of the Golden State Killer broke, Dishman sent a note to All of Us partners tasked with doing outreach for enrollment, reminding them of the the security measures the project is implementing. Of course, there are other ways personal health data could be abused if it were accessed by the wrong parties. A stolen medical identity can be used to falsify insurance claims, fraudulently acquire Medicare or Medicaid, or obtain prescriptions to opioids. While there isn’t currently much of a black market for genetic data, emerging technologies may create them in the future. And unlike a credit card number, you can’t change your DNA.
But the former Intel exec is clear-eyed about the risks he’s asking people to take. “Compared to most studies we are giving people a lot more information during the consent process about what they’re signing up for,” he says. “It might scare some people off, but it’s the right thing to do. That’s one of the ways you build trust. If we lose that trust then we lose the viability of the program.”
And it’s important to keep the potential benefits in perspective. Every day you give away data to Google and Facebook and Apple, and all you get are more ads trying to sell you stuff you don’t need. All of Us is offering something you might actually want: a future where when you get sick treatments will be available that are tailored to work just for you—but where hackers and identity thieves don’t have access to that priceless information at all.