Eve S. McCulloch
Genome sequencing coupled with medical and personal data holds enormous promise for unraveling the mysteries of the human body and advancing disease treatment. Increasingly, research projects are collecting data on large numbers of people to determine links among diseases, lifestyle, environment, and genes. The biobanks being created with these data raise questions about protecting the privacy of individuals whose DNA and medical records fuel research.
Repositories of human genetic material emerged more than a decade ago in Iceland with the company deCODE genetics. The United Kingdom has created a biobank with 500,000 enrolled volunteers. In the United States, researchers at Kaiser Permanente have revealed early findings based on a treasure trove of genetic and medical data collected from 100,000 Californians. This effort, establishing perhaps the largest biobank in the United States, has already shown new links between disease traits and genetic variants.
"One of the things that is becoming obvious is that genomes are far more variable from individual to individual than we thought even 5 years ago," said Joshua Meyer, a postdoctoral researcher at Oregon Health and Science University, where he works on chromosomal rearrangements with potential applications in cancer research. "Mapping those variations is really important if we are going to realize the medical dream of personal genomics."
The emergence of low-cost genome sequencing is opening doors across many fields of research, and it all depends on access. "We need more and bigger databases to figure out how to help people, and yet, more and bigger databases mean greater threats to people's privacy," stated Meyer.
The Presidential Commission for the Study of Bioethical Issues sought to address this challenge with its October 2012 report, Privacy and Progress in Whole Genome Sequencing. The report called for formal policies to address ethical dilemmas raised by genome sequencing, particularly for policymakers to create privacy protections governing how genomic data can be collected, stored, and shared. Presently, these data are treated differently depending on who took the sample for sequencing and the state in which it was taken. In many places, they can be used without the patient's knowledge or consent.
"These are just a few discrepancies in public policy that can create confusion and uncertainty when it comes to understanding how to protect some of our most personal data," said Commission Chair Amy Gutmann. "Confusion and uncertainty tend to erode trust, and trust is the key to amassing the large number of genomic data sets needed to make powerful, life-saving discoveries."
A study published in January 2013 in the journal Science (Gymrek et al. 2013) highlighted additional complications. Using public data sets and online resources such as genealogy Web sites—and without violating present federal privacy regulations—Whitehead Institute geneticist Yaniv Erlich and his colleagues deduced the names of dozens of supposedly anonymous individuals who had contributed DNA for medical or scientific research. The report "calls into question whether the goal of [the] complete deidentification of many types of human data is realistic in today's information-rich society," stated an accompanying commentary (Rodriguez et al. 2013).
Privacy is a two-sided coin. In addition to issues regarding data access, there is also controversy surrounding incidental findings, which are unexpected but potentially important information discovered as a byproduct of an experiment on some other topic. For example, genome sequencing may reveal that a donor faces a substantial risk of a serious but treatable health condition. The Commission for the Study of Bioethical Issues made no recommendations regarding the return of incidental findings, other than to advise that DNA contributors be informed of their possibility and whether they would be returned.
In contrast, a National Institutes of Health working group released explicit recommendations (Wolf et al. 2012). They called on biobanks to shoulder the responsibility for communicating pertinent findings to DNA donors, instead of the researcher and associated institution generating the result—as has traditionally been argued.
Some experts in the field, however, have called for flexibility. US biobanks—of which there are over 600—vary in a number of ways, including their size, their funding sources, and the types and sources of DNA samples held in them, according to a recent study (Henderson et al. 2013). "Given the diversity in biobank organizational characteristics identified in our survey, it's likely that management and governance policies will have to be tailored to fit the particular context," concluded Gail Henderson, lead author of the study and head of University of North Carolina's Center for Genomics and Society. "One-size policies will not fit all."
Resolving the issues associated with personal genetic data will not be easy. New regulations will likely be required, and the recommendations of the Presidential Commission for the Study of Bioethical Issues may provide a starting point for policymakers in the quest for genome privacy.
Gymrek M, McGuire AL, Golan D, Halperin E, Erlich Y. 2013. Identifying personal genomes by surname inference. Science 339: 321–324.
Henderson GE, Cadigan RJ, Edwards TP, Conlon I, Nelson AG, Evans JP, Davis AM, Zimmer C, Weiner BJ. 2013. Characterizing biobank organizations in the U.S.: Results from a national survey. Genome Medicine 5: 3.
Rodriguez LL, Brooks LD, Greeberg JH, Green ED. 2013. The complexities of genomic identifiability. Science 339: 275–276.
Wolf SM, et al. 2012. Managing incidental findings and research results in genomic research involving biobanks and archived data sets. Genetics in Medicine 14: 361–384.
BioScience 63: 333