Skip to content
Stanford University School of Medicine

New “genome cloaking” technique blocks private data, just as moon blocks sun during eclipse

Like many others this morning, I was boggled by what I could see of the solar eclipse. (And yes, I did use my eclipse glasses! I hope you did too.) Slowly the moon crept across the face of the sun, cloaking its brightness. While I was not in the path of totality, about 87 percent of the sun was covered at the eclipse's maximum. The effect was stunning. But still, a sliver of the sun shone out.

The experience of eclipse watching touches on an elegant solution developed by developmental biologist and computer scientist Gill Bejerano, PhD, who joined forces with Stanford cryptologist and computer scientist Dan Boneh, PhD. They were tackling a persistent problem in genome science: how to maintain patient privacy while also accumulating and analyzing the multitudes of DNA sequences necessary to determine which genetic mutations, or variants, are associated with the development of particular diseases. This "share vs. protect" conundrum has dogged the field for some time. Ideally, scientists would like to be able to shine a sliver of light on just those mutations that matter for the study at hand, while keeping the remaining DNA sequences hidden.

Just as the moon blocked out most of the sun during the eclipse and watchers like me could only see its edges, these researchers developed a method to mask most of the genome, which looking at only a small bit of it.

The researchers, including graduate students Karthik Jagadeesh and David Wudescribed their results in Science. Our release explains how they tested their method:

Using the technique, the researchers were able to identify the responsible gene mutations in groups of patients with four rare diseases; pinpoint the likely culprit of a genetic disease in a baby by comparing his DNA with that of his parents; and determine which out of hundreds of patients at two individual medical centers with similar symptoms also shared gene mutations. They did this all while keeping 97 percent or more of the participants’ unique genetic information completely hidden from anyone other than the individuals themselves.

Why should we care? Even though the Genetic Information Nondiscrimination Act signed in 2008 prohibits discrimination in health insurance or employment based on an individual's genome sequence, there are still plenty of other instances (think of applying for a loan or life insurance for example) in which it would be legal to discriminate against someone known to have a disease-associated gene mutations. Or maybe you just don't want your future in-laws to know you carry a gene variant associated with hereditary hearing loss, or your wealthy great-aunt Sally to guess that the only reason you're only able to tolerate her famous broccoli hotdish is because you lack the gene variant that responds strongly to bitter, unpleasant tastes. Or maybe it's just nobody else's business.

Jagadeesh and Wu used a cryptographic technique known as Yao's protocol to tackle the problem of needing to both know and not know a study participant's genetic information. As I described in the release:

A key component of the technique is the involvement of the individual whose genome is to be studied. In particular, each individual encrypts their genome (with the help of a simple algorithm on their own computer or smart phone) into a linear series of values describing the presence or absence of the gene variants under study, without revealing any other information about their genetic sequence. The encrypted information is uploaded into the cloud and the researchers then use a secure, multi-party computation (a cryptographic technique that ensures the input data remain private) to conduct the analysis and reveal only those gene variants likely to be pertinent to the investigation.

The researchers hope that the privacy afforded by the technique will encourage more people to participate in the large-scale studies necessary to identify more disease-associated genes or to develop new diagnostic procedures.

As Bejerano said, "We now have the tools in hand to make certain that genomic discrimination doesn’t happen. There are ways to simultaneously share and protect this information. Now we can perform powerful genetic analyses while also completely protecting our participants’ privacy."

Previously: Locking the door on big-data risks to privacy, Genome testing for children: What parents should consider and Stanford patient on having her genome sequenced: "This is the right thing to do for our family"
Photo by Lucky Lynda

Popular posts