A trove of genetic records from the UK Biobank was unleashed last July, and after mining the immense data set, scientists at Stanford have found strong evidence for 27 direct links between specific genetic mutations and a variety of human diseases.
Some of these mutations have never been associated with disease before, and some even seem to confer protection against disease, making them prime drug candidates.
The team's work, published in Nature Communications, focused in on a particular type of genetic oddity that halts proper protein formation before it's complete. It's called a protein-truncating variant, or a PTV. Manuel Rivas, PhD, a biomedical data scientist at Stanford, is the corresponding author.
Any time a gene is mutated into a shorter, incomplete version of itself, as is the case with PTVs, it has the potential to cause harm.
"The scientific community has appreciated that if you inhibit the function of a gene, it very well may end up playing a role in human disease," Rivas told me. "But this is among the first studies that really go deep into the precise characterization of how 'knockout' PTVs [which causes the gene to lose all function] contribute to diseases."
Drawing on more than 500,000 individual health records, the group used statistics-based algorithms to pick out which mutated genes had strong links to diseases such as cancer, asthma, Crohn's disease and heart disease among others. Altogether they found 27 different associations between 17 genes and 20 different diseases.
The study provides the most definitive evidence to date showing the potency PTVs in disease and the power big datasets can provide in the hunt to find correlations, and eventually causation.
What's more, Rivas saw that PTV aren't all bad -- they can protect against common ailments too. In particular, Rivas pointed to mutations in two different genes (IL33 and GSDMB) that seemed to fend off asthma. In another case, one mutation in the IFIH1 gene protected against hypothyroidism.
Next steps, Rivas says, are to pursue those PTVs that seem promising for therapeutics. Perhaps there are ways to mimic the effect of the IL33 and GSDMB mutations that lend protection against asthma, for example.
While the UK Biobank was critical to the discoveries made in this work, Rivas says that the amalgamation of data is not without its own drawbacks.
"If you wanted to study diseases like schizophrenia, which is on the rarer end of disease prevalence, it would take tens of millions if not hundreds of millions of individuals to get to the point where you're actually able to say something statistically relevant about the disease," he says.
In addition, because the data was collected only in the U.K., the extent to which the findings are broadly applicable may be limited. "That's why it's important to have more than one of these large-scale biobanks," says Rivas. "It's critical to our ability to understand how disease is realized in different countries."
Photo by StanfordMedicineStaff