Today, diagnosing rare genetic diseases requires a slow process of educated guesswork. Stanford computer scientist and genomicist Gill Bejerano, PhD, is working to speed it up.
In a paper published recently in Genetics in Medicine, Bejerano and colleagues describe an algorithm they’ve developed that automates the most labor-intensive part of genetic diagnosis, that of matching a patient’s genetic sequence and symptoms to a disease described in the scientific literature. Without computer help, this match-up process takes 20 to 40 hours per patient: The expert looks at a list of around 100 of the patient’s suspicious-looking mutations, makes an educated guess about which one might cause disease, checks the scientific literature, then moves on to the next one.
The algorithm developed by Bejerano’s team cuts the time needed by 90 percent.
“Clinicians’ time is expensive; computer time is cheap,” said Bejerano, who worked with pediatric geneticist Jon Bernstein, MD, PhD, and other experts in computer science and pediatrics to develop the new technique. “If I’m a busy clinician, before I even open a patient’s case, the computer needs to have done all it can to make my life easier.”
The algorithm’s name, Phrank, a mashup of “phenotype” and “rank,” gives a hint of how it works: Phrank compares a patient’s symptoms and gene data to a medical-literature knowledge base, generating a ranked list of which rare genetic diseases are most likely to be responsible for the symptoms. The clinician has a logical starting point for making a diagnosis, which they can confirm with one to four hours of effort per case instead of 20 to 40.
The mathematical workings of Phrank aren’t tied to a specific database, a first for this type of algorithm. This makes it much more flexible to use.
Phrank also dramatically outperforms earlier algorithms that have tried to do the same thing, the scientific paper explained. Bejerano’s team validated Phrank on medical and genetic data from 169 real patients, an important advance over earlier studies in the field. Prior studies had tested algorithms on made-up patients instead because real-patient data for this research is hard to come by.
“The problem is that this test [using synthetic patients] is just too easy,” Bejerano said. “Real patients don’t look exactly like a textbook description.” On data from real patients, one older algorithm ranked the patient's true diagnosis 33rd, on average, on the list of potential diagnoses it generated; Phrank, on average, ranked the true diagnosis 4th.
Phrank also holds potential for helping doctors identify new genetic diseases, Bejerano told me. For example, if a patient can't be matched to any known human diseases, the algorithm could check for clues in a broader knowledge base. "You might get the result that mouse experiments cause phenotypes similar to your patient, that you may have found the first human patient that suffers from this disease," Bejerano said.
Ultimately, he added, "nobody is going to replace a clinician making a diagnosis." But new technology will help experts use their time more efficiently, helping many more patients get diagnosed.
Image by Ricinator