Skip to content

Mining medical discoveries from a mountain of ones and zeroes

Kilo, mega, giga, tera, peta, exa, zetta, yotta - that's a lotta. Add three zeroes for each term separated by commas, and, next thing you know, you've got a really big number.

Stanford systems-medicine chief Atul Butte, MD, PhD, an intrepid data miner, says the human species is putting out a couple of zettabytes of data annually, with the rate of output doubling each year.

After the Human Genome Project spread its wings in 1990, the biotechnology industry followed suit by serving up some fast-forward fireworks of its own. As a result, there's been an explosion in the generation of medical data piling up in free-access, public databases.

Tons of data. A mountain of data. Just sitting there, waiting to be analyzed by somebody.

Butte is that somebody. He is, as I wrote in my just-published Stanford Medicine article, "King of the Mountain":

. . . a walking window through which to watch the data revolution in bioscience unfold. A proven master at mining this medical data, he’s now throwing his considerable energy into persuading other scientists to try his approach. It’s the fastest, least costly, most effective path to improving people’s health that he knows.

From creating a medical-molecule (via which diseases are paired with drugs based on each's opposing effect on gene expression) to fingering an immensely guilty-looking suspect in the hunt for genes that cause type-2 diabetes to out-and-out outsourcing his mouse experiments, Butte has been a pioneer in a new world of data-driven medical research.

But the trail-blazing, covered-wagon stage of bioscience's data revolution won't last very long. As patient data is increasingly captured - on a federally assisted national level - in increasingly interoperable, searchable electronic databases, blink and when you open your eyes you'll see a ten-line medical-information highway, paved with ones and zeroes.

Previously: Cheap Data! Stanford scientists' "opposites attract" algorithm plunders public databases, scores surprising drug-disease hook-ups and Newly identified type-2 diabetes gene's odds of being a false finding equal one in 1 followed by 19 zeroes
Photo by Sean MacEntee

Popular posts