Big data strikes again — subdividing tumor types to predict patient outcome, personalized treatment

A Stanford team has developed an algorithm that uses data about tumors to identify new classifications that can provide information about patient outcomes

Krista CongerOctober 31, 2018February 9, 2024

One of the first things I learned as a graduate student in cancer biology is that the idea of "cancer" as a single entity is a vast oversimplification of the disease.

It's kind of like referring to all four-wheeled motorized vehicles as cars, when in reality there are a nearly infinite number of shapes, sizes, colors, makes and models of that type of transport, from garbage trucks to golf carts. Some go fast, some pull large loads and some can lug massive amounts of sports equipment and small children to soccer field after soccer field. Each has a specific purpose.

When I was in school there were just a few ways to tell different types of cancers apart from one another or to distinguish the defining characteristics of individual tumors. What organ did it affect? How did it look under the microscope? What types of mutations does it harbor?

Although it doesn't feel like that long ago, it was practically the Dark Ages when one considers the volume of data and information that can now be gleaned about each patient's disease. But integrating that data into useful treatment plans can still be challenging.

Now, pathologists and postdoctoral scholars Avantika Lal, PhD, Daniele Ramazzotti, PhD, and Arend Sidow, PhD, professor of pathology and of genetics, have devised a computer algorithm to bring together several types of genetic data about specific cancers such as lung or breast and divide those cancers into many smaller groups that share common biological traits. This practice can yield important information about tumor behavior and patient survival, they found. (To extend the car analogy, I probably have more in common with other drivers of 2004 dark green Honda Pilots than I do with people who drive newer red Ferraris).

Lal and Ramazzotti named their algorithm "Cancer Integration via Multikernel Learning," or CIMLR. They published their results last week in Nature Communications.

As Lal explained in an email to me:

CIMLR allows us to classify individual cancers based on the underlying genetic changes in the tumor. Using CIMLR, we identified genomic subtypes for 36 types of cancer.

Tumors belonging to the different subtypes have very different underlying biology and so behave differently. By classifying patients into these subtypes, we can predict their chance of survival and their likelihood of relapse after treatment.

CIMLR works by integrating many different types of data about a tumor, including changes in gene expression, genetic mutations, variations in gene copy number (a change that often occurs in cancers when stretches of DNA are repeated or deleted during cell division) and promoter methylation (a physical tag that can trigger the expression or repression of downstream genes). Until now it's been difficult to combine these data and consider their effect on cancer growth as a whole.

The researchers applied CIMLR to 36 types of cancer, including liver, lung, kidney, melanoma and breast from more than 6,000 people. In 27 of the 36 cancer types, they found significant differences in patient survival among the various subcategories discovered by CIMLR.

Identifying patterns associated with poor patient outcomes could improve a clinician's ability to better tailor treatments to individual patients, the researchers said.

As Ramazzotti explained:

By studying the biological characteristics that are defining different subtypes, we could potentially develop personalized treatments to improve the care of patients.

Lal and Ramazzotti found that CIMLR outperforms other currently available tools in its speed, accuracy, and ability to predict patient survival. They also point out that this type of "multi-omic" approach could be useful in other situations.

As Lal said:

We are applying CIMLR to other cancer datasets, and making it easier to classify new patients into the subtypes we discover.

But we also want to make CIMLR accessible for scientists working on other diseases. The approach that we applied to cancer can be used to identify subtypes of other complex diseases. For example, a European group has already used CIMLR to identify subtypes of Alzheimer's disease with different brain features and there is interest in using it to find subtypes of human gut microbiomes. We are excited to see our method advance other medical fields.

Photo by AOMSIN