For any tumor that's accessible by scalpel, physicians often take what's called a biopsy, snipping away a tiny portion of the mass.
From that scrap, doctors glean as much information about the tumor as they can through two distinct methods. Imaging captures the physical likeness of the tumor cells, which can help scientists understand disease severity, among other things. And through molecular analysis, scientists can analyze the tissue to find drivers of disease on the cellular level.
These two modes of investigation are usually conducted separately, but James Zou, PhD, an assistant professor of biomedical data science at Stanford, has laid the groundwork for doctors to learn a tumor's spatial and genomic properties simultaneously.
"We thought, 'How can we use machine learning to combine both of these, so that we make the most out of these biopsies?'" Zou told me.
"Insta-pathology"
He and his team, led by graduate student Bryan He, have created an algorithm for a concept he's dubbed "insta-pathology." In an image of an individual tumor, a computer recognizes and labels the likely genomic activity of groups of cells based on their appearance.
Zou likens this idea to an Instagram filter. But instead of overlaying a photo with a location tag or puppy dog features, the "filter" adds information about which genes are turned on -- and where -- in a biopsy image.
So for instance, on an image with 1,000 cancer cells, the algorithm would annotate the photo based on the shapes, sizes and density of the cells, labeling regions to indicate which genes are likely causing specific physical characteristics of groups of cells. For example, cells with a smaller-than-normal nucleus are linked to activation of the gene AEBP1.
A paper describing the study was recently published in Nature Biomedical Engineering. Joakim Lundeberg, PhD, a pioneer of spatial genomics at the KTH Royal Institute of Technology, Stockholm, Sweden, co-led the research.
Teaching the algorithm
In developing the algorithm, Zou and his team used a dataset of breast cancer cells already labeled with genomic information to "teach" their new algorithm what genes encode specific morphologic features. After many training iterations, the algorithm learned that certain shapes or physical features correspond to a gene or sets of genes.
This information can be extremely helpful for clinicians as they seek to understand a patient's cancer, Zou said.
"The different morphologies and the architecture of cells, and how those actually are related with expression changes, is valuable for understanding how cancer cells interact with its surroundings, which can tell doctors how aggressive the cancer is, or help them decide what drugs to prescribe," he told me.
It's also important for understanding the immune response to cancer, as the image and algorithm can provide data about the amount of immune cells likely infiltrating the tumor, exactly where they go, and the immune-modulating genes that are active.
Faster, more accessible information
Technically, there are other ways to obtain this sort of spatial genomic information, but it's far more time-intensive and expensive, said Zou, as other techniques require scientists to painstakingly parse the genomic activity in single cells. Using an algorithm to "read" the cellular morphologies in a sample of tumor tissue would ideally provide doctors with faster, more accessible information to understand patients' cancers and treat them.
The algorithm is still early in its development, said Zou, but the team is excited at the prospects of how it could be used in the clinic. When tested on a preliminary dataset (none of which were used to train the algorithm), the algorithm accurately labeled 102 genes for at least 20 of 23 patients.
In the future, Zou sees this technique as broadly applicable to many types of cancer. What's more, he hopes that, beyond accurately showing which cancer-causing genes are active in a tumor sample, the algorithm will also help scientists discover new genes that are contributing to the disease by providing information about the genes active in a given cancer sample.
"Ultimately, our goal is to have this become a computer filter that clinicians can use to get spatial genomics data in real time," said Zou.
Photo by James Zou