AI, medicine and race: Why ending 'structural racism' in health care now is crucial

As artificial intelligence changes the way medicine is practiced, humans become more beholden to algorithms -- making it crucial to get those machine-human collaborations correct at the outset.

For instance, health care providers must reckon with inherent race-based biases in medicine, which can reinforce false stereotypes in algorithms and lead to improper treatment recommendations or late diagnoses, according to Stanford Medicine researcher Tina Hernandez-Boussard.

Building guard rails that protect against this bias is the focus of Hernandez-Boussard, MD, PhD, a professor of medicine, biomedical data science and surgery. A more accurate approach to the consideration of race and ethnicity, Hernandez-Boussard contends, would incorporate social determinants of health or non-medical factors that influence health outcomes, such as place of birth, food access and education.

"Many health differences in patients are not derived from race, but social determinants of health and structural racism in the health care system," Hernandez-Boussard said. "There's often not a genetic basis for this. There's more variation within a race than there is between races. It doesn't make sense to use race in these algorithms because it is a social construct not based on biology."

Hernandez-Boussard is the senior author of a paper published in Health Affairs on Oct. 2 that makes four recommendations for dismantling race-based medicine: increasing the diversity of clinical trial populations, broadening the focus of precision medicine beyond genetics, improving education about the complex factors shaping health outcomes, and developing new guidelines and policies to enable culturally responsive care.

Deconstructing algorithmic bias starts with the data that trains health care models. The U.S. is composed of a predominantly non-Hispanic white population with access to health care. Because researchers have the most data on this population, current algorithms are going to perform well for that population -- but it won't for many others.

According to Hernandez-Boussard, it will require a deeper dive to hash out the factors that cause perceived racial differences, giving us a more complete picture of disease and other health contributors. She recently spoke about the opportunities to identify and mitigate these biases before they reach the point of care.

How did race-based medicine come into existence and what are some examples?

Here's an example that stems from misinformation two hundred years ago: Doctors believed that African American people had a smaller lung capacity. It wasn't true. The smaller lung capacity was a result of malnutrition. Any race can have malnutrition. Years later this persistent stereotype reduced African American peoples' eligibility for lung transplants because doctors believed that their lung capacity wasn't strong enough to support the new organ.

Race has always been a routinely collected demographic in health care. So, it's easy to include in clinical decision-making algorithms to attempt to adjust for health care disparities. This makes it deeply entrenched in almost all facets of health care.

There are two big examples of using race as a health factor that we reference in our paper.

The first is the algorithm that measures kidney function, called the estimated glomerular filtration rate, or eGFR. eGFR uses creatinine, a measure of how well the kidney is filtering waste, among other measurements, to estimate a person's kidney health. When eGFR was first implemented, doctors thought African American people generally had higher levels of creatinine, as creatinine is stored in muscles and there was an assumption that African American people had an overall higher muscle mass, compared to the rest of the population. That false equivalence led to the underdiagnosis of kidney disease in this population.

The other is called the vaginal birth after cesarean section algorithm or VBAC. It's a measurement that estimates the likelihood of having an adverse pregnancy outcome from vaginal delivery after having a c-section. African American and Hispanic people were thought to have worse outcomes compared to white people, with higher costs and recovery time with c-sections. After exploring the differences in outcomes, however, scientists found that chronic hypertension was the culprit, not race.

How much awareness is there within today's health care system that this is a problem?

Physicians and scientists largely know that AI algorithms can perpetuate bias if they're not adequately developed, designed and tested. There's a big push from government, from patient advocate groups and from medical societies to eliminate bias.

In 2020 the National Kidney Foundation and the American Society of Nephrology convened a task force to look at the disparities around kidney transplants. The task force found that measuring cystatin C -- a protein that slows the breakdown of other proteins -- instead of creatinine improved the accuracy of the eGFR algorithm. Since 2014, the algorithm also considers time spent on dialysis, which supports patients during late-stage kidney disease.

African American populations often experience more severe disease as a result of delayed diagnoses, which leads to more time spent on dialysis. That often disqualifies them from transplants, as those who have spent less time on the list are thought to be healthier and therefore favored.

Updating the algorithm has helped accurately measure kidney function for all populations, leading to more equal transplant rates between African American and white groups.

In 2021 the Maternal Fetal Medicine Network also removed race as a risk factor for poor outcomes in the VBAC algorithm. Now, white people with chronic hypertension are more educated about their risk of complicated delivery and African American and Hispanic people undergo fewer unnecessary c-sections.

The American Academy of Pediatrics eliminated race as an input to all algorithms and guidelines for clinical practice. These are just a few examples.

How do we ensure that racial diversity is captured when creating AI algorithms?

There's a lot of research into both designing the questions that are analyzed by AI, and how to capture the data used to inform these questions. People who create health care algorithms are likely unaware of their innate biases. So, by having a more diverse team and having more input from the diverse public, we might be able to catch these misconceptions before they start causing patient harm.

One way to address this is to have more diverse datasets. The All of Us Research Program by the National Institutes of Health is a good example of enrolling diverse populations to increase representative data. The program aims to develop relationships with at least a million participants who reflect the country's diversity. I use the All of Us data in my research as well as Medicaid and other datasets to improve population representation.

What is the biggest remaining challenge?

We need to spend more time gathering complete patient data. It takes a lot of digging and strong research to understand the links between different characteristics and the health outcome we're looking at.

It's also about getting people to embrace the team approach with researchers from all different disciplines when gathering data. There's a lot of focus on genetics in precision medicine, for example. But precision care is about identifying the right treatment for the right patient. To do that, we also need to consider the patient's cultural values, their access to care, whether they're a caregiver, whether they have young kids at home, and whether they have access to transportation, among other factors.

We need a health care team that can see the patient holistically. When we start thinking about the patient holistically, we can better account for factors, like social determinants of health, which race has been an insufficient proxy for in many algorithms.

Photo: Zapp2Photo