Skip to content

The time ‘is now, in the beginning’: How do we ensure AI tools aren’t biased?

New artificial intelligence tools have the potential to revolutionize health care. But Stanford researchers argue that disparities could worsen without intervention now.

New artificial intelligence tools have the potential to revolutionize health care by providing quick and accurate responses to clinical questions, among many other possibilities.

However, the answers they give may also carry the racial, gender and other biases woven into the data that trained them, potentially perpetuating, or even amplifying, existing health care disparities. That's an overriding concern for Eleni Linos, MD, DrPH, the recently appointed director of Stanford Medicine's Center for Digital Health.

Linos, professor of medicine, dermatology and epidemiology, is the senior author of an investigation into how two different AI chatbots responded to clinical questions that had been put to human clinicians. The study, published in JAMA Network Open, was co-authored with a Stanford team including postdoctoral research scholars Jiyeong Kim, PhD, and Zhou Ran Cai, PhD, as well as medical student Michael Chen and epidemiology professor Julia Simard, PhD.

It demonstrates that the AI chatbots tested -- ChatGPT and Bard -- gave different clinical advice based on the race, gender and socioeconomic status of a patient, a concerning detail if these differences exacerbate health disparities, Linos and her team concluded.

As the push to deploy AI medical tools rapidly expands and gives hope for advancing diagnostic accuracy, Linos spoke about the urgency of eliminating bias in their development. 

What does your study add to what we know about the dangers of AI perpetuating inequalities in medical treatment?

We compared several clinical scenarios for which the gender and race of a patient should not affect the medical treatment recommendation. However, when we put these into ChatGPT and Bard, we found that the responses differed by race, gender and ethnicity of the patient.  

For example, in one scenario describing a patient with coronary artery disease, one AI chatbot correctly identified this diagnosis in white men, Black men and white women, but not in Hispanic men, Hispanic women or Black women, and later recommended thrombolysis treatment only for white men. 

Of course, doctors aren't perfect either. There is substantial evidence about biases that human doctors carry when practicing medicine. The question we urgently need to ask is, "How can we apply these new tools carefully and wisely to help solve the problems of bias in medicine?" 

Our paper isn't suggesting these large language models or AI-based tools should not be used in medicine, although they did in some cases make some problematic recommendations. We are still at the very early stages. But we need to think about addressing these issues now and work with the companies developing these technologies to ensure that new tools are fair and accurate, and that they benefit everyone.

Health inequities are a huge problem in medicine. My hope is that these AI-based tools will actually be a key part of the solution.

How do you see these tools influencing medicine more generally?

These AI-based tools are rapidly being incorporated into all aspects of medical care, including tools to help make accurate medical diagnoses, decisions about which patients most need urgent care in the emergency room, or even the responses that patients get when they send messages to their doctors. Making sure these tools reduce health disparities and don't exacerbate them is going to be extremely important. 

Why is it important to get this right while this technology is still young? 

We're at an inflection point. Medicine -- actually, humanity as a whole -- is about to undergo huge transformations, and we are making a plea to the medical and technical communities to think carefully about bias and inclusion now, in the beginning. Once this technology is incorporated in every medical clinic, we want to make sure that it's on a trajectory toward accuracy, inclusion and equity -- that it helps make things better for everyone. 

What will help ensure that? 

As doctors and academics, we can't do this alone. The tech companies can't do it alone, either. We need to work together. We have different strengths, but ultimately, we have the same goal -- improving the lives of all patients. To get there we'll need the synergy that comes from having engineers, scientists, doctors, ethicists working at the same table. 

You co-authored a study of how smartphone-based conversational agents responded to questions about suicide, depression, rape and domestic violence. Do the studies correlate?

Yes, I think they are related. In the early days of these conversational AI assistants, if you asked Siri a question about depression, assault, domestic violence or suicide, it did not give nearly as good an answer as if you asked it a question about a broken leg or other physical ailment.

What was perhaps most surprising and important about that early study was how responsive the tech companies were in fixing this problem. I remember that within about a week of that paper's publication, Apple reprogrammed its iPhones to direct people to the right help lines all around the world. The speed at which these major tech companies can respond is remarkable, and it has important implications for public health globally.

In what ways can you lead the CDH in facilitating the kinds of collaboration you're talking about?

This is an incredible time to have the opportunity to lead the Stanford Center for Digital Health. The technologic transformation continues to open innovation and capabilities critical for solving health issues affecting humanity. Our center's mission is simple -- bring together the best and brightest minds across Stanford University, the Silicon Valley community and around the world with a single aim. We want to identify and address some of the most pressing questions at the intersection of health and technology in a way that is scientifically rigorous and ethically sound. 

We are fortunate. Our mission resonates with the community at large. The team is growing, and our partners are incredible. Each brings a different perspective and expertise to the table. And we need them all. We are in the early stages and growing fast. Ultimately, my hope is that the conversations, connections and research we will support through the Center for Digital Health will help guide our society and our patients through this critical phase of digital transformation of medicine, and ultimately improve health for all.

Photo: 3rdtimeluckystudio

Popular posts

Category:
Animal Research
Could the avian flu be our next pandemic threat?

What does it mean that H5N1 bird flu, also known as highly pathogenic avian influenza A, is spreading among dairy cows? And how should U.S. health systems — and consumers of milk products — be responding?