Rayid Ghani is the first to admit that the problems he works on are depressing, subjects like lead poisoning in children, police misconduct, and the tight link between mental health disorders, homelessness and incarceration.
As director of the University of Chicago’s Center for Data Science and Public Policy, Ghani has tackled these issues using a combination of data analysis, problem-solving, communication and social sciences. He made the case for this approach during a recent talk hosted by Stanford Medicine’s Center for Population Health Sciences.
“The idea here is to combine interventions with prediction, because predictions by themselves aren’t doing anybody any good,” said Ghani, who is also director of the University of Chicago’s Data Science for Social Good Fellowship.
He provided examples of systems that his group designed to address social problems.
In one project, Ghani and his colleagues combined 15 years of data from home inspections and tests of blood lead levels in children in Chicago to create a model that predicts the risk of lead poisoning. The model is now integrated with one hospital system's electronic health records, and it flags pregnant patients who live at addresses with potentially hazardous lead levels.
"Now the health department has a few months to figure out how to allocate resources to fix this problem before the kid is born," Ghani said.
In another effort, his team reviewed a system that many large police departments have used to try to predict which officers are at risk of unjustified force or other misconduct. Working with the Charlotte-Mecklenburg Police Department in North Carolina, a fellowship team found that the system had flagged more than half of the police force as potential problems.
The team examined 10 years of data and discovered a possible correlation between misconduct and, among other predictors, the number of times an officer had been dispatched to suicide attempts or domestic abuses cases, particularly those involving children. With new criteria, they reprogrammed the system to identify the highest-risk officers, who would then be assigned to an intervention, such as counseling or training before a problem occurred. The new system was launched in November, and Ghani’s team continues gathering data.
“These types of [solutions] are happening, but are happening in pockets today,” Ghani said. “Not everybody is doing it. The question becomes why are they not doing it?”
The answer, he said, is that most machine-learning classes tend to focus on abstract problems, while people with expertise in social issues are often unfamiliar with how to find data-driven solutions.
It’s important for data models to include humans, Ghani said, because if people don’t trust the system, they won’t use it. Also, even good systems aren’t correct in a significant share of cases — human knowledge and judgment can correct for that. Plus, people know which interventions can best help those identified by computer models, for example desk duty or counseling for a high-risk police officer.
“It’s really important to get this part right,” Ghani said, “except in the majority of the methods that are used in machine learning, that’s an afterthought.”
That’s not to say that human judgment alone is superior to computers, he clarified. His group built a tool to assess bias and fairness in their data models. Though they found that many came up short, all of the new systems proved less biased than the ones they replaced.
Ghani said,
When people start getting really upset about these biased systems in AI/machine learning, it’s important to tell them, yes, but today’s system is also biased. Not using these tools means that you’re using today’s system, which is much, much, much worse in most cases.
Photo by janneke staaks