As popular artificial intelligence tools advance, they are becoming more covertly racist, a worrying new report says.
A team of technology and linguistics researchers this week discovered that large-scale language models like OpenAI’s ChatGPT and Google’s Gemini are now using African American Vernacular English (AAVE), a dialect of English created and spoken by Black Americans. ) has revealed that he holds racist stereotypes about the speaker.
Valentin Hoffman, a researcher at the Allen Institute for Artificial Intelligence and co-author of a recent paper published this week on arXiv, said, “These technologies can help companies perform tasks such as screening job applicants. We know that it is actually commonly used in Cornell University Open Access Research Archive.
Until now, researchers were “only looking at what they were actually observing,” Hoffman explained. overt “There may be racial bias in these technologies,” but “how these AI systems respond to less obvious indicators of race, such as differences in dialect. I have never done any “investigation”.
The paper states that black people who use AAVE in their speeches are “known to experience racial discrimination in a wide range of contexts, including education, employment, housing, and legal consequences.”
Hoffman and his colleagues asked the AI model to rate the intelligence and employability of people who spoke using AAVE compared to people who spoke using what they called “Standard American English.”
For example, an AI model could interpret the sentence “When I wake up from a bad dream, I’m so happy because they feel so real” and “When I wake up from a bad dream, I’m so happy because that’s how they feel.” I was asked to compare “. It’s too real.”
The model was significantly more likely to describe AAVE speakers as “stupid” and “lazy” and assign them to low-paying jobs.
Hoffman wonders if the results mean the AI model will penalize job applicants for code-switching between AAVE and standard American English (the act of changing how you express yourself depending on your audience). I am concerned that this may not be the case.
“One of the big concerns is that job seekers, for example, have used this dialect in their social media posts,” he told the Guardian. “It is not unreasonable to think that the language model would not select a candidate because they used a dialect in their online presence.”
The AI model was also significantly more likely to recommend the death penalty for a hypothetical criminal defendant who used AAVE in his courtroom statement.
“I’d like to think we’re not close to the time when this type of technology is used in determining convictions,” Hoffman said. “It may feel like a very dystopian future, and I hope it is.”
Still, Hoffman told the Guardian that it is difficult to predict how language learning models will be used in the future.
“Ten years ago, even five years ago, we had no idea about all the different contexts in which AI is used today,” he said, adding that new research on racism in large-scale language models It urged developers to heed the paper’s warnings.
In particular, AI models are already being used in the U.S. legal system to assist with administrative tasks such as creating court records and conducting legal research.
Leading AI experts like Timnit Gebru, former co-leader of Google’s ethical artificial intelligence team, have long urged the federal government to reduce the use of large, largely unregulated language models. I’ve been asking for it.
“It feels like a gold rush,” Gebru told the Guardian last year. “Actually, it is teeth gold Rush. And a lot of the people who are making money are not actually the people who are making money. ”
Google’s AI model, Gemini, recently announced that the company’s image-generating tool depicting a variety of historical figures, including the Pope, the Founding Fathers of the United States, and, most tragically, German soldiers from World War II. It appeared on social media posts and was in trouble. As a person of color.
Large-scale language models improve as they are fed more data, learning to more accurately imitate human speech by learning text from billions of web pages on the internet. . The conceit of this learning process is that models spout all the racist, sexist, and other harmful stereotypes they encounter on the internet. In computing, this problem is explained by the adage “garbage in, garbage out.” Racist input leads to racist output, and early AI chatbots like Microsoft’s Tay ended up spewing out the same neo-Nazi content he learned from Twitter users in 2016. Masu.
In response, groups like OpenAI have developed guardrails, a set of ethical guidelines that regulate the content that language models like ChatGPT can convey to users. As language models grow, overtly racist tendencies tend to fade.
However, Hoffman and his colleagues found that as language models grow, covert racism increases. They learned that ethical guardrails only teach language models to be more cautious about racial bias.
“It doesn’t solve the fundamental problem. The guardrails seem to be mimicking the behavior of educated people in the United States,” said Hugging Face, an AI ethics researcher at the intersection of public policy and technology. Abhijit Ghosh, whose research focuses on
“Once people get past a certain level of education they stop being vilified to their face, but racism still exists. It’s the same with language models: garbage in, garbage out. These models are Instead of forgetting that there is a problem, you just get better at hiding it.”
Aggressive adoption of language models by the U.S. private sector is expected to further intensify over the next decade, with the broader market for generative AI set to become a $1.3 trillion industry by 2032, according to Bloomberg. It is predicted that it will be. Meanwhile, federal labor regulators such as the Equal Employment Opportunity Commission have only recently begun protecting workers from AI-based discrimination, with the first case of its kind reported to the EEOC late last year.
Ghosh, like Gebru, is part of a growing corps of AI experts concerned about the harm language learning models could cause if technological advances continue to outpace federal regulations.
“There is no need to stop innovation or slow down AI research, but reducing the use of these technologies in certain sensitive areas is a good first step,” he said. “Racists exist all over the country. We don’t need to put them in prison, but we try to keep them from being in charge of recruiting and hiring. Technology should be regulated in a similar way. You should.”


