Bias in Natural Language Processing (NLP)

Although existing linguistic machine learning models reach numerically-high performance on many language-understanding tasks, they frequently lack optimization for minimizing hidden biases. Natural language processing (NLP) is one of the most active fields of machine learning research. As Natural Language Processing (NLP) tools rise in popularity, it becomes increasingly vital to recognize the role they play in shaping societal biases and stereotypes.
June 28, 2022

What does machine learning bias look like? In essence, it occurs when machine learning algorithms display latent biases that frequently go undetected during testing because the majority of publications evaluate their models for pure accuracy. Consider the following examples of deep learning algorithms displaying bias against women. Our deep learning models indicate that.

  • It is more likely that "He is a doctor" than "She is a doctor."
  • As a computer programmer is to a homemaker, a guy is to a woman.
  • Female nouns tend to convey rage more strongly in sentences.
  • "He is a nurse," in other words. She is a nurse. He is a doctor when "She is a doctor" is translated from English to Hungarian and back to English.

In contrast to an example like "man is to woman as king is to queen," where king and queen have a literal gender definition, the algorithm in these examples is effectively conveying preconceptions. Queens are defined as being female and kings as being male. The statement "Man is to woman as computer programmer is to homemaker" is prejudiced because neither computer programmers nor homemakers are designated as male or female.

NLP models also show a strong presence of biases other than gender prejudice. Here are some instances of some additional biases:

  • Black people are to crime what Caucasians are to law enforcement, according to machine learning models.
  • Machine learning models suggest that legal is to Christianity what terrorism is to Islam.
  • AI is more likely to mark tweets made by African Americans as offensive.

The key is that data patterns are discovered by machine learning algorithms. Let's assume that according to our data, the pronouns "nurse" and "doctor" are typically followed by male and female pronouns, respectively. Our model will learn these patterns and recognize that, typically, nurses are women and doctors are men. Whoops. We have unintentionally educated our model to believe that nurses are feminine and doctors are male.

How Bias Affected NLP Model?

The majority of the deep learning models we employ are "black-box" models. To put it another way, we create a model, build it, train it using certain data, and then utilize it to address a specific issue. As designers, we frequently stop there and don't go into great detail about the reasoning behind a model's choices. This doesn't necessarily imply that the fundamental ideas behind a model are unknown, though.

Unfortunately, the foundation of NLP is unsupervised learning. Although the models used in natural language processing are primarily supervised learning models, the data they use was produced by models that were trained in an unsupervised manner. Because we can't directly feed text into our models, we do this. Instead, we must transform the text into language representations that our models can understand. Word embeddings are the name given to these representations.

Unsupervised models produce word embeddings, which are numerical depictions of text data. An unsupervised model searches through a lot of text and generates vectors to represent the words in the text. Unfortunately, our models are exposed to more than just semantic information because we look for hidden patterns and use them to build embeddings (which automatically organize data). Models are subjected to biases similar to those seen in human culture while digesting the text. The biases then spread to our supervised learning models, which are designed to use unbiased data in order to avoid producing biased outputs.

Gender in Bias

The Artificial Intelligence and Emerging Technology Initiative of The Brookings Institution discusses prejudice and NLP research in a 2021 essay that is a part of the "AI and Bias" series. The article mentions research the author carried out in 2017 at Princeton University's Center for Information Technology Policy.

The study found numerous examples of gender bias in language applications using machine learning. The researchers discovered that when they saw machines processing word embeddings, the machines picked up biases similar to those of humans from the word associations in their training data. For instance, machine learning techniques discovered that words like "kitchen" and "art" were more frequently used in phrases that included the word "woman". On the other hand, words like "science" and "technology" were more frequently found in sentences that included the word "man". These kinds of gender biases spread further down the chain as machine learning recognizes these patterns. Additionally, biased embeddings in supervised training models will result in biased output.

The best illustration of gender bias in NLP was very recently discovered. The third generation Generative Pre-trained Transformer, or GPT-3, NLP model, was released by OpenAI in May 2020. The model was hailed for being the greatest NLP model to date and for producing language that could not be distinguished from human speech. Even the GPT-3 was not immune to gender prejudice, though. The GPT-3 responded that doctors are men and nurses are women when asked to specify the gender of particular professions.

Racial in Bias

Racial bias was noted in the 2017 Princeton study with gender bias. The research discovered that the model embeddings still reflected the online prejudices against African Americans and the Black community. The researchers discovered when they looked at their model that historically Black names were more significantly related with words that had a negative connotation than traditionally White names.

Additionally, much earlier than 2017 evidence of this kind of racial bias existed. The Brookings article also references a 2004 research that discovered that when machine learning was used to assess resumes, those with African American names received 50% less calls than resumes with white names.

How Do We Address It?

Bias in NLP can be handled early on or late on, like many other issues. Debiasing the dataset would be done early in the process, while debiasing the model would be done afterwards.

  • Debiasing the datasets

We must first eliminate biased datasets from current datasets for this to work. For instance, after discovering that a well-known computer vision dataset called Tiny Images had racial, misogynistic, and degrading labels, MIT recently withdrew it. That doesn't imply we can't utilize those datasets; it just means we need to take them out and make any necessary changes to account for bias. Likewise, bias must be taken into account while analyzing new datasets. Diversifying the dataset is currently the method for debiasing datasets that is most widely accepted. For instance, if a dataset constantly surrounds the word "nurse" with female pronouns, it can be debiased by include data for male nurses.

  • Reduce Model Bias

By altering the actual vector-representations of words, this is accomplished. For instance, the Hard Debias and Double-Hard Debias algorithms alter the vector representations to eliminate stereotypical information (such the association between "receptionist" and "female") while preserving pertinent gender information (like the association between "queen" and "female"). These algorithms produce encouraging results and are unquestionably a positive step in the fight against NLP bias.

Even though the discipline of NLP has advanced quickly, it is never too late to correct bias in NLP models. However, we still need to deal with these bias problems as soon as we can, ideally before the models are used in a real-world scenario. Here is an instance of bias that went undetected, leading to a biased model being used in real-world settings and having significant ramifications.


AI already plays a significant role in our lives, and this trend will continue. On the contrary, it plays a bigger and bigger part in our lives. The news we receive each day and the advertisements we see first each morning are selected by AI models. Ignoring the moral ramifications of letting something as influential as AI models run unchecked is a mistake that produces dubious outcomes like the biased examples in this article.

At times, it appears that censoring models is the only sensible course of action. But in my opinion, utilizing AI as a scapegoat just serves to conceal the fact that our culture is prejudiced. Through examples, the models we train learn quickly and effectively. Therefore, if we notice that our models are making biased judgments, we humans are probably at fault. Models serve as our mirror, allowing us to see the defects in our own reflection that we don't want to see because they can only replicate what has been taught.

The only effective way to address prejudice and ethical issues in our models is to address such issues inside. Fixing oneself maintains the bias correction and excellent accuracy of our models. Instead of trying to erase our past transgressions, now is the time to start setting an example for a better future using AI.

Written by Denny Fardian
contact us

Ready to accelerate your digital transformation?

Send us an email, and we will answer your questions regarding our products and services.
Contact Us