Penn State Penn State: College of the Liberal Arts
Linguistic Diversity Across the LifespanGraduate Research Traineeship Program

Linguistics Meets Technology (NRT) talk: Dr. Shomir Wilson

Linguistics Meets Technology (NRT) talk: Dr. Shomir Wilson
When: February 10, 2023
Where: 127 Moore Building (and virtual)

Friday, February 10, 2023, 9:00–10:30 a.m. EST, 127 Moore Building and virtually via Zoom

Dr. Shomir Wilson

Assistant Professor and Director of the Human
Language Technologies Lab in the
College of Information Sciences and Technology
at Penn State

“Sociodemographic Biases in Natural Language Processing: Two Case Studies”

Large language models (LLMs) are widely used in natural language processing (NLP) to
obtain high performance on a variety of tasks. However, the large corpora used to train
these models contain sociodemographic biases, and LLMs tend to inherit those biases, with
potentially harmful results. Shomir Wilson will present two case studies that reveal the
sociodemographic biases of select LLMs within the context of sentiment analysis, a
common NLP task. The first study shows that Word2Vec and GloVe exhibit negative
sentiment bias toward terms for people with disabilities. The second study shows that GPT-
2 exhibits a range of sentiment biases for nationality demonyms, i.e., words that specify
national origins. Shomir will conclude with some thoughts on the significance of these
biases and the challenges to mitigating or eliminating them.