Historical representations of social groups across 200 years of word embeddings from Google Books

Tessa E.S. Charlesworth, Aylin Caliskan, Mahzarin R. Banaji*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

50 Scopus citations

Abstract

Using word embeddings from 850 billion words in English-language Google Books, we provide an extensive analysis of historical change and stability in social group representations (stereotypes) across a long timeframe (from 1800 to 1999), for a large number of social group targets (Black, White, Asian, Irish, Hispanic, Native American, Man, Woman, Old, Young, Fat, Thin, Rich, Poor), and their emergent, bottom-up associations with 14,000 words and a subset of 600 traits. The results provide a nuanced picture of change and persistence in stereotypes across 200 y. Change was observed in the top-associated words and traits: Whether analyzing the top 10 or 50 associates, at least 50% of top associates changed across successive decades. Despite this changing content of top-associated words, the average valence (positivity/negativity) of these top stereotypes was generally persistent. Ultimately, through advances in the availability of historical word embeddings, this study offers a comprehensive characterization of both change and persistence in social group representations as revealed through books of the English-speaking world from 1800 to 1999.

Original languageEnglish (US)
Article numbere2121798119
JournalProceedings of the National Academy of Sciences of the United States of America
Volume119
Issue number28
DOIs
StatePublished - Jul 12 2022

Funding

ACKNOWLEDGMENTS. This research was supported by the Harvard Mind Brain Behavior Inter-Faculty Initiative, the Foundations of Human Behavior, and the Hao Family Inequality in America Support Fund awarded to M.R.B. and T.E.S.C. We are grateful to Wil Cunningham, Dan Hoyer, and Yoav Rabinovich for feedback on earlier versions of this manuscript.

Keywords

  • attitude change
  • natural language processing
  • stereotype change
  • word embeddings

ASJC Scopus subject areas

  • General

Fingerprint

Dive into the research topics of 'Historical representations of social groups across 200 years of word embeddings from Google Books'. Together they form a unique fingerprint.

Cite this