TY - GEN
T1 - Addressing age-related bias in sentiment analysis
AU - Díaz, Mark
AU - Johnson, Isaac
AU - Lazar, Amanda
AU - Piper, Anne Marie
AU - Gergle, Darren
N1 - Funding Information:
This work was supported in part by NSF grant IIS-1551574. We thank the bloggers who made their discourse and experience with aging publicly available online.
Publisher Copyright:
© 2018 Copyright is held by the owner/author(s).
PY - 2018/4/20
Y1 - 2018/4/20
N2 - Computational approaches to text analysis are useful in understanding aspects of online interaction, such as opinions and subjectivity in text. Yet, recent studies have identified various forms of bias in language-based models, raising concerns about the risk of propagating social biases against certain groups based on sociodemographic factors (e.g., gender, race, geography). In this study, we contribute a systematic examination of the application of language models to study discourse on aging. We analyze the treatment of age-related terms across 15 sentiment analysis models and 10 widely-used GloVe word embeddings and attempt to alleviate bias through a method of processing model training data. Our results demonstrate that significant age bias is encoded in the outputs of many sentiment analysis algorithms and word embeddings. We discuss the models' characteristics in relation to output bias and how these models might be best incorporated into research.
AB - Computational approaches to text analysis are useful in understanding aspects of online interaction, such as opinions and subjectivity in text. Yet, recent studies have identified various forms of bias in language-based models, raising concerns about the risk of propagating social biases against certain groups based on sociodemographic factors (e.g., gender, race, geography). In this study, we contribute a systematic examination of the application of language models to study discourse on aging. We analyze the treatment of age-related terms across 15 sentiment analysis models and 10 widely-used GloVe word embeddings and attempt to alleviate bias through a method of processing model training data. Our results demonstrate that significant age bias is encoded in the outputs of many sentiment analysis algorithms and word embeddings. We discuss the models' characteristics in relation to output bias and how these models might be best incorporated into research.
KW - Aging
KW - Older adults, algorithmic bias
KW - Sentiment analysis
UR - http://www.scopus.com/inward/record.url?scp=85046977477&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85046977477&partnerID=8YFLogxK
U2 - 10.1145/3173574.3173986
DO - 10.1145/3173574.3173986
M3 - Conference contribution
AN - SCOPUS:85046977477
T3 - Conference on Human Factors in Computing Systems - Proceedings
BT - CHI 2018 - Extended Abstracts of the 2018 CHI Conference on Human Factors in Computing Systems
PB - Association for Computing Machinery
T2 - 2018 CHI Conference on Human Factors in Computing Systems, CHI 2018
Y2 - 21 April 2018 through 26 April 2018
ER -