TY - GEN
T1 - Context-specific language modeling for human trafficking detection from online advertisements
AU - Esfahani, Saeideh Shahrokh
AU - Cafarella, Michael J.
AU - Pouyan, Maziyar Baran
AU - DeAngelo, Gregory J.
AU - Eneva, Elena
AU - Fano, Andrew E.
N1 - Publisher Copyright:
© 2019 Association for Computational Linguistics
PY - 2020
Y1 - 2020
N2 - Human trafficking is a worldwide crisis. Traffickers exploit their victims by anonymously offering sexual services through online advertisements. These ads often contain clues that law enforcement can use to separate out potential trafficking cases from volunteer sex advertisements. The problem is that the sheer volume of ads is too overwhelming for manual processing. Ideally, a centralized semi-automated tool can be used to assist law enforcement agencies with this task. Here, we present an approach using natural language processing to identify trafficking ads on these websites. We propose a classifier by integrating multiple text feature sets, including the publicly available pre-trained textual language model Bi-directional Encoder Representation from transformers (BERT). In this paper, we demonstrate that a classifier using this composite feature set has significantly better performance compared to any single feature set alone.
AB - Human trafficking is a worldwide crisis. Traffickers exploit their victims by anonymously offering sexual services through online advertisements. These ads often contain clues that law enforcement can use to separate out potential trafficking cases from volunteer sex advertisements. The problem is that the sheer volume of ads is too overwhelming for manual processing. Ideally, a centralized semi-automated tool can be used to assist law enforcement agencies with this task. Here, we present an approach using natural language processing to identify trafficking ads on these websites. We propose a classifier by integrating multiple text feature sets, including the publicly available pre-trained textual language model Bi-directional Encoder Representation from transformers (BERT). In this paper, we demonstrate that a classifier using this composite feature set has significantly better performance compared to any single feature set alone.
UR - http://www.scopus.com/inward/record.url?scp=85084041788&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85084041788&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85084041788
T3 - ACL 2019 - 57th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference
SP - 1180
EP - 1184
BT - ACL 2019 - 57th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference
PB - Association for Computational Linguistics (ACL)
T2 - 57th Annual Meeting of the Association for Computational Linguistics, ACL 2019
Y2 - 28 July 2019 through 2 August 2019
ER -