Context-specific language modeling for human trafficking detection from online advertisements

Saeideh Shahrokh Esfahani, Michael J. Cafarella, Maziyar Baran Pouyan, Gregory J. DeAngelo, Elena Eneva, Andrew E. Fano

Research output: Chapter in Book/Report/Conference proceedingConference contribution

9 Scopus citations

Abstract

Human trafficking is a worldwide crisis. Traffickers exploit their victims by anonymously offering sexual services through online advertisements. These ads often contain clues that law enforcement can use to separate out potential trafficking cases from volunteer sex advertisements. The problem is that the sheer volume of ads is too overwhelming for manual processing. Ideally, a centralized semi-automated tool can be used to assist law enforcement agencies with this task. Here, we present an approach using natural language processing to identify trafficking ads on these websites. We propose a classifier by integrating multiple text feature sets, including the publicly available pre-trained textual language model Bi-directional Encoder Representation from transformers (BERT). In this paper, we demonstrate that a classifier using this composite feature set has significantly better performance compared to any single feature set alone.

Original languageEnglish (US)
Title of host publicationACL 2019 - 57th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference
PublisherAssociation for Computational Linguistics (ACL)
Pages1180-1184
Number of pages5
ISBN (Electronic)9781950737482
StatePublished - 2020
Externally publishedYes
Event57th Annual Meeting of the Association for Computational Linguistics, ACL 2019 - Florence, Italy
Duration: Jul 28 2019Aug 2 2019

Publication series

NameACL 2019 - 57th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference

Conference

Conference57th Annual Meeting of the Association for Computational Linguistics, ACL 2019
Country/TerritoryItaly
CityFlorence
Period7/28/198/2/19

ASJC Scopus subject areas

  • Language and Linguistics
  • Computer Science(all)
  • Linguistics and Language

Fingerprint

Dive into the research topics of 'Context-specific language modeling for human trafficking detection from online advertisements'. Together they form a unique fingerprint.

Cite this