MALTP: Parallel Prediction of Malicious Tweets

Eric Lancaster, Tanmoy Chakraborty, V. S. Subrahmanian*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

5 Scopus citations

Abstract

It has been reported that embedded URLs and multimodal content (images, video, and sound recordings) in tweets are increasingly used to seduce users into a 'wrong click,' leading to malware infection. In this paper, we predict whether a tweet is malicious or not by examining five classes of features: Textual content including sentiment, paths emanating from a URL mentioned in the tweet, attributes associated with URLs, and multimodal content in the tweet. A fifth class of features first constructs a novel 'tweet graph' and then defines features by analyzing 'metapaths' contained in the tweet graph. Next, we propose a MALicious Tweets in Parallel (MALTP) collective classification algorithm that merges together tweet graphs, metapaths, and collective classification proposed previously in the literature. We conduct detailed experiments using two data sets-Warningbird (WB) and KBA. We show that our metapath-based approach outperforms past efforts at identifying malicious tweets and further show that metapath-based features in conjunction with Alexa ranks and features from KBA yield very high predictive accuracy-over 0.98 on KBA and over 0.94 on KBA, outperforming past work. More significantly, metapath features alone generate a predictive accuracy of 0.977 and 0.923, respectively, on the KBA and WB data sets, significantly outperforming the other methods in isolation. We conduct a further analysis to identify the most important features; surprisingly, our results show that the presence of multimodal content is not a major factor and that metapath-based features dominate in separating malicious from benign tweets.

Original languageEnglish (US)
Article number8472279
Pages (from-to)1096-1108
Number of pages13
JournalIEEE Transactions on Computational Social Systems
Volume5
Issue number4
DOIs
StatePublished - Dec 2018
Externally publishedYes

Keywords

  • Machine learning
  • Phishing
  • Predictive modeling
  • Security
  • Social media

ASJC Scopus subject areas

  • Modeling and Simulation
  • Social Sciences (miscellaneous)
  • Human-Computer Interaction

Fingerprint

Dive into the research topics of 'MALT<sup>P</sup>: Parallel Prediction of Malicious Tweets'. Together they form a unique fingerprint.

Cite this