Semi supervised image spam hunter: A regularized discriminant EM approach

Yan Gao*, Ming Yang, Alok Choudhary

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

10 Scopus citations

Abstract

Image spam is a new trend in the family of email spams. The new image spams employ a variety of image processing technologies to create random noises. In this paper, we propose a semi-supervised approach, regularized discriminant EM algorithm (RDEM), to detect image spam emails, which leverages small amount of labeled data and large amount of unlabeled data for identifying spams and training a classification model simultaneously. Compared with fully supervised learning algorithms, the semi-supervised learning algorithm is more suitedin adversary classification problems, because the spammers are actively protecting their work by constantly making changes to circumvent the spam detection. It makes the cost too high for fully supervised learning to frequently collect sufficient labeled data for training. Experimental results demonstrate that our approach achieves 91.66% high detection rate with less than 2.96% false positive rate, meanwhile it significantly reduces the labeling cost.

Original languageEnglish (US)
Title of host publicationAdvanced Data Mining and Applications - 5th International Conference, ADMA 2009, Proceedings
Pages152-164
Number of pages13
DOIs
StatePublished - 2009
Event5th International Conference on Advanced Data Mining and Applications, ADMA 2009 - Beijing, China
Duration: Aug 17 2009Aug 19 2009

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume5678 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other5th International Conference on Advanced Data Mining and Applications, ADMA 2009
Country/TerritoryChina
CityBeijing
Period8/17/098/19/09

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Fingerprint

Dive into the research topics of 'Semi supervised image spam hunter: A regularized discriminant EM approach'. Together they form a unique fingerprint.

Cite this