TY - GEN
T1 - Active learning image spam hunter
AU - Gao, Yan
AU - Choudhary, Alok
PY - 2009
Y1 - 2009
N2 - Image spam is annoying email users around the world. Most previous work for image spam detection focuses on supervised learning approaches. However, it is costly to get enough trustworthy labels for learning, especially for an adversarial problem where spammers constantly modify patterns to evade the classifier. To address this issue, we employ the principle of active learning where the learner guides the user to label as few images as possible while maximizing the classification accuracy. Active learning is more suited for online image spam filtering since it dramatically reduces the labeling costs with negligible overhead while maintaining high recognition performance. We present and compare two active learning algorithms, based on an SVM and a Gaussian process classifier respectively. To the best of our knowledge, we are the first to apply active learning for the task of spam image filtering. Experimental results demonstrate that our active learning based approaches quickly achieve >99% high detection rate and <0.5% low false positive rate with small number of images being labeled.
AB - Image spam is annoying email users around the world. Most previous work for image spam detection focuses on supervised learning approaches. However, it is costly to get enough trustworthy labels for learning, especially for an adversarial problem where spammers constantly modify patterns to evade the classifier. To address this issue, we employ the principle of active learning where the learner guides the user to label as few images as possible while maximizing the classification accuracy. Active learning is more suited for online image spam filtering since it dramatically reduces the labeling costs with negligible overhead while maintaining high recognition performance. We present and compare two active learning algorithms, based on an SVM and a Gaussian process classifier respectively. To the best of our knowledge, we are the first to apply active learning for the task of spam image filtering. Experimental results demonstrate that our active learning based approaches quickly achieve >99% high detection rate and <0.5% low false positive rate with small number of images being labeled.
UR - http://www.scopus.com/inward/record.url?scp=72549085329&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=72549085329&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-10520-3_27
DO - 10.1007/978-3-642-10520-3_27
M3 - Conference contribution
AN - SCOPUS:72549085329
SN - 364210519X
SN - 9783642105197
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 293
EP - 302
BT - Advances in Visual Computing - 5th International Symposium, ISVC 2009, Proceedings
T2 - 5th International Symposium on Advances in Visual Computing, ISVC 2009
Y2 - 30 November 2009 through 2 December 2009
ER -