Spam Filter by Using X2 Statistics and Support Vector Machines

Song Wook Lee

Spam Filter by Using X2 Statistics and Support Vector Machines

Song Wook Lee

Vol. 17, No. 3, pp. 249-254, Jun. 2010

10.3745/KIPSTB.2010.17.3.249

PDF

Abstract

We propose an automatic spam filter for e-mail data using Support Vector Machines(SVM). We use a lexical form of a word and its part of speech(POS) tags as features and select features by chi square statistics. We represent each feature by TF(text frequency), TF-IDF, and binary weight for experiments. After training SVM with the selected features, SVM classifies each e-mail as spam or not. In experiment, the selected features improve the performance of our system and we acquired overall 98.9% of accuracy with TREC05-p1 spam corpus.

Statistics

Cite this article

[IEEE Style]

S. W. Lee, "Spam Filter by Using X2 Statistics and Support Vector Machines," The KIPS Transactions:PartB , vol. 17, no. 3, pp. 249-254, 2010. DOI: 10.3745/KIPSTB.2010.17.3.249.

[ACM Style]

Song Wook Lee. 2010. Spam Filter by Using X2 Statistics and Support Vector Machines. The KIPS Transactions:PartB , 17, 3, (2010), 249-254. DOI: 10.3745/KIPSTB.2010.17.3.249.

Advanced Search ▼

Spam Filter by Using X2 Statistics and Support Vector Machines

Submenu

Forms

Search
(IN TITLE, AUTHOR, ABSTRACT,KEYWORDS)

Advanced Search

Recent Publications
(LAST 3 YEARS)

Old Journals

Indexing

Related Journals

Spam Filter by Using X2 Statistics and Support Vector Machines

Submenu

Forms

Search (IN TITLE, AUTHOR, ABSTRACT,KEYWORDS)

Advanced Search

POPULAR KEYWORDS(TOP 10 KEYWORDS)

Recent Publications(LAST 3 YEARS)

Old Journals

Indexing

Related Journals

Search
(IN TITLE, AUTHOR, ABSTRACT,KEYWORDS)

POPULAR KEYWORDS
(TOP 10 KEYWORDS)

Recent Publications
(LAST 3 YEARS)