$100,000 prize jackpot. call now!: identifying the pertinent features of SMS spam

Henry Tan,Nazli Goharian,Micah Sherr

$100,000 prize jackpot. call now!: identifying the pertinent features of SMS spam

2012

Henry Tan
Nazli Goharian
Micah Sherr

Mobile SMS spam is on the rise and is a prevalent problem. While recent work has shown that simple machine learning techniques can distinguish between ham and spam with high accuracy, this paper explores the individual contributions of various textual features in the classification process. Our results reveal the surprising finding that simple is better : using the largest spam corpus of which we are aware, we find that using simple textual features is sufficient to provide accuracy that is nearly identical to that achieved by the best known techniques, while achieving a twofold speedup.

Keywords:

Data mining
Feature selection
Information retrieval
Speedup
Spambot
Computer science
Simple machine
Machine learning
sms spam
Artificial intelligence

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations