A Spam Detection Study of Tweets in Indian Healthcare

2016 
One of the rapidly growing social network, twitter has been infiltrated by large amounts of spam. Twitter has many potential applications across diverse areas, however the signal to noise ratio is very high because of spam, which is a major obstacle of meaningful analysis and action. It is a well-studied problem in emails; however, for tweets, it is relatively less researched. In this paper we have a set up a focused study consisting of nearly 5000 Tweets related to Indian Healthcare. An extensive study has been conducted where six classifiers have been evaluated and compared for spam detection. A simple term frequency based feature selection technique has been shown to reduce the model building time significantly. Ensemble method based on top five classifiers improve the accuracy as well as the stability of the results.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    1
    Citations
    NaN
    KQI
    []