Multi-lingual Author Profiling: Predicting Gender and Age from Tweets!

2020 
This article describes how we build a multi-lingual classification system for author profiling. We have used Twitter corpus for English, Dutch, Italian and Spanish languages for building different models incorporating SVM classifier that predicts the gender and age of an author. We evaluated each model using 3-fold cross-validation on the training dataset for each of these languages. The overall maximum average accuracy for gender classification was 81.3% for Spanish while for classification of age we achieved a maximum accuracy score of 70.3% for English using the cross-validation scheme. For other languages, the results were between 64–76%.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    19
    References
    0
    Citations
    NaN
    KQI
    []