Classifying Emotions in Roman Urdu Posts using Machine Learning
2021
Emotion, a state of human mind has a significant impact on human behavior, social interactions and decision making. Few emotions from Carroll E. Izard model such as long-term fear, anger and sadness cause mental disorders which can lead to deterioration of mental and physical health. For the well-being of mental health, different psychological parameters such as voice, facial expressions and body movements were used to extract these emotions in earlier times. This paper focuses on verbal language for emotion classification. With the huge popularity of Web 3.0, people feel more comfortable in sharing their emotions over social media due to the anonymity of the web. Recently, researchers around the globe are interested to classify the emotions from social media posts (i.e., tweets, comments or micro-blogs etc.) using resource-rich languages e.g., English. However, less attention is paid towards the resource-poor language such as Urdu in Roman script. According to the Gallup survey in Pakistan, Urdu in Roman script is widely used to transmit text. To detect emotions in Roman Urdu text is a big challenge due to its scripting and morphology. In this paper, the following contributions are made; 1) A corpus of Roman Urdu is developed for emotion classification. 2) We aim to identify user’s mental states using textual posts on social media plat-forms; Instagram, Facebook, Twitter and YouTube. Particularly, Natural Language Processing methods incorporated with feature extraction methods (TF-IDF) and four supervised classifiers such as Support Vector Machine (SVM),Naive Bayes, Random Forest and K-nearest Neighbor are used. By utilizing 80% training and 20% testing data, Naive Bayes secures 70.57% accuracy. 3) Moreover, suitability analysis is performed to identify the most suitable classifier in terms of accuracy, training and testing time. Results show that Naive Bayes is the most suitable classifier for emotion classification.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
26
References
0
Citations
NaN
KQI