The Whole is Greater than the Sum of its Parts: Towards the Effectiveness of Voting Ensemble Classifiers for Complex Word Identification.

Nikhil Wani,Sandeep Mathias,Jayashree Aanand Gajjam,Pushpak Bhattacharyya

The Whole is Greater than the Sum of its Parts: Towards the Effectiveness of Voting Ensemble Classifiers for Complex Word Identification.

2018

Nikhil Wani
Sandeep Mathias
Jayashree Aanand Gajjam
Pushpak Bhattacharyya

In this paper, we present an effective system using voting ensemble classifiers to detect contextually complex words for non-native English speakers. To make the final decision, we channel a set of eight calibrated classifiers based on lexical, size and vocabulary features and train our model with annotated datasets collected from a mixture of native and non-native speakers. Thereafter, we test our system on three datasets namely News, WikiNews, and Wikipedia and report competitive results with an F1-Score ranging between 0.777 to 0.855 for each of the datasets. Our system outperforms multiple other models and falls within 0.042 to 0.026 percent of the best-performing model’s score in the shared task.

Keywords:

Machine learning
Computer science
Artificial intelligence
Voting
word identification
Natural language processing

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations