Automatic Classification of Foreign Language Accent

2021 
Automatic accent classification using a database developed with both L 1 and L 2 language data has been proposed. Speech samples were collected from native Indian speakers speaking in their mother tongue namely Kannada, Tamil, or Telugu, and from non-native English speakers with one of the above as the first language. The vocal tract characteristics were used in the present study. The MFCC features extracted from both native speech and non-native speech were extensively analyzed. Performance validation in Regional Nativity Identification has been investigated using both native South Indian speech, and non-native English speech by the compatriots of the linguistic groups. Detecting regional identity using MFCC features with GMMUBM / i-vector modeling has been proposed. The challenges of second language speech recognition have been addressed by leveraging native, and non-native speech, which produced an SVM classification score of 86.1%, and the Area Under Curve (AUC) is found to be well above 90% for all three languages.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    14
    References
    0
    Citations
    NaN
    KQI
    []