Feature Extraction Methods in Language Identification: A Survey

2019 
Language Identification (LI) is one of the widely emerging field in the areas of speech processing to accurately identify the language from the data base based on some features of the speech signal. LI technologies have a wide set of applications in different spheres due to the growing advancement in the field of artificial intelligence and machine learning. Feature extraction is one of the fundamental and significant process performed in LI. This review presents main paradigms of research in Feature Extraction methods that will provide a deep insight to the researchers about the feature extraction techniques for future studies in LI. Broadly, this review summarizes and compare various feature extraction approaches with and without noise compensation techniques as the current trend is towards robust universal Language Identification framework. This paper categorizes the different feature extraction approaches on the basis of different features, human speech production system/peripheral auditory system, spectral or cepstral analysis, and lastly on the basis of transform. Moreover, the different noise compensation-based feature extraction techniques are also covered in the review. This paper also presents, that Mel-Frequency Cepstral Coefficients (MFCCs) are the most popular approach. Results indicates that MFCC fused with other feature vectors and cleansing approaches gives improved performance as compared to the pure MFCC based Feature Extraction approaches. This study also describes the different categories at the front end of the LI system from research point of view.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    108
    References
    11
    Citations
    NaN
    KQI
    []