Analysis of Unintelligible Speech for MLLR and MAP-Based Speaker Adaptation

2021 
Speech Recognition is the process of translating human voice into textual form, which in turn drives many applications including HCI (Human Computer Interaction). A recognizer uses the acoustic model to define rules for mapping sound signals to phonemes. This article brings out a combined method of applying Maximum Likelihood Linear Regression (MLLR) and Maximum A Posteriori (MAP) techniques to the acoustic model of a generic speech recognizer, so that it can accept data of people with speech impairments and transcribe the same. In the first phase, MLLR technique was applied to alter the acoustic model of a generic speech recognizer, with the feature vectors generated from the training data set. In the second phase, parameters of the updated model were used as informative priors to MAP adaptation. This combined algorithm produced better results than a Speaker Independent (SI) recognizer and was less effortful for training compared to a Speaker Dependent (SD) recognizer. Testing of the system was conducted with the UA-Speech Database and the combined algorithm produced improvements in recognition accuracy from 43% to 90% for medium to highly impaired speakers revealing its applicability for speakers with higher degrees of speech disorders.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    27
    References
    0
    Citations
    NaN
    KQI
    []