Automatic Speech Recognition in Different Languages using High-Density Surface Electromyography Sensors

2020 
Automatic speech recognition (ASR) based on surface electromyography (sEMG) sensors is an important technology converting electrical signals into computer-readable textual messages, which can overcome the limitation of acoustic sensors that are easily contaminated by environmental noises. However, current placements of sEMG sensors mainly depend on the experimenter’s experience, which could miss important information about the major muscular activities and lead to the decline of classification performance. In this study, 120 closely-spaced sEMG sensors were utilized to collect high-density sEMG signals for recognizing ten digits in English and Chinese. The linear discriminant analysis classifier was used to classify the speaking tasks, and the sequential forward selection algorithm was utilized for analyzing the optimal position of the sensors. The results showed that the HD sEMG energy maps could help visualize the dynamic muscle activities during the speaking process, and significantly different muscular contraction patterns were observed for different speaking tasks. The classification accuracies when using the facial sensors were significantly lower than those on the neck, although with the same number of sensors. Moreover, the classification rates could be higher than 90% with only 15 optimally selected sensors that were mainly distributed on the neck instead of the face. This study suggests that the neck muscles could be the main contributor, and more sEMG sensors should be placed on the neck to improve the ASR performance. The findings of this study could provide valuable clues for the development of a practical sEMG-based speech recognition system, especially for patients with speaking disorders.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    52
    References
    0
    Citations
    NaN
    KQI
    []