DNN-based Approach to Detect and Classify Pathological Voice

2018 
We participate in the FEMH 2018 Challenge of a bigdata subproject of the IEEE. The goal of this Challenge is pathological voice detection, and classify the different diseases, including phono trauma, neoplasm and vocal paralysis. Final, this challenge uses sensitivity, specificity and UAR as a result. The database is recorded with 50 normal voice samples and 150 samples of common voice disorders in a tertiary teaching hospital (Far Eastern Memorial Hospital, FEMH). The paper proposes a Deep Neural Networks based (DNN-based) approach in this challenge. Data preprocessing used Mel-Frequency Cepstral Coefficients (MFCCs), which also have emotion specific information. Gradual spectral variations are captured using 13 MFCCs extracted from speech signal. In the disease detection section, we examine the performance among different DNN structures (ie, hidden layers and number of neurons). Then, In the disease classification section, examine the performance among different batch sizes and normalize or no normalize. Finally, the tested DNN structures have the best results at 5 hidden layers and 200 of neurons.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    4
    References
    3
    Citations
    NaN
    KQI
    []