An Ensemble Approach for the Diagnosis of COVID-19 from Speech and Cough Sounds
2021
Mass COVID-19 infections detection has shown to be a very hard problem. In this work, we describe our systems developed to diagnose COVID-19 cases based on coughing sounds and speech. We propose a hybrid configuration that employs Convolution Neural Network (CNN), Time Delay Neural Network (TDNN) and Long Short-Term Memory (LSTM) for the extraction of coughing sound and speech embeddings. Moreover, the proposed framework utilizes SpecAugment-based on-the-fly data augmentation and multi-level statistics pooling for mapping frame level information into utterance level embedding. We employ classical support vector machines, random forests, AdaBoost, decision trees, and logistic regression classifiers for the final decision making, to determine whether the given feature is from a COVID-19 negative or positive patient. We also adopt an end-to-end approach employing ResNet model with a one-class softmax loss function for making positive versus negative decision over the high resolution hand-crafted features. Experiments are carried out on the two subsets, denoted as COVID-19 Speech Sounds (CSS) and COVID-19 Cough Sounds (CCS), from the Cambridge COVID-19 Sound database and experimental results are reported on the development and test sets of these subsets. Our approach outperforms the baselines provided by the challenge organizers on the development set, and shows that using speech to help remotely detect early COVID-19 infections and eventually other respiratory diseases is likely possible, which opens a new opportunity for a promising cheap and scalable pre-diagnosis way to better handle pandemics.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
27
References
0
Citations
NaN
KQI