On Finding the Best Learning Model for Assessing Confidence in Speech

2020 
The human mind is naturally conditioned to assess the confidence of another speaker. Hence, confidence while speaking is crucial for success across most domains and situations. Confidence in speech is a highly useful trait to have when engaged in interactions and discussions. In the right amounts, it can often sound pleasant or reassuring to the listener. For a person striving to achieve a note of confidence in his/her voice, finding a human evaluator to give relevant feedback on the tone and voice is not always possible. Given the growing power of neural networks and other machine learning tools today, a machine could potentially serve as an evaluator for assessing the confidence in the user's speech, and provide scores as feedback for the user's improvement. In this paper, we present the descriptions, results and analysis of our experiments in predicting the confidence of a speaker using machine learning and audio processing tools. The project involved the building and scoring of an unbiased dataset of audio recordings based on the confidence of the speaker. The audio clips were recorded by the peers in the campus and graded based on clarity, modulation, pace, and volume. Three models were trained and tested on the built dataset: a multilayer perceptron (MLP) neural network, a support vector machine (SVM) and a convolutional neural network (CNN) to predict the confidence of a speaker. Our results show that convolutional neural networks produce scores with the highest accuracy, 86.3%, where accuracy is measured with respect to the closeness to the scores awarded by human assessment.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    3
    References
    0
    Citations
    NaN
    KQI
    []