Speech Segmentation and Speaker Diarization using Time-Delay Neural Network

Mesut Toruk,Ahmet Serbes,Gokhan Bilgin

Speech Segmentation and Speaker Diarization using Time-Delay Neural Network

2019

In recent years, important studies about speaker diarization, which is an important topic in the field of speech processing, have been carried out. Especially, significant improvements have been made in the problem of diarization with the i-vector method; and parallel to this, current deep learning methods have been used effectively in the field of speech processing. As a result of the improvements, the performance of speaker diarization systems have been increased. In this study, firstly, how various speech activity detection systems affect speaker diarization system is examined. Therefore, deep neural network, elevated deep neural network, adaptive context attention model and time-delayed deep neural network based methods are used. Then, the effect of i-vectors and x-vectors, on diarization error rate for speaker representation were examined.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations