Adapting Speaker Embeddings for Speaker Diarisation.

Youngki Kwon,Jee-weon Jung,Hee-Soo Heo,You-Jin Kim,Bong-Jin Lee,Joon Son Chung

Adapting Speaker Embeddings for Speaker Diarisation.

2021

Youngki Kwon
Jee-weon Jung
Hee-Soo Heo
You-Jin Kim
Bong-Jin Lee
Joon Son Chung

The goal of this paper is to adapt speaker embeddings for solving the problem of speaker diarisation. The quality of speaker embeddings is paramount to the performance of speaker diarisation systems. Despite this, prior works in the field have directly used embeddings designed only to be effective on the speaker verification task. In this paper, we propose three techniques that can be used to better adapt the speaker embeddings for diarisation: dimensionality reduction, attention-based embedding aggregation, and non-speech clustering. A wide range of experiments is performed on various challenging datasets. The results demonstrate that all three techniques contribute positively to the performance of the diarisation system achieving an average relative improvement of 25.07% in terms of diarisation error rate over the baseline.

Keywords:

Embedding
Speaker diarisation
Field (computer science)
Speech recognition
Range (mathematics)
Computer science
Cluster analysis
Dimensionality reduction
Word error rate
task

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations