SPEAKER TRACKING AND DETECTION WITH MULTIPLE SPEAKERS

Larry P. Heck,Mitchel Weintraub

SPEAKER TRACKING AND DETECTION WITH MULTIPLE SPEAKERS

1998

Larry P. Heck
Mitchel Weintraub

We describe a speaker tracking and detection system, for Switchboard conversations, that uses a two-speaker and silence hidden Markov model (HMM) with a minimum state duration constraint and Gaussian mixture model (GMM) state distributions adapted from a single gender- and handset-independent imposter model distribution. Speaker tracking is used to segment speakers for detection, which is carried out by averaging frame scores of the Viterbi path and HNORM’ing via a novel parameter interpolation extension of HNORM for use with files of arbitrary lengths. Use of duration statistics augmenting the acoustic scores is also introduced via a nonlinear combination function. Results are reported on the NIST 1998 Multispeaker development evaluation dataset.

Keywords:

Artificial intelligence
Mixture model
Speech recognition
NIST
Speaker diarisation
Interpolation
Viterbi algorithm
Pattern recognition
Nonlinear system
Computer science
Hidden Markov model
speaker tracking

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations