LEAP System for SRE 2019 CTS Challenge - Improvements and Error Analysis
4
Citation
26
Reference
10
Related Paper
Citation Trend
Abstract:
The NIST Speaker Recognition Evaluation - Conversational Telephone Speech (CTS) challenge 2019 was an open evaluation for the task of speaker verification in challenging conditions. In this paper, we provide a detailed account of the LEAP SRE system submitted to the CTS challenge focusing on the novel components in the back-end system modeling. All the systems used the time-delay neural network (TDNN) based x-vector embeddings. The x-vector system in our SRE19 submission used a large pool of training speakers (about 14k speakers). Following the x-vector extraction, we explored a neural network approach to backend score computation that was optimized for a speaker verification cost. The system combination of generative and neural PLDA models resulted in significant improvements for the SRE evaluation dataset. We also found additional gains for the SRE systems based on score normalization and calibration. Subsequent to the evaluations, we have performed a detailed analysis of the submitted systems. The analysis revealed the incremental gains obtained for different training dataset combinations as well as the modeling methods.Keywords:
NIST
Normalization
Speaker Verification
Normalization
Speaker Verification
Baseline (sea)
Cite
Citations (3)
The purpose of this letter is to unify several of the state-of-the-art score normalization techniques applied to text-independent speaker verification systems. We propose a new framework for this purpose. The two well-known Z- and T-normalization techniques can be easily interpreted in this framework as different ways to estimate score distributions. This is useful since it helps to understand the various assumptions behind these well-known score normalization techniques and opens the door for yet more complex solutions. Finally, some experiments on the Switchboard database are performed in order to illustrate the validity of the new proposed framework.
Normalization
Speaker Verification
Cite
Citations (34)
Speaker Verification
Speaker identification
Identification
Speaker diarisation
Phrase
Cite
Citations (0)
In this paper, we describe the speaker verification (SV) systems developed by Indian Institute of Technology Guwahati (IITG) for the NIST 2012 speaker recognition evaluations. The primary submission consists of five gender dependent SV systems combined at score level. Among the five systems two are based on sparse representation over learned and exemplar dictionaries, and the remaining are based on the generic i-vector and its variants obtained by vowel and non-vowel conditioning. The exemplar dictionary based system in particular exploits the new evaluation rule allowing the knowledge of all targets in each detection trial. The performance of the system is presented for the NIST SRE 2012 core task.
NIST
Speaker Verification
Representation
Cite
Citations (2)
Normalization
Speaker Verification
Adaptability
Word error rate
Theory of computation
Cite
Citations (0)
In this paper, we propose a novel approach to speaker verification. One of the problems in conventional speaker verification techniques is that constructing good speaker background models is difficult. In this paper, the background speakers are clustered into groups using a speaker clustering technique and the background model is constructed based on those groups. The support vector machine based speaker verification models are trained on the enrolled speaker and the background model: Preliminary results on text-independent speaker verification are provided to demonstrate the effectiveness of such systems.
Speaker Verification
Speaker diarisation
Cite
Citations (6)
This paper describes the development, implementation and validation of an automatic speaker recognition system on an iPad tablet. A score normalization approach, referred as Nearest Neighbor Normalization (3N), is applied in order to improve the baseline speaker verification system. The system is evaluated on the MOBIO corpus and results show an absolute improvement of the HTER by more than 4% when the score normalization is performed. A human-centered interface is implemented for the speaker recognition system and a survey is collected from 28 users in order to evaluate the application. The results showed that the users, familiar with touchscreen interface, found the application easy to learn and use.
Normalization
Touchscreen
Speaker Verification
Cite
Citations (0)
For better decision-making in a speaker verification system, the threshold achieved by the minimum detection cost function (DCF) is determined. The bimodal distribution parameters of the output score based on the target speaker model are different, so it is difficult to estimate a mutual threshold. This paper proposes a novel score normalization-TZ normalization combined by the traditional zero normalization and the test normalization. Then, to obtain better robustness, a new method is introduced to improve the estimated threshold. Text-independent speaker verification experiments on the telephony NIST speaker recognition evaluation corpus show that the significant improvements for this new technique are effective compared to the traditional techniques.
Normalization
Speaker Verification
NIST
Robustness
Cite
Citations (1)
Normalization
Speaker Verification
Standard score
Cite
Citations (0)
Normalization
NIST
Speaker Verification
Feature vector
Feature (linguistics)
Cite
Citations (49)