ОЦІНЮВАННЯ ОСНОВНОГО ТОНУ У АВТОМАТИЗОВАНІЙ СИСТЕМІ РОЗПІЗНАВАННЯ МОВЦЯ КРИТИЧНОГО ЗАСТОСУВАННЯ

2018 
The article proposes a method for pitch trend estimation, which, unlike existing ones, uses a factorial hidden Markov model optimized with the junction tree algorithm for pitch trend estimation, generalizing information from pitch state detectors based on deep and recurrent neural networks, with which it is allowed precisely to predict a pitch trend using long-term information from speech frames packets, describe the dynamics of the pitch in the time domain and reduce the noise influence on the quality of pitch estimates. Methods for estimating pitch states based on deep and recurrent neural networks and a method for estimating the pitch trend based on the factorial hidden Markov model (FHMM) are developed. A study was carried out to optimize the parameters of the proposed methods for use as part of the automated speaker recognition system for critical use (ASRSCU). In particular, the results of the research make it possible to recommend power-normalized cepstral characteristics as the basis for estimating the pitch by the proposed methods, to apply frames packets with a duration of 10 frames, to use 1024 neurons in the hidden layers of neural networks that implement the proposed methods, and to use 68 states to describe the pitch. The results of the conducted researches of the dependence of the quality of speakers recognition by the ASRSCU from the level of the signal-to-noise ratio (SNR) in the input speech material and the pitch estimates obtained as a result of the work of the created methods, the parameters of which are optimized taking into account the results of the conducted studies, showed that for all levels of SNR the exact pitch estimate is provided by the FHMM method, showing the correct speakers recognition probability by the ASRSCU at a level of 96…99 % for the selected test sample.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []