F0 estimation for noisy speech based on exploring local time-frequency segment

2015 
In this paper, we propose a fundamental frequency (F0) estimation algorithm for noisy speech based on exploring the local time-frequency (TF) segment. Our algorithm is motivated by the fact that the full band speech signal is redundant for pitch perception. We assume that in the same time region, the TF segment least affected by noise interference is more reliable for F0 estimation than the full band spectrum. Our algorithm consists of two main stages. Firstly, the overall TF plane is divided into overlapped TF segments, and then the F0 candidates are estimated from each single TF segment. Secondly, the optimal F0 value is selected from F0 candidates based on signal to noise ratio (SNR) estimation and dynamic programming. The experimental results show that the proposed algorithm outperforms several non-parametric state-of-the-art F0 estimation techniques.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    25
    References
    2
    Citations
    NaN
    KQI
    []