F0 estimation for noisy speech by exploring temporal harmonic structures in local time frequency spectrum segment

2016 
In this paper, we propose a noise robust F0 estimation approach by exploring the temporal harmonic structures in local time-frequency (TF) spectrum segment. Since the speech energy is sparsely distributed on the TF plane, the speech harmonic structures occupied in the higher speech energy TF segment are tending to dominate over noise. Thus, we attempt to derive F0 from such high (signal to noise ratio) SNR TF segments rather than full band signal. Our algorithm comprises of two stages: i) F0 candidate estimation for a series of TF segments; ii) F0 tracking based on the acoustic features of each TF segment as well as the F0 temporal continuity constraints. Experimental results show that our approach outperforms the compared methods in terms of F0 estimation accuracy.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    30
    References
    5
    Citations
    NaN
    KQI
    []