An Improved Oscillator Method for Modeling Structured Speech

2011 
Modern speech coders utilize several models in sequence to encode a single frame. The typical sequence consists of a linear predictor for modeling short scale structure, adaptive codebook for mid to long scale structure, and algebraic codebook for the remainder. We develop an alternative model, termed the Complete Oscillator Model (COM), which encodes structures on multiple time scales at once. When compared to the linear predictor and adaptive codebook of the Adaptive Multi-Rate standard, we have found the COM to yield better quality models on average while using the same number of parameters. However, its performance is uneven across different types of phonemes, notably in the transitions from unvoiced to voiced speech. We discuss how the derived performance relates to the fundamental oscillator properties and provide initial schemes for how the proposed method may be used in speech coders. All experiments are performed using sentences from the TIMIT database.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    8
    References
    2
    Citations
    NaN
    KQI
    []