Denoising of musical audio using sparse linear regression and structured priors

2008 
We describe in this paper an audio denoising technique based on sparse linear regression with structured priors. The noisy signal is decomposed as a linear combination of atoms belonging to two MDCT bases, plus a residual part containing the noise. One MDCT basis has a long time resolution, and thus high frequency resolution, and is aimed at modeling tonal parts of the signal, while the other MDCT basis has short time resolution and is aimed at modeling transient parts (such as attacks of notes). The problem is formulated within a Bayesian setting. Conditionally upon an indicator variable which is either 0 or 1, one expansion coefficient is set to zero or given a hierarchical prior. Structured priors are employed for the indicator variables; using two type of Markov chains, persistency along the time axis is favored for expansion coefficients of the tonal layer, while persistency along the frequency axis is favored for the expansion coefficients of the transient layer. Inference about the denoised signal and model parameters is performed using a Gibbs sampler, a standard MCMC sampling technique. We present results for denoising of a short glockenspiel excerpt and a long polyphonic music excerpt. Our approach is compared with unstructured sparse regression and with structured sparse regression in a single resolution MDCT basis (no transient layer). The results show that better denoising is obtained, both from SNR measurements and from subjective criteria, when both a transient and tonal layer are used, in conjunction with our proposed structured prior framework.
    • Correction
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []