Cross-Channel Spectral Subtraction for meeting speech recognition

Yu Nasu,Koichi Shinoda,Sadaoki Furui

Cross-Channel Spectral Subtraction for meeting speech recognition

2011

Yu Nasu
Koichi Shinoda
Sadaoki Furui

We propose Cross-Channel Spectral Subtraction (CCSS), a source separation method for recognizing meeting speech where one microphone is prepared for each speaker. The method quickly adapts to changes in transfer functions and uses spectral subtraction to suppress the speech of other speakers. Compared with conventional source separation methods based on independent component analysis (ICA) or that use binary masks, it requires less computational costs and the resulting speech signals have less distortion. In a recognition task of computer-simulated, partially-overlapped speech, CCSS improved the word accuracy from 66.5% to 77.7%. It also significantly improved the recognition accuracy of speech data in actual meetings.

Keywords:

Distortion
Speech recognition
Speech enhancement
Speech processing
Independent component analysis
Voice activity detection
Subtraction
Artificial intelligence
Source separation
Pattern recognition
Microphone
Computer science
Binary number

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations