Discriminative training of acoustic models applied to domains with unreliable transcripts [speech recognition applications]
2005
Training automatic speech recognition (ASR) systems requires the availability of training transcripts for the speech data. Obtaining these transcripts is a time consuming and costly process, especially for the medical domain. On the other hand, medical reports which are generated as a by-product of the normal medical transcription workflow are available easily. However, they only partially represent the acoustic data. In this paper, we present a method for the automatic generation of transcripts from these medical reports. In particular, we identify "reliable" regions in the transcript that can be used for training acoustic models. Experiments based on maximum likelihood (ML) and lattice-based discriminative training with frame filtering are presented. It is shown that discriminative training gives us word error rate (WER) reductions of 8-15% relative to the baseline.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
8
References
20
Citations
NaN
KQI