A new language model using (n ≥ 4)-gram for broadcast news speech transcription

Naoto Katoh,Noriyoshi Uratani,Terumasa Ehara,Akio Ando

A new language model using (n ≥ 4)-gram for broadcast news speech transcription

2005

This paper describes the language model for news speech transcription, utilizing the properties of the announce manuscript. In the news program, it often happens that the announcer reads the manuscript (announce manuscript) in the studio. The announce manuscript is composed, being improved by handwriting the manuscript, which is written by the reporter using a word processor (reporter's manuscript). Consequently, the announce manuscript has the property that the same word sequence as the immediate past word sequence in the reporter's manuscript is often contained. In the proposed language model, such a property is modeled as the (n ≥ 4)-gram. Then, the adaptation to the immediate past reporter's manuscript is realized, while the general language constraint is handled through modeling by the 2,3-gram. If all of the (n ≥ 4)-grams are simply memorized, the amount of data is enlarged with the number of words. In the proposed language model, the concept of the word position dictionary is introduced, and the amount of data is suppressed by memorizing the reporter's manuscript itself as the procedural knowledge. In modeling by the 2,3-gram, on the other hand, the reporter's manuscripts for several years are adapted to the immediate past reporter's manuscript, in order to reflect the current property of the news. Overadaptation is avoided by reducing the linear mixing ratio. The proposed language model is applied to the broadcast news, and an evaluation experiment is conducted in terms of the perplexity and the speech recognition. A satisfactory result is obtained. © 2004 Wiley Periodicals, Inc. Syst Comp Jpn, 36(1): 58–67, 2005; Published online in Wiley InterScience (). DOI 10.1002sscj.10385

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations