Development of a large-scale Mandarin Radio Speech Corpus

2017 
The Taiwan Mandarin Radio Speech Corpus consists of roughly 300 (and growing) hours of audio recordings, selected from Taiwan's National Education Radio (NER) archive. The corpus includes speech from hundreds of speakers and various speech styles (spontaneous conversational and read news). This corpus provides a rich resource for research in speech and automatic speech recognition (ASR). In this paper, we briefly introduce the corpus development approach and report two preliminary experimental results using this corpus.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    1
    References
    3
    Citations
    NaN
    KQI
    []