Development of a large-scale Mandarin Radio Speech Corpus

Yung hsiang Shawn Chang,Yuan-Fu Liao,Sheng-Ming Wang,Jenq-Haur Wang,Sing-yue Wang,Jhih-wei Chen,You-dian Chen

Development of a large-scale Mandarin Radio Speech Corpus

2017

Yung hsiang Shawn Chang
Yuan-Fu Liao
Sheng-Ming Wang
Jenq-Haur Wang
Sing-yue Wang
Jhih-wei Chen
You-dian Chen

The Taiwan Mandarin Radio Speech Corpus consists of roughly 300 (and growing) hours of audio recordings, selected from Taiwan's National Education Radio (NER) archive. The corpus includes speech from hundreds of speakers and various speech styles (spontaneous conversational and read news). This corpus provides a rich resource for research in speech and automatic speech recognition (ASR). In this paper, we briefly introduce the corpus development approach and report two preliminary experimental results using this corpus.

Keywords:

Speech technology
VoxForge
Natural language processing
Audio mining
Acoustic model
Speech synthesis
Mandarin Chinese
Artificial neural network
Computer science
Speech recognition
Speech corpus
Artificial intelligence
national education

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations