Methods for eliciting, annotating, and analyzing databases for child speech development

2017 
Reviews methods and results from databases of day-long recordings of infants.Describes challenges of building large-scale databases of childrens speech.Explicates relationship between ASR error rates and talker age for child speech.Surveys annotation schemas for child speech that do not assume adult segmental models.Illustrates acoustic measures to evaluate child-appropriate speech production models. Methods from automatic speech recognition (ASR), such as segmentation and forced alignment, have facilitated the rapid annotation and analysis of very large adult speech databases and databases of caregiverinfant interaction, enabling advances in speech science that were unimaginable just a few decades ago. This paper centers on two main problems that must be addressed in order to have analogous resources for developing and exploiting databases of young childrens speech. The first problem is to understand and appreciate the differences between adult and child speech that cause ASR models developed for adult speech to fail when applied to child speech. These differences include the fact that childrens vocal tracts are smaller than those of adult males and also changing rapidly in size and shape over the course of development, leading to between-talker variability across age groups that dwarfs the between-talker differences between adult men and women. Moreover, children do not achieve fully adult-like speech motor control until they are young adults, and their vocabularies and phonological proficiency are developing as well, leading to considerably more within-talker variability as well as more between-talker variability. The second problem then is to determine what annotation schemas and analysis techniques can most usefully capture relevant aspects of this variability. Indeed, standard acoustic characterizations applied to child speech reveal that adult-centered annotation schemas fail to capture phenomena such as the emergence of covert contrasts in childrens developing phonological systems, while also revealing childrens nonuniform progression toward community speech norms as they acquire the phonological systems of their native languages. Both problems point to the need for more basic research into the growth and development of the articulatory system (as well as of the lexicon and phonological system) that is oriented explicitly toward the construction of age-appropriate computational models.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    181
    References
    13
    Citations
    NaN
    KQI
    []