Challenges in Automating Maze Detection

Eric Morley,Anna Eva Hallin,Brian Roark

Challenges in Automating Maze Detection

2014

Eric Morley
Anna Eva Hallin
Brian Roark

SALT is a widely used annotation approach for analyzing natural language transcripts of children. Nine annotated corpora are distributed along with scoring software to provide norming data. We explore automatic identification of mazes ‐ SALT’s version of disfluency annotations ‐ and find that cross-corpus generalization is very poor. This surprising lack of crosscorpus generalization suggests substantial differences between the corpora. This is the first paper to investigate the SALT corpora from the lens of natural language processing, and to compare the utility of different corpora collected in a clinical setting to train an automatic annotation system.

Keywords:

Software
Natural language
Natural language processing
Data mining
Computer science
Artificial intelligence
Annotation
Information retrieval

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations