Effects of out of vocabulary words in spoken document retrieval.
2000
The eects of out-of-vocabulary (OOV) items in spoken document retrieval (SDR) are investigated. Several sets of transcriptions were created for the TREC-8 SDR task using a speech recognition system varying the vocabulary sizes and OOV rates, and the relative retrieval perfor- mance measured. The eects of OOV terms on a simple baseline IR system and on more sophisticated retrieval systems are described. The use of a parallel corpus for query and document expansion is found to be especially beneÞcial, and with this data set, good retrieval perfor- mance can be achieved even for fairly high OOV rates.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
3
References
50
Citations
NaN
KQI