Computer Program to Aid in the Selection and Evaluation of Vocabularies for Word Recognizers

The Journal of the Acoustical Society of America (1963)

Citation

Reference

Related Paper

Abstract:

The performance of a word recognizer depends upon the vocabulary that it attempts to recognize. A computer program has been written to aid in the selection and evaluation of vocabularies for word recognizers. The program operates by predicting the performance of a given recognizer on a given vocabulary. The words in the vocabulary are treated as strings of phonemes, and it is assumed that the word recognizer being evaluated operates in terms of such strings. The program searches the vocabulary for word pairs that would be confused with each other if the recognizer were unable to make certain discriminations. When a confusable pair is found, it may be added to a list of confusions for later printing, or one word of the pair may be deleted from the vocabulary. The latter mode of operation results in a reduced vocabulary on which the recognizer should perform well. Results are presented for a vocabulary of about 1000 common words.

Topics:

Speech and dialogue systems

Robotics and Automated Systems

Intelligent Tutoring Systems and Adaptive Learning

10.1121/1.2142552

Cite

PDF

Translation selection through source word sense disambiguation and target word selection

Proceedings of the 17th international conference on Computational linguistics - (2002)

Hyun Ah Lee Gil Chang Kim

A word has many senses, and each sense can be mapped into many target words. Therefore, to select the appropriate translation with a correct sense, the sense of a source word should be disambiguated before selecting a target word. Based on this observation, we propose a hybrid method for translation selection that combines disambiguation of a source word sense and selection of a target word. Knowledge for translation selection is extracted from a bilingual dictionary and target language corpora. Dividing translation selection into the two sub-problems, we can make knowledge acquisition straight-forward and select more appropriate target words.

Word Sense Disambiguation

SemEval

10.3115/1072228.1072274

Cite

Citations (7)

A systematic review of word selection in early childhood vocabulary instruction

Early Childhood Research Quarterly (2020)

Elizabeth Burke Hadley Karyn Zalman Mendez

Word list

10.1016/j.ecresq.2020.07.010

Cite

Citations (22)

Speaker-independent isolated word recognition for a moderate size(54 word)vocabulary

IEEE Transactions on Acoustics Speech and Signal Processing (1979)

L. R. Rabiner J. Wilpon

Recent work at Bell Laboratories has shown that statistical clustering techniques could be used to provide a reliable set of reference templates for a speaker-independent isolated-word recognition system. The vocabulary on which the system was tested consisted of the 26 letters of the alphabet, the 10 digits (0 to 9), and 3 command words. Since this vocabulary consisted of a large number of acoustically similar words (e.g., b, c, d, e, g, p, t, v, z), the recognition accuracy on the top candidate was only about 80 percent. In this paper results are presented using a considerably less difficult 54 word vocabulary of computer terms. Recognition accuracies from 95-98 percent were obtained across a wide variety of talkers. These results tend to support the hypothesis that carefully trained speaker-independent word recognizers can perform essentially as well as casually trained speaker-independent systems.

Word error rate

10.1109/tassp.1979.1163323

Cite

Citations (43)

Out-of-vocabulary word rejection algorithm in Korean variable vocabulary word recognition

Kwang-Sik Moon Yujin Kim Hoirin Kim Jae-Ho Chung

Utterance verification is used in variable vocabulary word recognition to reject the word that does not belong to in-vocabulary word or does not belong to correctly recognized word. Utterance verification is very important to design a user-friendly speech recognition system. We propose a new utterance verification algorithm that, with no-training required, is based on the minimum verification error. First, using PBW (Phonetically Balanced Words) DB (445 words), we generate no-training antiphoneme models. Then, for OOV (Out-Of-Vocabulary) rejection, a new confidence measure which uses the likelihood between phoneme model and anti-phoneme model is designed. Using our proposed anti-phoneme model and confidence measure, we achieve significant performance improvement; CA (Correctly Accept for In-Vocabulary) is about 89%, and CR (Correctly Reject for OOV) is about 90%, improving about 15-21% in ERR (Error Reduction Rate), in the vocabulary-independent case.

Utterance

Word error rate

10.1109/iscas.2000.857361

Cite

Citations (2)

Is corpus word frequency a good yardstick for selecting words to teach? Threshold levels for vocabulary selection

System (2015)

Mayumi Okamoto

Yardstick

British National Corpus

10.1016/j.system.2015.03.004

Cite

Citations (16)

Intelligence Report Voice Input.

L. Bahler P. Markey Stephen L. Moshier

Abstract : This work was an initial effort in the use of voice data entry for information data handling. The objective of this effort was to develop the technology for a large vocabulary (1000 word) isolated word recognition system capable of quick adaptation and high accuracy for a limited number of people. Techniques for word boundary detection, noise suppression, and frequency sealing were examined. Tests were conducted on a 1000 word and a 100 word unstructured vocabulary. Recognition accuracies of 30.5% and 66% were obtained for the untrained case and 62.4% and 90% after training each word once. (Author)

10.21236/ada062845

Cite

Citations (2)

Automatic recognition of keywords in unconstrained speech using hidden Markov models

IEEE Transactions on Acoustics Speech and Signal Processing (1990)

J. G. Wilpon L. R. Rabiner C.-H. Lee E.R. Goldman

The modifications made to a connected word speech recognition algorithm based on hidden Markov models (HMMs) which allow it to recognize words from a predefined vocabulary list spoken in an unconstrained fashion are described. The novelty of this approach is that statistical models of both the actual vocabulary word and the extraneous speech and background are created. An HMM-based connected word recognition system is then used to find the best sequence of background, extraneous speech, and vocabulary word models for matching the actual input. Word recognition accuracy of 99.3% on purely isolated speech (i.e., only vocabulary items and background noise were present), and 95.1% when the vocabulary word was embedded in unconstrained extraneous speech, were obtained for the five word vocabulary using the proposed recognition algorithm.< >

10.1109/29.103088

Cite

Citations (407)

Automatic detection of new words in a large vocabulary continuous speech recognition system

Ayman Asadi Richard Schwartz John Makhoul

In practical large vocabulary speech recognition systems, it is nearly impossible for a speaker to remember which words are in the vocabulary. The probability of the speaker using words outside the vocabulary can be quite high. For the case when a speaker uses a new word, current systems will always' recognize other words within the vocabulary in place of the new word, and the speaker wouldn't know what the problem is.In this paper, we describe a preliminary investigation of techniques that automatically detect when the speaker has used a word that is not in the vocabulary. We developed a technique that uses a general model for the acoustics of any word to recognize the existence of new words. Using this general word model, we measure the correct detection of new words versus the false alarm rate.Experiments were run using the DARPA 1000-word Resource Management Database for continuous speech recognition. The recognition system used is the BBN BYBLOS continuous speech recognition system (Chow et al., 1987). The preliminary results indicate a detection rate of 74% with a false alarm rate of 3.4%.

Word error rate

10.3115/1075434.1075477

Cite

Citations (9)

A rejection method for the Korean isolated word recognition system

The Journal of the Acoustical Society of America (1998)

Donghwa Kim Hyung-Soon Kim Young Ho Kim

The small vocabulary isolated word recognition systems as well as large vocabulary continuous speech recognition systems can be applicable in many areas. For the isolated word recognition systems to be deployable in actual applications, the ability to reject the out-of-vocabulary is required. This paper presents a rejection method which uses the clustered phoneme modeling combined with postprocessing by likelihood ratio scoring. Our baseline speech recognition was based on the whole-word continuous HMM. Six clustered phoneme models were generated using the statistical method, monophone clustering algorithm, from the 45 context independent Korean phoneme models, which were trained using the phonetically balanced Korean speech database. The performance of this method was assessed in terms of the out-of-vocabulary rejection rate and the accuracy on the pre-defined-vocabulary. The performance test for speaker independent isolated words recognition task on the 22 section names shows that this method is superior to the conventional postprocessing method, which is performing rejection according to the likelihood difference between the first and second candidates. Furthermore, these clusters phoneme models do not require retraining for the other isolated word recognition systems with different vocabulary sets.

Word error rate

10.1121/1.422257

Cite

Citations (0)

Runner-Up Solution to ECCV 2022 Challenge on Out of Vocabulary Scene Text Understanding: Cropped Word Recognition

arXiv (Cornell University) (2022)

Zhangzi Zhu Hao Yu Wenqing Zhang Chuhui Xue Song Bai

This report presents our 2nd place solution to ECCV 2022 challenge on Out-of-Vocabulary Scene Text Understanding (OOV-ST) : Cropped Word Recognition. This challenge is held in the context of ECCV 2022 workshop on Text in Everything (TiE), which aims to extract out-of-vocabulary words from natural scene images. In the competition, we first pre-train SCATTER on the synthetic datasets, then fine-tune the model on the training set with data augmentations. Meanwhile, two additional models are trained specifically for long and vertical texts. Finally, we combine the output from different models with different layers, different backbones, and different seeds as the final results. Our solution achieves a word accuracy of 59.45\% when considering out-of-vocabulary words only.

Training set

10.48550/arxiv.2208.02747

Cite

Citations (0)