A public domain speech-to-text system.

Mark Ordowski,Neeraj Deshmukh,Aravind Ganapathiraju,Jonathan Hamaker,Joseph Picone

A public domain speech-to-text system.

1999

Mark Ordowski
Neeraj Deshmukh
Aravind Ganapathiraju
Jonathan Hamaker
Joseph Picone

The lack of freely available state-of-the-art Speech-toText (STT) software has been a major hindrance to the development of new audio information processing technology. The high cost of the infrastructure required to conduct state-of-the-art speech recognition research prevents many small research groups from evaluating new ideas on large-scale tasks. In this paper, we present the core components of an available state-of-the-art STT system: an acoustic processor which converts the speech signal into a sequence of feature vectors; a training module which estimates the parameters for a Hidden Markov Model; a linguistic processor which predicts the next word given a sequence of previously recognized words; and a search engine which finds the most probable word sequence given a set of feature vectors.

Keywords:

Speech processing
Information processing
Speech recognition
Artificial intelligence
Acoustic model
Feature vector
Pattern recognition
Speech analytics
Language model
Computer science
Search engine
Hidden Markov model
Public domain
Software
Natural language processing

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations