Word Image Matching as a Techique for Degraded Text Recognition

Jonathan J. Hull,Siamak Khoubyari,Tin Kam Ho

Word Image Matching as a Techique for Degraded Text Recognition

1992

Jonathan J. Hull
Siamak Khoubyari
Tin Kam Ho

A technique is presented that determines equivalences between word images in a passage of text. A clustering procedure is applied to group visually similar words. Initial hypotheses for the identities of words are then generated by matching the word groups to language statistics that predict the frequency at which certain words will occur. This is followed by a recognition step that assigns identifications to the images in the clusters. This paper concentrates on the clustering algorithm. A clustering technique is presented and its performance on a running text of 1062 word images is determined. It is shown that the clustering algorithm can correctly locate groups of short function words with better than a 95 percent correct rate.

Keywords:

Fuzzy clustering
Cluster analysis
Computer science
Pattern recognition
Artificial intelligence
text recognition
Speech recognition
image matching

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations