Japanese document recognition and retrieval system using programmable SIMD processor

Sueharu Miyahara,Akira Suzuki,Shunkichi Tada,Takahiko Kawatani

Japanese document recognition and retrieval system using programmable SIMD processor

1991

This paper describes a new efficient information-filing system for a large number of documents. The system is designed to recognize Japanese characters and make full-text searches across a document database. Key components of the system are a small fully-programmable parallel processor for both recognition and retrieval an image scanner for document input and a personal computer as the operator console. The processor is constructed by a bit-serial single instruction multiple data stream architecture (SIMD) and all components including the 256 processor elements and 11 MB of RAM are integrated on one board. The recognition process divides a document into text lines isolates each character extracts character pattern features and then identifies character categories. The entire process is performed by a single micro-program package down-loaded from the console. The recognition accuracy is more than 99. 0 for about 3 printed Japanese characters at a performance speed of more than 14 characters per second. The processor can also be made available for high speed information retrieval by changing the down-loaded microprogram package. The retrieval process can obtain sentences that include the same information as an inquiry text from the database previously created through character recognition. Retrieval performance is very fast with 20 million individual Japanese characters being examined each second when the database is stored in the processor''s IC memory. It was confirmed that a high performance but flexible and cost-effective document-information-processing system

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations