High-Speed Identification of Language and Script

2007 
Humans communicate with text in thousands of languages, in dozens of scripts, and a wide variety of binary codes. There is a need to identify the language, script and code of this text to enable follow-on processing such as transcoding, translation, transliteration, routing and prioritization. This paper deals with the implementation of real-time language and script identification on high-speed hardware (principally a ternary content addressable memory) capable of processing network data streams at several gigabits per second.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    28
    References
    0
    Citations
    NaN
    KQI
    []