Method for identifying textual forms in a digital document and method and determination system and associated contextual information

2014 
The present invention relates to a method (42), implemented by computer, identification of textual forms (2j) relating to at least one field in a digital document (3), the computer comprising at least one processor and at least a memory storing an application (18) which, when executed by said at least one processor, implement the method (42), the digital document (3) being stored within the computer and comprising text, the text containing a set of characters in the form of natural language, each text form being defined as a subset of characters associated with a same type, said type being related to a particular field and representing the general nature of the textual form in this field. The method comprises: • a step (44) for extracting textual forms (2i) by applying to all areas relating to textual forms (2i) of the text of digital documents (3), a set of executables settlements (29i) of regular languages ​​on the text of the digital document (3), and • a step (46) for selecting, from textual forms (2i) extracted from textual forms (2d) on a subset said predetermined areas, by detecting, for each textual form (2i) extracted, the type representative of said form (2i).
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []