InDUS: Incremental Document Understanding System Focus on Document Classification

2018 
Our objective is to propose a Document Understanding System for Digital Mailroom application which can cope with three challenges: (1) process a full workflow with high accuracy, with the constraint of a partial training; (2) minimal requirement for configuration work from expert users; (3) adapt incrementally the system in quasi real-time to continuously maximize the recall. We describe an end-to-end system based on existing incremental algorithms for both document classification and field extraction. But in this paper, we really focus on the document classification issue. The main contribution is to adapt the Incremental Growing Neural Gas (A2ING) with a dynamic incremental feature vector. Moreover, a generic Framework automatically selects textual descriptors relying on performance. The quality assessment converges the A2ING and controls the system accuracy.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    14
    References
    6
    Citations
    NaN
    KQI
    []