JIGSAW: Structuring Text into Tables

2019 
We present JIGSAW, an end-to-end query driven system that efficiently generates structured tables from unstructured documents. To do so, first we describe how we can quickly retrieve sentences in support of structured queries that describe the table schema. Second, we describe how we can estimate table cell values using document context where such values can not be retrieved. Third, we describe how we can link together similar rows, rank, and diversify them to generate high-quality tables. We show that JIGSAW can generate tables from 25 million documents within seconds.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    21
    References
    1
    Citations
    NaN
    KQI
    []