Various Legal Factors Extraction Based on Machine Reading Comprehension

2021 
With the rapid growth of legal cases, professionals are under pressure to go through lengthy documents and grasp informative pieces of text in the limited time. Most of the existing techniques focus on simple legal information retrieval task, such as name or address of the prosecutor or the defendant, which can be easily accomplished with the help of handcrafted patterns or sequence labeling methods. Yet complicated texts always challenge such pattern-based methods and sequence labeling approaches. These texts state the same facts or describe the same events, but they do not share common or similar patterns. In this paper, we design a unified framework to extract legal information in various formats, including directly extracted information (a piece of span) and information that needs to be deduced. The framework follows the methodology to answer questions in machine reading comprehension (MRC) tasks. We treat the extraction fact labels as the counterpart of questions in MRC task and propose several strategies to represent them. We construct several datasets regarding different cases for training and testing. Our best strategy achieves up to 4% enhancement in F1 score on each dataset compared to the MRC baseline.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    16
    References
    0
    Citations
    NaN
    KQI
    []