Chemical-induced disease relation extraction with various linguistic features

2016 
Table 2 compares the performance of relation extraction at intra-sentence level and inter-sentence level, as well as the performance of the final results at document level on the development set using gold entity annotations. ‘LEX’, ‘DEP’ and ‘HF’ denote lexical features, dependency features and the hypernym filtering step mentioned in Section ‘Methods’. When comparing different levels of relations, the approach only using the lexical features is regarded as the baseline. Table 2. Performance on the development dataset Note that DEP was unavailable for inter-sentence level, while HF and LEX could be applied to both levels. Post-processing was executed at document level based on the optimal feature combination after the relation merging step, i.e. ‘HF+LEX + DEP’. The table indicates that: Only using the lexical features, the final performance of F-score was able to reach as high as 55.3%, and the performance at intra-sentence level was much higher than that at inter-sentence level. This suggests that lexical features were simple yet more effective for intra-sentence level than for inter-sentence level. This is probably because the CID relations at inter-sentence level spanned several sentences and thus had much more complex structures that the traditional lexical features could not capture effectively. Though the performance by dependency features was slightly lower than that by lexical features, its F-score still reached as high as 60.2%. This is probably because of its capability to represent the direct syntactic relationships between different entity mentions in a sentence. On the basis of lexical or dependency features, hypernym filtering significantly improved the recall for both intra- and inter-sentence levels, leading to the F-scores of 66.1% and 42.3% for two levels, respectively. This indicates that filtering the more general negative instances from the training set caused more true relation instances to be recalled, justifying our hypothesis in the Section ‘Hypernym filtering for training instances’. Combining HF, LEX and DEP, our system achieved the best performance for relation extraction. After merging the relations from mention level to document level, the F-score reached as high as 59.2%. After post-processing, the F-score further reached as high as 60.4%. The minor decrease in the recall may be caused by the fact that there were some false annotations for the relations with more general entities. To understand why the task is challenging, we have closely examined the errors and grouped the reasons as follows: ■ For intra-sentence level: Lexical sparsity: Sentences that describe the CID relations using rarely occurring words may not be captured effectively. For instance, in the sentence ‘Fatal haemorrhagic myocarditis secondary to cyclophosphamide therapy.’ (PMID: 11271907), the key clue ‘… secondary to …’ occurs less frequently in the corpus. The structure of sentence is complicated: If a sentence has a complicated structure, our method may not extract the CID relations correctly. For instance, in the sentence ‘The epidemiologic findings are most consistent with the hypothesis that chronic cocaine use disrupts dopaminergic function and, when coupled with recent cocaine use, may precipitate agitation, delirium, aberrant thermoregulation, rhabdomyolysis, and sudden death.’ (PMID: 8988571), though the relation between ‘cocaine’ (D003042) and ‘sudden death’ (D003645) is true, the token distance is too long and there are conjunction structures between mentions in the sentence. True relations are neglected in annotation: A close-up analysis on the results shows that some of our false-positive predictions are actually true-positive. For instance, in the sentence ‘This increase in aggressiveness was not secondary to METH-induced hyperactivity.’(PMID: 16192988), the relation between ‘METH’ (D008694) and ‘hyperactivity’ (D006948) was extracted by our system. This relation is not annotated in the document; however, it is actually annotated in the documents of PMID: 15764424 and PMID: 10579464. Inconsistent annotation: Correlated with the same entity, some relations are annotated while others are not. For instance, in the sentence ‘One patient group developed sinus tachycardias in the setting of a massive carbamazepine overdose.’(PMID: 1728915), the relation between ‘carbamazepine’ (D002220) and ‘overdose’ (D062787) is not annotated; however, in the sentence ‘The possibility of choreoathetoid movements should be considered in patients presenting after pemolineoverdose.’(PMID: 9022662), the relation between ‘pemoline’ (D010389) and ‘overdose’ (D062787) is annotated. ■ For inter-sentence level: Discourse inference is needed: This is the most common error type at inter-sentence level. The inter-sentence level relations are expressed spanning multiple sentences, thus discourse inference including co-reference resolution is needed for the relation extraction. For instance, in two sentences ‘Adverse events considered to be related to levofloxacin administration were reported by 29 patients (9%). The most common drug-related adverse events were diarrhea, flatulence, and nausea; most adverse events were mild to moderate in severity.’ The relation between ‘levofloxacin’ (D064704) and ‘flatulence’ (D005414) is true, while the phrase of ‘Adverse events’ is the anchor bridging the two entities. Inconsistent annotation: Correlated with the same entity, some relationships are annotated while others are not. This problem is similar to that at intra-sentence level.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    28
    References
    38
    Citations
    NaN
    KQI
    []