Various criteria in the evaluation of biomedical named entity recognition

Richard Tzong-Han Tsai,Shih-Hung Wu,Wen-Chi Chou,Yu Chun Lin,Ding He,Jieh Hsiang,Ting-Yi Sung,Wen-Lian Hsu

Various criteria in the evaluation of biomedical named entity recognition

2006

Richard Tzong-Han Tsai
Shih-Hung Wu
Wen-Chi Chou
Yu Chun Lin
Ding He
Jieh Hsiang
Ting-Yi Sung
Wen-Lian Hsu

Background Text mining in the biomedical domain is receiving increasing attention. A key component of this process is named entity recognition (NER). Generally speaking, two annotated corpora, GENIA and GENETAG, are most frequently used for training and testing biomedical named entity recognition (Bio-NER) systems. JNLPBA and BioCreAtIvE are two major Bio-NER tasks using these corpora. Both tasks take different approaches to corpus annotation and use different matching criteria to evaluate system performance. This paper details these differences and describes alternative criteria. We then examine the impact of different criteria and annotation schemes on system performance by retesting systems participated in the above two tasks.

Keywords:

Statistical hypothesis testing
Natural language processing
Annotation
Bioinformatics
Named-entity recognition
Data mining
Computer science
Text mining
Artificial intelligence
Information retrieval
exact match
vocabulary controlled

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations