Inter-Rater Reliability of Unstructured Text Labeling: Artificially vs. Naturally Intelligent Approaches.

G. V. Danilov,Alexandra V. Kosyrkova,Maria Shults,Semen Melchenko,Tatyana Tsukanova,Michael A. Shifrin,Alexander Potapov

Inter-Rater Reliability of Unstructured Text Labeling: Artificially vs. Naturally Intelligent Approaches.

2021

Unstructured medical text labeling technologies are expected to be highly demanded since the interest in artificial intelligence and natural language processing arises in the medical domain. Our study aimed to assess the agreement between experts who judged on the fact of pulmonary embolism (PE) in neurosurgical cases retrospectively based on electronic health records and assess the utility of the machine learning approach to automate this process. We observed a moderate agreement between 3 independent raters on PE detection (Light's kappa = 0.568, p = 0). Labeling sentences with the method we proposed earlier might improve the machine learning results (accuracy = 0.97, ROC AUC = 0.98) even in those cases that could not be agreed between 3 independent raters. Medical text labeling techniques might be more efficient when strict rules and semi-automated approaches are implemented. Machine learning might be a good option for unstructured text labeling when the reliability of textual data is properly addressed. This project was supported by the RFBR grant 18-29-22085.

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations