Artificial intelligence improves experts in reading pulmonary function tests

2018 
Introduction: The use of pulmonary function tests (PFT) is built on an expert opinion. PFT interpretation relies on the recognition of patterns but scarcely leads to a specific respiratory disease diagnosis. We aimed to explore the accuracy and inter-rater variability of pulmonologists when: 1/ interpreting PFT’s, 2/ suggesting a respiratory disease diagnosis based on clinical info and PFT’s. We compared it with artificial intelligence (AI)-based software developed on 1430 historical patient cases. Methods: 6000 interpretations of complete PFT (spirometry, body box and diffusion) and clinical info were made by 120 pulmonologists from 16 European hospitals on 50 patient cases. ATS/ERS guidelines were used as the gold standard for test interpretation. The gold standard diagnosis was derived from clinical history, PFT and all additional tests, and finally confirmed by an expert panel. AI software examined the same cases. Results: The interpretations of pulmonologists (senior 73%, junior 27%) matched the guidelines in 74.4% (±5.9) of the cases (range: 56-88%). Inter-rater variability of 0.67 pointed to a common agreement. Readers were able to correctly appoint the primary disease diagnosis in only 44.6% (±8.7) of the cases (range: 24-62%). Inter-rater variability of 0.35 indicates a common disagreement between readers. AI-based software perfectly (100%) matched the interpretation of ATS/ERS guidelines while it assigned a correct diagnosis in 82% of cases. Conclusions: Interpreting PFTs and suggesting primary respiratory disease diagnosis by expert clinicians contains discrepancy and incorrectness. The AI-based software provides a powerful decision support tool to improve current clinical practice.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []