A Two-Stage Deep Learning Model for Fully Automated Pancreas Segmentation on Computed Tomography: Comparison with Intra-Reader and Inter-Reader Reliability at Full and Reduced Radiation Dose on an External Dataset.

Ananya Panda,Panagiotis Korfiatis,Garima Suman,Sushil K. Garg,Eric C. Polley,Dhruv P. Singh,Suresh T. Chari,Ajit H. Goenka

A Two-Stage Deep Learning Model for Fully Automated Pancreas Segmentation on Computed Tomography: Comparison with Intra-Reader and Inter-Reader Reliability at Full and Reduced Radiation Dose on an External Dataset.

2021

PURPOSE To develop a two-stage 3D-CNN for fully automated volumetric segmentation of pancreas on CT and to further evaluate its performance in the context of intra-reader and inter-reader reliability at full dose and reduced radiation dose CTs on a public dataset. METHODS A dataset of 1994 abdomen CT scans (portal venous phase, slice thickness ≤ 3.75-mm, multiple CT vendors) was curated by two radiologists (R1 and R2) to exclude cases with pancreatic pathology, sub-optimal image quality, and image artifacts (n=77). Remaining 1917 CTs were equally allocated between R1 and R2 for volumetric pancreas segmentation [ground truth (GT)]. This internal dataset was randomly divided into training (n=1380), validation (n=248) and test (n=289) sets for the development of a two-stage 3D CNN model based on a modified U-net architecture for automated volumetric pancreas segmentation. Model's performance for pancreas segmentation and the differences in model-predicted pancreatic volumes versus GT volumes were compared on the test set. Subsequently, an external dataset from The Cancer Imaging Archive (TCIA) that had CT scans acquired at standard radiation dose and same scans reconstructed at a simulated 25% radiation dose was curated (n=41). Volumetric pancreas segmentation was done on this TCIA dataset by R1 and R2 independently on the full dose and then at the reduced radiation dose CT images. Intra-reader and inter-reader reliability, model's segmentation performance, and reliability between model-predicted pancreatic volumes at full versus reduced-dose were measured. Finally, model's performance was tested on the benchmarking National Institute of Health (NIH)-Pancreas CT (PCT) dataset. RESULTS 3D-CNN had mean (SD) Dice Similarity Coefficient (DSC): 0.91 (0.03) and average Hausdorff distance of 0.15 (0.09) mm on the test set. Model's performance was equivalent between males and females (p=0.08) and across different CT slice thicknesses (p>0.05) based on non-inferiority statistical testing. There was no difference in model-predicted and GT pancreatic volumes [mean predicted volume 99 cc (31cc); GT volume 101 cc (33 cc), p=0.33]. Mean pancreatic volume difference was -2.7 cc (percent difference: -2.4% of GT volume) with excellent correlation between model-predicted and GT volumes [concordance correlation coefficient (CCC)=0.97]. In the external TCIA dataset, the model had higher reliability than R1 and R2 on full versus reduced dose CT scans [model mean (SD) DSC: 0.96 (0.02), CCC=0.995 versus R1 DSC: 0.83 (0.07), CCC=0.89, and R2 DSC:0.87 (0.04), CCC=0.97]. The DSC and volume concordance correlations for R1 versus R2 (inter-reader reliability) were 0.85 (0.07), CCC=0.90 at full-dose and 0.83 (0.07), CCC=0.96 at reduced dose datasets. There was good reliability between model and R1 at both full and reduced dose CT [Full dose: DSC: 0.81 (0.07), CCC=0.83 and reduced dose DSC:0.81 (0.08), CCC=0.87]. Likewise, there was good reliability between model and R2 at both full and reduced dose CT [Full dose: DSC: 0.84 (0.05), CCC=0.89 and reduced dose DSC:0.83(0.06), CCC=0.89]. There was no difference in model-predicted and GT pancreatic volume in TCIA dataset (mean predicted volume 96 cc (33); GT pancreatic volume 89 cc (30), p=0.31). Model had mean (SD) DSC: 0.89 (0.04) (minimum- maximum DSC: 0.79 -0.96) on the NIH-PCT dataset. CONCLUSION A 3D-CNN developed on the largest dataset of CTs is accurate for fully automated volumetric pancreas segmentation and is generalizable across a wide-range of CT slice thicknesses, radiation dose and patient gender. This 3D-CNN offers a scalable tool to leverage biomarkers from pancreas morphometrics and radiomics for pancreatic diseases including for early pancreatic cancer detection.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations