Comparison of 11 automated PET segmentation methods in lymphoma.

2020 
BackgroundSegmentation of lymphoma lesions in FDG PET/CT images is critical in both assessing individual lesions and quantifying patient disease burden. Simple thresholding methods remain common despite the large heterogeneity in lymphoma lesion location, size, and contrast. Here, we assess 11 automated PET segmentation methods for their use in two scenarios: individual lesion segmentation and patient-level disease quantification in lymphoma.MethodsLesions on18F-FDG PET/CT scans of 90 lymphoma patients were contoured by a nuclear medicine physician. Thresholding, active contours, clustering, adaptive region-growing, and convolutional neural network (CNN) methods were implemented on all physician-identified lesions. Lesion-level segmentation was evaluated using multiple segmentation performance metrics (Dice, Hausdorff Distance). Patient-level quantification of total disease burden (SUVtotal) and metabolic tumor volume (MTV) was assessed using Spearman's correlation coefficients between the segmentation output and physician contours. Lesion segmentation and patient quantification performance was compared to inter-physician agreement in a subset of 20 patients segmented by a second nuclear medicine physician.ResultsIn total, 1,223 lesions with median tumor-to-background ratio of 4.0 and volume of 1.8 cm3, were evaluated. When assessed for lesion segmentation, a 3D CNN, DeepMedic, achieved the highest performance across all evaluation metrics. DeepMedic, clustering methods, and an iterative threshold method had lesion-level segmentation performance comparable to the degree of inter-physician agreement. For patient-level SUVtotaland MTV quantification, all methods except 40% and 50% SUVmaxand adaptive region-growing achieved a performance that was similar the agreement of the two physicians.ConclusionsMultiple methods, including a 3D CNN, clustering, and an iterative threshold method, achieved both good lesion-level segmentation and patient-level quantification performance in a population of 90 lymphoma patients. These methods are thus recommended over thresholding methods such as 40% and 50% SUVmax, which were consistently found to be significantly outside the limits defined by inter-physician agreement.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    31
    References
    9
    Citations
    NaN
    KQI
    []