Multitask Outcome Prediction using Hybrid Machine Learning and PET-CT Fusion Radiomics

2021 
1424 Objectives: We employ hybrid machine learning systems (HMLSs) using feature selection algorithms (FSA) and classifiers, applied to data generated using multi-level fusion radiomics strategies to combine PET and CT information. In doing so, we aim to improve prediction of multiple outcomes, with specific application to head and neck (H&N) cancer. Methods: We studied 250 subjects with HN (2-7) PET and CT images fused via wavelet-based fusion using CT-weights of 0.2, 0.4, 0.6 and 0.8, gradient transfer fusion, and guided filtering based fusion; (8) fused matrices; (9, 10) fused features constructed via feature averaging and feature concatenation (2×127 features for the latter); and finally, (11) CT images alone. Three outcomes, overall survival (OS), distant metastasis (DM), and loco-regional recurrence (LR), were considered, utilizing binary outcome classification (mean event-year as threshold). Different HMLSs constructed from 9 FSAs followed by 6 classifiers were employed to select optimal combination of features and to predict outcomes. Some classifiers such as k-nearest neighborhood (KNN) and random forest (RF) algorithms, etc. were optimized via automated hyperparameter tuning. The results were evaluated using mean accuracy from 5-fold cross-validation. Results: For classification based on OS, we achieved accuracy of 83% via; i) dataset 2+LLBCFS (local learning-based clustering by feature selection & kernel learning)+LDAC (linear discriminant analysis classifier), using 16 features (6 non-imaging features (NIFs)+10 imaging features (IFs)); ii) dataset 5+LLBCFS+LDAC, using 27 features (6 NIFs+21 IFs); and iii) dataset 6+ReliefA (relief algorithm)+KNNC, using 6 features (2 NIFs+4 IFs). Meanwhile, some combinations such as ReliefA+KNN+datasets 4, 5, 6 and 7; ReliefA+RFC+datasets 4, 6, and 8; ReliefA+LDAC+datasets 6 and 9; and dataset 2+LLBCFS+LDAC resulted in 82% accuracy. For DM, multiple combinations of datasets and HMLSs obtained accuracy ~90%; optimal performances were achieved by: i) dataset 1+FSASL (unsupervised feature selection algorithm with adaptive structure learning)+LDAC, using 30 features (6 NIFs+24 IFs); ii) dataset 2+ILFS (infinite latent feature selection algorithm)+LDAC, using 40 features (0 NIF+40 IFs); and iii) dataset 10+UMCFS (unsupervised multi-class/cluster feature section )+LDAC, using 28 features (0 NIF+28 IFs). For LR, multiple combinations (with <10 features) of all datasets except dataset 4 resulted from some HMLSs obtained accuracy over 87%. In short, dataset 1+ILFS+KNN, dataset 2+ ILFS+LDAC, dataset 5+ReliefA+RFC, dataset 6+CSFA (sort features based on pairwise correlations)+KNN, and dataset 10+CSFA+RFA were selected as optimal trajectories in prediction of LR. Feature combinations selected from datasets 1, 2, 3, 7, 8, 9, 10 and 11 only included IFs. Conclusions: We demonstrated that HMLS frameworks combined with fusion radiomics modeling enabled improved prediction of OS, while features provided from PET or CT alone are sufficient for prediction of LR and DM in patients with H&N cancer. Moreover, optimal combinations for OS prediction included both NIF and IF, while some combinations with IF alone were sufficient for prediction of DM and LR. In addition, dataset 2 (image-fusion) is very effective in prediction of all three outcomes, while datasets 1 (PET only) and 10 (feature-fusion) can be employed for prediction of DM and LR, and datasets 5 and 6 (image-fusion) can also be utilized for prediction of OS and LR.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []