Comparing the CORAL and random forest approaches for modelling the in vitro cytotoxicity of silica nanomaterials

2016 
Nanotechnology is one of the most important technological developments of the twenty-first century. In silico methods such as quantitative structure-activity relationships (QSARs) to predict toxicity promote the safe-by-design approach for the development of new materials, including nanomaterials. In this study, a set of cytotoxicity experimental data corresponding to 19 data points for silica nanomaterials was investigated to compare the widely employed CORAL and Random Forest approaches in terms of their usefulness for developing so-called “nano-QSAR” models. “External” leave-one-out cross-validation (LOO) analysis was performed to validate the two different approaches. An analysis of variable importance measures and signed feature contributions for both algorithms was undertaken in order to interpret the models developed. CORAL showed a more pronounced difference between the average coefficient of determination (R2) between training and LOO (0.83 and 0.65 for training and LOO respectively) compared to Random Forest (0.87 and 0.78 without bootstrap sampling, 0.90 and 0.78 with bootstrap sampling), which may be due to overfitting. The aspect ratio and zeta potential from amongst the nanomaterials’ physico-chemical properties were found to be the two most important variables for the Random Forest and the average feature contributions calculated for the corresponding descriptors were consistent with the clear trends observed in the dataset: less negative zeta potential values and lower aspect ratio values were associated with higher cytotoxicity. In contrast, CORAL failed to capture these trends.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    53
    References
    20
    Citations
    NaN
    KQI
    []