Extended solvent-contact model approach to blind SAMPL5 prediction challenge for the distribution coefficients of drug-like molecules

2016 
The performance of the extended solvent-contact model has been addressed in the SAMPL5 blind prediction challenge for distribution coefficient (LogD) of drug-like molecules with respect to the cyclohexane/water partitioning system. All the atomic parameters defined for 41 atom types in the solvation free energy function were optimized by operating a standard genetic algorithm with respect to water and cyclohexane solvents. In the parameterizations for cyclohexane, the experimental solvation free energy (ΔG sol ) data of 15 molecules for 1-octanol were combined with those of 77 molecules for cyclohexane to construct a training set because ΔG sol values of the former were unavailable for cyclohexane in publicly accessible databases. Using this hybrid training set, we established the LogD prediction model with the correlation coefficient (R), average error (AE), and root mean square error (RMSE) of 0.55, 1.53, and 3.03, respectively, for the comparison of experimental and computational results for 53 SAMPL5 molecules. The modest accuracy in LogD prediction could be attributed to the incomplete optimization of atomic solvation parameters for cyclohexane. With respect to 31 SAMPL5 molecules containing the atom types for which experimental reference data for ΔG sol were available for both water and cyclohexane, the accuracy in LogD prediction increased remarkably with the R, AE, and RMSE values of 0.82, 0.89, and 1.60, respectively. This significant enhancement in performance stemmed from the better optimization of atomic solvation parameters by limiting the element of training set to the molecules with experimental ΔG sol data for cyclohexane. Due to the simplicity in model building and to low computational cost for parameterizations, the extended solvent-contact model is anticipated to serve as a valuable computational tool for LogD prediction upon the enrichment of experimental ΔG sol data for organic solvents.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    42
    References
    4
    Citations
    NaN
    KQI
    []