DG-GL: Differential geometry based geometric learning of molecular datasets.

2018 
Motivation: Despite its great success in various physical modeling, differential geometry (DG) has rarely been devised as a versatile tool for analyzing large, diverse and complex molecular and biomolecular datasets due to the limited understanding of its potential power in dimensionality reduction and its ability to encode essential chemical and biological information in differentiable manifolds. Results: We put forward a differential geometry based geometric learning (DG-GL) hypothesis that the intrinsic physics of three-dimensional (3D) molecular structures lies on a family of low-dimensional manifolds embedded in a high-dimensional data space. We encode crucial chemical, physical and biological information into 2D element interactive manifolds, extracted from a high-dimensional structural data space via a multiscale discrete-to-continuum mapping using differentiable density estimators. Differential geometry apparatuses are utilized to construct element interactive curvatures %Gaussian curvature, mean curvature, minimum curvature and maximum curvature in analytical forms for certain analytically differentiable density estimators. These low-dimensional differential geometry representations are paired with a robust machine learning algorithm to showcase their descriptive and predictive powers for large, diverse and complex molecular and biomolecular datasets. Extensive numerical experiments are carried out to demonstrated that the proposed DG-GL strategy outperforms other advanced methods in the predictions of drug discovery related protein-ligand binding affinity, drug toxicity, and molecular solvation free energy.
    • Correction
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []