Predicting changes in protein thermostability upon point mutation with deep 3D convolutional neural networks

2020 
Predicting mutation-induced changes in protein thermodynamic stability ({Delta}{Delta}G) is of great interest in protein engineering, variant interpretation, and drug discovery. We introduce ThermoNet, a deep, 3D-convolutional neural network designed for structure-based prediction of the change in protein thermostability upon point mutation. To naturally leverage the image-processing power inherent in convolutional neural networks, we treat protein structures as if they were multi-channel 3D images. In particular, the inputs to ThermoNet are multi-channel voxel grids based on biophysical properties derived from raw atom coordinates; thus, the interpretability of its resulting predictions is improved compared to machine-learning approaches that take heterogeneous representations of predictive features as inputs. ThermoNet is trained with a data set balanced with direct and reverse mutations generated by symmetry-based data augmentation. It demonstrates improved performance compared to fifteen previously developed computational methods on a widely used blind test set. Unlike all other methods that exhibit a strong bias towards predicting destabilization, ThermoNet accurately predicts the effects of both stabilizing and destabilizing mutations. Finally, we demonstrate the practical utility of ThermoNet in predicting the {Delta}{Delta}G landscape for two clinically relevant proteins, p53 and myoglobin, and all ClinVar missense variants. These predictions indicate that 85.1% of benign variants have effects on protein stability that are in the range expected to be neutral and that pathogenic variants are almost equally likely to be destabilizing as stabilizing (56.6% vs 43.4%, respectively). Overall, our results suggest that 3D convolutional neural networks can model the complex, non-linear interactions perturbed by mutations, directly from biophysical properties of atoms.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    72
    References
    0
    Citations
    NaN
    KQI
    []