Improved prediction of protein-protein interactions using AlphaFold2 and extended multiple-sequence alignments

2021 
Predicting the structure of single-chain proteins is now close to being a solved problem due to the recent achievement of AlphaFold2 (AF2). However, predicting the structure of interacting protein chains is still a challenge. Here, we utilise AF2 to optimise a protocol for predicting the structure of heterodimeric protein complexes using only sequence information. We find that using the default AF2 protocol, 32% of the models in the Dockground test set can be modelled accurately. By tuning the input alignment and identifying the best model, we adjusted the performance to 43%. Our protocol uses MSAs generated by AF2 and MSAs paired on the organism level generated with HHblits. In a more extensive, more realistic, independent test set, the accuracy is 59%. In comparison, the alternative fold-and-dock method RoseTTAFold is only successful in 10% of the cases on this set and traditional docking methods 22%. However, for the traditional method, the performance would be lower if the bound form of both monomers was not known. The success is higher for bacterial protein pairs, pairs with large interaction areas consisting of helices or sheets, and many homologous sequences. We can distinguish acceptable (DockQ>0.23) from incorrect models with an AUC of 0.84 on the test set by analysing the predicted interfaces. At an error rate of 1%, 13% are acceptable (at a 10% error rate, 40% of the models are acceptable). All scripts and tools to run our protocol are freely available at: https://gitlab.com/ElofssonLab/FoldDock.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    56
    References
    10
    Citations
    NaN
    KQI
    []