A large number of protein–protein interactions (PPIs) are mediated by the interactions between proteins and peptide segments binding partners, and therefore determination of protein–peptide interactions (PpIs) is quite crucial to elucidate important biological processes and design peptides or peptidomimetic drugs that can modulate PPIs. Nowadays, as a powerful computation tool, molecular docking has been widely utilized to predict the binding structures of protein–peptide complexes. However, although a number of docking programs have been available, the systematic study on the assessment of their performance for PpIs has never been reported. In this study, a benchmark data set called PepSet consisting of 185 protein–peptide complexes with peptide length ranging from 5 to 20 residues was employed to evaluate the performance of 14 docking programs, including three protein–protein docking programs (ZDOCK, FRODOCK, and HawkDock), three small molecule docking programs (GOLD, Surflex-Dock, and AutoDock Vina), and eight protein–peptide docking programs (GalaxyPepDock, MDockPeP, HPEPDOCK, CABS-dock, pepATTRACT, DINC, AutoDock CrankPep (ADCP), and HADDOCK peptide docking). A new evaluation parameter, named IL_RMSD, was proposed to measure the docking accuracy with fnat (the fraction of native contacts). In global docking, HPEPDOCK performs the best for the entire data set and yields the success rates of 4.3%, 24.3%, and 55.7% at the top 1, 10, and 100 levels, respectively. In local docking, overall, ADCP achieves the best predictions and reaches the success rates of 11.9%, 37.3%, and 70.3% at the top 1, 10, and 100 levels, respectively. It is expected that our work can provide some helpful insights into the selection and development of improved docking programs for PpIs. The benchmark data set is freely available at http://cadd.zju.edu.cn/pepset/.
Abstract Protein–protein interactions (PPIs) play an important role in the different functions of cells, but accurate prediction of the three-dimensional structures for PPIs is still a notoriously difficult task. In this study, HawkDock, a free and open accessed web server, was developed to predict and analyze the structures of PPIs. In the HawkDock server, the ATTRACT docking algorithm, the HawkRank scoring function developed in our group and the MM/GBSA free energy decomposition analysis were seamlessly integrated into a multi-functional platform. The structures of PPIs were predicted by combining the ATTRACT docking and the HawkRank re-scoring, and the key residues for PPIs were highlighted by the MM/GBSA free energy decomposition. The molecular visualization was supported by 3Dmol.js. For the structural modeling of PPIs, HawkDock could achieve a better performance than ZDOCK 3.0.2 in the benchmark testing. For the prediction of key residues, the important residues that play an essential role in PPIs could be identified in the top 10 residues for ∼81.4% predicted models and ∼95.4% crystal structures in the benchmark dataset. To sum up, the HawkDock server is a powerful tool to predict the binding structures and identify the key residues of PPIs. The HawkDock server is accessible free of charge at http://cadd.zju.edu.cn/hawkdock/.
Compound-protein interactions (CPI) play significant roles in drug development. To avoid side effects, it is also crucial to evaluate drug selectivity when binding to different targets. However, most selectivity prediction models are constructed for specific targets with limited data. In this study, we present a pretrained multi-functional model for compound-protein interaction prediction (PMF-CPI) and fine-tune it to assess drug selectivity. This model uses recurrent neural networks to process the protein embedding based on the pretrained language model TAPE, extracts molecular information from a graph encoder, and produces the output from dense layers. PMF-CPI obtained the best performance compared to outstanding approaches on both the binding affinity regression and CPI classification tasks. Meanwhile, we apply the model to analyzing drug selectivity after fine-tuning it on three datasets related to specific targets, including human cytochrome P450s. The study shows that PMF-CPI can accurately predict different drug affinities or opposite interactions toward similar targets, recognizing selective drugs for precise therapeutics.Kindly confirm if corresponding authors affiliations are identified correctly and amend if any.Yes, it is correct.
Protein-protein interactions (PPIs) have been regarded as an attractive emerging class of therapeutic targets for the development of new treatments. Computational approaches, especially molecular docking, have been extensively employed to predict the binding structures of PPI-inhibitors or discover novel small molecule PPI inhibitors. However, due to the relatively 'undruggable' features of PPI interfaces, accurate predictions of the binding structures for ligands towards PPI targets are quite challenging for most docking algorithms. Here, we constructed a non-redundant pose ranking benchmark dataset for small-molecule PPI inhibitors, which contains 900 binding poses for 184 protein-ligand complexes. Then, we evaluated the performance of MM/PB(GB)SA approaches to identify the correct binding poses for PPI inhibitors, including two Prime MM/GBSA procedures from the Schrödinger suite and seven different MM/PB(GB)SA procedures from the Amber package. Our results showed that MM/PBSA outperformed the Glide SP scoring function (success rate of 58.6%) and MM/GBSA in most cases, especially the PB3 procedure which could achieve an overall success rate of ∼74%. Moreover, the GB6 procedure (success rate of 68.9%) performed much better than the other MM/GBSA procedures, highlighting the excellent potential of the GBNSR6 implicit solvation model for pose ranking. Finally, we developed the webserver of Fast Amber Rescoring for PPI Inhibitors (farPPI), which offers a freely available service to rescore the docking poses for PPI inhibitors by using the MM/PB(GB)SA methods.farPPI web server is freely available at http://cadd.zju.edu.cn/farppi/.Supplementary data are available at Bioinformatics online.
Although enzymes have the advantage of efficient catalysis, natural enzymes lack stability in industrial environments and do not even meet the required catalytic reactions. This prompted us to urgently <i>de novo</i> design new enzymes. As a powerful strategy, computational method can not only explore sequence space rapidly and efficiently, but also promote the design of new enzymes suitable for specific conditions and requirements, so it is very beneficial to design new industrial enzymes. Currently, there exists only one tool for enzyme generation, which exhibits suboptimal performance. We have selected several general protein sequence design tools and systematically evaluated their effectiveness when applied to specific industrial enzymes. We summarized the computational methods used for protein sequence generation into three categories: structure-conditional sequence generation, sequence generation without structural constraints, and co-generation of sequence and structure. To effectively evaluate the ability of the six computational tools to generate enzyme sequences, we first constructed a luciferase dataset named Luc_64. Then we assessed the quality of enzyme sequences generated by these methods on this dataset, including amino acid distribution, EC number validation, etc. We also assessed sequences generated by structure-based methods on existing public datasets using sequence recovery rates and root-mean-square deviation (RMSD) from a sequence and structure perspective. In the functionality dataset, Luc_64, ABACUSR and ProteinMPNN stood out for producing sequences with amino acid distributions and functionalities closely matching those of naturally occurring luciferase enzymes, suggesting their effectiveness in preserving essential enzymatic characteristics. Across both benchmark datasets, ABACUS-R and ProteinMPNN, have also exhibited the highest sequence recovery rates, indicating their superior ability to generate sequences closely resembling the original enzyme structures. Our study provides a crucial reference for researchers selecting appropriate enzyme sequence design tools, highlighting the strengths and limitations of each tool in generating accurate and functional enzyme sequences. ProteinMPNN and ABACUS-R emerged as the most effective tools in our evaluation, offering high accuracy in sequence recovery and RMSD and maintaining the functional integrity of enzymes through accurate amino acid distribution. Meanwhile, the performance of protein general tools for migration to specific industrial enzymes was fairly evaluated on our specific industrial enzyme benchmark.
Considerable attentions have been devoted recently to active vibration control using intelligent materials as sensors/actuators. This paper presents results on active control schemes for vibration suppression of flexible cantilever beam with bonded piezoelectric sensors and actuators. State equations which the generalized modal coordinates as variables are built.With the piezoelectric elements are surface-bonded near the same position to the fixed end of flexible cantilever beam, two active vibration control methods such as Linear Quadratic Gauss (LQG) optimal control and robust H∞ control are investigated. Finally, the simulation results are given to demonstrate the effectiveness of the control method in this paper, compared with the LQG control method, robust H control has strong robustness to modal parameters variation, and it has a good closed-loop dynamic performance, can suppress the vibration better at the circumstance of the system with uncertainty factors.
Translation speed can affect the cotranslational folding of nascent peptide. Experimental observations have indicated that slowing down translation rates of codons can increase the probability of protein cotranslational folding. Recently, a kinetic modeling indicates that fast translation can also increase the probability of cotranslational protein folding by avoiding misfolded intermediates. We show that the villin headpiece subdomain HP35 is an ideal model to demonstrate this phenomenon. We studied cotranslational folding of HP35 with different fast translation speeds by all-atom molecular dynamics simulations and found that HP35 can fold along a well-defined pathway that passes the on-pathway intermediate but avoids the misfolded off-pathway intermediate in certain case. This greatly increases the probability of HP35 cotranslational folding and the approximate mean first passage time of folding into native state is about 1.67μs. Since we also considered the space-confined effect of the ribosomal exit tunnel on the cotranslational folding, our simulation results suggested alternative mechanism for the increasing of cotranslational folding probability by fast translation speed.
Enhanced sampling has been extensively used to capture the conformational transitions in protein folding, but it attracts much less attention in the studies of protein-protein recognition. In this study, we evaluated the impact of enhanced sampling methods and solute dielectric constants on the overall accuracy of the molecular mechanics/Poisson-Boltzmann surface area (MM/PBSA) and molecular mechanics/generalized Born surface area (MM/GBSA) approaches for the protein-protein binding free energy calculations. Here, two widely used enhanced sampling methods, including aMD and GaMD, and conventional molecular dynamics (cMD) simulations with two AMBER force fields (ff03 and ff14SB) were used to sample the conformations for 21 protein-protein complexes. The MM/PBSA and MM/GBSA calculation results illustrate that the standard MM/GBSA based on the cMD simulations yields the best Pearson correlation (rp = -0.523) between the predicted binding affinities and the experimental data, which is much higher than that given by MM/PBSA (rp = -0.212). Two enhanced sampling methods (aMD and GaMD) are indeed more efficient for conformational sampling, but they did not improve the binding affinity predictions for protein-protein systems, suggesting that the aMD or GaMD sampling (at least in short timescale simulations) may not be a good choice for the MM/PBSA and MM/GBSA predictions of protein-protein complexes. The solute dielectric constant of 1.0 is recommended to MM/GBSA, but a higher solute dielectric constant is recommended to MM/PBSA, especially for the systems with higher polarity on the protein-protein binding interfaces. Then, a preliminary assessment of the MM/GBSA calculations based on a variable dielectric generalized Born (VDGB) model was conducted. The results highlight the potential power of VDGB in the free energy predictions for protein-protein systems, but more thorough studies should be done in the future.
Abstract Binding of different ligands to glucocorticoid receptor (GR) may induce different conformational changes and even trigger completely opposite biological functions. To understand the allosteric communication within the GR ligand binding domain, the folding pathway of helix 12 (H12) induced by the binding of the agonist dexamethasone (DEX), antagonist RU486, and modulator AZD9567 are explored by molecular dynamics simulations and Markov state model analysis. The ligands can regulate the volume of the activation function‐2 through the residues Phe737 and Gln738. Without ligand or with agonist binding, H12 swings from inward to outward to visit different folding positions. However, the binding of RU486 or AZD9567 perturbs the structural state, and the passive antagonist state appears more stable. Structure‐based virtual screening and in vitro bioassays are used to discover novel GR ligands that bias the conformation equilibria toward the passive antagonist state. HP‐19 exhibits the best anti‐inflammatory activity (IC 50 = 0.041 ± 0.011 µ m ) in nuclear factor‐kappa B signaling pathway, which is comparable to that of DEX. HP‐19 also does not induce adverse effect‐related transactivation functions of GR. The novel ligands discovered here may serve as promising starting points for the development of GR modulators.