phenix.refine is a program within the PHENIX package that supports crystallographic structure refinement against experimental data with a wide range of upper resolution limits using a large repertoire of model parameterizations. It has several automation features and is also highly flexible. Several hundred parameters enable extensive customizations for complex use cases. Multiple user-defined refinement strategies can be applied to specific parts of the model in a single refinement run. An intuitive graphical user interface is available to guide novice users and to assist advanced users in managing refinement projects. X-ray or neutron diffraction data can be used separately or jointly in refinement. phenix.refine is tightly integrated into the PHENIX suite, where it serves as a critical component in automated model building, final structure refinement, structure validation and deposition to the wwPDB. This paper presents an overview of the major phenix.refine features, with extensive literature references for readers interested in more detailed discussions of the methods.
The RNA-binding protein insulin-like growth factor 2 mRNA binding protein 1 (IMP1) is overexpressed in colorectal cancer (CRC); however, evidence for a direct role for IMP1 in CRC metastasis is lacking. IMP1 is regulated by let-7 microRNA, which binds in the 3' untranslated region (UTR) of the transcript. The availability of binding sites is in part controlled by alternative polyadenylation, which determines 3' UTR length. Expression of the short 3' UTR transcript (lacking all microRNA sites) results in higher protein levels and is correlated with increased proliferation. We used in vitro and in vivo model systems to test the hypothesis that the short 3' UTR isoform of IMP1 promotes CRC metastasis. Herein we demonstrate that 3' UTR shortening increases IMP1 protein expression and that this in turn enhances the metastatic burden to the liver, whereas expression of the long isoform (full length 3' UTR) does not. Increased tumor burden results from elevated tumor surface area driven by cell proliferation and cell survival mechanisms. These processes are independent of classical apoptosis pathways. Moreover, we demonstrate the shifts toward the short isoform are associated with metastasis in patient populations where IMP1-long expression predominates. Overall, our work demonstrates that different IMP1 expression levels result in different functional outcomes in CRC metastasis and that targeting IMP1 may reduce tumor progression in some patients.
One of the great challenges in refining macromolecular crystal structures is a low data-to-parameter ratio. Historically, knowledge from chemistry has been used to help to improve this ratio. When a macromolecule crystallizes with more than one copy in the asymmetric unit, the noncrystallographic symmetry relationships can be exploited to provide additional restraints when refining the working model. However, although globally similar, NCS-related chains often have local differences. To allow for local differences between NCS-related molecules, flexible torsion-based NCS restraints have been introduced, coupled with intelligent rotamer handling for protein chains, and are available in phenix.refine for refinement of models at all resolutions.
Study Design. Retrospective administrative claims database analysis. Objective. Identify distinct presurgery health care resource utilization (HCRU) patterns among posterior lumbar spinal fusion patients and quantify their association with postsurgery costs. Summary of Background Data. Presurgical HCRU may be predictive of postsurgical economic outcomes and help health care providers to identify patients who may benefit from innovation in care pathways and/or surgical approach. Methods. Privately insured patients who received one- to two-level posterior lumbar spinal fusion between 2007 and 2016 were identified from a claims database. Agglomerative hierarchical clustering (HC), an unsupervised machine learning technique, was used to cluster patients by presurgery HCRU across 90 resource categories. A generalized linear model was used to compare 2-year postoperative costs across clusters controlling for age, levels fused, spinal diagnosis, posterolateral/interbody approach, and Elixhauser Comorbidity Index. Results. Among 18,770 patients, 56.1% were female, mean age was 51.3, 79.4% had one-level fusion, and 89.6% had inpatient surgery. Three patient clusters were identified: Clust1 (n = 13,987 [74.5%]), Clust2 (n = 4270 [22.7%]), Clust3 (n = 513 [2.7%]). The largest between-cluster differences were found in mean days supplied for antidepressants (Clust1: 97.1 days, Clust2: 175.2 days, Clust3: 287.1 days), opioids (Clust1: 76.7 days, Clust2: 166.9 days, Clust3: 129.7 days), and anticonvulsants (Clust1: 35.1 days, Clust2: 67.8 days, Clust3: 98.7 days). For mean medical visits, the largest between-cluster differences were for behavioral health (Clust1: 0.14, Clust2: 0.88, Clust3: 16.3) and nonthoracolumbar office visits (Clust1: 7.8, Clust2: 13.4, Clust3: 13.8). Mean (95% confidence interval) adjusted 2-year all-cause postoperative costs were lower for Clust1 ($34,048 [$33,265–$34,84]) versus both Clust2 ($52,505 [$50,306–$54,800]) and Clust3 ($48,452 [$43,007–$54,790]), P < 0.0001. Conclusion. Distinct presurgery HCRU clusters were characterized by greater utilization of antidepressants, opioids, and behavioral health services and these clusters were associated with significantly higher 2-year postsurgical costs. Level of Evidence: 3
The fields of structural biology and soft matter have independently sought out fundamental principles to rationalize protein crystallization. Yet the conceptual differences and the limited overlap between the two disciplines have thus far prevented a comprehensive understanding of the phenomenon to emerge. We conduct a computational study of proteins from the rubredoxin family that bridges the two fields. Using atomistic simulations, we characterize their crystal contacts, and accordingly parameterize patchy particle models. Comparing the phase diagrams of these schematic models with experimental results enables us to critically examine the assumptions behind the two approaches. The study also reveals features of protein-protein interactions that can be leveraged to crystallize proteins more generally.
Refinement of macromolecular structures against low-resolution crystallographic data is limited by the ability of current methods to arrive at a high-quality structure with realistic geometry. We have developed a new method for crystallographic refinement which combines the Rosetta sampling methodology and all atom energy function with likelihood-based reciprocal space refinement in Phenix, and find, on a test set of difficult low-resolution refinement cases, that models refined with the new method have significantly improved model geometry, and in most cases, lower free R factors and RMS deviation to the final model. Integration of the software packages additionally makes advanced sampling methods used in structure prediction and design available for crystallographic refinement and model-building, and also provides a strategy for improving the Rosetta force field for better agreement with experimental data.
A consensus classification and nomenclature are defined for RNA backbone structure using all of the backbone torsion angles. By a consensus of several independent analysis methods, 46 discrete conformers are identified as suitably clustered in a quality-filtered, multidimensional dihedral angle distribution. Most of these conformers represent identifiable features or roles within RNA structures. The conformers are given two-character names that reflect the seven-angle delta epsilon zeta alpha beta gamma delta combinations empirically found favorable for the sugar-to-sugar "suite" unit within which the angle correlations are strongest (e.g., 1a for A-form, 5z for the start of S-motifs). Since the half-nucleotides are specified by a number for delta epsilon zeta and a lowercase letter for alpha beta gamma delta, this modular system can also be parsed to describe traditional nucleotide units (e.g., a1) or the dinucleotides (e.g., a1a1) that are especially useful at the level of crystallographic map fitting. This nomenclature can also be written as a string with two-character suite names between the uppercase letters of the base sequence (N1aG1gN1aR1aA1cN1a for a GNRA tetraloop), facilitating bioinformatic comparisons. Cluster means, standard deviations, coordinates, and examples are made available, as well as the Suitename software that assigns suite conformer names and conformer match quality (suiteness) from atomic coordinates. The RNA Ontology Consortium will combine this new backbone system with others that define base pairs, base-stacking, and hydrogen-bond relationships to provide a full description of RNA structural motifs.