In this article, we present (1) a feature selection algorithm based on nonlinear support vector machine (SVM) for fault detection and diagnosis in continuous processes and (2) results for the Tennessee Eastman benchmark process. The presented feature selection algorithm is derived from the sensitivity analysis of the dual C-SVM objective function. This enables simultaneous modeling and feature selection paving the way for simultaneous fault detection and diagnosis, where feature ranking guides fault diagnosis. We train fault-specific two-class SVM models to detect faulty operations, while using the feature selection algorithm to improve the accuracy and perform the fault diagnosis. Our results show that the developed SVM models outperform the available ones in the literature both in terms of detection accuracy and latency. Moreover, it is shown that the loss of information is minimized with the use of feature selection techniques compared to feature extraction techniques such as principal component analysis (PCA). This further facilitates a more accurate interpretation of the results.
Abstract Every two years groups worldwide participate in the Critical Assessment of Protein Structure Prediction (CASP) experiment to blindly test the strengths and weaknesses of their computational methods. CASP has significantly advanced the field but many hurdles still remain, which may require new ideas and collaborations. In 2012 a web-based effort called WeFold, was initiated to promote collaboration within the CASP community and attract researchers from other fields to contribute new ideas to CASP. Members of the WeFold coopetition (cooperation and competition) participated in CASP as individual teams, but also shared components of their methods to create hybrid pipelines and actively contributed to this effort. We assert that the scale and diversity of integrative prediction pipelines could not have been achieved by any individual lab or even by any collaboration among a few partners. The models contributed by the participating groups and generated by the pipelines are publicly available at the WeFold website providing a wealth of data that remains to be tapped. Here, we analyze the results of the 2014 and 2016 pipelines showing improvements according to the CASP assessment as well as areas that require further adjustments and research.
Maintenance can improve the availability of aging production systems and prevent process safety incidents. However, because of system complexity, resource allocation is nontrivial. This research developed and applied a framework to obtain optimal future-failure aware and safety-conscious production and maintenance schedules. Ensembles of nonlinear support vector machine classification models were leveraged to predict the time and probability of future equipment failure from equipment condition data. Multiobjective optimization of expected profit and a safety metric was then used to determine optimal process and maintenance schedules. The results of this research were that the ensemble models had an average accuracy and an F1-score of 0.987, that the ensemble models were more accurate and sensitive than the individual classifiers by 3 percentage points, and that the Pareto-optimal process and maintenance schedules were obtained, providing alternative solutions to the decision maker. This research described optimal resource allocation to help improve safety and system effectiveness.
HIV-1 entry into host cells is mediated by interactions between the V3-loop of viral glycoprotein gp120 and chemokine receptor CCR5 or CXCR4, collectively known as HIV-1 coreceptors. Accurate genotypic prediction of coreceptor usage is of significant clinical interest and determination of the factors driving tropism has been the focus of extensive study. We have developed a method based on nonlinear support vector machines to elucidate the interacting residue pairs driving coreceptor usage and provide highly accurate coreceptor usage predictions. Our models utilize centroid-centroid interaction energies from computationally derived structures of the V3-loop:coreceptor complexes as primary features, while additional features based on established rules regarding V3-loop sequences are also investigated. We tested our method on 2455 V3-loop sequences of various lengths and subtypes, and produce a median area under the receiver operator curve of 0.977 based on 500 runs of 10-fold cross validation. Our study is the first to elucidate a small set of specific interacting residue pairs between the V3-loop and coreceptors capable of predicting coreceptor usage with high accuracy across major HIV-1 subtypes. The developed method has been implemented as a web tool named CRUSH, CoReceptor USage prediction for HIV-1, which is available at http://ares.tamu.edu/CRUSH/.
Conjugation of Nedd8 (neddylation) to Cullins (Cul) in Cul-RING E3 ligases (CRLs) stimulates ubiquitination and polyubiquitination of protein substrates. CRL is made up of two Cul-flanked arms: one consists of the substrate-binding and adaptor proteins and the other consists of E2 and Ring-box protein (Rbx). Polyubiquitin chain length and topology determine the substrate fate. Here, we ask how polyubiquitin chains are accommodated in the limited space available between the two arms and what determines the polyubiquitin linkage topology. We focus on Cul5 and Rbx1 in three states: before Cul5 neddylation (closed state), after neddylation (open state), and after deneddylation, exploiting molecular dynamics simulations and the Gaussian Network Model. We observe that regulation of substrate ubiquitination and polyubiquitination takes place through Rbx1 rotations, which are controlled by Nedd8–Rbx1 allosteric communication. Allosteric propagation proceeds from Nedd8 via Cul5 dynamic hinges and hydrogen bonds between the C-terminal domain of Cul5 (Cul5CTD) and Rbx1 (Cul5CTD residues R538/R569 and Rbx1 residue E67, or Cul5CTD E474/E478/N491 and Rbx1 K105). Importantly, at each ubiquitination step (homogeneous or heterogeneous, linear or branched), the polyubiquitin linkages fit into the distances between the two arms, and these match the inherent CRL conformational tendencies. Hinge sites may constitute drug targets.