There is a growing body of work seeking to replicate the success of machine learning (ML) on domains like computer vision (CV) and natural language processing (NLP) to applications involving biophysical data. One of the key ingredients of prior successes in CV and NLP was the broad acceptance of difficult benchmarks that distilled key subproblems into approachable tasks that any junior researcher could investigate, but good benchmarks for biophysical domains are rare. This scarcity is partially due to a narrow focus on benchmarks which simulate biophysical data; we propose instead to carefully abstract biophysical problems into simpler ones with key geometric similarities. In particular we propose a new class of closed-form test functions for biophysical sequence optimization, which we call Ehrlich functions. We provide empirical results demonstrating these functions are interesting objects of study and can be non-trivial to solve with a standard genetic optimization baseline.
The general secretory or Sec system is the major route of export for proteins from the cytosol of Escherichia coli. The pathway through the membrane is provided by the translocon SecYEG. At the membrane, ATPase SecA binds SecYEG and drives protein translocation. In this work we shed light on Sec system using atomic force microscopy (AFM), a single molecule technique that is well suited for studying membrane proteins in near native conditions. AFM images provided direct visualizations of the dynamic loops of SecYEG which orchestrate critical translocon functions as well as determination of the oligomeric state of SecYEG in lipid bilayers. Additionally, we studied structure‐function relationships of SecA‐SecYEG complexes. Our results show that SecA binds SecYEG in liposomes in two distinct modes. When SecA was added extraneously to liposomes containing SecYEG, the proteins showed a wide distribution of heights with no prominent peaks, indicating that SecA populates multiple binding states. In contrast to this broad distribution of heights, the sample of proteoliposomes in which SecY and SecA were co‐assembled had a predominant species centered at 40 Å in height above the lipid bilayer. Because these proteoliposomes contain approximately sixfold more active translocons than do the proteoliposomes to which SecA was added after assembly, we conclude that this structural state represents SecA bound to translocons that have been rendered active. Taken together this work provides new functional and structural insights into the Sec system. More generally, the assays developed here are adaptable to other membrane protein systems. Grant Funding Source : NSF CAREER Award , Burroughs Wellcome Fund Career Award, NIH
We use density functional theory calculations to study a group of 2D materials known as MXenes toward the electrochemical nitrogen reduction reaction (NRR) to ammonia. So far, all computational studies have only considered the NRR chemistry on unfunctionalized (bare) MXenes. In this study, we investigate a total of 65 bare and functionalized MXenes. We establish free energy diagrams for the NRR on the basal planes of 55 different M2XTx MXenes (M = Ti, V, Zr, Nb, Mo, Ta, W; X = C, N) to span a large variety of possible chemistries. Energy trends with respect to the metal as well as nonmetal constituent of the MXenes are established for both bare and functionalized MXenes. We determine the limiting potentials and find that either the formation of NH3 from *NH2 or the formation of *N2H is the potential limiting reaction step for bare and functionalized MXenes, respectively. We find several Mo-, W-, and V-based MXenes (Mo2C, Mo2N, W2N, W2NH2, and V2N) to have suitable theoretical overpotentials for the NRR. Importantly, calculated Pourbaix stability diagrams combined with selectivity analysis, however, reveal that all bare MXenes are not stable under relevant NRR operating conditions. The only functionalized MXene with the three minimum required properties (i) having a low theoretical overpotential, (ii) being stable under NRR conditions, and (iii) having selectivity toward NRR rather than the parasitic HER is W2CH2, which is a H-terminated MXene. Finally, on the basis of our findings, we explore other routes for improving the NRR chemistry by studying 10 additional MXenes with the chemical formula M3X2Tx and MXenes with other functional groups (Tx = S, F, Cl). This opens up a larger variety and tunability of MXenes to be considered for the NRR.
We propose a framework using normalizing-flow based models, SELF-Referencing Embedded Strings, and multi-objective optimization that efficiently generates small molecules. With an initial training set of only 100 small molecules, FastFlows generates thousands of chemically valid molecules in seconds. Because of the efficient sampling, substructure filters can be applied as desired to eliminate compounds with unreasonable moieties. Using easily computable and learned metrics for druglikeness, synthetic accessibility, and synthetic complexity, we perform a multi-objective optimization to demonstrate how FastFlows functions in a high-throughput virtual screening context. Our model is significantly simpler and easier to train than autoregressive molecular generative models, and enables fast generation and identification of druglike, synthesizable molecules.
The growth of business firms is an example of a system of complex interacting units that resembles complex interacting systems in nature such as earthquakes. Remarkably, work in econophysics has provided evidence that the statistical properties of the growth of business firms follow the same sorts of power laws that characterize physical systems near their critical points. Given how economies change over time, whether these statistical properties are persistent, robust, and universal like those of physical systems remains an open question. Here, we show that the scaling properties of firm growth previously demonstrated for publicly-traded U.S. manufacturing firms from 1974 to 1993 apply to the same sorts of firms from 1993 to 2015, to firms in other broad sectors (such as materials), and to firms in new sectors (such as Internet services). We measure virtually the same scaling exponent for manufacturing for the 1993 to 2015 period as for the 1974 to 1993 period and virtually the same scaling exponent for other sectors as for manufacturing. Furthermore, we show that fluctuations of the growth rate for new industries self-organize into a power law over relatively short time scales.
In molecular discovery and drug design, structure-property relationships and activity landscapes are often qualitatively or quantitatively analyzed to guide the navigation of chemical space. The roughness (or smoothness) of these molecular property landscapes is one of their most studied geometric attributes, as it can characterize the presence of activity cliffs, with rougher landscapes generally expected to pose tougher optimization challenges. Here, we introduce a general, quantitative measure for describing the roughness of molecular property landscapes. The proposed roughness index (ROGI) is loosely inspired by the concept of fractal dimension and strongly correlates with the out-of-sample error achieved by machine learning models on numerous regression tasks.
Abstract The growth of business firms is an example of a system of complex interacting units that resembles complex interacting systems in nature such as earthquakes. Remarkably, work in econophysics has provided evidence that the statistical properties of the growth of business firms follow the same sorts of power laws that characterize physical systems near their critical points. Given how economies change over time, whether these statistical properties are persistent, robust, and universal like those of physical systems remains an open question. Here, we show that the scaling properties of firm growth previously demonstrated for publicly-traded U.S. manufacturing firms from 1974 to 1993 apply to the same sorts of firms from 1993 to 2015, to firms in other broad sectors (such as materials), and to firms in new sectors (such as Internet services). We measure virtually the same scaling exponent for manufacturing for the 1993 to 2015 period as for the 1974 to 1993 period and virtually the same scaling exponent for other sectors as for manufacturing. Furthermore, we show that fluctuations of the growth rate for new industries self-organize into a power law over relatively short time scales.
We propose an automatic approach that extracts editing styles in a source video and applies the edits to matched footage for video creation. Our Computer Vision based techniques considers framing, content type, playback speed, and lighting of each input video segment. By applying a combination of these features, we demonstrate an effective method that automatically transfers the visual and temporal styles from professionally edited videos to unseen raw footage. We evaluated our approach with real-world videos that contained a total of 3872 video shots of a variety of editing styles, including different subjects, camera motions, and lighting. We reported feedback from survey participants who reviewed a set of our results.