Predicting prognosis in patients from large-scale genomic data is a fundamentally challenging problem in genomic medicine. However, the prognosis still remains poor in many diseases. The poor prognosis may be caused by high complexity of biological systems, where multiple biological components and their hierarchical relationships are involved. Moreover, it is challenging to develop robust computational solutions with high-dimension, low-sample size data.In this study, we propose a Pathway-Associated Sparse Deep Neural Network (PASNet) that not only predicts patients' prognoses but also describes complex biological processes regarding biological pathways for prognosis. PASNet models a multilayered, hierarchical biological system of genes and pathways to predict clinical outcomes by leveraging deep learning. The sparse solution of PASNet provides the capability of model interpretability that most conventional fully-connected neural networks lack. We applied PASNet for long-term survival prediction in Glioblastoma multiforme (GBM), which is a primary brain cancer that shows poor prognostic performance. The predictive performance of PASNet was evaluated with multiple cross-validation experiments. PASNet showed a higher Area Under the Curve (AUC) and F1-score than previous long-term survival prediction classifiers, and the significance of PASNet's performance was assessed by Wilcoxon signed-rank test. Furthermore, the biological pathways, found in PASNet, were referred to as significant pathways in GBM in previous biology and medicine research.PASNet can describe the different biological systems of clinical outcomes for prognostic prediction as well as predicting prognosis more accurately than the current state-of-the-art methods. PASNet is the first pathway-based deep neural network that represents hierarchical representations of genes and pathways and their nonlinear effects, to the best of our knowledge. Additionally, PASNet would be promising due to its flexible model representation and interpretability, embodying the strengths of deep learning. The open-source code of PASNet is available at https://github.com/DataX-JieHao/PASNet .
Abstract Background: The chicken gut microbiota, as a reservoir of antibiotic resistance genes (ARGs), poses a high risk to humans and animals worldwide. Yet a comprehensive exploration of the chicken gut antibiotic resistomes remains incomplete. Results: In this study, we established the largest chicken gut resistance gene catalogue to date through metagenomic analysis of 629 chicken gut samples. We found significantly higher abundance of ARGs in the Chinese chicken gut than that in the Europe. tetX, mcr, and blaNDM, the genes resistant to antibiotics of last resort for human and animal health, were frequently detected in the Chinese chicken gut. The abundance of ARGs was linearly correlated with that of mobile genetic elements (MGEs). The host-tracking analysis identified Escherichia, Enterococcus, Staphylococcus, Klebsiella, and Lactobacillus as the major ARG hosts. Especially, Lactobacillus, an intestinal probiotic, carried multiple drug resistance genes, and was proportional to ISLhe63, highlighting its potential risk in agricultural production processes. Conclusions: We first established a reference gene catalogue of chicken gut antibiotic resistomes. Our study help to improve the knowledge and understanding of chicken antibiotic resistomes for knowledge-based sustainable chicken meat production.
The chicken gut microbiota, as a reservoir of antibiotic resistance genes (ARGs), poses a high risk to humans and animals worldwide. Yet a comprehensive exploration of the chicken gut antibiotic resistomes remains incomplete. In this study, we established the largest chicken gut resistance gene catalogue to date through metagenomic analysis of 629 chicken gut samples. We found significantly higher abundance of ARGs in the Chinese chicken gut than that in the Europe. tetX, mcr, and blaNDM, the genes resistant to antibiotics of last resort for human and animal health, were detected in the Chinese chicken gut. The abundance of ARGs was linearly correlated with that of mobile genetic elements (MGEs). The host-tracking analysis identified Escherichia, Enterococcus, Staphylococcus, Klebsiella, and Lactobacillus as the major ARG hosts. Especially, Lactobacillus, an intestinal probiotic, carried multiple drug resistance genes, and was proportional to ISLhe63, highlighting its potential risk in agricultural production processes. We first established a reference gene catalogue of chicken gut antibiotic resistomes. Our study helps to improve the knowledge and understanding of chicken antibiotic resistomes for knowledge-based sustainable chicken meat production. IMPORTANCE The prevalence of antibiotic resistance genes in the chicken gut environment poses a serious threat to human health; however, we lack a comprehensive exploration of antibiotic resistomes and microbiomes in the chicken gut environment. The results of this study demonstrate the diversity and abundance of antibiotic resistance genes and flora in the chicken gut environment and identify a variety of potential hosts carrying antibiotic resistance genes. Further analysis showed that mobile genetic elements were linearly correlated with antibiotic resistance genes abundance, implying that we should pay attention to the role played by mobile genetic elements in antibiotic resistance genes transmission. We established a reference genome of gut antibiotic resistance genes in chickens, which will help to rationalize the use of drugs in poultry farming.
Abstract Differentiable architecture search (DARTS) approach has made great progress in reducing the com- putational costs of neural architecture search. DARTS tries to discover an optimal architecture module called cell from a predefined super network. However, the obtained cell is then repeatedly and simply stacked to build a target network, failing to extract layered fea- tures hidden in different network depths. Therefore, this target network cannot meet the requirements of prac- tical applications. To address this problem, we propose an effective approach called Layered Feature Repre- sentation for Differentiable Architecture Search (LFR- DARTS). Specifically, we iteratively search for multiple cells with different architectures from shallow to deep layers of the super network. For each iteration, we optimize the architecture of a cell by gradient descent and prune out weak connections from this cell. After obtain- ing the optimal architecture of this cell, we deepen the super network by increasing the number of this cell, so as to create an adaptive network context to search for a deeper-adaptive cell in the next iteration. Thus, our LFR-DARTS can discover the architecture of each cell at a specific and adaptive network depth, which embeds the ability of layered feature representations into each cell to sufficiently extract layered features in different depths. Extensive experiments show that our algorithm achieves an advanced performance on the datasets of CIFAR10, fashionMNIST and ImageNet while at low search costs.
Cancer is a heterogeneous disease which has several subtypes that can be distinguished by molecular, histopathological, and clinical stages. Accurate diagnosis of cancer subtypes is vital to identify distinct disease states and develop effective personalized therapies. A number of unsupervised machine learning techniques have been applied to genomic data of the tumor samples, where clusters of patients were formed to be associated with a clinical outcome such as the survival of patients. However, clustering methods based on distance (or similarity) between data often fail to cluster biological data, due to their nonlinearity. In this paper, we develop a PAthway-based Sparse deep CLustering (PASCL) method for the identification of cancer subtypes. PASCL incorporates prior biological knowledge from pathway databases to build a robust and biological interpretable model. We evaluated the performance of PASCL by comparing with several state-of-the-art clustering methods. PASCL outperformed the benchmarking methods with lowest p-value in logrank test, and its outstanding performance is statistically assessed. PASCL provides a solution not only to effectively identify subtypes using high-dimensional nonlinear genomic data, but also to biologically interpret the model at a pathway level.
Antimicrobial resistance has become a global problem that poses great threats to human health. Antimicrobials are widely used in broiler chicken production and consequently affect their gut microbiota and resistome. To better understand how farm animals continuous antimicrobial use alter their microbial ecology, we used a metagenomic approach to investigate the effects of pulsed antimicrobial administration on bacterial community, antibiotic resistance genes (ARGs) and their bacterial host in the feces of broiler chickens. Chickens received three 5-day courses of alone/combined antimicrobials, including amoxicillin, chlortetracycline and florfenicol. The florfenicol administration significantly increased the abundance of mcr-1 gene accompanying with floR gene while amoxicillin significantly increased the abundance of genes encoding for AcrAB-tolC multidrug efflux pump (marA, soxS, sdiA, rob, evgS and phoP). These three antimicrobials all led to an increase in Proteobacteria. The increase in ARG host Escherichia was mainly attributed to the β-lactam, chloramphenicol and tetracycline resistance genes harbored by Escherichia under the pulsed antimicrobial treatment. These results indicated that pulsed antimicrobials administration with amoxicillin, chlortetracycline, florfenicol or mixed significantly increased the abundance of Proteobacteria and aggravated the abundance of particular ARGs. The ARG types were occupied by the multidrug resistance gene and had significant correlation with the total ARGs in the antimicrobial-treated groups. We provided a comprehensive insight into pulsed antimicrobial-mediated alteration of chicken fecal microbiota and resistome.
An in-depth understanding of complex biological processes associated to patients' survival time at the cellular and molecular level is critical not only for developing new treatments for patients but also for accurate survival prediction. However, highly nonlinear and high-dimension, low-sample size (HDLSS) data cause computational challenges in survival analysis. We developed a novel pathway-based, sparse deep neural network, called Cox-PASNet, for survival analysis by integrating highdimensional gene expression data and clinical data. Cox-PASNet is a biologically interpretable neural network model where nodes in the network correspond to specific genes and pathways, while capturing nonlinear and hierarchical effects of biological pathways to a patient's survival. We also provide a solution to train the deep neural network model with HDLSS data. Cox-PASNet was evaluated by comparing the performance of different cutting-edge survival methods such as Cox-nnet, SurvivalNet, and Cox elastic net (Cox-EN). Cox-PASNet significantly outperformed the benchmarking methods, and the outstanding performance was statistically assessed. We provide an open-source software implemented in PyTorch (https://github.com/DataX-JieHao/Cox-PASNet) that enables automatic training, evaluation, and interpretation of Cox-PASNet.
Abstract Background Chlortetracycline is widely used for disease treatment and prevention in animal production system. However, the impact of chlortetracycline on gastrointestinal tract microbial communities of growing chickens has not been fully explored. Results Chickens received 5-day-course of chlortetracycline at actual therapeutic dose/low doses. By using 16S-rRNA sequencing-based approach, We found the predominant Firmicutes and Oscillospira significantly increased, while Shigella significantly decreased in the therapeutic-dose group. The main responders at phylum level to the chlortetracycline were Proteobacteria in the therapeutic-dose group and Firmicutes in the low-dose group. The therapeutic-dose of chlortetracycline significantly increased the α-diversity index including Shannon diversity index, Chao1 index and PD whole tree index. Both therapeutic and low dose increased the Weighted Unifrac distance. Conclusions The significantly changed bacterial community diversity indicated chlortetracycline promoted differentiation of bacterial community in broiler chicken gut. We provided a comprehensive understanding on chlortetracycline-induced changes of gastrointestinal tract microbial communities of growing chickens to optimize the use of antibiotics in health management programs in broiler industry.