// Feixiong Cheng 1 , Peilin Jia 1,2 , Quan Wang 1 , Zhongming Zhao 1,2,3,4 1 Department of Biomedical Informatics, Vanderbilt University School of Medicine, Nashville, Tennessee, USA 2 Center for Quantitative Sciences, Vanderbilt University Medical Center, Nashville, Tennessee, USA 3 Department of Cancer Biology, Vanderbilt University School of Medicine, Nashville, Tennessee, USA 4 Department of Psychiatry, Vanderbilt University School of Medicine, Nashville, Tennessee, USA Correspondence: Zhongming Zhao, email: // Keywords : Kinome, kinase-substrate interaction, phosphorylation, interactome, resistance, systems biology Received : March 7, 2014 Accepted : May 16, 2014 Published : May 18, 2014 Abstract The human kinome is gaining importance through its promising cancer therapeutic targets, yet no general model to address the kinase inhibitor resistance has emerged. Here, we constructed a systems biology-based framework to catalogue the human kinome, including 538 kinase genes, in the broader context of the human interactome. Specifically, we constructed three networks: a kinase-substrate interaction network containing 7,346 pairs connecting 379 kinases to 36,576 phosphorylation sites in 1,961 substrates, a protein-protein interaction network (PPIN) containing 92,699 pairs, and an atomic resolution PPIN containing 4,278 pairs. We identified the conserved regulatory phosphorylation motifs (e.g., Ser/Thr-Pro) using a sequence logo analysis. We found the typical anticancer target selection strategy that uses network hubs as drug targets, might lead to a high adverse drug reaction risk. Furthermore, we found the distinct network centrality of kinases creates a high anticancer drug resistance risk by feedback or crosstalk mechanisms within cellular networks. This notion is supported by the systematic network and pathway analyses that anticancer drug resistance genes are significantly enriched as hubs and heavily participate in multiple signaling pathways. Collectively, this comprehensive human kinome interactome map sheds light on anticancer drug resistance mechanisms and provides an innovative resource for rational kinase inhibitor design.
Topic modeling can reveal the latent structure of text data and is useful for knowledge discovery, search relevance ranking, document classification, and so on. One of the major challenges in topic modeling is to deal with large datasets and large numbers of topics in real-world applications. In this paper, we investigate techniques for scaling up the non-probabilistic topic modeling approaches such as RLSI and NMF. We propose a general topic modeling method, referred to as Group Matrix Factorization (GMF), to enhance the scalability and efficiency of the non-probabilistic approaches. GMF assumes that the text documents have already been categorized into multiple semantic classes, and there exist class-specific topics for each of the classes as well as shared topics across all classes. Topic modeling is then formalized as a problem of minimizing a general objective function with regularizations and/or constraints on the class-specific topics and shared topics. In this way, the learning of class-specific topics can be conducted in parallel, and thus the scalability and efficiency can be greatly improved. We apply GMF to RLSI and NMF, obtaining Group RLSI (GRLSI) and Group NMF (GNMF) respectively. Experiments on a Wikipedia dataset and a real-world web dataset, each containing about 3 million documents, show that GRLSI and GNMF can greatly improve RLSI and NMF in terms of scalability and efficiency. The topics discovered by GRLSI and GNMF are coherent and have good readability. Further experiments on a search relevance dataset, containing 30,000 labeled queries, show that the use of topics learned by GRLSI and GNMF can significantly improve search relevance.
Abstract Motivation Analysis of whole-genome sequencing (WGS) for genetics is still a challenge due to the lack of accurate functional annotation of non-coding variants, especially the rare ones. As eQTLs have been extensively implicated in the genetics of human diseases, we hypothesize that rare non-coding variants discovered in WGS play a regulatory role in predisposing disease risk. Results With thousands of tissue- and cell-type-specific epigenomic features, we propose TVAR. This multi-label learning-based deep neural network predicts the functionality of non-coding variants in the genome based on eQTLs across 49 human tissues in the GTEx project. TVAR learns the relationships between high-dimensional epigenomics and eQTLs across tissues, taking the correlation among tissues into account to understand shared and tissue-specific eQTL effects. As a result, TVAR outputs tissue-specific annotations, with an average AUROC of 0.77 across these tissues. We evaluate TVAR’s performance on four complex diseases (coronary artery disease, breast cancer, Type 2 diabetes and Schizophrenia), using TVAR’s tissue-specific annotations, and observe its superior performance in predicting functional variants for both common and rare variants, compared with five existing state-of-the-art tools. We further evaluate TVAR’s G-score, a scoring scheme across all tissues, on ClinVar, fine-mapped GWAS loci, Massive Parallel Reporter Assay (MPRA) validated variants and observe the consistently better performance of TVAR compared with other competing tools. Availability and implementation The TVAR source code and its scores on the ClinVar catalog, fine mapped GWAS Loci, high confidence eQTLs from GTEx dataset, and MPRA validated functional variants are available at https://github.com/haiyang1986/TVAR. Supplementary information Supplementary data are available at Bioinformatics online.
This study aimed to investigate the evolutionary profile (including diversity, activity, and abundance) of retrotransposons (RTNs) with long terminal repeats (LTRs) in ten species of Tetraodontiformes. These species, Arothron firmamentum, Lagocephalus sceleratus, Pao palembangensis, Takifugu bimaculatus, Takifugu flavidus, Takifugu ocellatus, Takifugu rubripes, Tetraodon nigroviridis, Mola mola, and Thamnaconus septentrionalis, are known for having the smallest genomes among vertebrates. Data mining revealed a high diversity and wide distribution of LTR retrotransposons (LTR-RTNs) in these compact vertebrate genomes, with varying abundances among species. A total of 819 full-length LTR-RTN sequences were identified across these genomes, categorized into nine families belonging to four different superfamilies: ERV (Orthoretrovirinae and Epsilon retrovirus), Copia, BEL-PAO, and Gypsy (Gmr, Mag, V-clade, CsRN1, and Barthez). The Gypsy superfamily exhibited the highest diversity. LTR family distribution varied among species, with Takifugu bimaculatus, Takifugu flavidus, Takifugu ocellatus, and Takifugu rubripes having the highest richness of LTR families and sequences. Additionally, evidence of recent invasions was observed in specific tetraodontiform genomes, suggesting potential transposition activity. This study provides insights into the evolution of LTR retrotransposons in Tetraodontiformes, enhancing our understanding of their impact on the structure and evolution of host genomes.
In the era of the Internet, artificial intelligence (AI) and other emerging technologies leading the industrial revolution, on the basis of comprehensive analysis of modern medical information services and analysis of the content of some hospital websites, from the perspective of the public, the expert scoring method is applied to establish medical information on hospital websites. Service comprehensive evaluation index system, at the same time quantification and assignment of indicators, combined with the theoretical research platform of information service quality. The comprehensive evaluation model of the medical information service function and quality of the hospital website was constructed by the comprehensive scoring method, and an empirical study was carried out. After comparing the data results of the model simulation with the actual data, it can be seen that the deviation between the actual data and the simulated data is not significant, that is, between 0.12% and 15.93%, which is within a reasonable range. Therefore, we have reason to believe that the model is an effective and reasonable use of the system dynamics method. On the basis of previous theoretical research and historical yearbook data, plus innovative theoretical assumptions, the Internet+healthcare embedded in the Internet+healthcare is constructed. The model of the traditional medical information service system can help clarify the role of Internet+health and medical information service in the overall medical information service system.
Gene set-based analysis of genome-wide association study (GWAS) data has recently emerged as a useful approach to examine the joint effects of multiple risk loci in complex human diseases or phenotypes. Dental caries is a common, chronic, and complex disease leading to a decrease in quality of life worldwide. In this study, we applied the approaches of gene set enrichment analysis to a major dental caries GWAS dataset, which consists of 537 cases and 605 controls. Using four complementary gene set analysis methods, we analyzed 1331 Gene Ontology (GO) terms collected from the Molecular Signatures Database (MSigDB). Setting false discovery rate (FDR) threshold as 0.05, we identified 13 significantly associated GO terms. Additionally, 17 terms were further included as marginally associated because they were top ranked by each method, although their FDR is higher than 0.05. In total, we identified 30 promising GO terms, including 'Sphingoid metabolic process,' 'Ubiquitin protein ligase activity,' 'Regulation of cytokine secretion,' and 'Ceramide metabolic process.' These GO terms encompass broad functions that potentially interact and contribute to the oral immune response related to caries development, which have not been reported in the standard single marker based analysis. Collectively, our gene set enrichment analysis provided complementary insights into the molecular mechanisms and polygenic interactions in dental caries, revealing promising association signals that could not be detected through single marker analysis of GWAS data.
Abstract Cells govern biological actions through highly complex biological networks. Perturbations to the complex molecular network due to driver mutations may transit cells to new phenotypic states, e.g., tumorigenesis. Identifying how genetic lesions such as somatic mutations perturb these networks is a fundamental challenge in cancer biology. The recent TCGA studies revealed that a typical tumor contains two to as many as eight of driver mutations while the numerous remaining somatic mutations are passenger mutations. So far, it remains largely unknown what evolutionary forces and how the genetic lesions disrupt cancer interactome that leads to tumorigenesis. In this study, we systematically investigated the relationship among the network topology, evolutionary rates, and evolutionary origins of somatic and germline mutation driven disease genes in the large context of the protein-protein interaction (PPI) networks by utilizing recently released extensive somatic mutations and gene annotation data. We aimed to address two fundamental questions. (1) Whether cancer genes display a distinct network topology from Mendelian disease genes, and why? (2) From a network biology perspective, how the transition occurs from a normal cell to a tumor cell as initiated by a few driver genetic mutations? We collected the largest ever cancer gene list and five comprehensive networks. We found evolutionary origin is the main determinant of the unique network centrality of cancer proteins. We further investigated the perturbations of network topology by somatic mutations that were identified from 3268 tumors across 12 cancer types in TCGA. We revealed that the network-attacking perturbation of somatic mutations on central hubs of cancer interactome is a main feature of tumor emergence and evolution. This finding elucidates the high efficiency of the transition during tumorigenesis as initiated by a few driver mutations. This work improves our understanding of dynamic network-attacking perturbation by somatic mutations during tumorigenesis, and, in turn, of the implications for both basic cancer biology and the development of personalized antitumor therapy. Citation Format: Feixiong Cheng, Peilin Jia, Quan Wang, Chen-Ching Lin, Zhongming Zhao. Tumorigenesis: an investigation by network evolution and perturbations of somatic mutations in cancer interactome. [abstract]. In: Proceedings of the 105th Annual Meeting of the American Association for Cancer Research; 2014 Apr 5-9; San Diego, CA. Philadelphia (PA): AACR; Cancer Res 2014;74(19 Suppl):Abstract nr 367. doi:10.1158/1538-7445.AM2014-367
Although a correlation between immune cell phenotypes and inflammatory bowel disease (IBD) has been established, a causal relationship remains unestablished.
Five species and three varieties of Pinnularia (Bacillariophyta) are first reported in China. They are Pinnularia borealis var. subislandica Krammer,P. divergentissima var. subrostrata Cleve-Euler,P. episcopalis Cleve,P. erratica Krammer,P. esoxiformis Fusey,P. spitsbergensis Cleve,P. undula (Schumann) Krammer,and P. undula var. mesoleptiformis Krammer. All were collected from Da’erbin Lake and marshes around it in the Da Hinggan Ling Mountains,Nei Mongol Autonomous Region,China. These species were observed with LM and SEM and their taxonomic characters and habitats are discussed.
Objective
To analyze the cause of fatal cases in children with confirmed influenza virus infection, and in order to improve the level of diagnosis and treatment.
Methods
Deaths in critical illness of influenza were co-llected from November 2017 to April 2018 in Pediatric Intensive Care Unit of Beijing Children′s Hospital, Capital Medical University.The clinical characteristics and causes of death were retrospectively analyzed according to the different virus types.
Results
A total of 19 cases were included.Fifteen cases (78.95%) were less than 5 years old and 9 cases (47.37%) were less than 2 years old.On admission, the median score of pediatric index of mortality 2 was 72.7%.There were 11 cases of influenza H1N1 and 8 cases of influenza B. Six cases had underlying diseases.All patients had fever, cough and dyspnea.Thirteen patients had coma.Seventeen cases had pneumonia, 11 cases had severe acute respiratory distress syndrome(ARDS), 3 cases had air leakage syndrome and 8 cases had influenza-related encephalopathy(IAE). Ten cases (52.63%) died of severe ARDS, 7 cases (36.84%) died of IAE, 1 case(5.26%)died of multiple organ dysfunction, and 1 case(5.26%)died of severe myocarditis and cardiogenic shock.There was statistical difference in the time from onset to death between the ARDS group and IAE group[15(4, 22) d vs. 3(2, 8) d](Z=-2.063, P=0.039). Among the children who died of severe ARDS, most patients in influenza H1N1 group <2 years old, while those influenza B group ≥2 years old.All children who died of IAE were all ≥1 years old.Six cases(31.58%)had bacterial infection, mainly gram-positive cocci.All patients were treated with neuraminidase inhibitors.The average time from onset to the first time of medication was 5 days.
Conclusions
Severe ARDS and IAE are the main causes of death in children with influenza virus infection.Compared with ARDS, the condition of children with IAE worsened more rapidly.
Key words:
Influenza; Fatal case; Critical illness; Child