MoSBi: Automated signature mining for molecular stratification and subtyping
Tim RoseThibault BechtlerOctavia-Andreea CioraKim Anh Lilian LeFlorian MolnarNikolai KöhlerJan BaumbachRichard RöttgerJosch K. Pauling
6
Citation
42
Reference
10
Related Paper
Citation Trend
Abstract:
The improving access to increasing amounts of biomedical data provides completely new chances for advanced patient stratification and disease subtyping strategies. This requires computational tools that produce uniformly robust results across highly heterogeneous molecular data. Unsupervised machine learning methodologies are able to discover de novo patterns in such data. Biclustering is especially suited by simultaneously identifying sample groups and corresponding feature sets across heterogeneous omics data. The performance of available biclustering algorithms heavily depends on individual parameterization and varies with their application. Here, we developed MoSBi (molecular signature identification using biclustering), an automated multialgorithm ensemble approach that integrates results utilizing an error model-supported similarity network. We systematically evaluated the performance of 11 available and established biclustering algorithms together with MoSBi. For this, we used transcriptomics, proteomics, and metabolomics data, as well as synthetic datasets covering various data properties. Profiting from multialgorithm integration, MoSBi identified robust group and disease-specific signatures across all scenarios, overcoming single algorithm specificities. Furthermore, we developed a scalable network-based visualization of bicluster communities that supports biological hypothesis generation. MoSBi is available as an R package and web service to make automated biclustering analysis accessible for application in molecular sample stratification.Keywords:
Subtyping
Biclustering
Profiling (computer programming)
Abstract The purpose of subtyping is to differentiate bacterial isolates beyond the classification of species or subspecies. Subtyping methods can be grouped into two broad categories based on the cellular components targeted: (1) phenotypic subtyping methods that differentiate isolates by the enzymes, proteins, or other metabolites expressed by the cell, and (2) molecular subtyping methods that discriminate isolates based on interrogation of nucleic acid sequences. The two major types of molecular subtyping methods include band-based methods based on fragment pattern data or DNA fingerprints, and methods that generate DNA sequence data. Molecular subtyping methods have shown that Listeria monocytogenes isolates can be classified into four genetic lineages or divisions. Although band-based molecular subtyping methods continue to serve as the gold standard for routine molecular subtyping of most clinically important foodborne pathogens, including L. monocytogenes, the explosion of recently completed and ongoing DNA sequencing projects, and thus available DNA sequence data, have stimulated efforts to develop highly discriminatory and high-throughput DNA sequence-based subtyping methods for L. monocytogenes. L. monocytogenes represents one of the most highly sequenced human pathogens; more than 20 genome sequences are currently available for this organism. This review provides an overview of the concepts behind subtyping and discusses the application of molecular subtyping methods, with an emphasis on DNA sequence-based subtyping methods to characterize L. monocytogenes.
Subtyping
Cite
Citations (24)
Bacteria subtyping methods not only improve our ability to detect and track human listeriosis outbreaks, but also provide useful tools to track sources of L.monocytogenes contamination throughout the food system. Additionally, the use of subtyping methods provide an opportunity to better understand the population genetics, epidemiology, and the ecology of L.monocytogenes.The last five years have seen tremendous advancements in the development of sensitive,rapid,automated,and increasingly easy to use molecular subtyping methods for L.monocytogenes This review focused on the the different subtyping methods of L.monocytogenes and it's applications.
Subtyping
Cite
Citations (0)
Subtyping
Mainstream
Cite
Citations (25)
Accurate subtyping or classification of breast cancer is important for ensuring proper treatment of patients and also for understanding the molecular mechanisms driving this disease. While there have been several gene signatures proposed in the literature to classify breast tumours, these signatures show very low overlaps, different classification performance, and not much relevance to the underlying biology of these tumours. Here we evaluate DNA-damage response (DDR) and cell cycle pathways, which are critical pathways implicated in a considerable proportion of breast tumours, for their usefulness and ability in breast tumour subtyping. We think that subtyping breast tumours based on these two pathways could lead to vital insights into molecular mechanisms driving these tumours. Here, we performed a systematic evaluation of DDR and cell-cycle pathways for subtyping of breast tumours into the five known intrinsic subtypes. Homologous Recombination (HR) pathway showed the best performance in subtyping breast tumours, indicating that HR genes are strongly involved in all breast tumours. Comparisons of pathway based signatures and two standard gene signatures supported the use of known pathways for breast tumour subtyping. Further, the evaluation of these standard gene signatures showed that breast tumour subtyping, prognosis and survival estimation are all closely related. Finally, we constructed an all-inclusive super-signature by combining (union of) all genes and performing a stringent feature selection, and found it to be reasonably accurate and robust in classification as well as prognostic value. Adopting DDR and cell cycle pathways for breast tumour subtyping achieved robust and accurate breast tumour subtyping, and constructing a super-signature which contains feature selected mix of genes from these molecular pathways as well as clinical aspects is valuable in clinical practice.
Subtyping
Biomarker Discovery
Molecular diagnostics
Classification scheme
Cite
Citations (71)
Consistent subtyping is employed in some gradual type systems to validate type conversions. The original definition by Siek and Taha serves as a guideline for designing gradual type systems with subtyping. Polymorphic types à la System F also induce a subtyping relation that relates polymorphic types to their instantiations. However Siek and Taha's definition is not adequate for polymorphic subtyping. The first goal of this paper is to propose a generalization of consistent subtyping that is adequate for polymorphic subtyping, and subsumes the original definition by Siek and Taha. The new definition of consistent subtyping provides novel insights with respect to previous polymorphic gradual type systems, which did not employ consistent subtyping. The second goal of this paper is to present a gradually typed calculus for implicit (higher-rank) polymorphism that uses our new notion of consistent subtyping. We develop both declarative and (bidirectional) algorithmic versions for the type system. We prove that the new calculus satisfies all static aspects of the refined criteria for gradual typing, which are mechanically formalized using the Coq proof assistant.
Subtyping
Rank (graph theory)
Type theory
Cite
Citations (16)
Conventional, phenotypic, and DNA-based subtyping methods allow differentiation of Listeria monocytogenes beyond the species and subspecies level. Bacterial subtyping methods not only improve our ability to detect and track human listeriosis outbreaks, but also provide tools to track sources of L. monocytogenes contamination throughout the food system. The use of subtyping methods also provides an opportunity to better understand the population genetics, epidemiology, and ecology of L. monocytogenes. The last 5 years have seen tremendous advancements in the development of sensitive, rapid, automated, and increasingly easy-to-use molecular subtyping methods for L. monocytogenes. This review highlights key aspects of different L. monocytogenes subtyping methods and provides examples of their application in public health, food safety, population genetics, and epidemiology. A significant focus is on the application of subtyping methods to define L. monocytogenes subtypes and clonal groups, which may differ in phenotypic characteristics and pathogenic potential.
Subtyping
Subspecies
Molecular Epidemiology
Cite
Citations (192)
This paper, deals with a study of data mining techniques such as clustering, biclustering and triclustering. A large number of clustering approaches have been proposed for analysis of gene expression. However, the results of the application of standard clustering methods are limited. For this reason, concurrent clustering such as biclustering to find sub-matrices that are a subset of rows and a subset of columns from a two dimensional data set. Most of recently clustering of the 3D real dataset the triclustering techniques is implemented. Tricluster are constructed from two datasets by selecting a subset of features from each dataset and one shared subset of rows form amongst all the rows. This study reveals the journey of clustering to triclustering for gene expression data to identify the highest potential gene cluster or group.
Biclustering
Consensus clustering
Clustering high-dimensional data
Data set
Single-linkage clustering
Cite
Citations (7)
Subtyping
Cite
Citations (0)
By-name subtyping (or user-defined subtyping) and structural subtyping each have their own strengths and weaknesses. By-name subtyping allows programmers to explicitly express design intent, and, when types are associated with run time tags, enables run-time "type" tests and external/multimethod dispatch. On the other hand, structural subtyping is flexible and compositional, allowing unanticipated reuse. To date, nearly all object-oriented languages fully support only one subtyping paradigm or the other.
Subtyping
Cite
Citations (2)
The breast cancer is a usual and serious malignant tumor which threatens the women′s health.Molecular subtyping bases on the molecular level, and provides a new classification method for the breast cancer pathology classification, and plays an important guidance significance for the clinical treatment.At present, the breast cancer molecular subtyping is mainly divided into the following subtypes: the Luminal A type and Luminal B type, HER-2 overexpression and the triple negative breast cancer.Different molecular subtyping has different characteristics in treatment reaction, prognosis and the clinical application situation.
Key words:
Breast neoplasms; Molecular subtyping; Clinic Treatment
Subtyping
Clinical Significance
Cite
Citations (0)