logo
    Meta-analysis of fecal metagenomes reveals global viral signatures and its diagnostic potential for colorectal cancer and adenoma
    0
    Citation
    41
    Reference
    10
    Related Paper
    Abstract:
    Abstract Introduction Gut microbiome plays an important role in maintaining human health. Although mounting evidence has revealed the critical function of the gut bacteriome in the progression of CRC, the contribution of gut viral community to CRC is rarely studied. Objectives The present study aimed to reveal the gut virome signatures of colorectal adenoma patients and CRC patients and decipher the potential viral markers to build clinical predictive models for diagnosis. Methods 1,282 available fecal metagenomes data from 9 published CRC studies were collected. A new virus database was constructed based on a reference-independent virome approach for further analysis. Viral markers were filtered by statistical methods and used to build machine learning models such as Random Forest and Least Absolute Shrinkage and Selection Operator (LASSO) to distinguish patients from controls. New fecal samples were collected to validate the generalization of predictive model. Results The gut viral composition of CRC patients was drastically altered compared with healthy, as evidenced by changes in several Siphoviridae viruses and a reduction of Microviridae, whereas the virome variation in adenoma patients was relatively low. The viral markers contained the phages of Porphyromonas , Fusobacterium , Hungatella , and Ruminococcaceae . In 9 cohorts and independent validation cohorts, a random forest (RF) classifier and LASSO model got the optimal AUC 0.830 and 0.906, respectively. While the gut virome analysis of adenoma patients identified 88 differential viruses and achieved an optimal AUC of 0.772 for discriminating patients from controls. Conclusion Our findings demonstrate the distinctly different composition of gut virome between healthy controls and CRC patients, and highlight the potential of viral markers for clinical diagnosis.
    Keywords:
    Human virome
    Faecalibacterium prausnitzii
    Siphoviridae
    Colorectal adenoma
    Abstract Introduction Gut microbiome plays an important role in maintaining human health. Although mounting evidence has revealed the critical function of the gut bacteriome in the progression of CRC, the contribution of gut viral community to CRC is rarely studied. Objectives The present study aimed to reveal the gut virome signatures of colorectal adenoma patients and CRC patients and decipher the potential viral markers to build clinical predictive models for diagnosis. Methods 1,282 available fecal metagenomes data from 9 published CRC studies were collected. A new virus database was constructed based on a reference-independent virome approach for further analysis. Viral markers were filtered by statistical methods and used to build machine learning models such as Random Forest and Least Absolute Shrinkage and Selection Operator (LASSO) to distinguish patients from controls. New fecal samples were collected to validate the generalization of predictive model. Results The gut viral composition of CRC patients was drastically altered compared with healthy, as evidenced by changes in several Siphoviridae viruses and a reduction of Microviridae, whereas the virome variation in adenoma patients was relatively low. The viral markers contained the phages of Porphyromonas , Fusobacterium , Hungatella , and Ruminococcaceae . In 9 cohorts and independent validation cohorts, a random forest (RF) classifier and LASSO model got the optimal AUC 0.830 and 0.906, respectively. While the gut virome analysis of adenoma patients identified 88 differential viruses and achieved an optimal AUC of 0.772 for discriminating patients from controls. Conclusion Our findings demonstrate the distinctly different composition of gut virome between healthy controls and CRC patients, and highlight the potential of viral markers for clinical diagnosis.
    Human virome
    Faecalibacterium prausnitzii
    Siphoviridae
    Colorectal adenoma
    Citations (0)
    Abstract Background Viruses are central to microbial community structure in all environments. The ability to generate large metagenomic assemblies of mixed microbial and viral sequences provides the opportunity to tease apart complex microbiome dynamics, but these analyses are currently limited by the tools available for analyses of viral genomes and assessing their metabolic impacts on microbiomes. Design Here we present VIBRANT, the first method to utilize a hybrid machine learning and protein similarity approach that is not reliant on sequence features for automated recovery and annotation of viruses, determination of genome quality and completeness, and characterization of virome function from metagenomic assemblies. VIBRANT uses neural networks of protein signatures and a novel v-score metric that circumvents traditional boundaries to maximize identification of lytic viral genomes and integrated proviruses, including highly diverse viruses. VIBRANT highlights viral auxiliary metabolic genes and metabolic pathways, thereby serving as a user-friendly platform for evaluating virome function. VIBRANT was trained and validated on reference virus datasets as well as microbiome and virome data. Results VIBRANT showed superior performance in recovering higher quality viruses and concurrently reduced the false identification of non-viral genome fragments in comparison to other virus identification programs, specifically VirSorter and VirFinder. When applied to 120,834 metagenomically derived viral sequences representing several human and natural environments, VIBRANT recovered an average of 94.5% of the viruses, whereas VirFinder and VirSorter achieved less powerful performance, averaging 48.1% and 56.0%, respectively. Similarly, VIBRANT identified more total viral sequence and proteins when applied to real metagenomes. When compared to PHASTER and Prophage Hunter for the ability to extract integrated provirus regions from host scaffolds, VIBRANT performed comparably and even identified proviruses that the other programs did not. To demonstrate applications of VIBRANT, we studied viromes associated with Crohn’s Disease to show that specific viral groups, namely Enterobacteriales-like viruses, as well as putative dysbiosis associated viral proteins are more abundant compared to healthy individuals, providing a possible viral link to maintenance of diseased states. Conclusions The ability to accurately recover viruses and explore viral impacts on microbial community metabolism will greatly advance our understanding of microbiomes, host-microbe interactions and ecosystem dynamics.
    Human virome
    Identification
    Human Microbiome Project
    Lytic cycle
    Citations (0)
    Human gut microbiota is a complex ecosystem with several functions integrated in the host organism (metabolic, immune, nutrients absorption, etc.). Human microbiota is composed by bacteria, yeasts, fungi and, last but not least, viruses, whose composition has not been completely described.According to previous evidence on pathogenic viruses, the human gut harbours plant-derived viruses, giant viruses and, only recently, abundant bacteriophages. New metagenomic methods have allowed to reconstitute entire viral genomes from the genetic material spread in the human gut, opening new perspectives on the understanding of the gut virome composition, the importance of gut microbiome, and potential clinical applications.This review reports the latest evidence on human gut "virome" composition and its function, possible future therapeutic applications in human health in the context of the gut microbiota, and attempts to clarify the role of the gut "virome" in the larger microbial ecosystem.
    Human virome
    Bacterial virus
    Citations (283)
    Abstract Despite the accelerating number of uncultivated virus sequences discovered in metagenomics and their apparent importance for health and disease, the human gut virome and its interactions with bacteria in the gastrointestinal tract are not well understood. This is partly due to a paucity of whole-virome datasets and limitations in current approaches for identifying viral sequences in metagenomics data. Here, combining a deep-learning based metagenomics binning algorithm with paired metagenome and metavirome datasets, we develop Phages from Metagenomics Binning (PHAMB), an approach that allows the binning of thousands of viral genomes directly from bulk metagenomics data, while simultaneously enabling clustering of viral genomes into accurate taxonomic viral populations. When applied on the Human Microbiome Project 2 (HMP2) dataset, PHAMB recovered 6,077 high-quality genomes from 1,024 viral populations, and identified viral-microbial host interactions. PHAMB can be advantageously applied to existing and future metagenomes to illuminate viral ecological dynamics with other microbiome constituents.
    Human virome
    Human Microbiome Project
    Citations (75)
    Principles of mass parallel sequencing, otherwise called next generation sequencing (NGS), appeared at the beginning of 2000s and were realized in dozens of NGS platforms. High performance and sequencing speed of NGS platforms opened wide horizons for scientists in the field of genomic studies, including metagenomic, first of all related to studies of structure of various microbiocenoses. Dozens of studies dedicated to studies of microbiome and virome of various biotopes of humans in normal state and pathology by using NGS platforms have appeared, forming novel conceptions on pathogenesis and epidemiology ofvarious infectious diseases. Significant cost reduction of the analysis facilitates expansion of sphere of application for NGS technologies not only in the field of fundamental, but also applied microbiologic studies, including etiologic diagnostics of infectious diseases. Due to the increase of the number of cases of infectious diseases, that do not have a typical clinical presentation, use of metagenomic approach is of particular importance, allowing to carry out detection of a wide spectrum of causative agents of bacterial, viral and parasitic infections. Technologic features of mass parallel sequencing platform, main methods of metagenomic studies and bioinformatics approaches, used for the analysis of data obtained, are presented in the review. Studies on healthy human microbiome and in pathology are described; possibilities and perspectives of metagenomic approach application in diagnos- tics and system of epidemiologic control of infectious diseases are examined.
    Human virome
    Human Microbiome Project
    Citations (1)
    Abstract Background Viruses are central to microbial community structure in all environments. The ability to generate large metagenomic assemblies of mixed microbial and viral sequences provides the opportunity to tease apart complex microbiome dynamics, but these analyses are currently limited by the tools available for analyses of viral genomes and assessing their metabolic impacts on microbiomes. Design Here we present VIBRANT, the first method to utilize a hybrid machine learning and protein similarity approach that is not reliant on sequence features for automated recovery and annotation of viruses, determination of genome quality and completeness, and characterization of virome function from metagenomic assemblies. VIBRANT uses neural networks of protein signatures and a novel v-score metric that circumvents traditional boundaries to maximize identification of lytic viral genomes and integrated proviruses, including highly diverse viruses. VIBRANT highlights viral auxiliary metabolic genes and metabolic pathways, thereby serving as a user-friendly platform for evaluating virome function. VIBRANT was trained and validated on reference virus datasets as well as microbiome and virome data. Results VIBRANT showed superior performance in recovering higher quality viruses and concurrently reduced the false identification of non-viral genome fragments in comparison to other virus identification programs, specifically VirSorter and VirFinder. When applied to 120,834 metagenomically derived viral sequences representing several human and natural environments, VIBRANT recovered an average of 94.5% of the viruses, whereas VirFinder and VirSorter achieved less powerful performance, averaging 48.1% and 56.0%, respectively. Similarly, VIBRANT identified more total viral sequence and proteins when applied to real metagenomes. When compared to PHASTER and Prophage Hunter for the ability to extract integrated provirus regions from host scaffolds, VIBRANT performed comparably and even identified proviruses that the other programs did not. To demonstrate applications of VIBRANT, we studied viromes associated with Crohn’s Disease to show that specific viral groups, namely Enterobacteriales-like viruses, as well as putative dysbiosis associated viral proteins are more abundant compared to healthy individuals, providing a possible viral link to maintenance of diseased states. Conclusions The ability to accurately recover viruses and explore viral impacts on microbial community metabolism will greatly advance our understanding of microbiomes, host-microbe interactions and ecosystem dynamics.
    Human virome
    Identification
    Human Microbiome Project
    Lytic cycle
    Citations (21)
    Abstract Despite the accelerating number of uncultivated virus sequences discovered in metagenomics and their apparent importance for health and disease, the human gut virome and its interactions with bacteria in the gastrointestinal are not well understood. In addition, a paucity of whole-virome datasets from subjects with gastrointestinal diseases is preventing a deeper understanding of the virome’s role in disease and in gastrointestinal ecology as a whole. By combining a deep-learning based metagenomics binning algorithm with paired metagenome and metavirome datasets we developed the Phages from Metagenomics Binning (PHAMB) approach for binning thousands of viral genomes directly from bulk metagenomics data. Simultaneously our methodology enables clustering of viral genomes into accurate taxonomic viral populations. We applied this methodology on the Human Microbiome Project 2 (HMP2) cohort and recovered 6,077 HQ genomes from 1,024 viral populations and explored viral-host interactions. We show that binning can be advantageously applied to existing and future metagenomes to illuminate viral ecological dynamics with other microbiome constituents.
    Human virome
    Human Microbiome Project
    Citations (6)
    Viruses are central to microbial community structure in all environments. The ability to generate large metagenomic assemblies of mixed microbial and viral sequences provides the opportunity to tease apart complex microbiome dynamics, but these analyses are currently limited by the tools available for analyses of viral genomes and assessing their metabolic impacts on microbiomes.Here we present VIBRANT, the first method to utilize a hybrid machine learning and protein similarity approach that is not reliant on sequence features for automated recovery and annotation of viruses, determination of genome quality and completeness, and characterization of viral community function from metagenomic assemblies. VIBRANT uses neural networks of protein signatures and a newly developed v-score metric that circumvents traditional boundaries to maximize identification of lytic viral genomes and integrated proviruses, including highly diverse viruses. VIBRANT highlights viral auxiliary metabolic genes and metabolic pathways, thereby serving as a user-friendly platform for evaluating viral community function. VIBRANT was trained and validated on reference virus datasets as well as microbiome and virome data.VIBRANT showed superior performance in recovering higher quality viruses and concurrently reduced the false identification of non-viral genome fragments in comparison to other virus identification programs, specifically VirSorter, VirFinder, and MARVEL. When applied to 120,834 metagenome-derived viral sequences representing several human and natural environments, VIBRANT recovered an average of 94% of the viruses, whereas VirFinder, VirSorter, and MARVEL achieved less powerful performance, averaging 48%, 87%, and 71%, respectively. Similarly, VIBRANT identified more total viral sequence and proteins when applied to real metagenomes. When compared to PHASTER, Prophage Hunter, and VirSorter for the ability to extract integrated provirus regions from host scaffolds, VIBRANT performed comparably and even identified proviruses that the other programs did not. To demonstrate applications of VIBRANT, we studied viromes associated with Crohn's disease to show that specific viral groups, namely Enterobacteriales-like viruses, as well as putative dysbiosis associated viral proteins are more abundant compared to healthy individuals, providing a possible viral link to maintenance of diseased states.The ability to accurately recover viruses and explore viral impacts on microbial community metabolism will greatly advance our understanding of microbiomes, host-microbe interactions, and ecosystem dynamics. Video Abstract.
    Medical microbiology
    Citations (738)
    High throughput sequencing has spurred the development of metagenomics, which involves the direct analysis of microbial communities in various environments such as soil, ocean water, and the human body. Many existing methods based on marker genes or k-mers have limited sensitivity or are too computationally demanding for many users. Additionally, most work in metagenomics has focused on bacteria and archaea, neglecting to study other key microbes such as viruses and eukaryotes.Here we present a method, MiCoP (Microbiome Community Profiling), that uses fast-mapping of reads to build a comprehensive reference database of full genomes from viruses and eukaryotes to achieve maximum read usage and enable the analysis of the virome and eukaryome in each sample. We demonstrate that mapping of metagenomic reads is feasible for the smaller viral and eukaryotic reference databases. We show that our method is accurate on simulated and mock community data and identifies many more viral and fungal species than previously-reported results on real data from the Human Microbiome Project.MiCoP is a mapping-based method that proves more effective than existing methods at abundance profiling of viruses and eukaryotes in metagenomic samples. MiCoP can be used to detect the full diversity of these communities. The code, data, and documentation are publicly available on GitHub at: https://github.com/smangul1/MiCoP .
    Human virome
    Human Microbiome Project
    Profiling (computer programming)
    Citations (36)
    Abstract Background High throughput sequencing has spurred the development of metagenomics, which involves the direct analysis of microbial communities in various environments such as soil, ocean water, and the human body. Many existing methods based on marker genes or k-mers have limited sensitivity or are too computationally demanding for many users. Additionally, most work in metagenomics has focused on bacteria and archaea, neglecting to study other key microbes such as viruses and eukaryotes. Results Here we present a method, MiCoP (Microbiome Community Profiling), that uses fast-mapping of reads to build a comprehensive reference database of full genomes from viruses and eukaryotes to achieve maximum read usage and enable the analysis of the virome and eukaryome in each sample. We demonstrate that mapping of metagenomic reads is feasible for the smaller viral and eukaryotic reference databases. We show that our method is accurate on simulated and mock community data and identifies many more viral and fungal species than previously-reported results on real data from the Human Microbiome Project. Conclusions MiCoP is a mapping-based method that proves more effective than existing methods at abundance profiling of viruses and eukaryotes in metagenomic samples. MiCoP can be used to detect the full diversity of these communities. The code, data, and documentation is publicly available on GitHub at: https://github.com/smangul1/MiCoP
    Human virome
    Human Microbiome Project
    Profiling (computer programming)
    Citations (1)