Genomic rearrangements uncovered by genome-wide co-evolution analysis of a major nosocomial pathogen Enterococcus faecium

2020 
Enterococcus faecium is a gut commensal of the gastro-digestive tract, but also known as nosocomial pathogen among hospitalized patients. Population genetics based on whole-genome sequencing has revealed that E. faecium strains from hospitalized patients form a distinct clade, designated as clade A1 and that plasmids are major contributors to the emergence of nosocomial E. faecium. Here we further explored the adaptive evolution of E. faecium using a genome-wide co-evolution study (GWES) to identify co-evolving SNPs. We identified three genomic regions harboring large numbers of SNPs in tight linkage which are not proximal to each other based on the completely assembled chromosome of clade A1 reference hospital isolate AUS0004. Close examination of these regions revealed that they are located at the borders of four different types of large-scale genomic rearrangements, insertion sites of two different genomic islands and an IS30-like transposon. In non-clade A1 isolates, these regions are adjacent to each other and they lack the insertions of the genomic islands and IS30-like transposon. Additionally, among the clade A1 isolates there is one group of pet isolates lacking the genomic rearrangement and insertion of the genomic islands, suggesting a distinct evolutionary trajectory. In silico analysis of the biological functions of the genes encoded in three regions revealed a common link to a stress response. This suggests that these rearrangements may reflect adaptation to the stringent conditions in the hospital environment, such as antibiotics and detergents, to which bacteria are exposed. In conclusion, to our knowledge, this is the first study using GWES to identify genomic rearrangements, suggesting that there is considerable untapped potential to unravel hidden evolutionary signals from population genomic data. Impact statementEnterococcus faecium has emerged as an important nosocomial pathogen around the world. Population genetics revealed that clinical E. faecium strains form a distinct clade, designated as clade A1 and that plasmids are major contributors to the emergence of nosocomial E. faecium. Here, the adaptive evolution of E. faecium was further explored using an unsupervised machine learning method (SuperDCA) to identify genome-wide co-evolving SNPs. We identified three genomic regions harboring large numbers of SNPs in tight linkage which are separated by a large chromosomal distance in a clinical clade A1 reference isolate, but appeared adjacent to each other in non-clade A1 isolates. We identified four different types of large-scale genomic rearrangements and in all cases, we found insertion of two different genomic islands and an insertion element at the border. In contrast, no genomic rearrangement and insertions were identified among a group of clade A1 pet isolates, suggesting a distinct evolutionary trajectory. Based on the in silico predicted biological functions, we found a common link to a stress response for the genes encoded in three regions. This suggests that these rearrangements may reflect adaptation to the stringent conditions in the hospital environment, such as antibiotics and detergents, to which bacteria are exposed. Data summaryRaw core-genome alignment (1.1 MB, Harvest suite v1.1.2), including the 1,644 Clade A isolates and the complete E. faecium AUS0004 (accession number CP003351) as a reference is available under the following gitlab repository https://gitlab.com/sirarredondo/efm_gwes.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    33
    References
    0
    Citations
    NaN
    KQI
    []