PlasmidHostFinder: Prediction of plasmid hosts using random forest

2021 
Plasmids play a major role facilitating the spread of antimicrobial resistance between bacteria. Understanding the host range and dissemination trajectories of plasmids is critical for surveillance and prevention of antimicrobial resistance. Identification of plasmid host ranges could be improved using automated pattern detection methods, compared to homology-based methods due to the diversity and genetic plasticity of plasmids. In this study, we developed a method for predicting the host range of plasmids based on the random forest machine learning method. We trained the models with 8,519 plasmids from 359 different bacterial species per taxonomic level, where the models achieved 0.662 and 0.867 Matthews correlation coefficients at the species and order levels, respectively. Our results suggest that despite the diverse nature and genetic plasticity of plasmids, our random forest model can accurately distinguish between plasmid hosts. This tool can be used online through Center for Genomic Epidemiology (https://cge.cbs.dtu.dk/services/PlasmidHostFinder/). ImportanceAntimicrobial resistance is a global health threat to humans and animals causing high mortality and morbidity, and effectively ending decades of success in fighting against bacterial infections. Plasmids confer extra genetic capabilities to the host organisms through accessory genes, which can encode antimicrobial resistance and virulence factors. In addition to lateral inheritance, plasmids can be transferred horizontally between bacterial taxa. Therefore, detecting the host range of plasmids is crucial for understanding and predicting the dissemination trajectories of extrachromosomal genes and bacterial evolution, as well as for taking effective counter measures against antimicrobial resistance.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    36
    References
    0
    Citations
    NaN
    KQI
    []