Prediction of bacterial associations with plants using a supervised machine-learning approach.

2016 
Summary Recent scenarios of fresh produce contamination by human enteric pathogens have resulted in severe food-borne outbreaks, and a new paradigm has emerged stating that some human-associated bacteria can use plants as secondary hosts. As a consequence, there has been growing concern in the scientific community about these interactions that have not yet been elucidated. Since this is a relatively new area, there is a lack of strategies to address the problem of food-borne illnesses due to the ingestion of fruits and vegetables. In the present study, we performed specific genome annotations to train a supervised machine-learning model that allows for the identification of plant-associated bacteria with a precision of ∼93%. The application of our method to approximately 9500 genomes predicted several unknown interactions between well-known human pathogens and plants, and it also confirmed several cases for which evidence has been reported. We observed that factors involved in adhesion, the deconstruction of the plant cell wall and detoxifying activities were highlighted as the most predictive features. The application of our strategy to sequenced strains that are involved in food poisoning can be used as a primary screening tool to determine the possible causes of contaminations.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    82
    References
    23
    Citations
    NaN
    KQI
    []