Viral host prediction with Deep Learning

2019 
Zoonosis, the natural transmission of infections from animal to human, is a far-reaching global problem. The recent outbreaks of Zika virus and Ebola virus are examples of viral zoonosis, which occur more frequently due to globalization. In case of a virus outbreak, it is helpful to know which host organism was the original carrier of the virus. Once the reservoir or intermediate host is known, it can be isolated to prevent further spreading of the viral infection. Recent approaches aim to predict a viral host based on the viral genome, often in combination with the potential host genome and using arbitrary selected features. This methods have a clear limitation in either the amount of different hosts they can predict or the accuracy of the prediction. Here, we present a fast and accurate deep learning approach for viral host prediction, which is based on the viral genome sequence only. To assure a high prediction accuracy we developed an effective selection approach for the training data, to avoid biases due to a highly unbalanced number of known sequences per virus-host combinations. We tested our deep neural network on three different virus species (influenza A virus, rabies lyssavirus, rotavirus A) and reached for each virus species a AUC between 0.94 and 0.98, outperforming previous approaches and allowing highly accurate predictions while only using fractions of the viral genome sequences. We show that deep neural networks are suitable to predict the host of a virus, even with a limited amount of sequences and highly unbalanced available data. The deep neural networks trained for this approach build the core of the virus host predicting tool VIDHOP (VIrus Deep learning HOst Prediction).
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    27
    References
    10
    Citations
    NaN
    KQI
    []