Identification of the hyper-variable genomic hotspot for the novel coronavirus SARS-CoV-2

2020 
A recent study in this journal studied the genomes of the novel SARS-like coronavirus (SARS-CoV-2) in China and suggested that the SARS-CoV-2 had undergone genetic recombination with SARS-related CoV1. By February 14, 2020, a total of 66,576 confirmed cases of COVID-19, people infected with SARS-CoV-2, were reported in China, leading to 1,524 deaths, per the Chinese CDC (http://2019ncov.chinacdc.cn/2019-nCoV/). Several full genomic sequences of this virus have been released for the study of its evolutionary origin and molecular characteristics2, 3, 4. Here, we analyzed the potential mutations that may have evolved after the virus became epidemic among humans and also the mutations resulting in the human adaptation. The sequences of BetaCoV were downloaded on February 3, 2020 from the GISAID platform5. A total of 58 accessions were available, among which BetaCoV/bat/Yunnan/RaTG13/2013 is a known close relative of SARS-CoV-2. Four accessions, namely, BetaCov/Italy/INM1/2020, BetaCov/Italy/INM2/2020, BetaCoV/Kanagawa/1/2020, and BetaCoV/USA/IL1/2020, were excluded because of the short-truncated sequences or multiple ambiguous nucleotides. A total of 54 accessions (Supplementary table 1) isolated from humans were utilized in the following analysis. The sequences NC_004718.3 of SARS coronavirus6 genes were utilized to define the protein products of SARS-CoV-2. The protein sequences of ORF1ab, S, E, M, and N genes were translated, and all of the loci without experimental evidences were excluded. First, the protein sequences of SARS-CoV-2 were compared with RaTG13, human SARS (NC_004718.3), bat SARS (DQ022305.2), and human MERS (NC_019843.3) by calculating the similarity in a given sliding window (Figure 1A). The sliding window was set to 500 for ORF1ab and S, and to 50 for proteins E, M, and N considering their short length. SARS-CoV-2 were highly similar to RaTG13 isolated from bats, showing 96% identity based on the whole-nucleotide sequences and 83% based on the protein sequences, suggesting a bat zoonotic origin of SARS-CoV-2. ORF1a, and the head of S seemed to have diverged from other beta coronaviruses.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    10
    References
    32
    Citations
    NaN
    KQI
    []