Comparative Genomics and Integrated Network Approach Unveiled Undirected Phylogeny Patterns, Co-mutational Hotspots, Functional Crosstalk and Regulatory Interactions in SARS-CoV-2

2021 
SARS-CoV-2 pandemic resulted in 92 million cases in a span of one year. The study focuses on understanding population specific variations attributing its high rate of infections in specific geographical regions particularly in USA. Rigorous phylogenomic network analysis of complete SARS-CoV-2 genomes (245) inferred five central clades named a (ancestral), b, c, d and e (subtype e1 & e2). The clade d & e2 were found exclusively comprising of USA. Clades were distinguished by 10 co-mutational combinations in Nsp3, ORF8, Nsp13, S, Nsp12, Nsp2 and Nsp6. Our analysis revealed that only 67.46% of SNP mutations were at amino acid level. T1103P mutation in Nsp3 was predicted to increase protein stability in 238 strains except 6 strains which were marked as ancestral type; whereas co-mutation (P409L & Y446C) in Nsp13 were found in 64 genomes from USA highlighting its 100% co-occurrence. Docking highlighted mutation (D614G) caused reduction in binding of Spike proteins with ACE2, but it also showed better interaction with TMPRSS2 receptor contributing to high transmissibility among USA strains. We also found host proteins, MYO5A, MYO5B, MYO5C had maximum interaction with viral proteins (N, S, M). Thus, blocking the internalization pathway by inhibiting MYO5 proteins which could be an effective target for COVID-19 treatment. The functional annotations of the HPI network were found to be closely associated with hypoxia and thrombotic conditions confirming the vulnerability and severity of infection. We also screened CpG islands in Nsp1 & N conferring ability of SARS-CoV-2 to enter and trigger ZAP activity inside host cell.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    99
    References
    3
    Citations
    NaN
    KQI
    []