The order Hypocreales (Ascomycota) is composed of ubiquitous and ecologically diverse fungi such as saprobes, biotrophs, and pathogens. Despite their phylogenetic relationship, these species exhibit high variability in biomolecules production, lifestyle, and fitness. The mitochondria play an important role in the fungal biology, providing energy to the cells and regulating diverse processes, such as immune response. Although its importance, the mechanisms that shape fungal mitogenomes are still poorly understood. Herein, we investigated the variability and evolution of mitogenomes and its relationship with the divergence time using the order Hypocreales as a study model. We sequenced and annotated for the first time Trichoderma harzianum mitochondrial genome (mtDNA), which was compared to other 34 mtDNAs species that were publicly available. Comparative analysis revealed a substantial structural and size variation on non-coding mtDNA regions, despite the conservation of copy number, length, and structure of protein-coding elements. Interestingly, we observed a highly significant correlation between mitogenome length, and the number and size of non-coding sequences in mitochondrial genome. Among the non-coding elements, group I and II introns and homing endonucleases genes (HEGs) were the main contributors to discrepancies in mitogenomes structure and length. Several intronic sequences displayed sequence similarity among species, and some of them conserved even at gene position, and were present in the majority of mitogenomes, indicating its origin in a common ancestor. On the other hand, we also identified species-specific introns that advocate for the origin by different mechanisms. Investigation of mitochondrial gene transfer to the nuclear genome revealed that nuclear copies of the nad5 are the most frequent while atp8, atp9, and cox3 could not be identified in any of the nuclear genomes analyzed. Moreover, we also estimated the divergence time of each species and investigated its relationship with coding and non-coding elements as well as with the length of mitogenomes. Altogether, our results demonstrated that introns and HEGs are key elements on mitogenome shaping and its presence on fast-evolving mtDNAs could be mostly explained by its divergence time, although the intron sharing profile suggests the involvement of other mechanisms on the mitochondrial genome evolution, such as horizontal transference.
RNA processing is a highly conserved mechanism that serves as a pivotal regulator of gene expression. Alternative processing generates transcripts that can still be translated but lead to potentially nonfunctional proteins. A plethora of respiratory viruses, including severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), strategically manipulate the host’s RNA processing machinery to circumvent antiviral responses. We integrated publicly available omics datasets to systematically analyze isoform-level expression and delineate the nascent peptide landscape of SARS-CoV-2-infected human cells. Our findings explore a suggested but uncharacterized mechanism, whereby SARS-CoV-2 infection induces the predominant expression of unproductive splicing isoforms in key IFN signaling, interferon-stimulated (ISGs), class I MHC, and splicing machinery genes, including IRF7, HLA-B, and HNRNPH1. In stark contrast, cytokine and chemokine genes, such as IL6 and TNF, predominantly express productive (protein-coding) splicing isoforms in response to SARS-CoV-2 infection. We postulate that SARS-CoV-2 employs an unreported tactic of exploiting the host splicing machinery to bolster viral replication and subvert the immune response by selectively upregulating unproductive splicing isoforms from antigen presentation and antiviral response genes. Our study sheds new light on the molecular interplay between SARS-CoV-2 and the host immune system, offering a foundation for the development of novel therapeutic strategies to combat COVID-19.
ABSTRACT The increasing availability of genomic, annotation, evolutionary and phenotypic data for species contrasts with the lack of studies that adequately integrate these heterogeneous data sources to produce biologically meaningful knowledge. Here, we present CALANGO, a phylogeny-aware comparative genomics tool that uncovers functional molecular convergences and homologous regions associated with quantitative genotypes and phenotypes across species, enabling the fast discovery of novel statistically sound, biologically relevant phenotype-genotype associations. We demonstrate the usefulness of CALANGO in two case studies. The first one unveils potential causal links between prophage density and the pathogenicity phenotype in Escherichia coli , and confidently demonstrates how CALANGO supports the investigation of basic causal relationships by enabling a level of counterfactual investigation of observed associations in the data. As a second case study, we used our tool to search for homologous regions associated with a complex phenotypic trait in a major group of eukaryotes: the evolution of maximum height in angiosperms. We confidently identify a previously unknown association between maximum plant height and the expansion of the self-incompatibility system, a molecular mechanism that prevents inbreeding and increases genetic diversity. Taller species also have lower rates of molecular evolution due to their longer generation times, a critical concern for their long-term viability. The new mechanism we report could counterbalance this fact, and have far-reaching consequences for fields as diverse as conservation biology and agriculture. CALANGO is provided as a fully operational R package that can be freely installed from CRAN.
SARS-CoV-2 infection depend on the binding of the viral Spike glycoprotein (S) to the human receptor Angiotensin Converting Enzyme 2 (ACE2) to induce virus–cell membrane fusion. S protein evolved diverse amino acid changes that are possibly linked to more efficient binding to human ACE2, which might explain part of the increase in frequency of SARS-CoV-2 Variants Of Concern (VOCs). In this work, we investigated the role of ACE2 protein variations that are naturally found in human populations and its binding affinity with S protein from SARS-CoV-2 representative genotypes, based on a series of in silico approaches involving molecular modelling, docking and molecular dynamics simulations. Our results indicate that SARS-CoV-2 VOCs bind more efficiently to the human receptor ACE2 than the ancestral Wuhan genotype. Additionally, variations in the ACE2 protein can affect SARS-CoV-2 binding and protein-protein stability, mostly making the interaction weaker and unstable in some cases. We show that some VOCs, such as B.1.1.7 and P.1 are much less sensitive to ACE2 variants, while others like B.1.351 appear to be specifically optimized to bind to the widespread wild-type ACE2 protein. Communicated by Ramaswamy H. Sarma
The cell shape and morphology of plant tissues are intimately related to structural modifications in the primary cell wall that are associated with key processes in the regulation of cell growth and differentiation. The primary cell wall is composed mainly of cellulose immersed in a matrix of hemicellulose, pectin, lignin and some structural proteins. Xyloglucan is a hemicellulose polysaccharide present in the cell walls of all land plants (Embryophyta) and is the main hemicellulose in non-graminaceous angiosperms. In this work, we used a comparative genomic approach to obtain new insights into the evolution of the xyloglucan-related enzymatic machinery in green plants. Detailed phylogenetic analyses were done for enzymes involved in xyloglucan synthesis (xyloglucan transglycosylase/hydrolase, α-xylosidase, β-galactosidase, β-glucosidase and α-fucosidase) and mobilization/degradation (β-(1→4)-glucan synthase, α-fucosyltransferases, β-galactosyltransferases and α-xylosyl transferase) based on 12 fully sequenced genomes and expressed sequence tags from 29 species of green plants. Evidence from Chlorophyta and Streptophyta green algae indicated that part of the Embryophyta xyloglucan-related machinery evolved in an aquatic environment, before land colonization. Streptophyte algae have at least three enzymes of the xyloglucan machinery: xyloglucan transglycosylase/hydrolase, β-(1→4)-glucan synthase from the celullose synthase-like C family and α-xylosidase that is also present in chlorophytes. Interestingly, gymnosperm sequences orthologs to xyloglucan transglycosylase/hydrolases with exclusively hydrolytic activity were also detected, suggesting that such activity must have emerged within the last common ancestor of spermatophytes. There was a positive correlation between the numbers of founder genes within each gene family and the complexity of the plant cell wall. Our data support the idea that a primordial xyloglucan-like polymer emerged in streptophyte algae as a pre-adaptation that allowed plants to subsequently colonize terrestrial habitats. Our results also provide additional evidence that charophycean algae and land plants are sister groups.
Transcriptional regulation, led by transcription factors (TFs) such as those of the WRKY family, is a mechanism used by the organism to enhance or repress gene expression in response to stimuli. Here, we report on the genome-wide analysis of the Theobroma cacao WRKY TF family and also investigate the expression of WRKY genes in cacao infected by the fungus Moniliophthora perniciosa. In the cacao genome, 61 non-redundant WRKY sequences were found and classified in three groups (I to III) according to the WRKY and zinc-finger motif types. The 61 putative WRKY sequences were distributed on the 10 cacao chromosomes and 24 of them came from duplication events. The sequences were phylogenetically organized according to the general WRKY groups. The phylogenetic analysis revealed that subgroups IIa and IIb are sister groups and share a common ancestor, as well as subgroups IId and IIe. The most divergent groups according to the plant origin were IIc and III. According to the phylogenetic analysis, 7 TcWRKY genes were selected and analyzed by RT-qPCR in susceptible and resistant cacao plants infected (or not) with M. perniciosa. Some TcWRKY genes presented interesting responses to M. perniciosa such as Tc01_p014750/Tc06_p013130/AtWRKY28, Tc09_p001530/Tc06_p004420/AtWRKY40, Tc04_p016130/AtWRKY54 and Tc10_p016570/ AtWRKY70. Our results can help to select appropriate candidate genes for further characterization in cacao or in other Theobroma species.
The abundance of plant genomic information caused by the decrease of sequencing costs contrasts with the lack of databases that integrate genome annotation, taxonomy and phenotypes to produce statistically sound, biologically meaningful knowledge. Here we present ARCADE (ARChaeplastida Annotation DatabasE), a database of 171 high-quality archaeplastidian non-redundant proteomes gathered from six primary genomic databases, together with proteome quality metrics and a growing number of associated metadata. As a case study to demonstrate the usefulness of ARCADE, we used it to investigate the expansion and contraction of protein domains associated with the evolution of genome size (hereafter GS). GS varies greatly among land plants and the synthesis of large genomes can be costly to cells. Although GS has been studied extensively for decades, the molecular mechanisms involved in the adaptations of plants to the increase in GS are still poorly understood. We used the annotation and phylogenetic information available in ARCADE, together with estimated GS values available for 83 land plant species, to search for associations between the abundance of protein domain families in these species and GS variation through phylogenetic-aware methods. Additionally, we estimated the GS for the ancestral nodes of the extant land plant species. GS seems to be decreasing along the course of evolution, except for a few branches that might have undergone independent GS increases. We found 7 Pfam correlated with the variation in GS in land plants, mainly related to nucleotide metabolism, DNA repair and genome organization. We found larger genomes to have a greater frequency of the Histone 2A superfamily, responsible for diverse functions, including the nucleosome formation and silencing of transposable elements. These molecular functions we found correlated to GS variation suggests they may be associated with preserving genome stability in larger genomes, and might indicate the evolution of mechanisms to cope with the variation in GS in land plants. ARCADE is available at https://osf.io/2fkvh/.