The abundance of plant genomic information caused by the decrease of sequencing costs contrasts with the lack of databases that integrate genome annotation, taxonomy and phenotypes to produce statistically sound, biologically meaningful knowledge. Here we present ARCADE (ARChaeplastida Annotation DatabasE), a database of 171 high-quality archaeplastidian non-redundant proteomes gathered from six primary genomic databases, together with proteome quality metrics and a growing number of associated metadata. As a case study to demonstrate the usefulness of ARCADE, we used it to investigate the expansion and contraction of protein domains associated with the evolution of genome size (hereafter GS). GS varies greatly among land plants and the synthesis of large genomes can be costly to cells. Although GS has been studied extensively for decades, the molecular mechanisms involved in the adaptations of plants to the increase in GS are still poorly understood. We used the annotation and phylogenetic information available in ARCADE, together with estimated GS values available for 83 land plant species, to search for associations between the abundance of protein domain families in these species and GS variation through phylogenetic-aware methods. Additionally, we estimated the GS for the ancestral nodes of the extant land plant species. GS seems to be decreasing along the course of evolution, except for a few branches that might have undergone independent GS increases. We found 7 Pfam correlated with the variation in GS in land plants, mainly related to nucleotide metabolism, DNA repair and genome organization. We found larger genomes to have a greater frequency of the Histone 2A superfamily, responsible for diverse functions, including the nucleosome formation and silencing of transposable elements. These molecular functions we found correlated to GS variation suggests they may be associated with preserving genome stability in larger genomes, and might indicate the evolution of mechanisms to cope with the variation in GS in land plants. ARCADE is available at https://osf.io/2fkvh/.
The cell shape and morphology of plant tissues are intimately related to structural modifications in the primary cell wall that are associated with key processes in the regulation of cell growth and differentiation. The primary cell wall is composed mainly of cellulose immersed in a matrix of hemicellulose, pectin, lignin and some structural proteins. Xyloglucan is a hemicellulose polysaccharide present in the cell walls of all land plants (Embryophyta) and is the main hemicellulose in non-graminaceous angiosperms. In this work, we used a comparative genomic approach to obtain new insights into the evolution of the xyloglucan-related enzymatic machinery in green plants. Detailed phylogenetic analyses were done for enzymes involved in xyloglucan synthesis (xyloglucan transglycosylase/hydrolase, α-xylosidase, β-galactosidase, β-glucosidase and α-fucosidase) and mobilization/degradation (β-(1→4)-glucan synthase, α-fucosyltransferases, β-galactosyltransferases and α-xylosyl transferase) based on 12 fully sequenced genomes and expressed sequence tags from 29 species of green plants. Evidence from Chlorophyta and Streptophyta green algae indicated that part of the Embryophyta xyloglucan-related machinery evolved in an aquatic environment, before land colonization. Streptophyte algae have at least three enzymes of the xyloglucan machinery: xyloglucan transglycosylase/hydrolase, β-(1→4)-glucan synthase from the celullose synthase-like C family and α-xylosidase that is also present in chlorophytes. Interestingly, gymnosperm sequences orthologs to xyloglucan transglycosylase/hydrolases with exclusively hydrolytic activity were also detected, suggesting that such activity must have emerged within the last common ancestor of spermatophytes. There was a positive correlation between the numbers of founder genes within each gene family and the complexity of the plant cell wall. Our data support the idea that a primordial xyloglucan-like polymer emerged in streptophyte algae as a pre-adaptation that allowed plants to subsequently colonize terrestrial habitats. Our results also provide additional evidence that charophycean algae and land plants are sister groups.
Living species vary significantly in phenotype and genomic content. Sophisticated statistical methods linking genes with phenotypes within a species have led to breakthroughs in complex genetic diseases and genetic breeding. Despite the abundance of genomic and phenotypic data available for thousands of species, finding genotype-phenotype associations across species is challenging due to the non-independence of species data resulting from common ancestry. To address this, we present CALANGO (comparative analysis with annotation-based genomic components), a phylogeny-aware comparative genomics tool to find homologous regions and biological roles associated with quantitative phenotypes across species. In two case studies, CALANGO identified both known and previously unidentified genotype-phenotype associations. The first study revealed unknown aspects of the ecological interaction between Escherichia coli, its integrated bacteriophages, and the pathogenicity phenotype. The second identified an association between maximum height in angiosperms and the expansion of a reproductive mechanism that prevents inbreeding and increases genetic diversity, with implications for conservation biology and agriculture.
Abstract Glucose modulates plant metabolism, growth, and development. In Arabidopsis (Arabidopsis thaliana), Hexokinase1 (HXK1) is a glucose sensor that may trigger abscisic acid (ABA) synthesis and sensitivity to mediate glucose-induced inhibition of seedling development. Here, we show that the intensity of short-term responses to glucose can vary with ABA activity. We report that the transient (2 h/4 h) repression by 2% glucose of AtbZIP63, a gene encoding a basic-leucine zipper (bZIP) transcription factor partially involved in the Snf1-related kinase KIN10-induced responses to energy limitation, is independent of HXK1 and is not mediated by changes in ABA levels. However, high-concentration (6%) glucose-mediated repression appears to be modulated by ABA, since full repression of AtbZIP63 requires a functional ABA biosynthetic pathway. Furthermore, the combination of glucose and ABA was able to trigger a synergistic repression of AtbZIP63 and its homologue AtbZIP3, revealing a shared regulatory feature consisting of the modulation of glucose sensitivity by ABA. The synergistic regulation of AtbZIP63 was not reproduced by an AtbZIP63 promoter-5′-untranslated region::β-glucuronidase fusion, thus suggesting possible posttranscriptional control. A transcriptional inhibition assay with cordycepin provided further evidence for the regulation of mRNA decay in response to glucose plus ABA. Overall, these results indicate that AtbZIP63 is an important node of the glucose-ABA interaction network. The mechanisms by which AtbZIP63 may participate in the fine-tuning of ABA-mediated abiotic stress responses according to sugar availability (i.e., energy status) are discussed.
Sucrose content is a highly desirable trait in sugarcane as the worldwide demand for cost-effective biofuels surges. Sugarcane cultivars differ in their capacity to accumulate sucrose and breeding programs routinely perform crosses to identify genotypes able to produce more sucrose. Sucrose content in the mature internodes reach around 20% of the culms dry weight. Genotypes in the populations reflect their genetic program and may display contrasting growth, development, and physiology, all of which affect carbohydrate metabolism. Few studies have profiled gene expression related to sugarcane's sugar content. The identification of signal transduction components and transcription factors that might regulate sugar accumulation is highly desirable if we are to improve this characteristic of sugarcane plants.We have evaluated thirty genotypes that have different Brix (sugar) levels and identified genes differentially expressed in internodes using cDNA microarrays. These genes were compared to existing gene expression data for sugarcane plants subjected to diverse stress and hormone treatments. The comparisons revealed a strong overlap between the drought and sucrose-content datasets and a limited overlap with ABA signaling. Genes associated with sucrose content were extensively validated by qRT-PCR, which highlighted several protein kinases and transcription factors that are likely to be regulators of sucrose accumulation. The data also indicate that aquaporins, as well as lignin biosynthesis and cell wall metabolism genes, are strongly related to sucrose accumulation. Moreover, sucrose-associated genes were shown to be directly responsive to short term sucrose stimuli, confirming their role in sugar-related pathways.Gene expression analysis of sugarcane populations contrasting for sucrose content indicated a possible overlap with drought and cell wall metabolism processes and suggested signaling and transcriptional regulators to be used as molecular markers in breeding programs. Transgenic research is necessary to further clarify the role of the genes and define targets useful for sugarcane improvement programs based on transgenic plants.
Abstract Splicing is a highly conserved, intricate mechanism intimately linked to transcription elongation, serving as a pivotal regulator of gene expression. Alternative splicing may generate specific transcripts incapable of undergoing translation into proteins, designated as unproductive. A plethora of respiratory viruses, including Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2), strategically manipulate the host’s splicing machinery to circumvent antiviral responses. During the infection, SARS-CoV-2 effectively suppresses interferon (IFN) expression, leading to B cell and CD8+ T cell leukopenia, while simultaneously increasing the presence of macrophages and neutrophils in patients with severe COVID-19. In this study, we integrated publicly available omics datasets to systematically analyze transcripts at the isoform level and delineate the nascent-peptide translatome landscapes of SARS-CoV-2-infected human cells. Our findings reveal a hitherto uncharacterized mechanism whereby SARS-CoV-2 infection induces the predominant expression of unproductive splicing isoforms in key IFN signaling genes, interferon-stimulated genes (ISGs), class I MHC genes, and splicing machinery genes, including IRF7, OAS3, HLA-B, and HNRNPH1. In stark contrast, cytokine and chemokine genes, such as IL6, CXCL8, and TNF, predominantly express productive (protein-coding) splicing isoforms in response to SARS-CoV-2 infection. We postulate that SARS-CoV-2 employs a previously unreported tactic of exploiting the host splicing machinery to bolster viral replication and subvert the immune response by selectively upregulating unproductive splicing isoforms from antigen presentation and antiviral response genes. Our study sheds new light on the molecular interplay between SARS-CoV-2 and the host immune system, offering a foundation for the development of novel therapeutic strategies to combat COVID-19.
Ascorbate peroxidases (APX) are class I members of the Peroxidase-Catalase superfamily, a large group of evolutionarily related but rather divergent enzymes. Through mining in public databases, unusual subsets of APX homologs were identified, disclosing the existence of two yet uncharacterized families of peroxidases named ascorbate peroxidase-related (APX-R) and ascorbate peroxidase-like (APX-L). As APX, APX-R harbor all catalytic residues required for peroxidatic activity. Nevertheless, proteins of this family do not contain residues known to be critical for ascorbate binding and therefore cannot use it as an electron donor. On the other hand, APX-L proteins not only lack ascorbate-binding residues, but also every other residue known to be essential for peroxidase activity. Through a molecular phylogenetic analysis performed with sequences derived from basal Archaeplastida, the present study discloses the existence of hybrid proteins, which combine features of these three families. The results here presented show that the prevalence of hybrid proteins varies among distinct groups of organisms, accounting for up to 33% of total APX homologs in species of green algae. The analysis of this heterogeneous group of proteins sheds light on the origin of APX-R and APX-L and suggests the occurrence of a process characterized by the progressive deterioration of ascorbate-binding and catalytic sites towards neofunctionalization.