Background: Nitrogen (N) fertilization in crop production significantly impacts ecosystems, often disrupting natural plant-microbe-soil interactions and causing environmental pollution. Our research tested the hypothesis that phylogenetically related perennial grasses might preserve rhizosphere management strategies conducive to a sustainable N economy for crops. Method: We analyzed the N cycle in the rhizospheres of 36 Andropogoneae grass species related to maize and sorghum, investigating their impacts on N availability and losses. This assay is supplemented with the collection and comparison of native habitat environment data for ecological inference as well as cross-species genomic and transcriptomic association analyses for candidate gene discovery. Result: Contrary to our hypothesis, all examined annual species, including sorghum and maize, functioned as N "Conservationists," reducing soil nitrification potential and conserving N. In contrast, some perennial species enhanced nitrification and leaching ("Leachers"). Yet a few other species exhibited similar nitrification stimulation effects but limited NO3- losses ("Nitrate Keepers"). We identified significant soil characteristics as influential factors in the eco-evolutionary dynamics of plant rhizospheres, and highlighted the crucial roles of a few transporter genes in soil N management and utilization. Conclusion: These findings serve as valuable guidelines for future breeding efforts for global sustainability.
ABSTRACT Poa pratensis , commonly known as Kentucky bluegrass, is a popular cool-season grass species used as turf in lawns and recreation areas globally. Despite its substantial economic value, a reference genome had not previously been assembled due to the genome’s relatively large size and biological complexity that includes apomixis, polyploidy, and interspecific hybridization. We report here a fortuitous de novo assembly and annotation of a P. pratensis genome. Instead of sequencing the genome of a C4 grass, we accidentally sampled and sequenced tissue from a weedy P. pratensis whose stolon was intertwined with that of the C4 grass. The draft assembly consists of 6.09 Gbp with an N50 scaffold length of 65.1 Mbp, and a total of 118 scaffolds, generated using PacBio long reads and Bionano optical map technology. We annotated 256K gene models and found 58% of the genome to be composed of transposable elements. To demonstrate the applicability of the reference genome, we evaluated population structure and estimated genetic diversity in P. pratensis collected from three North American prairies, two in Manitoba, Canada and one in Colorado, USA. Our results support previous studies that found high genetic diversity and population structure within the species. The reference genome and annotation will be an important resource for turfgrass breeding and study of bluegrasses.
The genus Andropogon sensu lato is known to be polyphyletic. Accordingly, we here adjust part of the classification of the genus to reflect its evolutionary history and morphological diversity. A plastome phylogeny including 20 new plastome sequences confirms a well-supported clade of species broadly corresponding to Andropogon section Leptopogon. Morphological diversity was assessed across Andropogon sensu lato using specimens held at the K, MO, and A/GH herbaria, GrassBase, and photographs of spikelet pairs, with an emphasis on identifying members of this clade and their distinguishing features. The genus Anatherum is here reestablished, expanded to incorporate 45 of the 131 of Andropogon sensu lato species worldwide, and described and illustrated. Five species names in Anatherum are reinstated and new combinations are made for 40 species and one subspecies. Anatherum is most common and diverse in the Americas but also commonly found across Africa. Few species occur in Europe or Asia. Anatherum inflorescences generally have 2 branches, linear and slender internodes and pedicels with long trichomes, small elliptic to lanceolate spikelets, and flat to concave 2-keeled lower glumes with no intercarinal veins visible. Generic circumscription in this group is complicated by its polyploid history and limited understanding of the relationship between genomic composition and key morphological characters. Five species of doubtful generic affiliation are listed for future analysis.
ABSTRACT Sorghum and its relatives in the grass tribe Andropogoneae bear their flowers in pairs of spikelets, in which one spikelet (seed-bearing, or SS) of the pair produces a seed and the other is sterile or male (staminate). This division of function does not occur in other major cereals such as wheat or rice. Additionally, one bract of the seed-bearing spikelet often produces a long extension, the awn, which is in the same position as but independently derived from that of wheat and rice. The function of the sterile spikelet is unknown and that of the awn has not been tested in Andropogoneae. We used radioactive and stable isotopes of carbon, as well as RNA-seq of metabolically important enzymes to show that the sterile spikelet assimilates carbon, which is translocated to the largely heterotrophic SS, thereby functioning as a nurse tissue. The awn shows no evidence of photosynthesis. These results apply to distantly related species of Andropogoneae. Thus, the sterile spikelet, but not the awn, could affect yield in the cultivated species and fitness in the wild ones.
Societal Impact Statement The current rate of global biodiversity loss creates a pressing need to increase efficiency and throughput of extinction risk assessments in plants. We must assess as many plant species as possible, working with imperfect knowledge, to address the habitat loss and extinction threats of the Anthropocene. Using the biodiversity database, Botanical Information and Ecology Network (BIEN), and the Andropogoneae grass tribe as a case study, we demonstrate that large‐scale, preliminary conservation assessments can play a fundamental role in accelerating plant conservation pipelines and setting priorities for more in‐depth investigations. Summary The International Union for the Conservation of Nature (IUCN) Red List criteria are widely used to determine extinction risks of plant and animal life. Here, we used The Red List's criterion B, Geographic Range Size, to provide preliminary conservation assessments of the members of a large tribe of grasses, the Andropogoneae, with ~1100 species, including maize, sorghum, and sugarcane and their wild relatives. We used georeferenced occurrence data from the Botanical Information and Ecology Network (BIEN) and automated individual species assessments using ConR to demonstrate efficacy and accuracy in using time‐saving tools for conservation research. We validated our results with those from the IUCN‐recommended assessment tool, GeoCAT. We discovered a remarkably large gap in digitized information, with slightly more than 50% of the Andropogoneae lacking sufficient information for assessment. ConR and GeoCAT largely agree on which taxa are of least concern (>90%) or possibly threatened (<10%), highlighting that automating assessments with ConR is a viable strategy for preliminary conservation assessments of large plant groups. Results for crop wild relatives are similar to those for the entire dataset. Increasing digitization and collection needs to be a high priority. Available rapid assessment tools can then be used to identify species that warrant more comprehensive investigation.
Abstract Non-coding regions of the genome are just as important as coding regions for understanding the mapping from genotype to phenotype. Interpreting deep learning models trained on RNA-seq is an emerging method to highlight functional sites within non-coding regions. Most of the work on RNA abundance models has been done within humans and mice, with little attention paid to plants. Here, we benchmark four genomic deep learning model architectures with genomes and RNA-seq data from 18 species closely related to maize and sorghum within the Andropogoneae. The Andropogoneae are a tribe of C4 grasses that have adapted to a wide range of environments worldwide since diverging 18 million years ago. Hundreds of millions of years of evolution across these species has produced a large, diverse pool of training alleles across species sharing a common physiology. As model input, we extracted 1,026 base pairs upstream of each gene’s translation start site. We held out maize as our test set and two closely related species as our validation set, training each architecture on the remaining Andropogoneae genomes. Within a panel of 26 maize lines, all architectures predict expression across genes moderately well but poorly across alleles. DanQ consistently ranked highest or second highest among all architectures yet performance was generally very similar across architectures despite orders of magnitude differences in size. This suggests that state-of-the-art supervised genomic deep learning models are able to generalize moderately well across related species but not sensitively separate alleles within species, the latter of which agrees with recent work within humans. We are releasing the preprocessed data and code for this work as a community benchmark to evaluate new architectures on our across-species and across-allele tasks.
Abstract Assembled genomes and their associated annotations have transformed our study of gene function. However, each new assembly generates new gene models. Inconsistencies between annotations likely arise from biological and technical causes, including pseudogene misclassification, transposon activity, and intron retention from sequencing of unspliced transcripts. To evaluate gene model predictions, we developed reelGene, a pipeline of machine learning models focused on (1) transcription boundaries, (2) mRNA integrity, and (3) protein structure. The first two models leverage sequence characteristics and evolutionary conservation across related taxa to learn the grammar of conserved transcription boundaries and mRNA sequences, while the third uses conserved evolutionary grammar of protein sequences to predict whether a gene can produce a protein. Evaluating 1.8 million gene models in maize, reelGene found that 28% were incorrectly annotated or nonfunctional. By leveraging a large cohort of related species and through learning the conserved grammar of proteins, reelGene provides a tool for both evaluating gene model accuracy and genome biology.
Abstract Sorghum (Sorghum bicolor) and its relatives in the grass tribe Andropogoneae bear their flowers in pairs of spikelets in which one spikelet (seed-bearing or sessile spikelet [SS]) of the pair produces a seed and the other is sterile or male (staminate). This division of function does not occur in other major cereals such as wheat (Triticum aestivum) or rice (Oryza sativa). Additionally, one bract of the SS spikelet often produces a long extension, the awn, that is in the same position as, but independently derived from, that of wheat and rice. The function of the sterile spikelet is unknown and that of the awn has not been tested in Andropogoneae. We used radioactive and stable isotopes of carbon, RNA sequencing of metabolically important enzymes, and immunolocalization of ribulose-1,5-bisphosphate carboxylase/oxygenase (Rubisco) to show that the sterile spikelet assimilates carbon, which is translocated to the largely heterotrophic SS. The awn shows no evidence of photosynthesis. These results apply to distantly related species of Andropogoneae. Removal of sterile spikelets in sorghum significantly decreases seed weight (yield) by ∼9%. Thus, the sterile spikelet, but not the awn, affects yield in the cultivated species and fitness in the wild species.
Abstract Over the last 20 million years, the Andropogoneae tribe of grasses has evolved to dominate 17% of global land area. Domestication of these grasses in the last 10,000 years has yielded our most productive crops, including maize, sugarcane, and sorghum. The majority of Andropogoneae species, including maize, show a history of polyploidy – a condition that, while offering the evolutionary advantage of multiple gene copies, poses challenges to basic cellular processes, gene expression, and epigenetic regulation. Genomic studies of polyploidy have been limited by sparse sampling of taxa in groups with multiple polyploidy events. Here, we present 33 genome assemblies from 27 species, including chromosome-scale assemblies of maize relatives Zea and Tripsacum . In maize, the after-effects of polyploidy have been widely studied, showing reduced chromosome number, biased fractionation of duplicate genes, and transposable element (TE) expansions. While we observe these patterns within the genus Zea , 12 other polyploidy events deviate significantly. Those tetraploids and hexaploids retain elevated chromosome number, maintain nearly complete complements of duplicate genes, and have only stochastic TE amplifications. These genomes reveal variable outcomes of polyploidy, challenging simple predictions and providing a foundation for understanding its evolutionary implications in an ecologically and economically important clade.
Summary The IUCN Red List criteria are widely used to determine extinction risks of plant and animal life. Here, we use The Red List’s criterion B, Geographic Range Size, to provide preliminary conservation assessments of the members of a large tribe of grasses, the Andropogoneae, with ∼1100 species, including maize, sorghum, and sugarcane and their wild relatives. We use georeferenced occurrence data from the Botanical Information and Ecology Network (BIEN) and automated individual species assessments using ConR to demonstrate efficacy and accuracy in using time-saving tools for conservation research. We validate our results with those from the IUCN-authorized assessment tool, GeoCAT. We discovered a remarkably large gap in digitized information, with slightly more than 50% of the Andropogoneae lacking sufficient information for assessment. ConR and GeoCAT largely agree on which taxa are of least concern (>90%) or possibly threatened (<10%), highlighting that automating assessments with ConR is a viable strategy for preliminary conservation assessments of large plant groups. Results for crop wild relatives are similar to those for the entire data set. Increasing digitization and collection needs to be a high priority. Available rapid assessment tools can then be used to identify species that warrant more comprehensive investigation. Societal Impact Statement The current rate of global biodiversity loss creates a pressing need to increase efficiency and throughput of extinction risk assessments in plants. We must assess as many plant species as possible, working with imperfect knowledge, to address the habitat loss and seemingly countless extinction threats of the Anthropocene. Large-scale, preliminary conservation assessments can play a fundamental role in setting priorities for more in-depth investigation.