Advances in genome sequencing and assembly technologies are generating many high-quality genome sequences, but assemblies of large, repeat-rich polyploid genomes, such as that of bread wheat, remain fragmented and incomplete. We have generated a new wheat whole-genome shotgun sequence assembly using a combination of optimized data types and an assembly algorithm designed to deal with large and complex genomes. The new assembly represents >78% of the genome with a scaffold N50 of 88.8 kb that has a high fidelity to the input data. Our new annotation combines strand-specific Illumina RNA-seq and Pacific Biosciences (PacBio) full-length cDNAs to identify 104,091 high-confidence protein-coding genes and 10,156 noncoding RNA genes. We confirmed three known and identified one novel genome rearrangements. Our approach enables the rapid and scalable assembly of wheat genomes, the identification of structural variants, and the definition of complete gene models, all powerful resources for trait analysis and breeding of this key global crop.
The grass species Brachypodium distachyon (hereafter, Brachypodium) has been adopted as a model system for grasses. Here, we describe the development of a genetic linkage map of Brachypodium. The genetic linkage map was developed with an F 2 population from a cross between the diploid Brachypodium lines Bd3-1 and Bd21. The map was populated with polymorphic simple sequence repeat (SSR) markers from Brachypodium expressed sequence tag (EST) and bacterial artificial chromosome (BAC) end sequences and conserved orthologous sequence (COS) markers from other grass species. The map is 1386 cM in length and consists of 139 marker loci distributed across 20 linkage groups. Five of the linkage groups exceed 100 cM in length, with the largest being 231 cM long. Assessment of colinearity between the Brachypodium linkage map and the rice genome sequence revealed significant regions of macrosynteny between the two genomes, as well as rearrangements similar to those reported in other grass comparative structural genomics studies. The Brachypodium genetic linkage map described here will serve as a new tool to pursue a range of molecular genetic analyses and other applications in this new model plant system.
Bread wheat (Triticum aestivum) is a globally important crop, accounting for 20 per cent of the calories consumed byhumans. Major efforts are underway worldwide to increase wheat production by extending genetic diversity andanalysing key traits, and genomic resources can accelerate progress. But so far the very large size and polyploidcomplexity of the bread wheat genome have been substantial barriers to genome analysis. Here we report thesequencing of its large, 17-gigabase-pair, hexaploid genome using 454pyrosequencing, and comparison of this withthe sequences of diploid ancestral and progenitor genomes. We identified between 94,000 and 96,000 genes, andassigned two-thirds to the three component genomes (A, B and D) of hexaploid wheat. High-resolution syntenymaps identified many small disruptions to conserved gene order. We show that the hexaploid genome is highlydynamic, with significant loss of gene family members on polyploidization and domestication, and an abundance ofgene fragments. Several classes of genes involved in energy harvesting, metabolism and growth are among expandedgenefamiliesthatcouldbeassociatedwithcropproductivity.Ouranalyses,coupledwiththeidentificationofextensivegenetic variation, provide a resource for accelerating gene discovery and improving this major crop.With a global output of 681 million tonnes in 2011
Abstract Background The same species of plant can exhibit highly diverse sizes and shapes of organs that are genetically determined. Defining genetic variation underlying this morphological diversity is an important objective in evolutionary studies and it also helps identify the functions of genes influencing plant growth and development. Extensive screens of mutagenised Arabidopsis populations have identified multiple genes and mechanisms affecting organ size and shape, but relatively few studies have exploited the rich diversity of natural populations to identify genes involved in growth control. Results We screened a relatively well characterised collection of Arabidopsis thaliana ecotypes for variation in petal size. Association analyses identified sequence and gene expression variation on chromosome 4 that made a substantial contribution to differences in petal area. Variation in expression of At4g16850 (named as KSK ), encoding a hypothetical protein, had a substantial role on variation in organ size by influencing cell size. Over-expression of KSK led to larger petals with larger cells and promoted the formation of stamenoid features. The expression of auxin-responsive genes known to limit cell growth was reduced in response to KSK over-expression. ANT expression was also reduced in KSK over-expression lines, consistent with altered floral identities. Auxin availability was reduced in KSK over-expressing cells, consistent with changes in auxin-responsive gene expression. KSK may therefore influence auxin availability during petal development. Conclusions Understanding how genetic variation influences plant growth is important for both evolutionary and mechanistic studies. We used natural populations of Arabidopsis thaliana to identify sequence variation in a promoter region of Arabidopsis ecotypes that mediated differences in the expression of a previously uncharacterised membrane protein. This variation contributed to altered auxin availability and cell size during petal growth.
Abstract Background The accurate sequencing and assembly of very large, often polyploid, genomes remain a challenging task, limiting long range sequence information and phased sequence variation for applications such as plant breeding. The 15 Gb hexaploid bread wheat genome has been particularly challenging to sequence, and several contending approaches recently generated accurate long range assemblies. Understanding errors in these assemblies is important for optimising future sequencing and assembly approaches and for comparative genomics. Results Here we use a Fosill 38 Kb jumping library to assess medium and longer range order of different publicly available wheat genome assemblies. Modifications to the Fosill protocol generated longer Illumina sequences and enabled comprehensive genome coverage. Analyses of two independent BAC based chromosome-scale assemblies, two independent Illumina whole genome shotgun assemblies, and a hybrid long read (PacBio) and short read (Illumina) assembly were carried out. We revealed a variety of discrepancies using Fosill mate-pair mapping and validated several of each class. In addition, Fosill mate-pairs were used to scaffold a whole genome Illumina assembly, leading to a three-fold increase in N50 values. Conclusions Our analyses, using an independent means to validate different wheat genome assemblies, show that whole genome shotgun assemblies are significantly more accurate by all measures compared to BAC-based chromosome scale assemblies. Although current whole genome assemblies are reasonably accurate and useful, additional steps will be needed for the rapid, cost effective and complete sequencing and assembly of wheat genomes.
Abstract White blister rust, caused by the oomycete Albugo candida , is a widespread disease of Brassica crops. The Arabidopsis CSA1/DAR4 (also known as CSA1/CHS3) paired immune receptor carries an Integrated Domain (ID) with homology to the DA1 family of peptidases. Using domain swaps with DA1 family members, we show that the DAR4 ID acts as an integrated decoy for DAR3, which interacts with and inhibits the peptidase activities of DA1, DAR1 and DAR2 family members. Albugo infection rapidly lowered DAR3 levels and activates DA1 peptidase activity. This promotes endoreduplication of host tissues to support pathogen growth. We propose that DAR4/CSA1 senses the actions of a putative Albugo effector that reduces DAR3 levels and initiates defense.