Massive colonization of protein-coding exons by selfish genetic elements in Paramecium germline genomes

2020 
Ciliates are unicellular eukaryotes with both a germline genome and a somatic genome in the same cytoplasm. The macronucleus (MAC), responsible for gene expression, is not sexually transmitted but develops from a copy of the micronucleus (MIC) at each sexual generation. In the MIC genome of Paramecium tetraurelia, genes are interrupted by tens of thousands of unique intervening sequences, called Internal Eliminated Sequences (IESs), that have to be precisely excised during the development of the new MAC to restore functional genes. To understand the evolutionary origin of this peculiar genomic architecture, we sequenced the MIC genomes of nine Paramecium species (from ~100 Mb in P. aurelia species to > 1.5 Gb in P. caudatum). We detected several waves of IES gains, both in ancestral and in more recent lineages. Remarkably, we identified 24 families of mobile IESs that generated tens to thousands of new copies. The most active families show the signature of horizontal transfer. These examples illustrate how mobile elements can account for the massive proliferation of IESs in the germline genomes of Paramecium, both in non-coding regions and within exons. We also provide evidence that IESs represent a substantial burden for their host, presumably because of excision errors. Interestingly, we observe that IES excision pathways vary according to the age of IESs, and that older IESs tend to be more efficiently excised. This suggests that once fixed in the genome, the presence of IESs imposes a selective pressure on their host, both in cis (on the excision signals of each IES) and in trans (on the cellular excision machinery), to ensure efficient and precise removal. Finally, we identified 69 IESs that are under strong purifying selection across the P. aurelia clade, which indicates that a small fraction of IESs provide a function beneficial for their host. All these features highlight the major role played by selfish elements in shaping the complexity of gene expression processes and in driving genome architecture.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    88
    References
    2
    Citations
    NaN
    KQI
    []