Full-length Genome of the Ogataea polymorpha strain HU-11 reveals large duplicated segments in subtelomeic regions

2021 
BackgroundCurrently, methylotrophic yeasts (e.g., Pichia pastoris, Hansenula polymorpha, and Candida boindii) are subjects of intense genomics studies in basic research and industrial applications. In the genus Ogataea, most research is focused on three basic O. polymorpha strains--CBS4732, NCYC495, and DL-1. However, these three strains are of independent origin and unclear relationship. As a high-yield engineered O. polymorpha strain, HU-11 can be regarded as identical to CBS4732, because the only difference between them is a 5-bp insertion. ResultsIn the present study, we have assembled the full-length genome of O. polymorpha HU-11 using high-depth PacBio and Illumina data. Long terminal repeat (LTR) retrotransposons, rDNA, 5 and 3 telomeric, subtelomeric, low complexity and other repeat regions were curated to improve the genome quality. We took advantage of the full-length HU-11 genome sequence for the genome annotation and comparison. Particularly, we determined the exact location of the rDNA genes and LTR retrotransposons in seven chromosomes and detected large duplicated segments in the subtelomeic regions. Three novel findings are: (1) the O. polymorpha NCYC495 is so phylogenetically similar to HU11 that a nearly 100% of their genomes is covered by their syntenic regions, while NCYC495 is significantly distinct from DL-1; (2) large segment duplication in subtelomeic regions is the main reason for genome expansion in yeasts; and the duplicated segments in subtelomeric regions may be integrated at telomeric tandem repeats (TRs) through a molecular mechanism, which can be used to develop a simple and highly efficient genome editing system to integrate or cleave large segments at telomeric TRs. ConclusionsOur findings provide new opportunities for in-depth understanding of genome evolution in methylotrophic yeasts and lay the foundations for the industrial applications of O. polymorpha CBS4732 and HU11. The full-length genome of the O. polymorpha strain HU-11 should be included into the NCBI RefSeq database for future studies of O. polymorpha CBS4732 and its derivatives LR9, RB11 and HU-11.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    8
    References
    0
    Citations
    NaN
    KQI
    []