Using transcriptome sequencing and pooled exome capture to study local adaptation in the giga‐genome of Pinus cembra

2019 
Despite decreasing sequencing costs, whole-genome sequencing for population-based genome scans for selection is still prohibitively expensive for organisms with large genomes. Moreover, the repetitive nature of large genomes often represents a challenge in bioinformatic and downstream analyses. Here we use in-depth transcriptome sequencing to design probes for exome capture in Swiss stone pine ( Pinus cembra ), a conifer with an estimated genome size of 29.3 Gbp and no reference genome available. We successfully applied around 55,000 self-designed probes, targeting 25,000 contigs, to DNA pools of seven populations from the Swiss Alps and identified > 140,000 SNPs in around 13,000 contigs. The probes performed equally well in pools of the closely related species Pinus sibirica ; in both species, more than 70% of the targeted contigs were sequenced at a depth ≥ 40x, i.e. the number of haplotypes in the pool. However, a thorough analysis of individually sequenced P. cembra samples indicated that a majority of the contigs (63%) represented multi-copy genes. We therefore removed paralogous contigs based on heterozygote excess and deviation from allele balance. Without putatively paralogous contigs, allele frequencies of population pools represented accurate estimates of individually determined allele frequencies. Using population genetic and landscape genomic methods, we show that inferences of neutral and adaptive genetic variation may be biased when not accounting for such multi-copy genes. Future studies should therefore put more emphasis on identifying paralogous loci, which will be facilitated by the establishment of additional high-quality reference genomes.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    78
    References
    16
    Citations
    NaN
    KQI
    []