How low can you go: sex identification from sequencing data of species lacking assembled sex chromosomes

2021 
Accurate sex identification is crucial for elucidating the biology of a species. Here, we present SeXY, a sex-identification pipeline, for very low-coverage shotgun sequencing data from a single individual. The method does not require a conspecific sex-chromosome assembly as reference. SeXY was specifically designed to utilise low-effort screening data for sexing, but can also be applied to samples of higher-effort sequencing. We assess the accuracy of our pipeline to data quantity by downsampling sequencing data from 100,000 to 1,000 mapped reads, and mapping to a variety of reference genomes of various quality and phylogenetic distance. We show that when mapping to a high-quality (highly contiguous N50 > 30 Mb in our case, or chromosome-level) conspecific genome, our method is 100% accurate even down to 1,000 mapped reads. For lower-quality reference assemblies (N50
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    51
    References
    0
    Citations
    NaN
    KQI
    []