arcasHLA: high resolution HLA typing from RNA seq

2018 
Human leukocyte antigen (HLA) locus makes up the major compatibility complex (MHC) and plays a critical role in host response to disease, including cancers and autoimmune disorders. In the clinical setting, HLA typing is necessary for determining tissue compatibility. Recent improvements in the quality and accessibility of next-generation sequencing have made HLA typing from standard short-read data practical. However, this task remains challenging given the high level of polymorphism and homology between the HLA genes. HLA typing from RNA sequencing is further complicated by post-transcriptional splicing and bias due to amplification. Here, we present arcasHLA: a fast and accurate in silico tool that infers HLA genotypes from RNA sequencing data. Our tool outperforms established tools on the gold-standard benchmark dataset for HLA typing in terms of both accuracy and speed, with an accuracy rate of 100% at two field precision for MHC class I genes, and over 99.7% for MHC class II. Importantly, arcasHLA takes as its input pre-aligned BAM files, and outputs three-field resolution for all HLA genes in less than 2 minutes. Finally, we discuss evaluate the performance of our tool on a new biological dataset of 447 single-end total RNA samples from nasopharyngeal swabs, and establish the applicability of arcasHLA in metatranscriptome studies. arcasHLA is available at https://github.com/RabadanLab/arcasHLA.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    34
    References
    2
    Citations
    NaN
    KQI
    []