Maximal Segmental Score Method for Localizing Recessive Disease Variants Based on Sequence Data

2020 
Background: Due to the affordability of whole-genome sequencing, the genetic association design can now address rare diseases. However, some common statistical association methods only consider homozygosity mapping and need several criteria such as sliding windows of a given size and statistical significance threshold setting such as P-value < 0.05 to achieve good power in rare disease association detection. Method: Our region-specific method, called expanded Maximal Segmental Score (eMSS), converts p-values into continuous scores based on Maximal Segmental Score (MSS) (LIN et al. 2014) for detecting disease-associated segments. Our eMSS considers the whole genome sequence data, not only regions of homozygosity in candidate genes. Unlike sliding window methods of a given size, eMSS does not need predetermined parameters like window size or minimum or maximum number of SNPs in a segment. The performance of eMSS was evaluated by simulations and real data analysis for autosomal recessive diseases: Multiple Intestinal Atresia (MIA) and Osteogenesis Imperfecta (OI) where the number of cases is extremely small. For the real data, the results by eMSS were compared with a state of the art method, HDR-del (Imai et al., 2016). Results: Our simulation results show that eMSS had higher power as the number of non-causal haplotype blocks decreased. The type I error for eMSS under different scenarios was well controlled, p< 0.05. For our observed data, the bone morphogenetic protein 1 (BMP1) gene on chromosome 8 and the Violaxanthin de-epoxidase related, chloroplast (VDR) gene on chromosome 12 associated with OI, and the tetratricopeptide repeat domain 7A (TTC7A) gene on chromosome 2 associated with MIA have previously been identified as harboring the relevant pathogenic mutations. Conclusions: When compared to HDR-del, our eMSS is powerful in analyzing even small numbers of recessive cases and the results show that the method can further reduce numbers of candidate variants to a very small set of susceptibility pathogenic variants underlying OI and MIA. When we conduct whole-genome sequence analysis, eMSS used 3/5 computation time than HDR-del. Without additional parameters needing to be set in the segment detection, the computational burden for eMSS is lower compared with that in other region-specific approaches.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    32
    References
    0
    Citations
    NaN
    KQI
    []