Fuzzy heaping mechanism for heaped count data with imprecision

2018 
In genetic association studies, the traits of interest may sometimes be collected from the reported data. Since subjects report exact responses and/or rounded responses, the histogram of data frequently exhibits spikes at particular values. This phenomenon, known as heaping, can cause difficulties in performing the association test via standard modeling approaches. Recently, several models have been proposed to identify the true unobservable underlying distribution from heaped data. However, all of these methods depend on probabilistic assumptions regarding the heaping mechanism. Unfortunately, probabilistic models cannot represent heaped data effectively, because heaping can be caused by imprecisely reported values. This type of imprecision is different from probabilistic uncertainty, which is described well by a probabilistic model. In this paper, we propose a fuzzy heaping model to identify genetic variants for the heaped count data. Our fuzzy model uses a mixture of likelihood functions for precisely and imprecisely reported data, treating heaped data as imprecise data represented by fuzzy sets. Moreover, since reported count data may include excess zeros, as well as heaped data, we extend our fuzzy heaping model to handle excess zeros. Through simulation studies, we show that the proposed fuzzy heaping model controls type I errors effectively and has great power to identify causal variants. We illustrate the proposed fuzzy heaping model through a study of the identification of genetic variants associated with the number of cigarettes smoked per day.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    30
    References
    3
    Citations
    NaN
    KQI
    []