Analysis of epidemiologic study data when there is geolocation uncertainty

2020 
Abstract Geolocation uncertainty is common in epidemiological studies that depend on addresses to determine exposure. We developed a spatial construct and statistical framework by which to characterize geolocation uncertainty, develop analysis methods, and compare those methods. Exposure is represented by a three-dimensional step function over the partitioned spatial surface and a person’s geolocation boundary is defined as the union of partition elements that cover the possible index location. Disease rates are defined by exposures from the partitioned surface. Standard process theory was used for analytic results and an empirical evaluation computer simulation method was used to compare methods. A case-control study of pesticide exposure and childhood cancer was used to illustrate the problem. Pesticide exposure was derived from geolocations determined from birth residences, where about half the reference addresses resolved to ZIP Codes while the rest resolve to smaller areas. We found that the centroid method has much worse power (0.35) to detect pesticide-disease effects than using either whole area proportion exposed (0.52) or when using an induced intensity approach (0.79). The latter approach properly accounted for the geolocation uncertainty even if only ZIP Code address information was used. ZIP Code address data had twice the variance compared to using the actual geolocation boundaries which had twice the variance compared to if there were no geolocation uncertainty. Our area based analytic approach confirms that geolocation ambiguity should be considered in the context of exposure-disease investigations that rely on address data for determining exposure.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    22
    References
    1
    Citations
    NaN
    KQI
    []