Ranking Association Rules from Data Mining for Health Outcomes: A Case Study of Effect of Industrial Airborne Pollutant Mixtures on Birth Outcomes

2021 
Association rule mining can be a powerful computational tool for exploring complex interactions between high-dimensional exposures and health outcomes. Given the high-dimensional nature of the data, many complex association rules may be identified. To narrow down on the most important rules for hypothesis-generating and future investigation in the context of health research, we need an objective approach to reduce the ruleset. The ranking is often based on the lift, a widely used measure of association strength in data mining. In this paper, we show why the lift-based ranking is undesirable from a population health perspective. We propose a new approach to select rules obtained from association rule mining. This new approach considers both association strength measured by relative risk and the excessive health burden in the target population. We use a case study of rules mined from industrial airborne pollutant mixtures and birth outcomes, comparing rules selected using our proposed approach to those selected using lift.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    12
    References
    0
    Citations
    NaN
    KQI
    []