$$ LogRank^+ $$: A Novel Approach to Support Business Process Event Log Sampling

2020 
Massive amounts of business process event logs are collected and stored by modern information systems. Numerous process discovery approaches have been proposed to extract descriptive process models from such event logs in the past decades. To improve process discovery efficiency, event log sampling techniques are proposed. A sample log is a delicately selected subset of the original log that requires less computational cost. However, existing sampling techniques have difficulties, e.g., low efficiency, in handling large-scale event logs. To tackle this challenge, we propose a novel ranking-based event log sampling approach, denoted as \( LogRank^+ \), to support efficient sampling. In addition, we introduce a framework to evaluate the effectiveness of different sampling techniques by quantifying the sampling efficiency and the quality of sample logs. The proposed sampling approach has been implemented in the open-source process mining toolkit ProM. Experimental evaluation with both synthetic and real-life event logs demonstrates that the proposed sampling approach provides an effective solution to improve event log sampling efficiency as well as ensuring high quality of the obtained sample logs from a process discovery perspective.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    20
    References
    0
    Citations
    NaN
    KQI
    []