Process mining: on the balance between underfitting and overfitting

2008 
Process mining techniques attempt to extract non-trivial and useful information from event logs. One aspect of process mining is control-°ow discovery, i.e., automatically constructing a process model (e.g., a Petri net) describing the causal dependencies between activities. One of the essential problems in process mining is that one cannot assume to have seen all possible behavior. At best, one has seen a representative subset. Therefore, classical synthesis techniques are not suitable as they aim at ¯nding a model that is able to exactly reproduce the log. Existing process mining techniques try to avoid such \over¯tting" by generalizing the model to allow for more behavior. This generalization is often driven by the representation language and very crude assumptions about com- pleteness. As a result, parts of the model are \over¯tting" (allow only what has actually been observed) while other parts may be \under¯tting" (allow for much more behavior without strong support for it). This talk will present the main challenges posed by real-life applications of process mining and show that it is possible to balance between over¯tting and under¯tting in a controlled manner.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    2
    References
    1
    Citations
    NaN
    KQI
    []