Online Techniques For Dealing With Concept Drift in Process Mining
Online Techniques For Dealing With Concept Drift in Process Mining
Online Techniques For Dealing With Concept Drift in Process Mining
1 Introduction
Process Mining is a relatively novel discipline which has received a lot of atten-
tion in the last decade [16]. Although it shares many features with Data Mining,
it has originated from different concerns and communities, has a set of distinctive
techniques, and produces slightly different outcomes. Historically, process min-
ing arises from the observation that many organizations record their activities
into logs which describe, among others, the real ordering of activities of a given
process, in a particular implementation. Software engineering techniques have
mainly focused on the specification part of the processes within an information
system. In reality, this may cause a big gap between a system specification’s and
the final implementation, hampering the use of the models that specify the main
processes of an information system. As another example, designers of hardware
or embedded, concurrent systems, need to compare behavior and specifications;
typically they can passively or actively generate large amounts of logs from their
target system and/or their prototypes, so a logical approach is to use these logs
for the verification task.
By using the logs as source of information, process mining techniques are
meant to discover, analyze, and enhance formal process models of an infor-
mation system [17]. Process discovery is probably the main and most chal-
lenging discipline within process mining: to discover a formal process model
(a Petri net [15], an automaton, etc.) that adequately represents the traces
in the log. Several process discovery algorithms exist in the literature (the
reader can find a good summary in [17]). In this paper, we concentrate on
the control-flow part, i.e., the causal relations between the events of a pro-
cess. Let us use an example to illustrate control-flow discovery. The exam-
ple shown in Figure 1 is taken from [18] and considers the process of han-
1 r,s,sb,p,ac,ap,c 8 r,em,s,sb,p,ac,ap,c
2 r,sb,em,p,ac,ap,c 9 r,sb,em,s,p,ac,ap,c
3 r,sb,p,em,ac,rj,rs,c 10 r,sb,em,s,p,ac,rj,rs,c
4 r,em,sb,p,ac,ap,c 11 r,em,sb,p,s,ac,ap,c
5 r,sb,s,p,ac,rj,rs,c 12 r,em,sb,s,p,ac,rj,rs,c
6 r,sb,p,s,ac,ap,c 13 r,em,sb,p,s,ac,ap,c
7 r,sb,p,em,ac,ap,c 14 r,sb,p,em,s,ac,ap,c
(a)
sb p
rj rs
r
em ac c
ap
s
(b)
Fig. 1. Control-flow process discovery: (a) log containing a drift from trace 8 on, (b)
Petri net discovered from part of the log (traces 1 to 7).
dling customer orders. In the example, the log contains the following activi-
ties: r=register, s=ship, sb=send bill, p=payment, ac=accounting, ap=approved,
c=close, em=express mail, rj=rejected, and rs=resolve. The goal of process dis-
covery is to obtain a formal model such as the Petri net shown in Figure 1(b)1 .
The problem of concept drift is well known and well studied in the data
mining and machine learning communities, but hardly addressed so far in process
mining, where it has important particularities. Three main problems regarding
concept drift can be identified in the context of process mining [5]:
1. Change Detection: detect when a process change happens. This is the most
fundamental problem to solve.
2. Change Localization and Characterization: characterize the nature of one
particular change, and identify the region(s) of change in a process.
3. Unraveling Process Evolution: discover the evolution of process change over-
all, and how change affects the model over time.
2 Background
Given a set of activities T , an event log over T is a multiset L : T ∗ → IN . A
sequence σ ∈ T ∗ is a called trace. A trace σ is contained in a log if L(σ) ≥ 1.
Given a trace σ = t1 , t2 , . . . , tn , and a natural number 1 ≤ k ≤ n, the sequence
t1 , t2 , . . . , tk is called the prefix of length k in σ. Given a log L, we denote by
Pref(L) the set of all prefixes of traces in L. Finally, #(σ, e) is the number of
times that activity e occurs in sequence σ.
(1,1)
(0,0)
x x x x
This section describes the technique for concept drift detection in process mining.
It first learns a process model using abstract interpretation (Section 3.1), which
will be the main actor for the concept drift detection method in Section 3.2.
We now introduce the element to link traces from a log and abstract interpreta-
tion, which was initially presented in [7]:
This way, W will estimate the mass of Before the drift After the drift
current polyhedron, denoted µ b. The fig-
ure next illustrates the approach: a polyhe-
σb1 . . . σc
m σ[
m+1 . . .
Using the rigorous guarantees of ADWIN given in [1], one can give a rigorous
statement on the ability of the method above to detect drift in its input. The
proof can be found in the extended version.
Theorem 1. Suppose that for a sufficiently long time T0 . . . T1 the input dis-
tribution has remained stable, and that the learn phase has built a polyhedron
Pb, with mass µ1 . Suppose that by time T2 (> T1 ) the input distribution has
changed so that the mass of Pb is µ2 from then on. Then by time at most
T2 +O(ln(T2 −T0 )/(µ2 −µ1 )2 ) the method will have detected change and restarted.
Note that the case of abrupt change is when T2 = T1 +1, and that larger changes
in the mass of the current polyhedron imply shorter reaction times.
To test the detection technique of Section 3, a set of models (in our case, Petri
nets) have been used. For each model M , a log has been created by simulating
M . The first six models in Table 1 are taken from [8], and represent typical
behaviors in a concurrent system. Model Cycles(X,Y) represents a workflow
of overlayed cyclic processes. Finally, the models a12f0n00 ... t32f0n00 are
originated from well-known benchmarks in the area of process mining.
To derive logs that contain a drift, each model has been slightly modified in
four different dimensions:
– Flip: the ordering of two events of the model has been reversed.
– Rem: one event of the model has been removed
– Conc: two sequential events have been put in parallel
– Conf: two sequential or concurrent events have been put in conflict
So far, we have focused into explaining how to apply the theory of abstract
interpretation together with estimation and change detection in order to detect
concept drifts in the area of process mining. In this section we will briefly address
the two other problems highlighted in [5]: change location and characterization,
and unravel process evolution; this is ongoing work.
Change location and characterization. A polyhedron P is the solution to
the system of m inequalities P = {X|AX ≤ B} where A ∈ Qm×n and B ∈ Qm
(see Section 2.1). A subset C of these inequalities called causal constraints can
be used to derive the corresponding process model (see [7] for details). A causal
constraint satisfies particular conditions that makes it possible to be converted
into a process model element, e.g., a place and its corresponding arcs in a Petri
net. Since the adaptive windowing technique described in Section 2.2 requires
few resources, one can use one adaptive window for monitoring each causal
constraint in C. Thus, after the learning stage of Algorithm 1, |C| instances
of ADWIN are used to estimate the average satisfaction for each constraint.
Finally, in stage three these ADWIN’s can detect drift at each of the constraints.
When global or partial drift occurs, its location is exactly characterized by
the causal constraints that have experienced drift, which can be mapped to
the corresponding places in the process model. Remarkably, this provides a
fine-grain concept drift detection version of the technique presented in Section 3.
Unraveling process evolution. After change has been localized and charac-
terized as above, the new process model can be then produced. This is crucial
to unravel the process evolution [5]. Two sets of causal constraints will be used
to derive the new process model: i) the causal constraints which still are valid
after the drift, and ii) the new set of causal constraints that may appear in the
new polyhedron learned by revisiting stage one of Algorithm 1. For the former
set, both drifting and non-drifting causal constraints detected in the previous
iteration of Algorithm 1 will be considered. For drifting causal constraints, a
threshold value may be defined to determine when drift is strong enough to in-
validate it. As for the complexity of the model revision, for example the method
in [7] for deriving Petri nets from polyhedra is well-behaved in the sense that
a change in some of the inequalities can be translated to a local change in the
Petri net, with proportional computational cost.
The same idea can be used to alleviate a problem with the change detection
strategy described in Section 3. Recall that there we were only detecting changes
where new points appeared in regions outside the learned polyhedron. Drift may
also mean that points previously in the log, or in its convex hull, do no longer
appear (e.g., if some behaviors disappear and the polygon becomes larger than
necessary). We can detect many such changes by monitoring many aspects of
the stream of points, instead of just the mass of the learned polyhedron. For
example, we could use an array of ADWIN instances to monitor the average
distance among points, distance to their centroid or set of designated points,
distance to each constraint, projection to a set of random hyperplanes, etc.
Concept drift is an important concern for any data analysis scenario involving
temporally ordered data. Surprisingly, there is very little work (in fact, possibly
only [5]) in dealing with concept drift within process mining techniques.
In this paper we have presented the first online mechanism for detecting
and managing concept drift, combined with the process mining approach in [7]
based on abstract interpretation and Petri net models. Our experiments on pro-
cess mining benchmark data twisted to incorporate drift show that our method
detects abrupt changes quickly and accurately. We have also described how to
apply the mechanism for a richer set of tasks: characterizing and locating change,
unraveling the process change, and revising the mined models.
Future work includes experimenting with different forms of change, partic-
ularly, gradual, long term changes and those discussed at the end of Section 5;
implementing the fine-grain detection mechanisms for change location and un-
raveling; and using our approach in a real scenario with a high volume of data,
so sampling becomes essential, and with strong requirements on time and mem-
ory. Also, investigating tailored techniques that can deal with logs that contain
noise is an interesting future research direction. A possibility will be to adapt
Algorithm 1 to use some of the few process discovery techniques that can handle
noise in the log [19, 12, 20].
Acknowledgements
References
1. A. Bifet and R. Gavaldà. Learning from time-changing data with adaptive win-
dowing. In SDM. SIAM, 2007.
2. A. Bifet and R. Gavaldà. Adaptive learning from evolving data streams. In Niall M.
Adams, Céline Robardet, Arno Siebes, and Jean-François Boulicaut, editors, IDA,
volume 5772 of Lecture Notes in Computer Science, pages 249–260. Springer, 2009.
3. A. Bifet and R. Gavaldà. Mining frequent closed trees in evolving data streams.
Intell. Data Anal., 15(1):29–48, 2011.
4. A. Bifet, G. Holmes, B. Pfahringer, and R. Gavaldà. Mining frequent closed graphs
on evolving data streams. In Chid Apté, Joydeep Ghosh, and Padhraic Smyth,
editors, KDD, pages 591–599. ACM, 2011.
5. R. P. Jagadeesh Chandra Bose, W. M. P. van der Aalst, I. Zliobaite, and M. Pech-
enizkiy. Handling concept drift in process mining. In Haralambos Mouratidis
and Colette Rolland, editors, CAiSE, volume 6741 of Lecture Notes in Computer
Science, pages 391–405. Springer, 2011.
6. R.P. Jagadeesh Chandra Bose. Process Mining in the Large: Preprocessing, Dis-
covery, and Diagnostics. PhD thesis, Eindhoven University of Technology, 2012.
7. J. Carmona and J. Cortadella. Process mining meets abstract interpretation. In
European Conference on Machine Learning and Knowledge Discovery in Databases
ECML PKDD, Lecture Notes in Computer Science, pages 184–199. Springer, 2010.
8. J. Carmona, J. Cortadella, and M. Kishinevsky. New region-based algorithms for
deriving bounded Petri nets. IEEE Trans. on Computers, 59(3):371–384, 2009.
9. P. Cousot and R. Cousot. Static determination of dynamic properties of programs.
In 2nd Int. Symposium on Programming, pages 106–130. Paris, France, 1976.
10. P. Cousot and R. Cousot. Abstract interpretation: a unified lattice model for
static analysis of programs by construction or approximation of fixpoints. In Proc.
ACM SIGPLAN-SIGACT Symp. on Principles of Programming Languages, pages
238–252. ACM Press, 1977.
11. P. Cousot and N. Halbwachs. Automatic discovery of linear restraints among
variables of a program. In Proc. ACM SIGPLAN-SIGACT Symp. on Principles of
Programming Languages, pages 84–97. ACM Press, New York, 1978.
12. Christian W. Günther and Wil M. P. van der Aalst. Fuzzy mining - adaptive
process simplification based on multi-perspective metrics. In Gustavo Alonso, Peter
Dadam, and Michael Rosemann, editors, BPM, volume 4714 of Lecture Notes in
Computer Science, pages 328–343. Springer, 2007.
13. B. Jeannet and A. Miné. Apron: A library of numerical abstract domains for
static analysis. In Ahmed Bouajjani and Oded Maler, editors, CAV, volume 5643
of Lecture Notes in Computer Science, pages 661–667. Springer, 2009.
14. A. Miné. The octagon abstract domain. In Analysis, Slicing and Tranformation,
IEEE, pages 310–319. IEEE CS Press, October 2001.
15. T. Murata. Petri nets: Properties, analysis and applications. Proc. of the IEEE,
77(4), 1989.
16. IEEE Task Force on Process Mining. Process mining manifesto. In Florian Daniel,
Kamel Barkaoui, and Schahram Dustdar, editors, Business Process Management
Workshops (1), volume 99 of Lecture Notes in Business Information Processing,
pages 169–194. Springer, 2011.
17. W. M. P. van der Aalst. Process Mining - Discovery, Conformance and Enhance-
ment of Business Processes. Springer, 2011.
18. W. M. P. van der Aalst and C. W. Günther. Finding structure in unstructured
processes: The case for process mining. In Twan Basten, Gabriel Juhás, and
Sandeep K. Shukla, editors, ACSD, pages 3–12. IEEE Computer Society, 2007.
19. Wil M. P. van der Aalst, Ana Karla A. de Medeiros, and A. J. M. M. Weijters.
Genetic process mining. In Applications and Theory of Petri Nets 2005, 26th Inter-
national Conference, ICATPN 2005, Miami, USA, June 20-25, 2005, Proceedings,
volume 3536 of Lecture Notes in Computer Science, pages 48–69. Springer, 2005.
20. A. J. M. M. Weijters and J. T. S. Ribeiro. Flexible heuristics miner (FHM). In
CIDM, pages 310–317, 2011.