Handling Duplicated Tasks in Process Discovery by Refining...
Transcript of Handling Duplicated Tasks in Process Discovery by Refining...
![Page 1: Handling Duplicated Tasks in Process Discovery by Refining ...processmining.org/old-version/files/2016bpm_lu.pdf · Is this A1 a 2. nd. iteration of a loop? V. A1. B1. C1. A1. S.](https://reader036.fdocuments.in/reader036/viewer/2022071602/613d691ae1ef621e9f2db4c4/html5/thumbnails/1.jpg)
Handling Duplicated Tasks in Process Discovery by Refining Event Labels
Xixi Lu ([email protected])1
Dirk Fahland1
Frank van den Biggelaar2
Wil van der Aalst1
1Eindhoven University of Technology2Maastricht University Medical Center 1
![Page 2: Handling Duplicated Tasks in Process Discovery by Refining ...processmining.org/old-version/files/2016bpm_lu.pdf · Is this A1 a 2. nd. iteration of a loop? V. A1. B1. C1. A1. S.](https://reader036.fdocuments.in/reader036/viewer/2022071602/613d691ae1ef621e9f2db4c4/html5/thumbnails/2.jpg)
Overview
• (5 min) Research Problem• (10 min) Approach• (5 min) Evaluation and Conclusion
2
![Page 3: Handling Duplicated Tasks in Process Discovery by Refining ...processmining.org/old-version/files/2016bpm_lu.pdf · Is this A1 a 2. nd. iteration of a loop? V. A1. B1. C1. A1. S.](https://reader036.fdocuments.in/reader036/viewer/2022071602/613d691ae1ef621e9f2db4c4/html5/thumbnails/3.jpg)
VA CB A
Duplicated Tasks –Events Having Imprecise Labels
3
CA
BS+
X X
A B ACV S
C B SAC A B
VV S
Discover VABC
S
Record
+
![Page 4: Handling Duplicated Tasks in Process Discovery by Refining ...processmining.org/old-version/files/2016bpm_lu.pdf · Is this A1 a 2. nd. iteration of a loop? V. A1. B1. C1. A1. S.](https://reader036.fdocuments.in/reader036/viewer/2022071602/613d691ae1ef621e9f2db4c4/html5/thumbnails/4.jpg)
4
A B ACV S
C B SAC A B
VV S
Input:Imprecise log
Research Problem
![Page 5: Handling Duplicated Tasks in Process Discovery by Refining ...processmining.org/old-version/files/2016bpm_lu.pdf · Is this A1 a 2. nd. iteration of a loop? V. A1. B1. C1. A1. S.](https://reader036.fdocuments.in/reader036/viewer/2022071602/613d691ae1ef621e9f2db4c4/html5/thumbnails/5.jpg)
Research Problem
5
So that1. all traces & events preserved (no filtering)2. model is more precise (than without refining)3. we can explore different refinement of labels interactively
(because we don’t know correct label refinement from given log)
Discover(IM, ILP, …)
Model
Refine labels of events
Input:Imprecise log
Output:Refined log
![Page 6: Handling Duplicated Tasks in Process Discovery by Refining ...processmining.org/old-version/files/2016bpm_lu.pdf · Is this A1 a 2. nd. iteration of a loop? V. A1. B1. C1. A1. S.](https://reader036.fdocuments.in/reader036/viewer/2022071602/613d691ae1ef621e9f2db4c4/html5/thumbnails/6.jpg)
Refining Event Labels
6
imprecise Precise
DiscoverDiscover Discover
C B AC A B
A B AC C2 B2 A2
C2 A2 B2
A1 B1 A3C1
C2 B2 A2
C3 A4 B3
A1 B1 A3C1
C2 B2A2
A1 B1 A3C1
C2 B2 A2
C3 A4 B3
A1 B1 A3C1
C BA
A B AC
Discover
C2 B2 A2
C3 A4 B3
A1 B1 A3C1Refining Label
ABC
![Page 7: Handling Duplicated Tasks in Process Discovery by Refining ...processmining.org/old-version/files/2016bpm_lu.pdf · Is this A1 a 2. nd. iteration of a loop? V. A1. B1. C1. A1. S.](https://reader036.fdocuments.in/reader036/viewer/2022071602/613d691ae1ef621e9f2db4c4/html5/thumbnails/7.jpg)
Refining Event Labels
7
Precise
Discover Discover
C B AC A B
A B AC C2 B2 A2
C2 A2 B2
A1 B1 A3C1
C2 B2 A2
C3 A4 B3
A1 B1 A3C1
C2 B2A2
A1 B1 A3C1
C2 B2 A2
C3 A4 B3
A1 B1 A3C1
C BA
A B ACSystem unknown!
We don’t know what is optimal!
Discover
C2 B2 A2
C3 A4 B3
A1 B1 A3C1Refining Label
ABC
imprecise
Discover
![Page 8: Handling Duplicated Tasks in Process Discovery by Refining ...processmining.org/old-version/files/2016bpm_lu.pdf · Is this A1 a 2. nd. iteration of a loop? V. A1. B1. C1. A1. S.](https://reader036.fdocuments.in/reader036/viewer/2022071602/613d691ae1ef621e9f2db4c4/html5/thumbnails/8.jpg)
Related Work
8
Discover(IM, ILP, …)
Process Model
Refine Labelsof Events
Imprecise log Refined log
(c) Cluster traces
(b) New Discovery algorithm
(a) Relabelmodel elements
(d) Filter
Filtered log
Clusters …
![Page 9: Handling Duplicated Tasks in Process Discovery by Refining ...processmining.org/old-version/files/2016bpm_lu.pdf · Is this A1 a 2. nd. iteration of a loop? V. A1. B1. C1. A1. S.](https://reader036.fdocuments.in/reader036/viewer/2022071602/613d691ae1ef621e9f2db4c4/html5/thumbnails/9.jpg)
1. Detectimprecise
labels3. Refine
horizontally4. Refinevertically
Imprecise label candidates
Horizontal clusters of events
Vertical clusters of events
Approach
9
Refined log
2. Mapsimilar events
{x, …}
Mappings
Imprecise log
a) Manual by user inputb) Automated detection
![Page 10: Handling Duplicated Tasks in Process Discovery by Refining ...processmining.org/old-version/files/2016bpm_lu.pdf · Is this A1 a 2. nd. iteration of a loop? V. A1. B1. C1. A1. S.](https://reader036.fdocuments.in/reader036/viewer/2022071602/613d691ae1ef621e9f2db4c4/html5/thumbnails/10.jpg)
1. Detectimprecise
labels3. Refine
horizontally4. Refinevertically
Imprecise label candidates
Horizontal clusters of events
Vertical clusters of events
Approach
10
Refined log
2. Mapsimilar events
{x, …}
Mappings
Imprecise log
![Page 11: Handling Duplicated Tasks in Process Discovery by Refining ...processmining.org/old-version/files/2016bpm_lu.pdf · Is this A1 a 2. nd. iteration of a loop? V. A1. B1. C1. A1. S.](https://reader036.fdocuments.in/reader036/viewer/2022071602/613d691ae1ef621e9f2db4c4/html5/thumbnails/11.jpg)
Mapping Between Events
11
Trace 1
Trace 2
Mapping
AV SA CB
SV C AB
![Page 12: Handling Duplicated Tasks in Process Discovery by Refining ...processmining.org/old-version/files/2016bpm_lu.pdf · Is this A1 a 2. nd. iteration of a loop? V. A1. B1. C1. A1. S.](https://reader036.fdocuments.in/reader036/viewer/2022071602/613d691ae1ef621e9f2db4c4/html5/thumbnails/12.jpg)
Quantify Similarity of Mapped Events
12
… based on “structural context of events”
AV SA CB
SV C AB
![Page 13: Handling Duplicated Tasks in Process Discovery by Refining ...processmining.org/old-version/files/2016bpm_lu.pdf · Is this A1 a 2. nd. iteration of a loop? V. A1. B1. C1. A1. S.](https://reader036.fdocuments.in/reader036/viewer/2022071602/613d691ae1ef621e9f2db4c4/html5/thumbnails/13.jpg)
Quantify Similarity of Mapped Events
13
… based on “structural context of events”(1) Differences in neighbors
Different Same
Cost = 2
AV SA CB
SV C AB
![Page 14: Handling Duplicated Tasks in Process Discovery by Refining ...processmining.org/old-version/files/2016bpm_lu.pdf · Is this A1 a 2. nd. iteration of a loop? V. A1. B1. C1. A1. S.](https://reader036.fdocuments.in/reader036/viewer/2022071602/613d691ae1ef621e9f2db4c4/html5/thumbnails/14.jpg)
Quantify Similarity of Mapped Events
14
… based on “structural context of events”(1) Differences in neighbors(2) Differences in structure
Distance = 4
Distance = 3
Cost = 2+1
AV SA CB
SV C AB
![Page 15: Handling Duplicated Tasks in Process Discovery by Refining ...processmining.org/old-version/files/2016bpm_lu.pdf · Is this A1 a 2. nd. iteration of a loop? V. A1. B1. C1. A1. S.](https://reader036.fdocuments.in/reader036/viewer/2022071602/613d691ae1ef621e9f2db4c4/html5/thumbnails/15.jpg)
Quantify Similarity of Mapped Events
15
… based on “structural context of events”(1) Differences in neighbors(2) Differences in structure
Distance = 2
Distance = 1
Cost = 2+1+1+1+0
AV SA CB
SV C AB
![Page 16: Handling Duplicated Tasks in Process Discovery by Refining ...processmining.org/old-version/files/2016bpm_lu.pdf · Is this A1 a 2. nd. iteration of a loop? V. A1. B1. C1. A1. S.](https://reader036.fdocuments.in/reader036/viewer/2022071602/613d691ae1ef621e9f2db4c4/html5/thumbnails/16.jpg)
Quantify Similarity of Mapped Events
16
… based on “structural context of events”(1) Differences in neighbors(2) Differences in structure
Cost = 2+3Cost = 4+6
Cost = 4+4 Cost = 0+3
Cost = 2+4 Cost so far = 6+10+8+5+3
AV SA CB
SV C AB
![Page 17: Handling Duplicated Tasks in Process Discovery by Refining ...processmining.org/old-version/files/2016bpm_lu.pdf · Is this A1 a 2. nd. iteration of a loop? V. A1. B1. C1. A1. S.](https://reader036.fdocuments.in/reader036/viewer/2022071602/613d691ae1ef621e9f2db4c4/html5/thumbnails/17.jpg)
Quantify Similarity of Mapped Events
17
… based on “structural context of events”(1) Differences in neighbors(2) Differences in structure(3) #Dissimilar events
Xixi Lu, Dirk Fahland, Frank J.H.M. van den Biggelaar, and Wil M.P. van der Aalst. Detecting Deviating Behavior without Models. In BPM Workshops 2015, volume 256, pp.126-139, Springer International Publishing, 2015.
There is an algorithm to compute an optimal mapping :
AS SA CB
SS C AB
Total Cost = 6+10+8+5+3 +1
![Page 18: Handling Duplicated Tasks in Process Discovery by Refining ...processmining.org/old-version/files/2016bpm_lu.pdf · Is this A1 a 2. nd. iteration of a loop? V. A1. B1. C1. A1. S.](https://reader036.fdocuments.in/reader036/viewer/2022071602/613d691ae1ef621e9f2db4c4/html5/thumbnails/18.jpg)
1. Detectimprecise
labels3. Refine
horizontally4. Refinevertically
Imprecise label candidates
Horizontal clusters of events
Vertical clusters of events
Approach
18
Refined log
2. Mapsimilar events
{x, …}
Mappings
Imprecise log
![Page 19: Handling Duplicated Tasks in Process Discovery by Refining ...processmining.org/old-version/files/2016bpm_lu.pdf · Is this A1 a 2. nd. iteration of a loop? V. A1. B1. C1. A1. S.](https://reader036.fdocuments.in/reader036/viewer/2022071602/613d691ae1ef621e9f2db4c4/html5/thumbnails/19.jpg)
3. Refine horizontally using cost
19
AV SA CB
SV C AB
![Page 20: Handling Duplicated Tasks in Process Discovery by Refining ...processmining.org/old-version/files/2016bpm_lu.pdf · Is this A1 a 2. nd. iteration of a loop? V. A1. B1. C1. A1. S.](https://reader036.fdocuments.in/reader036/viewer/2022071602/613d691ae1ef621e9f2db4c4/html5/thumbnails/20.jpg)
3. Refine horizontally using cost
20
AV SA CB
SV C AB
Trace 1
Trace 2
SV C BATrace 3
20
V SA CBTrace 0
10 10 10
Imprecise label candidates
20 20
15 15 15
a) Normalize costs w.r.t. maximal cost seen in the log
![Page 21: Handling Duplicated Tasks in Process Discovery by Refining ...processmining.org/old-version/files/2016bpm_lu.pdf · Is this A1 a 2. nd. iteration of a loop? V. A1. B1. C1. A1. S.](https://reader036.fdocuments.in/reader036/viewer/2022071602/613d691ae1ef621e9f2db4c4/html5/thumbnails/21.jpg)
3. Refine horizontally using cost
21
1
0.5 0.5 0.5
Imprecise label candidates
1 1
0.75 0.75 0.75
a) Normalize costs w.r.t. maximal cost seen in the log
b) Set variant threshold,say, 0.8
c) Remove edges if cost > variant threshold
AV SA CB
SV C AB
SV C BA
V SA CB
Trace 1
Trace 2
Trace 3
Trace 0
![Page 22: Handling Duplicated Tasks in Process Discovery by Refining ...processmining.org/old-version/files/2016bpm_lu.pdf · Is this A1 a 2. nd. iteration of a loop? V. A1. B1. C1. A1. S.](https://reader036.fdocuments.in/reader036/viewer/2022071602/613d691ae1ef621e9f2db4c4/html5/thumbnails/22.jpg)
3. Refine horizontally using cost
22
0.5 0.5 0.5
0.75 0.75 0.75
a) Normalize costs w.r.t. maximal cost seen in the log
b) Set variant thresholdsay, 0.8
c) Remove edges if cost > variant threshold
A1V SA1 C1B1
SV C2 A2B2
SV C2 B2A2
V SA1 C1B1
Trace 1
Trace 2
Trace 3
Trace 0
![Page 23: Handling Duplicated Tasks in Process Discovery by Refining ...processmining.org/old-version/files/2016bpm_lu.pdf · Is this A1 a 2. nd. iteration of a loop? V. A1. B1. C1. A1. S.](https://reader036.fdocuments.in/reader036/viewer/2022071602/613d691ae1ef621e9f2db4c4/html5/thumbnails/23.jpg)
4. Refine vertically using frequency
23
I) Set unfolding threshold,say, 60%
II) Refine labels if freq > unfolding threshold
Is this A1 a 2nd iteration of a loop?
A1V SA1 C1B1
V SA1 C1B1
Trace 1
Trace 0
2 1
![Page 24: Handling Duplicated Tasks in Process Discovery by Refining ...processmining.org/old-version/files/2016bpm_lu.pdf · Is this A1 a 2. nd. iteration of a loop? V. A1. B1. C1. A1. S.](https://reader036.fdocuments.in/reader036/viewer/2022071602/613d691ae1ef621e9f2db4c4/html5/thumbnails/24.jpg)
4. Refine vertically using frequency
24
I) Set unfolding threshold,say, 40%
II) Refine labels if freq > unfolding threshold
1
Or is this a different task?
A3V SA1 C1B1
V SA1 C1B1
Trace 1
Trace 0
2
![Page 25: Handling Duplicated Tasks in Process Discovery by Refining ...processmining.org/old-version/files/2016bpm_lu.pdf · Is this A1 a 2. nd. iteration of a loop? V. A1. B1. C1. A1. S.](https://reader036.fdocuments.in/reader036/viewer/2022071602/613d691ae1ef621e9f2db4c4/html5/thumbnails/25.jpg)
1. Detectimprecise
labels3. Refine
horizontally4. Refinevertically
Imprecise label candidates
Approach
25
Refined log
2. Mapsimilar events
{x, …}
Mappings
Imprecise log
Variant threshold
Unfolding threshold
Iterative
![Page 26: Handling Duplicated Tasks in Process Discovery by Refining ...processmining.org/old-version/files/2016bpm_lu.pdf · Is this A1 a 2. nd. iteration of a loop? V. A1. B1. C1. A1. S.](https://reader036.fdocuments.in/reader036/viewer/2022071602/613d691ae1ef621e9f2db4c4/html5/thumbnails/26.jpg)
Implemented in ProM
• There will be a demo in the demo track.
26
Detect imprecise labels
Refine horizontally
Refine vertically
![Page 27: Handling Duplicated Tasks in Process Discovery by Refining ...processmining.org/old-version/files/2016bpm_lu.pdf · Is this A1 a 2. nd. iteration of a loop? V. A1. B1. C1. A1. S.](https://reader036.fdocuments.in/reader036/viewer/2022071602/613d691ae1ef621e9f2db4c4/html5/thumbnails/27.jpg)
• 3 experiments, each 600 models, k transitions w/ same labelE1) Default parameters, k = 4E2) Adaptive parameters, k = 4E3) 1 in loop, k = 2
Evaluation
27
Refining Label
Discover
Discover
Generate
Improved?Imprecise log Refined log
Compute qualities
![Page 28: Handling Duplicated Tasks in Process Discovery by Refining ...processmining.org/old-version/files/2016bpm_lu.pdf · Is this A1 a 2. nd. iteration of a loop? V. A1. B1. C1. A1. S.](https://reader036.fdocuments.in/reader036/viewer/2022071602/613d691ae1ef621e9f2db4c4/html5/thumbnails/28.jpg)
Experiment Result - Example
28
System precision improved by 0.68System recall is 1
Precision w.r.t. log improved by 0.55
B
B B B B
𝑡𝑡1𝑡𝑡2
𝑡𝑡3 𝑡𝑡4B
BB B
![Page 29: Handling Duplicated Tasks in Process Discovery by Refining ...processmining.org/old-version/files/2016bpm_lu.pdf · Is this A1 a 2. nd. iteration of a loop? V. A1. B1. C1. A1. S.](https://reader036.fdocuments.in/reader036/viewer/2022071602/613d691ae1ef621e9f2db4c4/html5/thumbnails/29.jpg)
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
E2) Adaptive
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
E1) Default
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
E3) 1 in loop
Experiment Result -F1-score of and w.r.t.
29
Improved 35% ~ 46% Models Improved 89% Models Improved 60% Models
![Page 30: Handling Duplicated Tasks in Process Discovery by Refining ...processmining.org/old-version/files/2016bpm_lu.pdf · Is this A1 a 2. nd. iteration of a loop? V. A1. B1. C1. A1. S.](https://reader036.fdocuments.in/reader036/viewer/2022071602/613d691ae1ef621e9f2db4c4/html5/thumbnails/30.jpg)
Experiment Results and Limitations
30
![Page 31: Handling Duplicated Tasks in Process Discovery by Refining ...processmining.org/old-version/files/2016bpm_lu.pdf · Is this A1 a 2. nd. iteration of a loop? V. A1. B1. C1. A1. S.](https://reader036.fdocuments.in/reader036/viewer/2022071602/613d691ae1ef621e9f2db4c4/html5/thumbnails/31.jpg)
Real-life Example : Hospital Data
31
1st Iteration : refined “surgery” events
2nd & 3rd Iterations: refine “consultation” events
![Page 32: Handling Duplicated Tasks in Process Discovery by Refining ...processmining.org/old-version/files/2016bpm_lu.pdf · Is this A1 a 2. nd. iteration of a loop? V. A1. B1. C1. A1. S.](https://reader036.fdocuments.in/reader036/viewer/2022071602/613d691ae1ef621e9f2db4c4/html5/thumbnails/32.jpg)
Conclusion and Future Work
• Improved logs (and process discovery) by refining labels for up to 87%• Interactive and explorative
• Future work• Different ways to compute similar events using context of events• Log preprocessing framework
• Integrated in “Log to Model Explorer”• Supporting clustering, filtering and label refining
• Come to the Demo!
32
![Page 33: Handling Duplicated Tasks in Process Discovery by Refining ...processmining.org/old-version/files/2016bpm_lu.pdf · Is this A1 a 2. nd. iteration of a loop? V. A1. B1. C1. A1. S.](https://reader036.fdocuments.in/reader036/viewer/2022071602/613d691ae1ef621e9f2db4c4/html5/thumbnails/33.jpg)
Questions?
33