"Decomposing Alignment- based Conformance Checking of Data-aware Process Models" Massimiliano de...
-
Upload
justina-houston -
Category
Documents
-
view
216 -
download
0
Transcript of "Decomposing Alignment- based Conformance Checking of Data-aware Process Models" Massimiliano de...
"Decomposing Alignment-based Conformance Checking of Data-aware Process Models"
Massimiliano de Leoni, Jorge Muñoz-Gama, Josep Carmona, Wil van der Aalst
PAGE 1
This presentation includes adaptations of slides prepared by Jorge Muñoz-Gama
Process Mining
2
process mining
data-oriented analysis (data mining, machine learning, business intelligence)
process model analysis (simulation, verification, etc.)p
erform
ance-o
riented
qu
estion
s, p
rob
lems an
d so
lutio
ns
com
plian
ce-orien
ted q
uestio
ns,
pro
blem
s and
solu
tion
s
Process Mining
software system
(process)model
eventlogs
modelsanalyzes
discovery
records events, e.g., messages,
transactions, etc.
specifies configures implements
analyzes
supports/controls
enhancement
conformance
“world”
people machines
organizationscomponents
businessprocesses
(a; {A = 3000;R = Michael; E = Pete}); (b; {V = OK;E = Sue});(c; {I = 530;D = OK;E = Sue});(f; {E = Pete});
Example: A Credit Institute
PAGE 4
For such a credit amount, should be interest <450
«Sue» not authorized to
perform b: is not Assistant
Activity h hasn’t been executed: D
cannot be OK
(a; {A = 3000;R = Michael; E = Pete}); (b; {V = OK;E = Pete});(c; {I = 530;D = OK;E = Sue}); (d, {I = 599; D = NOK; E = Sue});(f; {E = Pete});
(a; {A = 5001;R = Michael; E = Pete}); (b; {V = OK;E = Pete});(c; {I = 530;D = NOK;E = Sue}); (f; {E = Pete});
Activity d should have occurred, since amount<5000
Conformance General Idea
5
Process Traces
Log traceNonconformity
Alignments
Moves in both withincorrect write operations
Move in log
Move in process
Cost of alignments
• Each move is associated with a cost• Cost of alignment is the sum of the costs of its moves
: Cost of reading/writing a wrong value
<w> <z>
3 2
3 2
3 2
3 2
3 3
<x> <y>
1 2 2 2 2 2
3 2
3 2
3 2
: Cost of “move on log”
: Cost of not writing or reading a variable
: Cost of “move on model”
2 2
Cost of alignments: some examples
8 10
An optimal alignment: an
alignment with the lowest cost
Finding an optimal alignment
• Undecidable in the general case.
• If guards are restricted to be linear (in)equations, finding an optimal alignments is • Decidable
BUT• Double Exponential on the size of the model, i.e. the
number of activities and data variables.
M. de Leoni, W. M. P. van der Aalst "Aligning Event Logs and Process Models for Multi-Perspective Conformance Checking: An Approach Based on Integer Linear Programming“. BPM 2013
F. Mannhardt, M. de Leoni, H. Reijers, W. M. P. van der Aalst. “Balanced Multi-Perspective Checking of Process Conformance”. Software Computing, Springer (Under Review). Also available as BPM Center Report BPM-14-07, BPMcenter.org, 2014.
Finding an optimal alignment: a divide-et-impera approach• Beneficial since the problem is exponential!!
t1 t2 t3 t4 t6t5
t1 t2 t3
t3 t4
t6t5
• Decomposed Perfectly Fitting Checking: A model/log is perfectly fitting if and only if all the components are perfectly fitting
Petri Net with Data : Variables and Read/Write Operations
n1 n2
n3 n5 n6
Credit Request(a)
Register Negative Verification (d)
Inform Customers(e)
Renegotiate (f)
Open Credit Loan (h)
Assessment (c)Interests
Amount
Verification
Decision
Register Loan Rejection (g)
Register Loan Rejection (g)
n4n4
Verify (b)Verify (b)
VariablesWrite Operations
Read Operations
Each transition is associated with all valid bindings
PAGE 12
n1 n2
n3 n5 n6
Credit Request(a)
Register Negative Verification (d)
Inform Customers(e)
Renegotiate (f)
Open Credit Loan (h)
Assessment (c)Interests
Amount
Verification
Decision
Register Loan Rejection (g)
Register Loan Rejection (g)
n4n4
Verify (b)Verify (b)
Transition Guard
Credit Request --
Verify 0.1 * r(A) < w(I) < 0.2 * r(A)
Assessment r(V) = true
Register Negative Verification r(V) = false AND w(D) = false
Inform Requester --
Register Loan Rejection r(D) = false
Open Credit r(D) = true
Structural Decomposition
13
n1 n2
n3 n5 n6
Credit Request(a)
Register Negative Verification (d)
Inform Customers(e)
Renegotiate (f)
Open Credit Loan (h)
Assessment (c)Interests
Amount
Verification
Decision
Register Loan Rejection (g)
Register Loan Rejection (g)
n4n4
Verify (b)Verify (b)
SESE (Single-Entry-Single-Exit) Decomposition
14
SESE: set of edges which graph has a Single Entry node and a Single Exit node
Refined Process Structure Tree (RPST) containing non overlapping SESEs
• Unique• Modular• Linear Time
<
SESE does not guarantee Decomposed Perfectly Fitting Checking / 1
• Decomposed Perfectly Fitting Checking: A model/log is perfectly fitting if and only if all the components are perfectly fitting
• The problem is in the boundary places or variables• No reflection on the log
• A partition with only transitions shared among components (neither places, nor variables, nor arcs)• Transitions have reflection on the log
SESE does not guarantee Decomposed Perfectly Fitting Checking / 2
• Create a ‘bridge’ for each shared place
16
Implementation
Available in the package DataConformanceChecker
Experiments
• Generating different event logs with 5000 traces with a different average trace length • This ensured by enforcing a larger number of credit renegotiations
• 20% of the transition firings are so as to not satisfy the guards
n1 n2
n3 n5 n6
Credit RequestRegister Negative
Verification
Inform Customers
Renegotiate
Open Credit Loan
AssessmentInterests
Amount
Verification
Decision
Register Loan Rejection Register Loan Rejection
n4n4
VerifyVerify
Example of the SESE-based Algorithm
n3Verify
Register Negative Verification
Assessment
n2Credit Request
Verify
Renegotiate
n1Credit Request
Register Negative Verification
Inform Customers
Assessment
n4n4
n5
Renegotiate
Open Credit Loan
Register Loan Rejection Register Loan Rejection
n6
Open Credit Loan
Register Loan Rejection Register Loan Rejection
Verify
Register Negative Verification Open Credit Loan
Assessment
Verification Decision
Register Loan Rejection Register Loan Rejection
Credit Request
Verify
Renegotiate
Interests
Amount
Results: an exponential reduction of the computation time
5 10 15 20 25 3010000
100000
1000000
10000000
No Decomposition
SESE-based decomposition
Average number of events per event-log trace
Com
puta
tion
Tim
e (in
sec
onds
– lo
g sc
ale)
Projection on the model
• For each transition t:• n = number of fragments in which t occurs• is the i-th fragment in which t occurs.
𝑓𝑖𝑡𝑇𝑟𝑎𝑛𝑠𝑖𝑡𝑖𝑜𝑛 (𝑡 )=1−∑𝑖=1
𝑛
( ¿𝑐𝑜𝑟𝑟𝑒𝑐𝑡 (𝑡 ,𝐷𝑃𝑁 (𝑡 )𝑖)¿𝑡𝑜𝑡𝑎𝑙 (𝑡 ,𝐷𝑃𝑁 (𝑡 )𝑖) )
𝑛
#correct(t,DPN) = number of moves in both without
incorrect write operations for t in the alignments between each log trace and DPN
#total(t,DPN) = number of moves for t in the alignments of each log trace and DPN
Experiments with a real-life process
• Process enacted by a Dutch governmental agency to deal with unemployment benefits
• Checked the conformance against an event log consisting of 111 traces• Process Model
validated with process analysts of the agency
Without Decomposition: 52891 seconds
With Decomposition: 52.94 seconds -99.9%
Conclusion
• Finding an alignment is exponential in the model size • To speed the computation:
1. Decompose the model in submodels
2. Alignment each trace with each submodel• The decomposition needs to be valid:
Any trace is fitting the entire model if and only if it fits all smaller fragments.