"Decomposing Alignment- based Conformance Checking of Data-aware Process Models" Massimiliano de...

Post on 19-Jan-2016

218 views 0 download

Tags:

Transcript of "Decomposing Alignment- based Conformance Checking of Data-aware Process Models" Massimiliano de...

"Decomposing Alignment-based Conformance Checking of Data-aware Process Models"

Massimiliano de Leoni, Jorge Muñoz-Gama, Josep Carmona, Wil van der Aalst

PAGE 1

This presentation includes adaptations of slides prepared by Jorge Muñoz-Gama

Process Mining

2

process mining

data-oriented analysis (data mining, machine learning, business intelligence)

process model analysis (simulation, verification, etc.)p

erform

ance-o

riented

qu

estion

s, p

rob

lems an

d so

lutio

ns

com

plian

ce-orien

ted q

uestio

ns,

pro

blem

s and

solu

tion

s

Process Mining

software system

(process)model

eventlogs

modelsanalyzes

discovery

records events, e.g., messages,

transactions, etc.

specifies configures implements

analyzes

supports/controls

enhancement

conformance

“world”

people machines

organizationscomponents

businessprocesses

(a; {A = 3000;R = Michael; E = Pete}); (b; {V = OK;E = Sue});(c; {I = 530;D = OK;E = Sue});(f; {E = Pete});

Example: A Credit Institute

PAGE 4

For such a credit amount, should be interest <450

«Sue» not authorized to

perform b: is not Assistant

Activity h hasn’t been executed: D

cannot be OK

(a; {A = 3000;R = Michael; E = Pete}); (b; {V = OK;E = Pete});(c; {I = 530;D = OK;E = Sue}); (d, {I = 599; D = NOK; E = Sue});(f; {E = Pete});

(a; {A = 5001;R = Michael; E = Pete}); (b; {V = OK;E = Pete});(c; {I = 530;D = NOK;E = Sue}); (f; {E = Pete});

Activity d should have occurred, since amount<5000

Conformance General Idea

5

Process Traces

Log traceNonconformity

Alignments

Moves in both withincorrect write operations

Move in log

Move in process

Cost of alignments

• Each move is associated with a cost• Cost of alignment is the sum of the costs of its moves

: Cost of reading/writing a wrong value

<w> <z>

3 2

3 2

3 2

3 2

3 3

<x> <y>

1 2 2 2 2 2

3 2

3 2

3 2

: Cost of “move on log”

: Cost of not writing or reading a variable

: Cost of “move on model”

2 2

Cost of alignments: some examples

8 10

An optimal alignment: an

alignment with the lowest cost

Finding an optimal alignment

• Undecidable in the general case.

• If guards are restricted to be linear (in)equations, finding an optimal alignments is • Decidable

BUT• Double Exponential on the size of the model, i.e. the

number of activities and data variables.

M. de Leoni, W. M. P. van der Aalst "Aligning Event Logs and Process Models for Multi-Perspective Conformance Checking: An Approach Based on Integer Linear Programming“. BPM 2013

F. Mannhardt, M. de Leoni, H. Reijers, W. M. P. van der Aalst. “Balanced Multi-Perspective Checking of Process Conformance”. Software Computing, Springer (Under Review). Also available as BPM Center Report BPM-14-07, BPMcenter.org, 2014.

Finding an optimal alignment: a divide-et-impera approach• Beneficial since the problem is exponential!!

t1 t2 t3 t4 t6t5

t1 t2 t3

t3 t4

t6t5

• Decomposed Perfectly Fitting Checking: A model/log is perfectly fitting if and only if all the components are perfectly fitting

Petri Net with Data : Variables and Read/Write Operations

n1 n2

n3 n5 n6

Credit Request(a)

Register Negative Verification (d)

Inform Customers(e)

Renegotiate (f)

Open Credit Loan (h)

Assessment (c)Interests

Amount

Verification

Decision

Register Loan Rejection (g)

Register Loan Rejection (g)

n4n4

Verify (b)Verify (b)

VariablesWrite Operations

Read Operations

Each transition is associated with all valid bindings

PAGE 12

n1 n2

n3 n5 n6

Credit Request(a)

Register Negative Verification (d)

Inform Customers(e)

Renegotiate (f)

Open Credit Loan (h)

Assessment (c)Interests

Amount

Verification

Decision

Register Loan Rejection (g)

Register Loan Rejection (g)

n4n4

Verify (b)Verify (b)

Transition Guard

Credit Request --

Verify 0.1 * r(A) < w(I) < 0.2 * r(A)

Assessment r(V) = true

Register Negative Verification r(V) = false AND w(D) = false

Inform Requester --

Register Loan Rejection r(D) = false

Open Credit r(D) = true

Structural Decomposition

13

n1 n2

n3 n5 n6

Credit Request(a)

Register Negative Verification (d)

Inform Customers(e)

Renegotiate (f)

Open Credit Loan (h)

Assessment (c)Interests

Amount

Verification

Decision

Register Loan Rejection (g)

Register Loan Rejection (g)

n4n4

Verify (b)Verify (b)

SESE (Single-Entry-Single-Exit) Decomposition

14

SESE: set of edges which graph has a Single Entry node and a Single Exit node

Refined Process Structure Tree (RPST) containing non overlapping SESEs

• Unique• Modular• Linear Time

<

SESE does not guarantee Decomposed Perfectly Fitting Checking / 1

• Decomposed Perfectly Fitting Checking: A model/log is perfectly fitting if and only if all the components are perfectly fitting

• The problem is in the boundary places or variables• No reflection on the log

• A partition with only transitions shared among components (neither places, nor variables, nor arcs)• Transitions have reflection on the log

SESE does not guarantee Decomposed Perfectly Fitting Checking / 2

• Create a ‘bridge’ for each shared place

16

Implementation

Available in the package DataConformanceChecker

Experiments

• Generating different event logs with 5000 traces with a different average trace length • This ensured by enforcing a larger number of credit renegotiations

• 20% of the transition firings are so as to not satisfy the guards

n1 n2

n3 n5 n6

Credit RequestRegister Negative

Verification

Inform Customers

Renegotiate

Open Credit Loan

AssessmentInterests

Amount

Verification

Decision

Register Loan Rejection Register Loan Rejection

n4n4

VerifyVerify

Example of the SESE-based Algorithm

n3Verify

Register Negative Verification

Assessment

n2Credit Request

Verify

Renegotiate

n1Credit Request

Register Negative Verification

Inform Customers

Assessment

n4n4

n5

Renegotiate

Open Credit Loan

Register Loan Rejection Register Loan Rejection

n6

Open Credit Loan

Register Loan Rejection Register Loan Rejection

Verify

Register Negative Verification Open Credit Loan

Assessment

Verification Decision

Register Loan Rejection Register Loan Rejection

Credit Request

Verify

Renegotiate

Interests

Amount

Results: an exponential reduction of the computation time

5 10 15 20 25 3010000

100000

1000000

10000000

No Decomposition

SESE-based decomposition

Average number of events per event-log trace

Com

puta

tion

Tim

e (in

sec

onds

– lo

g sc

ale)

Projection on the model

• For each transition t:• n = number of fragments in which t occurs• is the i-th fragment in which t occurs.

𝑓𝑖𝑡𝑇𝑟𝑎𝑛𝑠𝑖𝑡𝑖𝑜𝑛 (𝑡 )=1−∑𝑖=1

𝑛

( ¿𝑐𝑜𝑟𝑟𝑒𝑐𝑡 (𝑡 ,𝐷𝑃𝑁 (𝑡 )𝑖)¿𝑡𝑜𝑡𝑎𝑙 (𝑡 ,𝐷𝑃𝑁 (𝑡 )𝑖) )

𝑛

#correct(t,DPN) = number of moves in both without

incorrect write operations for t in the alignments between each log trace and DPN

#total(t,DPN) = number of moves for t in the alignments of each log trace and DPN

Experiments with a real-life process

• Process enacted by a Dutch governmental agency to deal with unemployment benefits

• Checked the conformance against an event log consisting of 111 traces• Process Model

validated with process analysts of the agency

Without Decomposition: 52891 seconds

With Decomposition: 52.94 seconds -99.9%

Conclusion

• Finding an alignment is exponential in the model size • To speed the computation:

1. Decompose the model in submodels

2. Alignment each trace with each submodel• The decomposition needs to be valid:

Any trace is fitting the entire model if and only if it fits all smaller fragments.