Open Lecture Wil van der Aalst

94
Process Mining On the Interplay Between Data Science and Behavioral Science Open Lecture High Tech Campus Eindhoven, 11-12-2014 Wil van der Aalst Scientific director of the DSC/e

Transcript of Open Lecture Wil van der Aalst

Page 1: Open Lecture Wil van der Aalst

Process Mining On the Interplay Between Data

Science and Behavioral Science

Open Lecture High Tech Campus Eindhoven, 11-12-2014

Wil van der Aalst Scientific director of the DSC/e

Page 2: Open Lecture Wil van der Aalst

https://www.coursera.org/course/procmin

A new profession

is emerging, just

like computer

science in the

early 1980-ties!

Page 3: Open Lecture Wil van der Aalst

https://www.coursera.org/course/procmin

Requires a combination

of competences!

Page 4: Open Lecture Wil van der Aalst

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

data mining

processmining visualization

data science

behavioral/social

sciences

domainknowledge

machine learning

large scale distributedcomputing

statistics industrialengineering

databasesstochastics

privacy

algorithms

visual analytics

DSC/e Data Science Competences

Page 5: Open Lecture Wil van der Aalst

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

data mining

processmining visualization

behavioral/social

sciences

domainknowledge

machine learning

large scale distributedcomputing

industrialengineering

databases

privacy

algorithms

visual analytics

data science

statistics

stochastics

Statistics and Stochastics

Page 6: Open Lecture Wil van der Aalst

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

data mining

processmining visualization

data science

behavioral/social

sciences

domainknowledge

machine learning

large scale distributedcomputing

statistics industrialengineering

databasesstochastics

privacy

algorithms

visual analytics

Data Mining and Machine Learning

Page 7: Open Lecture Wil van der Aalst

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

data mining

processmining visualization

data science

behavioral/social

sciences

domainknowledge

machine learning

large scale distributedcomputing

statistics industrialengineering

databasesstochastics

privacy

algorithms

visual analytics

Process Mining

Page 8: Open Lecture Wil van der Aalst

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

data mining

processmining visualization

data science

behavioral/social

sciences

domainknowledge

machine learning

large scale distributedcomputing

statistics industrialengineering

databasesstochastics

privacy

algorithms

visual analytics

Large Scale Distributed Computing

Page 9: Open Lecture Wil van der Aalst

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

data mining

processmining visualization

data science

behavioral/social

sciences

domainknowledge

machine learning

large scale distributedcomputing

statistics industrialengineering

databasesstochastics

privacy

algorithms

visual analytics

Visualization and Visual Analytics

Page 10: Open Lecture Wil van der Aalst

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

data mining

processmining visualization

data science

behavioral/social

sciences

domainknowledge

machine learning

large scale distributedcomputing

statistics industrialengineering

databasesstochastics

privacy

algorithms

visual analytics

Domain Knowledge

Page 11: Open Lecture Wil van der Aalst

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

data mining

processmining visualization

data science domain

knowledge

machine learning

large scale distributedcomputing

statistics industrialengineering

behavioral/social

sciences

databasesstochastics

privacy

algorithms

visual analytics

Behavioral/Social Sciences & Privacy

Page 12: Open Lecture Wil van der Aalst

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

data mining

processmining visualization

data science

behavioral/social

sciences

domainknowledge

machine learning

large scale distributedcomputing

statistics industrialengineering

databasesstochastics

privacy

algorithms

visual analytics

Industrial Engineering

Page 13: Open Lecture Wil van der Aalst

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

data mining

processmining visualization

data science

behavioral/social

sciences

domainknowledge

machine learning

large scale distributedcomputing

statistics industrialengineering

databasesstochastics

privacy

algorithms

visual analytics

Databases & Algorithms

Page 14: Open Lecture Wil van der Aalst

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Internet of Events

Page 15: Open Lecture Wil van der Aalst

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Internet of Events: 4 sources of event data

Internet of Events

Page 16: Open Lecture Wil van der Aalst

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Internet of Events: 4 sources of event data

Internet of Content

Internet of Events

“Big Data”

Page 17: Open Lecture Wil van der Aalst

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Internet of Events: 4 sources of event data

Internet of Content

Internet of People

“social”

Internet of Events

“Big Data”

Page 18: Open Lecture Wil van der Aalst

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Internet of Events: 4 sources of event data

Internet of Content

Internet of People

“social”

Internet of Things

“cloud”

Internet of Events

“Big Data”

Page 19: Open Lecture Wil van der Aalst

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Internet of Events: 4 sources of event data

Internet of Content

Internet of People

“social”

Internet of Things

Internet of Places

“cloud” “mobility”

Internet of Events

“Big Data”

Page 20: Open Lecture Wil van der Aalst

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

"Der Datenflüsterer"

Building a relationship with data

Page 21: Open Lecture Wil van der Aalst
Page 22: Open Lecture Wil van der Aalst

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

DSC/e: Competences and Research Programs 28 groups involved

Context: Why are we using data science, does it have the intended effect, and will

people accept it?

Analysis: How to turn data into real value (models, answers/decisions, and

visualizations/insights)?

Enabling technologies: How to get the data and deal with computational/

infrastructural challenges (big data and hard questions)?

Probability and Statistics

Stochastic Networks

Data Mining

Process Mining

Visualization

Large-Scale Distributed Systems

Data-Intensive Algorithms

Data-Driven Operations Management

Data-Driven Innovation and Business

Human and Social Analytics

Privacy, Security, Ethics, and Governance

Internet of Things

syst

ems

infrastructurescit

ies

organ

ization

s

people

[RP1] Process Analytics: Improving Service While Cutting Costs

[RP2] Customer Journey: Correlating Events to Learn and Influence Customer Behavior

[RP3] Smart Maintenance & Diagnostics: Safeguarding Availability

[RP4] Quantified Self: Improving Performance and Well-Being

[RP5] Data Value and Privacy: Economic and Legal Aspects of Data Science

[RP6] Smart Cities: Ensuring Safety and Convenience for Citizens

[RP7] Smart Grids: Data Intensive Infrastructures

[RP1] Process Analytics: Improving Service While Cutting Costs

[RP2] Customer Journey: Correlating Events to Learn and Influence Customer Behavior

[RP3] Smart Maintenance & Diagnostics: Safeguarding Availability

[RP4] Quantified Self: Improving Performance and Well-Being

[RP5] Data Value and Privacy: Economic and Legal Aspects of Data Science

[RP6] Smart Cities: Ensuring Safety and Convenience for Citizens

[RP7] Smart Grids: Data Intensive Infrastructures

Page 23: Open Lecture Wil van der Aalst

Data Science Flagship (Philips & DSC/e)

• 4 Strategic topics

• 4 TU/e departments

• 16 PhD students

• 30 Data science specialists

1. Data Driven Value Propositions

2. Healthcare Smart Maintenance

3. Optimizing Healthcare Workflows

4. Continuous Personal Health

Page 24: Open Lecture Wil van der Aalst

Process Mining Let's Play

Page 25: Open Lecture Wil van der Aalst

data mining

processmining

visualization

data science

behavioral/social

sciences

domainknowledge

machine learning

large scale distributedcomputing

statistics industrialengineering

databasesstochastics

privacy

algorithms

visual analytics

Page 26: Open Lecture Wil van der Aalst

data mining

processmining

visualization

data science

behavioral/social

sciences

domainknowledge

machine learning

large scale distributedcomputing

statistics industrialengineering

databasesstochastics

privacy

algorithms

visual analytics

formal methods

business process

management

concurrency

business process re-engineering

process science

modelchecking Petri nets

BPMN

Page 27: Open Lecture Wil van der Aalst

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

register travel

request (a)

get detailed motivation letter (c)

get support from local

manager (b)

check budget by finance (d)

decide (e)

accept request (g)

reject request (h)

reinitiate request (f)

start end

Case Activity Timestamp Resource

432 register travel request (a) 18-3-2014:9.15 John

432 get support from local manager (b) 18-3-2014:9.25 Mary

432 check budget by finance (d) 19-3-2014:8.55 John

432 decide (e) 19-3-2014:9.36 Sue

432 accept request (g) 19-3-2014:9.48 Mary

Play-In

Play-Out

Replay

Let's play

Page 28: Open Lecture Wil van der Aalst

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Play-Out

register travel

request (a)

get detailed motivation letter (c)

get support from local

manager (b)

check budget by finance (d)

decide (e)

accept request (g)

reject request (h)

reinitiate request (f)

start end

Case Activity Timestamp Resource

432 register travel request (a) 18-3-2014:9.15 John

432 get support from local manager (b) 18-3-2014:9.25 Mary

432 check budget by finance (d) 19-3-2014:8.55 John

432 decide (e) 19-3-2014:9.36 Sue

432 accept request (g) 19-3-2014:9.48 Mary

Page 29: Open Lecture Wil van der Aalst

Play Out: A possible scenario

register travel

request (a)

get detailed motivation letter (c)

get support from local

manager (b)

check budget by finance (d)

decide (e)

accept request (g)

reject request (h)

reinitiate request (f)

start end

AND-

split

XOR-

split

XOR-

join

XOR-

join

AND-

join

XOR-

split

XOR-

join

a b d e g

Case Activity Timestamp Resource

432 register travel request (a) 18-3-2014:9.15 John

432 get support from local manager (b) 18-3-2014:9.25 Mary

432 check budget by finance (d) 19-3-2014:8.55 John

432 decide (e) 19-3-2014:9.36 Sue

432 accept request (g) 19-3-2014:9.48 Mary

Page 30: Open Lecture Wil van der Aalst

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Play Out: Another scenario

register travel

request (a)

get detailed motivation letter (c)

get support from local

manager (b)

check budget by finance (d)

decide (e)

accept request (g)

reject request (h)

reinitiate request (f)

start end

a c d e h f b d e

Page 31: Open Lecture Wil van der Aalst

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Play Out: Process model allows for many

more scenarios

register travel

request (a)

get detailed motivation letter (c)

get support from local

manager (b)

check budget by finance (d)

decide (e)

accept request (g)

reject request (h)

reinitiate request (f)

start end

abdeg abcefbdeh

adceg adbeh

acbefbdeh acdefcdefbdeh

adcefcdefbdefbdeg abdeg adceh adbeh

acbefbdeg acdefcdefbdeh adcefcdefbdefbdeg

abdeg

adceh

adbeh

acbefbdeg acdefcdefbdeh adcefcdefbdefbdeg

abdeg

Page 32: Open Lecture Wil van der Aalst

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Play-In

register travel

request (a)

get detailed motivation letter (c)

get support from local

manager (b)

check budget by finance (d)

decide (e)

accept request (g)

reject request (h)

reinitiate request (f)

start end

Case Activity Timestamp Resource

432 register travel request (a) 18-3-2014:9.15 John

432 get support from local manager (b) 18-3-2014:9.25 Mary

432 check budget by finance (d) 19-3-2014:8.55 John

432 decide (e) 19-3-2014:9.36 Sue

432 accept request (g) 19-3-2014:9.48 Mary

Page 33: Open Lecture Wil van der Aalst

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Loesje van

der Aalst

desire line

Page 34: Open Lecture Wil van der Aalst

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Play In: Simple process allowing for 4 traces

abdeg adbeg

abdeh

adbeh

abdeg adbeg

abdeh adbeh abdeh abdeh

abdeh

adbeh

adbeh

register travel

request (a)

get support from local

manager (b)

check budget by finance (d)

decide (e)

accept request (g)

reject request (h)

start end

Page 35: Open Lecture Wil van der Aalst

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Play In: Process allowing for more traces

abdeg abcefbdeh

adceg adbeh

acbefbdeh acdefcdefbdeh adcefcdefbdefbdeg abdeg adceh adbeh

acbefbdeg acdefcdefbdeh adcefcdefbdefbdeg

abdeg adceh

adbeh acbefbdeg

acdefcdefbdeh

adcefcdefbdefbdeg abdeg

register travel

request (a)

get detailed motivation letter (c)

get support from local

manager (b)

check budget by finance (d)

decide (e)

accept request (g)

reject request (h)

reinitiate request (f)

start end

Page 36: Open Lecture Wil van der Aalst

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

No modeling needed!

Page 37: Open Lecture Wil van der Aalst

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Example Process Discovery (Dutch housing agency, 208 cases, 5987 events)

Page 38: Open Lecture Wil van der Aalst

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Replay register travel

request (a)

get detailed motivation letter (c)

get support from local

manager (b)

check budget by finance (d)

decide (e)

accept request (g)

reject request (h)

reinitiate request (f)

start end

Case Activity Timestamp Resource

432 register travel request (a) 18-3-2014:9.15 John

432 get support from local manager (b) 18-3-2014:9.25 Mary

432 check budget by finance (d) 19-3-2014:8.55 John

432 decide (e) 19-3-2014:9.36 Sue

432 accept request (g) 19-3-2014:9.48 Mary

Page 39: Open Lecture Wil van der Aalst

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

event data process

model

Page 40: Open Lecture Wil van der Aalst

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

desire line

very safe

system

Page 41: Open Lecture Wil van der Aalst

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Replay

register travel

request (a)

get detailed motivation letter (c)

get support from local

manager (b)

check budget by finance (d)

decide (e)

accept request (g)

reject request (h)

reinitiate request (f)

start end

a c d e g

Page 42: Open Lecture Wil van der Aalst

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Replay

register travel

request (a)

get detailed motivation letter (c)

get support from local

manager (b)

check budget by finance (d)

decide (e)

accept request (g)

reject request (h)

reinitiate request (f)

start end

a c check budget (d)

is missing!

e g ?

Page 43: Open Lecture Wil van der Aalst

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Replay

register travel

request (a)

get detailed motivation letter (c)

get support from local

manager (b)

check budget by finance (d)

decide (e)

accept request (g)

reject request (h)

reinitiate request (f)

start end

a c d e g h reject request (h) is

impossible ?

Page 44: Open Lecture Wil van der Aalst

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Conformance Checking (WOZ objections Dutch municipality, 745 objections, 9583 event, f= 0.988)

Page 45: Open Lecture Wil van der Aalst

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Replay with timestamps

register travel

request (a)

get detailed motivation letter (c)

get support from local

manager (b)

check budget by finance (d)

decide (e)

accept request (g)

reject request (h)

reinitiate request (f)

start end

a9.15 c9.20 d9.35 e10.15 g11.30

9.15

9.20

9.35

10.15

11.30

5 55

20 40

75

Page 46: Open Lecture Wil van der Aalst

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Replay with timestamps

register travel

request (a)

get detailed motivation letter (c)

get support from local

manager (b)

check budget by finance (d)

decide (e)

accept request (g)

reject request (h)

reinitiate request (f)

start end

5 55

20 40

75

15

20

60

45

65

20

25

45

55

50

5 55

20 40

75

15

20

60

45

65

20

25

45

55

50

5

55 20

40

75

15

20

60

45

65

20

25

45

55

50

Page 47: Open Lecture Wil van der Aalst

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Performance Analysis Using Replay (WOZ objections Dutch municipality, 745 objections, 9583 event, f= 0.988)

Page 48: Open Lecture Wil van der Aalst

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

software

system

(process)

model

event

logs

models

analyzes

discovery

records

events, e.g.,

messages,

transactions,

etc.

specifies

configures

implements

analyzes

supports/

controls

enhancement

conformance

“world”

people machines

organizations

components

business

processes

Overview

Play-In

Play-Out

Replay

Page 49: Open Lecture Wil van der Aalst

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Process Mining Software

Page 50: Open Lecture Wil van der Aalst

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

600+ plug-ins available covering the

whole process mining spectrum

Page 51: Open Lecture Wil van der Aalst

©Wil van der Aalst & TU/e (use only with permission & acknowledgements) ©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Page 52: Open Lecture Wil van der Aalst

Process Discovery

Page 53: Open Lecture Wil van der Aalst

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Starting point for process mining:

Event data

student name course name exam date mark

Peter Jones Business Information systems 16-1-2014 8

Sandy Scott Business Information systems 16-1-2014 5

Bridget White Business Information systems 16-1-2014 9

John Anderson Business Information systems 16-1-2014 8

Sandy Scott BPM Systems 17-1-2014 7

Bridget White BPM Systems 17-1-2014 8

Sandy Scott Process Mining 20-1-2014 5

Bridget White Process Mining 20-1-2014 9

John Anderson Process Mining 20-1-2014 8

… … … …

case id activity name timestamp other data

every row is an event

(here: an exam attempt)

Page 54: Open Lecture Wil van der Aalst

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Another event log: patient treatment

patient activity timestamp doctor age cost

5781 make X-ray [email protected] Dr. Jones 45 70.00

5541 blood test [email protected] Dr. Scott 61 40.00

5833 blood test [email protected] Dr. Scott 24 40.00

5781 blood test [email protected] Dr. Scott 45 40.00

5781 CT scan [email protected] Dr. Fox 45 1200.00

5833 surgery [email protected] Dr. Scott 24 2300.00

5781 handle payment [email protected] Carol Hope 45 0.00

5541 radiation therapy [email protected] Dr. Jones 61 140.00

5541 radiation therapy [email protected] Dr. Jones 61 140.00

… … … … … …

case id activity name timestamp other data resource

Page 55: Open Lecture Wil van der Aalst

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Another event log: order handling

order

number

activity timestamp user product quantity

9901 register order [email protected] Sara Jones iPhone5S 1

9902 register order [email protected] Sara Jones iPhone5S 2

9903 register order [email protected] Sara Jones iPhone4S 1

9901 check stock [email protected] Pete Scott iPhone5S 1

9901 ship order [email protected] Sue Fox iPhone5S 1

9903 check stock [email protected] Pete Scott iPhone4S 1

9901 handle payment [email protected] Carol Hope iPhone5S 1

9902 check stock [email protected] Pete Scott iPhone5S 2

9902 cancel order [email protected] Carol Hope iPhone5S 2

… … … … … …

case id activity name timestamp other data resource

Page 56: Open Lecture Wil van der Aalst

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Simplifying event logs when focusing on

control-flow

order

number

activity timestamp user product quantity

9901 register order [email protected] Sara Jones iPhone5S 1

9902 register order [email protected] Sara Jones iPhone5S 2

9903 register order [email protected] Sara Jones iPhone4S 1

9901 check stock [email protected] Pete Scott iPhone5S 1

9901 ship order [email protected] Sue Fox iPhone5S 1

9903 check stock [email protected] Pete Scott iPhone4S 1

9901 handle payment [email protected] Carol Hope iPhone5S 1

9902 check stock [email protected] Pete Scott iPhone5S 2

9902 cancel order [email protected] Carol Hope iPhone5S 2

… … … … … … [ register_order, check_stock,ship_order,handle_payment,

register_order, check_stock,cancel_order,

register_order, check_stock , …]

Page 57: Open Lecture Wil van der Aalst

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Simple event log

• An event log is a multiset of traces (same trace

may appear multiple times).

• A trace is a sequence of activity names (we

abstract from all other attributes, but events are

ordered).

Page 58: Open Lecture Wil van der Aalst

alpha

mining

Page 59: Open Lecture Wil van der Aalst

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Goal of Alpha algorithm

a

b

c

de

p2

end

p4

p3p1

start

Event log contains all possible

traces of model and vice versa.

Page 60: Open Lecture Wil van der Aalst

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Another example

a

b

c

df

p2

end

p4

p3p1

start

e

p5

Generalization: event

log contains only

subset of all possible

traces of model.

Page 61: Open Lecture Wil van der Aalst

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Notation is less relevant (e.g. BPMN)

a

b

c

de

p2

end

p4

p3p1

start

a

start end

b

c

e

d

Page 62: Open Lecture Wil van der Aalst

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Another BPMN example

a

b

c

df

p2

end

p4

p3p1

start

e

p5

a

start end

b

c

e

d

f

Page 63: Open Lecture Wil van der Aalst

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

>,,||,# relations

• Direct succession: x>y iff

for some case x is directly

followed by y.

• Causality: xy iff x>y and

not y>x.

• Parallel: x||y iff x>y and

y>x

• Choice: x#y iff not x>y and

not y>x.

a>b

a>c

a>e

b>c

b>d

c>b

c>d

e>d

ab

ac

ae

bd

cd

ed b||c

c||b

abcd

acbd

aed

b#e

e#b

c#e

a#d

Page 64: Open Lecture Wil van der Aalst

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Basic Idea Used by Alpha Algorithm (1)

a b

(a) sequence pattern: a→b

Page 65: Open Lecture Wil van der Aalst

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Basic Idea Used by Alpha Algorithm (2)

a

b

c

(b) XOR-split pattern:

a→b, a→c, and b#c

a

b

c

(b) XOR-split pattern:

a→b, a→c, and b#c

b

c

d

(c) XOR-join pattern:

b→d, c→d, and b#c

Page 66: Open Lecture Wil van der Aalst

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Basic Idea Used by Alpha Algorithm (3)

a

b

c

(d) AND-split pattern:

a→b, a→c, and b||c

b

c

d

(e) AND-join pattern:

b→d, c→d, and b||c

a

b

c

(d) AND-split pattern:

a→b, a→c, and b||c

Page 67: Open Lecture Wil van der Aalst

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Example Revisited

Result produced by

the Alpha algorithm

a>b

a>c

a>e

b>c

b>d

c>b

c>d

e>d

ab

ac

ae

bd

cd

ed

b||c

c||b

b#e

e#b

c#e

a#d

a

b

c

de

p2

end

p4

p3p1

start

Page 68: Open Lecture Wil van der Aalst

inductive

mining

Page 69: Open Lecture Wil van der Aalst

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Split event logs based on activity labels

abdef

acdef

adbef

adcef

abdeg

acdeg

adbeg

adceg

Page 70: Open Lecture Wil van der Aalst

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Split {a,b,c,d,e,f,g,h} into {a,b,c,d} and {e,f,g}

using sequence decomposition

abdef

acdef

adbef

adcef

abdeg

acdeg

adbeg

adceg

Page 71: Open Lecture Wil van der Aalst

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Result

abd

acd

adb

adc

abd

acd

adb

adc

ef

ef

ef

ef

eg

eg

eg

eg

seq

abdef

acdef

adbef

adcef

abdeg

acdeg

adbeg

adceg

Page 72: Open Lecture Wil van der Aalst

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Split {a,b,c,d} into {a} and {b,c,d} using

sequence decomposition

abd

acd

adb

adc

abd

acd

adb

adc

ef

ef

ef

ef

eg

eg

eg

eg

seq

Page 73: Open Lecture Wil van der Aalst

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Result

bd

cd

db

dc

bd

cd

db

dc

ef

ef

ef

ef

eg

eg

eg

eg

seq

a

a

a

a

a

a

a

a

seq

Page 74: Open Lecture Wil van der Aalst

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Split {e,f,g} into {e} and {f,g} using sequence

decomposition

bd

cd

db

dc

bd

cd

db

dc

ef

ef

ef

ef

eg

eg

eg

eg

seq

a

a

a

a

a

a

a

a

seq

Page 75: Open Lecture Wil van der Aalst

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Result

bd

cd

db

dc

bd

cd

db

dc

e

e

e

e

e

e

e

e

seq

a

a

a

a

a

a

a

a

seq seq

f

f

f

f

g

g

g

g

Page 76: Open Lecture Wil van der Aalst

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Split {f,g} into {f} and {g} using XOR

decomposition

bd

cd

db

dc

bd

cd

db

dc

e

e

e

e

e

e

e

e

seq

a

a

a

a

a

a

a

a

seq seq

f

f

f

f

g

g

g

g

Page 77: Open Lecture Wil van der Aalst

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Result

bd

cd

db

dc

bd

cd

db

dc

e

e

e

e

e

e

e

e

seq

a

a

a

a

a

a

a

a

seq seq

f

f

f

f

g

g

g

g

xor

Page 78: Open Lecture Wil van der Aalst

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Split {b,c,d} into {b,c} and {d} using AND

decomposition

bd

cd

db

dc

bd

cd

db

dc

e

e

e

e

e

e

e

e

seq

a

a

a

a

a

a

a

a

seq seq

f

f

f

f

g

g

g

g

xor

Page 79: Open Lecture Wil van der Aalst

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Result

d

d

d

d

d

d

d

d

e

e

e

e

e

e

e

e

seq

a

a

a

a

a

a

a

a

seq seq

f

f

f

f

g

g

g

g

xor

b

c

b

c

b

c

b

c

par

Page 80: Open Lecture Wil van der Aalst

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Split {b,c} into {b} and {c} using XOR

decomposition

d

d

d

d

d

d

d

d

e

e

e

e

e

e

e

e

seq

a

a

a

a

a

a

a

a

seq seq

f

f

f

f

g

g

g

g

xor

b

c

b

c

b

c

b

c

par

Page 81: Open Lecture Wil van der Aalst

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Result

d

d

d

d

d

d

d

d

e

e

e

e

e

e

e

e

seq

a

a

a

a

a

a

a

a

seq seq par

b

b

b

b

c

c

c

c

xor …

no further decomposition is possible

Page 82: Open Lecture Wil van der Aalst

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Process tree

abdefacdefadbefadcefabdegacdegadbegadceg

seq

abdacdadbadc

efeg

a

bdcddbdc

seq

seq

e

fg

xor

f

g

par

d

bc

xor

b

c

astart

b

c

d

e

f

gend

ac

b

de

f

gstart end

Page 83: Open Lecture Wil van der Aalst

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Page 84: Open Lecture Wil van der Aalst

Conclusion

Page 85: Open Lecture Wil van der Aalst

data science process science

Page 86: Open Lecture Wil van der Aalst

process mining

data-oriented analysis

process model analysis (simulation, verification,

optimization, gaming, etc.)

(data mining, machine learning,

business intelligence)

Page 87: Open Lecture Wil van der Aalst

process mining

data-oriented analysis

process model analysis

performance-oriented

questions, problems and

solutions

compliance-oriented

questions, problems and

solutions

(simulation, verification,

optimization, gaming, etc.)

(data mining, machine learning,

business intelligence)

Page 88: Open Lecture Wil van der Aalst

Process Mining

Data Science in Action

https://www.coursera.org/course/procmin

38.000+ people joined!

Page 89: Open Lecture Wil van der Aalst

ProM

Page 90: Open Lecture Wil van der Aalst

©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Page 91: Open Lecture Wil van der Aalst

©Wil van der Aalst & TU/e (use only with permission & acknowledgements) ©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Page 92: Open Lecture Wil van der Aalst

©Wil van der Aalst & TU/e (use only with permission & acknowledgements) ©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Page 93: Open Lecture Wil van der Aalst

©Wil van der Aalst & TU/e (use only with permission & acknowledgements) ©Wil van der Aalst & TU/e (use only with permission & acknowledgements)

Page 94: Open Lecture Wil van der Aalst

©Wil van der Aalst & TU/e (use only with permission & acknowledgements) ©Wil van der Aalst & TU/e (use only with permission & acknowledgements)