Process Mining: Control-Flow Mining Algorithms technologie management 2 Process Mining • Short...
Transcript of Process Mining: Control-Flow Mining Algorithms technologie management 2 Process Mining • Short...
/faculteit technologie management 1
Process Mining: ControlProcess Mining: Control--Flow Flow Mining AlgorithmsMining Algorithms
Ana Karla Ana Karla AlvesAlves de Medeirosde Medeiros
Eindhoven University of TechnologyDepartment of Information Systems
/faculteit technologie management 2
Process Mining
• Short Recap
• Types of Process Mining Algorithms
• Common Constructs
• Input Format
• α-algorithm
• Heuristics Miner
• Genetic Miner
• Fuzzy Miner
/faculteit technologie management 3
Process Mining
• Short Recap
• Types of Process Mining Algorithms
• Common Constructs
• Input Format
• α-algorithm
• Heuristics Miner
• Genetic Miner
• Fuzzy Miner
/faculteit technologie management 4
Process Mining
1. Register at hospital2. Talk to doctor3. Take blood test4. Take X-ray5. Get results6. Talk to doctor7. Go home
1. Register at hospital2. Talk to doctor3. Undergo surgery4. Take medicines5. Take blood sample6. Talk to doctor7. Go home
1. Register at hospital2. Talk to doctor3. Undergo surgery4. Take medicines5. Take blood sample6. Talk to doctor7. Go home
EventEventLogsLogs
1. Register at hospital2. Talk to doctor3. Undergo surgery4. Take medicines5. Take blood sample6. Talk to doctor7. Go home
1. Register at hospital
2. Talk to doctor3. Undergo surgery4. Take medicines5. Take blood
sample6. Talk to doctor7. Go home
1. Register at hospital
2. Talk to doctor3. Undergo surgery4. Take medicines5. Take blood
sample6. Talk to doctor7. Go home
/faculteit technologie management 5
Process Mining Start
Register order
Prepareshipment
Ship goods
(Re)send bill
Receive paymentContactcustomer
Archive order
End
Performance Performance AnalysisAnalysis
ProcessProcess ModelModel OrganizationalOrganizational ModelModel
SocialSocial NetworkNetwork
EventEventLogLog
MiningMiningTechniquesTechniques
Auditing/SecurityAuditing/Security
MinedMinedModelsModels
/faculteit technologie management 6
Event Logs are Everywhere!
aaaaaa aaaaaa aaaaaa aaaaaa
aaaaaa aaaaaa aaaaaa aaaaaa
Start
Get Ready
Travel by CarTravel by Train
BETA PhD Day Starts
Visit Brewery
Have Dinner
Go Home
Travel by Train Pay for Parking
Travel by Car
End
Give a Talk
Machines, Machines, MunicipalitiesMunicipalities, , AirportsAirports,,Internet, Internet, HospitalsHospitals, etc., etc.
/faculteit technologie management 7
Tools
• www.processmining.org
• ProM 4.2
• ProMimport
• Free tools!
/faculteit technologie management 8
Process Mining
• Short Recap
• Types of Process Mining Algorithms
• Common Constructs
• Input Format
• α-algorithm
• Heuristics Miner
• Genetic Miner
• Fuzzy Miner
/faculteit technologie management 9
Process Mining
• Short Recap
• Types of Process Mining Algorithms
• Common Constructs
• Input Format
• α-algorithm
• Heuristics Miner
• Genetic Miner
• Fuzzy Miner
/faculteit technologie management 11
Start
Register order
Prepareshipment
Ship goods
(Re)send bill
Receive paymentContactcustomer
Archive order
End
ProcessProcess ModelModel
OrganizationalOrganizational ModelModel
SocialSocial NetworkNetwork
Types of Algorithms
/faculteit technologie management 12
Auditing/SecurityAuditing/Security
Start
Register order
Prepareshipment
Ship goods
(Re)send bill
Receive paymentContactcustomer
Archive order
End
ComplianceComplianceProcessProcess ModelModel
Types of Algorithms
/faculteit technologie management 13
Start
Register order
Prepareshipment
Ship goods
(Re)send bill
Receive paymentContactcustomer
Archive order
End
Bottlenecks/Bottlenecks/Business Business RulesRulesProcessProcess ModelModel
Performance Performance AnalysisAnalysis
Types of Algorithms
/faculteit technologie management 14
Process Mining
EventEventLogLog
DiscoveryDiscovery TechniquesTechniques: : ControlControl--FlowFlow MiningMining
MinedMinedModelModel
1. Start2. Get Ready3. Travel by Train4. Beta Event Starts5. Visit Brewery6. Have Dinner7. Go Home8. Travel by Train
1. Start2. Get Ready3. Travel by Train4. Beta Event Starts5. Give a Talk6. Visit Brewery7. Have Dinner8. Go Home9. Travel by Train
1. Start2. Get Ready3. Travel by Car4. Beta Event Starts5. Give a Talk6. Visit Brewery7. Have Dinner8. Go Home9. Pay Parking10. Travel by Car
1. Start2. Get Ready3. Travel by Car4. Conference Starts5. Join Reception6. Have Dinner7. Go Home8. Pay Parking9. Travel by Car10. End
Start
Get Ready
Travel by CarTravel by Train
BETA PhD Day Starts
Visit Brewery
Have Dinner
Go Home
Travel by Train Pay for Parking
Travel by Car
End
Give a Talk
StartStart
GetGet ReadyReady
Travel Travel bybyTrainTrain
Travel Travel bybyCarCar
Conference StartsConference Starts
GiveGive a Talka Talk
Join ReceptionJoin Reception
Have Have DinnerDinner
Go HomeGo Home
Travel Travel bybyTrainTrain
Travel Travel bybyCarCar
PayPayParkingParking
EndEnd
/faculteit technologie management 15
Process Mining
• Short Recap
• Types of Process Mining Algorithms
• Common Constructs
• Input Format
• α-algorithm
• Heuristics Miner
• Genetic Miner
• Fuzzy Miner
/faculteit technologie management 16
Process Mining
• Short Recap
• Types of Process Mining Algorithms
• Common Constructs
• Input Format
• α-algorithm
• Heuristics Miner
• Genetic Miner
• Fuzzy Miner
/faculteit technologie management 17
Common Constructs
• Sequence
• Splits
• Joins
• Loops
• Non-Free
Choice
• Invisible Tasks
• Duplicate
Tasks
PayPayParkingParking
Get Ready
Travel by CarTravel by Train
Defence Starts
Ask Question
Defence Ends
Go Home
Travel by Train Pay for Parking
Travel by Car
Give a Talk
Have Drinks
GetGetReadyReady
Travel Travel bybyTrainTrain
Travel Travel bybyCarCar
DefenseDefense StartsStarts
GiveGive a Talka Talk
AskAsk QuestionQuestion
DefenseDefense EndsEnds
Go HomeGo Home
Travel Travel bybyTrainTrain
Travel Travel bybyCarCar
Have Have DrinksDrinks
/faculteit technologie management 18
Common Constructs
• Sequence
• Splits
• Joins
• Loops
• Non-Free
Choice
• Invisible Tasks
• Duplicate
Tasks
PayPayParkingParking
Get Ready
Travel by CarTravel by Train
Defence Starts
Ask Question
Defence Ends
Go Home
Travel by Train Pay for Parking
Travel by Car
Give a Talk
Have Drinks
GetGetReadyReady
Travel Travel bybyTrainTrain
Travel Travel bybyCarCar
DefenseDefense StartsStarts
GiveGive a Talka Talk
AskAsk QuestionQuestion
DefenseDefense EndsEnds
Go HomeGo Home
Travel Travel bybyTrainTrain
Travel Travel bybyCarCar
Have Have DrinksDrinks
+ noise in logs!
/faculteit technologie management 19
Process Mining
• Short Recap
• Types of Process Mining Algorithms
• Common Constructs
• Input Format
• α-algorithm
• Heuristics Miner
• Genetic Miner
• Fuzzy Miner
/faculteit technologie management 20
Process Mining
• Short Recap
• Types of Process Mining Algorithms
• Common Constructs
• Input Format
• α-algorithm
• Heuristics Miner
• Genetic Miner
• Fuzzy Miner
/faculteit technologie management 21
Event Log: Mining XML (MXML)
The notion The notion of which of which tasks tasks belong to a belong to a same same instance is instance is crucial for crucial for applying applying process process mining mining techniques!techniques!
/faculteit technologie management 23
Event Log: Mining XML (MXML)
Compulsory Compulsory fields!fields!
/faculteit technologie management 24
Process Mining
• Short Recap
• Types of Process Mining Algorithms
• Common Constructs
• Input Format
• α-algorithm
• Heuristics Miner
• Genetic Miner
• Fuzzy Miner
/faculteit technologie management 25
Process Mining
• Short Recap
• Types of Process Mining Algorithms
• Common Constructs
• Input Format
• α-algorithm
• Heuristics Miner
• Genetic Miner
• Fuzzy Miner
/faculteit technologie management 26
α-algorithm
1.
Read a log2.
Get the set of tasks
3.
Infer the ordering relations4.
Build the net based on inferred relations
5.
Output the net
Core Step!Core Step!
/faculteit technologie management 27
• Direct succession: x>y iff
for some
case x is directly followed by y.
• Causality: x→y iff
x>y and not y>x.• Parallel: x||y iff
x>y
and y>x• Unrelated: x#y iff
not x>y and not y>x.
case 1 : task A case 1 : task A case 2 : task A case 2 : task A case 3 : task A case 3 : task A case 3 : task B case 3 : task B case 1 : task B case 1 : task B case 1 : task C case 1 : task C case 2 : task C case 2 : task C case 4 : task A case 4 : task A case 2 : task B case 2 : task B ......
A>BA>BA>CA>CB>CB>CB>DB>DC>BC>BC>DC>DE>FE>F
AA→→BBAA→→CCBB→→DDCC→→DDEE→→FF
B||CB||CC||BC||B
ABCDABCDACBDACBDEFEF
α-algorithm - Ordering Relations
>,→,||,#
/faculteit technologie management 28
Let W be a workflow log over T. α(W) is defined as follows. 1.
TW
= { t ∈ T | ∃σ ∈ W
t ∈ σ}, 2.
TI
= { t ∈ T | ∃σ ∈ W
t = first(σ) }, 3.
TO
= { t ∈ T | ∃σ ∈ W
t = last(σ) }, 4.
XW
= { (A,B) |
A ⊆ TW
∧ B ⊆
TW
∧ ∀a ∈
A
∀b ∈
B
a →W
b ∧ ∀a1,a2 ∈
A
a1
#W
a2
∧ ∀b1,b2 ∈
B
b1
#W
b2
}, 5.
YW
= { (A,B) ∈ X | ∀(A′,B′) ∈
X
A ⊆ A′ ∧B ⊆
B′⇒
(A,B) = (A′,B′) },
6. PW
= { p(A,B)
|
(A,B) ∈ YW
} ∪{iW
,oW
}, 7.
FW
= { (a,p(A,B)
) |
(A,B) ∈ YW
∧ a ∈
A } ∪
{ (p(A,B)
,b) |
(A,B) ∈ YW
∧ b
∈ B } ∪{ (iW
,t) |
t ∈ TI
} ∪{ (t,oW
) | t ∈
TO
}, and 8.
α(W) = (PW
,TW
,FW
).
α-algorithm - Formalization
/faculteit technologie management 29
A
B
C
D
E F
ABCDABCDACBDACBDEFEF
AA→→BBAA→→CCBB→→DDCC→→DDEE→→FF
B||CB||CC||BC||B
α-algorithm - Insight
/faculteit technologie management 30
• If log is complete with respect to relation >, it can be used to mine SWF-net
without short loops
• Structured Workflow Nets (SWF-nets) have no implicit places and the following two constructs cannot be used:
α-algorithm – Log properties + target nets
/faculteit technologie management 31
One-length
Two-length
B>B and not B>B implies B→B (impossible!)
A>B and B>A implies A||B and B||A instead of A→B and B→A
α-algorithm – No short loops Why no
short loops?
/faculteit technologie management 32
•
No invisible tasks, non-free-choice or duplicate tasks•
No noisy logs
α-algorithm – Common Constructs
Why no duplicate
tasks?
Why not invible tasks?
Why noise-free
logs?
/faculteit technologie management 33
Process Mining
• Short Recap
• Types of Process Mining Algorithms
• Common Constructs
• Input Format
• α-algorithm
• Heuristics Miner
• Genetic Algorithm
• Fuzzy Miner
/faculteit technologie management 34
Process Mining
• Short Recap
• Types of Process Mining Algorithms
• Common Constructs
• Input Format
• α-algorithm
• Heuristics Miner
• Genetic Algorithm
• Fuzzy Miner
/faculteit technologie management 35
Heuristics Miner
1.
Read a log2.
Get the set of tasks
3.
Infer the ordering relations based on their frequencies
4.
Build the net based on inferred relations5.
Output the net
/faculteit technologie management 36
Heuristics Miner
Insight:The more frequently a task A directly follows another task B, and the less frequently the opposite occurs, the higher the probability that A causally follows B!
/faculteit technologie management 37
•
No non-free-choice or duplicate tasks•
Robust to invisible tasks and noisy logs
α-algorithm – Common Constructs
Why no non-free- choice?
Why no guarantee whole net will be correct?
/faculteit technologie management 38
Process Mining
• Short Recap
• Types of Process Mining Algorithms
• Common Constructs
• Input Format
• α-algorithm
• Heuristics Miner
• Genetic Miner
• Fuzzy Miner
/faculteit technologie management 39
Process Mining
• Short Recap
• Types of Process Mining Algorithms
• Common Constructs
• Input Format
• α-algorithm
• Heuristics Miner
• Genetic Miner
• Fuzzy Miner
/faculteit technologie management 40
Genetic Process
Mining
(GPM)
• Genetic
Algorithms
+ Process
Mining
• Genetic
Algorithms
– Search technique
that
mimics
the process
of evolution
in biological systems
• Advantages–
Tackle all common
structural
constructs
– Robust
to noise
• Disadvantages–
Computational
Time
/faculteit technologie management 41
Genetic Process
Mining
(GPM)
Algorithm:
Internal Representation
Fitness MeasureGenetic
Operators
/faculteit technologie management 42
GPM – Fitness Measure
Start
Get Ready
Travel by CarTravel by Train
BETA PhD Day Starts
Visit Brewery
Have Dinner
Go Home
Travel by Train Pay for Parking
Travel by Car
End
Give a Talk
StartStart
GetGet ReadyReady
Travel Travel bybyTrainTrain
Travel Travel bybyCarCar
Conference StartsConference Starts
GiveGive a Talka Talk
VisitVisit BreweryBrewery
Have Have DinnerDinner
Go HomeGo Home
Travel Travel bybyTrainTrain
Travel Travel bybyCarCar
PayPayParkingParking
EndEnd
• Guidesthe search!
/faculteit technologie management 43
GPM – Fitness Measure
Start
Get Ready
Travel by CarTravel by Train
BETA PhD Day Starts
Visit Brewery
Have Dinner
Go Home
Travel by Train Pay for Parking
Travel by Car
End
Give a Talk
/faculteit technologie management 44
GPM – Fitness Measure
Start
Get Ready
Travel by CarTravel by Train
BETA PhD Day Starts
Visit Brewery
Have Dinner
Go Home
Travel by Train Pay for Parking
Travel by Car
End
Give a Talk
Start
Get Ready
Travel by TrainBETA PhD Day Starts
Visit Brewery
Have Dinner
Go Home
Pay for Parking
Travel by Car
End
Give a Talk
StartStart
GetGet ReadyReady
Travel Travel bybyCarCar
Conference StartsConference Starts
GiveGive a Talka Talk
VisitVisit BreweryBrewery
Have Have DinnerDinner
Go HomeGo Home
PayPayParkingParking
EndEnd
Travel Travel bybyTrainTrain
Punish for the amount of enabled tasks during the parsing!
Overgeneral solution
/faculteit technologie management 45
GPM – Fitness Measure
Start
Get Ready
Travel by CarTravel by Train
BETA PhD Day Starts
Visit Brewery
Have Dinner
Go Home
Travel by Train Pay for Parking
Travel by Car
End
Give a Talk
Start
Get Ready
Travel by TrainBETA PhD Day Starts
Visit Brewery
Have Dinner
Go Home
Pay for Parking
Travel by Car
End
Give a Talk
Start
Get Ready
Travel by CarTravel by Train
BETA PhD Day Starts
Visit Brewery
Have Dinner
Go Home
Travel by Train Pay for Parking
Travel by Car
End
Give a Talk
StartStart Start
Get ReadyGet Ready Get Ready
Travel by Train Travel by Car
BETA PhD Day StartsBETA PhD Day Starts BETA PhD Day Starts
Give a Talk
Visit BreweryVisit Brewery Visit Brewery
Have DinnerHave Dinner Have Dinner
Go HomeGo HomeGo Home
Travel by Train Pay for Parking
Travel by Car
EndEnd End
StartStart
GetGet ReadyReady
Travel Travel byby CarCar
Conference StartsConference Starts
GiveGive a Talka Talk
VisitVisit BreweryBrewery
Have Have DinnerDinner
Go HomeGo Home
Travel Travel byby TrainTrain
Travel Travel byby CarCar
PayPay ParkingParking
EndEnd
Travel Travel byby TrainTrain
Punish for the amount of duplicate tasks with common input/output tasks!
Overspecific solution
/faculteit technologie management 46
GPM – DGA ProM
Plug-in
Why does the GA Miner takes so
much time?
How could we improve its running time without
changing the code?
/faculteit technologie management 47
Process Mining
• Short Recap
• Types of Process Mining Algorithms
• Common Constructs
• Input Format
• α-algorithm
• Heuristics Miner
• Genetic Miner
• Fuzzy Miner
/faculteit technologie management 48
Process Mining
• Short Recap
• Types of Process Mining Algorithms
• Common Constructs
• Input Format
• α-algorithm
• Heuristics Miner
• Genetic Miner
• Fuzzy Miner
/faculteit technologie management 52
Fuzzy Miner
No “Ask Question” or “Give Talk”!
All details!
Abstracting even more from details!