Post on 15-Aug-2020
Data Scienceand Process Modelling
Frank Takes
LIACS, Leiden University
https://liacs.leidenuniv.nl/~takesfw/DSPM
Lecture 1 — Introduction
Frank Takes — DSPM — Lecture 1 — Introduction 1 / 51
DSPM (in short)
Data Science DS
Converting data into knowledge for a particular domain (e.g.,business)
Process Modelling PMMethods and techniques to systematically map and analyze businessprocesses
Frank Takes — DSPM — Lecture 1 — Introduction 2 / 51
DSPM (in short)
Data Science DSConverting data into knowledge for a particular domain (e.g.,business)
Process Modelling PMMethods and techniques to systematically map and analyze businessprocesses
Frank Takes — DSPM — Lecture 1 — Introduction 2 / 51
DSPM (in short)
Data Science DSConverting data into knowledge for a particular domain (e.g.,business)
Process Modelling PM
Methods and techniques to systematically map and analyze businessprocesses
Frank Takes — DSPM — Lecture 1 — Introduction 2 / 51
DSPM (in short)
Data Science DSConverting data into knowledge for a particular domain (e.g.,business)
Process Modelling PMMethods and techniques to systematically map and analyze businessprocesses
Frank Takes — DSPM — Lecture 1 — Introduction 2 / 51
BIPM (2018 and earlier)
Business Intelligence BIConverting raw company data into information that contributes to(knowledge on) business development
Process Modelling PMMethods and techniques to systematically map and analyze businessprocesses
Frank Takes — DSPM — Lecture 1 — Introduction 3 / 51
Course Introduction (Dutch)
Frank Takes — DSPM — Lecture 1 — Introduction 4 / 51
Vakinformatie
Voertaal: Nederlands (materiaal: grotendeels Engels)
3 feb 2019 t/m 13 mei 2019 (niet op 18 en 25 mrt en 22 apr)
Hoorcollege: maandag en vanaf maart woensdag van 11:15 tot 13:00
Werkcollege: maandag van 9.15 tot 11.00 en vanaf maart woensdagvan 13:30 tot 15:15 (!)
Vakinformatie: http://liacs.leidenuniv.nl/~takesfw/DSPM
Docent: dr. Frank Takes
f.w.takes@liacs.leidenuniv.nl
Snellius kamer 157b
Assistenten:
Gerrit-Jan de Bruin MSc, kamer 126b,g.j.de.bruin@liacs.leidenuniv.nl
Martijn Vlak BSc, m.vlak@umail.leidenuniv.nl
Frank Takes — DSPM — Lecture 1 — Introduction 5 / 51
Format
Materiaal: boek, slides en enkele wetenschappelijke artikelen
Schriftelijk tentamen: 9 juni 2020, 14.00–17.00 60%
Practicum met deliverables: 2 mrt, 13 apr, 18 mei 3× 13, �3%
Elk onderdeel moet voldoende (> 5.5) zijn.
Eindcijfer afgerond op basis van participatie naar dichtsbijzijndeelement in {1, 1.5, 2, 2.5, 3, 3.5, 4, 4.5, 5, 6, 6.5, 7, 7.5, 8, 8.5, 9, 9.5, 10}Hertentamen: 9 juli 2020, 14.00–17.00
Practicum herkansingsdeadline: 17 juni 2020 (minus 2 punten)
7 ECTS
Frank Takes — DSPM — Lecture 1 — Introduction 6 / 51
Book
W. van der Aalst, Process Mining: Discovery, Conformance andEnhancement of Business Processes, 2nd edition, Springer, 2016.
These slides are partially based on the slides of the (previous editionof the) book
Frank Takes — DSPM — Lecture 1 — Introduction 7 / 51
Disclaimer
New course format since 2015, larger adjustments in 2019
Definitions and abbreviations . . .
Pictures and schemas . . .
Participation
Feedback always welcome!
Frank Takes — DSPM — Lecture 1 — Introduction 8 / 51
Topics
February/March: Data Science
April/May: Process Modelling
Frank Takes — DSPM — Lecture 1 — Introduction 9 / 51
Assignments
1 Data Science: gaming industry data visualization on the web
2 Data Science: predictive analytics in Python
3 Process Modelling: BPMN and Petri net tools
Frank Takes — DSPM — Lecture 1 — Introduction 10 / 51
Introduction to Data Scienceand Business Intelligence
Frank Takes — DSPM — Lecture 1 — Introduction 11 / 51
Business Intelligence
Business Intelligence: techniques and tools for the transformationof raw data into meaningful and useful information for businessanalysis purposes (Wikipedia.org)
Business Intelligence: a data analysis process aimed at boostingbusiness performance by helping corporate executives and other endusers make more informed decisions (TechTarget.com)
Business Intelligence: anything that aims at providing actionableinformation that can be used to support decision making (v.d. Aalst)
Frank Takes — DSPM — Lecture 1 — Introduction 12 / 51
Business Analysis
Business analysis: identifying business needs and determiningsolutions to business problems
Business architecture analysisData warehouse analysisBusiness process analysis
Role of IT: Extraction, Transformation and Loading (ETL)
Identifying changes that are required to achieve strategic goals
Business Analyst
Frank Takes — DSPM — Lecture 1 — Introduction 13 / 51
Business Analytics
Business Analytics: skills, technologies and practices for continuousiterative exploration and investigation of past business performance togain insight and drive business planning
Descriptive analytics: gain insight from historical dataPredictive analytics: predictive modeling using statistical andmachine learning techniquesPrescriptive analytics: recommend decisions using optimization,simulation, etc.
Domains: marketing, pricing, sales, accounting, supply chains,communication, . . .
Frank Takes — DSPM — Lecture 1 — Introduction 14 / 51
Data
Data: facts, measurements or text collected for reference or analysis(Oxford dictionary)
Unstructured data: data that does not fit a certain data structure(text, a list of numeric measurements)Structured data: data that fits a certain data structure(table, tree, graph/network, etc.)
Frank Takes — DSPM — Lecture 1 — Introduction 15 / 51
Moore’s Law & Transistors
http://visual.ly
Frank Takes — DSPM — Lecture 1 — Introduction 16 / 51
Moore’s Law & CPU’s
Frank Takes — DSPM — Lecture 1 — Introduction 17 / 51
Moore’s Law & Intelligence
http://frugicom.nl
Frank Takes — DSPM — Lecture 1 — Introduction 18 / 51
Mega Giga Tera Peta Exa Zetta Yotta
Frank Takes — DSPM — Lecture 1 — Introduction 19 / 51
Storage: Digital vs. Analog
Frank Takes — DSPM — Lecture 1 — Introduction 20 / 51
Moore’s Law & Data
Figure: Zettabytes produced per year
Frank Takes — DSPM — Lecture 1 — Introduction 21 / 51
Big Data
Frank Takes — DSPM — Lecture 1 — Introduction 22 / 51
Big Data
Frank Takes — DSPM — Lecture 1 — Introduction 23 / 51
Data Mining
Data explosion
“We are drowning in data, but starving for knowledge!”
Interpretation
Understanding
Learning
Acting
Frank Takes — DSPM — Lecture 1 — Introduction 24 / 51
Data Mining
Data → Information → Knowledge
Knowledge Discovery
Data Mining
Machine Learning
Unsupervised: clustering, pattern mining, etc.Supervised: classification, prediction, etc.
Big Data
Data Science
Frank Takes — DSPM — Lecture 1 — Introduction 25 / 51
DIKW Pyramid
Frank Takes — DSPM — Lecture 1 — Introduction 26 / 51
DIKWD Pyramid (2)
information4dummies.blogspot.comFrank Takes — DSPM — Lecture 1 — Introduction 27 / 51
Data Science
Frank Takes — DSPM — Lecture 1 — Introduction 28 / 51
Data Science
Frank Takes — DSPM — Lecture 1 — Introduction 29 / 51
Break?
Frank Takes — DSPM — Lecture 1 — Introduction 30 / 51
Introduction to Process Modelling
Frank Takes — DSPM — Lecture 1 — Introduction 31 / 51
Processes
Process: a set of related actions and transactions to achieve acertain objective
Service process: providing a value transferProduction process: providing a value creation
Various domains: organizations, environments, biological systems,economies, . . .
Frank Takes — DSPM — Lecture 1 — Introduction 32 / 51
Business Processes
Business process: a sequence of activities aimed at producingsomething of value for the business (Morgan)
Business process: collection of related, structured activities or tasksthat produce a specific service or product (serve a particular goal) fora particular customer or customers (Wikipedia)
Business Process Management: the discipline that combinesknowledge from information technology and knowledge frommanagement sciences and applies this to operational businessprocesses (v.d. Aalst)
Frank Takes — DSPM — Lecture 1 — Introduction 33 / 51
Process Modelling
Frank Takes — DSPM — Lecture 1 — Introduction 34 / 51
Types of Business Processes
Management processes: the processes that govern the operation ofa system, e.g., corporate governance and strategic management
Operational processes: processes that constitute core business andcreate the primary value stream, e.g., purchasing, manufacturing,advertising, marketing and sales
Supporting processes: processes that support the core business,e.g., accounting, recruitment, HR, customer support and ICT support
Frank Takes — DSPM — Lecture 1 — Introduction 35 / 51
Business Process Model
Model: abstract or schematic representation of reality
Modelling (task): creating a model, for example based on realobservations
Business Process Model: abstract representation of businessprocesses
Process model functions:
Descriptive: what is actually happening?Prescriptive: what should be happening?Explanatory: why is the process designed this way?
Frank Takes — DSPM — Lecture 1 — Introduction 36 / 51
Classical BPM Lifecycle
Frank Takes — DSPM — Lecture 1 — Introduction 37 / 51
Process Model Goals
Insight: while making a model, the modeler is triggered to view theprocess from various angles
Discussion: the stakeholders use models to structure discussions
Documentation: processes are documented for instructing people orcertification purposes
Verification: process models are analyzed to find errors in systems orprocedures (e.g., potential deadlocks)
Frank Takes — DSPM — Lecture 1 — Introduction 38 / 51
Process Model Goals (2)
Performance analysis: techniques like simulation can be used tounderstand the factors influencing response times, service levels, etc.
Animation: models enable end users to play out different scenariosand provide feedback to the designer
Specification: models can be used to describe a Process-AwareInformation System (PAIS) before it is implemented and can henceserve as a “contract” between the developer and the enduser/management
Configuration: models can be used to configure a system
Frank Takes — DSPM — Lecture 1 — Introduction 39 / 51
Process Modelling
In practice: formalize and visualize business processes
Business Process Languages (BPLs):
Petri NetsBusiness Process Model Notation (BPMN)
Process discovery: derive the process from a description of activities
ManualAutomated: Process Mining
Frank Takes — DSPM — Lecture 1 — Introduction 40 / 51
Process Science
Frank Takes — DSPM — Lecture 1 — Introduction 41 / 51
Process Mining
Executable models may be used to force people to work in aparticular manner
Models are possibly not well-aligned with reality
Hand-made models may be disconnected from reality and provide anidealized view on the processes at hand: “paper tigers”
An abundance of event logs (data) is usually available
Process Mining is the task of converting event data into processmodels
Frank Takes — DSPM — Lecture 1 — Introduction 42 / 51
Process Discovery Example (Dutch)
Postorderbedrijf acties:
A = het plaatsen van een bestellingB = de betaling van deze bestellingC = de levering van de bestellingD = het versturen van een dankbriefE = klant op de hoogte brengen bij niet-leverbare order
Verzamelde procespaden:
Pad ABCD is 543 keer gevolgdPad ACBD is 378 keer gevolgdPad AED is 45 keer gevolgd
Frank Takes — DSPM — Lecture 1 — Introduction 43 / 51
Petri Net for Example (Dutch)
Frank Takes — DSPM — Lecture 1 — Introduction 44 / 51
Process Mining
Data-oriented business information analysis
Model-based process analysis
Event logs can be replayed on the model
Compliance, auditing, internal review, etc.
Frank Takes — DSPM — Lecture 1 — Introduction 45 / 51
Process Mining
Frank Takes — DSPM — Lecture 1 — Introduction 46 / 51
Process Mining tasks
Discovery: how can we automatically construct a model based onevent logs?
Conformance: given an event log and a model, what are thedifferences between the model and reality?
Enhancement: how can the model be extended using knowledgeinferred from event logs?
Frank Takes — DSPM — Lecture 1 — Introduction 47 / 51
Process Mining
Frank Takes — DSPM — Lecture 1 — Introduction 48 / 51
Assignment 1
Gaming industry context
Sales log spanning roughly 4 years of sales
Apply and compare BI techniques
Inspect, visualize, aggregate, . . .
Deliverables:
1 Web-based dashboard, including rationale for design choices2 Answers to various business-related questions using the dashboard
Format: short assignment report in LATEX
Frank Takes — DSPM — Lecture 1 — Introduction 49 / 51
Mini-practicum zometeen
ULCN-account bruikbaar onder Linux in Snellius zalen?
SSH-toegang tot webserver liacs.leidenuniv.nl?
MySQL-wachtwoord bekend?http://liacs.leidenuniv.nl/phpmyadmin
Verfris uw HTML/CSS/Javascript/jQuery/PHP/MySQL-kennis
Speel een (Facebook)spel (bijv. Jelly Splash)
Onderzoek het verdienmodel van dit spel; hoe kan men zoal betalen?Hoe werken micropayments? Wat zijn payment service providers?Hoe kun je online betalen in Frankrijk? etc.
Frank Takes — DSPM — Lecture 1 — Introduction 50 / 51
Credits
Lecture based on (slides of the (previous edition of the)) course book:W. van der Aalst, Process Mining: Data Science in Action, 2nd edition, Springer,2016.
Frank Takes — DSPM — Lecture 1 — Introduction 51 / 51