Data Science and Process Modellingliacs.leidenuniv.nl/~takesfw/DSPM/dspm2020-lecture1.pdfW. van der...

Post on 15-Aug-2020

2 views 0 download

Transcript of Data Science and Process Modellingliacs.leidenuniv.nl/~takesfw/DSPM/dspm2020-lecture1.pdfW. van der...

Data Scienceand Process Modelling

Frank Takes

LIACS, Leiden University

https://liacs.leidenuniv.nl/~takesfw/DSPM

Lecture 1 — Introduction

Frank Takes — DSPM — Lecture 1 — Introduction 1 / 51

DSPM (in short)

Data Science DS

Converting data into knowledge for a particular domain (e.g.,business)

Process Modelling PMMethods and techniques to systematically map and analyze businessprocesses

Frank Takes — DSPM — Lecture 1 — Introduction 2 / 51

DSPM (in short)

Data Science DSConverting data into knowledge for a particular domain (e.g.,business)

Process Modelling PMMethods and techniques to systematically map and analyze businessprocesses

Frank Takes — DSPM — Lecture 1 — Introduction 2 / 51

DSPM (in short)

Data Science DSConverting data into knowledge for a particular domain (e.g.,business)

Process Modelling PM

Methods and techniques to systematically map and analyze businessprocesses

Frank Takes — DSPM — Lecture 1 — Introduction 2 / 51

DSPM (in short)

Data Science DSConverting data into knowledge for a particular domain (e.g.,business)

Process Modelling PMMethods and techniques to systematically map and analyze businessprocesses

Frank Takes — DSPM — Lecture 1 — Introduction 2 / 51

BIPM (2018 and earlier)

Business Intelligence BIConverting raw company data into information that contributes to(knowledge on) business development

Process Modelling PMMethods and techniques to systematically map and analyze businessprocesses

Frank Takes — DSPM — Lecture 1 — Introduction 3 / 51

Course Introduction (Dutch)

Frank Takes — DSPM — Lecture 1 — Introduction 4 / 51

Vakinformatie

Voertaal: Nederlands (materiaal: grotendeels Engels)

3 feb 2019 t/m 13 mei 2019 (niet op 18 en 25 mrt en 22 apr)

Hoorcollege: maandag en vanaf maart woensdag van 11:15 tot 13:00

Werkcollege: maandag van 9.15 tot 11.00 en vanaf maart woensdagvan 13:30 tot 15:15 (!)

Vakinformatie: http://liacs.leidenuniv.nl/~takesfw/DSPM

Docent: dr. Frank Takes

f.w.takes@liacs.leidenuniv.nl

Snellius kamer 157b

Assistenten:

Gerrit-Jan de Bruin MSc, kamer 126b,g.j.de.bruin@liacs.leidenuniv.nl

Martijn Vlak BSc, m.vlak@umail.leidenuniv.nl

Frank Takes — DSPM — Lecture 1 — Introduction 5 / 51

Format

Materiaal: boek, slides en enkele wetenschappelijke artikelen

Schriftelijk tentamen: 9 juni 2020, 14.00–17.00 60%

Practicum met deliverables: 2 mrt, 13 apr, 18 mei 3× 13, �3%

Elk onderdeel moet voldoende (> 5.5) zijn.

Eindcijfer afgerond op basis van participatie naar dichtsbijzijndeelement in {1, 1.5, 2, 2.5, 3, 3.5, 4, 4.5, 5, 6, 6.5, 7, 7.5, 8, 8.5, 9, 9.5, 10}Hertentamen: 9 juli 2020, 14.00–17.00

Practicum herkansingsdeadline: 17 juni 2020 (minus 2 punten)

7 ECTS

Frank Takes — DSPM — Lecture 1 — Introduction 6 / 51

Book

W. van der Aalst, Process Mining: Discovery, Conformance andEnhancement of Business Processes, 2nd edition, Springer, 2016.

These slides are partially based on the slides of the (previous editionof the) book

Frank Takes — DSPM — Lecture 1 — Introduction 7 / 51

Disclaimer

New course format since 2015, larger adjustments in 2019

Definitions and abbreviations . . .

Pictures and schemas . . .

Participation

Feedback always welcome!

Frank Takes — DSPM — Lecture 1 — Introduction 8 / 51

Topics

February/March: Data Science

April/May: Process Modelling

Frank Takes — DSPM — Lecture 1 — Introduction 9 / 51

Assignments

1 Data Science: gaming industry data visualization on the web

2 Data Science: predictive analytics in Python

3 Process Modelling: BPMN and Petri net tools

Frank Takes — DSPM — Lecture 1 — Introduction 10 / 51

Introduction to Data Scienceand Business Intelligence

Frank Takes — DSPM — Lecture 1 — Introduction 11 / 51

Business Intelligence

Business Intelligence: techniques and tools for the transformationof raw data into meaningful and useful information for businessanalysis purposes (Wikipedia.org)

Business Intelligence: a data analysis process aimed at boostingbusiness performance by helping corporate executives and other endusers make more informed decisions (TechTarget.com)

Business Intelligence: anything that aims at providing actionableinformation that can be used to support decision making (v.d. Aalst)

Frank Takes — DSPM — Lecture 1 — Introduction 12 / 51

Business Analysis

Business analysis: identifying business needs and determiningsolutions to business problems

Business architecture analysisData warehouse analysisBusiness process analysis

Role of IT: Extraction, Transformation and Loading (ETL)

Identifying changes that are required to achieve strategic goals

Business Analyst

Frank Takes — DSPM — Lecture 1 — Introduction 13 / 51

Business Analytics

Business Analytics: skills, technologies and practices for continuousiterative exploration and investigation of past business performance togain insight and drive business planning

Descriptive analytics: gain insight from historical dataPredictive analytics: predictive modeling using statistical andmachine learning techniquesPrescriptive analytics: recommend decisions using optimization,simulation, etc.

Domains: marketing, pricing, sales, accounting, supply chains,communication, . . .

Frank Takes — DSPM — Lecture 1 — Introduction 14 / 51

Data

Data: facts, measurements or text collected for reference or analysis(Oxford dictionary)

Unstructured data: data that does not fit a certain data structure(text, a list of numeric measurements)Structured data: data that fits a certain data structure(table, tree, graph/network, etc.)

Frank Takes — DSPM — Lecture 1 — Introduction 15 / 51

Moore’s Law & Transistors

http://visual.ly

Frank Takes — DSPM — Lecture 1 — Introduction 16 / 51

Moore’s Law & CPU’s

Frank Takes — DSPM — Lecture 1 — Introduction 17 / 51

Moore’s Law & Intelligence

http://frugicom.nl

Frank Takes — DSPM — Lecture 1 — Introduction 18 / 51

Mega Giga Tera Peta Exa Zetta Yotta

Frank Takes — DSPM — Lecture 1 — Introduction 19 / 51

Storage: Digital vs. Analog

Frank Takes — DSPM — Lecture 1 — Introduction 20 / 51

Moore’s Law & Data

Figure: Zettabytes produced per year

Frank Takes — DSPM — Lecture 1 — Introduction 21 / 51

Big Data

Frank Takes — DSPM — Lecture 1 — Introduction 22 / 51

Big Data

Frank Takes — DSPM — Lecture 1 — Introduction 23 / 51

Data Mining

Data explosion

“We are drowning in data, but starving for knowledge!”

Interpretation

Understanding

Learning

Acting

Frank Takes — DSPM — Lecture 1 — Introduction 24 / 51

Data Mining

Data → Information → Knowledge

Knowledge Discovery

Data Mining

Machine Learning

Unsupervised: clustering, pattern mining, etc.Supervised: classification, prediction, etc.

Big Data

Data Science

Frank Takes — DSPM — Lecture 1 — Introduction 25 / 51

DIKW Pyramid

Frank Takes — DSPM — Lecture 1 — Introduction 26 / 51

DIKWD Pyramid (2)

information4dummies.blogspot.comFrank Takes — DSPM — Lecture 1 — Introduction 27 / 51

Data Science

Frank Takes — DSPM — Lecture 1 — Introduction 28 / 51

Data Science

Frank Takes — DSPM — Lecture 1 — Introduction 29 / 51

Break?

Frank Takes — DSPM — Lecture 1 — Introduction 30 / 51

Introduction to Process Modelling

Frank Takes — DSPM — Lecture 1 — Introduction 31 / 51

Processes

Process: a set of related actions and transactions to achieve acertain objective

Service process: providing a value transferProduction process: providing a value creation

Various domains: organizations, environments, biological systems,economies, . . .

Frank Takes — DSPM — Lecture 1 — Introduction 32 / 51

Business Processes

Business process: a sequence of activities aimed at producingsomething of value for the business (Morgan)

Business process: collection of related, structured activities or tasksthat produce a specific service or product (serve a particular goal) fora particular customer or customers (Wikipedia)

Business Process Management: the discipline that combinesknowledge from information technology and knowledge frommanagement sciences and applies this to operational businessprocesses (v.d. Aalst)

Frank Takes — DSPM — Lecture 1 — Introduction 33 / 51

Process Modelling

Frank Takes — DSPM — Lecture 1 — Introduction 34 / 51

Types of Business Processes

Management processes: the processes that govern the operation ofa system, e.g., corporate governance and strategic management

Operational processes: processes that constitute core business andcreate the primary value stream, e.g., purchasing, manufacturing,advertising, marketing and sales

Supporting processes: processes that support the core business,e.g., accounting, recruitment, HR, customer support and ICT support

Frank Takes — DSPM — Lecture 1 — Introduction 35 / 51

Business Process Model

Model: abstract or schematic representation of reality

Modelling (task): creating a model, for example based on realobservations

Business Process Model: abstract representation of businessprocesses

Process model functions:

Descriptive: what is actually happening?Prescriptive: what should be happening?Explanatory: why is the process designed this way?

Frank Takes — DSPM — Lecture 1 — Introduction 36 / 51

Classical BPM Lifecycle

Frank Takes — DSPM — Lecture 1 — Introduction 37 / 51

Process Model Goals

Insight: while making a model, the modeler is triggered to view theprocess from various angles

Discussion: the stakeholders use models to structure discussions

Documentation: processes are documented for instructing people orcertification purposes

Verification: process models are analyzed to find errors in systems orprocedures (e.g., potential deadlocks)

Frank Takes — DSPM — Lecture 1 — Introduction 38 / 51

Process Model Goals (2)

Performance analysis: techniques like simulation can be used tounderstand the factors influencing response times, service levels, etc.

Animation: models enable end users to play out different scenariosand provide feedback to the designer

Specification: models can be used to describe a Process-AwareInformation System (PAIS) before it is implemented and can henceserve as a “contract” between the developer and the enduser/management

Configuration: models can be used to configure a system

Frank Takes — DSPM — Lecture 1 — Introduction 39 / 51

Process Modelling

In practice: formalize and visualize business processes

Business Process Languages (BPLs):

Petri NetsBusiness Process Model Notation (BPMN)

Process discovery: derive the process from a description of activities

ManualAutomated: Process Mining

Frank Takes — DSPM — Lecture 1 — Introduction 40 / 51

Process Science

Frank Takes — DSPM — Lecture 1 — Introduction 41 / 51

Process Mining

Executable models may be used to force people to work in aparticular manner

Models are possibly not well-aligned with reality

Hand-made models may be disconnected from reality and provide anidealized view on the processes at hand: “paper tigers”

An abundance of event logs (data) is usually available

Process Mining is the task of converting event data into processmodels

Frank Takes — DSPM — Lecture 1 — Introduction 42 / 51

Process Discovery Example (Dutch)

Postorderbedrijf acties:

A = het plaatsen van een bestellingB = de betaling van deze bestellingC = de levering van de bestellingD = het versturen van een dankbriefE = klant op de hoogte brengen bij niet-leverbare order

Verzamelde procespaden:

Pad ABCD is 543 keer gevolgdPad ACBD is 378 keer gevolgdPad AED is 45 keer gevolgd

Frank Takes — DSPM — Lecture 1 — Introduction 43 / 51

Petri Net for Example (Dutch)

Frank Takes — DSPM — Lecture 1 — Introduction 44 / 51

Process Mining

Data-oriented business information analysis

Model-based process analysis

Event logs can be replayed on the model

Compliance, auditing, internal review, etc.

Frank Takes — DSPM — Lecture 1 — Introduction 45 / 51

Process Mining

Frank Takes — DSPM — Lecture 1 — Introduction 46 / 51

Process Mining tasks

Discovery: how can we automatically construct a model based onevent logs?

Conformance: given an event log and a model, what are thedifferences between the model and reality?

Enhancement: how can the model be extended using knowledgeinferred from event logs?

Frank Takes — DSPM — Lecture 1 — Introduction 47 / 51

Process Mining

Frank Takes — DSPM — Lecture 1 — Introduction 48 / 51

Assignment 1

Gaming industry context

Sales log spanning roughly 4 years of sales

Apply and compare BI techniques

Inspect, visualize, aggregate, . . .

Deliverables:

1 Web-based dashboard, including rationale for design choices2 Answers to various business-related questions using the dashboard

Format: short assignment report in LATEX

Frank Takes — DSPM — Lecture 1 — Introduction 49 / 51

Mini-practicum zometeen

ULCN-account bruikbaar onder Linux in Snellius zalen?

SSH-toegang tot webserver liacs.leidenuniv.nl?

MySQL-wachtwoord bekend?http://liacs.leidenuniv.nl/phpmyadmin

Verfris uw HTML/CSS/Javascript/jQuery/PHP/MySQL-kennis

Speel een (Facebook)spel (bijv. Jelly Splash)

Onderzoek het verdienmodel van dit spel; hoe kan men zoal betalen?Hoe werken micropayments? Wat zijn payment service providers?Hoe kun je online betalen in Frankrijk? etc.

Frank Takes — DSPM — Lecture 1 — Introduction 50 / 51

Credits

Lecture based on (slides of the (previous edition of the)) course book:W. van der Aalst, Process Mining: Data Science in Action, 2nd edition, Springer,2016.

Frank Takes — DSPM — Lecture 1 — Introduction 51 / 51