Data Science and Predictive Analytics Academic-Industry ... · Predictive Analytics...

67

Transcript of Data Science and Predictive Analytics Academic-Industry ... · Predictive Analytics...

Page 1: Data Science and Predictive Analytics Academic-Industry ... · Predictive Analytics Academic-Industry Partnering Forum ... • Building accident risk models to identify risky behaviour
Page 2: Data Science and Predictive Analytics Academic-Industry ... · Predictive Analytics Academic-Industry Partnering Forum ... • Building accident risk models to identify risky behaviour

Data Science and Predictive Analytics Academic-Industry Partnering Forum Stefan Steiner

Department ChairStatistics and Actuarial ScienceFriday, April 27, 2018

2

Page 3: Data Science and Predictive Analytics Academic-Industry ... · Predictive Analytics Academic-Industry Partnering Forum ... • Building accident risk models to identify risky behaviour

Partnering Forum Goals Stimulate contact and interaction between companies and academic

researchers

What does longer term success look like?

Establish industry research collaborations for faculty and graduate students

Provide funding for graduate students

Address problems facing industry

Develop a talent pipeline for companies

PAGE 3

Page 4: Data Science and Predictive Analytics Academic-Industry ... · Predictive Analytics Academic-Industry Partnering Forum ... • Building accident risk models to identify risky behaviour

Department of Statistics and Actuarial Science 50 research professors and 10 lecturers

900+ undergrad and close to 200 grad students

Research Institute/Groups Waterloo Research Institute in Insurance, Securities and

Quantitative Finance (WatRISQ)

Business and Industrial Statistics Research Group (BISRG)

Propel Centre for Population Health Impact (PROPEL)

Centre for Computational Mathematics in Industry and Commerce (CCMIC)

Survey Research Centre (SRC)

Statistical Consulting and Collaborative Research Unit

PAGE 4

Page 5: Data Science and Predictive Analytics Academic-Industry ... · Predictive Analytics Academic-Industry Partnering Forum ... • Building accident risk models to identify risky behaviour

Department Research Areas Actuarial risk management

Applied probability

Biostatistics

Business and industrial statistics

Computational statistics

Data science

Econometrics and quantitative finance

Risk theory

Statistical modeling and inference

Survey methods

PAGE 5

Page 6: Data Science and Predictive Analytics Academic-Industry ... · Predictive Analytics Academic-Industry Partnering Forum ... • Building accident risk models to identify risky behaviour

Forum Agenda and Logistics Morning (9-12:30) – DC 1302 (break at 10:45 in fishbowl)

Introduction and successful collaboration showcase

Company and faculty member profiles (5 minutes each)

Lunch (12:30-1:30) – M3 Atrium

Presentations by OCE, NSERC and Mitacs regarding funding opportunities

Afternoon (1:30-3:30) – M3 Atrium

Open networking (first 30 minutes)

Speed networking (starting at 2pm, scheduled 5 x 15 minute time slots)

Closing remarks

PAGE 6

Page 7: Data Science and Predictive Analytics Academic-Industry ... · Predictive Analytics Academic-Industry Partnering Forum ... • Building accident risk models to identify risky behaviour

Future Engagement/Funding Opportunities Collaborative Research

OCE, NSERC and Mitacs programs (more on this over lunch)

Engage with the Statistical Consulting and Collaborative Research Unit

Hire co-op/internship students

Undergraduate students in a large number of programs, e.g. Actuarial Science, Biostatistics, Computer Science, Data Science, Statistics,

Graduate students and Post-Doctoral Fellows

Waterloo-ASA DataFest, May 4-6, 2018 (annual event)

Actuarial science case competitionsPAGE 7

Page 8: Data Science and Predictive Analytics Academic-Industry ... · Predictive Analytics Academic-Industry Partnering Forum ... • Building accident risk models to identify risky behaviour

PAGE 8

Page 9: Data Science and Predictive Analytics Academic-Industry ... · Predictive Analytics Academic-Industry Partnering Forum ... • Building accident risk models to identify risky behaviour

Partnering with UWThe Story

Ella Hilal, PhD.

2

Page 10: Data Science and Predictive Analytics Academic-Industry ... · Predictive Analytics Academic-Industry Partnering Forum ... • Building accident risk models to identify risky behaviour

About the Speaker

Page 11: Data Science and Predictive Analytics Academic-Industry ... · Predictive Analytics Academic-Industry Partnering Forum ... • Building accident risk models to identify risky behaviour

Director of Data Science and Engineering

Photo by Sarah Pflug from Burst

Page 12: Data Science and Predictive Analytics Academic-Industry ... · Predictive Analytics Academic-Industry Partnering Forum ... • Building accident risk models to identify risky behaviour

Director of Innovation and Data Intelligence

Page 13: Data Science and Predictive Analytics Academic-Industry ... · Predictive Analytics Academic-Industry Partnering Forum ... • Building accident risk models to identify risky behaviour

6

Page 14: Data Science and Predictive Analytics Academic-Industry ... · Predictive Analytics Academic-Industry Partnering Forum ... • Building accident risk models to identify risky behaviour

The Story

Photo by: Matthew Henry on Burst

Page 15: Data Science and Predictive Analytics Academic-Industry ... · Predictive Analytics Academic-Industry Partnering Forum ... • Building accident risk models to identify risky behaviour

The Connected Car

Page 16: Data Science and Predictive Analytics Academic-Industry ... · Predictive Analytics Academic-Industry Partnering Forum ... • Building accident risk models to identify risky behaviour

Diverse Data Sources

Page 17: Data Science and Predictive Analytics Academic-Industry ... · Predictive Analytics Academic-Industry Partnering Forum ... • Building accident risk models to identify risky behaviour

10

Scale of Data

Speed of Data Arrival

Different Data Forms

Different Data Accuracy

Page 18: Data Science and Predictive Analytics Academic-Industry ... · Predictive Analytics Academic-Industry Partnering Forum ... • Building accident risk models to identify risky behaviour

Making Sense of

11

10 Trillion Data Points

Page 19: Data Science and Predictive Analytics Academic-Industry ... · Predictive Analytics Academic-Industry Partnering Forum ... • Building accident risk models to identify risky behaviour

12

Consumer PreferencesDriver Behavior Habits

Crafting the Lifestyle Narrative

Page 20: Data Science and Predictive Analytics Academic-Industry ... · Predictive Analytics Academic-Industry Partnering Forum ... • Building accident risk models to identify risky behaviour

13

Consumer PreferencesDriver Behavior Habits

Crafting the Lifestyle Narrative

Page 21: Data Science and Predictive Analytics Academic-Industry ... · Predictive Analytics Academic-Industry Partnering Forum ... • Building accident risk models to identify risky behaviour

14

Consumer PreferencesDriver Behavior Habits

Crafting the Lifestyle Narrative

Risk Analysis

Page 22: Data Science and Predictive Analytics Academic-Industry ... · Predictive Analytics Academic-Industry Partnering Forum ... • Building accident risk models to identify risky behaviour
Page 23: Data Science and Predictive Analytics Academic-Industry ... · Predictive Analytics Academic-Industry Partnering Forum ... • Building accident risk models to identify risky behaviour

Photo by JESHOOTS.COM on Unsplash

Page 24: Data Science and Predictive Analytics Academic-Industry ... · Predictive Analytics Academic-Industry Partnering Forum ... • Building accident risk models to identify risky behaviour

Ask the Experts

Page 25: Data Science and Predictive Analytics Academic-Industry ... · Predictive Analytics Academic-Industry Partnering Forum ... • Building accident risk models to identify risky behaviour

Great Partners

Photo by rawpixel.com on Unsplash

Page 26: Data Science and Predictive Analytics Academic-Industry ... · Predictive Analytics Academic-Industry Partnering Forum ... • Building accident risk models to identify risky behaviour

Funding

Page 27: Data Science and Predictive Analytics Academic-Industry ... · Predictive Analytics Academic-Industry Partnering Forum ... • Building accident risk models to identify risky behaviour

Photo by Sarah Pflug from Burst

Working Closely

Page 28: Data Science and Predictive Analytics Academic-Industry ... · Predictive Analytics Academic-Industry Partnering Forum ... • Building accident risk models to identify risky behaviour

Major Strides in a Challenging

Problem Space

Photo by rawpixel.com on Unsplash

Page 29: Data Science and Predictive Analytics Academic-Industry ... · Predictive Analytics Academic-Industry Partnering Forum ... • Building accident risk models to identify risky behaviour

Major Strides in a Challenging

Problem Space

1 Problem Statement

Photo by rawpixel.com on Unsplash

Page 30: Data Science and Predictive Analytics Academic-Industry ... · Predictive Analytics Academic-Industry Partnering Forum ... • Building accident risk models to identify risky behaviour

Major Strides in a Challenging

Problem Space

2

1

Data Assets

Problem Statement

Photo by rawpixel.com on Unsplash

Page 31: Data Science and Predictive Analytics Academic-Industry ... · Predictive Analytics Academic-Industry Partnering Forum ... • Building accident risk models to identify risky behaviour

Major Strides in a Challenging

Problem Space

2

1

Data Assets

Problem Statement

3Solution with Real-world Constraints

Photo by rawpixel.com on Unsplash

Page 32: Data Science and Predictive Analytics Academic-Industry ... · Predictive Analytics Academic-Industry Partnering Forum ... • Building accident risk models to identify risky behaviour

Major Strides in a Challenging

Problem Space

2

1

Data Assets

Problem Statement

3Solution with Real-world Constraints

4 Knowledge Transfer

Photo by rawpixel.com on Unsplash

Page 33: Data Science and Predictive Analytics Academic-Industry ... · Predictive Analytics Academic-Industry Partnering Forum ... • Building accident risk models to identify risky behaviour

Success isSweet

Photo by Matheus Ferrero on Unsplash

Page 34: Data Science and Predictive Analytics Academic-Industry ... · Predictive Analytics Academic-Industry Partnering Forum ... • Building accident risk models to identify risky behaviour

Photo by: Matthew Henry on Burst

The Story

Continues

Page 35: Data Science and Predictive Analytics Academic-Industry ... · Predictive Analytics Academic-Industry Partnering Forum ... • Building accident risk models to identify risky behaviour
Page 36: Data Science and Predictive Analytics Academic-Industry ... · Predictive Analytics Academic-Industry Partnering Forum ... • Building accident risk models to identify risky behaviour

Shopify is the leading cloud-based, multichannel commerce platform.

Page 37: Data Science and Predictive Analytics Academic-Industry ... · Predictive Analytics Academic-Industry Partnering Forum ... • Building accident risk models to identify risky behaviour

Shopify is the leading cloud-based, multichannel commerce platform.

Merchants can use the software to design, set up and manage their stores.

Page 38: Data Science and Predictive Analytics Academic-Industry ... · Predictive Analytics Academic-Industry Partnering Forum ... • Building accident risk models to identify risky behaviour

Shopify is the leading cloud-based, multichannel commerce platform.

Merchants can use the software to design, set up and manage their stores.

The Shopify platform was engineered for reliability and scale

Page 39: Data Science and Predictive Analytics Academic-Industry ... · Predictive Analytics Academic-Industry Partnering Forum ... • Building accident risk models to identify risky behaviour

Shopify is the leading cloud-based, multichannel commerce platform.

Merchants can use the software to design, set up and manage their stores.

The Shopify platform was engineered for reliability and scale

Shopify currently powers over 500,000 businesses in ~150 countries

Page 40: Data Science and Predictive Analytics Academic-Industry ... · Predictive Analytics Academic-Industry Partnering Forum ... • Building accident risk models to identify risky behaviour

Shopify is the leading cloud-based, multichannel commerce platform.

Merchants can use the software to design, set up and manage their stores.

The Shopify platform was engineered for reliability and scale

Shopify currently powers over 500,000 businesses in ~150 countries

Red Bull, LA Lakers, the New York Stock Exchange, GoldieBlox, and many more.

Page 41: Data Science and Predictive Analytics Academic-Industry ... · Predictive Analytics Academic-Industry Partnering Forum ... • Building accident risk models to identify risky behaviour

Detection of Check-out Bots

Photo by Matthew Henry from Burst

Page 42: Data Science and Predictive Analytics Academic-Industry ... · Predictive Analytics Academic-Industry Partnering Forum ... • Building accident risk models to identify risky behaviour

Detection of Flash SalesPhoto by Nicole De Khors from Burst

Page 43: Data Science and Predictive Analytics Academic-Industry ... · Predictive Analytics Academic-Industry Partnering Forum ... • Building accident risk models to identify risky behaviour

Marketing Campaigns

- Potential of Engagement

- Risk of Un-subscription

Photo by: Nicole De Khors

Page 44: Data Science and Predictive Analytics Academic-Industry ... · Predictive Analytics Academic-Industry Partnering Forum ... • Building accident risk models to identify risky behaviour

https://ca.linkedin.com/in/allaahilal @a_hilal

https://uwaterloo.ca/scholar/ahilal/

Collaboration is Essential for Advancement & Innovation

[email protected]

Page 45: Data Science and Predictive Analytics Academic-Industry ... · Predictive Analytics Academic-Industry Partnering Forum ... • Building accident risk models to identify risky behaviour
Page 46: Data Science and Predictive Analytics Academic-Industry ... · Predictive Analytics Academic-Industry Partnering Forum ... • Building accident risk models to identify risky behaviour
Page 47: Data Science and Predictive Analytics Academic-Industry ... · Predictive Analytics Academic-Industry Partnering Forum ... • Building accident risk models to identify risky behaviour
Page 48: Data Science and Predictive Analytics Academic-Industry ... · Predictive Analytics Academic-Industry Partnering Forum ... • Building accident risk models to identify risky behaviour
Page 49: Data Science and Predictive Analytics Academic-Industry ... · Predictive Analytics Academic-Industry Partnering Forum ... • Building accident risk models to identify risky behaviour
Page 50: Data Science and Predictive Analytics Academic-Industry ... · Predictive Analytics Academic-Industry Partnering Forum ... • Building accident risk models to identify risky behaviour
Page 51: Data Science and Predictive Analytics Academic-Industry ... · Predictive Analytics Academic-Industry Partnering Forum ... • Building accident risk models to identify risky behaviour
Page 52: Data Science and Predictive Analytics Academic-Industry ... · Predictive Analytics Academic-Industry Partnering Forum ... • Building accident risk models to identify risky behaviour
Page 53: Data Science and Predictive Analytics Academic-Industry ... · Predictive Analytics Academic-Industry Partnering Forum ... • Building accident risk models to identify risky behaviour
Page 54: Data Science and Predictive Analytics Academic-Industry ... · Predictive Analytics Academic-Industry Partnering Forum ... • Building accident risk models to identify risky behaviour
Page 55: Data Science and Predictive Analytics Academic-Industry ... · Predictive Analytics Academic-Industry Partnering Forum ... • Building accident risk models to identify risky behaviour

PAGE 55

Wayne Oldford Statistical reasoning

exploratory data analysis

data visualization

development of interactive computational environments that support these activities

Page 56: Data Science and Predictive Analytics Academic-Industry ... · Predictive Analytics Academic-Industry Partnering Forum ... • Building accident risk models to identify risky behaviour

Survey Methodology

There is designed data collection and organic data collection (Groves, 2011)Survey methodology research is about:• the principles of designed data collection• the combination of designed data and organic data• in analysis, accounting for the complexity of the design (e.g.

targeted sampling; network sampling; longitudinality)

56

Page 57: Data Science and Predictive Analytics Academic-Industry ... · Predictive Analytics Academic-Industry Partnering Forum ... • Building accident risk models to identify risky behaviour

Expertise at UW

• Involvement with large longitudinal surveys:– International Tobacco Control Project– Canadian Longitudinal Study on Aging

• Analysis of survey data with large numbers of variables; predictive model selection (Wu, Boudreau)

• Machine learning from text data (Schonlau)• Network sampling (Thompson)

57

Page 58: Data Science and Predictive Analytics Academic-Industry ... · Predictive Analytics Academic-Industry Partnering Forum ... • Building accident risk models to identify risky behaviour

Survey Research Centre

58

• A full-service survey research organization providing survey design, data collection and top-line analysis since 1999

• Emphasis on collecting high-quality data for scientific and decision–maker use

• Web, telephone, mail and mixed-mode surveys

• Data held on secure servers at the University of Waterloo

Page 59: Data Science and Predictive Analytics Academic-Industry ... · Predictive Analytics Academic-Industry Partnering Forum ... • Building accident risk models to identify risky behaviour

STEFAN STEINERRESEARCH PROFILE

Page 60: Data Science and Predictive Analytics Academic-Industry ... · Predictive Analytics Academic-Industry Partnering Forum ... • Building accident risk models to identify risky behaviour

ASSESSMENT OF STREAMING DATA

• Decision support with process monitoring and comparison• Monitoring manufacturing processes for upsets

• Monitoring customer satisfaction measures

• Comparing medical labs, hospitals, or individual surgeons over time with risk adjustment

• Example application: analysis of automotive telematics data

• Building accident risk models to identify risky behaviour profiles

• Developing driver behaviour profiles

• Providing real-time feedback on driving behaviour

Page 61: Data Science and Predictive Analytics Academic-Industry ... · Predictive Analytics Academic-Industry Partnering Forum ... • Building accident risk models to identify risky behaviour

PROCESS IMPROVEMENT AND VARIATION REDUCTION• Measurement system assessment

• Develop improved plans incorporating baseline, gold standard assessment, partial verification, specially selected parts, etc.

• Variety of characteristic types: continuous, binary, diagnostic tests, count, functional, etc.

• Comparison and calibration of measurement systems – probability of agreement

• Quality/process improvement systems

• Experimental design

Page 62: Data Science and Predictive Analytics Academic-Industry ... · Predictive Analytics Academic-Industry Partnering Forum ... • Building accident risk models to identify risky behaviour

o Broad research interests

o Enthusiasm in industrial partnership

o Diversity of Education Experiences

PAGE 62

Chengguo.Weng.com

Page 63: Data Science and Predictive Analytics Academic-Industry ... · Predictive Analytics Academic-Industry Partnering Forum ... • Building accident risk models to identify risky behaviour

Optimal Decision with Uncertainty

Predictive Analytics

PAGE 63

• Optimal reinsurance• Vast portfolio selection• Risk prioritization• Pricing and hedging of insurance/finance

products

• Monte Carlo simulation• Data-driven• Partial information• Statistical learning

• Insurance premium rating• Customers behavior characterization• Prediction of economic factors

• Personalized prediction algorithms• Enable price discrimination• Enable incorporation of large

information

Page 64: Data Science and Predictive Analytics Academic-Industry ... · Predictive Analytics Academic-Industry Partnering Forum ... • Building accident risk models to identify risky behaviour

Describe me your Situations

Bring me your Questions

Show me your Data

PAGE 64

Chengguo WengUniversity of Waterloo

Page 65: Data Science and Predictive Analytics Academic-Industry ... · Predictive Analytics Academic-Industry Partnering Forum ... • Building accident risk models to identify risky behaviour

Shoja [email protected]

Department of Statistics and Actuarial Science

Acknowledgment: My research has been funded by

Page 66: Data Science and Predictive Analytics Academic-Industry ... · Predictive Analytics Academic-Industry Partnering Forum ... • Building accident risk models to identify risky behaviour

Neural spiking:� Action potentials (spikes) are nerve impulses� Spike trains� Temporal point process. Inference on firing rates.

0

34

5

7

1 32 0

2 1

2 3

2 6

3 0

3 1

3 23 3

3 4

3 5

3 9

4 2

4 9 5 0

5 3

5 6

5 7

6 5

6 8

9 6 9

7 04 7

2 4 4 1

4 07 2

7 47 7

7 8

8 0

8 1

8 2

8 6

8 9

2 59 0

9 1

9 2

9 3

9 5

9 6 2 9 9 7

4 4 1 0 2

6 3 1 0 3 4 6 4 8

1 0 4

1 0 6

1 0 5 8 4 1 1 4 3

8 5 1 1 0

1 1 1

1 1 5

1 1 6 1 2 0

1 2 3

6 1 1 2 6

1 2 9

6 1 4

2

1 3 0

1 3 1

1 3 2 1 0 7

8 6 0 4 5

1 3 3

6 4

7 9 3 6 1 3 7

8 8 7 1

1 3 8

1 3 9

1 4 0 5 1

1 4 1

1 4 2 1 5 1 2 8

9 9 1 4 3

9 8 1 4 4

1 4 5

1 0 1 1 0 9 5 4 1 4 6

1 4 7

1 0 0 1 4 91 5 0 1 0 8

1 5 2

1 5 58 7

1 5 6

1 5 7 6 6

1 5 8

1 5 4 1 1 9

1 1 8 1 5 9 1 2 7 1 6 1 1 1 2 1 3 6

7 6 1 6 3 1 5 1

1 2 5 7 5

1 3 5

1 2 4 1 6 4 1 6 2

3 8 1 6 5 1 1 3

6 7

7 31 0

1 6 6

1 6 71 4 8

1 6 8 6 2

1 6 0 5 2 2 7 1 1 7 1 6 9

1 1 4

1 7 0

1 7 1

1 3 41 7 3 1 2 1 9 4

1 7 4

1 7 1 7 5

1 7 6

1 7 71 8 1 7 2

1 7 8

5 5

5 8 1 7 9

8 3

1 8 0 1 6

1 8 1

1 8 2

1 8 3 3 7 1 9

2 8 1 8 4

1 8 5

1 8 6

1 5 3 1 8 7 1

2 2 5 91 2 2 1 2

1 8 8

Network data:� Epidemic networks (directed graphs)� Predicting links based on covariates on the nodes� Temporal dynamics of networks.

AIS Data:� Modelling trajectories of vessels� Identifying dark targets (anomalous vessels)� Doppelgangers� Spatiotemporal processes, functional data

Additive manufacturing or 3D printing:� Design of experiments� Response surface methodology & optimization� Deformation and compensation� Process monitoring.

Page 67: Data Science and Predictive Analytics Academic-Industry ... · Predictive Analytics Academic-Industry Partnering Forum ... • Building accident risk models to identify risky behaviour

Freight train accidents carrying HazMat� Modelling probability of a car initiating derailment� Number of cars derailed� Data-driven marshalling yard� Markov chains, GLM and classification algorithms.

(a) Original images (b) PCA

Foreground objects

(c) PCP (d) ROBPCA (e) PWBPCA

Environmental contaminants� How the implemented policies by Canada and US

for acid rain worked? Multivariate Change point detection.

� Below detection data for site characterization and remediation.

� Analysis of left censored data and regression.

Dimensionality reduction� Represent the high-dimensional data in a

low-dimensional form without losing “important information”

� Widely applied to many types of data such as images, videos, texts.

� Many datasets are high-dimensional representations of data from low-dimensional curved manifolds.