Data Exploration at Ebiquity - The UK's Premier R User Group · Data exploration at Ebiquity (and...

21
Data Exploration at Ebiquity Wojtek Kostelecki

Transcript of Data Exploration at Ebiquity - The UK's Premier R User Group · Data exploration at Ebiquity (and...

Page 1: Data Exploration at Ebiquity - The UK's Premier R User Group · Data exploration at Ebiquity (and in a hackathon) Our hotel in Porto. I-COM GLOBAL SUMMIT (MARKETING DATA AND MEASUREMENT)

Data Exploration at EbiquityWojtek Kostelecki

Page 2: Data Exploration at Ebiquity - The UK's Premier R User Group · Data exploration at Ebiquity (and in a hackathon) Our hotel in Porto. I-COM GLOBAL SUMMIT (MARKETING DATA AND MEASUREMENT)

2

Kaggle Dojo, R Dojo, Python Dojo

Kaggle Hackathon – Saturday 27 May

meetup.com/London-Kaggle-Meetup

Page 3: Data Exploration at Ebiquity - The UK's Premier R User Group · Data exploration at Ebiquity (and in a hackathon) Our hotel in Porto. I-COM GLOBAL SUMMIT (MARKETING DATA AND MEASUREMENT)

3

Data exploration at Ebiquity (and in a hackathon)

Our hotel

in Porto

Page 4: Data Exploration at Ebiquity - The UK's Premier R User Group · Data exploration at Ebiquity (and in a hackathon) Our hotel in Porto. I-COM GLOBAL SUMMIT (MARKETING DATA AND MEASUREMENT)

I-COM GLOBAL SUMMIT (MARKETING DATA AND MEASUREMENT)

4

Intel Challenge Overview

Business Challenge:

What is the impact of discussions in social media and brand health

indicators on advertising effectiveness for high consideration purchases

such as consumer PC sales in the US?

Prediction Challenge:

Predict the sales revenue by CPU brand/device brand combination by

month for Jan and Feb 2017

Time Challenge:

24 Hours

Page 5: Data Exploration at Ebiquity - The UK's Premier R User Group · Data exploration at Ebiquity (and in a hackathon) Our hotel in Porto. I-COM GLOBAL SUMMIT (MARKETING DATA AND MEASUREMENT)

EBVERSE

5

Hadleyverse/Tidyverse

Page 6: Data Exploration at Ebiquity - The UK's Premier R User Group · Data exploration at Ebiquity (and in a hackathon) Our hotel in Porto. I-COM GLOBAL SUMMIT (MARKETING DATA AND MEASUREMENT)

RAW DATA

6

~300 MB of data over ~400 files

directory of text files

ebdb::dir_stack(path, func)

multi-sheet excel file

ebdb::read_yougov(x)

Page 7: Data Exploration at Ebiquity - The UK's Premier R User Group · Data exploration at Ebiquity (and in a hackathon) Our hotel in Porto. I-COM GLOBAL SUMMIT (MARKETING DATA AND MEASUREMENT)

GITLAB

7

Page 8: Data Exploration at Ebiquity - The UK's Premier R User Group · Data exploration at Ebiquity (and in a hackathon) Our hotel in Porto. I-COM GLOBAL SUMMIT (MARKETING DATA AND MEASUREMENT)

8

data script sourced data

Page 9: Data Exploration at Ebiquity - The UK's Premier R User Group · Data exploration at Ebiquity (and in a hackathon) Our hotel in Porto. I-COM GLOBAL SUMMIT (MARKETING DATA AND MEASUREMENT)

EBPLOT

9

Page 10: Data Exploration at Ebiquity - The UK's Premier R User Group · Data exploration at Ebiquity (and in a hackathon) Our hotel in Porto. I-COM GLOBAL SUMMIT (MARKETING DATA AND MEASUREMENT)

EBPLOT – PC/PROCESSOR OFFLINE ADVERTISING IN UK (AS MONITORED BY EBIQUITY)

10

area_plot(df, "Week", "Spend")

Page 11: Data Exploration at Ebiquity - The UK's Premier R User Group · Data exploration at Ebiquity (and in a hackathon) Our hotel in Porto. I-COM GLOBAL SUMMIT (MARKETING DATA AND MEASUREMENT)

EBPLOT – PC/PROCESSOR OFFLINE ADVERTISING IN UK (AS MONITORED BY EBIQUITY)

11

area_plot(df, "Week", "Spend", "Company")

Page 12: Data Exploration at Ebiquity - The UK's Premier R User Group · Data exploration at Ebiquity (and in a hackathon) Our hotel in Porto. I-COM GLOBAL SUMMIT (MARKETING DATA AND MEASUREMENT)

EBPLOT – PC/PROCESSOR OFFLINE ADVERTISING IN UK (AS MONITORED BY EBIQUITY)

12

area_plot(df, "Week", "Spend", "Year", "Company", labels = TRUE)

Page 13: Data Exploration at Ebiquity - The UK's Premier R User Group · Data exploration at Ebiquity (and in a hackathon) Our hotel in Porto. I-COM GLOBAL SUMMIT (MARKETING DATA AND MEASUREMENT)

EBPLOT – PC/PROCESSOR OFFLINE ADVERTISING IN UK (AS MONITORED BY EBIQUITY)

13

share_plot(df, "Year", "Spend", "Company")

Page 14: Data Exploration at Ebiquity - The UK's Premier R User Group · Data exploration at Ebiquity (and in a hackathon) Our hotel in Porto. I-COM GLOBAL SUMMIT (MARKETING DATA AND MEASUREMENT)

EBPLOT – PC/PROCESSOR OFFLINE ADVERTISING IN UK (AS MONITORED BY EBIQUITY)

14

share_plot(df, "Year", "Spend", "Company", "Device")

Page 15: Data Exploration at Ebiquity - The UK's Premier R User Group · Data exploration at Ebiquity (and in a hackathon) Our hotel in Porto. I-COM GLOBAL SUMMIT (MARKETING DATA AND MEASUREMENT)

EBPLOT – PC/PROCESSOR OFFLINE ADVERTISING IN UK (AS MONITORED BY EBIQUITY)

15

share_plot(df, "Week", "Spend", "Company")

Page 16: Data Exploration at Ebiquity - The UK's Premier R User Group · Data exploration at Ebiquity (and in a hackathon) Our hotel in Porto. I-COM GLOBAL SUMMIT (MARKETING DATA AND MEASUREMENT)

EBPLOT – PC/PROCESSOR OFFLINE ADVERTISING IN UK (AS MONITORED BY EBIQUITY)

16

waterfall(df, "Device", "Spend", "Year")

Page 17: Data Exploration at Ebiquity - The UK's Premier R User Group · Data exploration at Ebiquity (and in a hackathon) Our hotel in Porto. I-COM GLOBAL SUMMIT (MARKETING DATA AND MEASUREMENT)

MODELLING

17

Page 18: Data Exploration at Ebiquity - The UK's Premier R User Group · Data exploration at Ebiquity (and in a hackathon) Our hotel in Porto. I-COM GLOBAL SUMMIT (MARKETING DATA AND MEASUREMENT)

PREDICTION CHALLENGE – FORECAST TWO MONTHS OF SALES

18

Time Proc. Manuf. Dev. Rev. … Microsoft Ad Intel Ad AMD Ad Dell Ad

1 Intel Microsoft Laptop ## … £0 £10k £0 £0

2 Intel Microsoft Laptop ## … £100k 0k £0 £0

1 Intel Asus Laptop ## … £0 £10k £0 £0

2 Intel Asus Laptop ## … £0 0k £0 £0

1 Intel Dell Desktop ## … £0 £10k £0 £50k

2 Intel Dell Desktop ## … £0 0k £0 £50k

1 AMD Dell Laptop ## … £0 £0 £10k £50k

2 AMD Dell Laptop ## … £0 £0 £10k £50k

1 AMD Lenovo Laptop ## … £0 £0 £10k £0

2 AMD Lenovo Laptop ## … £0 £0 £10k £0

… … … … … … … … … …

target

Page 19: Data Exploration at Ebiquity - The UK's Premier R User Group · Data exploration at Ebiquity (and in a hackathon) Our hotel in Porto. I-COM GLOBAL SUMMIT (MARKETING DATA AND MEASUREMENT)

19

In House Weighted OLS with box-

constrained optimization• Estimate OLS

• Enforce coefficient

boundary constraints

• Re-estimate OLS with

boundary-touching

coefficients fixed

• Release fixed coefficients if

necessary

• Repeat

Variable w/transformationMod.

Link

Var.

Link

Coef.

Min

Coef.

Max

INTERCEPT Model

… … … … …

amd_tv * (proc == "AMD") Device proc_tv 0

intel_tv * (proc == "Intel") Device proc_tv 0

… … … … …

Model Specification

Model Estimation

Page 20: Data Exploration at Ebiquity - The UK's Premier R User Group · Data exploration at Ebiquity (and in a hackathon) Our hotel in Porto. I-COM GLOBAL SUMMIT (MARKETING DATA AND MEASUREMENT)

BENCHMARKING

20

Using a 6000 x 1230 model matrixImplementation with sparse

matrices and repeated

calculations moved up one level

Page 21: Data Exploration at Ebiquity - The UK's Premier R User Group · Data exploration at Ebiquity (and in a hackathon) Our hotel in Porto. I-COM GLOBAL SUMMIT (MARKETING DATA AND MEASUREMENT)

21