Capabilities Apollo and SQL Server Data Mining Presented by Jeff Kaplan, Principal Client Services...

37
Capabilities Apollo and SQL Server Data Mining Presented by Jeff Kaplan, Principal Client Services Paul Bradley, Ph.D., Principal Data Mining Technology 312.787.7376

Transcript of Capabilities Apollo and SQL Server Data Mining Presented by Jeff Kaplan, Principal Client Services...

Page 1: Capabilities Apollo and SQL Server Data Mining Presented by Jeff Kaplan, Principal Client Services Paul Bradley, Ph.D., Principal Data Mining Technology.

CapabilitiesApollo and SQL Server Data Mining

Presented by

Jeff Kaplan, Principal Client Services

Paul Bradley, Ph.D., Principal Data Mining Technology

312.787.7376

Page 2: Capabilities Apollo and SQL Server Data Mining Presented by Jeff Kaplan, Principal Client Services Paul Bradley, Ph.D., Principal Data Mining Technology.

2

Agenda

Apollo Overview

Data Mining 101

Project REAL Case Study

SQL Server 2005 Data Mining Demo

Real-life Examples

Page 3: Capabilities Apollo and SQL Server Data Mining Presented by Jeff Kaplan, Principal Client Services Paul Bradley, Ph.D., Principal Data Mining Technology.

3

Apollo Overview

PART ONE

Page 4: Capabilities Apollo and SQL Server Data Mining Presented by Jeff Kaplan, Principal Client Services Paul Bradley, Ph.D., Principal Data Mining Technology.

4

Company Background

First company delivering true predictive analytic solutions

10 plus years in data mining and data warehousing

Premier Partner for SQL Server 2005 Data Mining

Cater to a wide range of business including Microsoft, Sprint, Wal-Mart, Barnes & Noble, Seattle Times, Knight Ridder

Variety of Industries• Retail and Consumer Goods• Media• Financial Services• Manufacturing• Public Services

overview

Page 5: Capabilities Apollo and SQL Server Data Mining Presented by Jeff Kaplan, Principal Client Services Paul Bradley, Ph.D., Principal Data Mining Technology.

5

Industry Recognition

overview

Page 6: Capabilities Apollo and SQL Server Data Mining Presented by Jeff Kaplan, Principal Client Services Paul Bradley, Ph.D., Principal Data Mining Technology.

6

Testimonials

overview

Page 7: Capabilities Apollo and SQL Server Data Mining Presented by Jeff Kaplan, Principal Client Services Paul Bradley, Ph.D., Principal Data Mining Technology.

7

Testimonials

overview

Page 8: Capabilities Apollo and SQL Server Data Mining Presented by Jeff Kaplan, Principal Client Services Paul Bradley, Ph.D., Principal Data Mining Technology.

8

Testimonials

overview

Page 9: Capabilities Apollo and SQL Server Data Mining Presented by Jeff Kaplan, Principal Client Services Paul Bradley, Ph.D., Principal Data Mining Technology.

9

overview

Analytic Landscape

Page 10: Capabilities Apollo and SQL Server Data Mining Presented by Jeff Kaplan, Principal Client Services Paul Bradley, Ph.D., Principal Data Mining Technology.

10

Capabilities

overview

• Customer Acquisition

• Campaign Targeting

• Cross-sell/Up-sell

• Customer Segmentation

• Retention Modeling

• Behavioral Targeting

• Personalization

• Claim Analysis

• Call Center Analytics

• Data Warehousing

• Dashboard Reporting

Marketing Sales & Distribution

• Correlation Analysis

• Key Driver Analysis

• Verbatim Summarization

Market Research Operations

• Inventory Forecasting

• Sales Forecasting

• Pricing Optimization

• Next Best Offer

• Market Basket Analysis

• Recency & Frequency

Modeling

Page 11: Capabilities Apollo and SQL Server Data Mining Presented by Jeff Kaplan, Principal Client Services Paul Bradley, Ph.D., Principal Data Mining Technology.

11

RedCard

Booking

CallCenter

SQL-Server2005

Stores

Predictive Models

Dashboard & Ad-hoc Reporting

Customer Clustering Models

Measure Promotion Success

Web

Direct Mail

Email

Phone

Automate Predictions for Targeting, Forecasting,

Detection, etc.

• Join Customer Data Sources

Customer Targeting Models

• Score Model Results • Deliver Targeted Predictions• Run Predictive Algorithms

overview

Page 12: Capabilities Apollo and SQL Server Data Mining Presented by Jeff Kaplan, Principal Client Services Paul Bradley, Ph.D., Principal Data Mining Technology.

12

MS Data Mining

PART TWO

Page 13: Capabilities Apollo and SQL Server Data Mining Presented by Jeff Kaplan, Principal Client Services Paul Bradley, Ph.D., Principal Data Mining Technology.

13

Fastest Growing BI Segment (IDC)• Data Mining Tools: $1.85B in 2006• Predictive Analytic projects yield a high median ROI of 145%

Uses• Marketing: Customer Acquisition and Targeting, Cross-Sell/Up-Sell• Retail: Inventory Forecasting, Price Optimization• Market Research: Driver Analysis, Verbatim Summarization• Operations: Call Center Analytics• Finance: Fraud Detection, Risk Models

Mainstream Emergence• E-commerce (e.g Amazon.com)• Search (e.g. Vivisimo.com)• Behavioral Advertising

SQL-Server is in a Unique Position to Service Market Needs

ms data mining

Background

Page 14: Capabilities Apollo and SQL Server Data Mining Presented by Jeff Kaplan, Principal Client Services Paul Bradley, Ph.D., Principal Data Mining Technology.

14

Evolution of SQL Server Data Mining

SQL 2000

SQL 2000

SQL 2005

SQL 2005

Enter the GameEnter the Game Create industry standardCreate industry standard Target developer audienceTarget developer audience V1.0 product with 2 V1.0 product with 2

algorithmsalgorithms

Win LeadershipWin Leadership Continue standards and Continue standards and

developer effortdeveloper effort Comprehensive feature setComprehensive feature set Penetrate the EnterprisePenetrate the Enterprise Thought leadershipThought leadership

ms data mining

Page 15: Capabilities Apollo and SQL Server Data Mining Presented by Jeff Kaplan, Principal Client Services Paul Bradley, Ph.D., Principal Data Mining Technology.

15

SQL-Server 2005

OLAP

Reports (Adhoc)

Reports (Static)

Data Mining

Business Knowledge

Easy Difficult

Rel

ativ

e B

usin

ess

Val

uems data mining

Value of Data Mining

Page 16: Capabilities Apollo and SQL Server Data Mining Presented by Jeff Kaplan, Principal Client Services Paul Bradley, Ph.D., Principal Data Mining Technology.

16

SQL-Server 2005 BI Platform

Analysis ServicesAnalysis ServicesOLAP & Data MiningOLAP & Data Mining

Integration ServicesIntegration ServicesETLETL

SQL ServerSQL ServerRelational EngineRelational Engine

Reporting ServicesReporting Services Man

agem

ent T

oo

lsM

anag

emen

t To

ols

Dev

elo

pm

ent

To

ols

Dev

elo

pm

ent

To

ols

ms data mining

Page 17: Capabilities Apollo and SQL Server Data Mining Presented by Jeff Kaplan, Principal Client Services Paul Bradley, Ph.D., Principal Data Mining Technology.

17

SQL Server 2005 BI Platform

Embed Data Mining: Development Tool Integration• Make Decisions Without Coding

• Customized Logic Based on Client Data

• Logic Updated by Model Reprocessing – Applications Do Not Need to be Re-Written, Re-Compiled, and Re-Deployed

Data Mining Key Points• Price Point to Achieve Market Penetration

• Database Metaphors for Building, Managing, Utilizing Extracted Patterns and Trends

• APIs for Embedding Data Mining Functionality into Applications

ms data mining

Page 18: Capabilities Apollo and SQL Server Data Mining Presented by Jeff Kaplan, Principal Client Services Paul Bradley, Ph.D., Principal Data Mining Technology.

18

SQL-Server 2005 Algorithms

Decision TreesDecision Trees Time SeriesTime Series Neural NetNeural NetClusteringClustering

Sequence ClusteringSequence Clustering AssociationAssociation Naïve BayesNaïve Bayes

ms data mining

Linear and Logistic RegressionLinear and Logistic Regression

Page 19: Capabilities Apollo and SQL Server Data Mining Presented by Jeff Kaplan, Principal Client Services Paul Bradley, Ph.D., Principal Data Mining Technology.

19

Project REAL

PART THREE

Page 20: Capabilities Apollo and SQL Server Data Mining Presented by Jeff Kaplan, Principal Client Services Paul Bradley, Ph.D., Principal Data Mining Technology.

20

Client Profile – Inventory Forecasting

• Create a Reference Implementation of a BI System Using Real Retail Data.• Partners - Barnes & Noble, Microsoft, Scalability Experts, EMC, Unisys,

Panorama, Apollo

• Forecast Out-of-Stock for 5 Book Titles Across Entire Chain (800 Stores)• Predictive Models to Flag Items That Are Going to be Out-of-Stock• Model on 48 Weeks of Data, Predictions for Month of December

• Models Predicted Out-of-Stock Occurrences > 90% Accuracy• Conservative Sales Opportunity for just 5 Titles: $6,800 per year• Extrapolate Across Millions of Titles - Million Dollar Sales Opportunity

project real

Page 21: Capabilities Apollo and SQL Server Data Mining Presented by Jeff Kaplan, Principal Client Services Paul Bradley, Ph.D., Principal Data Mining Technology.

21

Predictive Modeling Process

+

Each item belongs to a category

For the category, create a set of store clusters predictive of sales in the category

Category

STEP 1

STEP 2

Identify the cluster which the store belongs to, for the category of that item.

STEP 3

Utilize sales data predict item sales 2 weeks out.

ITEMSTORE

CATEGORY

project real

Page 22: Capabilities Apollo and SQL Server Data Mining Presented by Jeff Kaplan, Principal Client Services Paul Bradley, Ph.D., Principal Data Mining Technology.

22

Store Clustering Demo

project real

Page 23: Capabilities Apollo and SQL Server Data Mining Presented by Jeff Kaplan, Principal Client Services Paul Bradley, Ph.D., Principal Data Mining Technology.

24

Out-of-Stock Data Preparation Summary

Apollo Explored 3 Data Preparation Strategies1. Use Sales, On-Hand, On-Order History Data for All Stores in the Same Cluster

Build One Mining Structure per Cluster, For All Stores in that Cluster for Each Title

Build One Mining Model per Store, per Cluster for Each Title

Negative: Few OOS Examples per Store, Computation to Deploy One Mining Model per Store/Title Combination

2. Use Sales, On-Hand, On-Order History for All Stores, Across All ClustersBuild One Mining Structure per Book, Use Cluster Membership of Store as

Input Attribute

Positive: Optimizes OOS Examples per Title by Considering All Stores

Negative: Does Not Capture Derivative Sales Information

3. Removed Negative of Strategy 2Included Historical Week-on-Week Sales Derivative Information for Each Title

Increase the Information Content of the Source Data for Modeling

project real

Page 24: Capabilities Apollo and SQL Server Data Mining Presented by Jeff Kaplan, Principal Client Services Paul Bradley, Ph.D., Principal Data Mining Technology.

25

Creating Variables for Success

Using:• Sales and Inventory History from January 2004 to end of November 2004• Recommend two (2) years of Historical Data to Increase accuracy for training model

Key: • Store + Fiscal Year + WeekID

Predicted Variables• 1 Week Ahead OOS Boolean• 1 Week Ahead Sales Bin (None, 1 to 2, 3 to 4, 4+)• 2 Week Ahead OOS Boolean• 2 Week Ahead Sales Bin (None, 1 to 2, 3 to 4, 4+)

Input Attributes• Store Cluster Membership (Derived from Store Cluster Model)• Current Week Sales, On-Hand, On-Order• Preceding 1-5 Week Sales, On-Hand, On-Order• Sales Derivative Atttributes

project real

Page 25: Capabilities Apollo and SQL Server Data Mining Presented by Jeff Kaplan, Principal Client Services Paul Bradley, Ph.D., Principal Data Mining Technology.

26

Model Training and Testing Scenarios

Purpose: Intelligence on Model Training Frequency• Scenario 1: Train Models Every 2 Weeks

Training Dataset: All Data Prior to Last 2 Fiscal Weeks in December 2004

Test Dataset: Last 2 Fiscal Week in December 2004

• Scenario 2: Train Models Monthly

Training Dataset: All Data Prior to End of Fiscal November 2004

Test Dataset: Fiscal Month of December 2004

project real

Page 26: Capabilities Apollo and SQL Server Data Mining Presented by Jeff Kaplan, Principal Client Services Paul Bradley, Ph.D., Principal Data Mining Technology.

27

Balancing Training Data

When Considering All Stores, Still Have Un-Balanced Datasets• [# Store/Week Combinations Where OOS is False] >> [# Store/Week

Combinations Where OOS is True]

• Common in Many Data Mining Applications

Training Datasets were Balanced • Sample Store/Week Combinations Where OOS is False to Obtain Equal

Proportion of True/False Values

“Cost” of Predictive Errors are Equal• Requested by Client

project real

Page 27: Capabilities Apollo and SQL Server Data Mining Presented by Jeff Kaplan, Principal Client Services Paul Bradley, Ph.D., Principal Data Mining Technology.

28

Prediction Methods

Algorithm SelectionMicrosoft Decision Trees for Predicting OOS Boolean flags

Consistently High Overall Accuracy

Straightforward Interpretation

Data Preparation• Scenario 2• Rebuild models monthly

Predictive Models are Contextual and Optimized for Behavior in the Coming Month

project real

Page 28: Capabilities Apollo and SQL Server Data Mining Presented by Jeff Kaplan, Principal Client Services Paul Bradley, Ph.D., Principal Data Mining Technology.

29

Prediction Methods

Modeling Methodology Benefits• Scalability (Titles and Stores)

• Saves 4x to 5x on Computational Cost when Rebuilding Models (versus Neural Networks)

5 Minutes for All 5 Titles => 1 Minute per Title for All Stores

project real

Page 29: Capabilities Apollo and SQL Server Data Mining Presented by Jeff Kaplan, Principal Client Services Paul Bradley, Ph.D., Principal Data Mining Technology.

30

Out-of-Stock Prediction Demo

project real

Page 30: Capabilities Apollo and SQL Server Data Mining Presented by Jeff Kaplan, Principal Client Services Paul Bradley, Ph.D., Principal Data Mining Technology.

32

Inventory Prediction Results

1 week and 2 week prediction accuracies

TITLE Week 1 Week 2 Week 1 Week 2JUNIE B JONES IS A GRADUATION 97.53% 92.87% 98.46% 99.98%CAPTAIN UNDERPANTS & THE INVA 99.06% 87.67% 99.06% 99.96%MTH RESEARCH GDE #01 DINO 100.00% 83.82% 100.00% 100.00%MTH RESEARCH GDE #08 TWISTERS 98.29% 83.60% 99.48% 100.00%SECRETS OF DROON #04 CITY IN 97.71% 84.31% 99.13% 100.00%

AVERAGE ACCURACY 98.52% 86.45% 99.23% 99.99%

OOS SALES BINS

project real

Page 31: Capabilities Apollo and SQL Server Data Mining Presented by Jeff Kaplan, Principal Client Services Paul Bradley, Ph.D., Principal Data Mining Technology.

33

Sales Opportunity

Data Mining created revenue generating opportunityBased on 55 titles for Jan 2004 - Dec 2004

• (# of weeks OOS across all stores)(Apollo Boolean Predicted Accuracy)• X (actual % of actual sales across all stores) x (retail price)• = Yearly Increase in Sales Opportunity using Apollo OOS Predictions

TITLE # of OOS 1-2 Sales Price 2 Wk PredJUNIE B JONES IS A GRADUATION 1,165 1.16% 14.95 92.87%CAPTAIN UNDERPANTS & THE INVA 10,040 1.01% 17.95 87.67%MTH RESEARCH GDE #01 DINO 15,227 0.16% 14.95 83.82%MTH RESEARCH GDE #08 TWISTERS 4,444 0.44% 27.95 83.60%SECRETS OF DROON #04 CITY IN 7,115 0.65% 21.95 84.31%

1 Copy Sales 2 Copy Sales187.44$ 374.89$

1,590.96$ 3,181.93$ 305.13$ 610.26$ 460.57$ 921.14$ 861.37$ 1,722.74$

3,405.48$ 6,810.95$

Sales bins produced $3.4K, $6.8K potential lift in sales

project real

Page 32: Capabilities Apollo and SQL Server Data Mining Presented by Jeff Kaplan, Principal Client Services Paul Bradley, Ph.D., Principal Data Mining Technology.

34

Client Profiles

PART FOUR

Page 33: Capabilities Apollo and SQL Server Data Mining Presented by Jeff Kaplan, Principal Client Services Paul Bradley, Ph.D., Principal Data Mining Technology.

35

Client Profile – Customer Acquisition

• Decrease Subscriber Churn

• Increase New Subscriptions

• Segment Geo-Demographic and Attitudinal Behaviors for Subscribers and Non-Subscribers

• Build Predictive Models to Identify Likely New Subscribers

• Using Analysis to Deliver Targeted Marketing Campaigns for Acquisition

• Increased Stop Saves by 2%

client profiles

Page 34: Capabilities Apollo and SQL Server Data Mining Presented by Jeff Kaplan, Principal Client Services Paul Bradley, Ph.D., Principal Data Mining Technology.

36

Client Profile – Cross sell / Up sell (Global Catalog Retailer)

• Increase Average Purchase Size• Deploy Product Recommendations on their Website

• Modeling Historical Sales to Determine Product Affinities• Incorporate Business Logic into Modeling Process (e.g. Same

category recommendation)

• Increase Average Shopping Cart Size• Increase Sales Lift• Data Mining Driven Product Recommendation Performed Better

than Manual Recommendations

client profiles

Page 35: Capabilities Apollo and SQL Server Data Mining Presented by Jeff Kaplan, Principal Client Services Paul Bradley, Ph.D., Principal Data Mining Technology.

37

Client Profile – Customer Support Automation

• Increase Visibility into Customer Service Center

• Increase Speed of Customer Support

• Utilizing Text Mining Engines to Automate Processing of Customer Support (Email, Web Inquiries, etc.)

• Automating the Process of Rolling up Keywords into Concepts

• Customer Support Center has the Ability to View Trends in Minutes versus Weeks

• Improved Accuracy - Text Mining Engines Removed the Bias and Inaccuracies Often Occurring in Call Center Representative Notes and Tagging.

client profiles

Page 36: Capabilities Apollo and SQL Server Data Mining Presented by Jeff Kaplan, Principal Client Services Paul Bradley, Ph.D., Principal Data Mining Technology.

38

Client Profile – Key Driver Analysis

• Evaluate Customer Satisfaction Metrics• Increase Customer Satisfaction

• Partnered with Apollo to Develop Market Research Database and Reporting

• Developed Models to Identify “Key” Satisfaction Drivers

• Successfully Identified Drivers to Increase Customer Satisfaction• Delivered Driver Recommendations to Field Operations - Insight into

Action• Company Wide (sales, marketing, executive level) Visibility into Customer

Satisfaction Metrics

client profiles

Page 37: Capabilities Apollo and SQL Server Data Mining Presented by Jeff Kaplan, Principal Client Services Paul Bradley, Ph.D., Principal Data Mining Technology.

Presented by

Jeff Kaplan

Principal Client Services

[email protected]

312.787.7376