Mailing Campaign Model Nan Yang University of Central Florida 04/11/2008.

12
Mailing Campaign Mailing Campaign Model Model Nan Yang Nan Yang University of Central University of Central Florida Florida 04/11/2008 04/11/2008

Transcript of Mailing Campaign Model Nan Yang University of Central Florida 04/11/2008.

Page 1: Mailing Campaign Model Nan Yang University of Central Florida 04/11/2008.

Mailing Campaign Mailing Campaign ModelModel

Nan YangNan Yang

University of Central FloridaUniversity of Central Florida

04/11/200804/11/2008

Page 2: Mailing Campaign Model Nan Yang University of Central Florida 04/11/2008.

OverviewOverview

Data VisualizationData Visualization Data PreparationData Preparation Model BuildingModel Building

Variable SelectionVariable Selection InteractionInteraction

Model AssessmentModel Assessment ROCROC

Page 3: Mailing Campaign Model Nan Yang University of Central Florida 04/11/2008.

Data VisualizationData Visualization

63 Variables63 Variables Target is binary with 1 indicating Target is binary with 1 indicating

people responded to the mailing people responded to the mailing campaigncampaign

Target is very unbalancedTarget is very unbalanced Target rate is 1.13% for training setTarget rate is 1.13% for training set

Page 4: Mailing Campaign Model Nan Yang University of Central Florida 04/11/2008.

Data VisualizationData Visualization

Categorical VariableCategorical Variable High level variables High level variables

x2 ~ 57 levelsx2 ~ 57 levels DATE variables (x10 & x11) ~ over 100 DATE variables (x10 & x11) ~ over 100

levelslevels Missing valueMissing value

DATE variables ~ 30%-70%DATE variables ~ 30%-70% Some variables missing value coded as Some variables missing value coded as

“Unknown” or “Uncoded”, e.g x20“Unknown” or “Uncoded”, e.g x20

Page 5: Mailing Campaign Model Nan Yang University of Central Florida 04/11/2008.

Data VisualizationData Visualization

Interval VariableInterval Variable SkewnessSkewness

Page 6: Mailing Campaign Model Nan Yang University of Central Florida 04/11/2008.

Data PreparationData Preparation

Missing Value Indicator (MVI)Missing Value Indicator (MVI) Variables with > 5% missingVariables with > 5% missing BinaryBinary Capture the missing value informationCapture the missing value information

Page 7: Mailing Campaign Model Nan Yang University of Central Florida 04/11/2008.

Data PreparationData Preparation

ImputationImputation Unconditional imputationUnconditional imputation Categorical variableCategorical variable

Tree/Tree SurrogateTree/Tree Surrogate Interval variable Interval variable

ClusterCluster

Page 8: Mailing Campaign Model Nan Yang University of Central Florida 04/11/2008.

Data PreparationData Preparation

TransformationTransformation Right skewedRight skewed

Log or Square Root transformationLog or Square Root transformation Left skewedLeft skewed

Square transformationSquare transformation

Page 9: Mailing Campaign Model Nan Yang University of Central Florida 04/11/2008.

Model BuildingModel Building

Variable selectionVariable selection Individual predictive powerIndividual predictive power Logistic backward eliminationLogistic backward elimination

Keep the potential interaction termsKeep the potential interaction terms Logistic stepwise selectionLogistic stepwise selection TreeTree

Different criterionsDifferent criterions

21 variables selected21 variables selected

Page 10: Mailing Campaign Model Nan Yang University of Central Florida 04/11/2008.

Model BuildingModel Building

InteractionsInteractions SAS EMiner Regression nodeSAS EMiner Regression node 11 interaction terms selected11 interaction terms selected

ModelModel Ensemble different logistic modelsEnsemble different logistic models

Page 11: Mailing Campaign Model Nan Yang University of Central Florida 04/11/2008.

Model AssessmentModel Assessment

AUC = 0.66AUC = 0.66

Sens i t i vi t y

0. 0

0. 1

0. 2

0. 3

0. 4

0. 5

0. 6

0. 7

0. 8

0. 9

1. 0

1 - Speci fi ci t y

0. 0 0. 1 0. 2 0. 3 0. 4 0. 5 0. 6 0. 7 0. 8 0. 9 1. 0

Page 12: Mailing Campaign Model Nan Yang University of Central Florida 04/11/2008.

AcknowledgementAcknowledgement

UCF Statistics DeptUCF Statistics Dept BlueCross BlueShield of FLBlueCross BlueShield of FL