Mailing Campaign Model Nan Yang University of Central Florida 04/11/2008.
-
Upload
jody-berry -
Category
Documents
-
view
213 -
download
0
Transcript of Mailing Campaign Model Nan Yang University of Central Florida 04/11/2008.
![Page 1: Mailing Campaign Model Nan Yang University of Central Florida 04/11/2008.](https://reader036.fdocuments.in/reader036/viewer/2022080905/56649e7a5503460f94b7a09b/html5/thumbnails/1.jpg)
Mailing Campaign Mailing Campaign ModelModel
Nan YangNan Yang
University of Central FloridaUniversity of Central Florida
04/11/200804/11/2008
![Page 2: Mailing Campaign Model Nan Yang University of Central Florida 04/11/2008.](https://reader036.fdocuments.in/reader036/viewer/2022080905/56649e7a5503460f94b7a09b/html5/thumbnails/2.jpg)
OverviewOverview
Data VisualizationData Visualization Data PreparationData Preparation Model BuildingModel Building
Variable SelectionVariable Selection InteractionInteraction
Model AssessmentModel Assessment ROCROC
![Page 3: Mailing Campaign Model Nan Yang University of Central Florida 04/11/2008.](https://reader036.fdocuments.in/reader036/viewer/2022080905/56649e7a5503460f94b7a09b/html5/thumbnails/3.jpg)
Data VisualizationData Visualization
63 Variables63 Variables Target is binary with 1 indicating Target is binary with 1 indicating
people responded to the mailing people responded to the mailing campaigncampaign
Target is very unbalancedTarget is very unbalanced Target rate is 1.13% for training setTarget rate is 1.13% for training set
![Page 4: Mailing Campaign Model Nan Yang University of Central Florida 04/11/2008.](https://reader036.fdocuments.in/reader036/viewer/2022080905/56649e7a5503460f94b7a09b/html5/thumbnails/4.jpg)
Data VisualizationData Visualization
Categorical VariableCategorical Variable High level variables High level variables
x2 ~ 57 levelsx2 ~ 57 levels DATE variables (x10 & x11) ~ over 100 DATE variables (x10 & x11) ~ over 100
levelslevels Missing valueMissing value
DATE variables ~ 30%-70%DATE variables ~ 30%-70% Some variables missing value coded as Some variables missing value coded as
“Unknown” or “Uncoded”, e.g x20“Unknown” or “Uncoded”, e.g x20
![Page 5: Mailing Campaign Model Nan Yang University of Central Florida 04/11/2008.](https://reader036.fdocuments.in/reader036/viewer/2022080905/56649e7a5503460f94b7a09b/html5/thumbnails/5.jpg)
Data VisualizationData Visualization
Interval VariableInterval Variable SkewnessSkewness
![Page 6: Mailing Campaign Model Nan Yang University of Central Florida 04/11/2008.](https://reader036.fdocuments.in/reader036/viewer/2022080905/56649e7a5503460f94b7a09b/html5/thumbnails/6.jpg)
Data PreparationData Preparation
Missing Value Indicator (MVI)Missing Value Indicator (MVI) Variables with > 5% missingVariables with > 5% missing BinaryBinary Capture the missing value informationCapture the missing value information
![Page 7: Mailing Campaign Model Nan Yang University of Central Florida 04/11/2008.](https://reader036.fdocuments.in/reader036/viewer/2022080905/56649e7a5503460f94b7a09b/html5/thumbnails/7.jpg)
Data PreparationData Preparation
ImputationImputation Unconditional imputationUnconditional imputation Categorical variableCategorical variable
Tree/Tree SurrogateTree/Tree Surrogate Interval variable Interval variable
ClusterCluster
![Page 8: Mailing Campaign Model Nan Yang University of Central Florida 04/11/2008.](https://reader036.fdocuments.in/reader036/viewer/2022080905/56649e7a5503460f94b7a09b/html5/thumbnails/8.jpg)
Data PreparationData Preparation
TransformationTransformation Right skewedRight skewed
Log or Square Root transformationLog or Square Root transformation Left skewedLeft skewed
Square transformationSquare transformation
![Page 9: Mailing Campaign Model Nan Yang University of Central Florida 04/11/2008.](https://reader036.fdocuments.in/reader036/viewer/2022080905/56649e7a5503460f94b7a09b/html5/thumbnails/9.jpg)
Model BuildingModel Building
Variable selectionVariable selection Individual predictive powerIndividual predictive power Logistic backward eliminationLogistic backward elimination
Keep the potential interaction termsKeep the potential interaction terms Logistic stepwise selectionLogistic stepwise selection TreeTree
Different criterionsDifferent criterions
21 variables selected21 variables selected
![Page 10: Mailing Campaign Model Nan Yang University of Central Florida 04/11/2008.](https://reader036.fdocuments.in/reader036/viewer/2022080905/56649e7a5503460f94b7a09b/html5/thumbnails/10.jpg)
Model BuildingModel Building
InteractionsInteractions SAS EMiner Regression nodeSAS EMiner Regression node 11 interaction terms selected11 interaction terms selected
ModelModel Ensemble different logistic modelsEnsemble different logistic models
![Page 11: Mailing Campaign Model Nan Yang University of Central Florida 04/11/2008.](https://reader036.fdocuments.in/reader036/viewer/2022080905/56649e7a5503460f94b7a09b/html5/thumbnails/11.jpg)
Model AssessmentModel Assessment
AUC = 0.66AUC = 0.66
Sens i t i vi t y
0. 0
0. 1
0. 2
0. 3
0. 4
0. 5
0. 6
0. 7
0. 8
0. 9
1. 0
1 - Speci fi ci t y
0. 0 0. 1 0. 2 0. 3 0. 4 0. 5 0. 6 0. 7 0. 8 0. 9 1. 0
![Page 12: Mailing Campaign Model Nan Yang University of Central Florida 04/11/2008.](https://reader036.fdocuments.in/reader036/viewer/2022080905/56649e7a5503460f94b7a09b/html5/thumbnails/12.jpg)
AcknowledgementAcknowledgement
UCF Statistics DeptUCF Statistics Dept BlueCross BlueShield of FLBlueCross BlueShield of FL