1 Chapter 4: Introduction to Predictive Modeling: Regressions 4.1 Introduction 4.2 Selecting...

Post on 04-Jan-2016

218 views 0 download

Transcript of 1 Chapter 4: Introduction to Predictive Modeling: Regressions 4.1 Introduction 4.2 Selecting...

1

Chapter 4: Introduction to Predictive Modeling: Regressions

4.1 Introduction

4.2 Selecting Regression Inputs

4.3 Optimizing Regression Complexity

4.4 Interpreting Regression Models

4.5 Transforming Inputs

4.6 Categorical Inputs

4.7 Polynomial Regressions (Self-Study)

2

Chapter 4: Introduction to Predictive Modeling: Regressions

4.1 Introduction4.1 Introduction

4.2 Selecting Regression Inputs

4.3 Optimizing Regression Complexity

4.4 Interpreting Regression Models

4.5 Transforming Inputs

4.6 Categorical Inputs

4.7 Polynomial Regressions (Self-Study)

3

Model Essentials – Regressions

Predict new cases.

Select useful inputs.

Optimize complexity.

...

4

Model Essentials – Regressions

Best modelfrom sequence

Sequentialselection

Predict new cases.

Select useful inputs

Optimize complexity

Select useful inputs.

Optimize complexity.

...

6

Linear Regression Prediction Formula

parameterestimate

inputmeasurement

interceptestimate

= w0 + w1 x1 + w2 x2 ^ ^ ^y · · prediction

estimate^

Choose intercept and parameter estimates to minimize:

∑( yi – yi )2

trainingdata

^squared error function

...

7

Linear Regression Prediction Formula

parameterestimate

inputmeasurement

interceptestimate

= w0 + w1 x1 + w2 x2 ^ ^ ^y · · prediction

estimate^

Choose intercept and parameter estimates to minimize.

∑( yi – yi )2

trainingdata

^squared error function

...

8

Logistic Regression Prediction Formula

= w0 + w1 x1 + w2 x2 ^ ^ ^· · logit scores

...

^log

p

1 – p( )^

9

Logit Link Function

= w0 + w1 x1 + w2 x2 ^ ^ ^· ·

...

logitlink function

0 1

5

-5

The logit link function transforms probabilities (between 0 and 1) to logit scores (between −∞ and +∞).

^log

p

1 – p( )^

logit scores

10

Logit Link Function

= w0 + w1 x1 + w2 x2 ^ ^ ^· · logit scores

...

logitlink function

0 1

5

-5

The logit link function transforms probabilities (between 0 and 1) to logit scores (between −∞ and +∞).

^log

p

1 – p( )^

11

Logit Link Function

= w0 + w1 x1 + w2 x2 ^ ^ ^· ·

...

^log

p

1 – p( )^

1

1 + e-logit( p )p = ^^

^logit( p )

To obtain prediction estimates, the logit equation is solved for p. ^

=

12

Logit Link Function

= w0 + w1 x1 + w2 x2 ^ ^ ^· ·

...

^log

p

1 – p( )^

1

1 + e-logit( p )p = ^^

^logit( p )

To obtain prediction estimates, the logit equation is solved for p. ^

=

13

Logit Link Function

...

14

Simple Prediction Illustration – Regressions Predict dot color for each x1 and x2.

You need intercept and parameter estimates.

...

0.0 0.50.1 0.2 0.3 0.4 0.6 0.7 0.8 0.9 1.0

x1

0.0

0.5

0.1

0.2

0.3

0.4

0.6

0.7

0.8

0.9

1.0

x2

0.40

0.50

0.60

0.70

15

Simple Prediction Illustration – Regressions

You need intercept and parameter estimates.

...

0.0 0.50.1 0.2 0.3 0.4 0.6 0.7 0.8 0.9 1.0

x1

0.0

0.5

0.1

0.2

0.3

0.4

0.6

0.7

0.8

0.9

1.0

x2

0.40

0.50

0.60

0.70

16

Simple Prediction Illustration – Regressions

log-likelihood function

Find parameter estimates by maximizing

...

0.0 0.50.1 0.2 0.3 0.4 0.6 0.7 0.8 0.9 1.0

x1

0.0

0.5

0.1

0.2

0.3

0.4

0.6

0.7

0.8

0.9

1.0

x2

0.40

0.50

0.60

0.70

17

Simple Prediction Illustration – Regressions

log-likelihood function

Find parameter estimates by maximizing

...

0.0 0.50.1 0.2 0.3 0.4 0.6 0.7 0.8 0.9 1.0

x1

0.0

0.5

0.1

0.2

0.3

0.4

0.6

0.7

0.8

0.9

1.0

x2

0.40

0.50

0.60

0.70

18

Simple Prediction Illustration – Regressions

0.0 0.50.1 0.2 0.3 0.4 0.6 0.7 0.8 0.9 1.0

x1

0.0

0.5

0.1

0.2

0.3

0.4

0.6

0.7

0.8

0.9

1.0

x2

0.40

0.50

0.60

0.70

Using the maximum likelihood estimates, the prediction formula assigns a logit score to each x1 and x2.

...

19

20

4.01 Multiple Choice PollWhat is the logistic regression prediction for the indicated point?

a. 0.243

b. 0.56

c. yellow

d. It depends.

0.0 0.50.1 0.2 0.3 0.4 0.6 0.7 0.8 0.9 1.0

x1

0.0

0.5

0.1

0.2

0.3

0.4

0.6

0.7

0.8

0.9

1.0

x2

0.40

0.50

0.60

0.70

21

4.01 Multiple Choice Poll – Correct AnswerWhat is the logistic regression prediction for the indicated point?

a. 0.243

b. 0.56

c. yellow

d. It depends.

0.0 0.50.1 0.2 0.3 0.4 0.6 0.7 0.8 0.9 1.0

x1

0.0

0.5

0.1

0.2

0.3

0.4

0.6

0.7

0.8

0.9

1.0

x2

0.40

0.50

0.60

0.70

22

Regressions: Beyond the Prediction Formula

Manage missing values.

Interpret the model.

Account for nonlinearities.

Handle extreme or unusual values.

Use nonnumeric inputs.

...

23

Regressions: Beyond the Prediction Formula

Manage missing values.

Interpret the model.

Account for nonlinearities.

Handle extreme or unusual values.

Use nonnumeric inputs.

...

24

Missing Values and Regression Modeling

Training Datatargetinputs

Problem 1: Training data cases with missing values on inputs used by a regression model are ignored.

...

25

Consequence: missing values can significantly reduce your amount of training data for regression modeling!

Missing Values and Regression Modeling

Training Datatargetinputs

Problem 1: Training data cases with missing values on inputs used by a regression model are ignored.

...

26

Missing Values and Regression Modeling

Consequence: Missing values can significantly reduce your amount of training data for regression modeling!

Training Datatargetinputs

...

27

Missing Values and the Prediction Formula

Predict: (x1, x2) = (0.3, ? )

Problem 2: Prediction formulas cannot score cases with missing values.

...

28

Missing Values and the Prediction Formula

Predict: (x1, x2) = (0.3, ? )

Problem 2: Prediction formulas cannot score cases with missing values.

...

29

Missing Values and the Prediction Formula

...

Problem 2: Prediction formulas cannot score cases with missing values.

30

Missing Values and the Prediction Formula

...

Problem 2: Prediction formulas cannot score cases with missing values.

31

Missing Value Issues

Manage missing values.

Problem 1: Training data cases with missing values on inputs used by a regression model are ignored.

...

Problem 2: Prediction formulas cannot score cases with missing values.

32

Missing Value Issues

Manage missing values.

Problem 1: Training data cases with missing values on inputs used by a regression model are ignored.

...

Problem 2: Prediction formulas cannot score cases with missing values.

33

Missing Value Causes

Manage missing values.

Non-applicable measurement

No match on merge

Non-disclosed measurement

...

34

Missing Value Remedies

Manage missing values.

xi = f(x1, … ,xp)

Non-applicable measurement

No match on merge

Non-disclosed measurement

...

35

Managing Missing Values

This demonstration illustrates how to impute synthetic data values and create missing value indicators.

36

Running the Regression Node

This demonstration illustrates using the Regression tool.

37

Chapter 4: Introduction to Predictive Modeling: Regressions

4.1 Introduction

4.2 Selecting Regression Inputs4.2 Selecting Regression Inputs

4.3 Optimizing Regression Complexity

4.4 Interpreting Regression Models

4.5 Transforming Inputs

4.6 Categorical Inputs

4.7 Polynomial Regressions (Self-Study)

38

Predictionformula

Model Essentials – Regressions

Best modelfrom sequence

Sequentialselection

Predict new cases.

Select useful inputs

Optimize complexity.

Select useful inputs.

39

Sequential Selection – Forward

Entry CutoffInput p-value

...

40

Sequential Selection – Forward

Entry CutoffInput p-value

...

41

Sequential Selection – Forward

Entry CutoffInput p-value

...

42

Sequential Selection – Forward

Entry CutoffInput p-value

...

43

Sequential Selection – Forward

Entry CutoffInput p-value

44

Sequential Selection – Backward

Stay CutoffInput p-value

...

45

Sequential Selection – Backward

Stay CutoffInput p-value

...

46

Sequential Selection – Backward

Stay CutoffInput p-value

...

47

Sequential Selection – Backward

Stay CutoffInput p-value

...

48

Sequential Selection – Backward

Stay CutoffInput p-value

...

49

Sequential Selection – Backward

Stay CutoffInput p-value

...

50

Sequential Selection – Backward

Stay CutoffInput p-value

...

51

Sequential Selection – Backward

Stay CutoffInput p-value

52

Sequential Selection – StepwiseInput p-value Entry Cutoff

Stay Cutoff

...

53

Sequential Selection – StepwiseInput p-value Entry Cutoff

Stay Cutoff

...

54

Sequential Selection – StepwiseInput p-value Entry Cutoff

Stay Cutoff

...

55

Sequential Selection – StepwiseInput p-value Entry Cutoff

Stay Cutoff

...

56

Sequential Selection – StepwiseInput p-value Entry Cutoff

Stay Cutoff

...

57

Sequential Selection – StepwiseInput p-value Entry Cutoff

Stay Cutoff

...

58

Sequential Selection – StepwiseInput p-value Entry Cutoff

Stay Cutoff

59

60

4.02 PollThe three sequential selection methods for building regression models can never lead to the same model for the same set of data.

True

False

61

4.02 Poll – Correct AnswerThe three sequential selection methods for building regression models can never lead to the same model for the same set of data.

True

False

62

Selecting Inputs

This demonstration illustrates using stepwise selection to choose inputs for the model.

63

Chapter 4: Introduction to Predictive Modeling: Regressions

4.1 Introduction

4.2 Selecting Regression Inputs

4.3 Optimizing Regression Complexity4.3 Optimizing Regression Complexity

4.4 Interpreting Regression Models

4.5 Transforming Inputs

4.6 Categorical Inputs

4.7 Polynomial Regressions (Self-Study)

64

Model Essentials – Regressions

Predict new cases.

Select useful inputs.

Optimize complexity.

Predictionformula

Sequentialselection

...

65

Model Fit versus Complexity

1 2 3 4 5 6

Model fit statistic

training

validation

...

66

Select Model with Optimal Validation Fit

1 2 3 4 5 6

Model fit statistic

Evaluate eachsequence step.

...

67

Optimizing Complexity

This demonstration illustrates tuning a regression model to give optimal performance on the validation data.

68

Chapter 4: Introduction to Predictive Modeling: Regressions

4.1 Introduction

4.2 Selecting Regression Inputs

4.3 Optimizing Regression Complexity

4.4 Interpreting Regression Models4.4 Interpreting Regression Models

4.5 Transforming Inputs

4.6 Categorical Inputs

4.7 Polynomial Regressions (Self-Study)

69

Beyond the Prediction Formula

Manage missing values.

Interpret the model.

Account for nonlinearities.

Handle extreme or unusual values.

Use nonnumeric inputs.

...

70

Beyond the Prediction Formula

Manage missing values

Interpret the model.

Account for nonlinearities.

Handle extreme or unusual values.

Use nonnumeric inputs.

...

71

Logistic Regression Prediction Formula

...

= w0 + w1 x1 + w2 x2 ^ ^ ^· ·

^log

p

1 – p( )^

logit scores

72

Odds Ratios and Doubling Amounts

Odds ratio: Amount odds change with unit change in input.Doubling amount:

How much does an input have to change to double the odds?

1 odds exp(wi)

odds 20.69wi

Δxi consequence

...

= w0 + w1 x1 + w2 x2 ^ ^ ^· ·

^log

p

1 – p( )^

logit scores

73

Interpreting a Regression Model

This demonstration illustrates interpreting a regression model using odds ratios.

74

Chapter 4: Introduction to Predictive Modeling: Regressions

4.1 Introduction

4.2 Selecting Regression Inputs

4.3 Optimizing Regression Complexity

4.4 Interpreting Regression Models

4.5 Transforming Inputs4.5 Transforming Inputs

4.6 Categorical Inputs

4.7 Polynomial Regressions (Self-Study)

75

Beyond the Prediction Formula

Manage missing values.

Interpret the model.

Account for nonlinearities.

Handle extreme or unusual values.

Use nonnumeric inputs.

...

76

Extreme Distributions and Regressions

high leverage pointsskewed inputdistribution

standard regression

true association

standard regression

true association

Original Input Scale

...

77

Extreme Distributions and Regressions

high leverage pointsskewed inputdistribution

standard regression

true association

standard regression

true association

Original Input Scale

more symmetricdistribution

Regularized Scale

...

78

Original Input Scale

Regularizing Input Transformations

more symmetricdistribution

Regularized Scale

standard regression

standard regression

...

Original Input Scale

high leverage pointsskewed inputdistribution

79

Regularizing Input TransformationsRegularized Scale

standard regression

standard regression

...

Original Input ScaleOriginal Input Scale

regularized estimate

regularized estimate

80

Regularizing Input TransformationsRegularized Scale

standard regression

standard regression

...

Original Input Scale

regularized estimate

regularized estimate

true association

true association

81

82

4.03 Multiple Choice PollWhich statement below is true about transformations of input variables in a regression analysis?

a. They are never a good idea.

b. They help model assumptions match the assumptions of maximum likelihood estimation.

c. They are performed to reduce the bias in model predictions.

d. They typically are done on nominal (categorical) inputs.

83

4.03 Multiple Choice Poll – Correct AnswerWhich statement below is true about transformations of input variables in a regression analysis?

a. They are never a good idea.

b. They help model assumptions match the assumptions of maximum likelihood estimation.

c. They are performed to reduce the bias in model predictions.

d. They typically are done on nominal (categorical) inputs.

84

Transforming Inputs

This demonstration illustrates using the Transform Variables tool to apply standard transformations to a set of inputs.

85

Chapter 4: Introduction to Predictive Modeling: Regressions

4.1 Introduction

4.2 Selecting Regression Inputs

4.3 Optimizing Regression Complexity

4.4 Interpreting Regression Models

4.5 Transforming Inputs

4.6 Categorical Inputs4.6 Categorical Inputs

4.7 Polynomial Regressions (Self-Study)

86

Beyond the Prediction Formula

Manage missing values.

Interpret the model.

Account for nonlinearities.

Handle extreme or unusual values.

Use nonnumeric inputs.

...

87

Beyond the Prediction Formula

Manage missing values.

Interpret the model.

Account for nonlinearities.

Handle extreme or unusual values.

Use nonnumeric inputs.

...

88

Nonnumeric Input Coding

Level DI

1 0 0 0 0 0 0 0

DA DB DC DD DE DF DG DH

0

0 0 0 1 0 0 0 0

0 1 0 0 0 0 0 00 0 1 0 0 0 0 0

0 0 0 0 1 0 0 00 0 0 0 0 1 0 00 0 0 0 0 0 1 00 0 0 0 0 0 0 10 0 0 0 0 0 0 0

00000001

ABCDEFGHI

...

89

DI

000000001

DI

000000001

Coding Redundancy

Level

1 0 0 0 0 0 0 0

DA DB DC DD DE DF DG DH

0 0 0 1 0 0 0 0

0 1 0 0 0 0 0 00 0 1 0 0 0 0 0

0 0 0 0 1 0 0 00 0 0 0 0 1 0 00 0 0 0 0 0 1 00 0 0 0 0 0 0 10 0 0 0 0 0 0 0

ABCDEFGHI

...

90

DI

000000001

Coding Consolidation

Level

1 0 0 0 0 0 0 0

DA DB DC DD DE DF DG DH

0 0 0 1 0 0 0 0

0 1 0 0 0 0 0 00 0 1 0 0 0 0 0

0 0 0 0 1 0 0 00 0 0 0 0 1 0 00 0 0 0 0 0 1 00 0 0 0 0 0 0 10 0 0 0 0 0 0 0

ABCDEFGHI

...

91

DI

000000001

Coding Consolidation

Level

1 0 0 0 0 0 0 0

DABCD DB DC DD DEF DF DGH DH

1 0 0 1 0 0 0 0

1 1 0 0 0 0 0 01 0 1 0 0 0 0 0

0 0 0 0 1 0 0 00 0 0 0 1 1 0 00 0 0 0 0 0 1 00 0 0 0 0 0 1 10 0 0 0 0 0 0 0

ABCDEFGHI

92

Recoding Categorical Inputs

This demonstration illustrates using the Replacement tool to facilitate the process of combining input levels.

93

Chapter 4: Introduction to Predictive Modeling: Regressions

4.1 Introduction

4.2 Selecting Regression Inputs

4.3 Optimizing Regression Complexity

4.4 Interpreting Regression Models

4.5 Transforming Inputs

4.6 Categorical Inputs

4.7 Polynomial Regressions (Self-Study)4.7 Polynomial Regressions (Self-Study)

94

Beyond the Prediction Formula

Manage missing values.

Interpret the model.

Account for nonlinearities.

Handle extreme or unusual values.

Use nonnumeric inputs.

...

95

Beyond the Prediction Formula

Manage missing values.

Interpret the model.

Account for nonlinearities.

Handle extreme or unusual values.

Use nonnumeric inputs.

...

96

Standard Logistic Regression

= w0 + w1 x1 + w2 x2 ^

^ ^ ^log p

1 – p( )^ ·

0.0 0.50.1 0.2 0.3 0.4 0.6 0.7 0.8 0.9 1.0

x1

0.0

0.5

0.1

0.2

0.3

0.4

0.6

0.7

0.8

0.9

1.0

x2

0.40

0.50

0.60

0.70

97

Polynomial Logistic Regression

= w0 + w1 x1 + w2 x2 ^

^ ^ ^log p

1 – p( )^ · ·

quadratic terms

+ w3 x1 + w4 x2 2 2^ ^

+ w5 x1 x2

0.0 0.50.1 0.2 0.3 0.4 0.6 0.7 0.8 0.9 1.0

x1

0.0

0.5

0.1

0.2

0.3

0.4

0.6

0.7

0.8

0.9

1.0

x2

0.40 0.50 0.60 0.700.30

0.60

0.70

0.80

...

^

98

Adding Polynomial Regression Terms Selectively

This demonstration illustrates how to add polynomial regression terms selectively.

99

Adding Polynomial Regression Terms Autonomously (Self-Study)

This demonstration illustrates how to add polynomial regression terms autonomously.

100

Exercises

This exercise reinforces the concepts discussed previously.

101

Regression Tools ReviewReplace missing values for interval (means) and categorical data (mode). Create a unique replacement indicator.

Create linear and logistic regression models. Select inputs with a sequential selection method and appropriate fit statistic. Interpret models with odds ratios.

Regularize distributions of inputs. Typical transformations control for input skewness via a log transformation.

continued...

102

Regression Tools Review

Consolidate levels of a nonnumeric input using the Replacement Editor window.

Add polynomial terms to a regression either by hand or by an autonomous exhaustive search.