Lecture 10.1.Key

7/23/2019 Lecture 10.1.Key

1/26

Transforms Revisited

Transforms are used to

Change the mean function so that it islinear.

Adjust for non-constant variance problem Fix non-Normal residuals Although you won't always solve all three

problems (or any problem for that matter.)

7/23/2019 Lecture 10.1.Key

2/26

Youve already studied

log transforms and square-root transforms

Now

were going to consider a more generalclass of transforms and discuss strategiesfor finding the best transform

7/23/2019 Lecture 10.1.Key

3/26

Strategy

First, transform Y. If that doesn't work, transform the

predictors, but not Y.

If that improves things but not perfectly, seeif you can now transform Y.

There are also approaches that considertransforming ALL variables simultaneously.

Keep in mind

Don't remove outliers, influential points,etc. until the transforming is done. These points might not really be so outlying

once the transform is done.

7/23/2019 Lecture 10.1.Key

4/26

Keep in Mind

Simple is better than complicated If you are expected to interpret the

parameters, then transformations mightmake this impossible.

Transform Y

E(Y|X)6=0+ 1x1+ . . .+ pxp

Basic idea: What if

but instead:

E(Y|X) = g(0+ 1x1+ . . . + pxp)

so we need to discover g()

7/23/2019 Lecture 10.1.Key

5/26

E(Y|X) = g(0+ 1x1+ . . . + pxp)

if we knew g(), we could invert it:

g1(E(Y|X)) = g1(g(0+ 1x1+ . . . + pxp))

Ynew = 0 + 1x1 + . . .+ pxp

Transform Y: 2 approaches

Inverse Response Plots Box-Cox Method

7/23/2019 Lecture 10.1.Key

6/26

Inverse Response Plots

a technique for guessing g()

If the predictors have an elliptically symmetricdistribution (so joint Normal is one example of this), then

plot y-hat against y.

The shape of the resulting curve gives you an idea as to the

shape of g inverse.

7/23/2019 Lecture 10.1.Key

7/26

> m1=lm(ozone~temperature+pressure,data=ozonetext)> plot(m1)

A plot of the predictors show that their joint distributionis roughly elliptical.

7/23/2019 Lecture 10.1.Key

8/26

> library(alr3)

> invResPlot(m2)

lambda RSS1 0.3658881 1989.771

2 -1.0000000 3412.9123 0.0000000 2082.3774 1.0000000 2196.992

Suggests that the best transform is Ynew = Y0.365881

(lambda=0 refers to the log transform)

Note log transform isnt to different from optimal

7/23/2019 Lecture 10.1.Key

9/26

> ozone.t1=transform(ozonetext,ozone.t = ozone^(.37) )> m2=lm(ozone.t~temperature+pressure,data=ozone.t1)> plot(m2) transformed

original

7/23/2019 Lecture 10.1.Key

10/26

transform

original

transformed

original

7/23/2019 Lecture 10.1.Key

11/26

On the whole, the transformationimproved the validity of the model.

But interpretation may now be quitedifficult.

Still, improved validity means we bettertrust p-values and confidence intervals andprediction intervals.

> summary(m2)

Call:lm(formula = ozone.t ~ temperature + pressure, data = ozone.t1)

Coefficients: Estimate Std. Error t value Pr(>|t|)(Intercept) -0.4004629 0.1774149 -2.257 0.0256 *temperature 0.0423812 0.0027663 15.321

7/23/2019 Lecture 10.1.Key

12/26

Another approach:

(Useful when the distribution of the variable to betransformed is not Normal.)

Box-Cox

Choose a transform of Y, (Y)

such that distribution of Y is closer to Normalwhere

(Y) = gm(Y)1(Y 1)/

(Y) =gm(Y)log(Y) for = 0

(gmis the geometric mean)

7/23/2019 Lecture 10.1.Key

13/26

(Y) = gm(Y)1(Y 1)/

gm(Y) is the geometric mean of y =

ni=1Y1/ni

To find lambda....

maximum likelihood estimation of lambda.

> library(MASS)

> boxcox(m1)

or

> library(alr3)

>summary(powerTransform(y~x1+x2,data=))

7/23/2019 Lecture 10.1.Key

14/26

1/3

which confirmsour previous

transformationusing lambda = .37

> boxcox(m1)

> summary(powerTransform(m1))

bcPower Transformation to Normality

Est.Power Std.Err. Wald Lower Bound Wald Upper Bound

Y1 0.2343 0.0866 0.0646 0.4041

Likelihood ratio tests about transformation parameters

LRT df pval

LR test, lambda = (0) 7.568201 1 5.940706e-03

LR test, lambda = (1) 66.558671 1 3.330669e-16

In fact, optimal transform is .23, which is smaller thanprevious .37. However, .37 is within the confidence interval

of 0.0646 to 0.4041

7/23/2019 Lecture 10.1.Key

15/26

Likelihood ratio tests about transformation parameters

LRT df pval

LR test, lambda = (0) 7.568201 1 5.940706e-03

LR test, lambda = (1) 66.558671 1 3.330669e-16

Null: lambda=0Alt: lambda 0

Small p-value, so we reject.Thus, it is best to notdo a

log transform.

Null: no transform (lambda=1)Alt: do a transform

Reject. We need a transform.

Transform Predictors

You can use BoxCox to transformpredictors when Y is NOT transformed

Then, if necessary, use inverse responseplot to transform Y

7/23/2019 Lecture 10.1.Key

16/26

In this approach, we find a transformationthat makes the joint distribution of all the

predictorsmultivariate Normal.

(or as close to it as we can get)

once thats done, we try to find atransform for Y.

Then we see if it helps.

7/23/2019 Lecture 10.1.Key

17/26

7/23/2019 Lecture 10.1.Key

18/26

o these predictors look like they come from a Normaldistribution?

(probably not)

>library(alr3)

> summary(powerTransform(ozone~temperature+height,data=o2.mini))

box.cox Transformations to Multinormality

Est.Power Std.Err. Wald(Power=0) Wald(Power=1)

temperature 1.1383 0.3246 3.5070 0.426height 18.9126 4.5176 4.1864 3.965

LRT df p.valueLR test, all lambda equal 0 25.50600 2 2.893633e-06

LR test, all lambda equal 1 17.30179 2 1.749703e-04

Best lambda could be within two Std. Errors of Estimated.

For temp, use a lambda between 0.5 to 1.7, roundinggenerously.

7/23/2019 Lecture 10.1.Key

19/26

> summary(powerTransform(cbind(o2.mini$temperature, o2.mini

$height,data=o2.mini)~1)

box.cox Transformations to Multinormality

Est.Power Std.Err. Wald(Power=0) Wald(Power=1)

temperature 1.1383 0.3246 3.5070 0.426height 18.9126 4.5176 4.1864 3.965


LR test, all lambda equal 1 17.30179 2 1.749703e-04

Temp: try square-root transform or no transform

Height: Transform to a high power, which is very unusual

and probably not helpful. But let's try the 20th poweranyways.

> o2.minit=transform(o2.mini,temp.t = sqrt(temperature),height.t =height^20)> plot(o2.minit)

7/23/2019 Lecture 10.1.Key

20/26

> o2.minit=transform(o2.mini,temp.t = sqrt(temperature),height.t = height^20)> plot(o2.minit)

residuals: no transform

7/23/2019 Lecture 10.1.Key

21/26

not much better, so look at transforming

Y

transformedpredictors > m.t1 = lm(ozone~temp.t+height.t,data=o2.minit)

> plot(m.t1)

> invResPlot(m.t1)

7/23/2019 Lecture 10.1.Key

22/26

once again,

Y Y

1/3

looks best.> o2.minit2 = transform(o2.minit,ozone.t =ozone^(1/3))> m.t2 = lm(ozone.t~temp.t

+height.t,data=o2.minit2)

> plot(m.t2)

7/23/2019 Lecture 10.1.Key

23/26

A third approach is to use boxcox totransform the predictors and the response

simultaneously

7/23/2019 Lecture 10.1.Key

24/26

Use BoxCox to transform ALL at once.>

summary(powerTransform(with(o2.mini,cbind(ozone,height,temperature))

)box.cox Transformations to Multinormality

Est.Power Std.Err. Wald(Power=0) Wald(Power=1)ozone 0.2503 0.0888 2.8178 -8.4416

height 18.8959 4.4542 4.2422 4.0177

temperature 1.1590 0.2661 4.3550 0.5976


LR test, all lambda equal 1 83.53574 3 0.000000e+00

This is consistent with the 1/3 power of ozone, a 20th power forheight, and no change (raise to the 1 power) for temp.

7/23/2019 Lecture 10.1.Key

25/26

7/23/2019 Lecture 10.1.Key

26/26

2 (p+ 1)/n= 2 3/141 = 0.04 = "big" leverage

Lecture 10.1.Key

Documents

Transcript of Lecture 10.1.Key