Linear Regression Analysis Additional Reference: Applied Linear Regression Models – Neter, Kutner,...

17
Linear Regression Analysis Additional Reference: • Applied Linear Regression Models – Neter, Kutner, Nachtsheim, Wasserman The lecture notes of Dr. Thomas Wherly and Dr. Fred Dahm aided in the preparation of chapter 12 material

Transcript of Linear Regression Analysis Additional Reference: Applied Linear Regression Models – Neter, Kutner,...

Page 1: Linear Regression Analysis Additional Reference: Applied Linear Regression Models – Neter, Kutner, Nachtsheim, Wasserman The lecture notes of Dr. Thomas.

Linear Regression Analysis

Additional Reference:

• Applied Linear Regression Models – Neter, Kutner, Nachtsheim, Wasserman

The lecture notes of Dr. Thomas Wherly and Dr. Fred Dahm aided in the preparation of chapter 12 material

Page 2: Linear Regression Analysis Additional Reference: Applied Linear Regression Models – Neter, Kutner, Nachtsheim, Wasserman The lecture notes of Dr. Thomas.

Linear Regression Analysis (ch. 12)

Observe a response Y and one or more predictors x. Formulate a model that relates the mean response E(Y) to x.

Y – Dependent Variable

x – Independent Variable

Page 3: Linear Regression Analysis Additional Reference: Applied Linear Regression Models – Neter, Kutner, Nachtsheim, Wasserman The lecture notes of Dr. Thomas.

Deterministic Model

• Y = f(x) ; Once we know the value of x, the value of Y is completely satisfied

• Simplest (Straight Line)Model: Y= o + 1x

• 1 = Slope of the Line

• o = Y-intercept of the Line

Page 4: Linear Regression Analysis Additional Reference: Applied Linear Regression Models – Neter, Kutner, Nachtsheim, Wasserman The lecture notes of Dr. Thomas.

Probabilistic Model• Y = f(x) + ; The value of Y is a R.V.

• Model for Simple Linear Regression:Yi = o + 1xi + i , i=1,..,n

• Y1,…,Yn – Observed Value of the Response

• x1,…,xn – Observed Value of Predictor

• o,1 – Unknown Parameters to be Estimated from the Data

• 1,…, n – Unknown Random Error Terms – Usually iid N(0,2) Random Variables

Page 5: Linear Regression Analysis Additional Reference: Applied Linear Regression Models – Neter, Kutner, Nachtsheim, Wasserman The lecture notes of Dr. Thomas.

Interpretation of ModelFor each value of x, the observed Y will fall above or below the line Y = o + 1x according to the error term . For each fixed x

Y~N(o + 1x , 2)

Page 6: Linear Regression Analysis Additional Reference: Applied Linear Regression Models – Neter, Kutner, Nachtsheim, Wasserman The lecture notes of Dr. Thomas.

Questions

1. How do we estimate o,1, and 2?

2. Does the proposed model fit the data well?

3. Are the assumptions satisfied?

Page 7: Linear Regression Analysis Additional Reference: Applied Linear Regression Models – Neter, Kutner, Nachtsheim, Wasserman The lecture notes of Dr. Thomas.

Plotting the Data

A scatter plot of the data is a useful first step for checking whether a linear relationship is plausible.

Page 8: Linear Regression Analysis Additional Reference: Applied Linear Regression Models – Neter, Kutner, Nachtsheim, Wasserman The lecture notes of Dr. Thomas.

Example (12.4)

A study to assess the capability of subsurface flow wetland systems to remove biochemical oxygen demand and other various chemical constituents resulted in the following scatter plot of the data where x = BOD mass loading and y = BOD mass removal. Does the plot suggest a linear relationship?

x 3 8 10 11 13 16 27 30 35 37 38 44 103 142y 4 7 8 8 10 11 16 26 21 9 31 30 75 90

Page 9: Linear Regression Analysis Additional Reference: Applied Linear Regression Models – Neter, Kutner, Nachtsheim, Wasserman The lecture notes of Dr. Thomas.

Example (12.5)

An experiment conducted to investigate the stretchability of mozzarella cheese with temperature resulted in the following scatter plot where x = temperature and y = % elongation at failure. Does the scatter plot suggest a linear relationship?

Page 10: Linear Regression Analysis Additional Reference: Applied Linear Regression Models – Neter, Kutner, Nachtsheim, Wasserman The lecture notes of Dr. Thomas.

Estimating o and 1

Consider an arbitrary line y = b0 + b1x drawn through a scatter plot. We want the line to be as close to the points in the scatter plot as possible. The vertical distance from (x,y) to the corresponding point on the line (x,b0 + b1x) is y-(b0 + b1x).

Page 11: Linear Regression Analysis Additional Reference: Applied Linear Regression Models – Neter, Kutner, Nachtsheim, Wasserman The lecture notes of Dr. Thomas.

Possible Estimation Criteria• Eyeball Method

• L1 Estimation - Choose o,1 to minimize yi - ox - 1xi

• Least Squares Estimation - Choose o,1 to minimize (yi - o - 1xi )2

* We use Least Squares Estimation in practice since it is difficult to mathematically manipulate the other options*

Page 12: Linear Regression Analysis Additional Reference: Applied Linear Regression Models – Neter, Kutner, Nachtsheim, Wasserman The lecture notes of Dr. Thomas.

Least Squares Estimation

Take derivatives with respect to b0 and b1, and set equal to zero. This results in the “normal equations” (based on right angles – not the Normal distribution)

Page 13: Linear Regression Analysis Additional Reference: Applied Linear Regression Models – Neter, Kutner, Nachtsheim, Wasserman The lecture notes of Dr. Thomas.

Formulas for Least Squares Estimates

Solving for b0 and b1 results in the L.S. estimates 10

ˆ and ˆ

Page 14: Linear Regression Analysis Additional Reference: Applied Linear Regression Models – Neter, Kutner, Nachtsheim, Wasserman The lecture notes of Dr. Thomas.

Example (12.12)Refer to the previous example (12.4). Obtain the expression for the Least Squares line

825,25yx 454,17y

095,39x 346y

517x 14

ii2i

2ii

in

Page 15: Linear Regression Analysis Additional Reference: Applied Linear Regression Models – Neter, Kutner, Nachtsheim, Wasserman The lecture notes of Dr. Thomas.

Estimating 2

Residual = Observed – Predicted

iii yye ˆ

Recall the definition of sample variance

n

ii xx

ns

1

22 )(1

1

Page 16: Linear Regression Analysis Additional Reference: Applied Linear Regression Models – Neter, Kutner, Nachtsheim, Wasserman The lecture notes of Dr. Thomas.

Estimating 2 Cont’d

• The minimum value of the squared deviation is

D = (yi - ox - 1xi )2 = (yi - )2 = SSE

• Divide the SSE by it’s degrees of

freedom (n-2) to estimate 2

iy

2ˆ 22

n

SSEs

Page 17: Linear Regression Analysis Additional Reference: Applied Linear Regression Models – Neter, Kutner, Nachtsheim, Wasserman The lecture notes of Dr. Thomas.

Example (12.12) Cont’d

Predict the value of BOD mass removal when BOD loading is 35. Calculate the residual. Calculate the SSE and a point estimate of 2