Quantitative approaches Contents Lesson 10: Bivariate ...

8
Quantitative approaches Lesson 10: Bivariate regression Quantitative approaches Contents 1. What is (bivariate) linear regression? 2. Example : Size of dwarfs and the influence of food 3. How to do it in SPSS Quantitative approaches 1. What is (bivariate) linear regression? Quantitative approaches What is (bivariate) linear regression? Bivariate linear regression = Statistical method that relates an independent variable to a dependent (or response) variable by modeling the relationship as a straight line. Regression analysis is used when both variables are continuous variables (measured on an interval or metric scale)

Transcript of Quantitative approaches Contents Lesson 10: Bivariate ...

Page 1: Quantitative approaches Contents Lesson 10: Bivariate ...

Quantitative approaches

Lesson 10:

Bivariate regression

Quantitative approaches

Contents

1. What is (bivariate) linear regression?

2. Example : Size of dwarfs and the influence of food

3. How to do it in SPSS

Quantitative approaches

1. What is (bivariate) linear regression?

Quantitative approaches

What is (bivariate) linear regression?

Bivariate linear regression =

Statistical method that relates an independent variable to a

dependent (or response) variable by modeling the

relationship as a straight line.

Regression analysis is used when both variables are

continuous variables (measured on an interval or metric

scale)

Page 2: Quantitative approaches Contents Lesson 10: Bivariate ...

Quantitative approaches

The basic model

The basic model we fit in a bivariate linear regression is a

straight line with

y = a + bx

y = response variable (= dependent)

x = explanatory variable (= independent)

a = intercept

b = slope

Quantitative approaches

What is (bivariate) linear regression?

delta y

delta x

b =delta y

delta x= slope of line

a

a = intercept

Quantitative approaches

2. Example :

Size of dwarfs and influence of food

Quantitative approaches

Size of dwarfs and influence of food : data

Food (X) Size (Y)

8 12

7 10

6 8

5 11

4 6

3 7

2 2

1 3

0 3

Page 3: Quantitative approaches Contents Lesson 10: Bivariate ...

Quantitative approaches

Food and size of dwarfs: Scatterplot

Quantitative approaches

Scatterplot and regression line

The regression line is

our «!model!» for the

data.

For every value of

«!food!», the model

predicts the value of

«!size!» on the

regression line.

Quantitative approaches

The meaning of the slope b

Slope b = change in Y

that accompanies a

unit change in X

In our example:

Adding one unit of

food causes a dwarf to

grow 1.22 cm on

average.

Slope b =!Y

!X=

! Size

! Food

Quantitative approaches

Errors (or: residuals)

Since the prediction is rarely

completely accurate, we get for

every value of «!food!»

an «!error!» e , that is, the

distance between actual value

of «!size!» and predicted value

of «!size!» .

We also get an «!explained part

of the variance! r »

errors

Y!

Y

Y!

Y

e = Y !Y!

Page 4: Quantitative approaches Contents Lesson 10: Bivariate ...

Quantitative approaches

The Least Squares Criterion

We look for the line

that minimizes the

squared residuals e

(SSE). This is called

the «!least squares

criterion!»

error

Y!

Y

minimize SSE = e

2= (Y !Y!)

2

""

e = Y !Y!

Quantitative approaches

Degree of fit: R-squareIt is not enough to know the value of slope b. Very different relationships

between X and Y may have the same slope b.

We therefore calculate R-square, (= explained variance/total variance) in

order to measure the «!fit!» of the model. R-square ranges from 0 to 1.

b = 1.163

R-squared = 0.979

b = 1.483

R-squared = 0.877

b = 1.521

R-squared = 0.589

Quantitative approaches

Explained Variance

All the variance is

explained through the modelNo variance is

explained through the model

Quantitative approaches

Degree of fit: R-square

By introducing the regression line, we

divide the total variation of «!size!»

into a regression variation SSR

(explained) and a error variation SSE

(unexplained).

Explained variance

= R-square

= explained variation/total variation

R2=SSR

SSY

Page 5: Quantitative approaches Contents Lesson 10: Bivariate ...

Quantitative approaches

Formula (1)

SSY = (y ! y)2

"

SSX = (x ! x)2

"

SSXY = (x ! x)(y ! y)"

b =SSXY

SSXa =

y!n

" b*x!

n

slope of regression line intercept of regression line

sums of squares in Y

sums of squares in X

sums of products X,Y

Quantitative approaches

Formula (2)

SSE = SSY ! SSR

regression variation

(explained)

error variation

(unexplained)

total variation

(sum of squares)SSY = (y ! y)

2

"

SSR =SSXY

2

SSX

explained variance R2=SSR

SSY

Quantitative approaches

SSY = (y ! y)2

" = 108.8889

SSX = (x ! x)2

" = 60

SSXY = (x ! x)(y ! y)" = 73

Calculating intercept a and slope b

y = a + b* x

y = 2.02 +1.22 * x

b =SSXY

SSX=73

60= 1.22

a =y!

n" b*

x!n

=62

9"1*

36

9= 2.02

Quantitative approaches

SSY = (y ! y)2

" = 108.8889

SSX = (x ! x)2

" = 60

SSXY = (x ! x)(y ! y)" = 73

Calculating explained variation,

residual variation and explained variance

SSR =SSXY

2

SSX=73

2

60= 88.8166

SSE = SSY ! SSR = 108.8889 ! 88.8166 = 20.0723

Explained variance

SSR

SSY=88.8166

108.8889= 0.8157 = 81.6%

Regression

variation

Error

variation

Page 6: Quantitative approaches Contents Lesson 10: Bivariate ...

Quantitative approaches

Calculating error variance (ANOVA-table)

Regression

Error

Total

Sum of squares df Mean squares F ratio

88.817 (SSR)

20.072 (SSE)

108.889 (SSY)

88.817 30.9741

7

8

88.817

1=

20.072

7= s

2= 2.86746

88.817

2.86746=

critical F-value = 5.591

Since the F-ratio is greater than the

critical F-value for df= 1/7, we

reject the 0-hypothesis that the real

b in population could be equal to 0

The ANOVA-table of the

regression tells us if all the

explanatory variables have together

a significant effect on the variance

of Y

the error

variance

will be

used to

calculate

standard

errors for

b and a

Quantitative approaches

Calculating the standard errors of the

intercept a and the slope b

We can now use the error variance s2 from the Anova-table

in order to calculate the standard errors of the intercept a

and the slope b.

standard error of b =s

2

SSX=

2.867

60= 0.2186

standard error of a =s

2x

2

!n*SSX

=2.867 *204

9 *60= 1.0408

Quantitative approaches

Calculating the p-value of intercept a

Coefficients:

Estimate Std. Error t value p value

(Intercept) 2.0222 1.0408 1.943 0.093129

food 1.2167 0.2186 5.565 0.000846 ***

Estimate

Std. Error = t value

2.0222

1.0408 = 1.943

The t-value +/-1.943 cuts off two areas

of the t-distribution with df=8 on the left and the right hand

side. The total of these two areas is the p-value 0.0931.

-> in 9.3% of the cases an intercept might have come up

with this size or bigger, even if the real intercept was 0.

-> The intercept is not significantly bigger than 0.

Quantitative approaches

Calculating the p-value of slope b

Coefficients:

Estimate Std. Error t value p value

(Intercept) 2.0222 1.0408 1.943 0.093129

food 1.2167 0.2186 5.565 0.000846 ***

Estimate

Std. Error = t value

The t-value +/- 5.565 cuts off two areas

of the t-distribution with df=8 on the left and the right hand

side. The total of these two areas is the p-value 0.000846.

-> in 0.08% of the cases an intercept might have come up

with this size or bigger, even if the real intercept was 0.

-> The slope is not significantly different from 0.

1.2167

0.2186 = 5.565

Page 7: Quantitative approaches Contents Lesson 10: Bivariate ...

Quantitative approaches

Calculating the p-value of slope b

t=5.565t=-5.565

Quantitative approaches

3. How to do it in SPSS

Quantitative approaches

Regression (1) : get data

File -> Open -> Data

Click on FoodSize.sav

Open

Quantitative approaches

Regression (2)

Analyze -> Regression -> Linear

Put «!food!» into «!Dependent!»

Put «!size!» into «!Indepedent(s)!»

Statistics:

Regression Coefficients:

- Estimates

- Confidence intervals

Continue

OK

Page 8: Quantitative approaches Contents Lesson 10: Bivariate ...

Quantitative approaches

Regression (3) : Results