bs2506 tutorial3 corrected - WordPress.com · 1. House price (Again) Predictor (Variable)...

31
Tutorial 3 Inferential Statistics, Statistical Modelling & Survey Methods (BS2506) Pairach Piboonrungroj (Champ) [email protected]

Transcript of bs2506 tutorial3 corrected - WordPress.com · 1. House price (Again) Predictor (Variable)...

Page 1: bs2506 tutorial3 corrected - WordPress.com · 1. House price (Again) Predictor (Variable) Coefficient (B) SE (B) Constan -2.5 41.4 X1 1.62 0.21 X2 0.257 1.88 X4 -0.027 0.008 Source

Tutorial 3

Inferential Statistics, Statistical Modelling & Survey Methods

(BS2506)

Pairach Piboonrungroj (Champ)

[email protected]

Page 2: bs2506 tutorial3 corrected - WordPress.com · 1. House price (Again) Predictor (Variable) Coefficient (B) SE (B) Constan -2.5 41.4 X1 1.62 0.21 X2 0.257 1.88 X4 -0.027 0.008 Source

1. House price (Again) Predictor (Variable)

Coefficient (B) SE (B)

Constant -2.5 41.4 X1 1.62 0.21 X2 0.257 1.88 X4 -0.027 0.008

Source of variation Sum of Squares Degree of Freedom Mean Squares

Regression 277,895

Residual 34,727

Analysis of Variance (ANOVA)

Page 3: bs2506 tutorial3 corrected - WordPress.com · 1. House price (Again) Predictor (Variable) Coefficient (B) SE (B) Constan -2.5 41.4 X1 1.62 0.21 X2 0.257 1.88 X4 -0.027 0.008 Source

1 (a) (i) Write out the estimated regression equation

421 027.0257.062.15.2ˆ XXXY −++−=

Predictor (Variable)

Coefficient (B) SE (B)

Constant -2.5 41.4 X1 1.62 0.21 X2 0.257 1.88 X4 -0.027 0.008

Page 4: bs2506 tutorial3 corrected - WordPress.com · 1. House price (Again) Predictor (Variable) Coefficient (B) SE (B) Constan -2.5 41.4 X1 1.62 0.21 X2 0.257 1.88 X4 -0.027 0.008 Source

1 (a) (ii) Test for the significance of regression equation

1058.311,005.0415,201.0,2

===−

tttdfα

01.0=αAt 1% Step1: Critical Value

Step2: t-Statistic i

i SEt i

ββ

β=

Page 5: bs2506 tutorial3 corrected - WordPress.com · 1. House price (Again) Predictor (Variable) Coefficient (B) SE (B) Constan -2.5 41.4 X1 1.62 0.21 X2 0.257 1.88 X4 -0.027 0.008 Source

1 (a) (ii) Test for the significance of regression equation

1058.311,005.0 =tAt 1% Step1: Critical Value

Step2: t-Statistic

i

i SEt i

ββ

β=

01.0=α

71.721.062.1

1 ==t

137.088.1257.0

2 ==t

375.3008.0027.0

4 −=−

=t

Reject H0

Do NOT Reject H0

Reject H0

> 3.1058

< 3.1058

< -3.1058

Page 6: bs2506 tutorial3 corrected - WordPress.com · 1. House price (Again) Predictor (Variable) Coefficient (B) SE (B) Constan -2.5 41.4 X1 1.62 0.21 X2 0.257 1.88 X4 -0.027 0.008 Source

1. a). (iii) What are DF for SSR & SSE? Predictor (Variable)

Coefficient (B) SE (B)

Constant -2.5 41.4 X1 1.62 0.21 X2 0.257 1.88 X4 -0.027 0.008

Source of variation Sum of Squares Degree of Freedom Mean Squares

Regression 277,895 3 (p)

Residual 34,727 11 (n-p-1)

Analysis of Variance (ANOVA)

Page 7: bs2506 tutorial3 corrected - WordPress.com · 1. House price (Again) Predictor (Variable) Coefficient (B) SE (B) Constan -2.5 41.4 X1 1.62 0.21 X2 0.257 1.88 X4 -0.027 0.008 Source

1. a). (iv) Test for Significant relationship X&Y?

Source of variation

Sum of Squares

Degree of Freedom

Mean Squares F Statistic

Regression 277,895 3 92,631 29.341

Residual 34,727 11 3157

Analysis of Variance (ANOVA)

0421 === βββH0:

H1: At least one of the coefficients does not equal 0

217.6)11,3(01.0 =FAt Critical Value 01.0=α

Then we can reject Null hypothesis, there is a relationship between Xs & Y

Page 8: bs2506 tutorial3 corrected - WordPress.com · 1. House price (Again) Predictor (Variable) Coefficient (B) SE (B) Constan -2.5 41.4 X1 1.62 0.21 X2 0.257 1.88 X4 -0.027 0.008 Source

1. a). (v) Compute the coefficient of determination and explain its meaning

Source of variation

Sum of Squares

Degree of Freedom

Mean Squares F Statistic

Regression 277,895 3 92,631 29.341

Residual 34,727 11 3157

TOTAL 312,622

Analysis of Variance (ANOVA) R2

R2 = 1 – (34,727/312,622) R2 = 1 – 0.111 R2 = 0.889 = 88.9%

Total Squares SumError Square Sum1−=

Page 9: bs2506 tutorial3 corrected - WordPress.com · 1. House price (Again) Predictor (Variable) Coefficient (B) SE (B) Constan -2.5 41.4 X1 1.62 0.21 X2 0.257 1.88 X4 -0.027 0.008 Source

1(b)

41 026.0601.18.1ˆ xxy −+=

880.02 =R

Model 1

6541 371.65794.63026.023.105.64ˆ xxxxy −+−+=

Model 2

935.02 =R

65421 447.65447.63026.0067.022.12.65ˆ xxxxxy −+−−+=Model 3

936.02 =R

Page 10: bs2506 tutorial3 corrected - WordPress.com · 1. House price (Again) Predictor (Variable) Coefficient (B) SE (B) Constan -2.5 41.4 X1 1.62 0.21 X2 0.257 1.88 X4 -0.027 0.008 Source

1(b) (i) Compute Adjusted Coefficient of determination for three models

)11)(1(1 222

−−

−−−==

pnnRRRadj

86.0)1215115)(880.01(12

1 =−−

−−−=R

909.0)1415115)(935.01(12

2 =−−

−−−=R

900.0)1515115)(936.01(12

3 =−−

−−−=R

Page 11: bs2506 tutorial3 corrected - WordPress.com · 1. House price (Again) Predictor (Variable) Coefficient (B) SE (B) Constan -2.5 41.4 X1 1.62 0.21 X2 0.257 1.88 X4 -0.027 0.008 Source

1(b) (ii) Interpret the coefficients on the house type, Beta5 and Beta6

Prices for Detached houses increase by £63,794

Prices for Terrace Houses decreased by £65,371

(relative to Semi- detached)

6541 371.65794.63026.023.105.64ˆ xxxxy −+−+=(model 2)

Page 12: bs2506 tutorial3 corrected - WordPress.com · 1. House price (Again) Predictor (Variable) Coefficient (B) SE (B) Constan -2.5 41.4 X1 1.62 0.21 X2 0.257 1.88 X4 -0.027 0.008 Source

1(b) (iii) At 0.05 level of significance, determine whether model 2 is superior to model1

6541 371.65794.63026.023.105.64ˆ xxxxy −+−+=Model 2 41 026.0601.18.1ˆ xxy −+=Model 1

qppn

RRR

FComplete

strictedComplete

−−×

−=

11 2

2Re

2

231.4241415

935.01880.0935.0

=−

−−×

−=F

231.4103.410,2,05.0)1415,24(,05.0)1,(, <=== −−−−−− FFF pnqpα

Significant i.e., Model 2 is better than Model 1

Page 13: bs2506 tutorial3 corrected - WordPress.com · 1. House price (Again) Predictor (Variable) Coefficient (B) SE (B) Constan -2.5 41.4 X1 1.62 0.21 X2 0.257 1.88 X4 -0.027 0.008 Source

1(b) (iv) At 0.05 level of significance, determine whether model 3 is superior to model 2

6541 371.65794.63026.023.105.64ˆ xxxxy −+−+=Model 2

qppn

RRR

FComplete

strictedComplete

−−×

−=

11 2

2Re

2

141.0451515

936.01935.0936.0

=−

−−×

−=F

141.0117.59,1,05.0)1515,45(,05.0)1,(, >=== −−−−−− FFF pnqpα

NOT Significant i.e., Model 3 is NOT better than Model 2

65421 447.65447.63026.0067.022.12.65ˆ xxxxxy −+−−+=Model 3

Page 14: bs2506 tutorial3 corrected - WordPress.com · 1. House price (Again) Predictor (Variable) Coefficient (B) SE (B) Constan -2.5 41.4 X1 1.62 0.21 X2 0.257 1.88 X4 -0.027 0.008 Source

6541 371.65794.63026.023.105.64ˆ xxxxy −+−+=

0*371.651*794.63)5*250(026.0250*23.105.64ˆ −+−+=y

844,402£ˆ =y

1(b) (v) From model2, estimate the price of 5 years old detached house with 250 square meters

Page 15: bs2506 tutorial3 corrected - WordPress.com · 1. House price (Again) Predictor (Variable) Coefficient (B) SE (B) Constan -2.5 41.4 X1 1.62 0.21 X2 0.257 1.88 X4 -0.027 0.008 Source

2. Advertising expenditure X, Advertising

(£000) Y, Sales

(£000) 5.5 90

2.0 40

3.2 55

6.0 95

3.8 70

4.4 80

6.0 5.0 6.5 7.0

88 85 92 91

R square 0.97 Adjusted R Square 0.96 Standard error of regression 3.37

DF Sum Square Mean Square Regression 2,904 Residual 80.0

Analysis of variance

Variables in the Equation Variable B SE B

Advert 31.79 4.48 Advert-square -2.30 0.485 (constant) -17.22 9.65

Page 16: bs2506 tutorial3 corrected - WordPress.com · 1. House price (Again) Predictor (Variable) Coefficient (B) SE (B) Constan -2.5 41.4 X1 1.62 0.21 X2 0.257 1.88 X4 -0.027 0.008 Source

2.(a) State the regression equation for the curvilinear model.

230.279.3122.17ˆ XXYt −+−=

Variables in the Equation Variable B SE B

Advert 31.79 4.48 Advert-square -2.30 0.485 (constant) -17.22 9.65

2210

ˆ XXYt βββ −+=

Page 17: bs2506 tutorial3 corrected - WordPress.com · 1. House price (Again) Predictor (Variable) Coefficient (B) SE (B) Constan -2.5 41.4 X1 1.62 0.21 X2 0.257 1.88 X4 -0.027 0.008 Source

2.(b) Predict the monthly sales (in pounds) for a month with total advertising

expenditure of £6,000 230.279.3122.17ˆ XXYt −+−=

X = 6

Yt = −17.22+31.79(6)− 2.30(6)2 = 90.720

720,90£000,1*720.90 ==Sales

Page 18: bs2506 tutorial3 corrected - WordPress.com · 1. House price (Again) Predictor (Variable) Coefficient (B) SE (B) Constan -2.5 41.4 X1 1.62 0.21 X2 0.257 1.88 X4 -0.027 0.008 Source

2.(c) Determine there is significant relationship between the sales and advertising expenditure at

the 0.01 level of significance

DF Sum Square Mean Square F

Regression 2 2,904 1,452 127.05

Residual 7 80.0 11.428

Analysis of variance

547.5)7,2(01.0 =FAt Critical Value 01.0=α

Then we can reject Null hypothesis, there is a curvilinear relationship between sales and advertising expenditure

021 == ββH0:

H1: At least one of the coefficients does not equal 0

2210

ˆ XXYt βββ −+=

Page 19: bs2506 tutorial3 corrected - WordPress.com · 1. House price (Again) Predictor (Variable) Coefficient (B) SE (B) Constan -2.5 41.4 X1 1.62 0.21 X2 0.257 1.88 X4 -0.027 0.008 Source

2 (d) Fit a linear model to the data and calculate SSE for this model

∑∑

−= 221

ˆxnxyxnxy

β

xy 10ˆˆ ββ −=

Page 20: bs2506 tutorial3 corrected - WordPress.com · 1. House price (Again) Predictor (Variable) Coefficient (B) SE (B) Constan -2.5 41.4 X1 1.62 0.21 X2 0.257 1.88 X4 -0.027 0.008 Source

2 (d) Fit a linear model to the data and calculate SSE for this model

ID X

Advertising Y

Sales

1 5.5 90

2 2 40

3 3.2 55

4 6 95

5 3.8 70

6 4.4 80

7 6 88

8 5 85

9 6.5 92

10 7 91

Page 21: bs2506 tutorial3 corrected - WordPress.com · 1. House price (Again) Predictor (Variable) Coefficient (B) SE (B) Constan -2.5 41.4 X1 1.62 0.21 X2 0.257 1.88 X4 -0.027 0.008 Source

2 (d) Fit a linear model to the data and calculate SSE for this model

ID X

Advertising Y

Sales xy x^2 y^2

1 5.5 90 495 30.25 8100 2 2 40 80 4 1600 3 3.2 55 176 10.24 3025 4 6 95 570 36 9025 5 3.8 70 266 14.44 4900 6 4.4 80 352 19.36 6400 7 6 88 528 36 7744 8 5 85 425 25 7225 9 6.5 92 598 42.25 8464

10 7 91 637 49 8281

Sum 49.4 786 4127 266.54 64764

Average 4.94 78.6 412.7 26.654 6476.4

Page 22: bs2506 tutorial3 corrected - WordPress.com · 1. House price (Again) Predictor (Variable) Coefficient (B) SE (B) Constan -2.5 41.4 X1 1.62 0.21 X2 0.257 1.88 X4 -0.027 0.008 Source

2 (d) Fit a linear model to the data and calculate SSE for this model

∑∑

−= 221

ˆxnxyxnxy

β 85.10)94.4(1054.266)6.78)(94.4(104127ˆ

21 =−

−=β

xy 10ˆˆ ββ −= 0.25)94.4(85.106.78ˆ

0 =−=β

xy 85.100.25ˆ +=

Page 23: bs2506 tutorial3 corrected - WordPress.com · 1. House price (Again) Predictor (Variable) Coefficient (B) SE (B) Constan -2.5 41.4 X1 1.62 0.21 X2 0.257 1.88 X4 -0.027 0.008 Source

2 (d) Fit a linear model to the data and calculate SSE for this model

ID X

Advertising Y

Sales xy x^2 y^2

1 5.5 90 495 30.25 8100 2 2 40 80 4 1600 3 3.2 55 176 10.24 3025 4 6 95 570 36 9025 5 3.8 70 266 14.44 4900 6 4.4 80 352 19.36 6400 7 6 88 528 36 7744 8 5 85 425 25 7225 9 6.5 92 598 42.25 8464

10 7 91 637 49 8281

Sum 49.4 786 4127 266.54 64764

Average 4.94 78.6 412.7 26.654 6476.4

Page 24: bs2506 tutorial3 corrected - WordPress.com · 1. House price (Again) Predictor (Variable) Coefficient (B) SE (B) Constan -2.5 41.4 X1 1.62 0.21 X2 0.257 1.88 X4 -0.027 0.008 Source

2 (d) Fit a linear model to the data and calculate SSE for this model

ID X

Advertising Y

Sales xy x^2 y^2 predicted Y

1 5.5 90 495 30.25 8100 84.68 2 2 40 80 4 1600 46.70 3 3.2 55 176 10.24 3025 59.72 4 6 95 570 36 9025 90.10 5 3.8 70 266 14.44 4900 66.23 6 4.4 80 352 19.36 6400 72.74 7 6 88 528 36 7744 90.10 8 5 85 425 25 7225 79.25 9 6.5 92 598 42.25 8464 95.53

10 7 91 637 49 8281 100.95

Sum 49.4 786 4127 266.54 64764

Average 4.94 78.6 412.7 26.654 6476.4

XYt 85.1025ˆ +=

Page 25: bs2506 tutorial3 corrected - WordPress.com · 1. House price (Again) Predictor (Variable) Coefficient (B) SE (B) Constan -2.5 41.4 X1 1.62 0.21 X2 0.257 1.88 X4 -0.027 0.008 Source

2 (d) Fit a linear model to the data and calculate SSE for this model

ID X

Advertising Y

Sales xy x^2 y^2 predicted Y

Square Error

1 5.5 90 495 30.25 8100 84.68 28.35 2 2 40 80 4 1600 46.70 44.92 3 3.2 55 176 10.24 3025 59.72 22.29 4 6 95 570 36 9025 90.10 24.00 5 3.8 70 266 14.44 4900 66.23 14.20 6 4.4 80 352 19.36 6400 72.74 52.69 7 6 88 528 36 7744 90.10 4.41 8 5 85 425 25 7225 79.25 33.05 9 6.5 92 598 42.25 8464 95.53 12.43

10 7 91 637 49 8281 100.95 99.01

Sum 49.4 786 4127 266.54 64764

Average 4.94 78.6 412.7 26.654 6476.4

Page 26: bs2506 tutorial3 corrected - WordPress.com · 1. House price (Again) Predictor (Variable) Coefficient (B) SE (B) Constan -2.5 41.4 X1 1.62 0.21 X2 0.257 1.88 X4 -0.027 0.008 Source

2 (d) Fit a linear model to the data and calculate SSE for this model

ID X

Advertising Y

Sales xy x^2 y^2 predicted Y

Square Error

1 5.5 90 495 30.25 8100 84.68 28.35 2 2 40 80 4 1600 46.70 44.92 3 3.2 55 176 10.24 3025 59.72 22.29 4 6 95 570 36 9025 90.10 24.00 5 3.8 70 266 14.44 4900 66.23 14.20 6 4.4 80 352 19.36 6400 72.74 52.69 7 6 88 528 36 7744 90.10 4.41 8 5 85 425 25 7225 79.25 33.05 9 6.5 92 598 42.25 8464 95.53 12.43

10 7 91 637 49 8281 100.95 99.01

Sum 49.4 786 4127 266.54 64764 335.36 Average 4.94 78.6 412.7 26.654 6476.4

Page 27: bs2506 tutorial3 corrected - WordPress.com · 1. House price (Again) Predictor (Variable) Coefficient (B) SE (B) Constan -2.5 41.4 X1 1.62 0.21 X2 0.257 1.88 X4 -0.027 0.008 Source

2(e) At 0.01 level of significance, determine whether the curvilinear model is superior to the

linear regression model

Linear Regression Model

Curvilinear Model

qppn

SSESSESSEF

rCurvilinea

rCurvilineaLinear

−−×

−=

1

3125.22121210

8080335

=−

−−×

−=F

3.2225.127,1,01.0)1210,12(,01.0)1,(, <=== −−−−−− FFF pnqpα

Significant i.e., Curvilinear effect make significant contribution and should be included in the model.

230.279.3122.17ˆ XXYt −+−=XYt 85.1025ˆ +=

Page 28: bs2506 tutorial3 corrected - WordPress.com · 1. House price (Again) Predictor (Variable) Coefficient (B) SE (B) Constan -2.5 41.4 X1 1.62 0.21 X2 0.257 1.88 X4 -0.027 0.008 Source

2 (f) Draw a scatter diagram between the sales& Advertising expenditure.

Sales

0

1020

3040

50

6070

8090

100

0 1 2 3 4 5 6 7 8

Observed

Page 29: bs2506 tutorial3 corrected - WordPress.com · 1. House price (Again) Predictor (Variable) Coefficient (B) SE (B) Constan -2.5 41.4 X1 1.62 0.21 X2 0.257 1.88 X4 -0.027 0.008 Source

2 (f) Sketch the Linear regression

Sales

0

1020

3040

50

6070

8090

100

0 1 2 3 4 5 6 7 8

ObservedLinear Regression

XYt 85.1025ˆ +=

Page 30: bs2506 tutorial3 corrected - WordPress.com · 1. House price (Again) Predictor (Variable) Coefficient (B) SE (B) Constan -2.5 41.4 X1 1.62 0.21 X2 0.257 1.88 X4 -0.027 0.008 Source

2 (f) Sketch the Quadratic regression

Sales

0

1020

3040

50

6070

8090

100

0 1 2 3 4 5 6 7 8

ObservedLinear Regression

Quadratic Regression

230.279.3122.17ˆ XXYt −+−=

Page 31: bs2506 tutorial3 corrected - WordPress.com · 1. House price (Again) Predictor (Variable) Coefficient (B) SE (B) Constan -2.5 41.4 X1 1.62 0.21 X2 0.257 1.88 X4 -0.027 0.008 Source

Thank you Download this Slides at

www.pairach.com/teaching

Q & A