Business Quantitative Lecture 3

32
QUANTITATIVE ANALYSIS FOR BUSINESS Lecture 3 July 12 th , 2010 Saksarun (Jay) Mativachranon

description

 

Transcript of Business Quantitative Lecture 3

Page 1: Business Quantitative Lecture 3

QUANTITATIVE ANALYSIS FOR

BUSINESSLecture 3

July 12th, 2010

Saksarun (Jay) Mativachranon

Page 2: Business Quantitative Lecture 3

ERROR IN REGRESSION MODEL

Page 3: Business Quantitative Lecture 3

ASSUMPTIONS OF THE REGRESSION MODEL If we make certain assumptions about

the errors in a regression model, we can perform statistical tests to determine if the model is useful 1. Errors are independent2. Errors are normally distributed3. Errors have a mean of zero4. Errors have a constant variance

A plot of the residuals (errors) will often highlight any glaring violations of the assumption

Page 4: Business Quantitative Lecture 3

RESIDUAL PLOTS A random plot of residuals

Figure 4.4A

Err

or

X

Page 5: Business Quantitative Lecture 3

RESIDUAL PLOTS Nonconstant error variance

Figure 4.4B

Err

or

X

Page 6: Business Quantitative Lecture 3

RESIDUAL PLOTS Nonlinear relationship

Figure 4.4C

Err

or

X

Page 7: Business Quantitative Lecture 3

ANALYSIS OF VARIANCE

Page 8: Business Quantitative Lecture 3

ANALYSIS OF VARIANCE (ANOVA) Analysis of Variance (ANOVA)

A statistical procedure for analyzing the total variability of a data set

Page 9: Business Quantitative Lecture 3

ANALYSIS OF VARIANCE (ANOVA) Sum of squares total (SST)

Measures the total variation in the dependent variable

Sum of squares of regression (SSR)Measures the variation in the dependent

variable explained by the independent variable

Sum of squares of errors (SSE)Measures the unexplained variation

2)( YYSST

2)ˆ( YYSSR

22 )ˆ( YYeSSE

Page 10: Business Quantitative Lecture 3

ESTIMATING THE VARIANCE

Errors are assumed to have a constant variance ( 2), but we usually don’t know this

It can be estimated using the mean squared error (MSE), s2

12

knSSE

MSEs

wheren = number of observations in the samplek = number of independent variables

Page 11: Business Quantitative Lecture 3

THE STANDARD ERROR OF ESTIMATE The standard error of estimate (SEE) or

The standard error of the regressionMeasures uncertainty between independent

and dependent variables

2

n

SSEMSESEE

Page 12: Business Quantitative Lecture 3

THE F STATISTIC F-test asses how well a set of

independent variables, as a group, explains the variation in the dependent variable

Where MSR = mean regression sum of squares MSE = mean squared error k = the number of slope parameters (k = 1 for

linear regression) n = number of observations

1

knSSE

kSSR

MSE

MSRF

Page 13: Business Quantitative Lecture 3

F-STATISTIC LINEAR REGRESSION For linear regression, the hypotheses for

the validity of the model are;H0: b1 = 0Ha: b1 ≠ 0

To determine if b1 is statistically significant, the calculated F-statistic is compared with the critical F-value, Fc, at the appropriate level of significance.

Page 14: Business Quantitative Lecture 3

F-STATISTIC LINEAR REGRESSION The degree of freedom (df) for the

numerator and denominator with one independent variable are;dfnumerator = k = 1dfdenominator = n – k – 1 = n – 2

Decision for F-testReject H0 if F > Fc

Page 15: Business Quantitative Lecture 3

COMPANY A DATASales of Company A ($) Man Hour (Hour)

6 3

8 4

9 6

5 4

4.5 2

9.5 5

Page 16: Business Quantitative Lecture 3

COMPANY A EXAMPLEY X (Y – Y)2 Y (Y – Y)2 (Y – Y)2

6 3 (6 – 7)2 = 1 2 + 1.25(3) = 5.75 0.0625 1.563

8 4 (8 – 7)2 = 1 2 + 1.25(4) = 7.00 1 0

9 6 (9 – 7)2 = 4 2 + 1.25(6) = 9.50 0.25 6.25

5 4 (5 – 7)2 = 4 2 + 1.25(4) = 7.00 4 0

4.5 2 (4.5 – 7)2 = 6.25

2 + 1.25(2) = 4.50 0 6.25

9.5 5 (9.5 – 7)2 = 6.25

2 + 1.25(5) = 8.25 1.5625 1.563

∑(Y – Y)2 = 22.5 ∑(Y – Y)2 = 6.875 ∑(Y – Y)2 =

15.625

Y = 7 SST = 22.5 SSE = 6.875 SSR = 15.625

^

_

_^

_

_ _^ ^

^

Page 17: Business Quantitative Lecture 3

ESTIMATING THE VARIANCE

For Company A

718814

87506116

875061

2 ...

knSSE

MSEs

We can estimate the standard deviation, s

This is also called the standard error of the estimate or the standard deviation of the regression

31171881 .. MSEs

Page 18: Business Quantitative Lecture 3

TESTING THE MODEL FOR SIGNIFICANCE When the sample size is too small, you

can get good values for MSE and r2 even if there is no relationship between the variables

Testing the model for significance helps determine if the values are meaningful

We do this by performing a statistical hypothesis test

Page 19: Business Quantitative Lecture 3

TESTING THE MODEL FOR SIGNIFICANCE

We start with the general linear model XY 10

If 1 = 0, the null hypothesis is that there is no relationship between X and Y

The alternate hypothesis is that there is a linear relationship (1 ≠ 0)

If the null hypothesis can be rejected, we have proven there is a relationship

We use the F statistic for this test

Page 20: Business Quantitative Lecture 3

TESTING THE MODEL FOR SIGNIFICANCE

The F statistic is based on the MSE and MSR

kSSR

MSR

wherek = number of independent variables in the model

The F statistic is

MSEMSR

F

This describes an F distribution withdegrees of freedom for the numerator = df1 = kdegrees of freedom for the denominator = df2 = n

– k – 1

Page 21: Business Quantitative Lecture 3

TESTING THE MODEL FOR SIGNIFICANCE

If there is very little error, the MSE would be small and the F-statistic would be large indicating the model is useful

If the F-statistic is large, the significance level (p-value) will be low, indicating it is unlikely this would have occurred by chance

So when the F-value is large, we can reject the null hypothesis and accept that there is a linear relationship between X and Y and the values of the MSE and r2 are meaningful

Page 22: Business Quantitative Lecture 3

STEPS IN A HYPOTHESIS TEST1. Specify null and alternative

hypotheses

2. Select the level of significance (). Common values are 0.01 and 0.05

3. Calculate the value of the test statistic using the formula

010 :H011 :H

MSEMSR

F

Page 23: Business Quantitative Lecture 3

STEPS IN A HYPOTHESIS TEST

4. Make a decision using one of the following methodsa) Reject the null hypothesis if the test statistic is

greater than the F-value from the table Otherwise, do not reject the null hypothesis:

21 ifReject dfdfcalculated FF ,,

kdf 1

12 kndf

b) Reject the null hypothesis if the observed significance level, or p-value, is less than the level of significance (). Otherwise, do not reject the null hypothesis:

)( statistictest calculatedvalue- FPp

value- ifReject p

Page 24: Business Quantitative Lecture 3

COMPANY AStep 1.

H0: 1 = 0 (no linear relationship between X and Y)H1: 1 ≠ 0 (linear relationship exists between X and Y)

Step 2.Select = 0.05

6250151625015

..

kSSR

MSR

09971881625015

...

MSEMSR

F

Step 3.Calculate the value of the test statistic

Page 25: Business Quantitative Lecture 3

COMPANY AStep 4.

Reject the null hypothesis if the test statistic is greater than the F-value

df1 = k = 1df2 = n – k – 1 = 6 – 1 – 1 = 4The value of F associated with a 5%

level of significance and with degrees of freedom 1 and 4 is

F0.05,1,4 = 7.71

Fcalculated = 9.09

Reject H0 because 9.09 > 7.71

Page 26: Business Quantitative Lecture 3

F = 7.71

0.05

9.09

COMPANY A

We can conclude there is a statistically significant relationship between X and Y

The r2 value of 0.69 means about 69% of the variability in sales (Y) is explained by Man Hour (X)

Page 27: Business Quantitative Lecture 3

LIMITATION OF REGRESSION ANALYSIS Linear relationships can change over

timeThis is referred to as parameter instability

Even if the model is accurate, its usefulness will be limited if other market participants are also aware of and act on this model

If the assumptions do not hold, the interpretation and tests of hypotheses may not be valid

Page 28: Business Quantitative Lecture 3

USING SOFTWARE FOR REGRESSION

Page 29: Business Quantitative Lecture 3

USING SOFTWARE FOR REGRESSION

Page 30: Business Quantitative Lecture 3

USING SOFTWARE FOR REGRESSIONCorrelation coefficient is

called Multiple R in Excel

Page 31: Business Quantitative Lecture 3

ANALYSIS OF VARIANCE (ANOVA) TABLE When software is used to develop a

regression model, an ANOVA table is typically created that shows the observed significance level (p-value) for the calculated F value

This can be compared to the level of significance () to make a decisionDF SS MS F SIGNIFICANCE

Regression k SSR MSR = SSR/k MSR/MSE P(F > MSR/MSE)

Residual n - k - 1 SSE MSE = SSE/(n - k - 1)

Total n - 1 SST

Table 4.4

Page 32: Business Quantitative Lecture 3

ANOVA FOR COMPANY A

Because this probability is less than 0.05, we reject the null hypothesis of no linear relationship and conclude there is a linear relationship between X and Y

P(F > 9.0909) = 0.0394