SIMPLE LINEAR REGRESSION

2

SIMPLE LINEAR REGRESSION

Simple Regression Linear Regression

3

Simple Regression

Definition A regression model is a mathematical

equation that describes the relationship between two or more variables. A simple regression model includes only two variables: one independent and one dependent. The dependent variable is the one being explained, and the independent variable is the one used to explain the variation in the dependent variable.

4

Linear Regression

Definition A (simple) regression model that

gives a straight-line relationship between two variables is called a linear regression model.

5

Figure 1 Relationship between food expenditure and income. (a) Linear relationship.

(b) Nonlinear relationship.

Food

Expendit

ure

Food

Expendit

ure

Income Income

(a) (b)

Linear

Nonlinear

6

Figure 2 Plotting a linear equation.

150

100

50

5 10 15 x

y = 50 + 5x

x = 0

y = 50

x = 10

y = 100

y

7

SIMPLE LINEAR REGRESSION ANALYSIS

Scatter Diagram Least Square Line Interpretation of a and b Assumptions of the Regression Model

8

SIMPLE LINEAR REGRESSION ANALYSIS cont.

y = A + Bx

Constant term or y-intercept

Slope

Independent variableDependent variable

9

SIMPLE LINEAR REGRESSION ANALYSIS cont.

Definition In the regression model y = A + Bx

+ Є, A is called the y-intercept or constant term, B is the slope, and Є is the random error term. The dependent and independent variables are y and x, respectively.

10

SIMPLE LINEAR REGRESSION ANALYSIS

Definition In the model ŷ = a + bx, a and b,

which are calculated using sample data, are called the estimates of A and B.

11

Table 1 Incomes (in hundreds of dollars) and Food Expenditures of Seven Households

Income Food Expenditure 35 49 21 39 15 28 25

915 711 5 8 9

12

Scatter Diagram

Definition A plot of paired observations is called

a scatter diagram.

13

Figure 4 Scatter diagram.

Income

Food e

xpendit

ure

First householdSeventh household

Income

Food Expend

iture

35 49 21 39 15 28 25

915 711 5 8 9

14

Figure 5 Scatter diagram and straight lines.

Income

Food

expen

dit

ure

15

Least Squares Line

Figure 6 Regression line and random errors.

Income

F

ood e

xpendit

ure

e

Regression line

16

Descriptive Statistics

9,14 3,18 7

30,29 11,56 7

Y

X

Mean Std. Deviation N

Model Summary

,959a ,919 ,903 ,99Model1

R R SquareAdjustedR Square

Std. Error ofthe Estimate

Predictors: (Constant), Xa.

OUTPUT SPSS

17

Coefficientsa

1,142 1,126 1,014 ,357

,264 ,035 ,959 7,533 ,001

(Constant)

X

Model1

B Std. Error

UnstandardizedCoefficients

Beta

Standardized

Coefficients

t Sig.

Dependent Variable: Ya.

The Least Squares Line

a=1,142 b=0,264

Thus, ŷ = 1.1414 + 0.2642x

18

22

ii

iiii

XXn

YXYXnb

__

XbYa

19

Figure 7 Error of prediction.

ePredicted = $1038.84

Error = -$138.84

Actual = $900

ŷ = 1.1414 + .2642x

Income

F

ood e

xpendit

ure

20

Figure . Errors of prediction when regression model is used.Fo

od e

xpendit

ure

Income

ŷ = 1.1414 + .2642x

21

Interpretation of a and b

Interpretation of a Consider the household with zero

income ŷ = 1.1414 + .2642(0) = $1.1414

hundred Thus, we can state that households

with no income is expected to spend $114.14 per month on food

22

Interpretation of a and b cont.

Interpretation of b The value of b in the regression

model gives the change in y due to change of one unit in x

We can state that, on average, a $1 increase in income of a household will increase the food expenditure by $0.2642

23

Figure 8 Positive and negative linear relationships between x and y.

(a) Positive linear relationship.

(b) Negative linear relationship.

b > 0 b < 0

y

x

y

x

24

Table 4x y ŷ = 1.1414 + .2642x e = y – ŷ

35492139152825

915 711 5 8 9

10.388414.0872 6.689611.4452 5.1044 8.5390 7.7464

-1.3884 .9128 .3104 -.4452 -.1044 -.5390 1.2536

1.9277 .8332 .0963 .1982 .0109 .29051.5715

22 yye

9283.4ˆ22 yye

25

Linearitas Test(Uji Validitas Model)

Model Sum of Squares

Degrees of

Freedom (db)

Mean Squar

e

Value of the test statistic

(F Value )

Regression

SSreg 1 MSreg

Residual

SSres n-2 MSres

Total SST N-1

Table. Validity for Simple Regression Model

res

reg

MS

MSF

26

ANOVAb

55,929 1 55,929 56,742 ,001a

4,928 5 ,986

60,857 6

Regression

Residual

Total

Model1

Sum ofSquares df Mean Square F Sig.

Predictors: (Constant), Xa.

Dependent Variable: Yb.

OUTPUT SPSS

27

Figure Nonlinear relations between x and y.

(a) (b)

y

x

y

x

28

n

YYSS i

it

2

2.1

n

YXYXbSS i

iireg.2

regtres SSSSSS .3

regreg SSMS .4

2.5

n

SSMS res

res

res

reg

MS

MSF

F table ,dbreg=1 and dbres=n-2

29

SIGNIFICANCE KOEFISIEN REGRESI

Coefficientsa

1,142 1,126 1,014 ,357

,264 ,035 ,959 7,533 ,001

(Constant)

X

Model1

B Std. Error


Beta

Standardized

Coefficients

t Sig.


30

as

at

bs

bt

n

XXSS i

ixx

2

2

2i

XX

ea X

nSS

ss

2

n

SSs res

e

x

eb

SS

ss

31

.0140,11264,1

1422,1

as

at

.533,7035068,0

26417,0

bs

bt

Coefficientsa

1,142 1,126 1,014 ,357

,264 ,035 ,959 7,533 ,001

(Constant)

X

Model1

B Std. Error


Beta

Standardized

Coefficients

t Sig.


Output SPSS

32

Do not reject H0Reject H0Reject H0

ttable = 2.571

Significan level α = 0.05

-ttable = -2.571

33

REGRESSION ANALYSIS: COMPLETE EXERCISES

Exercise 1: The following data give the experience

(in years ) and monthly salary (in hundreds of dollars) of nine randomly selected secretaries.

34

Exercise 1Experience

(years)Monthly salary

(Hundreds of dollars)

1435649

185

16

4224333129394730

43

35

a. Construct a scatter diagram for these data.b. Find the regression line with experience as

an independent variable and monthly salary as a dependent variable.

c. Give a brief interpretation of the values of a and b calculated in part b.

d. Plot the regression line on the scatter diagram of part a and show the errors by drawing vertical lines between the scatter points and the regression line.

e. Does the regression model show a linear relationship between experience and monthly salary? Use 5 % significant level.

f. Construct a 5 % significant level for b.

36

Exercise 2 A random sample of eight drivers

insured with a company and having similar auto insurance policies was selected. The following table lists their driving experience (in years) and monthly auto insurance premiums.

37

Example 2

Driving Experience (years)

Monthly Auto InsurancePremium

5 212 915 62516

$64 87 50 71 44 56 42 60

38

Scatter diagram and the regression line.

e)

Insu

ran

ce p

rem

ium

Experience

xy 547.16605.76ˆ

39

Solution ..

g) The predict value of y for x = 10 is

ŷ = 76.6605 – 1.5476(10) = $61.18

40

Solution …..

i)

52. to57.20240.15476.1

)5270(.943.15476.1

943.1

6282

05.)2/90(.5.2/

5270.5000.383

3199.10

tsb

t

ndf

SS

ss

b

xx

eb

41

Solution …

j) H0: B = 0

B is not negative

H1: B < 0 B is negative

42

Solution ….

Area in the left tail = α = .05 df = n – 2 = 8 – 2 = 6 The critical value of t is -1.943

43

Figure ..

α = .01

Do not reject H0Reject H0

Critical value of t

t -1.943 0

44

Solution …

937.25270.

05476.1

bs

Bbt

From H0

45

Solution …

The value of the test statistic t = -2.937 It falls in the rejection region

Hence, we reject the null hypothesis and conclude that B is negative

46

Figure ….

-2.447 0 2.447 t

α/2 = .025 α/2 = .025

Do not reject H0Reject H0

Reject H0

Two critical values of t

SIMPLE LINEAR REGRESSION

Documents

Transcript of SIMPLE LINEAR REGRESSION