SIMPLE LINEAR REGRESSION
description
Transcript of SIMPLE LINEAR REGRESSION
SIMPLE LINEAR REGRESSION
2
SIMPLE LINEAR REGRESSION
Simple Regression Linear Regression
3
Simple Regression
Definition A regression model is a mathematical
equation that describes the relationship between two or more variables. A simple regression model includes only two variables: one independent and one dependent. The dependent variable is the one being explained, and the independent variable is the one used to explain the variation in the dependent variable.
4
Linear Regression
Definition A (simple) regression model that
gives a straight-line relationship between two variables is called a linear regression model.
5
Figure 1 Relationship between food expenditure and income. (a) Linear relationship.
(b) Nonlinear relationship.
Food
Expendit
ure
Food
Expendit
ure
Income Income
(a) (b)
Linear
Nonlinear
6
Figure 2 Plotting a linear equation.
150
100
50
5 10 15 x
y = 50 + 5x
x = 0
y = 50
x = 10
y = 100
y
7
SIMPLE LINEAR REGRESSION ANALYSIS
Scatter Diagram Least Square Line Interpretation of a and b Assumptions of the Regression Model
8
SIMPLE LINEAR REGRESSION ANALYSIS cont.
y = A + Bx
Constant term or y-intercept
Slope
Independent variableDependent variable
9
SIMPLE LINEAR REGRESSION ANALYSIS cont.
Definition In the regression model y = A + Bx
+ Є, A is called the y-intercept or constant term, B is the slope, and Є is the random error term. The dependent and independent variables are y and x, respectively.
10
SIMPLE LINEAR REGRESSION ANALYSIS
Definition In the model ŷ = a + bx, a and b,
which are calculated using sample data, are called the estimates of A and B.
11
Table 1 Incomes (in hundreds of dollars) and Food Expenditures of Seven Households
Income Food Expenditure 35 49 21 39 15 28 25
915 711 5 8 9
12
Scatter Diagram
Definition A plot of paired observations is called
a scatter diagram.
13
Figure 4 Scatter diagram.
Income
Food e
xpendit
ure
First householdSeventh household
Income
Food Expend
iture
35 49 21 39 15 28 25
915 711 5 8 9
14
Figure 5 Scatter diagram and straight lines.
Income
Food
expen
dit
ure
15
Least Squares Line
Figure 6 Regression line and random errors.
Income
F
ood e
xpendit
ure
e
Regression line
16
Descriptive Statistics
9,14 3,18 7
30,29 11,56 7
Y
X
Mean Std. Deviation N
Model Summary
,959a ,919 ,903 ,99Model1
R R SquareAdjustedR Square
Std. Error ofthe Estimate
Predictors: (Constant), Xa.
OUTPUT SPSS
17
Coefficientsa
1,142 1,126 1,014 ,357
,264 ,035 ,959 7,533 ,001
(Constant)
X
Model1
B Std. Error
UnstandardizedCoefficients
Beta
Standardized
Coefficients
t Sig.
Dependent Variable: Ya.
The Least Squares Line
a=1,142 b=0,264
Thus, ŷ = 1.1414 + 0.2642x
18
22
ii
iiii
XXn
YXYXnb
__
XbYa
19
Figure 7 Error of prediction.
ePredicted = $1038.84
Error = -$138.84
Actual = $900
ŷ = 1.1414 + .2642x
Income
F
ood e
xpendit
ure
20
Figure . Errors of prediction when regression model is used.Fo
od e
xpendit
ure
Income
ŷ = 1.1414 + .2642x
21
Interpretation of a and b
Interpretation of a Consider the household with zero
income ŷ = 1.1414 + .2642(0) = $1.1414
hundred Thus, we can state that households
with no income is expected to spend $114.14 per month on food
22
Interpretation of a and b cont.
Interpretation of b The value of b in the regression
model gives the change in y due to change of one unit in x
We can state that, on average, a $1 increase in income of a household will increase the food expenditure by $0.2642
23
Figure 8 Positive and negative linear relationships between x and y.
(a) Positive linear relationship.
(b) Negative linear relationship.
b > 0 b < 0
y
x
y
x
24
Table 4x y ŷ = 1.1414 + .2642x e = y – ŷ
35492139152825
915 711 5 8 9
10.388414.0872 6.689611.4452 5.1044 8.5390 7.7464
-1.3884 .9128 .3104 -.4452 -.1044 -.5390 1.2536
1.9277 .8332 .0963 .1982 .0109 .29051.5715
22 yye
9283.4ˆ22 yye
25
Linearitas Test(Uji Validitas Model)
Model Sum of Squares
Degrees of
Freedom (db)
Mean Squar
e
Value of the test statistic
(F Value )
Regression
SSreg 1 MSreg
Residual
SSres n-2 MSres
Total SST N-1
Table. Validity for Simple Regression Model
res
reg
MS
MSF
26
ANOVAb
55,929 1 55,929 56,742 ,001a
4,928 5 ,986
60,857 6
Regression
Residual
Total
Model1
Sum ofSquares df Mean Square F Sig.
Predictors: (Constant), Xa.
Dependent Variable: Yb.
OUTPUT SPSS
27
Figure Nonlinear relations between x and y.
(a) (b)
y
x
y
x
28
n
YYSS i
it
2
2.1
n
YXYXbSS i
iireg.2
regtres SSSSSS .3
regreg SSMS .4
2.5
n
SSMS res
res
res
reg
MS
MSF
F table ,dbreg=1 and dbres=n-2
29
SIGNIFICANCE KOEFISIEN REGRESI
Coefficientsa
1,142 1,126 1,014 ,357
,264 ,035 ,959 7,533 ,001
(Constant)
X
Model1
B Std. Error
UnstandardizedCoefficients
Beta
Standardized
Coefficients
t Sig.
Dependent Variable: Ya.
30
as
at
bs
bt
n
XXSS i
ixx
2
2
2i
XX
ea X
nSS
ss
2
n
SSs res
e
x
eb
SS
ss
31
.0140,11264,1
1422,1
as
at
.533,7035068,0
26417,0
bs
bt
Coefficientsa
1,142 1,126 1,014 ,357
,264 ,035 ,959 7,533 ,001
(Constant)
X
Model1
B Std. Error
UnstandardizedCoefficients
Beta
Standardized
Coefficients
t Sig.
Dependent Variable: Ya.
Output SPSS
32
Do not reject H0Reject H0Reject H0
ttable = 2.571
Significan level α = 0.05
-ttable = -2.571
33
REGRESSION ANALYSIS: COMPLETE EXERCISES
Exercise 1: The following data give the experience
(in years ) and monthly salary (in hundreds of dollars) of nine randomly selected secretaries.
34
Exercise 1Experience
(years)Monthly salary
(Hundreds of dollars)
1435649
185
16
4224333129394730
43
35
a. Construct a scatter diagram for these data.b. Find the regression line with experience as
an independent variable and monthly salary as a dependent variable.
c. Give a brief interpretation of the values of a and b calculated in part b.
d. Plot the regression line on the scatter diagram of part a and show the errors by drawing vertical lines between the scatter points and the regression line.
e. Does the regression model show a linear relationship between experience and monthly salary? Use 5 % significant level.
f. Construct a 5 % significant level for b.
36
Exercise 2 A random sample of eight drivers
insured with a company and having similar auto insurance policies was selected. The following table lists their driving experience (in years) and monthly auto insurance premiums.
37
Example 2
Driving Experience (years)
Monthly Auto InsurancePremium
5 212 915 62516
$64 87 50 71 44 56 42 60
38
Scatter diagram and the regression line.
e)
Insu
ran
ce p
rem
ium
Experience
xy 547.16605.76ˆ
39
Solution ..
g) The predict value of y for x = 10 is
ŷ = 76.6605 – 1.5476(10) = $61.18
40
Solution …..
i)
52. to57.20240.15476.1
)5270(.943.15476.1
943.1
6282
05.)2/90(.5.2/
5270.5000.383
3199.10
tsb
t
ndf
SS
ss
b
xx
eb
41
Solution …
j) H0: B = 0
B is not negative
H1: B < 0 B is negative
42
Solution ….
Area in the left tail = α = .05 df = n – 2 = 8 – 2 = 6 The critical value of t is -1.943
43
Figure ..
α = .01
Do not reject H0Reject H0
Critical value of t
t -1.943 0
44
Solution …
937.25270.
05476.1
bs
Bbt
From H0
45
Solution …
The value of the test statistic t = -2.937 It falls in the rejection region
Hence, we reject the null hypothesis and conclude that B is negative
46
Figure ….
-2.447 0 2.447 t
α/2 = .025 α/2 = .025
Do not reject H0Reject H0
Reject H0
Two critical values of t