Simple Regression Excellent 1883
-
Upload
kshitij-bhardwaj -
Category
Documents
-
view
221 -
download
0
Transcript of Simple Regression Excellent 1883
-
8/7/2019 Simple Regression Excellent 1883
1/68
Introduction to Linear Regression
and Correlation Analysis
-
8/7/2019 Simple Regression Excellent 1883
2/68
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 14-2
Scatter Plots and Correlation
A scatter plot (or scatter diagram) is used to show therelationship between two variables
Correlation analysis is used to measure strength of the association (linear relationship) between twovariables
Only concerned with strength of the relationship
No causal effect is implied
-
8/7/2019 Simple Regression Excellent 1883
3/68
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 14-3
Scatter Plot Examples
y
x
y
x
y
y
x
x
Linear relationships Curvilinear relationships
-
8/7/2019 Simple Regression Excellent 1883
4/68
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 14-4
Scatter Plot Examples
y
x
y
x
y
y
x
x
Strong relationships Weak relationships
(continued)
-
8/7/2019 Simple Regression Excellent 1883
5/68
-
8/7/2019 Simple Regression Excellent 1883
6/68
-
8/7/2019 Simple Regression Excellent 1883
7/68
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 14-7
Features of r
Unit freeRange between -1 and 1The closer to -1, the stronger the negativelinear relationshipThe closer to 1, the stronger the positivelinear relationship
The closer to 0, the weaker the linear relationship
-
8/7/2019 Simple Regression Excellent 1883
8/68
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 14-8r = +.3 r = +1
Examples of Approximater Values
y
x
y
x
y
x
y
x
y
x
r = -1 r = -.6 r = 0
-
8/7/2019 Simple Regression Excellent 1883
9/68
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 14-9
Calculating theCorrelation Coefficient
=
])yy(][)xx([
)yy)(xx(r
22
where:r = Sample correlation coefficientn = Sample sizex = Value of the independent variable
y = Value of the dependent variable
=
])y()y(n][)x()x(n[
yxxynr
2222
Sample correlation coefficient:
or the algebraic equivalent:
-
8/7/2019 Simple Regression Excellent 1883
10/68
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 14-10
Calculation ExampleTree
HeightTrunk
Diameter
y x xy y 2 x2
35 8 280 1225 64
49 9 441 2401 8127 7 189 729 4933 6 198 1089 3660 13 780 3600 169
21 7 147 441 4945 11 495 2025 12151 12 612 2601 144
=321 =73 =3142 =14111 =713
-
8/7/2019 Simple Regression Excellent 1883
11/68
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 14-11
0
10
20
30
40
50
60
70
0 2 4 6 8 10 12 14
0.886
](321)][8(14111)(73)[8(713)(73)(321)8(3142)
]y)()y][n(x)()x[n(
yxxynr
22
2222
=
=
=
Trunk Diameter, x
TreeHeight,y
Calculation Example(continued)
r = 0.886 relatively strong positivelinear association between x and y
-
8/7/2019 Simple Regression Excellent 1883
12/68
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 14-12
Excel Output
Tree Height Trunk Diameter Tree Height 1Trunk Diameter 0.886231 1
Excel Correlation Output
Tools / data analysis / correlation
Correlation betweenTree Height and Trunk Diameter
-
8/7/2019 Simple Regression Excellent 1883
13/68
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 14-13
Significance Test for Correlation
HypothesesH0: = 0 (no correlation)
HA: 0 (correlation exists)
Test statistic
(with n 2 degrees of freedom)
2nr 1
r t 2
=
The Greek letter (rho) representsthe population correlation coefficient
-
8/7/2019 Simple Regression Excellent 1883
14/68
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 14-14
Example: Produce Stores
Is there evidence of a linear relationshipbetween tree height and trunk diameter atthe .05 level of significance?
H 0: = 0 (No correlation)H 1: 0 (correlation exists)
=.05 , df = 8 - 2 = 6
4.68
28.8861
.886
2nr 1
r t22
=
=
=
-
8/7/2019 Simple Regression Excellent 1883
15/68
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 14-15
4.68
28
.8861
.886
2n
r 1
r t
22=
=
=
Example: Test Solution
Conclusion:There is sufficientevidence of alinear relationship
at the 5% level of significance
Decision:Reject H 0
Reject H 0Reject H 0
/2=.025
-t/2Do not reject H 0
0 t/2
/2=.025
-2.4469 2.4469 4.68
d.f. = 8-2 = 6
-
8/7/2019 Simple Regression Excellent 1883
16/68
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 14-16
Introduction toRegression Analysis
Regression analysis is used to:Predict the value of a dependent variable based onthe value of at least one independent variable
Explain the impact of changes in an independentvariable on the dependent variable
Dependent variable: the variable we wish toexplain
Independent variable: the variable used toexplain the dependent variable
-
8/7/2019 Simple Regression Excellent 1883
17/68
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 14-17
Simple Linear Regression Model
Only one independent variable , x
Relationship between x and y isdescribed by a linear function
Changes in y are assumed to becaused by changes in x
-
8/7/2019 Simple Regression Excellent 1883
18/68
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 14-18
Types of Regression Models
Positive Linear Relationship
Negative Linear Relationship
Relationship NOT Linear
No Relationship
-
8/7/2019 Simple Regression Excellent 1883
19/68
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 14-19
xy 10 ++=Linear component
Population Linear Regression
The population regression model:
Populationy intercept
PopulationSlope
Coefficient
RandomError
term, or residualDependentVariable
IndependentVariable
Random Error component
-
8/7/2019 Simple Regression Excellent 1883
20/68
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 14-20
Linear Regression Assumptions
Error values () are statistically independentError values are normally distributed for anygiven value of xThe probability distribution of the errors isnormalThe distributions of possible values have
equal variances for all values of xThe underlying relationship between the xvariable and the y variable is linear
-
8/7/2019 Simple Regression Excellent 1883
21/68
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 14-21
Population Linear Regression(continued)
Random Error for this x value
y
x
Observed Valueof y for x i
Predicted Valueof y for x i
xy 10 ++=
xi
Slope = 1
Intercept = 0
i
-
8/7/2019 Simple Regression Excellent 1883
22/68
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 14-22
xbby 10i +=
The sample regression line provides an estimate of the population regression line
Estimated Regression Model
Estimate of the regression
intercept
Estimate of theregression slope
Estimated(or predicted)y value
Independentvariable
The individual random error terms e i have a mean of zero
-
8/7/2019 Simple Regression Excellent 1883
23/68
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 14-23
Least Squares Criterion
b0 and b 1 are obtained by finding the valuesof b 0 and b 1 that minimize the sum of thesquared residuals
210
22
x))b(b(y
)y(ye
+=
=
-
8/7/2019 Simple Regression Excellent 1883
24/68
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 14-24
The Least Squares Equation
The formulas for b 1 and b 0 are:
algebraic equivalent for b 1:
=
n
)x(x
nyxxy
b 22
1
=
21 )x(x
)y)(yx(x
b
xbyb 10 =
and
-
8/7/2019 Simple Regression Excellent 1883
25/68
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 14-25
b0 is the estimated average value of ywhen the value of x is zero
b1 is the estimated change in theaverage value of y as a result of aone-unit change in x
Interpretation of theSlope and the Intercept
-
8/7/2019 Simple Regression Excellent 1883
26/68
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 14-26
Simple Linear RegressionExample
A real estate agent wishes to examine therelationship between the selling price of a homeand its size (measured in square feet)
A random sample of 10 houses is selectedDependent variable (y) = house price in $1000sIndependent variable (x) = square feet
-
8/7/2019 Simple Regression Excellent 1883
27/68
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 14-27
Sample Data for House Price Model
House Price in $1000s(y)
Square Feet(x)
245 1400312 1600
279 1700308 1875
199 1100219 1550405 2350324 2450
319 1425255 1700
-
8/7/2019 Simple Regression Excellent 1883
28/68
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 14-28
Regression Using Excel
Data / Data Analysis / Regression
-
8/7/2019 Simple Regression Excellent 1883
29/68
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 14-29
Excel OutputRegression Statistics
Multiple R 0.76211
R Square 0.58082
Adjusted R Square 0.52842
Standard Error 41.33032
Observations 10
ANOVA
df SS MS F Significance F
Regression 1 18934.9348 18934.9348 11.0848 0.01039
Residual 8 13665.5652 1708.1957
Total 9 32600.5000
Coefficients Standard Error t Stat P-value Lower 95% Upper 95%
Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386
Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580
The regression equation is:
feet)(square0.1097798.24833pricehouse +=
-
8/7/2019 Simple Regression Excellent 1883
30/68
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 14-30
050
100
150200250300350400450
0 500 1000 1500 2000 2500 3000
Square Fee
H o u s e
P r i c e
( $ 1 0 0 0 s )
Graphical Presentation
House price model: scatter plot andregression line
feet)(square0.1097798.24833pricehouse +=
Slope= 0.10977
Intercept= 98.248
I i f h
-
8/7/2019 Simple Regression Excellent 1883
31/68
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 14-31
Interpretation of theIntercept, b 0
b0 is the estimated average value of Y when thevalue of X is zero (if x = 0 is in the range of observed x values)
Here, no houses had 0 square feet, so b 0 = 98.24833
just indicates that, for houses within the range of sizesobserved, $98,248.33 is the portion of the house pricenot explained by square feet
feet)(square0.1097798.24833pricehouse +=
I i f h
-
8/7/2019 Simple Regression Excellent 1883
32/68
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 14-32
Interpretation of theSlope Coefficient, b 1
b1 measures the estimated change in theaverage value of Y as a result of a one-unitchange in X
Here, b 1 = .10977 tells us that the average value of ahouse increases by .10977($1000) = $109.77, onaverage, for each additional one square foot of size
feet)(square0.1097798.24833pricehouse +=
-
8/7/2019 Simple Regression Excellent 1883
33/68
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 14-33
Least Squares RegressionProperties
The sum of the residuals from the least squaresregression line is 0 ( )
The sum of the squared residuals is a minimum
(minimized )The simple regression line always passes through themean of the y variable and the mean of the x variable
The least squares coefficients are unbiased estimatesof 0 and 1
0)y(y =
2)y(y
-
8/7/2019 Simple Regression Excellent 1883
34/68
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 14-34
Explained and UnexplainedVariation
Total variation is made up of two parts:
SSR SSE SST +=Total sum of
SquaresSum of Squares
RegressionSum of Squares
Error
= 2)yy(SST
= 2)yy(SSE
= 2)yy(SSR
where: = Average value of the dependent variabley = Observed values of the dependent variable
= Estimated value of y for the given x value y
y
-
8/7/2019 Simple Regression Excellent 1883
35/68
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 14-35
SST = total sum of squares
Measures the variation of the y i values around their mean y
SSE = error sum of squares
Variation attributable to factors other than therelationship between x and y
SSR = regression sum of squaresExplained variation attributable to the relationshipbetween x and y
(continued)
Explained and UnexplainedVariation
l d d l d
-
8/7/2019 Simple Regression Excellent 1883
36/68
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 14-36
(continued)
Xi
y
x
y i
SST = (y i - y)2
SSE = (y i - y i )2
SSR = (y i - y)2
_ _
_ y
y
y
_ y
Explained and UnexplainedVariation
-
8/7/2019 Simple Regression Excellent 1883
37/68
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 14-37
The coefficient of determination is the portionof the total variation in the dependent variablethat is explained by variation in the
independent variableThe coefficient of determination is also calledR-squared and is denoted as R2
Coefficient of Determination, R 2
SSTSSR
R =2 1R0 2 where
-
8/7/2019 Simple Regression Excellent 1883
38/68
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 14-38
Coefficient of determination
Coefficient of Determination, R 2
squaresof sumtotalregressionbyexplainedsquaresof sum
SSTSSR
R ==2
(continued)
Note: In the single independent variable case, the coefficientof determination is
where:R2 = Coefficient of determinationr = Simple correlation coefficient
22 r R =
E l f A i
-
8/7/2019 Simple Regression Excellent 1883
39/68
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 14-39
R2 = +1
Examples of ApproximateR2 Values
y
x
y
x
R2 = 1
R2 = 1
Perfect linear relationshipbetween x and y:
100% of the variation in y is
explained by variation in x
E l f A i
-
8/7/2019 Simple Regression Excellent 1883
40/68
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 14-40
Examples of ApproximateR2 Values
y
x
y
x
0 < R 2 < 1
Weaker linear relationshipbetween x and y:
Some but not all of the
variation in y is explainedby variation in x
(continued)
E l f A i
-
8/7/2019 Simple Regression Excellent 1883
41/68
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 14-41
Examples of ApproximateR2 Values
R2 = 0
No linear relationshipbetween x and y:
The value of Y does not
depend on x. (None of thevariation in y is explainedby variation in x)
y
xR2 = 0
(continued)
-
8/7/2019 Simple Regression Excellent 1883
42/68
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 14-42
Excel OutputRegression Statistics
Multiple R 0.76211
R Square 0.58082
Adjusted R Square 0.52842
Standard Error 41.33032
Observations 10
ANOVA
df SS MS F Significance F
Regression 1 18934.9348 18934.9348 11.0848 0.01039
Residual 8 13665.5652 1708.1957
Total 9 32600.5000
Coefficients Standard Error t Stat P-value Lower 95% Upper 95%
Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386
Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580
58.08%58.08% of the variation inhouse prices is explained by
variation in square feet
0.5808232600.500018934.9348
SSTSSRR 2 ===
T f Si ifi f
-
8/7/2019 Simple Regression Excellent 1883
43/68
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 14-43
Test for Significance of Coefficient of Determination
Hypotheses
H0: 2 = 0
HA: 2 0
Test statistic
(with D1 = 1 and D 2 = n - 2degrees of freedom)
2)SSE/(nSSR/1F
=
H0: The independent variable does not explain a significantportion of the variation in the dependent variable
HA: The independent variable does explain a significantportion of the variation in the dependent variable
-
8/7/2019 Simple Regression Excellent 1883
44/68
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 14-44
Excel OutputRegression Statistics
Multiple R 0.76211
R Square 0.58082
Adjusted R Square 0.52842
Standard Error 41.33032
Observations 10
ANOVA
df SS MS F Significance F
Regression 1 18934.9348 18934.9348 11.0848 0.01039
Residual 8 13665.5652 1708.1957
Total 9 32600.5000
Coefficients Standard Error t Stat P-value Lower 95% Upper 95%
Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386
Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580
The critical F value from Appendix H for The critical F value from Appendix H for = .05 and D= .05 and D 11 = 1 and D= 1 and D 22 = 8 d.f. is 5.318.= 8 d.f. is 5.318.
Since 11.085 > 5.318 we reject HSince 11.085 > 5.318 we reject H 00:: 2
= 0
11.0852)-1013665.57/(
18934.93/12)-SSE/(n
SSR/1F ===
-
8/7/2019 Simple Regression Excellent 1883
45/68
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 14-45
Standard Error of Estimate
The standard deviation of the variation of observations around the simple regression lineis estimated by
2nSSE
s =
WhereSSE = Sum of squares error
n = Sample size
Th St d d D i ti f th
-
8/7/2019 Simple Regression Excellent 1883
46/68
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 14-46
The Standard Deviation of theRegression Slope
The standard error of the regression slopecoefficient (b 1) is estimated by
==
nx)(
x
s
)x(x
ss 2
2
2
b1
where:= Estimate of the standard error of the least squares slope
= Sample standard error of the estimate
1bs
2nSSE
s =
-
8/7/2019 Simple Regression Excellent 1883
47/68
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 14-47
Excel OutputRegression Statistics
Multiple R 0.76211
R Square 0.58082
Adjusted R Square 0.52842
Standard Error 41.33032
Observations 10
ANOVA
df SS MS F Significance F
Regression 1 18934.9348 18934.9348 11.0848 0.01039
Residual 8 13665.5652 1708.1957
Total 9 32600.5000
Coefficients Standard Error t Stat P-value Lower 95% Upper 95%
Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386
Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580
41.33032s =
0.03297s1b
=
-
8/7/2019 Simple Regression Excellent 1883
48/68
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 14-48
Comparing Standard Errors
y
y y
x
x
x
y
x
1bssmall
1bslarge
ssmall
slarge
Variation of observed y valuesfrom the regression line Variation in the slope of regressionlines from different possible samples
-
8/7/2019 Simple Regression Excellent 1883
49/68
I f b t th Sl
-
8/7/2019 Simple Regression Excellent 1883
50/68
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 14-50
House Price in$1000s
(y)
Square Feet(x)
245 1400
312 1600
279 1700
308 1875
199 1100
219 1550
405 2350
324 2450
319 1425
255 1700
(sq.ft.)0.109898.25pricehouse +=
Estimated Regression Equation:
The slope of this model is 0.1098
Does square footage of the houseaffect its sales price?
Inference about the Slope:t Test
(continued)
-
8/7/2019 Simple Regression Excellent 1883
51/68
R g i A l i f
-
8/7/2019 Simple Regression Excellent 1883
52/68
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 14-52
Regression Analysis for Description
Confidence Interval Estimate of the Slope:
Excel Printout for House Prices:
At 95% level of confidence, the confidence interval for the slope is (0.0337, 0.1858)
1b/21stb
Coefficients Standard Error t Stat P-value Lower 95% Upper 95%
Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386
Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580
d.f. = n - 2
Regression Anal sis for
-
8/7/2019 Simple Regression Excellent 1883
53/68
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 14-53
Regression Analysis for Description
Since the units of the house price variable is$1000s, we are 95% confident that the averageimpact on sales price is between $33.70 and$185.80 per square foot of house size
Coefficients Standard Error t Stat P-value Lower 95% Upper 95%
Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386
Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580
This 95% confidence interval does not include 0 .
Conclusion: There is a significant relationship betweenhouse price and square feet at the .05 level of significance
Confidence Interval for
-
8/7/2019 Simple Regression Excellent 1883
54/68
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 14-54
Confidence Interval for the Average y, Given x
Confidence interval estimate for themean of y given a particular x p
Size of interval varies accordingto distance away from mean, x
+ 2
2p
/2
)x(x
)x(x
n
1sty
Confidence Interval for
-
8/7/2019 Simple Regression Excellent 1883
55/68
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 14-55
Confidence Interval for an Individual y, Given x
Confidence interval estimate for anIndividual value of y given a particular x p
++ 2
2p
/2 )x(x
)x(x
n1
1sty
This extra term adds to the interval width to reflectthe added uncertainty for an individual case
Interval Estimates
-
8/7/2019 Simple Regression Excellent 1883
56/68
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 14-56
Interval Estimatesfor Different Values of x
y
x
Prediction Intervalfor an individual y,given x p
x p
y = b 0 + b 1 x
x
ConfidenceInterval for the mean of y, given x p
-
8/7/2019 Simple Regression Excellent 1883
57/68
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 14-57
House Price in$1000s
(y)
Square Feet(x)
245 1400
312 1600
279 1700
308 1875
199 1100
219 1550
405 2350
324 2450
319 1425
255 1700
(sq.ft.)0.109898.25pricehouse +=
Estimated Regression Equation:
Example: House Prices
Predict the price for a housewith 2000 square feet
-
8/7/2019 Simple Regression Excellent 1883
58/68
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 14-58
317.85
0)0.1098(20098.25(sq.ft.)0.109898.25pricehouse
=
+=+=
Example: House Prices
Predict the price for a housewith 2000 square feet:
The predicted price for a house with 2000square feet is 317.85($1,000s) = $317,850
(continued)
Estimation of Mean Values:
-
8/7/2019 Simple Regression Excellent 1883
59/68
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 14-59
Estimation of Mean Values:Example
Find the 95% confidence interval for the averageprice of 2,000 square-foot houses
Predicted Price Y i = 317.85 ($1,000s)
Confidence Interval Estimate for E(y)|x p
37.12317.85)x(x
)x(x
n
1sty
2
2p
/2 =
+
The confidence interval endpoints are 280.66 -- 354.90,or from $280,660 -- $354,900
Estimation of Individual Values:
-
8/7/2019 Simple Regression Excellent 1883
60/68
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 14-60
Estimation of Individual Values:Example
Find the 95% confidence interval for an individualhouse with 2,000 square feet
Predicted Price Y i = 317.85 ($1,000s)
Prediction Interval Estimate for y|x p
102.28317.85)x(x
)x(x
n
11sty
2
2p
/2 =
++
The prediction interval endpoints are 215.50 -- 420.07,or from $215,500 -- $420,070
Finding Confidence and Prediction
-
8/7/2019 Simple Regression Excellent 1883
61/68
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 14-61
Finding Confidence and PredictionIntervals PHStat
In Excel, use
PHStat | regression | simple linear regression
Check theconfidence and prediction interval for X= box and enter the x-value and confidence level desired
Finding Confidence and Prediction
-
8/7/2019 Simple Regression Excellent 1883
62/68
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 14-62
Input values
Finding Confidence and PredictionIntervals PHStat
(continued)
Confidence Interval Estimate for E(y)|x p
Prediction Interval Estimate for y|x p
-
8/7/2019 Simple Regression Excellent 1883
63/68
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 14-63
Residual Analysis
PurposesExamine for linearity assumptionExamine for constant variance for all levels
of xEvaluate normal distribution assumption
Graphical Analysis of Residuals
Can plot residuals vs. xCan create histogram of residuals to checkfor normality
-
8/7/2019 Simple Regression Excellent 1883
64/68
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 14-64
Residual Analysis for Linearity
Not Linear Linear
x r e s
i d u a
l s
x
y
x
y
x
r e s
i d u a
l s
Residual Analysis for
-
8/7/2019 Simple Regression Excellent 1883
65/68
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 14-65
Residual Analysis for Constant Variance
Non-constant variance Constant variance
x x
y
x x
y
r e s
i d u a
l s
r e s
i d u a
l s
-
8/7/2019 Simple Regression Excellent 1883
66/68
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 14-66
House Price Model Residual Pl
-60
-40
-20
0
20
40
60
80
0 1000 2000 300
Square Fee
R e s
i d u a
l s
Excel Output
RESIDUAL OUTPUT
Predicted House Price
Residuals
1 251.92316 -6.923162
2 273.87671 38.123293 284.85348 -5.853484
4 304.06284 3.937162
5 218.99284 -19.99284
6 268.38832 -49.38832
7 356.20251 48.797498 367.17929 -43.17929
9 254.6674 64.33264
10 284.85348 -29.85348
-
8/7/2019 Simple Regression Excellent 1883
67/68
Business Statistics: A Decision-Making Approach, 7e 2008 Prentice-Hall, Inc. Chap 14-67
Chapter Summary
Introduced correlation analysisDiscussed correlation to measure the strengthof a linear association
Introduced simple linear regression analysisCalculated the coefficients for the simple linear regression equationDescribed measures of variation (R 2 and s )Addressed assumptions of regression andcorrelation
-
8/7/2019 Simple Regression Excellent 1883
68/68
Chapter Summary
Described inference about the slopeAddressed estimation of mean values andprediction of individual valuesDiscussed residual analysis
(continued)