Marcel Dettling · 2013. 9. 24. · Marcel Dettling, Zurich University of Applied Sciences 6...
Transcript of Marcel Dettling · 2013. 9. 24. · Marcel Dettling, Zurich University of Applied Sciences 6...
![Page 1: Marcel Dettling · 2013. 9. 24. · Marcel Dettling, Zurich University of Applied Sciences 6 Applied Statistical Regression AS 2013 – Week 02 What is Regression? Example: Fresh](https://reader033.fdocuments.in/reader033/viewer/2022051905/5ff7c3cb7b6ff4644d1ab71b/html5/thumbnails/1.jpg)
1Marcel Dettling, Zurich University of Applied Sciences
Applied Statistical RegressionAS 2013 – Week 02 & Week 04
Marcel DettlingInstitute for Data Analysis and Process Design
Zurich University of Applied Sciences
http://stat.ethz.ch/~dettling
ETH Zürich, September 23, 2013
![Page 2: Marcel Dettling · 2013. 9. 24. · Marcel Dettling, Zurich University of Applied Sciences 6 Applied Statistical Regression AS 2013 – Week 02 What is Regression? Example: Fresh](https://reader033.fdocuments.in/reader033/viewer/2022051905/5ff7c3cb7b6ff4644d1ab71b/html5/thumbnails/2.jpg)
2Marcel Dettling, Zurich University of Applied Sciences
Applied Statistical RegressionAS 2013 – Week 02
Your LecturerName: Marcel Dettling
Age: 39 years
Civil Status: Married, 2 children
Education: Dr. Math. ETH
Position: Lecturer at ETH Zürich and ZHAWSenior Researcher at IDP, a ZHAW institute
Hobbies: Rock climbing, Skitouring, Paragliding, …
![Page 3: Marcel Dettling · 2013. 9. 24. · Marcel Dettling, Zurich University of Applied Sciences 6 Applied Statistical Regression AS 2013 – Week 02 What is Regression? Example: Fresh](https://reader033.fdocuments.in/reader033/viewer/2022051905/5ff7c3cb7b6ff4644d1ab71b/html5/thumbnails/3.jpg)
3Marcel Dettling, Zurich University of Applied Sciences
Applied Statistical RegressionAS 2013 – Week 02
Course Organization
![Page 4: Marcel Dettling · 2013. 9. 24. · Marcel Dettling, Zurich University of Applied Sciences 6 Applied Statistical Regression AS 2013 – Week 02 What is Regression? Example: Fresh](https://reader033.fdocuments.in/reader033/viewer/2022051905/5ff7c3cb7b6ff4644d1ab71b/html5/thumbnails/4.jpg)
4Marcel Dettling, Zurich University of Applied Sciences
Applied Statistical RegressionAS 2013 – Week 02
What is Regression?The answer to an everyday question: How does a target variable of special interest depend onseveral other (explanatory) factors or causes.
Examples:• growth of plants, depends on fertilizer, soil quality, …• apartment rents, depends on size, location, furnishment, … • car insurance premium, depends on age, sex, nationality, …
Regression:• quantitatively describes relation between predictors and target• high importance, most widely used statistical methodology
![Page 5: Marcel Dettling · 2013. 9. 24. · Marcel Dettling, Zurich University of Applied Sciences 6 Applied Statistical Regression AS 2013 – Week 02 What is Regression? Example: Fresh](https://reader033.fdocuments.in/reader033/viewer/2022051905/5ff7c3cb7b6ff4644d1ab71b/html5/thumbnails/5.jpg)
5Marcel Dettling, Zurich University of Applied Sciences
Applied Statistical RegressionAS 2013 – Week 02
Regression Mathematics See blackboard...
![Page 6: Marcel Dettling · 2013. 9. 24. · Marcel Dettling, Zurich University of Applied Sciences 6 Applied Statistical Regression AS 2013 – Week 02 What is Regression? Example: Fresh](https://reader033.fdocuments.in/reader033/viewer/2022051905/5ff7c3cb7b6ff4644d1ab71b/html5/thumbnails/6.jpg)
6Marcel Dettling, Zurich University of Applied Sciences
Applied Statistical RegressionAS 2013 – Week 02
What is Regression?Example: Fresh Water Tank on Planes
• Earlier: it was impossible to predict the amount of fresh water needed, the tank was always filled to 100% at Zurich airport.
• Goal: Minimizing the amount of fresh water that is carried. This lowers the weight, and thus fuel consumption and cost.
• Task: Modelling the relation between fresh water consumption and # of passengers, flight duration, daytime, destination, …Furthermore, quantifying what is needed as a reserve.
• Method: Multiple linear regression model
![Page 7: Marcel Dettling · 2013. 9. 24. · Marcel Dettling, Zurich University of Applied Sciences 6 Applied Statistical Regression AS 2013 – Week 02 What is Regression? Example: Fresh](https://reader033.fdocuments.in/reader033/viewer/2022051905/5ff7c3cb7b6ff4644d1ab71b/html5/thumbnails/7.jpg)
7Marcel Dettling, Zurich University of Applied Sciences
Applied Statistical RegressionAS 2013 – Week 02
Goals with Linear ModelingTo understand the relation between
• Does the fertilizer positively affect plant growth?• Regression is a tool to give an answer on this• However, showing causality is a different matter
Target value prediction for new configurations
• What are the expected claims for auto insurance?• Regression analysis formalizes “prior experience”• It also provides an idea on the uncertainty of the prediction
![Page 8: Marcel Dettling · 2013. 9. 24. · Marcel Dettling, Zurich University of Applied Sciences 6 Applied Statistical Regression AS 2013 – Week 02 What is Regression? Example: Fresh](https://reader033.fdocuments.in/reader033/viewer/2022051905/5ff7c3cb7b6ff4644d1ab71b/html5/thumbnails/8.jpg)
8Marcel Dettling, Zurich University of Applied Sciences
Applied Statistical RegressionAS 2013 – Week 02
Regression: Goals1) Understanding the relation between and
The aim is to pin down which of the predictors have influence on the response variable, as well as to quantify the strength of thisrelation. There is a battery of statistics and tests that addressthese questions.
2) Prediction
The regression equation can be used for predicting the expectedresponse value for an arbitrary predictor configuration . We will not only generate point predictions, but can also attributea prediction interval that quantifies the involved uncertainty.
y 1,..., px x
1,..., px xy
![Page 9: Marcel Dettling · 2013. 9. 24. · Marcel Dettling, Zurich University of Applied Sciences 6 Applied Statistical Regression AS 2013 – Week 02 What is Regression? Example: Fresh](https://reader033.fdocuments.in/reader033/viewer/2022051905/5ff7c3cb7b6ff4644d1ab71b/html5/thumbnails/9.jpg)
9Marcel Dettling, Zurich University of Applied Sciences
Applied Statistical RegressionAS 2013 – Week 02
Simple RegressionIn this course, we first discuss simple regression, where there is only one single predictor variable. Later, we will extend this to multiple regression, where many predictors can be present.
Advantages of discussing simple regression:
• Visualization of data and fit is possible• Corresponds to estimating a straight line or curve• Is also mathematically simpler and more intuitive
We start out with smoothing, i.e. fitting non-parametric curves. Then, we will proceed with discussing linear models, i.e. the classical parametric regression approach.
![Page 10: Marcel Dettling · 2013. 9. 24. · Marcel Dettling, Zurich University of Applied Sciences 6 Applied Statistical Regression AS 2013 – Week 02 What is Regression? Example: Fresh](https://reader033.fdocuments.in/reader033/viewer/2022051905/5ff7c3cb7b6ff4644d1ab71b/html5/thumbnails/10.jpg)
10
Applied Statistical RegressionAS 2013 – Week 02
Example: Airline PassengersEach month, Zurich Airport publishes the number of air trafficmovements and airline passengers. We study their relation.
![Page 11: Marcel Dettling · 2013. 9. 24. · Marcel Dettling, Zurich University of Applied Sciences 6 Applied Statistical Regression AS 2013 – Week 02 What is Regression? Example: Fresh](https://reader033.fdocuments.in/reader033/viewer/2022051905/5ff7c3cb7b6ff4644d1ab71b/html5/thumbnails/11.jpg)
11Marcel Dettling, Zurich University of Applied Sciences
Applied Statistical RegressionAS 2013 – Week 02
Example: Airline Passengers
Month Pax ATM
2010-12 1‘730‘629 22‘666
2010-11 1‘772‘821 22‘579
2010-10 2‘238‘314 24‘234
2010-09 2‘139‘404 24‘172
2010-08 2‘230‘150 24‘377
... ... ...19000 20000 21000 22000 23000 24000 25000
1400
000
1600
000
1800
000
2000
000
2200
000
Flugbewegungen
Pax
Flughafen Zürich: Pax vs. ATM
![Page 12: Marcel Dettling · 2013. 9. 24. · Marcel Dettling, Zurich University of Applied Sciences 6 Applied Statistical Regression AS 2013 – Week 02 What is Regression? Example: Fresh](https://reader033.fdocuments.in/reader033/viewer/2022051905/5ff7c3cb7b6ff4644d1ab71b/html5/thumbnails/12.jpg)
12Marcel Dettling, Zurich University of Applied Sciences
Applied Statistical RegressionAS 2013 – Week 02
Smoothing
19000 20000 21000 22000 23000 24000 25000
1400
000
1600
000
1800
000
2000
000
2200
000
Flugbewegungen
Pax
Flughafen Zürich: Pax vs. ATMWe may use an arbitrarysmooth function forcapturing the relation bet-ween Pax and ATM.
• It should fit well, but not follow the data tooclosely.
• The question is howthe line/function areobtained.
( )f
![Page 13: Marcel Dettling · 2013. 9. 24. · Marcel Dettling, Zurich University of Applied Sciences 6 Applied Statistical Regression AS 2013 – Week 02 What is Regression? Example: Fresh](https://reader033.fdocuments.in/reader033/viewer/2022051905/5ff7c3cb7b6ff4644d1ab71b/html5/thumbnails/13.jpg)
13Marcel Dettling, Zurich University of Applied Sciences
Applied Statistical RegressionAS 2013 – Week 02
Linear Modeling
19000 20000 21000 22000 23000 24000 25000
1400
000
1600
000
1800
000
2000
000
2200
000
Flugbewegungen
Pax
Flughafen Zürich: Pax vs. ATMA straight line representsthe systematic relationbetween Pax and ATM.
• Only appropriate if thetrue relation is indeeda straight line
• The question is howthe line/function areobtained.
![Page 14: Marcel Dettling · 2013. 9. 24. · Marcel Dettling, Zurich University of Applied Sciences 6 Applied Statistical Regression AS 2013 – Week 02 What is Regression? Example: Fresh](https://reader033.fdocuments.in/reader033/viewer/2022051905/5ff7c3cb7b6ff4644d1ab71b/html5/thumbnails/14.jpg)
14Marcel Dettling, Zurich University of Applied Sciences
Applied Statistical RegressionAS 2013 – Week 02
Smoothing vs. Linear ModelingAdvantages and disadvantages of smoothing:+ Flexibility+ No assumptions are made- Functional form remains unknown- Danger of overfitting
Advantages and disadvantages of linear modelling:+ Formal inference on the relation is possible+ Better efficiency, i.e. less data required- Only reasonable if the relation is linear- Might falsely imply causality
![Page 15: Marcel Dettling · 2013. 9. 24. · Marcel Dettling, Zurich University of Applied Sciences 6 Applied Statistical Regression AS 2013 – Week 02 What is Regression? Example: Fresh](https://reader033.fdocuments.in/reader033/viewer/2022051905/5ff7c3cb7b6ff4644d1ab71b/html5/thumbnails/15.jpg)
15Marcel Dettling, Zurich University of Applied Sciences
Applied Statistical RegressionAS 2013 – Week 02
SmoothingOur goal is visualizing the relation between the / responsevariable Pax and the / predictor variable ATM.
we are not after a functional description of
Since there is no parametric function that describes the responsevs. predictor relation, smoothing is also termed non-parametricregression analysis.
Method/Idea: "Running Mean"- take a window of x-values- compute the mean of the y-values within the window- this is an estimate for the function value at the window center
( )f
Yx
![Page 16: Marcel Dettling · 2013. 9. 24. · Marcel Dettling, Zurich University of Applied Sciences 6 Applied Statistical Regression AS 2013 – Week 02 What is Regression? Example: Fresh](https://reader033.fdocuments.in/reader033/viewer/2022051905/5ff7c3cb7b6ff4644d1ab71b/html5/thumbnails/16.jpg)
16Marcel Dettling, Zurich University of Applied Sciences
Applied Statistical RegressionAS 2013 – Week 02
Running Mean: Example
2 4 6 8 10
510
1520
25
5.556.31
7.68
11.07
13.62
15.59
17.96 18.18
21.44 21.83
Running Mean: Beispiel
![Page 17: Marcel Dettling · 2013. 9. 24. · Marcel Dettling, Zurich University of Applied Sciences 6 Applied Statistical Regression AS 2013 – Week 02 What is Regression? Example: Fresh](https://reader033.fdocuments.in/reader033/viewer/2022051905/5ff7c3cb7b6ff4644d1ab71b/html5/thumbnails/17.jpg)
17Marcel Dettling, Zurich University of Applied Sciences
Applied Statistical RegressionAS 2013 – Week 02
Running Mean: MathematicsRunningMean(x) = Mean of y-values over a window with
width around .
The estimate for , denoted as , is defined as follows:
,
The weights are defined as , and is the window width.
( )f ˆ ( )f
1 | | / 20
ji
if x xw
else
/ 2
1
1
ˆ ( )
n
i ii
n
ij
w yf x
w
x
![Page 18: Marcel Dettling · 2013. 9. 24. · Marcel Dettling, Zurich University of Applied Sciences 6 Applied Statistical Regression AS 2013 – Week 02 What is Regression? Example: Fresh](https://reader033.fdocuments.in/reader033/viewer/2022051905/5ff7c3cb7b6ff4644d1ab71b/html5/thumbnails/18.jpg)
18Marcel Dettling, Zurich University of Applied Sciences
Applied Statistical RegressionAS 2013 – Week 02
Running Mean: R-Implementation• As an introductory exercise, it is instructive to code a function
that computes and visualizes the running mean.Arguments: xx= x values
yy= y valueswidth= window widthsteps= # of points computed
• Alternatively, one can simply use function ksmooth(). The window size can be adjusted by argument bandwidth=. Some other settings can be made, especially with respectto evaluation.
We will now study the running mean fit...
![Page 19: Marcel Dettling · 2013. 9. 24. · Marcel Dettling, Zurich University of Applied Sciences 6 Applied Statistical Regression AS 2013 – Week 02 What is Regression? Example: Fresh](https://reader033.fdocuments.in/reader033/viewer/2022051905/5ff7c3cb7b6ff4644d1ab71b/html5/thumbnails/19.jpg)
19Marcel Dettling, Zurich University of Applied Sciences
Applied Statistical RegressionAS 2013 – Week 02
Running Mean: R-Implementation
![Page 20: Marcel Dettling · 2013. 9. 24. · Marcel Dettling, Zurich University of Applied Sciences 6 Applied Statistical Regression AS 2013 – Week 02 What is Regression? Example: Fresh](https://reader033.fdocuments.in/reader033/viewer/2022051905/5ff7c3cb7b6ff4644d1ab71b/html5/thumbnails/20.jpg)
20Marcel Dettling, Zurich University of Applied Sciences
Applied Statistical RegressionAS 2013 – Week 02
Running Mean: Unique Data> plot(Pax ~ ATM, data=unique2010, main="...") > lines(fit, col="red", lwd=2)
19000 20000 21000 22000 23000 24000 250001400
000
1800
000
2200
000
ATM
Pax
Zurich Airport Data: Pax vs. ATM / Bandwidth=1000, x.points=ATM
![Page 21: Marcel Dettling · 2013. 9. 24. · Marcel Dettling, Zurich University of Applied Sciences 6 Applied Statistical Regression AS 2013 – Week 02 What is Regression? Example: Fresh](https://reader033.fdocuments.in/reader033/viewer/2022051905/5ff7c3cb7b6ff4644d1ab71b/html5/thumbnails/21.jpg)
21Marcel Dettling, Zurich University of Applied Sciences
Applied Statistical RegressionAS 2013 – Week 02
Running Mean: Unique Data> plot(Pax ~ ATM, data=unique2010, main="...") > lines(fit, col="red", lwd=2)
19000 20000 21000 22000 23000 24000 250001400
000
1800
000
2200
000
ATM
Pax
Zurich Airport Data: Pax vs. ATM / Bandwidth=1000, n.points=1000
![Page 22: Marcel Dettling · 2013. 9. 24. · Marcel Dettling, Zurich University of Applied Sciences 6 Applied Statistical Regression AS 2013 – Week 02 What is Regression? Example: Fresh](https://reader033.fdocuments.in/reader033/viewer/2022051905/5ff7c3cb7b6ff4644d1ab71b/html5/thumbnails/22.jpg)
22Marcel Dettling, Zurich University of Applied Sciences
Applied Statistical RegressionAS 2013 – Week 02
Running Mean: Drawbacks• The finer grained the evaluation points are, the less smooth
the fitted function turns out to be. This is unwanted. Reason: datapoints are "lost" abruptly.
• For large window width, we loose a lot of information on theboundaries. For small windows however, we may have toofew points withing the window, and thus instability.
There are much better smoothing algorithms!
We will introduce:a) a Gaussian Kernel Smoother, andb) the robust LOESS-Smoother
![Page 23: Marcel Dettling · 2013. 9. 24. · Marcel Dettling, Zurich University of Applied Sciences 6 Applied Statistical Regression AS 2013 – Week 02 What is Regression? Example: Fresh](https://reader033.fdocuments.in/reader033/viewer/2022051905/5ff7c3cb7b6ff4644d1ab71b/html5/thumbnails/23.jpg)
23Marcel Dettling, Zurich University of Applied Sciences
Applied Statistical RegressionAS 2013 – Week 02
Gaussian Kernel SmootherKernelSmoother(x) = Gaussian bell curve weighted average
of y-values around x.
The estimate for , denoted as , is defined as:
,
The weights are defined as: , i.e.the window is infinitely wide,but distant observation obtain little weight.
( )f ˆ ( )f
2( )exp j
j
x xw
1
1
ˆ ( )
n
j jj
n
jj
w yf x
w
![Page 24: Marcel Dettling · 2013. 9. 24. · Marcel Dettling, Zurich University of Applied Sciences 6 Applied Statistical Regression AS 2013 – Week 02 What is Regression? Example: Fresh](https://reader033.fdocuments.in/reader033/viewer/2022051905/5ff7c3cb7b6ff4644d1ab71b/html5/thumbnails/24.jpg)
24Marcel Dettling, Zurich University of Applied Sciences
Applied Statistical RegressionAS 2013 – Week 02
Gaussian Kernel Smoother: Idea
2 4 6 8 10
510
1520
25
5.556.31
7.68
11.07
13.62
15.59
17.96 18.18
21.44 21.83
![Page 25: Marcel Dettling · 2013. 9. 24. · Marcel Dettling, Zurich University of Applied Sciences 6 Applied Statistical Regression AS 2013 – Week 02 What is Regression? Example: Fresh](https://reader033.fdocuments.in/reader033/viewer/2022051905/5ff7c3cb7b6ff4644d1ab71b/html5/thumbnails/25.jpg)
25Marcel Dettling, Zurich University of Applied Sciences
Applied Statistical RegressionAS 2013 – Week 02
Gaussian Kernel Smoother: Unique Data> ks.gauss <- ksmooth(ATM, Pax, kernel="normal", band=1000)> plot(ATM, Pax, xlab="ATM", ylab="Pax", pch=20)> lines(ks.gauss, col="darkgreen", lwd=1.5)
19000 20000 21000 22000 23000 24000 250001400
000
1800
000
2200
000
ATM
Pax
Zurich Airport Data: Pax vs. ATM / Bandwidth=1000, n.points=1000
![Page 26: Marcel Dettling · 2013. 9. 24. · Marcel Dettling, Zurich University of Applied Sciences 6 Applied Statistical Regression AS 2013 – Week 02 What is Regression? Example: Fresh](https://reader033.fdocuments.in/reader033/viewer/2022051905/5ff7c3cb7b6ff4644d1ab71b/html5/thumbnails/26.jpg)
26Marcel Dettling, Zurich University of Applied Sciences
Applied Statistical RegressionAS 2013 – Week 02
LOESS-SmootherThe LOESS-Smoother is better, more flexible and more robust than the Gaussian Kernel Smoother. It should be prefered!
It works as follows:
1) Choose a window of fixed width
2) For this window, a straight line (or a parabola) is fitted to the datapoints within, using a robust fitting method.
3) Predicted value at window center := fitted value
4) Slide the window over the entire x-range
![Page 27: Marcel Dettling · 2013. 9. 24. · Marcel Dettling, Zurich University of Applied Sciences 6 Applied Statistical Regression AS 2013 – Week 02 What is Regression? Example: Fresh](https://reader033.fdocuments.in/reader033/viewer/2022051905/5ff7c3cb7b6ff4644d1ab71b/html5/thumbnails/27.jpg)
27Marcel Dettling, Zurich University of Applied Sciences
Applied Statistical RegressionAS 2013 – Week 02
LOESS-Smoother: Idea
2 4 6 8 10
510
1520
25
5.556.31
7.68
11.07
13.62
15.59
17.96 18.18
21.44 21.83
![Page 28: Marcel Dettling · 2013. 9. 24. · Marcel Dettling, Zurich University of Applied Sciences 6 Applied Statistical Regression AS 2013 – Week 02 What is Regression? Example: Fresh](https://reader033.fdocuments.in/reader033/viewer/2022051905/5ff7c3cb7b6ff4644d1ab71b/html5/thumbnails/28.jpg)
28Marcel Dettling, Zurich University of Applied Sciences
Applied Statistical RegressionAS 2013 – Week 02
LOESS-Smoother: R-Implementation
![Page 29: Marcel Dettling · 2013. 9. 24. · Marcel Dettling, Zurich University of Applied Sciences 6 Applied Statistical Regression AS 2013 – Week 02 What is Regression? Example: Fresh](https://reader033.fdocuments.in/reader033/viewer/2022051905/5ff7c3cb7b6ff4644d1ab71b/html5/thumbnails/29.jpg)
29Marcel Dettling, Zurich University of Applied Sciences
Applied Statistical RegressionAS 2013 – Week 02
LOESS-Smoother: Unique Data> smoo <- loess.smooth(unique2010$ATM, unique2010$Pax)> plot(Pax ~ ATM, data=unique2010, main=...)> lines(smoo, col="blue")
19000 20000 21000 22000 23000 24000 250001400
000
1800
000
2200
000
ATM
Pax
Loess-Glätter: Default-Einstellung
![Page 30: Marcel Dettling · 2013. 9. 24. · Marcel Dettling, Zurich University of Applied Sciences 6 Applied Statistical Regression AS 2013 – Week 02 What is Regression? Example: Fresh](https://reader033.fdocuments.in/reader033/viewer/2022051905/5ff7c3cb7b6ff4644d1ab71b/html5/thumbnails/30.jpg)
30Marcel Dettling, Zurich University of Applied Sciences
Applied Statistical RegressionAS 2013 – Week 04
Linear Modeling
19000 20000 21000 22000 23000 24000 25000
1400
000
1600
000
1800
000
2000
000
2200
000
Flugbewegungen
Pax
Flughafen Zürich: Pax vs. ATMA straight line representsthe systematic relationbetween Pax and ATM.
• Only appropriate if thetrue relation is indeeda straight line
• The question is howthe line/function areobtained.
![Page 31: Marcel Dettling · 2013. 9. 24. · Marcel Dettling, Zurich University of Applied Sciences 6 Applied Statistical Regression AS 2013 – Week 02 What is Regression? Example: Fresh](https://reader033.fdocuments.in/reader033/viewer/2022051905/5ff7c3cb7b6ff4644d1ab71b/html5/thumbnails/31.jpg)
31Marcel Dettling, Zurich University of Applied Sciences
Applied Statistical RegressionAS 2013 – Week 04
Simple Linear RegressionThe more air traffic movements, the more passengers there are. The relation seems to be linear, which is of course also themathematically most simple way of describing the relation.
, resp.
Name/meaning of the two "Intercept"parameters in the equation: "Slope"
Fitting a straight line into a 2-dimensional scatter plot is knownas simple linear regression. This is because: • there is just one single predictor variable ("simple").• the relation is linear in the parameters ("linear").
1( ) of x x
0 1
0 1Pax ATM
![Page 32: Marcel Dettling · 2013. 9. 24. · Marcel Dettling, Zurich University of Applied Sciences 6 Applied Statistical Regression AS 2013 – Week 02 What is Regression? Example: Fresh](https://reader033.fdocuments.in/reader033/viewer/2022051905/5ff7c3cb7b6ff4644d1ab71b/html5/thumbnails/32.jpg)
32Marcel Dettling, Zurich University of Applied Sciences
Applied Statistical RegressionAS 2013 – Week 04
Model, Data & Random ErrorsNo we are bringing the data into play. The regression line will not run through all the data points. Thus, there are random errors:
, for all
Meaning of variables/parameters:is the response variable (Pax) of observation .is the predictor variable (ATM) of observation .are the regression coefficients. They are unknownpreviously, and need to be estimated from the data.is the residual or error, i.e. the random difference bet-ween observation and regression line.
0 1i i iy x E
ixiy
iE
0 1,
1,...,i n
ii
![Page 33: Marcel Dettling · 2013. 9. 24. · Marcel Dettling, Zurich University of Applied Sciences 6 Applied Statistical Regression AS 2013 – Week 02 What is Regression? Example: Fresh](https://reader033.fdocuments.in/reader033/viewer/2022051905/5ff7c3cb7b6ff4644d1ab71b/html5/thumbnails/33.jpg)
33Marcel Dettling, Zurich University of Applied Sciences
Applied Statistical RegressionAS 2013 – Week 04
Least Squares Fitting http://sambaker.com/courses/J716/demos/LeastSquares/LeastSquaresDemo.html
We need to fit a straightline that fits the data well.
Many possible solutionsexist, some are good, some are worse.
Our paradigm is to fit theline such that the squarederrors are minimal.
![Page 34: Marcel Dettling · 2013. 9. 24. · Marcel Dettling, Zurich University of Applied Sciences 6 Applied Statistical Regression AS 2013 – Week 02 What is Regression? Example: Fresh](https://reader033.fdocuments.in/reader033/viewer/2022051905/5ff7c3cb7b6ff4644d1ab71b/html5/thumbnails/34.jpg)
34Marcel Dettling, Zurich University of Applied Sciences
Applied Statistical RegressionAS 2013 – Week 04
Least Squares: MathematicsThe paradigm in verbatim...
Given a set of data points , the goal is to fit theregression line such that the sum of squared differencesbetween observed value and regression line is minimal. The function
measures, how well the regression line, defined by , fitsthe data. The goal is to minimize this "quality function".
Solution: see next slide...
2 2 20 1 0
1 1 1
ˆ( , ) ( ) ( ( )) min!n n n
i i i i ii i i
Q r y y y x
1,...,( , )i i i nx y
0 1,
iy
![Page 35: Marcel Dettling · 2013. 9. 24. · Marcel Dettling, Zurich University of Applied Sciences 6 Applied Statistical Regression AS 2013 – Week 02 What is Regression? Example: Fresh](https://reader033.fdocuments.in/reader033/viewer/2022051905/5ff7c3cb7b6ff4644d1ab71b/html5/thumbnails/35.jpg)
35Marcel Dettling, Zurich University of Applied Sciences
Applied Statistical RegressionAS 2013 – Week 04
Solution Idea: Partial Derivatives• We are taking partial derivatives on the function with
respect to both arguments and . As we are after theminimum of the function, we set them to zero:
and
• This results in a linear equation system, which (here) has twounknowns , but also two equations. These are also known under the name normal equations.
• The solution for can be written explicitly as a function ofthe data pairs , see next slide...
00 1( , )Q
1
0
0Q
1
0Q
0 1, 1,...,( , )i i i nx y
0 1,
![Page 36: Marcel Dettling · 2013. 9. 24. · Marcel Dettling, Zurich University of Applied Sciences 6 Applied Statistical Regression AS 2013 – Week 02 What is Regression? Example: Fresh](https://reader033.fdocuments.in/reader033/viewer/2022051905/5ff7c3cb7b6ff4644d1ab71b/html5/thumbnails/36.jpg)
36Marcel Dettling, Zurich University of Applied Sciences
Applied Statistical RegressionAS 2013 – Week 04
Least Squares: SolutionAccording to the least squares paradigm, the best fittingregression line is, i.e. the optimal coefficients are:
and
• For a given set of data points , we can determinethe solution with a pocket calculator (...or better, with R).
• The solution for our example Pax vs. ATM:obtained from
> lm(Pax ~ ATM, data=uni...)
11
2
1
( )( )ˆ
( )
n
i ii
n
ii
x x y y
x x
1,...,( , )i i i nx y
0 1ˆ ˆy x
1 0ˆ ˆ138.8, 1'197'682
![Page 37: Marcel Dettling · 2013. 9. 24. · Marcel Dettling, Zurich University of Applied Sciences 6 Applied Statistical Regression AS 2013 – Week 02 What is Regression? Example: Fresh](https://reader033.fdocuments.in/reader033/viewer/2022051905/5ff7c3cb7b6ff4644d1ab71b/html5/thumbnails/37.jpg)
37Marcel Dettling, Zurich University of Applied Sciences
Applied Statistical RegressionAS 2013 – Week 04
Fitted ValuesThe estimated parameters can be used for determiningthe fitted values . Please note that mathematically, this is a conditional expected value::
In R, the fitted values are obtained as follows:
> fit <- lm(Pax ~ ATM, data=unique2010)> fitted(fit)
1 2 3 4 5 1654841 1808312 2165068 2156465 2184911
6 7 8 9 ... 2250545 2108731 2062107 1493184 ...
0 1ˆ ˆ,
y
0 1ˆ ˆˆ [ | ]y E y x x
![Page 38: Marcel Dettling · 2013. 9. 24. · Marcel Dettling, Zurich University of Applied Sciences 6 Applied Statistical Regression AS 2013 – Week 02 What is Regression? Example: Fresh](https://reader033.fdocuments.in/reader033/viewer/2022051905/5ff7c3cb7b6ff4644d1ab71b/html5/thumbnails/38.jpg)
19000 20000 21000 22000 23000 24000 250001400
000
1800
000
2200
000
ATM
Pax
Zurich Airport Data: Pax vs. ATM
38
Applied Statistical RegressionAS 2013 – Week 04
Drawing the Regression Line> plot(Pax ~ ATM, data=unique2010, pch=20)> title("Zurich Airport Data: Pax vs. ATM")> abline(fit, col="red", lwd=2)
0 1ˆ ˆy x
![Page 39: Marcel Dettling · 2013. 9. 24. · Marcel Dettling, Zurich University of Applied Sciences 6 Applied Statistical Regression AS 2013 – Week 02 What is Regression? Example: Fresh](https://reader033.fdocuments.in/reader033/viewer/2022051905/5ff7c3cb7b6ff4644d1ab71b/html5/thumbnails/39.jpg)
39
Applied Statistical RegressionAS 2013 – Week 04
Is This a Good Model for Predicting thePax Number from the ATM?a) Beyond the range of observed dataUnknown, but most likely not...
b) Within the range of observed dataYes, under the following conditions:- the relation is in truth a straight line, i.e. - the scatter of the errors is constant, i.e. - the data are uncorrelated (from a representative sample)- the errors are approximately normally distributed
Fodder for thougt: 9/11, SARS, Eyjafjallajökull...?Marcel Dettling, Zurich University of Applied Sciences
[ ] 0iE E 2( )iVar E
![Page 40: Marcel Dettling · 2013. 9. 24. · Marcel Dettling, Zurich University of Applied Sciences 6 Applied Statistical Regression AS 2013 – Week 02 What is Regression? Example: Fresh](https://reader033.fdocuments.in/reader033/viewer/2022051905/5ff7c3cb7b6ff4644d1ab71b/html5/thumbnails/40.jpg)
40
Applied Statistical RegressionAS 2013 – Week 04
Model DiagnosticsFor assessing the quality of the regression line, we need to(at least roughly) check whether the assumptions are met:
and can be reviewed by:[ ] 0iE E 2( )iVar E
19000 21000 23000 25000
-1e+
05-5
e+04
0e+0
05e
+04
ATM
Res
idua
ls
Residuals vs. Predictor ATM
1400000 1800000 2200000
-1e+
05-5
e+04
0e+0
05e
+04
Fitted
Res
idua
ls
Residuals vs. Fitted Values
![Page 41: Marcel Dettling · 2013. 9. 24. · Marcel Dettling, Zurich University of Applied Sciences 6 Applied Statistical Regression AS 2013 – Week 02 What is Regression? Example: Fresh](https://reader033.fdocuments.in/reader033/viewer/2022051905/5ff7c3cb7b6ff4644d1ab71b/html5/thumbnails/41.jpg)
41
Applied Statistical RegressionAS 2013 – Week 04
Model DiagnosticsFor assessing the quality of the regression line, we need to(at least roughly) check whether the assumptions are met:Gaussian distribution can be reviewed by:
We will revisit model diagnosticsagain later in this course, whereit will be discussed more deeply.
"Residuals vs. Fitted" and the"Normal Plot" will always stay atthe heart of model diagnostics.
-2 -1 0 1 2
-1e+
050e
+00
5e+0
4
Normal Q-Q Plot
Theoretical Quantiles
Sam
ple
Qua
ntile
s
![Page 42: Marcel Dettling · 2013. 9. 24. · Marcel Dettling, Zurich University of Applied Sciences 6 Applied Statistical Regression AS 2013 – Week 02 What is Regression? Example: Fresh](https://reader033.fdocuments.in/reader033/viewer/2022051905/5ff7c3cb7b6ff4644d1ab71b/html5/thumbnails/42.jpg)
42
Applied Statistical RegressionAS 2013 – Week 04
Why Least Squares?History...
Within a few years (1801, 1805), the method was developedindependently by Gauss and Legendre. Both were after solvingapplied problems in astronomy...Source: http://de.wikipedia.org/wiki/Methode_der_kleinsten_Quadrate
Carl Friedrich Gauss
Marcel Dettling, Zurich University of Applied Sciences
Adrien-Marie Legendre
![Page 43: Marcel Dettling · 2013. 9. 24. · Marcel Dettling, Zurich University of Applied Sciences 6 Applied Statistical Regression AS 2013 – Week 02 What is Regression? Example: Fresh](https://reader033.fdocuments.in/reader033/viewer/2022051905/5ff7c3cb7b6ff4644d1ab71b/html5/thumbnails/43.jpg)
43
Applied Statistical RegressionAS 2013 – Week 04
Why Least Squares?Mathematics...
• Least Squares is simple in the sense that the solution isknown in closed form as a function of .
• The line runs through the center of gravity
• The sum of residuals adds up to zero:
• Some deeper mathematical optimality can be shown whenanalyzing the large sample properties of the estimatesThis is especially true under the assumption of normallydistributed errors .
Marcel Dettling, Zurich University of Applied Sciences
1,...,( , )i i i nx y
( , )x y
1
0n
ii
r
0 1ˆ ˆ,
iE
![Page 44: Marcel Dettling · 2013. 9. 24. · Marcel Dettling, Zurich University of Applied Sciences 6 Applied Statistical Regression AS 2013 – Week 02 What is Regression? Example: Fresh](https://reader033.fdocuments.in/reader033/viewer/2022051905/5ff7c3cb7b6ff4644d1ab71b/html5/thumbnails/44.jpg)
44Marcel Dettling, Zurich University of Applied Sciences
Applied Statistical RegressionAS 2013 – Week 04
Gauss-Markov-TheoremA mathematical optimality result for the Least Squares line
It only holds if the following conditions are met:
- the relation is in truth a straight line, i.e. - the scatter of the errors is constant, i.e. - the errors are uncorrelated, i.e.
Not yet required:- the errors are normally distributed:
Gauss-Markov-Theorem:- Least Squares yields the best linear unbiased estimates
[ ] 0iE E 2( )iVar E
( , ) 0,i jCov E E if i j
2~ (0, )i EE N
![Page 45: Marcel Dettling · 2013. 9. 24. · Marcel Dettling, Zurich University of Applied Sciences 6 Applied Statistical Regression AS 2013 – Week 02 What is Regression? Example: Fresh](https://reader033.fdocuments.in/reader033/viewer/2022051905/5ff7c3cb7b6ff4644d1ab71b/html5/thumbnails/45.jpg)
45Marcel Dettling, Zurich University of Applied Sciences
Applied Statistical RegressionAS 2013 – Week 04
Properties of the Least Square EstimatesUnder the conditions above, the estimates are unbiased:
and
The variances of the estimates are as follows:
and
Precise estimates are obtained with:- a large number of observations- a good scatter in the predictor- an informative/useful predictor, making small- (an error distribution which is approximately Gaussian)
0 0ˆ[ ]E 1 1
ˆ[ ]E
20 2
1
1ˆ( )( )
E nii
xVarn x x
2
1 21
ˆ( )( )
En
ii
Varx x
nix
2E
![Page 46: Marcel Dettling · 2013. 9. 24. · Marcel Dettling, Zurich University of Applied Sciences 6 Applied Statistical Regression AS 2013 – Week 02 What is Regression? Example: Fresh](https://reader033.fdocuments.in/reader033/viewer/2022051905/5ff7c3cb7b6ff4644d1ab71b/html5/thumbnails/46.jpg)
46Marcel Dettling, Zurich University of Applied Sciences
Applied Statistical RegressionAS 2013 – Week 04
Estimating the Error VarianceBesides the regression coefficients, we also need to estimate the error variance. It is a necessary ingredient for all tests and confidence intervals that will be discussed shortly.
The estimate is based on the residual sum of squares (RSS):
In R, the regression summary provides the estimate for theerror’s standard deviation as Residual standard error:
> summary(fit)...Residual standard error: 59700 on 22 degrees of freedom
2 2
1
1ˆ ˆ( )2
n
E i ii
y yn
![Page 47: Marcel Dettling · 2013. 9. 24. · Marcel Dettling, Zurich University of Applied Sciences 6 Applied Statistical Regression AS 2013 – Week 02 What is Regression? Example: Fresh](https://reader033.fdocuments.in/reader033/viewer/2022051905/5ff7c3cb7b6ff4644d1ab71b/html5/thumbnails/47.jpg)
47Marcel Dettling, Zurich University of Applied Sciences
Applied Statistical RegressionAS 2013 – Week 04
Benefits of Linear Regression• Inference on the relation between and
The goal is to understand if and how strongly the responsevariable depends on the predictor. There are performanceindicators as well as statistical tests adressing the issue.
• Prediction of (future) observations
The regression line/equation can be employed to predictthe PAX number for any given ATM value.
However, this mostly will not work well for extrapolation!
0 1ˆ ˆy x
y x
![Page 48: Marcel Dettling · 2013. 9. 24. · Marcel Dettling, Zurich University of Applied Sciences 6 Applied Statistical Regression AS 2013 – Week 02 What is Regression? Example: Fresh](https://reader033.fdocuments.in/reader033/viewer/2022051905/5ff7c3cb7b6ff4644d1ab71b/html5/thumbnails/48.jpg)
19000 20000 21000 22000 23000 24000 250001400
000
1800
000
2200
000
Flugbewegungen
Pax
Flughafen Zürich: Pax vs. ATM
48Marcel Dettling, Zurich University of Applied Sciences
Applied Statistical RegressionAS 2013 – Week 04
: The Coefficient of Determination The coefficient of determination is also known as multipleR-squared. It tells which portion of the total variation isaccounted for by the regression line.
2R2R
![Page 49: Marcel Dettling · 2013. 9. 24. · Marcel Dettling, Zurich University of Applied Sciences 6 Applied Statistical Regression AS 2013 – Week 02 What is Regression? Example: Fresh](https://reader033.fdocuments.in/reader033/viewer/2022051905/5ff7c3cb7b6ff4644d1ab71b/html5/thumbnails/49.jpg)
49Marcel Dettling, Zurich University of Applied Sciences
Applied Statistical RegressionAS 2013 – Week 04
Computation ofis the portion of the total variation that is explained through
regression. It is determined as one minus the quotient of theyellow arrow divided by the blue arrow.
The closer to 1 the value is, the tighter the datapoints arepacked around the regression line. However, there are noformal criteria which value needs to be met such thatthe regression can be said to be useful/valid.
2R
2
2 1
2
1
ˆ( )1 [0,1]
( )
n
i ii
n
ii
y yR
y y
2R
2R
![Page 50: Marcel Dettling · 2013. 9. 24. · Marcel Dettling, Zurich University of Applied Sciences 6 Applied Statistical Regression AS 2013 – Week 02 What is Regression? Example: Fresh](https://reader033.fdocuments.in/reader033/viewer/2022051905/5ff7c3cb7b6ff4644d1ab71b/html5/thumbnails/50.jpg)
50Marcel Dettling, Zurich University of Applied Sciences
Applied Statistical RegressionAS 2013 – Week 04
Confidence Interval for the SlopeA 95%-CI for the slope tells which values (besidesthe point estimate ) are plausible, too. The uncertaintyis due to estimation/sampling effects.
95%-CI for : , resp.
In R: > fit <- lm(Pax ~ ATM, data=unique2010)> confint(fit, "ATM")
2.5 % 97.5 %ATM 124.4983 153.025
1
11 0.975; 2ˆ ˆnqt
1
2 21 0.975; 2 1ˆ ˆ ( )n
n E iiqt x x
1
1
![Page 51: Marcel Dettling · 2013. 9. 24. · Marcel Dettling, Zurich University of Applied Sciences 6 Applied Statistical Regression AS 2013 – Week 02 What is Regression? Example: Fresh](https://reader033.fdocuments.in/reader033/viewer/2022051905/5ff7c3cb7b6ff4644d1ab71b/html5/thumbnails/51.jpg)
51Marcel Dettling, Zurich University of Applied Sciences
Applied Statistical RegressionAS 2013 – Week 04
Testing the SlopeThere is a statistical hypothesis test which can be used to check whether the slope is significantly different from zero, or any otherarbitrary value . The null hypothesis is:
, resp.
One usually tests two-sided on the 95%-level. The alternative is:
, resp.
As a test statistic, we use:
, resp. , , both have a distribution.
1
0 1: 0H 0 1:H b
b
1: 0AH 1:AH b
2nt 0 1
1
1: 0
ˆ
ˆ
ˆHT
0 1
1
1:
ˆ
ˆ
ˆH bbT
![Page 52: Marcel Dettling · 2013. 9. 24. · Marcel Dettling, Zurich University of Applied Sciences 6 Applied Statistical Regression AS 2013 – Week 02 What is Regression? Example: Fresh](https://reader033.fdocuments.in/reader033/viewer/2022051905/5ff7c3cb7b6ff4644d1ab71b/html5/thumbnails/52.jpg)
52Marcel Dettling, Zurich University of Applied Sciences
Applied Statistical RegressionAS 2013 – Week 04
Reading R-Output> summary(lm(Pax ~ ATM, data=dat))
Call: lm(formula = Pax ~ ATM, data = dat)
Residuals: Min 1Q Median 3Q Max -104188 -40885 2099 48588 89154
Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -1.198e+06 1.524e+05 -7.858 7.94e-08 ***ATM 1.388e+02 6.878e+00 20.176 1.11e-15 ***
---
Residual standard error: 59700 on 22 degrees of freedomMultiple R-squared: 0.9487, Adjusted R-squared: 0.9464 F-statistic: 407.1 on 1 and 22 DF, p-value: 1.110e-15
Will be explained in detail on the blackboard!
![Page 53: Marcel Dettling · 2013. 9. 24. · Marcel Dettling, Zurich University of Applied Sciences 6 Applied Statistical Regression AS 2013 – Week 02 What is Regression? Example: Fresh](https://reader033.fdocuments.in/reader033/viewer/2022051905/5ff7c3cb7b6ff4644d1ab71b/html5/thumbnails/53.jpg)
53Marcel Dettling, Zurich University of Applied Sciences
Applied Statistical RegressionAS 2013 – Week 04
Testing the SlopePractical Example:
Use the Pax vs. ATM data and perform a statistical test for thenull hypothesis . The information from the summaryon slide 25 can be used as a basis. Then, also answer:
a) Explain in colloquial language what was just tested. What isthe benefit of this test? What claims could motivate the test?
b) How does the testing result relate with the 95%-CI that wecomputed on slide 23? Would we be able to tell the testresults from the CI alone?
See blackboard for the answers
1
0 1: 150H
![Page 54: Marcel Dettling · 2013. 9. 24. · Marcel Dettling, Zurich University of Applied Sciences 6 Applied Statistical Regression AS 2013 – Week 02 What is Regression? Example: Fresh](https://reader033.fdocuments.in/reader033/viewer/2022051905/5ff7c3cb7b6ff4644d1ab71b/html5/thumbnails/54.jpg)
54Marcel Dettling, Zurich University of Applied Sciences
Applied Statistical RegressionAS 2013 – Week 04
Testing the InterceptAn analogous test can be done for the intercept.
• No matter what the test result will be, the intercept shouldgenerally not be omitted from the regression model.
• The presence of the intercept protects against possiblenon-linearities and calibration errors of measurementdevices. If it is kicked out of the model, the results aregenerally worse.
• If theory dictates that there should not be an intercept butit is still significant, take this as evidence that the linear relation does not hold when extrapolating to .
0
0x
![Page 55: Marcel Dettling · 2013. 9. 24. · Marcel Dettling, Zurich University of Applied Sciences 6 Applied Statistical Regression AS 2013 – Week 02 What is Regression? Example: Fresh](https://reader033.fdocuments.in/reader033/viewer/2022051905/5ff7c3cb7b6ff4644d1ab71b/html5/thumbnails/55.jpg)
55Marcel Dettling, Zurich University of Applied Sciences
Applied Statistical RegressionAS 2013 – Week 04
PredictionUsing the regression line, we can predict the -value for any desired -value. The result is the expectation for given .
a.k.a. "fitted value"
Example: With 24’000 air traffic movements, we expect
Passengers
Be careful:
At best, interpolation within the range of observed -values is trustworthy. Extrapolation with ATM values such as 50’000, 5'000 or even 0 usually produces completely useless results.
0 1ˆ ˆˆ[ | ]E y x y x
1'197'682 24 '000 138.8 2'133'518
yx
x
y x
![Page 56: Marcel Dettling · 2013. 9. 24. · Marcel Dettling, Zurich University of Applied Sciences 6 Applied Statistical Regression AS 2013 – Week 02 What is Regression? Example: Fresh](https://reader033.fdocuments.in/reader033/viewer/2022051905/5ff7c3cb7b6ff4644d1ab71b/html5/thumbnails/56.jpg)
56Marcel Dettling, Zurich University of Applied Sciences
Applied Statistical RegressionAS 2013 – Week 04
Prediction with RWe can use the regression fit object for prediction. The syntax for obtaining the fitted value(s) is as follows:
> fit <- lm(Pax ~ ATM, data=unique2010)> dat <- data.frame(ATM=c(24000)) > predict(fit, newdata=dat)1 2132598
The -values need to be provided in a data frame, where thevariable/column name is identical to the predictor name.
Then, the predict() procedure is invoked with the regressionfit and the new -values as arguments.
x
x
![Page 57: Marcel Dettling · 2013. 9. 24. · Marcel Dettling, Zurich University of Applied Sciences 6 Applied Statistical Regression AS 2013 – Week 02 What is Regression? Example: Fresh](https://reader033.fdocuments.in/reader033/viewer/2022051905/5ff7c3cb7b6ff4644d1ab71b/html5/thumbnails/57.jpg)
57Marcel Dettling, Zurich University of Applied Sciences
Applied Statistical RegressionAS 2013 – Week 04
Confidence Interval forWe just computed the fitted value , i.e. the expectednumber of passengers for 24'000 ATMs. This is not a deter-ministic value, but an estimate that is subject to variability.
A 95%-CI for the fitted value at position is given by:
In R: > predict(fit, newdata=dat, interval="confidence")fit lwr upr
1 2132598 2095450 2169746
2
0 1 0.975; 2 21
1 ( )ˆ ˆ ˆ( )
n E nii
x xx qtn x x
[ | ]E y x0 1
ˆ ˆ x
x
![Page 58: Marcel Dettling · 2013. 9. 24. · Marcel Dettling, Zurich University of Applied Sciences 6 Applied Statistical Regression AS 2013 – Week 02 What is Regression? Example: Fresh](https://reader033.fdocuments.in/reader033/viewer/2022051905/5ff7c3cb7b6ff4644d1ab71b/html5/thumbnails/58.jpg)
58Marcel Dettling, Zurich University of Applied Sciences
Applied Statistical RegressionAS 2013 – Week 04
Prediction Interval forThe confidence interval for tells about the variabilty ofthe fitted value. It does not account for the scatter of the datapoints around the regression line and thus does not define a region where we have to expect the observed value. A 95% prediction interval at position is given by:
In R: > predict(fit,newdata=dat,interval="prediction")fit lwr upr
1 2132598 2003343 2261853
y
x2
0 1 0.975; 2 21
1 ( )ˆ ˆ ˆ 1( )
n E nii
x xx qtn x x
[ | ]E y x
![Page 59: Marcel Dettling · 2013. 9. 24. · Marcel Dettling, Zurich University of Applied Sciences 6 Applied Statistical Regression AS 2013 – Week 02 What is Regression? Example: Fresh](https://reader033.fdocuments.in/reader033/viewer/2022051905/5ff7c3cb7b6ff4644d1ab71b/html5/thumbnails/59.jpg)
59Marcel Dettling, Zurich University of Applied Sciences
Applied Statistical RegressionAS 2013 – Week 04
Confidence and Prediction Interval
19000 20000 21000 22000 23000 24000 250001400
000
1800
000
2200
000
ATM
Pax
Pax vs. ATM with Confidence and Prediction Interval
ConfidencePrediction
![Page 60: Marcel Dettling · 2013. 9. 24. · Marcel Dettling, Zurich University of Applied Sciences 6 Applied Statistical Regression AS 2013 – Week 02 What is Regression? Example: Fresh](https://reader033.fdocuments.in/reader033/viewer/2022051905/5ff7c3cb7b6ff4644d1ab71b/html5/thumbnails/60.jpg)
60Marcel Dettling, Zurich University of Applied Sciences
Applied Statistical RegressionAS 2013 – Week 04
Confidence and Prediction IntervalNote:
Visualizing the confidence and prediction intervals in R isnot straightforward, but requires some tedious handwork.
R-Hints:
dat <- data.frame(ATM=seq(...,..., length=200))pred <- predict(fit, newdata=dat, interval=...)plot(..., ..., main="...")lines(dat$ATM, pred[,2], col=...)lines(dat$ATM, pred[,3], col=...)