2DS00 Statistics 1 for Chemical Engineering Lecture 3.
-
date post
19-Dec-2015 -
Category
Documents
-
view
216 -
download
1
Transcript of 2DS00 Statistics 1 for Chemical Engineering Lecture 3.
Week schedule
Week 1: Measurement and statistics
Week 2: Error propagation
Week 3: Simple linear regression analysis
Week 4: Multiple linear regression analysis
Week 5: Nonlinear regression analysis
Detailed contents of week 3
• Least Squares Method
• simple linear regression
– parameter estimates
– residuals
– confidence intervals
– significance test
– influential points
– lack-of-fit
Tijd(sec)
Gemetenafstand
Berekendeafstand
Gemeten –Berekendeafstand
Kwadraat
1 36.754 21 15.754 248.19
2 71.845 32 39.845 1587.62
3 60.479 43 17.479 305.52
4 101.149 54 47.149 2223.03
5 103.150 65 38.15 1455.42
6 111.148 76 35.148 1235.38
7 142.170 87 55.17 3043.73
8 157.334 98 59.334 3520.52
9 161.843 109 52.843 2792.38
10 206.030 120 86.03 7401.16
Kwadratensom 23812.96
Table of measurements and squares
Visualisation of sums of squares
8. 56
13. 56
18. 56
23. 56
Par amet er v
- 10
10
30
50
Par amet er S0
Kwadr at ensom
1117
5733
10349
14965
Types of regression analysis
Linear means linear in coefficients, not linear functions!
•Simple linear regression
•Multiple linear regression
• Non-linear regression
0 1Y x
0 1 1 2 2 ...Y x x
21Y C
Surface tension nitrobenzene
• measurements of temperature and surface tension
• temperature ranges from 40 to 200 oC
• scatter plot indicates linear relation
Confidence intervals
• parameter estimates: estimate +/- t14-2;0,025 standard error
• predicted values (extrapolation is dangerous, most accurate
predictions at mean of independent variable)
----------------------------------------------------------------------------- Standard TParameter Estimate Error Statistic P-Value-----------------------------------------------------------------------------Intercept 45,9629 0,164407 279,568 0,0000Slope -0,113016 0,0014057 -80,3984 0,0000-----------------------------------------------------------------------------
Significance testing----------------------------------------------------------------------------- Standard TParameter Estimate Error Statistic P-Value-----------------------------------------------------------------------------Intercept 45,9629 0,164407 279,568 0,0000Slope -0,113016 0,0014057 -80,3984 0,0000-----------------------------------------------------------------------------
Analysis of Variance-----------------------------------------------------------------------------Source Sum of Squares Df Mean Square F-Ratio P-Value-----------------------------------------------------------------------------Model 374,924 1 374,924 6463,91 0,0000Residual 0,696032 12 0,0580027-----------------------------------------------------------------------------Total (Corr.) 375,62 13
Model: Yi = 0 + 1x1 + i
ssumptions:
• the model is linear (+ enough terms)
• the i's are normally distributed with =0 and constant
variance 2
• the i's are independent.
Simple Linear regression: model assumptions
Normality checking + independence
• check normality by considering residuals
• apply both graphical checks and Shapiro-Wilks
• check independence by using the Durbin – Watson test
• also check residuals by plotting them against time
Residuals
• use studentized residuals in order to obtain universal scale
e versus homogeneity of
variance
e versus linearity
e versus time independence of
errors
e versus xi homogeneity of
variance
Y
Y
Lack-of-fit test
• if multiple measurements are available, then we may test whether
model may be improved significantly
• test is based on two different ways of computing standard deviation
• note difference with testing of model is significant
Influential points
regression lines tend to go to remote points: see
http://www.stat.sc.edu/~west/javahtml/Regression.html
Y
X
****
*
*
*
*Invloedrijk punt
Check-list
1. apply regression analysis
2. check whether regression is signficant. If applicable, apply lack-of-
fit test
3. study residual plots for constant variance
4. check for outliers
5. check normality of residuals (graphical checks, Shapiro-Wilks)
6. check independence of residuals (residual plots, Durbin – Watson)
7. check for influential points