Post on 06-Dec-2020
7/17/2017 http://numericalmethods.eng.usf.edu 1
Linear Regression
Major: All Engineering Majors
Authors: Autar Kaw, Luke Snyder
http://numericalmethods.eng.usf.eduTransforming Numerical Methods Education for STEM
Undergraduates
http://numericalmethods.eng.usf.edu/
Linear Regression
http://numericalmethods.eng.usf.edu
http://numericalmethods.eng.usf.edu/
http://numericalmethods.eng.usf.edu 3
What is Regression?What is regression? Given n data points
best fit )(xfy = to the data.
Residual at each point is
)(xfy =
Figure. Basic model for regression
),(),......,,(),,( 2211 nn yxyxyx
)( iii xfyE −=
y
x
),( 11 yx
),( nn yx),( ii yx
)( iii xfyE −=
http://numericalmethods.eng.usf.edu 4
Linear Regression-Criterion#1Given n data points best fit xaay 10 += to the data.
Does minimizing∑=
n
iiE
1work as a criterion?
x
xaay 10 +=),( 11 yx
),( 22 yx),( 33 yx
),( nn yx
),( ii yx
iii xaayE 10 −−=
y
Figure. Linear regression of y vs x data showing residuals at a typical point, xi .
),(),......,,(),,( 2211 nn yxyxyx
http://numericalmethods.eng.usf.edu 5
Example for Criterion#1
x y
2.0 4.0
3.0 6.0
2.0 6.0
3.0 8.0
Example: Given the data points (2,4), (3,6), (2,6) and (3,8), best fit the data to a straight line using Criterion#1
Figure. Data points for y vs x data.
Table. Data Points
0
2
4
6
8
10
0 1 2 3 4
y
x
Minimize∑=
n
iiE
1
Chart1
2
3
2
3
x
y
4
6
6
8
Sheet1
xy
24
36
26
38
Sheet1
x
y
Sheet2
Sheet3
http://numericalmethods.eng.usf.edu 6
Linear Regression-Criteria#1
04
1=∑
=iiE
x y ypredicted E = y - ypredicted2.0 4.0 4.0 0.0
3.0 6.0 8.0 -2.0
2.0 6.0 4.0 2.0
3.0 8.0 8.0 0.0
Table. Residuals at each point for regression model y=4x − 4
Figure. Regression curve y=4x − 4 and y vs x data
0
2
4
6
8
10
0 1 2 3 4
y
x
Using y=4x − 4 as the regression curve
Chart2
21.4
31.6
21.8
32
2.2
2.4
2.6
2.8
3
3.2
3.4
x
y
4
1.6
6
2.4
6
3.2
8
4
4.8
5.6
6.4
7.2
8
8.8
9.6
Sheet1
xy
24
36
26
38
y=4x-4y = 6
1.41.66
1.62.46
1.83.26
246
2.24.86
2.45.66
2.66.46
2.87.26
386
3.28.86
3.49.66
Sheet1
x
y
Sheet2
y =4x - 4
y = 6
y
x
Sheet3
http://numericalmethods.eng.usf.edu 7
Linear Regression-Criterion#1
x y ypredicted E = y - ypredicted2.0 4.0 6.0 -2.0
3.0 6.0 6.0 0.0
2.0 6.0 6.0 0.0
3.0 8.0 6.0 2.0
04
1=∑
=iiE
0
2
4
6
8
10
0 1 2 3 4
y
x
Table. Residuals at each point for regression model y=6
Figure. Regression curve y=6 and y vs x data
Using y=6 as a regression curve
Chart1
20.2
30.4
20.6
30.8
1
1.2
1.4
1.6
1.8
2
2.2
2.4
2.6
2.8
3
3.2
3.4
3.6
3.8
x
y
4
6
6
6
6
6
8
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
Sheet1
xy
24
36
26
38
y=4x-4y = 6
06
0.26
0.46
0.66
0.86
16
1.26
1.41.66
1.62.46
1.83.26
246
2.24.86
2.45.66
2.66.46
2.87.26
386
3.28.86
3.49.66
3.66
3.86
Sheet1
x
y
Sheet2
y = 6
Sheet3
http://numericalmethods.eng.usf.edu 8
Linear Regression – Criterion #1
04
1=∑
=iiE for both regression models of y=4x-4 and y=6
The sum of the residuals is minimized, in this case it is zero, but the regression model is not unique.
Hence the criterion of minimizing the sum of the residuals is a bad criterion.
0
2
4
6
8
10
0 1 2 3 4
y
x
Chart2
21.4
31.6
21.8
32
2.2
2.4
2.6
2.8
3
3.2
3.4
x
y
4
1.6
6
2.4
6
3.2
8
4
4.8
5.6
6.4
7.2
8
8.8
9.6
Sheet1
xy
24
36
26
38
y=4x-4y = 6
1.41.66
1.62.46
1.83.26
246
2.24.86
2.45.66
2.66.46
2.87.26
386
3.28.86
3.49.66
Sheet1
x
y
Sheet2
y =4x - 4
y = 6
y
x
Sheet3
http://numericalmethods.eng.usf.edu 9
Linear Regression-Criterion#1
04
1=∑
=iiE
x y ypredicted E = y - ypredicted2.0 4.0 4.0 0.0
3.0 6.0 8.0 -2.0
2.0 6.0 4.0 2.0
3.0 8.0 8.0 0.0
Table. Residuals at each point for regression model y=4x − 4
Figure. Regression curve y=4x-4 and y vs x data
0
2
4
6
8
10
0 1 2 3 4
y
x
Using y=4x − 4 as the regression curve
Chart2
21.4
31.6
21.8
32
2.2
2.4
2.6
2.8
3
3.2
3.4
x
y
4
1.6
6
2.4
6
3.2
8
4
4.8
5.6
6.4
7.2
8
8.8
9.6
Sheet1
xy
24
36
26
38
y=4x-4y = 6
1.41.66
1.62.46
1.83.26
246
2.24.86
2.45.66
2.66.46
2.87.26
386
3.28.86
3.49.66
Sheet1
x
y
Sheet2
y =4x - 4
y = 6
y
x
Sheet3
http://numericalmethods.eng.usf.edu 10
Linear Regression-Criterion#2
x
),( 11 yx
),( 22 yx ),( 33 yx
),( nn yx
),( ii yx
iii xaayE 10 −−=
y
Figure. Linear regression of y vs. x data showing residuals at a typical point, xi .
Will minimizing ||1∑=
n
iiE work any better?
xaay 10 +=
http://numericalmethods.eng.usf.edu 11
Example for Criterion#2
x y2.0 4.0
3.0 6.0
2.0 6.0
3.0 8.0
Example: Given the data points (2,4), (3,6), (2,6) and (3,8), best fit the data to a straight line using Criterion#2
Figure. Data points for y vs. x data.
Table. Data Points
0
2
4
6
8
10
0 1 2 3 4
y
x
Minimize∑=
n
iiE
1||
Chart1
2
3
2
3
x
y
4
6
6
8
Sheet1
xy
24
36
26
38
Sheet1
x
y
Sheet2
Sheet3
http://numericalmethods.eng.usf.edu 12
Linear Regression-Criterion#2
4||4
1=∑
=iiE
x y ypredicted E = y - ypredicted2.0 4.0 4.0 0.0
3.0 6.0 8.0 -2.0
2.0 6.0 4.0 2.0
3.0 8.0 8.0 0.0
Table. Residuals at each point for regression model y=4x − 4
Figure. Regression curve y= y=4x − 4 and y vs. xdata
0
2
4
6
8
10
0 1 2 3 4
y
x
Using y=4x − 4 as the regression curve
Chart2
21.4
31.6
21.8
32
2.2
2.4
2.6
2.8
3
3.2
3.4
x
y
4
1.6
6
2.4
6
3.2
8
4
4.8
5.6
6.4
7.2
8
8.8
9.6
Sheet1
xy
24
36
26
38
y=4x-4y = 6
1.41.66
1.62.46
1.83.26
246
2.24.86
2.45.66
2.66.46
2.87.26
386
3.28.86
3.49.66
Sheet1
x
y
Sheet2
y =4x - 4
y = 6
y
x
Sheet3
http://numericalmethods.eng.usf.edu 13
Linear Regression-Criterion#2
x y ypredicted E = y - ypredicted2.0 4.0 6.0 -2.0
3.0 6.0 6.0 0.0
2.0 6.0 6.0 0.0
3.0 8.0 6.0 2.0
4||4
1=∑
=iiE
0
2
4
6
8
10
0 1 2 3 4
y
x
Table. Residuals at each point for regression model y=6
Figure. Regression curve y=6 and y vs x data
Using y=6 as a regression curve
Chart1
20.2
30.4
20.6
30.8
1
1.2
1.4
1.6
1.8
2
2.2
2.4
2.6
2.8
3
3.2
3.4
3.6
3.8
x
y
4
6
6
6
6
6
8
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
Sheet1
xy
24
36
26
38
y=4x-4y = 6
06
0.26
0.46
0.66
0.86
16
1.26
1.41.66
1.62.46
1.83.26
246
2.24.86
2.45.66
2.66.46
2.87.26
386
3.28.86
3.49.66
3.66
3.86
Sheet1
x
y
Sheet2
y = 6
Sheet3
http://numericalmethods.eng.usf.edu 14
Linear Regression-Criterion#2for both regression models of y=4x − 4 and y=6.
The sum of the absolute residuals has been made as small as possible, that is 4, but the regression model is not unique.
Hence the criterion of minimizing the sum of the absolute value of the residuals is also a bad criterion.
44
1=∑
=iiE
http://numericalmethods.eng.usf.edu 15
Least Squares CriterionThe least squares criterion minimizes the sum of the square of the
residuals in the model, and also produces a unique line.
( )2
110
1
2 ∑ −−=∑===
n
iii
n
iir xaayES
x
11, yx
22, yx
33, yx
nnyx ,
iiyx ,
y
Figure. Linear regression of y vs x data showing residuals at a typical point, xi .
xaay 10 +=
iii xaayE 10 −−=
http://numericalmethods.eng.usf.edu16
Finding Constants of Linear Model( )
2
110
1
2 ∑ −−=∑===
n
iii
n
iir xaayESMinimize the sum of the square of the residuals:
To find
( )( ) 0121
100
=−−−−=∂∂ ∑
=
n
iii
r xaayaS
( )( ) 021
101
=−−−−=∂∂ ∑
=
n
iiii
r xxaayaS
giving
i
n
iii
n
ii
n
ixyxaxa ∑∑∑
===
=+1
2
11
10
0a and 1a we minimize with respect to 1a 0aandrS .
∑∑∑===
=+n
iii
n
i
n
iyxaa
111
10
http://numericalmethods.eng.usf.edu17
Finding Constants of Linear Model0aSolving for
2
11
2
1111
−
−=
∑∑
∑∑∑
==
===
n
ii
n
ii
n
ii
n
ii
n
iii
xxn
yxyxna
and
2
11
2
1111
2
0
−
−=
∑∑
∑∑∑∑
==
====
n
ii
n
ii
n
iii
n
ii
n
ii
n
ii
xxn
yxxyxa
1aand directly yields,
xaya 10 −=
http://numericalmethods.eng.usf.edu18
Example 1The torque, T needed to turn the torsion spring of a mousetrap through an angle, is given below.
Angle, θ Torque, T
Radians N-m
0.698132 0.188224
0.959931 0.209138
1.134464 0.230052
1.570796 0.250965
1.919862 0.313707
Table: Torque vs Angle for a torsional spring
Find the constants for the model given byθ21 kkT +=
Figure. Data points for Torque vs Angle data
0.1
0.2
0.3
0.4
0.5 1 1.5 2
θ (radians)
Torq
ue (N
-m)
Chart2
0.698132
0.959931
1.134464
1.570796
1.919862
θ (radians)
Torque (N-m)
0.188224
0.209138
0.230052
0.250965
0.313707
Sheet1
0.6981320.188224
0.9599310.209138θ
1.1344640.230052
1.5707960.250965
1.9198620.313707
Sheet1
θ (radians)
Torque (N-m)
Sheet2
Sheet3
http://numericalmethods.eng.usf.edu19
Example 1 cont.
1a
The following table shows the summations needed for the calculations of the constants in the regression model.
θ 2θ θTRadians N-m Radians2 N-m-Radians
0.698132 0.188224 0.487388 0.131405
0.959931 0.209138 0.921468 0.200758
1.134464 0.230052 1.2870 0.260986
1.570796 0.250965 2.4674 0.394215
1.919862 0.313707 3.6859 0.602274
6.2831 1.1921 8.8491 1.5896
Table. Tabulation of data for calculation of important
∑=
=5
1i
5=nUsing equations described for
25
1
5
1
2
5
1
5
1
5
12
−
−=
∑∑
∑∑∑
==
===
ii
ii
ii
ii
iii
n
TTnk
θθ
θθ
( ) ( )( )( ) ( )228316849185
1921128316589615..
...−
−=
21060919 −×= . N-m/rad
summations
0aT and with
http://numericalmethods.eng.usf.edu20
Example 1 cont.
n
TT i
i∑==
5
1_
Use the average torque and average angle to calculate 1k
_
2
_
1 θkTk −=
ni
i∑==
5
1_
θθ
51921.1
=
1103842.2 −×=
52831.6
=
2566.1=
Using,
)2566.1)(106091.9(103842.2 21 −− ×−×=1101767.1 −×= N-m
http://numericalmethods.eng.usf.edu21
Example 1 Results
Figure. Linear regression of Torque versus Angle data
Using linear regression, a trend line is found from the data
Can you find the energy in the spring if it is twisted from 0 to 180 degrees?
http://numericalmethods.eng.usf.edu22
Linear Regression (special case)
Given
best fit
to the data.
),( , ... ),,(),,( 2211 nn yxyx yx
xay 1=
http://numericalmethods.eng.usf.edu23
Linear Regression (special case cont.)
Is this correct?
2
11
2
1111
−
−=
∑∑
∑∑∑
==
===
n
ii
n
ii
n
ii
n
ii
n
iii
xxn
yxyxna
xay 1=
http://numericalmethods.eng.usf.edu24
x
11, yx
ii xax 1,
nnyx ,
iiyx ,
iii xay 1−=ε
y
Linear Regression (special case cont.)
http://numericalmethods.eng.usf.edu25
Linear Regression (special case cont.)
iii xay 1−=ε
∑=
=n
iirS
1
2ε
( )2
11∑
=
−=n
iii xay
Residual at each data point
Sum of square of residuals
http://numericalmethods.eng.usf.edu26
Linear Regression (special case cont.)
Differentiate with respect to
gives
( )( )∑=
−−=n
iiii
r xxaydadS
11
1
2
( )∑=
+−=n
iiii xaxy
1
2122
01
=dadSr
∑
∑
=
== n
ii
n
iii
x
yxa
1
2
11
1a
http://numericalmethods.eng.usf.edu27
Linear Regression (special case cont.)
∑
∑
=
== n
ii
n
iii
x
yxa
1
2
11
( )∑=
+−=n
iiii
r xaxydadS
1
21
1
22
021
22
1
2
>= ∑=
n
ii
r xda
Sd
∑
∑
=
== n
ii
n
iii
x
yxa
1
2
11
Does this value of a1 correspond to a local minima or local maxima?
Yes, it corresponds to a local minima.
http://numericalmethods.eng.usf.edu28
Linear Regression (special case cont.)
Is this local minima of an absolute minimum of ?rS rS
1a
rS
http://numericalmethods.eng.usf.edu29
Example 2
Strain Stress
(%) (MPa)
0 0
0.183 306
0.36 612
0.5324 917
0.702 1223
0.867 1529
1.0244 1835
1.1774 2140
1.329 2446
1.479 2752
1.5 2767
1.56 2896
To find the longitudinal modulus of composite, the following data is collected. Find the longitudinal modulus, Table. Stress vs. Strain data
E using the regression modelεσ E= and the sum of the square of the
0.0E+00
1.0E+09
2.0E+09
3.0E+09
0 0.005 0.01 0.015 0.02
Strain, ε (m/m)
Stre
ss, σ
(Pa)
residuals.
Figure. Data points for Stress vs. Strain data
Chart3
0
0.00183
0.0036
0.005324
0.00702
0.00867
0.010244
0.011774
0.01329
0.01479
0.015
0.0156
Strain, ε (m/m)
Stress, σ (Pa)
0
306000000
612000000
917000000
1223000000
1529000000
1835000000
2140000000
2446000000
2752000000
2767000000
2896000000
Sheet1
0.6981320.188224
0.9599310.209138θ
1.1344640.230052
1.5707960.2509650000
1.9198620.3137070.1833060.00183306000000
0.366120.0036612000000
0.53249170.005324917000000
0.70212230.007021223000000
σε0.86715290.008671529000000
ε1.024418350.0102441835000000
1.177421400.0117742140000000
1.32924460.013292446000000
Є1.47927520.014792752000000
1.527670.0152767000000
1.5628960.01562896000000
Sheet1
θ (radians)
Torque (N-m)
Sheet2
Strain, ε (m/m)
Stress, σ (Pa)
Sheet3
http://numericalmethods.eng.usf.edu30
Example 2 cont.
i ε σ ε 2 εσ
1 0.0000 0.0000 0.0000 0.0000
2 1.8300×10−3 3.0600×108 3.3489×10−6 5.5998×105
3 3.6000×10−3 6.1200×108 1.2960×10−5 2.2032×106
4 5.3240×10−3 9.1700×108 2.8345×10−5 4.8821×106
5 7.0200×10−3 1.2230×109 4.9280×10−5 8.5855×106
6 8.6700×10−3 1.5290×109 7.5169×10−5 1.3256×107
7 1.0244×10−2 1.8350×109 1.0494×10−4 1.8798×107
8 1.1774×10−2 2.1400×109 1.3863×10−4 2.5196×107
9 1.3290×10−2 2.4460×109 1.7662×10−4 3.2507×107
10 1.4790×10−2 2.7520×109 2.1874×10−4 4.0702×107
11 1.5000×10−2 2.7670×109 2.2500×10−4 4.1505×107
12 1.5600×10−2 2.8960×109 2.4336×10−4 4.5178×107
1.2764×10−3 2.3337×108
Table. Summation data for regression model
∑=
12
1i
∑=
−×=12
1
32 102764.1i
iε
∑=
×=12
1
8103337.2i
iiεσ
∑
∑
=
== 12
1
2
12
1
ii
iii
Eε
εσ
3
8
102764.1103337.2
−××
=
GPa84.182=
∑
∑
=
== n
ii
n
iii
E
1
2
1
ε
εσ
http://numericalmethods.eng.usf.edu31
Example 2 Resultsεσ 91084.182 ×=The equation
Figure. Linear regression for stress vs. strain data
describes the data.
Additional ResourcesFor all resources on this topic such as digital audiovisual lectures, primers, textbook chapters, multiple-choice tests, worksheets in MATLAB, MATHEMATICA, MathCad and MAPLE, blogs, related physical problems, please visit
http://numericalmethods.eng.usf.edu/topics/linear_regression.html
http://numericalmethods.eng.usf.edu/topics/linear_regression.html
THE END
http://numericalmethods.eng.usf.edu
http://numericalmethods.eng.usf.edu/
Linear RegressionLinear Regression�� ��http://numericalmethods.eng.usf.edu��What is Regression?Linear Regression-Criterion#1Example for Criterion#1Linear Regression-Criteria#1Linear Regression-Criterion#1Linear Regression – Criterion #1Linear Regression-Criterion#1Linear Regression-Criterion#2Example for Criterion#2Linear Regression-Criterion#2Linear Regression-Criterion#2Linear Regression-Criterion#2Least Squares Criterion Finding Constants of Linear ModelFinding Constants of Linear ModelExample 1Example 1 cont.Example 1 cont.Example 1 ResultsLinear Regression (special case)Linear Regression (special case cont.)Slide Number 24Slide Number 25Slide Number 26Slide Number 27Slide Number 28Example 2Example 2 cont.Example 2 ResultsAdditional ResourcesSlide Number 33
/ColorImageDict > /JPEG2000ColorACSImageDict > /JPEG2000ColorImageDict > /AntiAliasGrayImages false /CropGrayImages false /GrayImageMinResolution 300 /GrayImageMinResolutionPolicy /OK /DownsampleGrayImages true /GrayImageDownsampleType /Bicubic /GrayImageResolution 300 /GrayImageDepth -1 /GrayImageMinDownsampleDepth 2 /GrayImageDownsampleThreshold 1.00000 /EncodeGrayImages true /GrayImageFilter /DCTEncode /AutoFilterGrayImages true /GrayImageAutoFilterStrategy /JPEG /GrayACSImageDict > /GrayImageDict > /JPEG2000GrayACSImageDict > /JPEG2000GrayImageDict > /AntiAliasMonoImages false /CropMonoImages false /MonoImageMinResolution 1200 /MonoImageMinResolutionPolicy /OK /DownsampleMonoImages true /MonoImageDownsampleType /Bicubic /MonoImageResolution 1200 /MonoImageDepth -1 /MonoImageDownsampleThreshold 1.00000 /EncodeMonoImages true /MonoImageFilter /FlateEncode /MonoImageDict > /AllowPSXObjects false /CheckCompliance [ /None ] /PDFX1aCheck false /PDFX3Check false /PDFXCompliantPDFOnly false /PDFXNoTrimBoxError true /PDFXTrimBoxToMediaBoxOffset [ 0.00000 0.00000 0.00000 0.00000 ] /PDFXSetBleedBoxToMediaBox true /PDFXBleedBoxToTrimBoxOffset [ 0.00000 0.00000 0.00000 0.00000 ] /PDFXOutputIntentProfile (GRACoL2006_Coated1v2) /PDFXOutputConditionIdentifier () /PDFXOutputCondition () /PDFXRegistryName () /PDFXTrapped /False
/CreateJDFFile false /Description