Post on 15-Sep-2020
Aravali College of Engineering and Management,
Faridabad
Department of Computer Science & Engineering
(July – Dec 2020)
05/22/2023 1
Introduction to Regression Analysis
Slide-8
Regression analysis is used to: Predict the value of a dependent variable based on the
value of at least one independent variable Explain the impact of changes in an independent
variable on the dependent variable
Dependent variable: the variable we wish to predict or explain
Independent variable: the variable used to explainthe dependent variable
Simple Linear Regression Model
Slide-9
Only one independent variable, X Relationship between X and Y
is described by a linear function Changes in Y are assumed to be
caused by changes in X
Types of Relationships
Slide-10
Y
Y
X
Y
Y
X
Linear relationships Curvilinear relationships
X X
Types of Relationships
Slide-11
Y
Y
X
Y
Y
X
Strong relationships Weak relationships
(continued)
X X
Types of Relationships
Slide-12
Y
X
Y
X
No relationship(continued)
Yi β0 β1Xi
Linear component
Simple Linear Regression Model
Slide-13
Population Yintercept
Population Slope Coefficient
Random Error term
Dependent Variable
Independent Variable
εi
Random Error component
Random Errorifor this X value
X
YObserved Value
of Y for Xi
Predicted Valueof Y for Xi
Yi β0 β1Xi εi
Xi
Slope = β1
Simple Linear Regression Model
(continued)
Slide-14
Intercept = β0
εi
Yˆi b0
b1Xi
The simple linear regression equation provides an estimate of the population regression line
Simple Linear Regression Equation (Prediction Line)
Slide-15
Estimate of the regression
intercept
Estimate of the regression slope
Estimated(or predicted) Y value forobservation i
Value of X for observation i
The individual random error terms ei have a mean of zero
Sample Data for House Price Model
Slide-16
House Price in $1000s (Y)
Square Feet (X)
245 1400312 1600279 1700308 1875199 1100219 1550405 2350324 2450319 1425255 1700
Regression Using Excel
Slide-17
Tools / Data Analysis / Regression
Assumptions of Regression
Department of Statistics, ITS Surabaya Slide-18
Use the acronym LINE: Linearity
The underlying relationship between X and Y is linear
Independence of Errors Error values are statistically independent
Normality of Error Error values (ε) are normally distributed for any given value of
X
Equal Variance (Homoscedasticity) The probability distribution of the errors has constant
variance
Pitfalls of Regression Analysis
Department of Statistics, ITS Surabaya Slide-19
Lacking an awareness of the assumptions underlying least-squares regression
Not knowing how to evaluate the assumptions
Not knowing the alternatives to least-squares regression if a particular assumption is violated
Using a regression model without knowledge of the subject matter
Extrapolating outside the relevant range
05/22/2023 20
Aravali College of Engineering And ManagementJasana, Tigoan Road, Neharpar, Faridabad, Delhi NCR
Toll Free Number : 91- 8527538785Website : www.acem.edu.in