LINEAR REGRESSION: Evaluating Regression Models
Overview
• Assumptions for Linear Regression• Evaluating a Regression Model
Assumptions for Bivariate Linear Regression
• Quantitative data (or dichotomous)• Independent observations• Predict for same population that was
sampled
Assumptions for Bivariate Linear Regression
• Linear relationship– Examine scatterplot
• Homoscedasticity – equal spread of residuals at different values of predictor– Examine ZRESID vs ZPRED plot
Checking for Homoscedasticity
Assumptions for Bivariate Linear Regression
• Independent errors– Durbin Watson should be close to 2
• Normality of errors– Examine frequency distribution of residuals
Checking for Normality of Errors
Influential Cases
• Influential cases have greater impact on the slope and y-intercept
• Select casewise diagnostics and look for cases with large residuals
Standard Error of the Estimate
• Index of how far off predictions are expected to be
• Larger r means smaller standard error• Standard deviation of y scores around
predicted y scores
Sums of Squares
• Total SS – total squared differences of Y scores from the mean of Y
• Model SS – total squared differences of predicted Y scores from the mean of Y
• Residual SS – total squared differences of Y scores from predicted Y scores
Coefficient of Determination
• r2 is the proportion of variance in Y explained by X
• Adjusted r2 corrects for the fact that the r2 often overestimates the true relationship. Adjusted r2 will be lower when there are fewer subjects.
Goodness of Fit
• Dividing the Model SS by the Total SS produces r2
• The ANOVA F-test determines whether the regression equation accounted for a significant proportion of variance in Y
• F is the Model Mean Square divided by the Residual Mean Square
Coefficients
• The Constant B under “unstandardized” is the y-intercept b0
• The B listed for the X variable is the slope b1
• The t test is the coefficient divided by its standard error
• The standardized slope is the same as the correlation
Example of Reporting a Regression Analysis
The linear regression for predicting quiz enjoyment from level of statistics anxiety did not account of a significant portion of variance, F(1, 24) = 1.75, p = .20, r2 = .07.
Take-Home Points
• The validity of a regression procedure depends on multiple assumptions.
• A regression model can be evaluated based on whether and how well it predicts an outcome variable.
Top Related