Regr Hand

7/29/2019 Regr Hand

1/4

SPSS Regression Output

Regression

The box below is the first thing youll see in the standard SPSS regression output. Standard output

means that I did not click ANY boxes or options to get this printout. For bivariate regression thisfirst box is not of much interest, because it simply lists the single variable we are using as apredictor. Later for multiple regression this will show all the variables in our models, and it can also

show the sets of variables for several models at once.

Variables Entered/Removedb

SOCIO-EC

ONOMIC

STATUS

COMPOSIT

Ea

. Enter

Model

1

Variables

Entered

Variables

Removed Method

All requested variables entered.a.

Dependent Variable: MATH STANDARDIZED SCOREb.

The model summary box comes next. In it you will find R, R2 and the standard error of estimate

(Sy.x), which is the square root of the mean squared error, or MSE. Here we interpret R as the

correlation of the Y scores with the predicted values Y .

The adjusted R

2

is used in multiple regression. It is adjusted to account for the use of morepredictors simply adding more Xs can raise your R2, so this value is adjusted downwards a little topenalize ourselves for just hunting around for significant predictors.

Model Summary

.531a .282 .279 8.2148

Model

1

R R Square

Adjusted R

Square

Std. Error

of the

Estimate

Predictors: (Constant), SOCIO-ECONOMIC STATUS

COMPOSITE

a.

Note that in the footnote to the model summary SPSS tells us what predictors are relevant for the Rand R2, even though in this case we have only one predictor. If we were to run several bivariate

regressions or several multiple regressions, we would get a list of several models.

The word constant in parentheses refers to the intercept. This is printed because it is possible to

force SPSS not to estimate an intercept. This is only done in unusual situations for most

regressions, and all of the regressions we will run, we will allow SPSS to estimate the intercept term.

7/29/2019 Regr Hand

2/4

The box below shows the ANOVA table for the regression. ANOVA stands for Analysis Of

Variance specifically the analysis of variation in the Y scores. Here we see the two sums of

squares introduced in class the regression and residual (or error) sums of squares. The variance ofthe residuals (or errors) is the value of the mean square error or MSEhere it is 67.483.

Recall that we compare the value of the MSE to the value of the variance of Y. The standardoutput does not give us the variance of Y you need to click the statistics button (in the regression

menu) to get it OR run descriptive statistics on Y.

Also in this table we find the F test. This tests the hypothesis that the predictor (here our only

predictor) shows no relationship to Y. We can write hypothesis this in several ways, as mentioned in

class.

The F test has two numbers for its degrees of freedom (recall that our t test has one df). These are

called the numerator and denominator degrees of freedom,. Of df1 and df2 . Here the numerator df

(df1) tells us how many predictors we have (this time it is 1) and the denominator degrees of freedom

are n - 1- df1 or n-2 for bivariate regression.

The value of the test for our data is F(1,248) = 97.42. The table shows us this is significant (p < .001). As the F is large, we determine that our predictor of math outcome (here , ses) is related to

math score in our population.

ANOVAb

6574.387 1 6574.387 97.423 .000a

16735.828 248 67.483

23310.215 249

Regression

Residual

Total

Model

1

Sum of

Squares df Mean Square F Sig.

Predictors: (Constant), SOCIO-ECONOMIC STATUS COMPOSITEa.

Dependent Variable: MATH STANDARDIZED SCOREb.

Again the footnote tells us what predictor is being used and what outcome is being predicted.

Last, the table provides us with the data we need to compute R2. If we compute SS-regression

divided by SS-Total, we should get R2.

SS-regression /SS-Total = 6574.39/ 23310.21 = .282

The last table is full of information about the model. In it we find the slope (or slopes, in multipleregression). Our values of b0 and b1 are listed as unstandardized values, and their standard errors

SE( b0 ) and SE( b1) are in the second column. The standardized coefficient for the predictor in a

bivariate regression is simple the correlation. Check back to the value of R in the first table, and see

that it is the same as Beta here -- .531. In our notation from class this is b*1.

7/29/2019 Regr Hand

3/4

We can write the sample regression model from these slopes and also the sample standardized

regression model, if we like. Those models are

Sample regression model: Sample standardized regression model:

Yi = b0 + b1 (sesi) + eI Z(yi) = b*1Z(sesi) + eI*

Yi = 51.28 + 6.54 (sesi) + eI Z(yi) = .531 Z(sesi) + ei*

Note, you could also write these models using Y and omitting the error terms.

Recall again the interpretation of the slopes. The unstandardized slope of 6.54 tells us that a

students math score increases by about 6.5 points for every additional point on the SES scale.Higher SES scores are associated with higher math scores.

The standardized slope tells us that for each standard-deviation unit of increase in SES, we predict

slightly more than a half of a standard deviation increase in math score.

Coefficientsa

51.277 .520 98.552 .000

6.537 .662 .531 9.870 .000

(Constant)

SOCIO-ECONOMIC

STATUS COMPOSITE

Model

1

B Std. Error

Unstandardized

Coefficients

Beta

Standardized

Coefficients

t Sig.

Dependent Variable: MATH STANDARDIZED SCOREa.

Last, the table gives us the t tests for the slope and intercept. In multiple regression we will get

individual tests for each predictor. The table does not tell us the df for the t. We need to know that

the df for each t is the same as the df for residuals in the F table above. Here the df is n-2.

Each t test examines the hypothesis H0: = 0 for the predictor used.

As we learned in class, the F is the square of the t test, when we have only one predictor. Here that

is 9.87*9.87 = 97.42. Also our results must agree (if only one X is used). Here again we reject the

null model, and decide that SES is a good predictor of math score, and has a slope that is not zero in

the population.

Last we check our assumptions. In regression we make assumptions about the structural part of the

model, that is, about the predictor(s). We assume that all of the important predictors are in ourmodel, and no unimportant ones are included. This is usually NOT a good assumption for a

bivariate regression!!

7/29/2019 Regr Hand

4/4

We also make assumptions about our errors. Specifically, we assume that the residuals are

independent and normally distributed, and that they have equal variances for any X value.

Therefore we make a normal plot (to get this we need to click on the [Plots] button in SPSS).These residuals dont look very normal it is likely that other predictors could explain more

variation in the data.

Regression Standardized Residual

2.251.75

1.25.75

.25-.25

-.75-1.25

-1.75

-2.25

-2.75

Histogram

Dependent Variable: MATH STANDARDIZED

Frequency

30

20

10

0

Std. Dev = 1.00

Mean = 0.00

N = 250.00

To check the assumption about variances we make a scatterplot. We plot the residuals on the Y axis

and the predictor variable (or equivalently the predicted values) on the X axis. Using the [Plots]button, we select zpred and put it on the X axis and use zresid on the Y axis. We hope to find equal

scatter in the points all along the horizontal axis. This plot looks pretty good!!

Scatterplot

Dependent Variable: MATH STANDARDIZED S

Regression Standardized Predicted Value

3210-1-2-3

RegressionStandard

izedResidual

3

2

1

0

-1

-2

-3

Regr Hand

Documents

Transcript of Regr Hand