Regr Hand

download Regr Hand

of 4

Transcript of Regr Hand

  • 7/29/2019 Regr Hand

    1/4

    SPSS Regression Output

    Regression

    The box below is the first thing youll see in the standard SPSS regression output. Standard output

    means that I did not click ANY boxes or options to get this printout. For bivariate regression thisfirst box is not of much interest, because it simply lists the single variable we are using as apredictor. Later for multiple regression this will show all the variables in our models, and it can also

    show the sets of variables for several models at once.

    Variables Entered/Removedb

    SOCIO-EC

    ONOMIC

    STATUS

    COMPOSIT

    Ea

    . Enter

    Model

    1

    Variables

    Entered

    Variables

    Removed Method

    All requested variables entered.a.

    Dependent Variable: MATH STANDARDIZED SCOREb.

    The model summary box comes next. In it you will find R, R2 and the standard error of estimate

    (Sy.x), which is the square root of the mean squared error, or MSE. Here we interpret R as the

    correlation of the Y scores with the predicted values Y .

    The adjusted R

    2

    is used in multiple regression. It is adjusted to account for the use of morepredictors simply adding more Xs can raise your R2, so this value is adjusted downwards a little topenalize ourselves for just hunting around for significant predictors.

    Model Summary

    .531a .282 .279 8.2148

    Model

    1

    R R Square

    Adjusted R

    Square

    Std. Error

    of the

    Estimate

    Predictors: (Constant), SOCIO-ECONOMIC STATUS

    COMPOSITE

    a.

    Note that in the footnote to the model summary SPSS tells us what predictors are relevant for the Rand R2, even though in this case we have only one predictor. If we were to run several bivariate

    regressions or several multiple regressions, we would get a list of several models.

    The word constant in parentheses refers to the intercept. This is printed because it is possible to

    force SPSS not to estimate an intercept. This is only done in unusual situations for most

    regressions, and all of the regressions we will run, we will allow SPSS to estimate the intercept term.

  • 7/29/2019 Regr Hand

    2/4

    The box below shows the ANOVA table for the regression. ANOVA stands for Analysis Of

    Variance specifically the analysis of variation in the Y scores. Here we see the two sums of

    squares introduced in class the regression and residual (or error) sums of squares. The variance ofthe residuals (or errors) is the value of the mean square error or MSEhere it is 67.483.

    Recall that we compare the value of the MSE to the value of the variance of Y. The standardoutput does not give us the variance of Y you need to click the statistics button (in the regression

    menu) to get it OR run descriptive statistics on Y.

    Also in this table we find the F test. This tests the hypothesis that the predictor (here our only

    predictor) shows no relationship to Y. We can write hypothesis this in several ways, as mentioned in

    class.

    The F test has two numbers for its degrees of freedom (recall that our t test has one df). These are

    called the numerator and denominator degrees of freedom,. Of df1 and df2 . Here the numerator df

    (df1) tells us how many predictors we have (this time it is 1) and the denominator degrees of freedom

    are n - 1- df1 or n-2 for bivariate regression.

    The value of the test for our data is F(1,248) = 97.42. The table shows us this is significant (p < .001). As the F is large, we determine that our predictor of math outcome (here , ses) is related to

    math score in our population.

    ANOVAb

    6574.387 1 6574.387 97.423 .000a

    16735.828 248 67.483

    23310.215 249

    Regression

    Residual

    Total

    Model

    1

    Sum of

    Squares df Mean Square F Sig.

    Predictors: (Constant), SOCIO-ECONOMIC STATUS COMPOSITEa.

    Dependent Variable: MATH STANDARDIZED SCOREb.

    Again the footnote tells us what predictor is being used and what outcome is being predicted.

    Last, the table provides us with the data we need to compute R2. If we compute SS-regression

    divided by SS-Total, we should get R2.

    SS-regression /SS-Total = 6574.39/ 23310.21 = .282

    The last table is full of information about the model. In it we find the slope (or slopes, in multipleregression). Our values of b0 and b1 are listed as unstandardized values, and their standard errors

    SE( b0 ) and SE( b1) are in the second column. The standardized coefficient for the predictor in a

    bivariate regression is simple the correlation. Check back to the value of R in the first table, and see

    that it is the same as Beta here -- .531. In our notation from class this is b*1.

  • 7/29/2019 Regr Hand

    3/4

    We can write the sample regression model from these slopes and also the sample standardized

    regression model, if we like. Those models are

    Sample regression model: Sample standardized regression model:

    Yi = b0 + b1 (sesi) + eI Z(yi) = b*1Z(sesi) + eI*

    Yi = 51.28 + 6.54 (sesi) + eI Z(yi) = .531 Z(sesi) + ei*

    Note, you could also write these models using Y and omitting the error terms.

    Recall again the interpretation of the slopes. The unstandardized slope of 6.54 tells us that a

    students math score increases by about 6.5 points for every additional point on the SES scale.Higher SES scores are associated with higher math scores.

    The standardized slope tells us that for each standard-deviation unit of increase in SES, we predict

    slightly more than a half of a standard deviation increase in math score.

    Coefficientsa

    51.277 .520 98.552 .000

    6.537 .662 .531 9.870 .000

    (Constant)

    SOCIO-ECONOMIC

    STATUS COMPOSITE

    Model

    1

    B Std. Error

    Unstandardized

    Coefficients

    Beta

    Standardized

    Coefficients

    t Sig.

    Dependent Variable: MATH STANDARDIZED SCOREa.

    Last, the table gives us the t tests for the slope and intercept. In multiple regression we will get

    individual tests for each predictor. The table does not tell us the df for the t. We need to know that

    the df for each t is the same as the df for residuals in the F table above. Here the df is n-2.

    Each t test examines the hypothesis H0: = 0 for the predictor used.

    As we learned in class, the F is the square of the t test, when we have only one predictor. Here that

    is 9.87*9.87 = 97.42. Also our results must agree (if only one X is used). Here again we reject the

    null model, and decide that SES is a good predictor of math score, and has a slope that is not zero in

    the population.

    Last we check our assumptions. In regression we make assumptions about the structural part of the

    model, that is, about the predictor(s). We assume that all of the important predictors are in ourmodel, and no unimportant ones are included. This is usually NOT a good assumption for a

    bivariate regression!!

  • 7/29/2019 Regr Hand

    4/4

    We also make assumptions about our errors. Specifically, we assume that the residuals are

    independent and normally distributed, and that they have equal variances for any X value.

    Therefore we make a normal plot (to get this we need to click on the [Plots] button in SPSS).These residuals dont look very normal it is likely that other predictors could explain more

    variation in the data.

    Regression Standardized Residual

    2.251.75

    1.25.75

    .25-.25

    -.75-1.25

    -1.75

    -2.25

    -2.75

    Histogram

    Dependent Variable: MATH STANDARDIZED

    Frequency

    30

    20

    10

    0

    Std. Dev = 1.00

    Mean = 0.00

    N = 250.00

    To check the assumption about variances we make a scatterplot. We plot the residuals on the Y axis

    and the predictor variable (or equivalently the predicted values) on the X axis. Using the [Plots]button, we select zpred and put it on the X axis and use zresid on the Y axis. We hope to find equal

    scatter in the points all along the horizontal axis. This plot looks pretty good!!

    Scatterplot

    Dependent Variable: MATH STANDARDIZED S

    Regression Standardized Predicted Value

    3210-1-2-3

    RegressionStandard

    izedResidual

    3

    2

    1

    0

    -1

    -2

    -3