Testing statistical significance of differences between coefficients Jane E. Miller, PhD The Chicago...

27
Testing statistical significance of differences between coefficients Jane E. Miller, PhD The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

Transcript of Testing statistical significance of differences between coefficients Jane E. Miller, PhD The Chicago...

Page 1: Testing statistical significance of differences between coefficients Jane E. Miller, PhD The Chicago Guide to Writing about Multivariate Analysis, 2nd.

Testing statistical significance of differences between coefficients

Jane E. Miller, PhD

The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

Page 2: Testing statistical significance of differences between coefficients Jane E. Miller, PhD The Chicago Guide to Writing about Multivariate Analysis, 2nd.

Overview

• Review: Inferential statistical tests for coefficients• Testing statistical significance of differences

– Between coefficients in the same model– Between coefficients in independent models

• Standard error of the difference• Presenting results of tests of differences between

coefficients

The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

Page 3: Testing statistical significance of differences between coefficients Jane E. Miller, PhD The Chicago Guide to Writing about Multivariate Analysis, 2nd.

Review: Statistical significance of βs• In the standard output from a regression model,

inferential statistics provide the information to test whether the coefficient on an independent variable is statistically significantly different from zero

• For continuous independent variables– Whether the marginal effect of a one-unit increase in that IV

is different from zero

• For categorical independent variables– Whether difference between the mean of the DV for the

specified group and the reference category is different from zero

The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

Page 4: Testing statistical significance of differences between coefficients Jane E. Miller, PhD The Chicago Guide to Writing about Multivariate Analysis, 2nd.

Estimated coefficients from an OLS model of birth weight in grams

CoefficientStandard

errorIntercept 3,317.8** 25.1Mother’s age at child’s birth

(years)10.7** 1.2

Mother’s education < High school (<HS) –55.5** 19.3= High school (=HS) –53.9** 14.8(> High school; >HS)

** denotes p < 0.01Reference category in parenthesis

The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

Page 5: Testing statistical significance of differences between coefficients Jane E. Miller, PhD The Chicago Guide to Writing about Multivariate Analysis, 2nd.

Example: β on a continuous IV

• OLS model of birth weight in grams includes mother’s age in years as an independent variable

• βmother’s age = 10.7 with a standard error (s.e.) of 1.2, p < 0.001– Thus we reject the null hypothesis

H0: βmother’s age = 0

• We conclude that the slope of the association between mother’s age and birth weight is statistically significantly different from zero at p < 0.001

The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

Page 6: Testing statistical significance of differences between coefficients Jane E. Miller, PhD The Chicago Guide to Writing about Multivariate Analysis, 2nd.

Example: β on a categorical IV• The birth weight model includes an ordinal measure of

mother’s educational attainment • > HS is the reference category

• β<HS= –55.5 with a standard error (s.e.) of 19.3, p < 0.001

– Thus we reject the null hypothesis H0: β<HS = 0

H0: mean birth weight for < HS = mean birth weight for > HS

• Mean birth weight for infants born to mothers with < high school education is statistically significantly different from those born to mothers with > HS

The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

Page 7: Testing statistical significance of differences between coefficients Jane E. Miller, PhD The Chicago Guide to Writing about Multivariate Analysis, 2nd.

Testing other hypotheses• For some research questions, you might need to test a

hypothesis in addition to i = 0. E.g., whether– Two s in a given model are statistically significantly

different from one another• E.g., <HS = =HS

– The size and statistical significance of a changes across models when additional covariates such as confounders or mediators are included in the model

• E.g., H0: non-Hispanic black (I) = non-Hispanic black (II)

– The effect of a covariate differs across models estimated for independent subgroups (stratified models)

• E.g., H0: <HS is the same for males as for femalesThe Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

Page 8: Testing statistical significance of differences between coefficients Jane E. Miller, PhD The Chicago Guide to Writing about Multivariate Analysis, 2nd.

Testing statistical significance of differences between coefficients

• To formally test statistical significance of differences between coefficients, e.g., H0: βj = βk

– Divide the difference between the estimated coefficients (j − i) by the standard error of the difference to obtain the test statistic

– Compare the calculated test statistic against the pertinent critical value with one degree of freedom

The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

Page 9: Testing statistical significance of differences between coefficients Jane E. Miller, PhD The Chicago Guide to Writing about Multivariate Analysis, 2nd.

Standard error of the difference• The standard error of the difference is calculated:

√[var(j) + (2 × cov(j, k)) + var(k) ]– var(j) and var(k) are the variances of j and k

– cov(j, k) is the covariance between j and k

• When j and k are from different models– Considered statistically independent of one another

• cov(j, k) = 0

• When j and k are from within one regression model– Not independent of one another

• cov(j, k) ≠ 0The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

Square root

Page 10: Testing statistical significance of differences between coefficients Jane E. Miller, PhD The Chicago Guide to Writing about Multivariate Analysis, 2nd.

Testing differences of s from one model

• When j and k are from the same model, must include the covariance in the calculation of the standard error of the difference

√[var(j) + (2 × cov(j, k)) + var(k) ]

• The complete variance-covariance matrix for a regression can be requested as part of the output

• The variance of each coefficient can be calculated from its standard error (s.e.)

var(j) = [s.e.(j)]2

The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

Page 11: Testing statistical significance of differences between coefficients Jane E. Miller, PhD The Chicago Guide to Writing about Multivariate Analysis, 2nd.

Example: Testing whether β<HS = β=HS

• From the table, <HS = –55.5 and =HS = –53.9

• The difference between β<HS and β=HS is calculated

β<HS – β=HS = –55.5 –(–53.9) = 1.6

• For that model, • var(<HS) = 370.9

• var(=HS) = 218.8

• cov(<HS, =HS) = 137.8

• Plugging those values into the formula for the standard error of the difference yields

= √[370.9 + (2 × 137.8) + 218.8]= 17.72

The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

Page 12: Testing statistical significance of differences between coefficients Jane E. Miller, PhD The Chicago Guide to Writing about Multivariate Analysis, 2nd.

Example, cont.: Testing β<HS = β=HS

• To calculate the test statistic, divide the difference between <HS and =HS by the standard error of the difference:

(β<HS – β=HS)/s.e. (β<HS – β=HS)

= 1.6/17.7 = 0.09

• 0.09 < 1.96 (the critical value of 1.96 for a t-test with ∞ degrees of freedom at p < 0.05)

• Cannot reject the null hypothesis that β<HS = β=HS

The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

Page 13: Testing statistical significance of differences between coefficients Jane E. Miller, PhD The Chicago Guide to Writing about Multivariate Analysis, 2nd.

TEST statement• Many software packages can do these calculations

for you• To test other contrasts among categories, request the

test statistic for equality of coefficients for pairs of coefficients: H0 : βj = βk

– E.g., to test whether predicted birth weight is statistically significantly different for infants born to mothers with < HS than for those with = HS

• Specify “TEST ‘<HS’ = ‘=HS’” in your SAS syntax• Output for H0: β<HS = β=HS reports an F-statistic of 0.01

with a p-value of 0.93The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

Neither is the reference category

Page 14: Testing statistical significance of differences between coefficients Jane E. Miller, PhD The Chicago Guide to Writing about Multivariate Analysis, 2nd.

Testing differences of s from independent models

• When j and k are from different models they can be assumed to be independent of one another

cov(j, k) = 0

• Thus the formula for the standard error of the difference

√[var(j) + (2 × cov(j, k)) + var(k) ]

simplifies to √[var(j) + var(k) ]

• Reminder: var(j ) and var(k) can be calculated from the standard error reported in the regression output

var(j) = [s.e.(j)]2

The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

Page 15: Testing statistical significance of differences between coefficients Jane E. Miller, PhD The Chicago Guide to Writing about Multivariate Analysis, 2nd.

Example: Change in βs across nested models

• In nested models I and II, s on non-Hispanic black areNHB(I) = –244.5 , s.e. = 16.7

NHB (II) = –147.2 , s.e. = 17.6

• The change in β between models I and II: –244.5 –(–147.2 ) = 97.3

• Plugging the standard errors for NHB(I) and NHB(II) into the formula for standard error of the difference yields

(s.e. difference) = √ [(16.7)2 + (17.6)2]= 24.3

The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

Page 16: Testing statistical significance of differences between coefficients Jane E. Miller, PhD The Chicago Guide to Writing about Multivariate Analysis, 2nd.

Change in βs across nested models, cont.

• The t statistic for the difference in β is calculated: (difference in β)/ s.e.(difference in β)

• Plugging in the values from the previous slide:97.3 ÷ 24.3 = 4.01

• 4.01 exceeds the critical value of 2.56 for p < 0.01, so we conclude that the change in NHB between models I and II is statistically significant at p < 0.01

The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

Page 17: Testing statistical significance of differences between coefficients Jane E. Miller, PhD The Chicago Guide to Writing about Multivariate Analysis, 2nd.

Tables to present multivariate results

• In the table of multivariate statistics, for each independent variable in the model, present– The estimated coefficient ()– The standard error

• See chapters 5 and 11 of Writing about Multivariate Analysis, 2nd Edition for guidelines and examples of multivariate tables

The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

Page 18: Testing statistical significance of differences between coefficients Jane E. Miller, PhD The Chicago Guide to Writing about Multivariate Analysis, 2nd.

Prose to present results of differences between coefficients

• Introduce the substantive reason behind the test for difference between s, given your – Research question – Variables (categories, units)

• Report and interpret the results of the formal statistical test of difference between coefficients– Test statistic – Accompanying degrees of freedom

• Explain the conclusions you draw from that test about specification of your model

The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

Page 19: Testing statistical significance of differences between coefficients Jane E. Miller, PhD The Chicago Guide to Writing about Multivariate Analysis, 2nd.

Poor presentation:Results of test differences between s • “From table 15.3, Model III we have <HS = –55.5 and =HS = –53.9, so

the difference between β<HS and β=HS is β<HS – β=HS = –55.5 – (–53.9) = 1.6. For that model, var(<HS) = 370.9, var(=HS) = 218.8, and cov(<HS, =HS) = 137.8. Plugging those values into the formula for the standard error of the difference yields √[370.9 + (2 × 137.8) + 218.8] = 17.7. To calculate the test statistic, divide the difference between <HS and =HS by the standard error of the difference: (β<HS – β=HS)/s.e. (β<HS – β=HS) = 1.6/17.7 = 0.09, which is less than the critical value of 1.96 for a t-test with ∞ degrees of freedom at p < 0.05). Thus we cannot reject the null hypothesis that β<HS = β=HS.”– Except for an assignment in a course where you must demonstrate that you

know this logic, skip the statistics lesson to your readers! The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

Page 20: Testing statistical significance of differences between coefficients Jane E. Miller, PhD The Chicago Guide to Writing about Multivariate Analysis, 2nd.

Better presentation:Results of test differences between s • “The 1.6 unit (gram) birth weight difference between the

estimated coefficients for ‘less than high school’ and ‘high school graduate’ in Model III is not statistically significant (F-statistic for the test of difference = 0.01; p = 0.93).” – Mentions the

• Dependent variable• Independent variable (educational attainment)• Units or categories• Purpose of the test for a change in NHB across nested models

• Magnitude• Statistical significance• Direction (not mentioned because trivially small)

The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

Page 21: Testing statistical significance of differences between coefficients Jane E. Miller, PhD The Chicago Guide to Writing about Multivariate Analysis, 2nd.

Example presentation: Change in across nested models

• “As shown in table 15.3, the coefficient on non-Hispanic black decreases 93 points (grams), from –244.5 in model I to –147.2 in model II (t = 4.01; p < 0.01). Thus, the addition of controls for socioeconomic characteristics is associated with a large, statistically significant decrease in the birth weight deficit for non-Hispanic black compared to non-Hispanic white infants.”– Mentions the

• Dependent variable • Independent variables and their units or categories• Purpose of the test for a change in NHB across nested models

• Direction• Magnitude• Statistical significance

Page 22: Testing statistical significance of differences between coefficients Jane E. Miller, PhD The Chicago Guide to Writing about Multivariate Analysis, 2nd.

Summary• To test hypotheses other than H0: βi = 0, calculate a

test statistic from the difference in coefficients and the standard error of the difference• Compare that test statistic against the critical value

• βs from different models are considered statistically independent of one another, so the covariance is not needed to compute standard error of the difference• E.g., nested models, stratified models

• βs from the same model are not statistically independent of one another, so the covariance is needed to compute standard error of the difference

The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

Page 23: Testing statistical significance of differences between coefficients Jane E. Miller, PhD The Chicago Guide to Writing about Multivariate Analysis, 2nd.

Summary, cont.

• If coefficients are not statistically significantly different from one another, the model specification often can be simplified by combining terms

• Then test effect of simplified specification on overall fit using model GOF statistics

• Present results of difference between coefficients– Use a combination of tables and prose– Describe conclusions, not process– Relate to topic at hand

Page 24: Testing statistical significance of differences between coefficients Jane E. Miller, PhD The Chicago Guide to Writing about Multivariate Analysis, 2nd.

Suggested resources

• Miller, J. E. 2013. The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition. University of Chicago Press. Chapters 11 and 15.

• Freedman, David, Robert Pisani, and Roger Purves. 2007. Statistics, 4th Edition. New York: W. W. Norton.

• Gujarati, Damodar N. 2002. Basic Econometrics, 4th Edition. New York: McGraw-Hill/Irwin.

The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

Page 25: Testing statistical significance of differences between coefficients Jane E. Miller, PhD The Chicago Guide to Writing about Multivariate Analysis, 2nd.

Suggested online resources

• Podcasts on– Interpreting coefficients from OLS and logit

models– Comparing overall goodness of fit across models– Testing whether a multivariate specification can be

simplified

The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

Page 26: Testing statistical significance of differences between coefficients Jane E. Miller, PhD The Chicago Guide to Writing about Multivariate Analysis, 2nd.

Suggested practice exercises

• Study guide to The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.– Questions #2, 3, and 5 in the problem set for

chapter 11– Suggested course extensions for chapter 11

• “Reviewing” exercise #2• “Applying statistics and writing” exercise #3

The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

Page 27: Testing statistical significance of differences between coefficients Jane E. Miller, PhD The Chicago Guide to Writing about Multivariate Analysis, 2nd.

Contact information

Jane E. Miller, [email protected]

Online materials available athttp://press.uchicago.edu/books/miller/multivariate/index.html

The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.