Introduction to Statistics: Political Science (Class 3) Calculating R-Squared, Dichotomous and...

28
Introduction to Statistics: Political Science (Class 3) Calculating R-Squared, Dichotomous and Nominal Variables, F-tests

Transcript of Introduction to Statistics: Political Science (Class 3) Calculating R-Squared, Dichotomous and...

Page 1: Introduction to Statistics: Political Science (Class 3) Calculating R-Squared, Dichotomous and Nominal Variables, F-tests.

Introduction to Statistics: Political Science (Class 3)

Calculating R-Squared, Dichotomous and Nominal

Variables, F-tests

Page 2: Introduction to Statistics: Political Science (Class 3) Calculating R-Squared, Dichotomous and Nominal Variables, F-tests.

R-Squared

Page 3: Introduction to Statistics: Political Science (Class 3) Calculating R-Squared, Dichotomous and Nominal Variables, F-tests.

R-Squared Example

• Measure of proportion of variance in Y explained by the IVs

Coef. St.Err T PBush FT -.165 .019 -8.72 0.000Party Identification 7.354 .278 26.44 0.000Constant 65.28 .962 67.89 0.000

FU

LL S

AM

PLE

Coef. St.Err T PBush FT -.090 .489 -0.18 0.860Party Identification 12.31 7.47 1.65 0.143Constant 50.16 18.23 2.75 0.028

10 R

ando

m C

ases

R2 = .5336

Page 4: Introduction to Statistics: Political Science (Class 3) Calculating R-Squared, Dichotomous and Nominal Variables, F-tests.

• First, we need the variance of Y

• Mean = 66, so:

Obama FT = 50.16 + (-.090)(Bush FT) + 12.31(Party Identification)

Observed(Observed-

Mean)(Observed-

Mean)2

50 -16 256

30 -36 1296

100 34 1156

50 -16 256

70 4 16

30 -36 1296

85 19 361

100 34 1156

85 19 361

60 -6 36

Variance = 6190

Page 5: Introduction to Statistics: Political Science (Class 3) Calculating R-Squared, Dichotomous and Nominal Variables, F-tests.

Bush FT PID Predicted Observed(Observed-

Predicted)(Observed-

Predicted)2

25.00 0.00 47.92 50.00 2.08 4.320.00 2.00 74.78 30.00 -44.78 2005.640.00 3.00 87.10 100.00 12.90 166.53

40.00 0.00 46.58 50.00 3.42 11.7130.00 2.00 72.10 70.00 -2.10 4.3960.00 -1.00 32.48 30.00 -2.48 6.130.00 3.00 87.10 85.00 -2.10 4.390.00 3.00 87.10 100.00 12.90 166.531.00 1.00 62.38 85.00 22.62 511.490.00 1.00 62.47 60.00 -2.47 6.12

SSR (Sum of Squared Residuals) = 2887.25Variance of Y = 6190

R2 =(6190-2887.25)

6190 = .5336

Page 6: Introduction to Statistics: Political Science (Class 3) Calculating R-Squared, Dichotomous and Nominal Variables, F-tests.

What is a “good” R2?

• Predict feelings about Obama with:– Party ID and feelings about Bush– Education– Zodiac sign

Page 7: Introduction to Statistics: Political Science (Class 3) Calculating R-Squared, Dichotomous and Nominal Variables, F-tests.

Non-continuous IVs

Dealing with Dichotomous and Nominal Variables

Page 8: Introduction to Statistics: Political Science (Class 3) Calculating R-Squared, Dichotomous and Nominal Variables, F-tests.

Democratic Peace

• Is sum of democracy scores the right measure?

• Alternative: Are the pair of countries both democracies?

• Indicator/dummy/dichotomous variable:– 1 if both countries have democracy scores >5– 0 otherwise

Page 9: Introduction to Statistics: Political Science (Class 3) Calculating R-Squared, Dichotomous and Nominal Variables, F-tests.

Dichotomous IV

Coef SE Coef T PDemocratic Pair (1=yes) 5.18 0.362 14.31 0.000Constant 24.35 0.171 142.45 0.000

R-squared = 0.0057

Coef SE Coef T PDemocratic Pair (1=yes) 4.74 0.369 12.84 0.000Military Spending ($mil) 0.053 0.002 25.59 0.000Constant 22.21 0.204 108.98 0.000

R-squared = 0.0242

DV: Years at peace

Page 10: Introduction to Statistics: Political Science (Class 3) Calculating R-Squared, Dichotomous and Nominal Variables, F-tests.

Nominal variables

• Speed dating survey: You have 100 points to distribute among the following attributes -- give more points to those attributes that are more important in a potential date, and fewer points to those attributes that are less important in a potential date.

• Attractive• Fun• Intelligent• Sincere• Ambitious• Shared Interests

Page 11: Introduction to Statistics: Political Science (Class 3) Calculating R-Squared, Dichotomous and Nominal Variables, F-tests.

How do people’s perspective/goals affect what’s important to them?

• What is your primary goal in participating in this event? – Seemed like a fun night out=1– To meet new people=2– To get a date=3– Looking for a serious relationship=4– To say I did it=5

• Does this make sense as a linear scale?

Page 12: Introduction to Statistics: Political Science (Class 3) Calculating R-Squared, Dichotomous and Nominal Variables, F-tests.

Who is likely to say each of the following is important?

• Attractiveness? Fun? – Seemed like a fun night out=1– To meet new people=2– To get a date=3– Looking for a serious relationship=4– To say I did it=5

• Does this make sense as a linear scale?

Page 13: Introduction to Statistics: Political Science (Class 3) Calculating R-Squared, Dichotomous and Nominal Variables, F-tests.

Effects of Nominal Variable

One Variable:Seemed like a fun night out=1

To meet new people=2To get a date=3

Looking for a serious relationship=4To say I did it=5

Five Variables:Seemed like a fun night out (1=yes)

To meet new people (1=yes) To get a date (1=yes)

Looking for a serious relationship (1=yes) To say I did it (1=yes)

Page 14: Introduction to Statistics: Political Science (Class 3) Calculating R-Squared, Dichotomous and Nominal Variables, F-tests.

Importance of Attribute = β0 + β1(Seemed Fun) + β2(Meet People) + β3(Date) + β4(Serious Relationship) + β5(Say Did) + u

What would β0 correspond to in this model?

Page 15: Introduction to Statistics: Political Science (Class 3) Calculating R-Squared, Dichotomous and Nominal Variables, F-tests.

“Reference Group”

• Leave one indicator out

Importance of Attribute = β0 + β1(Seemed Fun) + β2(Meet People) + β3(Date) + β4(Serious Relationship) + β5(Say Did) + u

Page 16: Introduction to Statistics: Political Science (Class 3) Calculating R-Squared, Dichotomous and Nominal Variables, F-tests.

(Remember: reference group is “to say I did it”)

Attractiveness Coef.SE Coef. T p

Seemed Fun -4.011 0.883 -4.54 0.000

Meet People -3.843 0.891 -4.31 0.000

Date -3.186 1.033 -3.09 0.002

Serious Relationship -6.320 1.084 -5.83 0.000

Constant 22.566 0.846 26.68 0.000

What if we want to know whether people who want a date and those who want a serious relationship differ in how important they think attractiveness is?

Page 17: Introduction to Statistics: Political Science (Class 3) Calculating R-Squared, Dichotomous and Nominal Variables, F-tests.

Easiest way: change reference category

Importance of Attribute = β0 + β1(Seemed Fun) + β2(Meet People) + β3(Date) + β4(Serious Relationship) + β5(Say Did) + u

Attractiveness Coef. SE Coef. T p

Seemed Fun 2.309 0.723 3.19 0.001

Meet People 2.477 0.733 3.38 0.001

Date 3.134 0.900 3.48 0.001

Say I Did 6.320 1.084 5.83 0.000

Constant 16.246 0.678 23.95 0.000

Do people who want a date and those who want a serious relationship differ in how important they think attractiveness is?

Page 18: Introduction to Statistics: Political Science (Class 3) Calculating R-Squared, Dichotomous and Nominal Variables, F-tests.

Nominal and Dichotomous IVs

Attractiveness Coef.SE Coef. T p

Seemed Fun 1.852 0.696 2.66 0.008

Meet People 2.516 0.705 3.57 0.000

Date 2.998 0.865 3.46 0.001

Say I Did 6.303 1.042 6.05 0.000

Gender (1=male) 4.689 0.326 14.38 0.000

Constant 14.084 0.669 21.06 0.000

Estimated points allocated to attractiveness for men who attended because it seemed fun?

Page 19: Introduction to Statistics: Political Science (Class 3) Calculating R-Squared, Dichotomous and Nominal Variables, F-tests.

F-Tests

Testing the joint significance of variables

Page 20: Introduction to Statistics: Political Science (Class 3) Calculating R-Squared, Dichotomous and Nominal Variables, F-tests.

F-test

• Way of testing joint significance of variables – i.e., whether set of variables significantly improve explanatory power

• When to use:– Nominal variables– Variables likely to be highly correlated, but

important predictors

Page 21: Introduction to Statistics: Political Science (Class 3) Calculating R-Squared, Dichotomous and Nominal Variables, F-tests.

Terminology

• Unrestricted model – includes IVs you want to test joint significance of

• Restricted model – same model, excluding IVs to be tested

• SSR – Sum of Squared Residuals

Page 22: Introduction to Statistics: Political Science (Class 3) Calculating R-Squared, Dichotomous and Nominal Variables, F-tests.

Formula

• q = # of variables being tested• n = number of cases• k = number of IVs in unrestricted

F =(SSRr - SSRur)/q

SSRur/(n-(k+1)

Page 23: Introduction to Statistics: Political Science (Class 3) Calculating R-Squared, Dichotomous and Nominal Variables, F-tests.

Who values fun people?

Fun Coef.SE Coef. T p

Seemed Fun 0.537 0.349 1.54 0.124

Meet People -0.058 0.354 -0.17 0.869

Date -1.235 0.434 -2.84 0.004

Say I Did -0.271 0.523 -0.52 0.605

Gender (1=male) 0.254 0.164 1.55 0.121

Constant 17.139 0.336 51.06 0.000

What if we want to know whether the reason for attending variables as a group improve the explanatory power of the model?

Page 24: Introduction to Statistics: Political Science (Class 3) Calculating R-Squared, Dichotomous and Nominal Variables, F-tests.

q = # of variables being tested

n = number of cases

k = number of IVs in unrestrictedF =

(SSRr - SSRur)/q

SSRur/(n-(k+1)

UNRESTRICTED Sum of Squares df MS

Model 672.078 5 134.4156

Residual 40819.896 2478 16.47292

Total 41491.974 2483 16.71042

RESTRICTED

Restricted Sum of Squares df MS

Model 62.841 1 62.84063

Residual 41429.133 2482 16.69183

Total 41491.974 2483 16.71042

F =(41429.133 - 40819.896)/4

40819.896 /(2484-(5+1))= 9.25

Page 25: Introduction to Statistics: Political Science (Class 3) Calculating R-Squared, Dichotomous and Nominal Variables, F-tests.

Statistical significance of F-test

• What does an F value of 9.25 mean?• Similar idea to a t-test, but shape of F-

distribution depends (heavily) on degrees of freedom– Numerator = number of IVs being tested– Denominator = N-(number of IVs)-1– Here: 4 and 2478 (2484-5-1)

Page 26: Introduction to Statistics: Political Science (Class 3) Calculating R-Squared, Dichotomous and Nominal Variables, F-tests.
Page 27: Introduction to Statistics: Political Science (Class 3) Calculating R-Squared, Dichotomous and Nominal Variables, F-tests.

Look up critical value in a table or use Minitab

• Calc Probability Distributions F

Note: this will give you area under the curve up to your F-test, so use 1-p

Cumulative Distribution Function

F distribution with 4 DF in numerator and 2478 DF in denominator

x P( X <= x )9.25 1.00000

Page 28: Introduction to Statistics: Political Science (Class 3) Calculating R-Squared, Dichotomous and Nominal Variables, F-tests.

Notes and Next Time

• Graded homework will be handed back next time and model answers will be posted online early next week

• New homework will be handed out next time (and due next Thursday)

• Next time: – Functional form in multivariate regression