WF ED 540, CLASS MEETING 6, Contingency Tables, 2016

Post on 09-Feb-2017

122 views 0 download

Transcript of WF ED 540, CLASS MEETING 6, Contingency Tables, 2016

Contingency TablesDATA ANALYSIS27 SEPTEMBER 2016

A refresher:Hypothesis testingTHEORY, PROPOSITIONS, LOGIC

Language of hypothesis testing… Hypotheses are“tested”

Hypotheses are never“proved”

Hypotheses only are“rejected”

Theories are built and verified by testing hypotheses

Decision-by-truth tableTruth

Ho true Ho falseD

ecis

ion Fail to

reject Ho

Reject Ho

Decision-by-truth table

Error

Error

TruthHo true Ho false

Dec

isio

n Fail to reject Ho

Reject Ho

Decision-by-truth table

Type 1error

Type 2error

TruthHo true Ho false

Dec

isio

n Fail to reject Ho

Reject Ho

Decision-by-truth table

TRADITIONALLY, probability of Type 1

error set at .05

Minimize Type 2error by

increasing sample size

TruthHo true Ho false

Dec

isio

n Fail to reject Ho

Reject Ho

Contingency tablesAlso known as CROSSTABULATIONS

What is a contingency table?A contingency table is a table of counts.A two-dimensional contingency table is

formed by classifying subjects by two variables.

One variable identifies the row categories; the other variable defines the column categories.

The combinations of row and column categories are called cells.

Structure of rows-by-column contingency table…

R1; C1 R1; C2

R2; C1 R2; C2

Column 1 Column 2Ro

w 1

Row

2

R1tot

R2tot

C1 tot C2 tot Total

Data from NLSY79

Example of contingency table…

R1; C1 R1; C2

R2; C1 R2; C2

Male FemaleRo

w 1

Row

2

R1total

R2total

C1 total C2 total Total

Example of contingency table…

R1; C1 R1; C2

R2; C1 R2; C2

Male Female

2005

Hou

seho

ldNo

t in

Pove

rty20

05 H

ouse

hold

In

Pov

erty

R1total

R2total

C1 total C2 total Total

Example of contingency table…

Males not in poverty

Females not in poverty

Males in poverty Females in poverty

Male Female

R1total

R2total

C1 total C2 total Total

2005

Hou

seho

ldNo

t in

Pove

rty20

05 H

ouse

hold

In

Pov

erty

Example of contingency table…

Males not in poverty

Females not in poverty

Males in poverty Females in poverty

Male Female

Nopovtotal

Pov total

Male total Female total Total

2005

Hou

seho

ldNo

t in

Pove

rty20

05 H

ouse

hold

In

Pov

erty

Example of contingency table…

Males not in poverty

Females not in poverty

Males in poverty Females in poverty

Male Female

Nopovtotal

Pov total

Male total Female total Total

2005

Hou

seho

ldNo

t in

Pove

rty20

05 H

ouse

hold

In

Pov

erty

Marginals =>

<=

Mar

gina

ls

R analysis for cell countsR script:

Console output:

Example of contingency table…

3,086 3,039

443 623

Male Female

Male total Female total Total

2005

Hou

seho

ldNo

t in

Pove

rty20

05 H

ouse

hold

In

Pov

erty

Nopovtotal

Pov total

R analysis for marginal counts

R script:

Console output:

Example of contingency table…

3,086 3,039

443 623

Male Female

3,529 3,662 7,191

2005

Hou

seho

ldNo

t in

Pove

rty20

05 H

ouse

hold

In

Pov

erty

6,125

1,066

Research question

3,086 3,039

443 623

Male Female

3,529 3,662 7,191

2005

Hou

seho

ldNo

t in

Pove

rty20

05 H

ouse

hold

In

Pov

erty

6,125

1,066

Is gender independent of household poverty status?

Research question

3,086 3,039

443 623

Male Female

3,529 3,662 7,191

2005

Hou

seho

ldNo

t in

Pove

rty20

05 H

ouse

hold

In

Pov

erty

6,125

1,066

If you know a person’s gender, can you predict poverty status?

Research question

3,086 3,039

443 623

Male Female

3,529 3,662 7,191

2005

Hou

seho

ldNo

t in

Pove

rty20

05 H

ouse

hold

In

Pov

erty

6,125

1,066

If you know a person’s poverty status, can you predict gender?

Under the null hypothesis…

3,086 3,039

443 623

Male Female

3,529 3,662 7,191

2005

Hou

seho

ldNo

t in

Pove

rty20

05 H

ouse

hold

In

Pov

erty

6,125

1,066

A cell value should be equal to (row total x column total) ÷ total

Under the null hypothesis…

3,086Expected value

is 30063,039

443 623

Male Female

3,529 3,662 7,191

2005

Hou

seho

ldNo

t in

Pove

rty20

05 H

ouse

hold

In

Pov

erty

6,125

1,066

E.g., (6125 x 3529) ÷ 7191 should be equal to 3086, but is 3006

Under the null hypothesis…

3,086Expected value

is 30063,039

443 623

Male Female

3,529 3,662 7,191

2005

Hou

seho

ldNo

t in

Pove

rty20

05 H

ouse

hold

In

Pov

erty

6,125

1,066

An expected cell count is a hypothetical count that would occur if there is no relationship between the two variables

test of independence

3,086Expected value

is 30063,039

443 623

Male Female

3,529 3,662 7,191

2005

Hou

seho

ldNo

t in

Pove

rty20

05 H

ouse

hold

In

Pov

erty

6,125

1,066

A value is the sum of the squared deviations of observed minus expected divided by the expected value

test of independence

3,086Expected value

is 30063,039

443 623

Male Female

3,529 3,662 7,191

2005

Hou

seho

ldNo

t in

Pove

rty20

05 H

ouse

hold

In

Pov

erty

6,125

1,066

A value is the sum of the squared deviations of observed minus expected divided by the expected value

Hypothesis tested about …Null hypothesis is H0: R x C = 0

Alternate hypothesis is H1: R x C ≠ 0

a = .05 Described as a test of independence

Calculating in R….it’s simple

Console output:

R script:

Calculating in R….it’s simple

Console output:

R script:

Degrees of freedom (df) = (# rows – 1)(# columns – 1)

Calculating in R….it’s simple

Console output:

R script:

p-value < .05, so reject null

test of independence

A test of the hypothesis that rows and columns in a table are independent

In our case, a test of the independence of gender and poverty status reveals• Household poverty status and gender

are not independent• Knowing household poverty status helps

predict gender

test of independence

A test of the hypothesis that rows and columns in a table are independent

In our case, a test of the independence of gender and poverty status reveals• Household poverty status and gender

are not independent• Knowing household poverty status helps

predict gender

But how much?

Contingency TablesDATA ANALYSIS27 SEPTEMBER 2016