Stat 301 – Day 21 Large sample methods. Announcements HW 4 Updated solutions Especially...

Stat 301 – Day 21

Large sample methods

Announcements

HW 4 Updated solutions

Especially Simpson’s Paradox Should always show your work and explain your

reasoning Significance is an adjective of the statistic, not the

parameter! Project proposals

Informed consent, “anchoring”

Last Time – Hypergeometric Distribution for Fisher’s Exact Test X counts the number of successes in a sample

of n objects selected from a population of N objects consisting of M successes

E(X) = n(M/N) Two-way table applet, R

Fisher’s Exact Test

),(

),(),()(

nNC

xnMNCxMCxXP

Let X represent the number of yawners in the yawn seed group

p-value = P(X> 10) with M = 14, n = 34, N = 50

(o) Success = “not yawning”

p-value = P(X < 24)

Yawning study M

N

n

MN

n

(p) Let X represent the number of non-yawners in the no-yawn-seed group

p-value = P(X > 12) with M = 36, n = 16, N = 50

Yawning study

MN

n

Last Time – Fisher’s Exact Test Once have the two-way table, are multiple

equivalent ways to calculate the p-value. Identifying values of M, n, and N and writing out

P(X ? ?)… Include detail on how you carried out the

calculation (which inputs, which technology) Interpretation of p-value: X% of “random shuffles”

would have a difference in proportions at least as extreme as XX assuming no treatment effect

PP 2.7A

(a) This test procedure, which assumes the null hypothesis is true, can never be used as evidence for the null

Absence of evidence is not evidence of absence

(b) This sample size is not al that large and will be difficult to have a high probability of detecting a small increase in probability(c) Not independent random samples, but number of successes in group A determines number of successes in group B

PP 2.7B

(a) N = 30, M = 13, n = 15

(b) P(X > 10)

PP 2.7B

Investigation 2.8 (p. 158)

(a) Obs unit = 518 cases

EV = which set of instructions (CPR vs. CC)

RV = whether or not survived to discharge

(b)

(c) Diff in sample proportions: 29/278-35/240 = -.0415

(d) P(X < 29) where

X hypergeometric(N = 518, M = 64, n = 78)

Expected # of successes in CPR:(64/518)*(278) 34.35

Investigation 2.8

Investigation 2.8

Normal approximation to hypergeometric

Investigation 2.8

R

Improve the approximation?

.0973 vs. .0764

Investigation 2.8

(h) p-value = .0973 < .10, so we reject the null hypothesis at the 10% level and conclude that the survival rate is higher with CC alone (underlying treatment probability is larger)

But not at 5% level of significance

Investigation 2.8

(i) Use technology

I’m 90% confident that the survival rate with CC alone is up to 8.5 percentage points higher than the survival rate with CPR (or 1 percentage point less)

Investigation 2.9 preview

Is the observed difference of -.0415 noteworthy?

Definition: Relative risk is the ratio of the conditional proportions Often set up to be larger than one RR = (CC survival)/(CPR survival) = .1458/.1043

1.398 Interpretation: Survival rate is about 1.4 times

higher with CC than with CPR

Or 40% more patients survive with CC than CPR

Our usual question

How do I decide whether my observed relative risk of 1.398 is surprising under the null hypothesis of no difference between the two treatments??

1. Simulate the shuffling of 64 survivors and 454 non-survivors to the CPR/CC groups

2. Calculate the relative risk for each shuffle

3. Where does 1.398 fall in this distribution?

To Do for Thursday

PP 2.8 (p. 156)

Stat 301 – Day 21 Large sample methods. Announcements HW 4 Updated solutions Especially...

Documents

Transcript of Stat 301 – Day 21 Large sample methods. Announcements HW 4 Updated solutions Especially...