Making Change Happen

55
Making Change Happen Kari Lock Morgan STAT 790: Teaching Statistics 10/24/12

description

Making Change Happen. Kari Lock Morgan STAT 790: Teaching Statistics 10/24/12. Introducing Inference with Simulation Methods. Sleep versus Caffeine. - PowerPoint PPT Presentation

Transcript of Making Change Happen

Page 1: Making Change Happen

Making Change Happen

Kari Lock MorganSTAT 790: Teaching Statistics

10/24/12

Page 2: Making Change Happen

Introducing Inference with Simulation Methods

Page 3: Making Change Happen

Sleep versus Caffeine

Mednick, Cai, Kanady, and Drummond (2008). “Comparing the benefits of caffeine, naps and placebo on verbal, motor and perceptual memory,” Behavioral Brain Research, 193, 79-86.

• Students were given words to memorize, then randomly assigned to take either a 90 min nap, or a caffeine pill. 2 ½ hours later, they were tested on their recall ability.

• words

• Is sleep better than caffeine for memory?

Page 4: Making Change Happen

Traditional Inference

1 22 21 2

1 2

s sn n

X X

2 23.31

15

3.551

.

2 12

25 12.25

2.14

1. Conditions Met?

3. Calculate numbers and plug into formula

4. Plug into calculator

5. Which theoretical distribution?6. df?7. find p-value

0.025 < p-value < 0.05

> pt(2.14, 11, lower.tail=FALSE)[1] 0.02780265

2. Which formula?

1 2 12n n

Page 5: Making Change Happen

• Confidence intervals and hypothesis tests using the normal and t-distributions

• With a different formula for each situation, students often get mired in the details and fail to see the big picture

• Plugging numbers into formulas does little to help reinforce conceptual understanding

Traditional Inference

Page 6: Making Change Happen

• Simulation methods (bootstrapping and randomization) are a computationally intensive alternative to the traditional approach

• Rather than relying on theoretical distributions for specific test statistics, we can directly simulate the distribution of any statistic

• Great for conceptual understanding!

Simulation Methods

Page 7: Making Change Happen

To generate a distribution assuming H0 is true:

•Traditional Approach: Calculate a test statistic which should follow a known distribution if the null hypothesis is true (under some conditions)

• Randomization Approach: Decide on a statistic of interest. Simulate many randomizations assuming the null hypothesis is true, and calculate this statistic for each randomization

Hypothesis Testing

Page 8: Making Change Happen

http://www.youtube.com/watch?v=3ESGpRUMj9E

Paul the Octopus

Page 9: Making Change Happen

• Paul the Octopus predicted 8 World Cup games, and predicted them all correctly

• Is this evidence that Paul actually has psychic powers?

• How unusual would this be if he was just randomly guessing (with a 50% chance of guessing correctly)?

• How could we figure this out?

Paul the Octopus

Page 10: Making Change Happen

• Students each flip a coin 8 times, and count the number of heads

• Count the number of students with all 8 heads by a show of hands (will probably be 0)

• If Paul was just guessing, it would be very unlikely for him to get all 8 correct!

• How unlikely? Simulate many times!!!

Simulate with Students

Page 11: Making Change Happen

www.lock5stat.com/statkey

Simulate with StatKey

81 0.00392

Page 12: Making Change Happen

• In a randomized experiment on treating cocaine addiction, 48 people were randomly assigned to take either Desipramine (a new drug), or Lithium (an existing drug)

• The outcome variable is whether or not a patient relapsed

• Is Desipramine significantly better than Lithium at treating cocaine addiction?

Cocaine Addiction

Page 13: Making Change Happen

R R R R R RR R R R R RR R R R R RR R R R R R

R R R R R RR R R R R RR R R R R RR R R R R R

R R R RR R R R R RR R R R R RR R R R R R

R R R RR R R R R RR R R R R RR R R R R R

New Drug Old Drug1. Randomly assign units to treatment groups

Page 14: Making Change Happen

R R R RR R R R R RR R R R R RN N N N N N

RRR R R RR R R R N NN N N N N N

RR

N N N N N N

R = RelapseN = No Relapse

R R R RR R R R R RR R R R R RN N N N N N

RRR R R RR R R R RRR R N N N N

RR

N N N N N N

2. Conduct experiment3. Observe relapse counts in each group

Old DrugNew Drug

10 relapse, 14 no relapse 18 relapse, 6 no relapse

1. Randomly assign units to treatment groups

10 1824

ˆ ˆ

24.333

new oldp p

Page 15: Making Change Happen

• Assume the null hypothesis is true

• Simulate new randomizations

• For each, calculate the statistic of interest

• Find the proportion of these simulated statistics that are as extreme as your observed statistic

Randomization Test

Page 16: Making Change Happen

R R R RR R R R R RR R R R R RN N N N N N

RRR R R RR R R R N NN N N N N N

RR

N N N N N N

10 relapse, 14 no relapse 18 relapse, 6 no relapse

Page 17: Making Change Happen

R R R R R RR R R R N NN N N N N NN N N N N N

R R R R R RR R R R R RR R R R R RN N N N N N

R N R NR R R R R RR N R R R NR N N N R R

N N N RN R R N N NN R N R R NR N R R R R

Simulate another randomization

New Drug Old Drug

16 relapse, 8 no relapse 12 relapse, 12 no relapse

ˆ ˆ16 1224 240.167

N Op p

Page 18: Making Change Happen

R R R RR R R R R RR R R R R RN N N N N N

RRR R R RR N R R N NR R N R N R

RR

R N R N R R

Simulate another randomization

New Drug Old Drug

17 relapse, 7 no relapse 11 relapse, 13 no relapse

ˆ ˆ17 1124 240.250

N Op p

Page 19: Making Change Happen

• Give students index cards labeled R (28 cards) and N (20 cards)

• Have them deal the cards into 2 groups

• Compute the difference in proportions

• Contribute to a class dotplot for the randomization distribution

Simulate with Students

Page 20: Making Change Happen

• Why did you re-deal your cards?

• Why did you leave the outcomes (relapse or no relapse) unchanged on each card?

Cocaine AddictionYou want to know what would happen

• by random chance (the random allocation to treatment groups)

• if the null hypothesis is true (there is no difference between the drugs)

Page 21: Making Change Happen

Simulate with StatKey

The probability of getting results as extreme or more extreme than those observed if the null hypothesis is true, is about .02. p-value

Proportion as extreme as observed statistic

observed statistic

www.lock5stat.com/statkey

Distribution of Statistic Assuming Null is True

Page 22: Making Change Happen

Sleep versus Caffeine

Mednick, Cai, Kanady, and Drummond (2008). “Comparing the benefits of caffeine, naps and placebo on verbal, motor and perceptual memory,” Behavioral Brain Research, 193, 79-86.

• Students were given words to memorize, then randomly assigned to take either a 90 min nap, or a caffeine pill. 2 ½ hours later, they were tested on their recall ability.

• words

• Is sleep better than caffeine for memory?

Page 23: Making Change Happen

Simulation Inference• How extreme would a sample difference of 3 be, if

there were no difference between sleep and caffeine for word recall?

• What kind of differences would we observe, just by random chance?

• Simulate many randomizations, assuming no difference

• Calculate the p-value as the proportion of simulated randomizations that yield differences as extreme as the observed 3

Page 24: Making Change Happen

Randomization Test

p-valueProportion as extreme as observed statistic

observed statistic

Distribution of Statistic Assuming Null is True

StatKey at www.lock5stat.com

Page 25: Making Change Happen

Traditional Inference

1 22 21 2

1 2

s sn n

X X

2 23.31

15

3.551

.

2 12

25 12.25

2.14

1. Conditions Met?

3. Calculate numbers and plug into formula

4. Plug into calculator

5. Which theoretical distribution?6. df?7. find p-value

0.025 < p-value < 0.05

> pt(2.14, 11, lower.tail=FALSE)[1] 0.02780265

2. Which formula?

1 2 12n n

Page 26: Making Change Happen

Simulation vs Traditional• Simulation methods

• intrinsically connected to concepts• same procedure applies to all statistics• no conditions to check• minimal background knowledge needed

• Traditional methods (normal and t based)• familiarity expected after intro stats• needed for future statistics classes• only summary statistics are needed• insight from standard error

Page 27: Making Change Happen

Simulation AND Traditional?

• We introduces inference with simulation methods, then cover the traditional methods

• Students have seen the normal distribution appear repeatedly via simulation; use this common shape to motivate traditional inference

• “Shortcut” formulas give the standard error, avoiding the need for thousands of simulations

• Students already know the concepts, so can go relatively fast through the mechanics

Page 28: Making Change Happen

TopicsCh 1: Collecting DataCh 2: Describing DataCh 3: Confidence Intervals (Bootstrap)Ch 4: Hypothesis Tests (Randomization)Ch 5: Normal DistributionCh 6: Inference for Means and Proportions (formulas

and theory)Ch 7: Chi-Square TestsCh 8: ANOVACh 9: RegressionCh 10: Multiple Regression(Optional): Probability

Page 29: Making Change Happen

• Normal and t-based inference after bootstrapping and randomization:

• Students have seen the normal distribution repeatedly – CLT easy!

• Same idea, just using formula for SE and comparing to theoretical distribution

• Can go very quickly through this!

Theoretical Approach

Page 30: Making Change Happen

Theoretical Approach

p-value

t-statistic

www.lock5stat.com/statkey

Page 31: Making Change Happen

• Introduce new statistic - 2 or F

• Students know that these can be compared to either a randomization distribution or a theoretical distribution

• Students are comfortable using either method, and see the connection!

• If conditions are met, the randomization and theoretical distributions are the same!

Chi-Square and ANOVA

Page 32: Making Change Happen

Chi-Square Statistic

Randomization Distribution

Chi-Square Distribution (3 df)

p-value = 0.357

2 statistic = 3.242

2 statistic = 3.242 p-value = 0.356

Page 33: Making Change Happen

Student Preferences

Which way of doing inference gave you a better conceptual understanding of confidence intervals and hypothesis tests?

Bootstrapping and Randomization

Formulas and Theoretical Distributions

113 51

69% 31%

Page 34: Making Change Happen

Student Preferences

Which way did you prefer to learn inference (confidence intervals and hypothesis tests)?

Bootstrapping and Randomization

Formulas and Theoretical Distributions

105 60

64% 36%Simulation Traditional

AP Stat 31 36

No AP Stat 74 24

Page 35: Making Change Happen

Student Behavior• Students were given data on the second midterm and asked to compute a confidence interval for the mean

• How they created the interval:

Bootstrapping t.test in R Formula94 9 9

84% 8% 8%

Page 36: Making Change Happen

A Student Comment

" I took AP Stat in high school and I got a 5.  It was mainly all equations, and I had no idea of the theory behind any of what I was doing.

Statkey and bootstrapping really made me understand the concepts I was learning, as opposed to just being able to just spit them out on an exam.”

- one of my students

Page 37: Making Change Happen

It is the way of the past…

"Actually, the statistician does not carry out this very simple and very tedious process [the randomization test], but his conclusions have no justification beyond the fact that they agree with those which could have been arrived at by this elementary method."

-- Sir R. A. Fisher, 1936

Page 38: Making Change Happen

… and the way of the future“... the consensus curriculum is still an unwitting prisoner of history. What we teach is largely the technical machinery of numerical approximations based on the normal distribution and its many subsidiary cogs. This machinery was once necessary, because the conceptually simpler alternative based on permutations was computationally beyond our reach. Before computers statisticians had no choice. These days we have no excuse. Randomization-based inference makes a direct connection between data production and the logic of inference that deserves to be at the core of every introductory course.”

-- Professor George Cobb, 2007

Page 39: Making Change Happen

• How to assess the variability of sample statistics?

• Sample repeatedly from the population to create a sampling distribution to compute the standard error?

• Reality: We only have one sample!

• How do we determine how much statistics vary from sample to sample, if we only have one sample???

Confidence Intervals

Page 40: Making Change Happen

Population

µ

BUT, in practice we don’t see the “tree” or all of the “seeds” – we only have ONE seed

Sampling Distribution

Page 41: Making Change Happen

Bootstrap“Population”

What can we do with just one seed?

Grow a NEW tree!

𝑥

Estimate the distribution and variability (SE) of ’s from the bootstrap statistics

Bootstrap Distribution

Page 42: Making Change Happen

Bootstrap statistics are to the original sample statistic

as the original sample statistic is to the population parameter

Bootstrap Distribution

Page 43: Making Change Happen

•The variability of the bootstrap statistics is similar to the variability of the sample statistics

• The standard error of a statistic can be estimated using the standard deviation of the bootstrap distribution!

Standard Error

Page 44: Making Change Happen

• Imagine the population is many, many copies of the original sample (what do you have to assume?)

• Sample repeatedly from this mock population

• This is done by sampling with replacement from the original sample

Bootstrap Distribution

Page 45: Making Change Happen

Atlanta Commutes

Data: The American Housing Survey (AHS) collected data from Atlanta in 2004.

What’s the mean commute time for workers in metropolitan Atlanta?

Page 46: Making Change Happen

Atlanta Commutes - Sample

Where might the “true” μ be?

Time20 40 60 80 100 120 140 160 180

CommuteAtlanta Dot Plot

 

Page 47: Making Change Happen

Original Sample

Page 48: Making Change Happen

“Population” many copies of sample

Sample from this “population”

Page 49: Making Change Happen

Bootstrapping• A bootstrap sample is a random sample taken with replacement from the original sample, of the same size as the original sample

• A bootstrap statistic is the statistic computed on the bootstrap sample

• A bootstrap distribution is the distribution of many bootstrap statistics

Page 50: Making Change Happen

Original Sample

BootstrapSample

BootstrapSample

BootstrapSample

.

.

.

Bootstrap Statistic

Sample Statistic

Bootstrap Statistic

Bootstrap Statistic

.

.

.

Bootstrap Distribution

Page 51: Making Change Happen

Simulate with StatKeywww.lock5stat.com/statkey

Middle 95% of bootstrap statistics: (27.4, 30.9)

Page 52: Making Change Happen

Other Changes• More Bayesian inference in undergraduate courses?

• More emphasis on analysis of big datasets?

• ?

Page 53: Making Change Happen

Making Change Happen• You are the next generation of statistics professors… you are best suited to implement change!

• Think hard about the existing curriculum and pedagogy… is it best for the students?

• If not, how could it be made better?

• Talk to colleagues, attend conferences, …

Page 54: Making Change Happen

Making Change Happen• The 2013 United States Conference on Teaching Statistics (USCOTS) will be in the Research Triangle Park!

• May 16-18, 2013

• Conference theme: “Making Change Happen”

• Plenary sessions, breakout sessions, posters

• You are all encouraged to attend!

Page 55: Making Change Happen

Practice Teaching• 12 min, any statistics topic• “It’s not what you teach, it’s what they learn”• Motivate the topic• Use examples, and real data if possible• Stress conceptual understanding• Try visuals to clarify concepts• What are the key points?• Computer or board?• Engage the students!