Sampling and Sample Size Part 2 Cally Ardington. Lecture Outline Standard deviation and standard...

73
Sampling and Sample Size Part 2 Cally Ardington

Transcript of Sampling and Sample Size Part 2 Cally Ardington. Lecture Outline Standard deviation and standard...

Page 1: Sampling and Sample Size Part 2 Cally Ardington. Lecture Outline  Standard deviation and standard error Detecting impact  Background  Hypothesis testing.

Sampling and Sample Size Part 2Cally Ardington

Page 2: Sampling and Sample Size Part 2 Cally Ardington. Lecture Outline  Standard deviation and standard error Detecting impact  Background  Hypothesis testing.

Lecture Outline Standard deviation and standard error

•Detecting impact Background

Hypothesis testing Power

The ingredients of power

Page 3: Sampling and Sample Size Part 2 Cally Ardington. Lecture Outline  Standard deviation and standard error Detecting impact  Background  Hypothesis testing.

We implement the Balsakhi Program

Case 2: Remedial Education in IndiaEvaluating the Balsakhi Program

Incorporating random assignment into the program

Case 2: Remedial Education in IndiaEvaluating the Balsakhi Program

Incorporating random assignment into the program

Page 4: Sampling and Sample Size Part 2 Cally Ardington. Lecture Outline  Standard deviation and standard error Detecting impact  Background  Hypothesis testing.

Post-test: control & treatment

Page 5: Sampling and Sample Size Part 2 Cally Ardington. Lecture Outline  Standard deviation and standard error Detecting impact  Background  Hypothesis testing.

Yes No

Don’t know

33%33%33%

Is this impact statistically significant?

A. YesB. NoC. Don’t know

01234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283848586878889909192939495969798991000

20

40

60

80

100

120

140

160

control treatment

control μ treatment μ

test scores

Page 6: Sampling and Sample Size Part 2 Cally Ardington. Lecture Outline  Standard deviation and standard error Detecting impact  Background  Hypothesis testing.

Lecture Outline Standard deviation and standard error

•Detecting impact Background

Hypothesis testing Power

The ingredients of power

Page 7: Sampling and Sample Size Part 2 Cally Ardington. Lecture Outline  Standard deviation and standard error Detecting impact  Background  Hypothesis testing.

• The Law of Large of Numbers and Central Limit Theorem allow us to do hypothesis testing to determine whether our findings are statistically significant

Hypothesis Testing

Page 8: Sampling and Sample Size Part 2 Cally Ardington. Lecture Outline  Standard deviation and standard error Detecting impact  Background  Hypothesis testing.

• In criminal law, most institutions follow the rule: “innocent until proven

guilty”

• The presumption is that the accused is innocent and the burden is on

the prosecutor to show guilt

• The jury or judge starts with the “null hypothesis” that the accused person is

innocent

• The prosecutor has a hypothesis that the accused person is guilty

Hypothesis Testing

8

Page 9: Sampling and Sample Size Part 2 Cally Ardington. Lecture Outline  Standard deviation and standard error Detecting impact  Background  Hypothesis testing.

• In program evaluation, instead of “presumption of innocence,” the

rule is: “presumption of insignificance”

• The “Null hypothesis” (H0) is that there was no (zero) impact of the

program

• The burden of proof is on the evaluator to show a significant

difference

Hypothesis Testing

Page 10: Sampling and Sample Size Part 2 Cally Ardington. Lecture Outline  Standard deviation and standard error Detecting impact  Background  Hypothesis testing.

• If it is very unlikely (less than a 5% probability) that the difference is

solely due to chance:

• We “reject our null hypothesis”

• We may now say:

• “our program has a statistically significant impact”

Hypothesis Testing: Conclusions

Page 11: Sampling and Sample Size Part 2 Cally Ardington. Lecture Outline  Standard deviation and standard error Detecting impact  Background  Hypothesis testing.

Type I and II errorsYOU CONCLUDE

Effective No Effect

THE TRUT

H

Effective Type II Error

No Effect

Type I Error

Page 12: Sampling and Sample Size Part 2 Cally Ardington. Lecture Outline  Standard deviation and standard error Detecting impact  Background  Hypothesis testing.

What is the significance level?

• Type I error: rejecting the null hypothesis even though it is true (false positive)

• Significance level: The probability that we will reject the null hypothesis even though it is true

Page 13: Sampling and Sample Size Part 2 Cally Ardington. Lecture Outline  Standard deviation and standard error Detecting impact  Background  Hypothesis testing.

Theoretical Sampling Distribution

-4 -3 -2 -1 0 1 2 3 4 5 60

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

control

H0

Page 14: Sampling and Sample Size Part 2 Cally Ardington. Lecture Outline  Standard deviation and standard error Detecting impact  Background  Hypothesis testing.

-4 -3 -2 -1 0 1 2 3 4 5 60

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

control

95% Confidence Interval

H0

1.96 SD1.96 SD

Page 15: Sampling and Sample Size Part 2 Cally Ardington. Lecture Outline  Standard deviation and standard error Detecting impact  Background  Hypothesis testing.

Impose Significance Level of 5%

-4 -3 -2 -1 0 1 2 3 4 5 60

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

control

H0

1.96 SD

H0H0

Page 16: Sampling and Sample Size Part 2 Cally Ardington. Lecture Outline  Standard deviation and standard error Detecting impact  Background  Hypothesis testing.

Lecture Outline Standard deviation and standard error

•Detecting impact Background

Hypothesis testing Power

The ingredients of power

Page 17: Sampling and Sample Size Part 2 Cally Ardington. Lecture Outline  Standard deviation and standard error Detecting impact  Background  Hypothesis testing.

• Type II Error: Failing to reject the null hypothesis

(concluding there is no difference), when indeed the

null hypothesis is false.

• Power: If there is a measureable effect of our

intervention (the null hypothesis is false), the

probability that we will detect an effect (reject the null

hypothesis)

• Power = 1- Probability of Type II Error

What is Power?

Page 18: Sampling and Sample Size Part 2 Cally Ardington. Lecture Outline  Standard deviation and standard error Detecting impact  Background  Hypothesis testing.

-4 -3 -2 -1 0 1 2 3 4 5 60

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

control

treatment

H0

Page 19: Sampling and Sample Size Part 2 Cally Ardington. Lecture Outline  Standard deviation and standard error Detecting impact  Background  Hypothesis testing.

Anything between lines cannot be distinguished from 0

Impose significance level of 5%

-4 -3 -2 -1 0 1 2 3 4 5 60

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

control

treatment

significance

H0

Page 20: Sampling and Sample Size Part 2 Cally Ardington. Lecture Outline  Standard deviation and standard error Detecting impact  Background  Hypothesis testing.

Shaded area shows % of time we would find Hβ true if it was-4 -3 -2 -1 0 1 2 3 4 5 6

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

control

treatment

power

Can we distinguish Hβ from H0 ?

H0

Page 21: Sampling and Sample Size Part 2 Cally Ardington. Lecture Outline  Standard deviation and standard error Detecting impact  Background  Hypothesis testing.

Type I and II errorsYOU CONCLUDE

Effective No Effect

THE TRUT

H

Effective Type II Error

No Effect

Type I Error

Page 22: Sampling and Sample Size Part 2 Cally Ardington. Lecture Outline  Standard deviation and standard error Detecting impact  Background  Hypothesis testing.

Type I and II errorsYOU CONCLUDE

Effective No Effect

THE TRUT

H

Effective Type II Error

No Effect

Type I Error

(probability =

significance level)

Page 23: Sampling and Sample Size Part 2 Cally Ardington. Lecture Outline  Standard deviation and standard error Detecting impact  Background  Hypothesis testing.

Type I and II errorsYOU CONCLUDE

Effective No Effect

THE TRUT

H

Effective (probability =

power)

Type II Error

No Effect

Type I Error

Page 24: Sampling and Sample Size Part 2 Cally Ardington. Lecture Outline  Standard deviation and standard error Detecting impact  Background  Hypothesis testing.

Before the experiment

-4 -3 -2 -1 0 1 2 3 4 5 60

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

control

treatment

• Assume two effects: no effect and treatment effect β

H0Hβ

Page 25: Sampling and Sample Size Part 2 Cally Ardington. Lecture Outline  Standard deviation and standard error Detecting impact  Background  Hypothesis testing.

Impose significance level of 5%

Anything between lines cannot be distinguished from 0

-4 -3 -2 -1 0 1 2 3 4 5 60

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

control

treatment

significance

HβH0

Page 26: Sampling and Sample Size Part 2 Cally Ardington. Lecture Outline  Standard deviation and standard error Detecting impact  Background  Hypothesis testing.

Can we distinguish Hβ from H0 ?

-4 -3 -2 -1 0 1 2 3 4 5 60

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

control

treatment

power

Shaded area shows % of time we would find Hβ true if it was

HβH0

Page 27: Sampling and Sample Size Part 2 Cally Ardington. Lecture Outline  Standard deviation and standard error Detecting impact  Background  Hypothesis testing.

What influences power?

• What are the factors that change the proportion of the research hypothesis that is shaded—i.e. the proportion that falls to the right (or left) of the null hypothesis curve?

• Understanding this helps us design more powerful experiments

Page 28: Sampling and Sample Size Part 2 Cally Ardington. Lecture Outline  Standard deviation and standard error Detecting impact  Background  Hypothesis testing.

Lecture Outline Standard deviation and standard error

•Detecting impact Background

Hypothesis testing Power

The ingredients of power

Page 29: Sampling and Sample Size Part 2 Cally Ardington. Lecture Outline  Standard deviation and standard error Detecting impact  Background  Hypothesis testing.

• Effect Size

• Sample Size

• Variance

• Proportion of sample in Treatment vs. Control

• Clustering

Power: Main Ingredients

Page 30: Sampling and Sample Size Part 2 Cally Ardington. Lecture Outline  Standard deviation and standard error Detecting impact  Background  Hypothesis testing.

• Effect Size

• Sample Size

• Variance

• Proportion of sample in Treatment vs. Control

• Clustering

Power: Main Ingredients

Page 31: Sampling and Sample Size Part 2 Cally Ardington. Lecture Outline  Standard deviation and standard error Detecting impact  Background  Hypothesis testing.

Effect Size: 1*SD• Hypothesized effect size determines distance between means

-4 -3 -2 -1 0 1 2 3 4 5 60

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

control

treatment

1 Standard Deviation

HβH0

Page 32: Sampling and Sample Size Part 2 Cally Ardington. Lecture Outline  Standard deviation and standard error Detecting impact  Background  Hypothesis testing.

Effect Size = 1*SD

-4 -3 -2 -1 0 1 2 3 4 5 60

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

control

treatment

significance

H0Hβ

Page 33: Sampling and Sample Size Part 2 Cally Ardington. Lecture Outline  Standard deviation and standard error Detecting impact  Background  Hypothesis testing.

Power: 26%If the true impact was 1*SD…

The Null Hypothesis would be rejected only 26% of the time

-4 -3 -2 -1 0 1 2 3 4 5 60

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

control

treatment

power

HβH0

Page 34: Sampling and Sample Size Part 2 Cally Ardington. Lecture Outline  Standard deviation and standard error Detecting impact  Background  Hypothesis testing.

Effect Size: 3*SD

-4 -3 -2 -1 0 1 2 3 4 5 60

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

control

treatment

Bigger hypothesized effect size distributions farther apart

3*SD

Page 35: Sampling and Sample Size Part 2 Cally Ardington. Lecture Outline  Standard deviation and standard error Detecting impact  Background  Hypothesis testing.

Effect size 3*SD: Power= 91%

-4 -3 -2 -1 0 1 2 3 4 5 60

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

control

treatment

power

Bigger Effect size means more power

H0

Page 36: Sampling and Sample Size Part 2 Cally Ardington. Lecture Outline  Standard deviation and standard error Detecting impact  Background  Hypothesis testing.

25% 25%25%25%

What effect size should you use when designing your experiment?

A. Smallest effect size that is still cost effective

B. Largest effect size you expect your program to produce

C. BothD. Neither

Page 37: Sampling and Sample Size Part 2 Cally Ardington. Lecture Outline  Standard deviation and standard error Detecting impact  Background  Hypothesis testing.

• What is the smallest effect that should justify the program to be adopted:

• Cost of this program vs the benefits it brings• Cost of this program vs the alternative use of the money

• If the effect is smaller than that, it might as well be zero: we are not interested in proving that a very small effect is different from zero

• In contrast, any effect larger than that effect would justify adopting this program: we want to be able to distinguish it from zero

Picking an effect size

Page 38: Sampling and Sample Size Part 2 Cally Ardington. Lecture Outline  Standard deviation and standard error Detecting impact  Background  Hypothesis testing.

Effect size and take-up

• Let’s say we believe the impact on our participants is “3”• What happens if take up is 1/3?• Let’s show this graphically

Page 39: Sampling and Sample Size Part 2 Cally Ardington. Lecture Outline  Standard deviation and standard error Detecting impact  Background  Hypothesis testing.

Effect Size: 3*SD

-4 -3 -2 -1 0 1 2 3 4 5 60

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

control

treatment

Let’s say we believe the impact on our participants is “3”

3*SD

Page 40: Sampling and Sample Size Part 2 Cally Ardington. Lecture Outline  Standard deviation and standard error Detecting impact  Background  Hypothesis testing.

Take up is 33%. Effect size is 1/3rd

• Hypothesized effect size determines distance between means

-4 -3 -2 -1 0 1 2 3 4 5 60

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

control

treatment

1 Standard Deviation

HβH0

Page 41: Sampling and Sample Size Part 2 Cally Ardington. Lecture Outline  Standard deviation and standard error Detecting impact  Background  Hypothesis testing.

Back to: Power = 26%

Take-up is reflected in the effect size

-4 -3 -2 -1 0 1 2 3 4 5 60

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

control

treatment

powerHβH0

Page 42: Sampling and Sample Size Part 2 Cally Ardington. Lecture Outline  Standard deviation and standard error Detecting impact  Background  Hypothesis testing.

• Effect Size

• Sample Size

• Variance

• Proportion of sample in Treatment vs. Control

• Imperfect compliance

• Clustering

Power: Main Ingredients

Page 43: Sampling and Sample Size Part 2 Cally Ardington. Lecture Outline  Standard deviation and standard error Detecting impact  Background  Hypothesis testing.

20% 20%20%20%20%

By increasing sample size you increase…

A. AccuracyB. PrecisionC. BothD. NeitherE. Don’t know

-4 -3 -2 -1 0 1 2 3 4 5 60

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

control

treatment

power

Power: 91%

Page 44: Sampling and Sample Size Part 2 Cally Ardington. Lecture Outline  Standard deviation and standard error Detecting impact  Background  Hypothesis testing.

Power: Effect size = 1 SD, Sample size = 4

-4 -3 -2 -1 0 1 2 3 4 5 60

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

control

treatment

significance

Page 45: Sampling and Sample Size Part 2 Cally Ardington. Lecture Outline  Standard deviation and standard error Detecting impact  Background  Hypothesis testing.

Power: 64%

-4 -3 -2 -1 0 1 2 3 4 5 60

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

control

treatment

power

Page 46: Sampling and Sample Size Part 2 Cally Ardington. Lecture Outline  Standard deviation and standard error Detecting impact  Background  Hypothesis testing.

Power: Effect size = 1 SD, Sample size = 9

-4 -3 -2 -1 0 1 2 3 4 5 60

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

control

treatment

significance

Page 47: Sampling and Sample Size Part 2 Cally Ardington. Lecture Outline  Standard deviation and standard error Detecting impact  Background  Hypothesis testing.

Power: 91%

-4 -3 -2 -1 0 1 2 3 4 5 60

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

control

treatment

power

Page 48: Sampling and Sample Size Part 2 Cally Ardington. Lecture Outline  Standard deviation and standard error Detecting impact  Background  Hypothesis testing.

Sample Size

Page 49: Sampling and Sample Size Part 2 Cally Ardington. Lecture Outline  Standard deviation and standard error Detecting impact  Background  Hypothesis testing.

• Effect Size

• Sample Size

• Variance

• Proportion of sample in Treatment vs. Control

• Imperfect compliance

• Clustering

Power: Main Ingredients

Page 50: Sampling and Sample Size Part 2 Cally Ardington. Lecture Outline  Standard deviation and standard error Detecting impact  Background  Hypothesis testing.

• How large an effect you can detect with a given sample depends on how variable the outcomes is.

• Example: If all children have very similar learning level without a program, a very small impact will be easy to detect

• We can try to “absorb” variance:

• Using a baseline

• Controlling for other variables

• In practice, controlling for other variables (besides the baseline

outcome) buys you very little

Variance

Page 51: Sampling and Sample Size Part 2 Cally Ardington. Lecture Outline  Standard deviation and standard error Detecting impact  Background  Hypothesis testing.

Variance

Low Standard Deviation

0

5

10

15

20

25

valu

e

33

37

41

45

49

53

57

61

65

69

73

77

81

85

89

Number

Fre

qu

ency

mean 50

mean 60

Page 52: Sampling and Sample Size Part 2 Cally Ardington. Lecture Outline  Standard deviation and standard error Detecting impact  Background  Hypothesis testing.

Less Precision

Medium Standard Deviation

0

1

2

3

4

5

6

7

8

9

valu

e

33

37

41

45

49

53

57

61

65

69

73

77

81

85

89

Number

Fre

qu

ency

mean 50

mean 60

Page 53: Sampling and Sample Size Part 2 Cally Ardington. Lecture Outline  Standard deviation and standard error Detecting impact  Background  Hypothesis testing.

Even less precise

High Standard Deviation

0

1

2

3

4

5

6

7

8

value 33 37 41 45 49 53 57 61 65 69 73 77 81 85 89

Number

Fre

qu

en

cy

mean 50

mean 60

Page 54: Sampling and Sample Size Part 2 Cally Ardington. Lecture Outline  Standard deviation and standard error Detecting impact  Background  Hypothesis testing.

• Effect Size

• Sample Size

• Variance

• Proportion of sample in Treatment vs. Control

• Clustering

Power: Main Ingredients

Page 55: Sampling and Sample Size Part 2 Cally Ardington. Lecture Outline  Standard deviation and standard error Detecting impact  Background  Hypothesis testing.

Sample split: 50% C, 50% T

-4 -3 -2 -1 0 1 2 3 4 5 60

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

control

treatment

significance

H0 Hβ

Equal split gives distributions that are the same “fatness”

Page 56: Sampling and Sample Size Part 2 Cally Ardington. Lecture Outline  Standard deviation and standard error Detecting impact  Background  Hypothesis testing.

Power: 91%

-4 -3 -2 -1 0 1 2 3 4 5 60

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

control

treatment

power

Page 57: Sampling and Sample Size Part 2 Cally Ardington. Lecture Outline  Standard deviation and standard error Detecting impact  Background  Hypothesis testing.

If it’s not 50-50 split?

• What happens to the relative fatness if the split is not 50-50.• Say 25-75?

Page 58: Sampling and Sample Size Part 2 Cally Ardington. Lecture Outline  Standard deviation and standard error Detecting impact  Background  Hypothesis testing.

Sample split: 25% C, 75% T

-4 -3 -2 -1 0 1 2 3 4 5 60

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

control

treatment

significance

H0 Hβ

Uneven distributions, not efficient, i.e. less power

Page 59: Sampling and Sample Size Part 2 Cally Ardington. Lecture Outline  Standard deviation and standard error Detecting impact  Background  Hypothesis testing.

Power: 83%

-4 -3 -2 -1 0 1 2 3 4 5 60

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

control

treatment

power

Page 60: Sampling and Sample Size Part 2 Cally Ardington. Lecture Outline  Standard deviation and standard error Detecting impact  Background  Hypothesis testing.

• Effect Size

• Sample Size

• Variance

• Proportion of sample in Treatment vs. Control

• Clustering

Power: Main Ingredients

Page 61: Sampling and Sample Size Part 2 Cally Ardington. Lecture Outline  Standard deviation and standard error Detecting impact  Background  Hypothesis testing.

Clustered design: definition

• In sampling:• When clusters of individuals (e.g. schools, communities, etc) are randomly

selected from the population, before selecting individuals for observation

• In randomized evaluation:• When clusters of individuals are randomly assigned to different treatment

groups

Page 62: Sampling and Sample Size Part 2 Cally Ardington. Lecture Outline  Standard deviation and standard error Detecting impact  Background  Hypothesis testing.

Reason for adopting cluster randomization• Need to minimize or remove contamination

• Example: In the deworming program, schools was chosen as the unit because worms are contagious

• Basic feasibility considerations• Example: The PROGRESA program would not have been politically feasible if

some families were introduced and not others.

• Only natural choice• Example: Any education intervention that affect an entire classroom (e.g.

flipcharts, teacher training).

Page 63: Sampling and Sample Size Part 2 Cally Ardington. Lecture Outline  Standard deviation and standard error Detecting impact  Background  Hypothesis testing.

Clustered design: intuition

• You want to know how close the upcoming national elections will be

• Method 1: Randomly select 50 people from entire Indian population

• Method 2: Randomly select 5 families, and ask ten members of each family their opinion

Page 64: Sampling and Sample Size Part 2 Cally Ardington. Lecture Outline  Standard deviation and standard error Detecting impact  Background  Hypothesis testing.

Low intra-cluster correlation (ICC) aka ρ (rho)

Page 65: Sampling and Sample Size Part 2 Cally Ardington. Lecture Outline  Standard deviation and standard error Detecting impact  Background  Hypothesis testing.

HIGH intra-cluster correlation (ρ)

Page 66: Sampling and Sample Size Part 2 Cally Ardington. Lecture Outline  Standard deviation and standard error Detecting impact  Background  Hypothesis testing.

High Low

No effect on rh

o

Don’t know

25% 25%25%25%

All uneducated people live in one village. People with only primary education live in another. College grads live in a third, etc. ICC (ρ) on education will be..

A. HighB. LowC. No effect on rhoD. Don’t know

Page 67: Sampling and Sample Size Part 2 Cally Ardington. Lecture Outline  Standard deviation and standard error Detecting impact  Background  Hypothesis testing.

Clustered Design: Intuition

• The outcomes within a family are likely correlated. Similarly with children within a school, families within a village etc.

• Each additional individual does not bring entirely new information

• At the limit, imagine all outcomes within a cluster are exactly the same: effective sample size is number of clusters, not number of individuals

• Precision will depend on the number of clusters, sample size within clusters and the within cluster correlation

67

Page 68: Sampling and Sample Size Part 2 Cally Ardington. Lecture Outline  Standard deviation and standard error Detecting impact  Background  Hypothesis testing.

Include m

ore cl

usters

in the s.

..

Include m

ore peo

ple in cl

usters

Both

Don’t know

25% 25%25%25%

If ICC (ρ) is high, what is a more efficient way of increasing power?

A. Include more clusters in the sample

B. Include more people in clusters

C. BothD. Don’t know

Page 69: Sampling and Sample Size Part 2 Cally Ardington. Lecture Outline  Standard deviation and standard error Detecting impact  Background  Hypothesis testing.

• The Standardized effect size is the effect size divided by the standard deviation of the outcome

• δ = effect size/Standard deviation

Standardized Effect Sizes

Page 70: Sampling and Sample Size Part 2 Cally Ardington. Lecture Outline  Standard deviation and standard error Detecting impact  Background  Hypothesis testing.

Standardized Effect SizesAn effect size of…

Is considered… …and it means that… Required N under 50% treatment

0.2 Modest The average member of the treatment group had a better outcome than the 58th percentile of the control group

786

0.5 Large The average member of the treatment group had a better outcome than the 69th percentile of the control group

126

0.8 VERY Large The average member of the treatment group had a better outcome than the 79th percentile of the control group

50

Page 71: Sampling and Sample Size Part 2 Cally Ardington. Lecture Outline  Standard deviation and standard error Detecting impact  Background  Hypothesis testing.
Page 72: Sampling and Sample Size Part 2 Cally Ardington. Lecture Outline  Standard deviation and standard error Detecting impact  Background  Hypothesis testing.

Conclusion

• Even with a perfectly valid experiment, the ability to make inference depends on the SIZE OF THE SAMPLE.

• In designing an evaluation, you need to balance tradeoffs to ensure that your sample is large enough, given

• Desired power and significance levels• Anticipated effect size• The amount of “noise” (underlying variance in outcome variable)• Treatment-Control size ration (feasibility and cost)• Take up of treatment• Clustering

Page 73: Sampling and Sample Size Part 2 Cally Ardington. Lecture Outline  Standard deviation and standard error Detecting impact  Background  Hypothesis testing.

The Important Stuff How confident are we of our results ?

We have a sample, not the population. The Central Limit Theorem and The Law of Large Numbers tell us important things

about the sampling distribution that allow for HYPOTHESIS TESTING. Hypothesis testing enables us to establish whether our results are statistically significant. There are two kind of errors we can make in hypothesis testing

> Type 1: The intervention is not effective and we find it to be effective. We FIX this at 5%.

> Type 2 : The intervention is effective and we find it to be no impact. The smaller the probability of this occurring the higher our power. Power can be increased by five things

>Sample size > The size of the effect > The proportion of your sample in the control group and the proportion in the

sample of your treatment group > The variance> Clustering