Impact Evaluation Session VII Sampling and Power Jishnu Das November 2006.

22
Impact Evaluation Session VII Session VII Sampling and Power Sampling and Power Jishnu Das November 2006

Transcript of Impact Evaluation Session VII Sampling and Power Jishnu Das November 2006.

Impact Evaluation

Session VIISession VIISampling and Sampling and

PowerPowerJishnu Das

November 2006

2WBISARHDN

Sample Selection in Evaluation Population based representative surveys:

Sample representative of whole population Good for learning about the population Not always most efficient for impact evaluation

Sampling for Impact evaluation Balance between treatment and control groups Power statistical inference for groups of interest Concentrate sample strategically

Survey budget as major consideration In practice, sample size is many times set by budget Concentrate sample on key populations to increase

power

3WBISARHDN

Purposive Sampling:

Risk: We will systematically bias our sample, so results don’t generalize to the rest of the population or other sub-groups

Trade off between power within population of interest and population representation

Results are internally valid, but not generalizable.

4WBISARHDN

Survey - Sampling

Population: all cases of interest Sampling frame: list of all potential

cases Sample: cases selected for analysis Sampling method: technique for

selecting cases from sampling frame Sampling fraction: proportion of cases

from population selected for sample (n/N)

5WBISARHDN

Sampling Frame

Simple Sampling Stratified Sampling Cluster Sampling

6WBISARHDN

Sampling Methods

Random Sampling Systematic Sampling

7WBISARHDN

The Design Effect in Clustering

Necessary to take into account when samples are clustered

VarVar

srs

clusterdeff

8WBISARHDN

Correlación intracluster () DEFF depends on the size of the cluster and the

intra-cluster correlation

is the degree of homogeneity in the cluster, and is called the “intra-cluster” correlation

sizeclusterk

ncorrelatio erintraclust where

)]k([Deff

11

sizecluster

effectdesign

11

k

deff

kdeff

9WBISARHDN

Tamaño de muestra

The necessary sample size will increase in clustered samples

But, you have to have some idea of the intra-cluster coefficient to get at this number!

samplingrandom simple withrequired sizesrsn where

srs deffnn

10WBISARHDN

Power Calculations Test significance of a null hypothesis. For example, whether two means are

different.

11WBISARHDN

-4 -2 0 2 4 6x

Type I and Type II errors

Type II error=

Significance Level

Power= 1-

Type I error=

12WBISARHDN

Type I and type II errors

Type I error: Reject the null hypothesis when it is true Significance level probability of rejecting the

null when it is true (Type I error) Type II error: Accept (fail to reject) the null

hypothesis when it is false Power probability of rejecting the null when

an alternative null is true (1-probability of Type II)

We want to minimize both types of errors Increase sample size

13WBISARHDN

Type I and Type II errors Type I error =

Probability that you conclude the intervention had an effect if actually it did not

Type II error = Probability you conclude that intervention had

no effect when it actually did Power = 1 -

Probabilty of correctly conluding that the intervention had an effect

Fix the type I error and use sample size to increase the power

14WBISARHDN

Power Calculations for sample size

Fix the confidence level and as you increase the size of the sample: Rejection region gets larger The power increases

-4 -2 0 2 4 6x

-4 -2 0 2 4 6x

n↑

15WBISARHDN

What we have so far

Clustering increases the required sample size

As does the need for statistical testing: if we know The estimated size of the treatment The variance of the distribution

We can start making power calculations for evaluations

16WBISARHDN

In Practice

Many, many analytical statistical results May be simpler to use simulations in Stata

or similar package Easily accounts for complicated designs

17WBISARHDN

In Practice: An Example

Does Information improve child performance in schools? (Pakistan)

Randomized Design Interested in villages where there are private

schooling options

What Villages should we work in? Stratification: North, Central, South Random Sample: Villages chosen randomly

from list of all villages with a private school

18WBISARHDN

In Practice: An Example

How many villages should we choose? Depends on:

How many children in every village How big do we think the treatment effect will

be What the overall variability in the outcome

variable will be

19WBISARHDN

In Practice: An Example Simulation Tables

Table 1 assumes very high variability in test-scores.

X,Y: X is for intervention with small effect size; Y for larger effect size N: Significant < 1% of simulations S: Significant < 10% of simulations A: Significant > 99% of simulations

Number of Villages (assuming 3 schools per village) 21 27 36 42 60

5 N,n N,n N,s N,s N,a 10 N,s N,m N,a N,a S,a 15 N,s N,m N,a N,a M,a 20 N,m S,m M,a M,a M,a

Number of children in every School

20WBISARHDN

In Practice: An Example Simulation Tables

Table 1 assumes lower variability in test-scores.

X,Y: X is for intervention with small effect size; Y for larger effect size N: Significant < 1% of simulations S: Significant < 10% of simulations A: Significant > 99% of simulations

Number of Villages (assuming 3 schools per village) 21 27 36 42 60

5 N,s N,s s,m s,m S,a 10 N,m S,m m,a M,a M,a 15 N,m s,m M,a M,a M,a 20 S,a S,a M,a M,a M,a

Number of children in every School

21WBISARHDN

A smorgasbord of topics

Probability proportional to size sampling to pick clusters

Using weights Estimating means vs. Estimating regressions

Increasing efficiency using matched randomizations

Using evaluations to say something about baseline populations Age targeted programs

22WBISARHDN

When do we really worry about this? IF

Very small samples at unit of treatment! Suppose treatment in 20 schools and control in

20 schools But there are 400 children in every school

This is still a small sample IF

Interested in sub-groups (blocks) Sample size requirements increase

exponentially IF

Using Regression Discontinuity Designs