Wim Van den Noortgate Katholieke Universiteit Leuven, Belgium Belgian Campbell Group...

Post on 04-Jan-2016

212 views 0 download

Transcript of Wim Van den Noortgate Katholieke Universiteit Leuven, Belgium Belgian Campbell Group...

Wim Van den Noortgate Katholieke Universiteit Leuven, Belgium

Belgian Campbell GroupWim.VandenNoortgate@kuleuven-kortrijk.be

Workshop systematic reviews Leuven June 4-6, 2012

Introduction to meta-analysis

1. Introducing meta-analysis for group designed studies

2. Effect sizes3. Meta-analysis of studies with other designs

Content

1. Introducing meta-analysis

The role of chance

Example: association between gender and math

M = 8 ; F = 8.5 ; M = F = 1.5

F M

0.33

M F

Standardized mean difference (Cohen, 1969):

Estimated by its sample counterpart:

A B

A B

p

X Xg

S

‘True’ effect size

‘Observed’ effect size

0.20 = small effect

0.50 = moderate effect

0.80 = large effect

sM sF p (2-sided) g

8.10 9.34 1.55 1.55 0.015 (*) 0.80

MxFx

Example: M = 8 ; F = 8.5 ; M = F = 1.5 => δ = 0.33

nM=nF = 20

sM sF p (2-sided) g

8.107.60

9.347.59

1.551.23

1.551.47

0.015 (*)0.98

0.80-0.0069

MxFx

Example: M = 8 ; F = 8.5 ; M = F = 1.5 => δ = 0.33

sM sF p (2-sided) g

8.107.607.967.708.177.868.198.117.868.34

9.347.598.818.258.258.817.938.157.948.53

1.551.231.381.491.761.241.791.761.891.39

1.551.471.591.651.331.581.781.971.641.79

0.015 (*)0.980.0780.280.870.040 (*)0.650.950.890.71

0.80-0.00690.570.350.0530.67-0.140.0200.0420.12

MxFx

Example: M = 8 ; F = 8.5 ; M = F = 1.5 => δ = 0.33

2( )~ ( , )gg N 2

( )

4gwith

N

δ g

~ ( , 0.1)g N

95 % confidence interval:

1.96 0.1 ; 1.96 0.1g g

1 2 3 4 5 6 7 8 9 10-1

-0.5

0

0.5

1

1.5

Data set

g

Suppose simulated data are data from 10 studies, being replications of each other:

Vote-counting procedure?

Combining study results in a meta-analysis

gg

k

Combining study results in a meta-analysis

Suppose simulated data are data from 10 studies, being replications of each other:

2( ) 0.01g

0.25

2( )~ ( , )gg Nk

1 2 3 4 5 6 7 8 9 10-1

-0.5

0

0.5

1

1.5

Data set

g

1. Observed effect sizes may be positive, negative, small, moderate and large.

2. CI relatively large3. 0 often included in confidence intervals

4. Combined effect size close to population effect size (averaging out the noise)

5. CI relatively small (higher accuracy)6. 0 not included in confidence interval (higher

power)

Comparing individual study results and combined study results

Meta-analysis: Gene Glass (Educational Researcher, 1976, p.3):

“Meta-analysis refers to the analysis of analyses”

A meta-analysis with dissimilar study sample sizes

2( )~ ( , )gg N 2

( )

4gwith

N

δ g

nM = nF = 100

δ g

nM = nF = 20

ˆ j

k

g

j j

j

gww

2( )

1

j

j j

jg

with w N

or w

An example in education(Raudenbush, S. W. (1984). Magnitude of teacher expectancy effects on pupil IQ as a function of the credibility of expectancy induction: A synthesis of findings from 18 experiments. Journal of Educational Psychology, 76, 85-97.)

StudyWeeks prior

contact gj

1.2.3.4.5.6.7.8.9.

10.11.12.13.14.15.16.17.18.19.

Rosenthal et al. (1974)Conn et al. (1968)Jose & Cody (1971)Pellegrini & Hicks (1972)Pellegrini & Hicks (1972)Evans & Rosenthal (1969)Fielder et al. (1971)Claiborn (1969)Kester & Letchworth (1972)Maxwell (1970)Carter (1970)Flowers (1966)Keshock (1970)Henrickson (1970)Fine (1972)Greiger (1970)Rosenthal & Jacobson (1968)Fleming & Anttonen (1971)Ginsburg (1970)

2330033301001233123

0.030.12

-0.141.180.26

-0.06-0.02-0.320.270.800.540.18

-0.020.23

-0.18-0.060.300.07

-0.07

0.130.150.170.370.370.100.100.220.160.250.300.220.290.290.160.170.140.090.17

( )jg

Mean effect: 0.060, p= .10

(Keren, R., & Chan, E. (2002). A meta-analysis of randomized, controlled trials comparing short- and long-course antibiotic therapy for urinary tract infections in children. Pediatrics, 109, e70.)

An example in medical research

Study Year Sample Size

RR (95% CI)

Bailey and Abbott 1978 10 1.33 (0.17–10.25)

Khan et al 1981 16 0.20 (0.01–3.61)

Stahl et al 1984 26 1.20 (0.34–4.28)

Fine and Jacobson 1985 31 2.34 (0.53–10.30)

Gaudreault et al 1992 40 1.00 (0.02–48.09)

Pitt et al 1982 42 2.50 (0.11–58.06)

Helin 1984 43 2.53 (0.25–25.81)

Grimwood et al 1988 45 2.80 (0.65–12.02)

Avner et al 1983 49 4.69 (1.13–19.51)

Lohr et al 1981 50 1.28 (0.23–7.00)

Nolan et al 1989 90 10.45 (1.40–78.31)

Copenhagen 1991 264 1.50 (0.68–3.32)

Note:

treatment failure in short course antibiotic treatment

treatment failure in long course antibiotic treatment

RR = Relative Risk = Risk Ratio

Combined RR = 1.94 (95% CI: 1.19–3.15)

2. Effect sizes

Example: testing the difference in the size of tumors in an experimental and a control group

What would you conclude if p = .11? p < .0001?

p-values or effect sizes?

Misconceptions:◦ failure to reject the null hypothesis implies no

effect◦ a statistically significant p-value implies a

large effect

The interpretation of p-values

Test of Significance = Size of Effect × Size of Study

Rosenthal, R. (1991). Meta-analytic procedures for social research. Newbury Park, CA: Sage

Before being combined in a meta-analysis, findings from primary studies are summarized to a measure of effect size

There are several possible effect size indices: e.g. ◦ Two continuous variables: the correlation coefficient◦ One continuous, one dichotomous: the standardized

mean difference◦ Two dichotomous: the odds ratio, relative risk, …

To allow comparison over studies, a common measure is used, often a standardized one

In a meta-analysis, effect size measures are compared and combined

The use of effect sizes in meta-analysis

Final exam

Predictive test 1 0

1 130(87 %)

20(13 %)

150(100 %)

0 30(60 %)

20(40%)

50(100 %)

160 40 200

Example: two dichotomous variables

1. Risk difference: .87-.60 = .272. Relative risk: .87/.60 = 1.453. Phi: (130 x 20 – 20 x 30)/sqrt (150 x 50 x 160 x 40) = 0.294. Odds ratio: (130 x 20 / 20 x 30) = 4.33

◦ direct calculation based on means and standard deviations

◦ algebraically equivalent formulas (t-test)◦ exact probability value for a t-test◦ approximations based on continuous data (correlation

coefficient)

◦ Results of one way ANOVA with 3 or more groups◦ Results of ANCOVA◦ Results of multiple regression analysis

◦ approximations based on dichotomous data

Calculating & converting effect sizesExample: the standardized mean difference (g)(based on slides retrieved on March 6 2008, from http://mason.gmu.edu/~dwilsonb/ma.html)

Gre

atG

ood

Poo

r

Methods of Calculating the Standardized Mean Difference

1 2 1 2

2 21 1 2 2

1 2

( 1) ( 1)

2

p

X X X Xg

s s n s n

n n

Direction Calculation Method

Methods of Calculating the Standardized Mean Difference

1 2

p

X Xg

s

Calculation based on test statistics

1 2

1 2

1 1p

X Xt

sn n

1 2

1 2

n ng t

n n

exact p-values from a t-test or F-ratio can be convertedinto t-value and the above formula applied

Methods of Calculating the Standardized Mean Difference

Calculation based on other effect size measures

2

2. .,

1

re g g

r

Other conversion formulae:

Lipsey, M. W., & Wilson, D. B. (2001). Practical meta-analysis. Thousand Oaks, CA: Sage.

Rosenthal, R. (1994). Parametric measures of effect size. In H. Cooper, & L. V. Hedges (Eds.), The handbook of research synthesis (pp. 231-244). New York: Russell Sage Foundation.

3. Combining effect sizes of other designs

35

Example: Stimulating response behavior in the classroom: A single-case study

Narayan, J. S., Heward, W. L., Gardner, R., Courson, F. H., Omness, C. K.

(1990), JABA, 23, 483-490.

36

37

Yi = 0 + 1 (Treatment)i+ ei met ei ~ N(0,2e)

Measuring and testing the effect

38

1

0

39

Effect size

1 B A 1 B ASC

e e

1 B ASC

p p

b x xd

s s

exp

exp

Cfr.: contrGC

contrGC

p

x xd

s

40

41

Results of the meta-analysis: Mean effect Treatment: 14.33 Standard error mean effect: 0.74

o Expressing effects in (quasi-)experimental studies◦ Comparing experimental & control groups◦ Comparing one group under several conditions

o Expressing association in non-experimental studies◦ Comparing existing groups (e.g., male vs. female)◦ Expressing association between continuous variables

(e.g., relation between class size and performance)o Describing one single variable (e.g.,

prevalence rates, means, …)42

‘Effect’ sizes?

Check the internal validity of the design!(are there confounding variables?)

Pay attention with the interpretation of your results!

Association ≠ Causation !