False dichotomies and health intervention research designs ... · Week Intervention group Control...

Post on 24-Jun-2020

2 views 0 download

Transcript of False dichotomies and health intervention research designs ... · Week Intervention group Control...

False dichotomies and health intervention

research designs: Randomized trials are not

always the answer

Centre for Big Data Research in

Health Seminar Series

University of New South Wales

Friday 3rd November 2017

Professor Stephen Soumerai

Harvard Medical School and

Harvard Pilgrim Health Care Institute

Department of Population Medicine

Source: Soumerai SB et al. J Gen Intern Med. 2017 Feb;32(2):204-209.

“Information in administrative data

sets is spurious by default.”

John Ioannidis

Source: Ioannidis, JP. JAMA. 2013;309(13):1410-1.

Background

This statement prolongs the polarizing “all or

nothing” debate on “available data”

Administrative data are not always spurious

RCTs: the “gold standard”

Usually infeasible in “natural

experiments”

Study endpoints can be manipulated

Patients, are often not generalizable

• May not be blind to treatment

RCTs are not useful for most policy

interventions

Most national health policies, e.g., High

deductibles, Copays, Pay for Performance,

ACOs

Seat belt laws, Speed Limits

Banning medical technologies (e.g. drugs)

Opioid and antibiotic controls

Smoking regulations, etc.

What do we mean by “False

Dichotomies”

RCTs vs “everything else”

Ignores Quasi-Experiments

Campbell & Stanley (1963)

• Revised in 1979 and 2002

• Three main categories

–RCTs

–Strong quasi-experiments

–Weak “pre-experiments”

Hierarchy of Strong and Weak Designs:

Capacity to Control for Biases

Strong Design: Often Trustworthy Effects

Intermediate Design: Sometimes

Trustworthy Effects

Weak Designs: Rarely Trustworthy Effects

(No Controls for Common Biases.)

Hierarchy of Strong and Weak Designs:

Capacity to Control for Biases

Strong Design: Often Trustworthy Effects

Multiple RCTs The “gold standard” of evidence,

incorporating systematic review of all

studies.

Single RCT A single, strong randomized

experiment, but sometimes not

generalizable

Interrupted time

series with control

series (CITS)

Baseline trends often allow visible

effects and control for biases. Two

controls.

Hierarchy of Strong and Weak Designs:

Capacity to Control for Biases

Intermediate design: Sometimes Trustworthy Effects

Single ITS Controls for trends, but no comparison.

Before and after

with comparison

group

Pre-post change using two single

observations. Comparability of baseline

unclear.

Weak Designs: Rarely Trustworthy Effects (No Controls)

Uncontrolled

pre-post

Single observations before and after

intervention, no baseline or control

group.

Cross-sectional

designs

Simple correlation, no baseline, no

measure of change.

intervention intervention

Different Effects That Can Be

Observed in Time Series

before

after

before

after

intervention

beforeafter

intervention

before

after

Times series effects of drug benefit limits and cost sharing on the

average number of prescriptions per pt per month among

noninstitutionalized, chronically ill New Hampshire pts (n=860) and

other pts (n=8002). Soumerai, S et al. N Engl J Med. 1987;317(15)

Single ITS: Sometimes Trustworthy

Another ITS: Sometimes Trustworthy

Rates of antidepressant use and psychotropic drug poisoning per quarter before

and after the warnings among young adults (18-29) enrolled in 11 health plans in

nationwide Mental Health Research Network. Source: Lu CY et al. BMJ. 2014 Jun 18;348:g3596.

ITS ~200 years ago: Puerperal fever monthly mortality rates at Vienna Maternity

Institution 1841-1849. Rates drop when implementing handwashing.

Source: Semmelweis I (1861). Die Aetiologie, der Begriff und die Prophylaxis des Kindbettfiebers. [The

etiology, concept, and prophylaxis of childbed fever]. Budapest and Vienna: Hartleben.

A strong interrupted time-series design debunked IHI’s claim of

lives savedSource: AHRQ. Statistics on hospital stays. Accessed May 26, 2015.

Without baseline data the press

hyped the findings

AP headline 2008: “Campaign against

hospital mistakes says 122,000 lives

saved”

“A campaign to reduce lethal errors and

unnecessary deaths… has saved an estimated

122,300 lives in the last 18 months….”

“We in health care have never seen or

experienced anything like this,” said Dennis

O’Leary, president of JCAHO.”

Upper graph shows fatal and injurious crashes on Arizona interstate highways with the

increase to 65 MPH maximum speed limit. The lower graph indicates fatal and injurious

crashes on Arizona interstate highways with no change in the 55 MPH maximum speed

limit.

Source: Epperlein T. Arizona: Arizona Statistical Analysis Center; 1989.

ITS with Control Series

Objectives

Impacts of health system interventions

uncertain

Aim: Do the results of ITS differ from cluster

RCTs?

Results

ITS and RCTs were similar

• ITS with concurrent controls important

• Need to analyze baseline/follow-up trends

in cluster RCTs

Why RCTs should use controlled ITS

Ex: Dedicated chest pain unit, UK

Studied whether a chest pain unit

(CPU) would reduce hosp. admissions

14 hospitals randomized to establish a

chest pain unit, or not

90,000 visits with chest pain over 2 yrs

Source: Goodacre S et al. BMJ. 2007;335:659

Conventional Cluster (Diff-in-Diff)

RCT-perspective

0

50%

60%

70%

80%

10%

20%

30%

40%

Pro

po

rtio

n a

dm

itted

Before

Control CPU hospital

After

Reanalysis of data from Goodacre S et al. BMJ. 2007;335:659

Difference in admission rate between

intervention and control group

05

%1

0%

15%

20%

25%

DIffe

ren

ce in

adm

issio

n r

ate

(in

terv

en

tion

-con

tro

l)

0 5 10 15 20 25Months

Reanalysis of data from Goodacre S et al. BMJ. 2007;335:659

02

04

06

0

Nu

mb

er

of clin

ical pro

ble

ms a

dd

ed

to m

edic

al re

co

rd

5 10 15 20 25 30 35 40 45 50Week

Intervention group

Control group

Wright et al

The average effect size

only tells us half the

story.

ITS of RCT to Increase Reporting of

Clinical Decision Problems in EHR

Reanalysis of data from Wright A et al. J Am Med Inform Assoc. 2012;19(4):555-561

Without ITS, the decay to

no effect is not observable.

12

34

56

Fa

lls p

er

resi

de

nt-

yea

r

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17Month

Intervention group

Control group

Kerse 2004

ITS of RCT Data: Fall Prevention in LTC

Large Baseline and Follow-

up Difference Hidden in

Conventional RCT. Not

interpretable.

Reanalysis of data from Kerse, N et al. J Am Geriatr Soc 2004;52(4)524-31

Summary of ITS of RCT Studies

Interrupted time series analysis is valuable in

evaluation of health systems and policy

interventions

• When RCTs are not feasible

• In the analysis of data from cluster RCTs

Important information may be lost if cluster

RCTs do not consider changes over time

Conclusions

Research design is the first consideration in

addressing trustworthiness of research.

Medical and graduate schools should

emphasize weaknesses of uncontrolled or

cross-sectional designs and include stronger

research designs.

Well controlled studies can save lives, while

weak designs promote wasteful programs,

and jeopardize public health.