Synthetic Controls: Methods and Practice Alberto Abadie MIT
Transcript of Synthetic Controls: Methods and Practice Alberto Abadie MIT
Synthetic Controls: Methods and Practice
Alberto AbadieMIT
SI 2021 Methods Lecture - Causal Inference Using SyntheticControls and the Regression Discontinuity Design
July 30, 2021
Introduction
I Synthetic control methods were originally proposed in Abadieand Gardeazabal (2003) and Abadie et al. (2010) with theaim to estimate the effects of aggregate interventions.
I Many events or interventions of interest naturally happen atan aggregate level affecting a small number of large units(such as cities, regions, or countries).
I Even in experimental settings micro-interventions may not befeasible (e.g., fairness) or effective (e.g., interference).
In this talk, I will use the terms “event”, “intervention”, and “treatment”interchangeably.
Applications
I Synthetic controls have been applied to study the effects ofright-to-carry laws (Donohue et al., 2017), legalizedprostitution (Cunningham and Shah, 2018), immigrationpolicy (Bohn et al., 2014), corporate political connections(Acemoglu et al., 2016) and many other policy issues.
I They have also been adopted as the main tool for dataanalysis across different sides of the issues in recent prominentdebates on the effects of immigration (Borjas, 2017; Peri andYasenov, 2017) and minimum wages (Allegretto et al., 2017;Jardim et al., 2017; Neumark and Wascher, 2017; Reich et al.,2017).
I Synthetic controls are also applied outside economics in thesocial sciences, biomedical disciplines, engineering, etc. (see,e.g., Heersink et al., 2017; Pieters et al., 2017).
Applications
I Outside academia, synthetic controls have found considerablecoverage in the popular press (see, e.g., Guo, 2015; Douglas,2018) and have been widely adopted by multilateralorganizations, think tanks, business analytics units,governmental agencies, and consulting firms.
I For example, the synthetic control method plays a prominentrole in the official evaluation of the effects of the massive Bill& Melinda Gates Foundation’s Intensive Partnerships forEffective Teaching program (Gutierrez et al., 2016).
DOWJONES,ANEWSCORPCOMPANY •
N1kke1 21674.47 ·O 67', Hang Seng 26054.78 ·120% U S 10 Yr 0/32 Yield 3 068% Crude Oil 57.17 126', Yen 112.60 O 04',
THE WALL STREET JOURNAL. Subscribe Now Sign In
Home World U.S. Politics Economy Business Tech Markets Opinion Life & Arts Real Estate WSJ. Magazine Q
<
0
0
0
fl
e
... U.S. Home-Builder
Sentiment Fell in November
REAL TIME ECONOMICS I ECONOMICS
• � •·l ' " 1
1�1f
I ' ii. '
... U.S.-China Rift:
One Phrase
'Torpedoed' Pacific
Accord
INCOME.
... NewYorkFed
Chief Expects Gradual
Rate Rises
How an Analysis of Basque Terrorism Helps
Economists Understand Brexit
Invesco
A method pioneered by an MIT professor has also been used to estimate the economic effect of a
tobacco ban, German reunification, legalization of prostitution and gun rights
Professor Alberto Abadie created an economic model in 2003 of terrorism-free Basque
Country. Similar methods now inform analysis of the impact of the U.K.'s Brexit vote.
PHOTO: BRYCE VICKMARK
By Jason Douglas
Nov 7, 2018 5:37 am ET
• 4COMMENTS
A method developed more than a decade ago to assess the cost of
political violence in the Basque country has become a key tool for
economists trying to figure out the cost ofBrexit.
¥NATIXISBEYOND BANKING
>
The U.K. voted to leave the European Union in 2016 and is due to
formally withdraw from the bloc in March next year. Attempting to
estimate the effect of the referendum result runs into a problem
familiar to anyone curious about the economic consequences of a
particular event: No one can be sure how the economy would have
performed had the vote gone the other way.
Recommended Videos
Pelosi's Math Problem: Why Speakership Isn't a Sure Bet
() �_i, ..._ .' I I,... . I
Facial Recognition Tech Aims to Identify Good and Evil
One approach might be to extrapolate the pre-referendum trend to
paint a picture of how the economy might have behaved had voters
chosen to keep the U.K. in the EU. Another might be to take the
average growth rate of the world's other advanced economies and
compare it with the U.K.'s. Both approaches have drawbacks. The first
misses out more recent developments in the global economy that
might be important, while the second isn't a precise match for the U.K.
economy and its idiosyncrasies.
In the Elevator With AT&T CEO Randall Stephenson
I I ,. "
� .
In the early 2000s, similar concerns led Alberto Abadie, a professor at
the Massachusetts Institute of Technology, to develop a technique he
dubbed the synthetic-control method. He wanted to calculate the
terrible price of decades of terrorism on his home region of Spain but
couldn't figure out how to do so while controlling for other economic
factors, such as the recession Spain suffered during the late 1970s and
early 1980s.
"We had to come up with a different way to approach the problem,"
Prof. Abadie said in a recent interview. So they hit upon a novel idea:
Why not create an economic model of a virtual Basque country that
had known nothing but peace?
In a 2003 paper, Prof. Abadie, then at Harvard, and Javier Gardeazabal
of the University of the Basque Country in Bilbao, Spain, built this
alternate world by melding together bits and pieces of the economies
of other Spanish regions. This "counterfactual" Basque country
displayed similar characteristics to the real thing, such as population
density and weight of industrial production in output. But it hadn't
been plagued by the violence that resulted in 800 deaths before
separatists unilaterally called a truce in 1998.
This experiment and related analysis led them to conclude that
separatist violence held back growth per person by as much as 10%
between the start of the conflict and a 1990 ceasefire.
The method developed by Prof. Abadie and colleagues has since been
used to estimate the economic effect of California's pioneering
tobacco ban, German reunification, the legalization of prostitution
and the right to bear arms. "We realized this could be applied to many
other situations," he said.
Now researchers in Europe are using it to construct a virtual Britain
where 2016's Brexit vote didn't happen to determine whether that
decision has already had a cost. Those who have used it include
economists led by Benjamin Born of the Frankfurt School of Finance
and Management, investment bank UBS AG, the University of Sussex's
Trade Policy Observatory and the Centre for European Reform, a
London-based think tank devoted to European policy.
Not-So-Model Growth
The U .K. economy has grown more slowly than some researchers estimate it would
have if people had voted against breaking from the European Union in a June 2016
referendum.
Cumulative change in GDP since Q1 2016
6%
5
4
3
2
1
0
·1
2016 ·17
Source: Centre for European Reform
'18
U.K. minus Brexit vote
aU.K.
There are small differences in the various studies, but they all use
Prof. Abadie's method as the basis for constructing a "doppelganger"
U.K. from other similar advanced economies, such as the U.S., Canada,
France and the Netherlands. They reach similar conclusions,
suggesting the British economy at the start of 2018 was around 2%
smaller than it would have been had the 2016 referendum gone the
other way, a reflection of factors such as weaker investment and
consumer spending.
Prof. Abadie said he's aware his technique has been applied to Brexit
but hasn't followed the studies in question closely. He said
researchers need to take care in selecting the countries used to
construct the doppelganger to ensure an appropriate match. And he
added the bigger question remains how Brexit will affect the U.K. over
the long term. Britain won't leave the EU until March 2019 and the
shape of its future ties to the bloc are still under negotiation.
"We are only in the middle of this process," he said.
RELATED
The Price ofBrexit: Two Years After the Vote (June 22)
With a Year to Go to Brexit, U.K. Economy Lags Its Peers (March 29)
Brexit Drags Down U.K. Economy as Neighbors Soar (Jan. 26)
Share this:
BREXIT
< PREVIOUS
Real Time Economics: The Midterm Results Are In I Gridlock Likely To Limit Policy Shifts I Remember the Debt Ceiling?
ECONOMIC GROWTH GDP GROWTH MODEL
NEXT>
Real Time Economics:
Trump's Trade Agenda !Who
Cares About the Economy? I It's
Fed Day!
SHOW COMMENTS (4)
What to Read Next ...
ASIA NEWS
China Detains Leading Activist Pastor aud Scores of \Vorshippers .-
BUSINESS
At Netflix, Radical Transparency and Blunt Firings Unsettle the Ranks.-
Independent of The WBII Street Journal newsroom
RETIREMENT
Mark Cubau says the best investing advice he got when young was to be as poor as possible first MarketW'atch
l'J1TID 111.1 TT {'t1'1fff:1T.V11 Tf\TID\T AT ........... , ... ,_ -
THE WALL fil1REITT JOURNAL. U.S. Edition •
WSJ Membership
EUROPE NEWS
Riots Resurface Across H-auce, Piling Pressure 011 Macron .,.
MARKETS MAIN
Ernst & Young Chairman, CEO to Step Down Next Year .-
BUSINESS
U.S. Alleges Huawei CFO Hid Ties to Telecom With Iran Busiuess.-
Independent of The WBII Street Journal newsroom
Medieval Estate Where Henry VIII and Aune Boleyu Honeymooned Hits Market
MANSION GLOBAL
Ads
Advertise
California's Deadliest, Most Destructive Wildfire
Carlos Ghosn: Legacy of Nissan and Renault Savior Takes a U-Turn
Most Popular Articles
With Facebook at 'War,' Zuckerberg Adopts More Aggressive Style .-
Outlook for Traditional TV Goes From Bad to Worse ...
Ghosn Arrested as Nissan Plans Ouster for False Pay Filings ...
Apple Suppliers Suffer With Uncertainty Around iPhone Demand.-
Bloomberg to Give Johns Hopkins Record $1.8 Billion .-
subscribe
More
WSJ+ Membership Benefits
Digital Subscription
Print Subscription
Print and Digital Subscription
Why Subscribe?
Customer Service
Customer Center
Contact Us
live Help
Redesign Guided Tour
Notice to Subscribers
Tools & Features
Emails & Alerts
Guides
My News
RSSFeeds
Advertise Locally
Commercial Real Estate Ads
Place a Classified Ad
Content Partnerships
Corrections
JobsatWSJ
News Archive
Register for Free
Reprints
Sell Your Business
Download WSJ Apps for iOS and Android
Video Center
Watchlist
Podcasts
Sell Your Home
Recruitment & Career Ads Corporate Subscriptions
Professor Journal
Student Journal
Ii App Store •• ••• ,"-ill C, . .',\� Tu
Dow Jones Products Barron's BigCharts DJX Dow Jones Newswires Factiva Financial News Mansion Global MarketWatch
Private Markets realtor.com Risk & Compliance WSJ Conference WSJ Pro Central Banking WSJ Video WSJ Wine
Privacy Policy Cookie Policy Copyright Policy Data Policy Subscriber Agreement & Terms of Use Your Ad Choices
Copyright ©2016 Dow Jones & CompanY, Inc. All Rights Reserved.
sign1n
Plan for the talk
1. A primer on synthetic control estimation
2. Why use synthetic controls?
3. A penalized synthetic control estimator
4. Synthetic controls for experimental design
5. Closing remarks
Literature is large, and there is much I will not cover ...
I Matrix/tensor completion: Amjad, Shah, and Shen, (2018),Agarwal, Shah and Shen (2020), Athey, Bayati, Doudchenko,Imbens, and Khosravi (2018), Bai and Ng (2020)
I Bias correction: Abadie and L’Hour (2020), Arkhangelsky,Athey, Hirshberg, Imbens, and Wager, (2019), Ben-Michael,Feller, and Rothstein (2020)
I Inference: Cattaneo, Feng, Titiunik (2020), Chernozhukov,Wuthrich, and Zhu (2019a, 2019b), Firpo and Possebom(2018)
I Functional and distributional outcomes: Chernozhukov,Wuthrich, and Zhu (2019c), Gunsilius (2020)
I Large-T : Botosaru and Ferman (2019), Ferman (2019), Li(2020)
I Other related methods: Brodersen, Gallusser, Koehler,Remy, and Scott (2015)
... and many more (and many, many, empirical applications).
A primer on synthetic control estimation
I When the units of analysis are a few aggregate entities, acombination of comparison units (a “synthetic control”) oftendoes a better job reproducing the characteristics of a treatedunit than any single comparison unit alone.
I The comparison unit in the synthetic control method isselected as the weighted average of all potential comparisonunits that best resembles the characteristics of the treatedunit(s).
A primer on synthetic control estimation
I Suppose that we observe J + 1 units in periods 1, 2, . . . ,T .
I Unit “one” is exposed to the intervention of interest (that is,“treated”) during periods T0 + 1, . . . ,T .
I The remaining J units are an untreated reservoir of potentialcontrols (a “donor pool”).
I Let Y Iit be the outcome that would be observed for unit i at time t if
unit i is exposed to the intervention in periods T0 + 1 to T .
I Let Y Nit be the outcome that would be observed for unit i at time t
in the absence of the intervention.
I We aim to estimate the effect of the intervention on the treated unit,
τ1t = Y I1t − Y N
1t = Y1t − Y N1t
for t > T0, and Y1t is the outcome for unit one at time t.
A primer on synthetic control estimation
I Let W = (w2, . . . ,wJ+1)′ with wj ≥ 0 for j = 2, . . . , J + 1and w2 + · · ·+ wJ+1 = 1. Each value of W represents apotential synthetic control.
I Let X 1 be a (k × 1) vector of pre-intervention characteristicsfor the treated unit. Similarly, let X 0 be a (k × J) matrixwhich contains the same variables for the unaffected units.
I The vector W ∗ = (w∗2 , . . . ,w∗J+1)′ is chosen to minimize
‖X 1 − X 0W ‖, subject to our weight constraints.
I Let Yjt be the value of the outcome for unit j at time t. For apost-intervention period t (with t ≥ T0) the synthetic controlestimator is:
τ1t = Y1t −J+1∑j=2
w∗j Yjt .
A primer on synthetic control estimation
I Typically,
‖X 1 − X 0W ‖ =
(k∑
h=1
vh (Xh1 − w2Xh2 − · · · − wJ+1XhJ+1)2)1/2
I The positive constants v1, . . . , vk reflect the predictive power ofeach of the k predictors on Y N
1t .
I v1, . . . , vk can be chosen by the analyst or by data-drivenmethods.
A primer on synthetic control estimation
Application: German reunification
1960 1970 1980 1990 2000
050
0010
000
2000
030
000
(a)
Year
Per
Cap
ita G
DP
(P
PP
200
2 U
SD
)
1960 1970 1980 1990 20000
5000
1000
020
000
3000
0
(b)
Year
Per
Cap
ita G
DP
(P
PP
200
2 U
SD
)
West GermanyOECDSynthetic West Germany
A primer on synthetic control estimation
Application: German reunification
West Synthetic OECDGermany West Germany Sample
(1) (2) (3)
GDP per-capita 15808.9 15802.24 13669.4Trade openness 56.8 56.9 59.8Inflation rate 2.6 3.5 7.6Industry share 34.5 34.5 34.0Schooling 55.5 55.2 38.7Investment rate 27.0 27.0 25.9
Note: First column reports X 1, second column reportsX 0W
∗, and last column reports a simple average for the16 OECD countries in the donor pool. GDP per capita, in-flation rate, and trade openness are averages for 1981–1990.Industry share (of value added) is the average for 1981–1989.Schooling is the average for 1980 and 1985. Investment rateis averaged over 1980–1984.
A primer on synthetic control estimation
Application: German reunification
country j W ∗j country j W ∗
j
Australia 0 Netherlands 0.10Austria 0.42 New Zealand 0Belgium 0 Norway 0Denmark 0 Portugal 0France 0 Spain 0Greece 0 Switzerland 0.11Italy 0 United Kingdom 0Japan 0.16 United States 0.22
A primer on synthetic control estimation
I Abadie et al. (2010) establish a bias bound under the factor model
Y Nit = θtZ i + λtµi + εit ,
where Z i are observed features, µi are unobserved features, andεit is a unit-level transitory shock, modeled as random noise.
I Suppose that we can choose W ∗ such that:
J+1∑j=2
w∗j Z j = Z 1,
J+1∑j=2
w∗j Yj1 = Y11, · · · ,J+1∑j=2
w∗j YjT0 = Y1T0 .
In practice, these may hold only approximately.
A primer on synthetic control estimation
Suppose that E |εjt |p <∞ for some p > 2. Then,
|E [τ1t − τ1t ]| < C (p)1/p(λ2F
ξ
)J1/p max
{m
1/pp
T1−1/p0
,σ
T1/20
}
where F is the number of unobserved factors,
σ2jt = E |εjt |2, σ2j =1
T0
T0∑t=1
σ2jt , σ2 = maxj=2,...,J+1
σ2j ,
mpjt = E |εjt |p, mpj =1
T0
T0∑t=1
mpjt , mp = maxj=2,...,J+1
mpj ,
for p even, |λtf | ≤ λ for all t = 1, . . . ,T and f = 1, . . . ,F , and
ξ ≤ ξ(M) = smallest eigenvalue of1
M
T0∑t=T0−M+1
λ′tλt .
A primer on synthetic control estimation
I The bias bound is predicated on close fit, and controlled bythe ratio between the scale of εit and T0.
I In particular, the credibility of a synthetic control depends onthe extent to which it is able to fit the trajectory of Y1t for anextended pre-intervention period.
A primer on synthetic control estimation
I There are no ex-ante guarantees on the fit. If the fit is poor,Abadie et al. (2010) recommend against the use of syntheticcontrols.
I Settings with small T0, large J, and large noise createsubstantial risk of overfitting.
I To reduce interpolation biases and risk of overfitting, restrictthe donor pool to units that are similar to the treated unit.
A primer on synthetic control estimation
I Abadie et al. (2010) propose a mode of inference for thesynthetic control framework that is based on permutationmethods.
I A permutation distribution can be obtained by iterativelyreassigning the treatment to the units in the donor pool andestimating “placebo effects” in each iteration.
I The effect of the treatment on the unit affected by theintervention is deemed to be significant when its magnitude isextreme relative to the permutation distribution.
A primer on synthetic control estimation
Application: German reunification
Portugal
Austria
Denmark
France
Japan
Switzerland
UK
Belgium
Spain
New Zealand
USA
Netherlands
Greece
Norway
Australia
Italy
West Germany
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
5 10 15
Post−Period RMSE / Pre−Period RMSE
A primer on synthetic control estimation
I The permutation distribution is more informative thanmechanically looking at p-values alone.
I Depending on the number of units in the donor pool,conventional significance levels may be unrealistic orimpossible.
I Often, one sided inference is most relevant.
A primer on synthetic control estimation
Application: California tobacco control program
1970 1975 1980 1985 1990 1995 2000
020
4060
8010
012
014
0
year
per−
capi
ta c
igar
ette
sal
es (
in p
acks
)Californiarest of the U.S.
Passage of Proposition 99
A primer on synthetic control estimation
Application: California tobacco control program
1970 1975 1980 1985 1990 1995 2000
020
4060
8010
012
014
0
year
per−
capi
ta c
igar
ette
sal
es (
in p
acks
)Californiasynthetic California
Passage of Proposition 99
A primer on synthetic control estimation
Application: California tobacco control program
1970 1975 1980 1985 1990 1995 2000
−30
−20
−10
010
2030
year
gap
in p
er−
capi
ta c
igar
ette
sal
es (
in p
acks
)
Passage of Proposition 99
A primer on synthetic control estimation
Application: California tobacco control program(All States in Donor Pool)
1970 1975 1980 1985 1990 1995 2000
−30
−20
−10
010
2030
year
gap
in p
er−
capi
ta c
igar
ette
sal
es (
in p
acks
) Californiacontrol states
Passage of Proposition 99
A primer on synthetic control estimation
Application: California tobacco control program(Pre-Prop. 99 MSPE ≤ 20 Times Pre-Prop. 99 MSPE for CA)
1970 1975 1980 1985 1990 1995 2000
−30
−20
−10
010
2030
year
gap
in p
er−
capi
ta c
igar
ette
sal
es (
in p
acks
) Californiacontrol states
Passage of Proposition 99
A primer on synthetic control estimation
Application: California tobacco control program(Pre-Prop. 99 MSPE ≤ 5 Times Pre-Prop. 99 MSPE for CA)
1970 1975 1980 1985 1990 1995 2000
−30
−20
−10
010
2030
year
gap
in p
er−
capi
ta c
igar
ette
sal
es (
in p
acks
) Californiacontrol states
Passage of Proposition 99
A primer on synthetic control estimation
Application: California tobacco control program(Pre-Prop. 99 MSPE ≤ 2 Times Pre-Prop. 99 MSPE for CA)
1970 1975 1980 1985 1990 1995 2000
−30
−20
−10
010
2030
year
gap
in p
er−
capi
ta c
igar
ette
sal
es (
in p
acks
) Californiacontrol states
Passage of Proposition 99
A primer on synthetic control estimation
Application: California tobacco control program(All 38 States in Donor Pool)
0 20 40 60 80 100 120
01
23
45
post/pre−Proposition 99 mean squared prediction error
freq
uenc
y
California
A primer on synthetic control estimation
I The availability of a well-defined procedure to select thecomparison unit makes the estimation of the effects ofplacebo interventions feasible.
I The permutation method we just described does not attemptto approximate the sampling distributions of test statistics.
I Sampling-based inference is often complicated in a syntheticcontrol setting, sometimes because of the absence of awell-defined sampling mechanism and sometimes because thesample is the same as the population.
A primer on synthetic control estimation
I This mode of inference reduces to classical randomizationinference (Fisher, 1935) when the intervention is randomlyassigned, a rather improbable setting.
I More generally, this mode of inference evaluates significancerelative to a benchmark distribution for the assignmentprocess, one that is implemented directly in the data.
The uniform benchmark is often employed in practice, but departures fromuniformity are possible (see, Firpo and Possebom, 2018).
Why use synthetic controls?
I Compare to linear regression. Let:I Y 0 be the (T − T0)× J matrix of post-intervention outcomes
for the units in the donor pool.
I X1 and X0 be the result of augmenting X 1 and X 0 with a rowof ones.
I B = (X0X′0)−1X0Y
′0 collects the coefficients of the regression
of Y 0 on X0.
I B′X1 is a regression-based estimator of the counterfactual
outcome for the treated unit without the treatment.
I Notice that B′X1 = Y 0W
reg , with
Wreg =X
′0(X0X
′0)−1X1.
I The components of W reg sum to one, but may be outside[0, 1], allowing extrapolation, and will not be sparse.
Why use synthetic controls?
Application: German reunification
country j W regj country j W reg
j
Australia 0.12 Netherlands 0.14Austria 0.26 New Zealand 0.12Belgium 0.00 Norway 0.04Denmark 0.08 Portugal -0.08France 0.04 Spain -0.01Greece -0.09 Switzerland 0.05Italy -0.05 United Kingdom 0.06Japan 0.19 United States 0.13
Why use synthetic controls?
I No extrapolation. Synthetic control estimators precludeextrapolation outside the support of the data.
I Transparency of the fit. Linear regression uses extrapolationto obtain X 0W
reg = X 1, even when the untreated units arecompletely dissimilar in their characteristics to the treatedunit. In contrast, synthetic controls make transparent theactual discrepancy between the treated unit and the convexhull of the units in the donor pool, X 1 − X 0W
∗.
I Safeguard against specification searches. Syntheticcontrols do not require access to post-treatment outcomes inthe design phase of the study, when synthetic control weightsare calculated. Therefore, all design decisions can be madewithout knowing how they affect the conclusions of the study.
Why use synthetic controls?
I Safeguard against specification searches (cont.)Synthetic control weights can be calculated and pre-registeredbefore the post-treatment outcomes are realized, or before theactual intervention takes place, providing a safeguard againstspecification searches and p-hacking.
I Transparency of the counterfactual. Synthetic controlsmake explicit the contribution of each comparison unit to thecounterfactual of interest.
I Sparsity. Because the synthetic control coefficients are properweights and are sparse, they allow a precise interpretation ofthe nature of the estimate of the counterfactual of interest(and of potential biases).
Why use synthetic controls?
Sparsity: Geometric interpretation
X 0W∗
X 1
I If X 1 does not belong to the convex hull of the columns ofX 0, the synthetic control X 0W
∗ is unique and sparse.
I If X 1 belongs to the convex hull of the columns of X 0, thesynthetic control X 0W
∗ may not be unique and candidateW∗’s may not be sparse, although sparse solutions always
exist (by Caratheodory’s theorem).
A penalized synthetic control estimator
Penalized synthetic control (Abadie and L’Hour, 2020): W ∗(λ)solves
minW
∥∥∥∥∥∥X 1 −J+1∑j=2
WjX j
∥∥∥∥∥∥2
+ λ
J+1∑j=2
Wj ‖X 1 − X j‖2
s.t. Wj ≥ 0,J+1∑j=2
Wj = 1.
I λ > 0 controls the trade-off between fitting well the treatedand minimizing the sum of pairwise distances to selectedcontrol units.
I λ→ 0: pure synthetic control i.e. synthetic control thatminimizes the pairwise matching discrepancies among allsolutions for the unpenalized estimator.
I λ→∞: nearest neighbor matching.
A penalized synthetic control estimator
Advantages of the penalized estimator:
1. For any λ > 0, solution is unique and sparse provided thatuntreated observations are in general position.
2. The presence of the penalization term reduces theinterpolation bias that occurs when averaging units that arefar away from each other.
3. Same computational complexity as the unpenalizedestimator.
A penalized synthetic control estimator
X2 X1 X3 X4
1 2 3 4 5c c ct
I X1 = 2 and X0 = [1 4 5].
I The (unpenalized) synthetic control has two sparse solutions:W ∗
1 = (2/3, 1/3, 0) and W ∗∗1 = (3/4, 0, 1/4).
I W ∗1 dominates W ∗∗
1 in terms of matching discrepancy. Infinitenumber of non-sparse solutions from convex combinations ofthese two.
I However, when λ > 0, the penalized synthetic control has aunique solution:
W ∗1 (λ) =
{(2 + λ/2, 1− λ/2, 0)/3 if 0 < λ ≤ 2,(1, 0, 0), if λ > 2.
I As λ→ 0, W ∗1 (λ)→W ∗
1 , the pure synthetic control. Thepenalized synthetic control never uses the “bad” match X4.
Synthetic controls for experimental design
I Suppose a ridesharing company wants to assess the impact ofa new incentive pay program for drivers.
I To do so, the new incentive pay treatment will be applied as apilot program in one market/city or in a few markets/cities.
I Which market or markets should they treat?
I Which market or markets should they use as acomparison/control?
I This is a setting where randomization of treatment may createdefective designs where:
I The treated market/markets are non-representative of theentire set of markets of interest.
I Treated and control markets are very different in theircharacteristics.
Synthetic controls for experimental design
I In these contexts, Abadie and Zhao (2021) propose:
I Find a first set of weights that make a synthetic treated unitreproduce the features of the population of units of interest.
I Create a synthetic control for the synthetic treated unit.
I In contrast to the observational case, in the experimentalsettings we have two synthetic control units: one treated andone untreated.
I Related ongoing work by Doudchenko and co-authors.Synthetic controls are widely used as experimental designs bybusiness analytics units (e.g., Uber).
Closing remarks
I Synthetic controls provide many practical advantages for theestimation of the effects of policy interventions and otherevents of interest.
I Like for any other statistical procedure (and especially forthose aiming to estimate causal effects), the credibility of theresults depends crucially on the level of diligence exerted inthe application of the method and on whether contextual anddata requirements are met in the empirical application athand (see Abadie, JEL 2021).
Closing remarks
I Some open areas of research: sampling-based inference,external validity, sensitivity to model restrictions, estimationwith multiple interventions, data driven selectors of vh,mediation analysis ...
I An area of recent heightened interest regarding the use ofsynthetic controls is the design of experimental interventions.
I Results on robust and efficient computation of syntheticcontrols are scarce, and more research is needed on thecomputational aspects of this methodology.
I On the empirical side, many of the events and the policyinterventions economists care about take place at anaggregate level, affecting entire aggregate units.
Resources
The material in this presentation comes from:
I Abadie, A. and J. Gardeazabal. 2003. “The Economic Costs of Conflict: A CaseStudy of the Basque Country.” American Economic Review, 93(1): 112-132.
I Abadie, A., A. Diamond, and J. Hainmueller. 2010. “Synthetic Control Methodsfor Comparative Case Studies: Estimating the Effect of California’s TobaccoControl Program.” Journal of the American Statistical Association, 105(490):493-505.
I Abadie, A., A. Diamond, and J. Hainmueller. 2015. “Comparative Politics andthe Synthetic Control Method”. American Journal of Political Science, 59(2):495-510.
I Abadie, A. 2021. “Using Synthetic Controls: Feasibility, Data Requirements, andMethodological Aspects.” Journal of Economic Literature, 59(2): 391-425.
I Abadie, A. and J. L’Hour. 2020. “A Penalized Synthetic Control Estimator forDisaggregated Data.” Journal of the American Statistical Association(forthcoming).
I Abadie, A. and J. Zhao. 2021. Synthetic Controls for Experimental Design.
Code: synth (Matlab, Stata and R)http://web.stanford.edu/∼jhain/synthpage.html
pensynth (R)https://github.com/jeremylhour/pensynth
Resources
While this talk has mostly focused on my work, many have contributed to theliterature on synthetic control estimators and related methods. Some references:
I Acemoglu, D., S. Johnson, A. Kermani, J. Kwak, and T. Mitton. 2016. “TheValue of Connections in Turbulent Times: Evidence from the United States.Journal of Financial Economics, 121, 368–391.
I Amjad, M., D. Shah, and D. Shen. 2018. “Robust Synthetic Control.” Journalof Machine Learning Research, 19(22), 1–51.
I Amjad, M. J., V. Misra, D. Shah, and D. Shen. 2019. “mRSC: MultidimensionalRobust Synthetic Control.” In Proc. ACM Meas. Anal. Comput. Syst.,Volume 3.
I Agarwal, A., D. Shah, and D. Shen. 2020. “Synthetic Interventions.”arXiv:2006.07691.
I Arkhangelsky, D., S. Athey, D. A. Hirshberg, G. W. Imbens, and S. Wager. 2019.“Synthetic Difference in Differences.” arXiv:1812.09970.
I Athey, S., M. Bayati, N. Doudchenko, G. Imbens, and K. Khosravi. 2018.“Matrix Completion Methods for Causal Panel Data Models.” arXiv:1710.10251.
ResourcesI Botosaru, I. and B. Ferman. 2019. “On the Role of Covariates in the Synthetic
Control Method.” The Econometrics Journal, forthcoming.
I Brodersen, K. H., F. Gallusser, J. Koehler, N. Remy, S. L. Scott, et al. 2015.“Inferring Causal Impact Using Bayesian Structural Time-Series Models.” TheAnnals of Applied Statistics, 9(1), 247–274.
I Cattaneo, M. D., Y. Feng, and R. Titiunik. 2019. “Prediction Intervals forSynthetic Control Methods.” arXiv:1912.07120
I Chernozhukov, V., K. Wuthrich, and Y. Zhu. 2019a. “An Exact and RobustConformal Inference Method for Counterfactual and Synthetic Controls.”arXiv:1712.09089v6.
I Chernozhukov, V., K. Wuthrich, and Y. Zhu. 2019b. “Practical and Robustt-test Based Inference for Synthetic Control and Related Methods.”arXiv:1812.10820v2.
I Chernozhukov, V., K. Wuthrich, and Y. Zhu. 2019c. “Practical and Robustt-test Based Inference for Synthetic Control and Related Methods.”arXiv:1909.07889.
I Doudchenko, N., D. Gilinson, S. Taylor and N. Wernerfelt. 2020. DesigningExperiments with Synthetic Controls.
ResourcesI Doudchenko, N. and G. W. Imbens. 2016. “Balancing, Regression,
Difference-in-Differences and Synthetic Control Methods: A Synthesis.”arXiv:1610.07748.
I Ferman, B. 2019. “On the Properties of the Synthetic Control Estimator withMany Periods and Many Controls.” arXiv:1906.06665.
I Firpo, S. and V. Possebom. 2018. “Synthetic Control Method: Inference,Sensitivity Analysis and Confidence Sets.” Journal of Causal Inference, 6(2).
I Gunsilius, F. 2020. “Distributional Synthetic Controls.”
I Li, K. T. 2020. “Statistical Inference for Average Treatment Effects Estimatedby Synthetic Control Methods.” Journal of the American Statistical Association115, 2068-2083.
I Robbins, M. W., J. Saunders, and B. Kilmer. 2017. “A Framework for SyntheticControl Methods with High-Dimensional, Micro-Level Data: Evaluating aNeighborhood-Specific Crime Intervention.” Journal of the American StatisticalAssociation, 112, 109–126.
I Samartsidis, P., S. R. Seaman, A. M. Presanis, M. Hickman, and D. De Angelis.2019. “Assessing the Causal Effect of Binary Interventions from ObservationalPanel Data with Few Treated Units. Statistical Science, forthcoming.
Thank you!