Institutional Corruption and Election Fraud: Evidence...
Transcript of Institutional Corruption and Election Fraud: Evidence...
Institutional Corruption and Election Fraud: Evidencefrom a Field Experiment in Afghanistan∗
Michael Callen† and James D. Long‡
June 14, 2014
Abstract
We investigate the relationship between political networks, weak institutions, and electionfraud during the 2010 parliamentary election in Afghanistan combining: (i) data on politicalconnections between candidates and election officials; (ii) a nationwide controlled evaluationof a novel monitoring technology; and (iii) direct measurements of aggregation fraud. Wefind considerable evidence of aggregation fraud in favor of connected candidates and that theannouncement of a new monitoring technology reduced theft of election materials by about60 percent and vote counts for connected candidates by about 25 percent. The results haveimplications for electoral competition and are potentially actionable for policymakers.
JEL codes: P16, D72, D73
Keywords: Election Fraud, Corruption
∗Authors’ Note: We are grateful to Glenn Cowan, Jed Ober, Eric Bjornlund, Evan Smith, and JonGatto at Democracy International (DI) and Nader Nadery, Jandad Spinghar, and Una Moore at the Freeand Fair Elections Foundation of Afghanistan (FEFA). We acknowledge the support of USAID DevelopmentInnovation Ventures (DIV), DI, and grant #FA9550-09-1-0314 from the Air Force Office of Scientific Research(Callen). We are indebted to James Andreoni, Luke Condra, Gordon Dahl, Daniel Egel, Col. Joseph H.Felter, Ray Fisman, Tarek Ghani, Susan Hyde, Radha Iyengar, Danielle Jung, Asim Ijaz Khwaja, DavidLaitin, David Lake, Edward Miguel, Karthik Muralidharan, Paul Niehaus, Gerard Padro-i-Miquel, RohiniPande, Ngoc Anh Tran, Maura O’Neill, Shanker Satyanath, Jacob Shapiro, Romain Wacziarg, Scott Worden,Christopher Woodruff and seminar participants at the NBER Political Economy Meeting, the 2011 NEUDCconference, George Mason University, the Development and Conflict Research (DACOR) conference, the2011 Institute on Global Conflict and Cooperation (IGCC) conference, UC San Diego, Yale, the Center forGlobal Development (CGD), and USAID for extremely helpful comments. We are particularly indebtedto Eli Berman, Clark Gibson, and Craig McIntosh for advice and support at all stages of the project.This project would not have been possible without the dedicated research assistance of Randy Edwards,Mohammad Isaqzadeh, Shahim Kabuli, and Arman Rezaee. Mistakes remain with the authors.†University of California Los Angeles. Department of Political Science. 4289 Bunche Hall, Los
Angeles, CA 90095. email: [email protected]‡Harvard Academy for International and Area Studies and University of Washington. Depart-
ment of Political Science.
1
1 Introduction
Many governments are not responsive to their citizens. Fair elections provide an important means
of improving responsiveness by making elected officials accountable to voters.1 However, election
fraud undermines this critical function in many young democracies, often at the hands of tightly
networked groups of political elites. This paper examines whether candidates exploit connections
to elections officials to add fraudulent votes during the aggregation process. We study this problem
in Afghanistan, a country where democratic institutions are struggling to develop after the last
three decades of continuous conflict.2
There are many ways to manipulate elections, including voter intimidation, ballot box stuffing,
and changing vote totals after ballots are cast. We study the manipulation of vote totals during
the aggregation process, which we henceforth call “aggregation fraud”. We are interested in this
particular type of fraud because it is likely to involve collusion between candidates and election
officials. We collect novel data on aggregation fraud by photographing provisional vote tally sheets
at individual polling centers before the aggregation process and comparing these counts to the
corresponding numbers after aggregation is completed.3 This technique, which we call “photo
quick count”, records the same vote totals both before and after aggregation. In a clean election,
these numbers should be identical. We find differences at 78 percent of the polling locations in our
observed sample. Additionally, candidates connected to officials in charge of aggregation receive
an average of 3.5 fraudulent votes per polling station (about 13.7 percent of their polling station
average).
Given that fraud affects many elections in developing countries, there is reason to study the
effects of election monitoring both for the design of policy and for understanding the mechanics of
1There is substantial empirical documentation of the benefits of improving political accountability (Besleyand Burgess 2002; Besley, Pande and Rao 2005; Chattopadhyay and Duflo 2004; Ferraz and Finan 2008;2011; Fujiwara 2013; Pande 2003; Martinez-Bravo et al. 2013). Additionally, in countries experiencing violentcontests for state control, such as Afghanistan, fair elections may also undermine popular support for insur-gents by promoting an accountable and legitimate government and by providing a forum for reconciliation(Berman, Shapiro and Felter 2011; Besley and Persson 2011; McChrystal 2009; United States Army 2006;World Bank 2011).
2Rashid (2009) provides an authoritative account of how the patronage networks of Afghan warlords haveundermined political development in Afghanistan.
3In many countries, it is standard for election officials to record vote totals at a particular polling centeron an election returns form and then post the form on the outside of the polling center, indicating vote totalsat the polling centers to local residents.
2
election fraud. We therefore implement an experiment both to estimate the causal effect of photo
quick count on aggregation fraud and to provide more general evidence on the response of fraud to
the credible threat of discovery through monitoring. Specifically, we test whether announcing photo
quick count measurements to election officials reduces fraud. We deliver a letter to a randomly
selected set of polling center managers in 238 polling centers from an experimental sample of
471 polling centers.4 The letter explains how photo quick count works and announces that the
measurement will be taken. Our nationwide experimental sample comprises 7.8 percent of polling
centers operating on election day and spans 19 of Afghanistan’s 34 provinces.
This experiment produces four main results. First, the photo quick count announcement re-
duced damaging of election materials by candidate representatives from 18.9 to 8.1 percent (a
reduction of about 60 percent). Second, it reduced votes for politically powerful candidates at
a given polling location from about 21 to about 15 (a reduction of about 25 percent).5 Treat-
ment effects are also larger for candidates connected to officials in charge of aggregation in their
constituency.6 These candidates lost about 6.5 votes as a result of the intervention.7 Third, the in-
tervention impacts fraud measured directly using photo quick count. Point estimates for this effect
range between 9.37 and 17.17 fewer votes changed during aggregation for candidates connected to
the provincial aggregator.8 Last, we find that having a neighbor treated within one kilometer is
associated with an additional reduction of about seven votes, suggesting our estimates may slightly
understate the true effect.
Our results relate to empirical findings in four strands of literature on the economics of cor-
ruption. First, as in the important examples provided by Bertrand et al. (2007) and Olken and
Barron (2009), we find that corruption limits the ability of governments to correct externalities.
4Polling center managers are election officials tasked with managing the voting process on election dayand overseeing the counting of votes at the end of the day in their assigned polling center.
5The mean number of votes cast for a candidate at a given polling substation in our control sample is1.410, reflecting the fact that many candidates receive no votes in many locations. The mean number ofvotes cast for all candidates at a given polling location is 269.
6As we describe in Section 3, we observe these connections only for 57 candidates, a small subset ofcandidates running in the election. In Table A2, however, we find that treatment effects are localized toconnected candidates, even controlling for whether they were investigated.
7Correspondingly, these candidates receive enough votes to rank among winning candidates at 49.7 percentof the polling substations in our control sample. Our treatment reduces this share to 38.2 percent of pollingsubstations.
8These large bounds on the estimated treatment effect are due to substantial treatment-related attritionin this measure. We calculate these bounds using the trimming method of Lee (2009).
3
The purpose of electoral law is to ensure that election outcomes reflect the will of the electorate.
We show that this function is undermined by a faulty aggregation process. Second, the effects of
announcing photo quick count depend on pre-existing connections between candidates and election
officials. Fragile democracies provide many examples of elected officials sharing rents with their
networks (Fisman 2001; Khwaja and Mian 2005). Free and fair elections may place limits on this
(see, for example, Ferraz and Finan 2008; 2011), and so patronage networks might have incen-
tives to coordinate when capturing elections. This suggests the possibility of multiple equilibria in
corruption as discussed in Olken and Pande (2011); the same intervention can have very different
effects depending on pre-existing political relationships. Third, we examine the determinants of
equilibrium patterns of corruption (Shleifer and Vishny 1993; Cadot 1987; Rose-Ackerman 1975;
Svensson 2003), focusing on the role of candidate connections. Last, our experiment relates to the
growing body of experimental and quasi-experimental assessments of initiatives to improve elec-
tions (Aker, Collier and Vicente 2013; Fujiwara 2013; Hyde 2007; Ichino and Schundeln 2012). Our
project also draws direct inspiration from work in development economics on efforts to improve
transparency and accountability (Atanassova, Bertrand and Niehaus 2013; Duflo, Hanna and Ryan
2012; Di Tella and Schargrodsky 2003; Ferraz and Finan 2008; Olken 2007; Yang 2008).9
We point to three implications for policies aimed at strengthening democratic institutions.
First, aggregation fraud was a serious problem in this election. When electoral institutions are
weak, candidates may be able to leverage their ties to officials to distort electoral outcomes in their
favor. Second, we find a substantial response of fraud to monitoring, suggesting that monitoring
can increase fairness in elections. Third and relatedly, our results provide promise for photo quick
count as a means of both precisely measuring and of reducing aggregation fraud. This approach can
increase precision beyond existing forensic tests which compare realized vote distributions with the-
oretical distributions that should occur in a fair election.10 Moreover, such checks can be subverted
by competent manipulators by ensuring that rigged values follow the test distributions. The tech-
nology is also highly compatible with implementation via Information Communications Technology.
9Research on the role of monitoring and anti-corruption efforts in development is advancing rapidly; wedirect readers to Olken and Pande (2011) and McGee and Gaventa (2011) for excellent reviews of researchin this field.
10Myagkov, Ordeshook and Shakin (2009), Mebane (2008), and Beber and Scacco (2012) describe forensicmethods of detecting election fraud.
4
The cost of recording and centralizing information on diffuse illegal behavior is now nominal. The
rapid increase in cellular connectivity and in smartphone usage in developing countries allows the
possibility that this technology might also be adapted to citizen-based implementation.
We structure the rest of the paper as follows. Section 2 describes our experimental setting
and relevant features of electoral institutions in Afghanistan. Section 3 presents results using data
on directly measured aggregation fraud. Section 4 describes our experimental evaluation of photo
quick count. Section 5 provides results from the experiment and Section 6 concludes.
2 Institutional Background
2.1 Post-Invasion Democracy in Afghanistan
After the invasion by the United States and fall of the Taliban in 2001, a coalition of international
armed forces helped to empanel a Constitutional Loya Jirga to establish democratic institutions
in Afghanistan after decades of internecine conflict, civil war, and Taliban rule. In 2004, popular
elections validated the Loya Jirga’s choice of Hamid Karzai as president, and in 2005, Afghans
voted in the first elections for the lower house of parliament (Wolesi Jirga). In 2009, Karzai was re-
elected amid claims of rampant election fraud, which largely discredited the government.11 Afghans
returned to the polls in September 2010 to elect members of parliament amid a growing insurgency
and a US commitment to begin withdrawing troops in July 2011. The international community
viewed these elections as a critical benchmark in the consolidation of democratic institutions given
doubts about the Karzai government’s ability to exercise control in much of the country. Despite
lingering memories of violence and widespread fraud from the 2009 election, roughly 5 million voters
cast ballots in the 2010 Wolesi Jirga elections.12
11Karzai initially won 53 percent of the vote, above the 50 percent threshold necessary to avoid a run-off. After an independent investigation based on physical inspections of a random subsample of ballots,Karzai’s share was reduced to 47 percent. Karzai finally won re-election when his main competitor, AbdullahAbdullah, refused to participate in the run-off.
12The Independent Electoral Commission projected this number out of what it believes are 11 millionlegitimate registered voters. Afghanistan has never had a complete census so population estimates varywidely. The total population is estimated at roughly 30 million and the voting age population is roughly 16million.
5
2.2 Electoral Institutions
Three features of the parliamentary election system in Afghanistan make it particularly vulnerable
to fraud. First, many seats are available within a single constituency, creating thin victory margins
for a large number of positions. This both makes fraudulent votes potentially highly valuable and
also increases the number of potential manipulators. Second, electoral institutions in Afghanistan
are just beginning to develop and remain weak. Finally, the state exercises incomplete territorial
control, leaving polling centers in contested regions vulnerable to closure or capture. We discuss
each of these three features in this section.
Afghanistan’s 34 provinces serve as multi-member districts that elect members to the Wolesi
Jirga. Each province serves as a single electoral district and the number of seats it holds in
parliament is proportional to its estimated population. Candidates run “at large” within a province
without respect to any smaller constituency boundaries. Voters cast a Single Non-Transferable Vote
(SNTV) for individual candidates, nearly all of whom run as independents.13 The rules declare
winning candidates as those who receive the most votes relative to each province’s seat share. For
example, Kabul province elects the most members to Parliament (33), Ghor province elects the
median number of candidates in our sample (6), and Panjsher the fewest (2). The candidates who
rank 1 to 33 in Kabul, 1 to 6 in Ghor, and 1 to 2 in Panjsher win seats.14
This method for allocating seats in parliament for winning candidates provides incentives for
fraud. Because many seats are available within a single province, a large number of candidates
gain office with thin victory margins. SNTV with large district magnitudes and a lack of political
parties also disperses votes across many candidates. The vote margin separating the lowest winning
candidate from the highest losing candidate is often small. This creates a high expected return
for even small manipulation for many candidates. In contrast, electoral systems with established
parties and with small district magnitude tend to produce larger victory margins, so non-viable
13SNTV systems provide voters with one ballot that they cast for one candidate when multiple candidatesrun for multiple seats. If a voter’s ballot goes towards a losing candidate, the vote is not re-apportioned. Al-though this electoral system is rare, former U.S. Ambassador to Afghanistan Zalmay Khalilzad and PresidentHamid Karzai promoted SNTV during the first parliamentary elections in 2005 to marginalize warlords andreduce the likelihood they obtained parliamentary seats. As a corollary, Karzai also decreed that politicalparties should not be allowed to form.
14There are 249 seats in the Wolesi Jirga. Ten seats are reserved for the nomadic Kuchi population andthe remaining seats are allocated in proportion to the estimated population of the province.
6
candidates may be less likely to rig. Because each province contains multiple seats, it remains
possible for election officials involved in vote aggregation to rig votes on behalf of multiple officials
simultaneously.
Electoral malfeasance in Afghanistan may also be partly due to the weak institutions tasked
with managing elections. The Independent Election Commission serves as the main electoral body
responsible for polling, counting votes, aggregation, and certifying winning candidates. Histori-
cally, the Independent Election Commission has proven susceptible to influence by corrupt agents.
We review specific features of the aggregation process which conduce to fraud and considerable
photographic evidence of fraud in Section 3 below.15
Informal networks also play an important role in determining political outcomes in Afghanistan.
Despite attempts to promote democratic institutions, pre-existing power structures remain highly
relevant.16 For example, several of the main candidates in the 2010 election and a number of
current government officials were warlords prior to the U.S.-led invasion. Former warlords also
had considerable influence during the drafting of the Bonn Agreement, which provides the basis
for Afghanistan’s current political institutions. Along these lines, Karzai enjoys strong links with
government officials in Southern Afghanistan given his family roots in that part of the country.
Former warlords fighting in the Northern Alliance against the Taliban exert strong control in
Northern Afghanistan and have played a key role in the US-backed government.
Despite weak electoral institutions, candidates and officials face some possibility of punishment
for rigging. The Electoral Complaints Commission, which is backed by the United Nations, exists
as a separate and independent body from the Independent Elections Commission. The Electoral
Complaints Commission investigates complaints against polling officials, candidates, or citizens.
Any Afghan can lodge such a complaint. Based on the seriousness of a complaint and its likelihood
of affecting the election’s outcome, the Electoral Complaints Commission may decide to cancel all
of the votes at a given polling location, all of the votes for a particular candidate at a polling loca-
tion, or the total votes for a candidate across their entire constituency. The Electoral Complaints
15Similarly, Callen and Weidmann (2013) find that fraud was sufficiently severe to be detected by relativelyinsensitive forensic techniques in about 20 percent of the 398 districts in Afghanistan.
16For example, the Afghan Attorney General, Ishaq Malako alleged in an address to parliament, broadcaston Afghan nation television, that election outcomes were heavily influenced by well-connected Afghansthrough secret dealings in Dubai, and claimed to have evidence to that effect.
7
Commission over-turned some 25 percent of the ballots in this process in the 2010 election. This
resulted in 30 seats changing hands (National Democratic Institute 2010). Additionally, under its
purview of fighting corruption, the Attorney General’s office may prosecute specific individuals,
including election officials and candidates, it believes to have participated in election fraud and
levy fines or prison sentences against them if found guilty.17
3 Aggregation Fraud
There are many ways to rig an election such as buying votes, intimidating voters, and stuffing
ballot boxes. We focus on fraud which happens during the aggregation process. This type of fraud
typically involves adjusting votes in favor of a particular candidate and usually happens at a central
aggregation center.
We capture vote totals both immediately before and directly after the aggregation process,
which took roughly one month. These data directly measure votes added or subtracted during
aggregation for each candidate at each polling location in our observed sample. To our knowledge,
these are the first direct measurements of aggregation fraud. We begin by examining which types
of candidates benefit from aggregation fraud and summarize a few basic patterns.
3.1 Vote Aggregation
Aggregation takes place in three stages. First, after voting concludes, election staff count votes at
individual polling centers. Polling centers contain several polling substations. There are typically
four substations in a polling center. For example, a polling center might be a school, and polling
substations are classrooms set up as polling locations inside of the school.18 Counting is overseen
17To our knowledge, no candidates received criminal penalties for committing election fraud. However,during the month-long aggregation process, the Attorney General attempted to suspend the heads of boththe Independent Election Commission and the Electoral Complaints Commission.
18Analysis in this paper is done at the level of the polling substation. In particular, we use votes for agiven candidates at single polling substation as a main outcome. We had several options in terms of thelevel of aggregation to select as the unit of analysis. The candidate-polling substation is the most basicconstitutive unit. We use this as the unit of analysis because polling substation outcomes range much lesswidely than polling center outcomes. This allows us to estimate treatment effects across fairly similar units.In our data, there are between 1 and 11 polling substations in a given polling center. The number of votescast at a polling substation varies between 0 and 599, reflecting a law that when 600 votes are cast, a new
8
Presiding Official: Polling Center Manager Total Number: 6,038 Period of Operation: 7AM – 4PM on election day
Polling Center Provincial Aggregation
Center National Aggregation
Center
Presiding Official: Provincial Elections Officer Total Number: 36 Period of Operation: ~1 month after election
Presiding Official: Chairman Total Number: 1 Period of Operation: ~3 months after election
Figure 1: The Aggregation Process
at each polling center by a Polling Center Manager. The candidate totals for each substation are
recorded on a single Declaration of Results form. Second, copies of the results forms are then
sealed in a tamper-evident bag and sent to a Provincial Aggregation Center. After the count, the
ballots are returned to the boxes, and the boxes are stored locally and are not transmitted to the
Provincial Aggregation Center.19 Changing the Declaration of Results forms is therefore all that
is necessary to manipulate the aggregation process. Separate copies of each Declaration of Results
form are also posted on the outside of the polling center for public viewing. One Declaration of
Results form is posted for each polling substation. In the final stage, results forms are collected
at the National Aggregation Center in Kabul and combined to produce a national total. Figure 1
summarizes the aggregation process.
3.2 Measuring Fraud
We measure fraud by taking photographs of the Declaration of Results form posted for public
viewing prior to aggregation and comparing this record to the corresponding vote total after aggre-
gation. We compare at the level of individual candidates at specific polling substations. Figure 2
illustrates the method. Our research team took the picture on the left at a polling center in the
field the morning after election day. The scan on the right was taken at the national aggregation
substation must be opened. By contrast, the number of votes cast at a polling center in our data rangesfrom 0 to 5526. Appendix Table A8 reports results aggregating to the candidate - polling center level.
19In most cases, boxes are stored at one pre-designated site in each of the 398 districts across Afghanistan.
9
Panel A: Election Returns Form at the Polling Center Panel B: Election Returns Form at the National Aggregation Center
Figure 2: Election Returns Form for the Same Polling Center Before and After Aggregation
center in Kabul about one month later. The picture on the right should be of a carbon paper copy
of the form on the left and so they should be identical. Nonetheless, we find three major differences.
First, someone has converted the Dari script for the polling center and polling substation
numbers into arabic numerals. Second, the name of the polling center manager has changed. Last,
the sheet on the left records votes for several candidates and appears to result from normal polling
while the figure on the right records no votes whatsoever.
To record these differences we require both a picture and post-election data on votes disaggre-
gated by candidate and polling substation. We obtained post-election data by scraping the election
commission website on October 24, 2010.20 Pre-aggregation vote totals are based on photographs.
We hand-coded 48,871 vote entries for 1,784 different candidates at 347 polling substations directly
from our photographs.21 Of these observations, we have corresponding data from our web scrape
20Web scraping involves collecting, downloading, and structuring .html data which is available on theinternet but not in a form ready for analysis. We obtained the earliest possible returns in order to isolatethe effects of our treatment from the many readjustments that were made during a year long arbitrationprocess.
21We had data for 20 percent of our polling stations independently double-entered to try to minimize errorassociated with varying picture quality. During double entry we found mistakes for only five candidate -
10
for 48,018 entries at 346 polling substations. These 346 polling substations were contained in 149
distinct polling centers in our experimental sample.22
3.3 Patterns of Fraud
There are several ways a candidate might adjust votes during the aggregation process. They might
simply add votes to their total without adjusting votes for other candidates. Alternatively, they
might cancel votes for competitors or move votes from their competitors’ total to their own total.
To gain a sense of how this is done, we examine whether votes are only added, subtracted, or some
combination across all candidates on a given Declaration of Results form. Table 1 describes the
frequency of five different patterns of fraud.
In the majority of cases (36.7 percent), candidates add votes while simultaneously subtracting
from competitors. Candidates also commonly add votes, while leaving their competitors totals
unaltered (20.2 percent of cases). Adding and subtracting in equal measure is less common (4.3
percent of cases) as is simply subtracting votes (4.3 percent of cases).23 Only 21.4 percent of the
polling substations in our sample record no change whatsoever, indicating the extent of aggregation
fraud. Along these lines, of the 48,018 candidate - polling substation observations, 5,308 (11.05
percent) record some difference during aggregation. We note that we observe only a highly selected
sample.24
3.4 Political Connections and Aggregation Fraud
Political connections between candidates and officials in charge of aggregation may facilitate fraud.
To alter Declaration of Results forms, candidates need some means of accessing them. We exam-
ine whether votes for politically connected candidates are more likely to be altered using data on
polling substation observations.22In three of the polling centers (five polling substations) in our data, we record no votes after aggregation.
This might signal that the Independent Elections Commission officially cancelled votes from these pollingcenters, or that a candidate successfully manage to eliminate all votes from these polling centers duringaggregation. Our results are robust to dropping these three polling centers.
23The five polling substations that record no positive votes for any candidate after aggregation fall intothe “subtracting votes only” category.
24As we discuss in Section 5 below, at many polling substations these records were either not posted, torndown, or removed. If sheets are removed from polling centers where candidates anticipate adding votes, ourdata will report less fraud than actually occurred during the election.
11
Table 1: Aggregation Discrepancy Patterns
Pattern: Number of Share Mean VotePolling Substations of Sample Difference
No Fraud 74 21.4% 0.00Adding votes only 70 20.2% 47.34Subtracting votes only 15 4.3% -245.07Adding and subtracting equally 15 4.3% 0.00Adding more than subtracting 127 36.7% 83.45Subtracting more than adding 45 13.01% -54.13
Notes: This table reports aggregation discrepancy patterns. Each row corresponds to whether
discrepancies during aggregation, across all candidates on the ballot at a given polling substation,
reflect: no change (1); only additions (2); only subtractions (3); additions and subtractions in equal
measure (4); more added votes than subtracted votes (5); or more subtracted votes than added
votes (6). The sample comprises 341 polling substations for which both pre-aggregation and post-
aggregation data are available. Column (1) reports the number of polling substations falling into
each category. Column (2) provides the corresponding share of the total sample. Column (3) reports
the average of vote differences across polling substations in the indicated category, where differences
are calculated for each polling substation as the sum of differences between post-aggregation votes
and pre-aggregation votes across all candidates on the ballot at the substation.
political connections from systematic investigations of candidates’ political histories. These inves-
tigations involved extensive expert interviews in the relevant constituencies and cross-validation
against a set of additional resources. Appendix A.1 describes the investigation process in detail.
To examine the role of connections, we run regressions of the form:
Yijs = β0 + β1Investigatedi + β2 Connectioni + γj + εis. (1)
Yijs is the difference in the number of votes before and after aggregation for candidate i at polling
substation s in constituency j. Investigatedi is a dummy variable equal to one if the candidate is
among the 57 candidates for whom we have political history data, Connectioni is a dummy variable
equal to one for a political connection for candidate i and γj are constituency (province) fixed
effects.25 We consider connections to President Karzai and to district and provincial aggregators.
We also check whether votes for incumbents and for candidates with some government service are
more likely to be altered. Table 2 provides summary statistics for the data used in these regressions.
25The political connection variables are based only on the investigations data, therefore β2 is interpretedas the additional difference associated with connectioni, conditional on being investigated.
12
Table 2: Summary Statistics on Political Connections and Aggregation Fraud
Variable Mean Std. Dev. # Obs.
Political ConnectionsConnected to Provincial Aggregator (=1) 0.005 0.073 48008Connected to Provincial + District Aggregator (=1) 0.004 0.065 48008Karzai Connection (=1) 0.011 0.106 48008Government Service (=1) 0.015 0.12 48008Incumbent (=1) 0.067 0.25 48008
Aggregation FraudPost-aggregation Votes - Pre-aggregation Votes 0.166 6.409 48008(Post-aggregation Share - Pre-aggregation Share) x 100 0.570 14.287 48008
Notes: This table reports summary statistics for political connections and for aggregation fraud. Theunit of observation is the candidate - polling substation. The first five rows are dummy variables thatequal 1 for each observation where a candidate has a connection of the indicated type. The penultimaterow is the difference between post-aggregation and pre-aggregation votes. The last row is the differencebetween the post aggregation and the pre-aggregation share multiplied by 100. Both the pre-aggregationand the post-aggregation vote shares are calculated using pre-aggregation polling station totals as thedenominator.
Table 3 reports results on the relationship between political connections and the degree of
aggregation fraud. Vote totals for elite candidates are more likely to change during aggregation.
In Column (1) of panel A we see that votes for the 57 investigated candidates increase by an
average of 1.870 votes during aggregation. In Column (2), we see that the votes for investigated
candidates also connected to the provincial aggregator increase by a total of 2.552 votes. We reject
that the sum of these coefficients equal zero with 95 percent confidence. Column (3) shows that
candidates with a connection to the provincial aggregator and district aggregator add about 3.587
votes at each polling substation. None of our 57 investigated candidates are connected to the district
aggregator but not the provincial aggregator, so we cannot estimate separate coefficients. Columns
(4) and (5) show that connections to President Karzai and having a history of Government Service
negatively predict vote addition, conditional on being in the investigated set, though these estimates
are quite imprecise. Column (6) shows that incumbency, after accounting for investigation, has
only a small additional effect. In Column (6) we see that having a connection to the aggregator
remains important, with the coefficient close to that reported in column (2), even after controlling
for incumbency. We calculate standard errors for these estimates using the method of Cameron,
13
Table 3: Vote Changes During Aggregation by Candidate Type
Dependent Variable: Post-aggregation Votes - Pre-aggregation VotesPanel A - Votes (1) (2) (3) (4) (5) (6) (7)
Investigated (=1) 1.870* 1.610 1.380 2.328 3.149** 1.764* 3.385**(0.980) (1.208) (1.176) (2.169) (1.415) (1.013) (1.713)
Provincial Aggregator Connection (=1) 0.942(1.651)
Prov. + District Aggregator Connection (=1) 2.207 2.553(1.848) (1.807)
Karzai Connect. (=1) -0.745 -1.005(2.216) (2.153)
Government Service (=1) -1.634 -2.038(1.918) (1.996)
Incumbent (=1) 0.245 0.302(0.238) (0.226)
Constant 0.104*** 0.105*** 0.106*** 0.105*** 0.105*** 0.094** 0.096***(0.037) (0.037) (0.037) (0.036) (0.037) (0.037) (0.037)
R-Squared 0.013 0.013 0.014 0.013 0.013 0.013 0.015# Candidates 1783 1783 1783 1783 1783 1783 1783#Polling Stations 149 149 149 149 149 149 149# Candidate - Polling Station Observations 48008 48008 48008 48008 48008 48008 48008Connection(s) + Investigated = 0 (p-value) 0.056 0.038 0.009 0.029 0.212 0.035 0.032Mean for Candidates Not Investigated 0.131 0.131 0.131 0.131 0.131 0.131 0.131
Dependent Variable: (Post-aggregation Share - Pre-aggregation Share) x 100Panel B - Vote Shares (1) (2) (3) (4) (5) (6) (7)
Investigated (=1) 2.978** 3.069* 2.905* 2.054* 0.756 2.928** 0.051(1.316) (1.708) (1.588) (1.208) (0.864) (1.330) (1.271)
Provincial Aggregator Connection (=1) -0.325(1.810)
Prov. + District Aggregator Connection (=1) 0.328 -0.215(1.828) (2.005)
Karzai Connect. (=1) 1.500 1.343(1.227) (1.263)
Government Service (=1) 2.838 2.744(2.263) (2.486)
Incumbent (=1) 0.115 -0.000(0.198) (0.207)
Constant 0.247* 0.247* 0.248* 0.246* 0.246* 0.243* 0.244*(0.132) (0.132) (0.132) (0.132) (0.132) (0.131) (0.131)
R-Squared 0.008 0.009 0.009 0.009 0.009 0.009 0.009# Candidates 1784 1784 1784 1784 1784 1784 1784#Polling Stations 149 149 149 149 149 149 149# Candidate - Polling Station Observations 48008 48008 48008 48008 48008 48008 48008Connection(s) + Investigated = 0 (p-value) 0.024 0.008 0.009 0.024 0.042 0.020 0.003Mean for Candidates Not Investigated 0.503 0.503 0.503 0.503 0.503 0.503 0.503
Notes: This table reports on political connections and the degree of aggregation fraud. The unit of observation is the candidate - polling substation.Each column in Panel A reports results from an OLS regression using the difference between post-aggregation and pre-aggregation votes as thedependent variable and a dummy variable that equals one if a candidate - polling substation observation records the indicated connection(s) as anindependent variable and an additional dummy that equals one if that observation has any background investigation data. Note that we observeconnections only for investigated candidates. Panel B repeats these specifications using the difference between pre-aggregation and post-aggregationvote shares (multiplied by 100) as a dependent variable. Both pre-aggregation and post-aggregation vote shares are calculated using pre-electionpolling station vote totals as the denominator. All specifications in both panels include province fixed effects and drop the five largest and fivesmallest observations of the dependent variable. No candidates record a connection to the district aggregator and not to the provincial aggregatorso these coefficients cannot be estimated separately. Sample: treatment and control polling centers for which complete vote difference data areavailable. Table A7 reports corresponding results for the control sample only. Levels of significance: *p < 0.1, **p < 0.05, ***p < 0.01. Standarderrors clustered by candidate and by polling center using the method of Cameron, Gelbach and Miller (2011) are reported in parentheses.
14
Gelbach and Miller (2011), which allows for arbitrary correlation for a given candidate across polling
centers and across candidates within a given polling center.26
Panel B repeats this analysis replacing the dependent variable with the change in vote shares
during aggregation.27 Being in the set of elite candidates with a political history investigation
predicts the a roughly 3 percent increase in vote share across specifications. To provide a sense of
the magnitude, if a candidate moved from a 0.5 percent vote share (the mean among un-investigated
candidates) to a 3.5 percent vote share, they would move from the seventh percentile to the 79th
percentile of candidates receiving positive votes.28
While there is evidence that political strength predicts adding votes during aggregation, we
treat this exercise speculatively for three reasons. First, we have data on connections only for
the most powerful candidates. Political power is likely to be highly asymmetrically distributed
across candidates and is concentrated in the subsample of 57 that we observe, making it difficult to
determine precisely which type of connection is most important for rigging an election.29 Second,
connections are likely correlated with other candidate attributes that facilitate fraud, and we observe
only a very limited number of candidate characteristics making it difficult to rule out omitted
variables. Last, we omit the five largest and smallest observations off the left-hand side variable.
These results are robust to dropping all but the three most negative observations, which are all
at least 28.5 standard deviations away from the mean, but break down if these three outliers are
included.
Our results, to this point, are based on a comparison of independent photographic records of
election returns forms prior to aggregation and ex post results.30 This approach uncovers a type of
26In this case, these standard errors are more conservative than those clustered only at the candidate oronly at the polling substation level.
27For both the pre-aggregation and post-aggregation vote share, we normalize votes by the total numberof votes cast at the substation prior to election. We do this to isolate vote changes that are sure to benefitthe candidate in question.
28On average, there are 483 candidates listed at each of the polling substations in our sample and 93candidates listed if we exclude Kabul from the sample. At a given polling substation, on average 267.701votes are cast (standard deviation = 175.410). Excluding Kabul from the sample, on average 293.683 votesare cast (standard deviation = 178.285).
29Table A6 reports regressions of vote changes during aggregation on specific connections only within thesample of investigated candidates.
30This design builds on Parallel Vote Tabulations (PVTs), which have been in use since the 1980s. Throughrepresentative sampling and recording of ballots by field staff, PVTs predict national totals within a smallmargin of error (Cowan, Estok and Nevitte 2002), but do not make polling center specific comparisons. Two
15
fraud which is usually largely hidden, suggesting that it might also be used to deter manipulation.
We now turn from a descriptive analysis of the dynamics of aggregation fraud to an examination
of whether awareness of photo quick count can be used to reduce fraud.
4 Experiment
This section reports results from an experiment that randomly informed election officials about
photo quick count measurements in order to assess potential impacts on fraud. This section de-
scribes the experimental setting, the information intervention used to manipulate the awareness of
polling officials, and provides details of our treatment assignment protocol. Section 5 describes our
measures of election fraud and provides results.
4.1 Experimental Setting
During the period of our study, Afghanistan was an active warzone. To keep our field staff safe, we
selected polling centers that met three safety criteria: (i) achieving the highest security rating given
by the International Security Assistance Force (ISAF) and the Afghan National Police (ANP); (ii)
being located in a provincial center, which are much safer than rural areas; and (iii) being scheduled
to operate on election day by the Independent Elections Commission. Figure 3 maps the polling
centers in our national sample in Panel A and in Kabul only in Panel B. Our experimental sample
comprises 471 polling centers (7.8 percent of polling centers operating on election day) in 19 of the
34 provincial capitals in Afghanistan.
4.2 Experimental Intervention
On election day (September 18, 2010), we randomly announced the use of photo quick count
by delivering letters to 238 of the 471 polling centers in our experimental sample. We call this
treatment our “Letter Treatment.” Appendix Figure A4 provides an English translation of the
important technological developments allow us to build on the PVT concept. First, it is now common forelection commissions to post disaggregated results on the internet. Second, the cheap availability of digitalphotography allows rapid and accurate recording of returns forms.
16
Panel A: National Sample Panel B: Kabul Subsample
Figure 3: Experimental Sample of Polling Centers
letter and Appendix Figure A5 provides the Dari translation. We instructed our Afghan field staff
to deliver letters to Polling Center Managers after 10AM and before polling concluded at 4PM.
Staff visited all 471 polling centers the following day in order to take a picture of the election
returns form.31
The letter announced to Polling Center Managers that researchers would photograph declara-
tion of results forms the following day (September 19) in order to document discrepancies arising
during aggregation. Two points about the experimental protocol bear emphasis. First, if we had not
notified managers of monitoring on election day, they would have been unaware of our treatment as
no election staff should be present at the polling center on the day after the election. Correspond-
ingly, our staff report encountering election officials while they were taking photographs on the day
following the election at only 2 of our 471 polling centers. Second, the experimental sample was
known only to the research team. Polling Center Managers in the treatment group were informed
of their status, but no election officials had a means of determining which sites we had selected as
controls.
We asked Polling Center Managers to acknowledge receipt by signing the letter. Managers at
seventeen polling centers (seven percent of centers receiving letters) refused to sign. We designate
31Declaration of Returns forms were posted at the conclusion of the count around 9PM on election day.We determined that keeping our field staff out this late was unsafe and so we instructed them to return thefollowing morning.
17
a polling center as treated if the manager received a letter (Letter Treatment = 1). Our results
remain robust to redefining treatment as both receiving and signing a letter.
4.3 Assigning Treatment
To inform our treatment assignment, we fielded a baseline survey of households living in the imme-
diate vicinity of 450 of the polling centers in our experimental sample a month before the election
(August 2010). On election day, we added 21 polling centers in Kabul after obtaining additional
funding.32 We do not have baseline survey data for these 21 polling centers and so place them all
in a single strata when assigning treatment.33
To ensure balance, we stratify treatment on province. In the 450 polling centers for which we
have baseline data, we also stratify on the share of respondents from the baseline survey reporting
at least occasional access to electricity and on respondents reporting that the district governor
carries the most responsibility for keeping elections fair. We estimate all core specifications both
with and without stratum fixed effects.34 Table 4 reports summary statistics and verifies balance
and Table A5 provides summary statistics for the remaining variables used in the analysis.35
5 Data and Results
We examine the impact of announcing photo quick count on four outcomes. The first is the direct
aggregation fraud measure discussed in Section 3, which we construct as the absolute value of
the difference between votes before and after the aggregation process. This measure is subject to
substantial non-random attrition, potentially biasing our estimated treatment effect. We therefore
32The survey contained 2,904 respondents. To attempt to obtain a representative sample of respondentsliving near polling centers, enumerators employed a random walk pattern starting at the polling center, withrandom selection of every fourth house or structure. Respondents within households are randomly selectedusing Kish grid. The survey had 50 percent male and female respondents each and enumerators conductedinterviews in either Dari or Pashto.
33An alternative is to drop these 21 polling centers from specifications with district fixed effects reflectingthe lack of baseline survey data to stratify them on. Results are robust to this change.
34Bruhn and McKenzie (2009) suggest stratifying treatment assignment on baseline measurements of keyoutcomes to increase power. Because measures of fraud are unavailable prior to the election, we select ourstratifying variables by identifying measures most highly correlated with fraud during the 2009 presidentialelection in a national sample.
35Estimates in column (1) of Table 7 additionally indicate that treatment did not affect turnout.
18
Table 4: Randomization Verification
Control Treat T-C p-value # Obs. # Obs.(Control) (Treat)
(1) (2) (3) (4) (5) (6)
Baseline survey dataPlans to turnout during election (=1) 0.790 0.797 0.008 0.734 223 227
[0.236] [0.232] (0.022)Believes vote is secret (=1) 0.668 0.650 -0.018 0.453 223 227
[0.265] [0.255] (0.024)Candidate will know how I voted (=1) 0.086 0.090 0.004 0.767 223 227
[0.146] [0.153] (0.014)Can identify sitting MP (=1) 0.371 0.386 0.015 0.627 223 227
[0.325] [0.318] (0.030)People in precinct will vote for same cand. (=1) 0.236 0.249 0.012 0.607 221 224
[0.253] [0.258] (0.024)Problems with ballot transport are likely (=1) 0.538 0.534 -0.004 0.880 219 225
[0.305] [0.302] (0.029)Police in PC help security (=1) 0.728 0.737 0.009 0.701 223 227
[0.246] [0.241] (0.023)People like you are threatened to vote one way (=1) 0.213 0.202 -0.011 0.606 223 226
[0.231] [0.223] (0.021)Local violence likely on elect. day (=1) 0.500 0.483 -0.017 0.592 219 226
[0.316] [0.347] (0.031)MP Candidate from same Qawm (=1) 0.235 0.232 -0.003 0.878 223 227
[0.219] [0.227] (0.021)Trad. auth. helps settle disputes (=1) 0.286 0.293 0.007 0.785 223 227
[0.267] [0.240] (0.024)Pashtun (=1) 0.323 0.318 -0.005 0.903 223 227
[0.388] [0.407] (0.037)Tajik (=1) 0.436 0.433 -0.004 0.921 223 227
[0.387] [0.390] (0.037)Income generating activity (=1) 0.602 0.607 0.004 0.817 223 227
[0.197] [0.192] (0.018)Electrified (=1) 0.722 0.706 -0.016 0.595 223 227
[0.300] [0.323] (0.029)District Governor keeps elect. fair (=1) 0.112 0.114 0.003 0.864 223 225
[0.170] [0.169] (0.016)
Administrative dataVisited by international election monitors (=1) 0.146 0.182 0.036 0.308 223 227
[0.353] [0.385] (0.035)Visited by domestic election monitors (=1) 0.892 0.863 -0.029 0.350 223 227
[0.311] [0.344] (0.031)Indelible ink washes or not available (=1) 0.798 0.762 -0.036 0.357 223 227(pre-treatment) [0.402] [0.427] (0.039)
Notes: This table checks balance between treatment and control polling centers. The unit of observation is the polling center. The
first sixteen rows report data from the baseline survey. In these sixteen rows, each polling center observation reflects an average of
responses from either six or eight respondents sampled in the immediate vicinity of the polling center. An observation is missing if all
of the respondents in the polling center catchment refused to respond. The last three rows report administrative data. “Visited by
international election monitors” is a dummy that equals 1 if a foreign observer from Democracy International visited the polling center.
“Visited by domestic election monitors” is a dummy that equals 1 if an Afghan observer from Free and Fair Elections Afghanistan
(FEFA) visited the polling center. “Indelible ink washes or not available” is a dummy that equals 1 if the indelible ink used to prevent
individuals from voting multiple times exhibited either problem and is recorded by FEFA. Column (1) produces averages for the control
sample, column (2) provides averages for the treatment sample, Column (3) reports the difference of column (2) and column (1) and
column (4) reports the p-value from a t-test of equality in means between control and treatment. Column (5) reports the number of
observations in the control sample and column (6) reports the number in the treatment sample. Standard deviations are reported in
brackets and standard errors are reported in parentheses.
19
rely on three additional proxy measures less subject to this concern: (i) the number of votes cast
for elite candidates; (ii) a dummy variable that equals 1 in every case where a candidate records
enough votes to rank among the winning candidates at the polling substation, which we report in
Appendix A.2; and (iii) primary reports that materials were stolen or damaged by local candidate
representatives.36
5.1 Aggregation Fraud
The purpose of our treatment was to provide incentives to election officials to ensure that no
discrepancies arose during aggregation. Given this objective and our finding in Section 3 that
votes are both commonly added and subtracted, we examine impacts on the absolute value of vote
differences between pre-aggregation and post-aggregation vote totals.37 This measure exhibited a
large degree of treatment-related attrition. We turn to a discussion of how we address this issue in
our analysis.
Addressing Attrition
Treatment significantly increased the availability of photographic records (p = 0.064), potentially
biasing mean difference estimators of the treatment effect.38 In general, with substantial treatment-
related attrition, the effect of treatment on fraud will be confounded with its effect on the availability
of photographic records. The mean difference may only reflect treatment revealing an additional
part of the fraud distribution, while there is no actual effect on the outcome of interest. To address
this, we use the method provided in Lee (2009), which estimates bounds on the effect of treatment
in the presence of non-random attrition. The purpose of this method is to trim observations that
36The number of votes cast for elite candidates is available for 461 of our 471 polling centers. Baselinecovariates are not available for 21 of the 461 polling centers with vote outcome data. Specifications aretherefore estimated on samples corresponding to 461 polling centers when control variables are not includedand 440 when control variables are included. Specifications in Table 9 are estimated on one less polling centerwhich we drop when trimming the top percentile of results. Only one polling center is dropped (rather thanfour), because there can be several powerful candidates in a given polling center.
37In Appendix A.4 we provide evidence that both vote additions and subtractions are especially large forpowerful candidates, motivating our use of the absolute value measure for these subsamples.
38We collected pictures of 204 (20 percent) of the 1,020 Declarations of Results forms that should havebeen posted at treatment polling centers and 143 (14.77 percent) of the 968 forms that should have beenposted at control polling centers. The treatment effect is therefore -0.052 (standard error = 0.028).
20
report outcomes only under treatment from the estimation sample, allowing impacts to be estimated
using only units where outcomes would be observed irrespective of treatment assignment.39
This method relies fundamentally on the monotonicity assumption.40 In this setting, there
are two ways that the attrition process could be monotonic. Both regard polling officials who
would post Declarations of Results forms under treatment, but not otherwise. In the first instance,
these marginal officials might respond to the treatment notification letter by posting results only in
cases when doing so does not reveal fraud during the aggregation process. This would imply that
trimming the corresponding marginal polling substations from the bottom of the treatment sample
would provide the correct estimate. In the second, these officials might respond to treatment by
posting results only in locations where there is substantial fraud.41 In this case, trimming the
corresponding portion at the top of the treatment distribution would provide the correct estimate.
An intermediate scenario would arise if low-level polling center managers are unaware of aggregation
fraud that takes place later in the aggregation process. In this case, the true effect would lie between
the two bounds.
To illustrate this method, Table 5 applies the method to the absolute vote difference for can-
didates connected to the provincial aggregator. Of the 892 polling substations where we could
potentially observe this measure in our control sample, we successfully measure 100 (11.2 percent).
In this sample, the average absolute vote difference is 17.170. Turning to our treatment sample,
there are 969 potential observations, of which we successfully measure 155 (16 percent). In this
sample, the average absolute vote difference is 5.484.
To bound the treatment effect, we remove the part of the treatment group induced to reveal
outcomes by treatment. We calculate the share to remove, the “trimming ratio”, as 16%−11.2%16% =
29.9%. Under the monotonicity assumption, we can estimate the upper bound of the treatment
effect (i.e. the smallest fraud reducing effect) by removing the bottom 29.9 percentiles in terms of
39These estimands are therefore different from the average treatment effect across the entire experimentalsample. We point interested readers to the Gerber and Green (2012) text, which provides a clear descriptionof the Lee (2009) method.
40Let ri(T ) denote whether an outcome is reported as a function of treatment status (T ∈ {0, 1}). Whenri(T ) = 1 the outcome is observed and when ri(T ) = 0 it is not. The monotonicity assumption statesri(1) ≥ ri(0) or ri(1) ≤ ri(0) for all i.
41This could happen if polling center managers in the most fraud-affected polling centers are not afraid ofrevealing fraud.
21
absolute vote differences from the treatment sample and taking the difference between the trimmed
treatment mean and the control mean. This yields an estimate of 7.798 - 17.170 = -9.372 votes. To
obtain the lower bound of the treatment effect (i.e. the largest fraud reducing effect), we remove
the top 29.9 percentiles in terms of absolute vote differences. This yields an estimate of 0 - 17.170
= -17.170 votes.42
Table 6 summarizes estimates from the same exercise performed on the full sample, and for
each of the remaining five candidate type subsamples.43 To provide a basis of comparison for the
estimated effect sizes, column (1) reports the mean post-aggregation vote-total across candidate-
polling substations for the controlling polling substations within the candidate subsample. Columns
(2) and (3) report the number of observations that would be observed in the control sample with
complete data and the non-missing proportion, respectively. Column (4) reports the mean vote
difference in the observed control sample. These estimates are consistent with our finding in
Table 3 above that observations corresponding to connected candidates are more likely to exhibit
discrepancies. Columns (5) through (7) provide the analogues to columns (2) through (4) for
the treatment sample. The next column provides the trimming ratio p. Column (8) provides the
trimmed mean in the control sample trimming above the (1−p)th percentile and column (9) reports
the trimmed mean eliminating the bottom p percentiles. The final two columns report the lower
bound and the upper bound, respectively. In eleven of the twelve cases, the estimated upper bound
is below zero. In no cases, however, can we statistically reject that the upper bound is different
from zero.
Using the post-aggregation total in column (1) as a point of reference, the upper and lower
42To construct confidence intervals, we estimate standard errors for both upper and lower bounds. Lee(2009) shows that the asymptotic variance of the trimming estimator depends on: (i) the variance of thetrimmed outcome variable; (ii) the trimming threshold value, which is a quantile that must be estimated;and (iii) the corresponding trimming ratio, which is also estimated. Lee (2009), provides expressions for theasymptotic standard errors only for the i.i.d. case. Our treatment is administered at the cluster (pollingcenter) level. Therefore, as in Vogl (forthcoming), we block-bootstrap the bounds estimator at the pollingcenter level.
43In Table A4, we perform the same exercise using the simple vote difference as an outcome. We findno evidence of a mean effect on this outcome. In Appendix A.4, we argue that this is because photo quickcount appears to reduce rigging both for and against powerful candidates in roughly equal measure. We alsoprovide evidence that the reason we find a post-aggregation reduction in Table 7 while finding no effect onvote differences may be because treatment affected the number of votes posted immediately after treatment.This is necessarily speculative because of the limited sample with vote difference data. Tables A10 throughA15 replicate Table 5 for each of the five remaining subsamples.
22
bound estimates indicate that treatment reduces the vote discrepancy substantially. Looking at
the upper bound (the smallest effect on fraud) for the most connected candidate suggests a vote
discrepancy reduction of 20.1 percent. The upper bound for candidates connected to both the
district and provincial aggregators suggests a reduction of nearly 30 percent.
Table 5: Trimmed Estimates of Effect on Aggregation Fraud(Sample: Provincial Aggregator Connection (=1))
Control (i) Number of Observations 892(ii) Proportion Non-missing 0.112(iii) Mean abs(Vote Difference) for observed 17.170 Control Standard Error 7.582
Treatment (iv) Number of Observations 969(v) Proportion Non-missing 0.160(vi) Mean abs(Vote Difference) for observed 5.484
p = [(v)-(ii)/(v)] 0.299
(vii) pth quantile 0.000(viii) Trimmed Mean: E[Y |Y > yp] 7.798 Treatment UB Std. Err. 3.493
(ix) (1-p)th quantile 0.000(x) Trimmed Mean: E[Y |Y < y(1−p)] 0.000 Treatment LB Std. Err. 1.877
Effect (xi) Upper Bound Estimate = (viii) - (iii) -9.372 (xiii) UB Effect Std. Err. 8.760(xii) Lower Bound Estimate = (x) - (iii) -17.170 (xiv) LB Effect Std. Err. 7.739
95 Percent Confidence Interval = [(xii) - 1.96*(xiv), (xi) + 1.96*(xiii)] [-32.338, 7.798]90 Percent Confidence Interval = [(xii) - 1.645*(xiv), (xi) + 1.645*(xiii)] [-29.901, 5.038]
Notes: This table reports details of the Lee (2009) procedure applied to the data on aggregation fraud. The sample is all candidates
with a recorded connection to the provincial aggregator. Rows (i) - (iii) provide values for the control group; rows (iv) - (vi) provide
values for the untrimmed treatment group; rows (vii) and (viii) correspond to the treatment group after trimming the bottom p
percentiles; rows (ix) and (x) correspond to the treatment group after trimming the top p percentiles. Rows (xi) and (xii) report
the upper and lower bound respectively. The corresponding 95 percent and 90 percent confidence intervals are reported in the final
two rows. There are 100 non-missing observations in the control group. Before trimming, there are 155 non-missing observations in
the treatment group. After trimming, there are 109 (108) observations remaining in the treatment group after trimming the lower
p percentiles (the upper p percentiles) of the distribution. Standard errors reported in this table are obtained by bootstrapping all
polling centers in our experimental sample with replacement 250 times.
23
Tab
le6:
Tri
mm
edE
stim
ates
ofE
ffec
ton
Agg
rega
tion
Fra
ud
Con
trol
Tre
atm
ent
Tre
atm
ent
Eff
ect
Mea
nO
bs.
Pro
por
tion
Mea
nA
bs.
Ob
s.P
rop
orti
onM
ean
Ab
s.T
rim
min
gT
rim
med
Tri
mm
edL
ower
Up
per
Pos
t-A
ggre
g.N
on-m
issi
ng
Vot
eD
iff.
Non
-mis
sin
gV
ote
Diff
.R
atio
Mea
nM
ean
Bou
nd
Bou
nd
Vot
esE
[Y|Y
<y (
1−p)]
E[Y|Y
>y p
]
(1)
(2)
(3)
(4)
(5)
(6)
(7)
[(6)
-(3
)]/(
6)(8
)(9
)(8
)-
(4)
(9)
-(3
)
Sam
ple:
Fu
llS
amp
le2.
318
1836
600.
097
0.81
7***
2039
210.
149
0.58
2***
0.34
90.
000
0.89
3***
-0.8
17**
0.07
6(0
.210
)(0
.140
)(0
.245
)(0
.246
)(0
.342
)(0
.249
)In
cum
ben
t7.
501
1049
20.
118
2.84
0***
1148
50.
171
1.59
7***
0.30
80.
000
2.30
8***
-2.8
40**
*-0
.531
(0.8
01)
(0.3
17)
(0.4
98)
(0.5
72)
(0.9
81)
(0.9
07)
Mos
tC
onn
ecte
dC
and
idat
e24
.739
962
0.14
810
.310
*10
160.
201
3.79
9***
0.26
50.
000
5.16
7**
-10.
310*
-5.1
43(5
.476
)(1
.417
)(0
.802
)(2
.053
)(5
.566
)(5
.938
)P
rov.
+D
istr
ict
Agg
rega
tor
Con
nec
tion
36.1
6144
10.
184
18.7
90**
480
0.26
05.
760*
*0.
295
0.00
08.
090*
*-1
8.79
0**
-10.
700
(9.0
09)
(2.5
10)
(2.0
30)
(4.0
89)
(9.0
90)
(10.
340)
Kar
zai
Con
nec
tion
21.4
8415
500.
141
8.83
1**
1600
0.20
43.
667*
**0.
309
0.00
05.
282*
**-8
.831
**-3
.549
(3.7
01)
(1.0
49)
(0.6
75)
(1.7
82)
(3.7
72)
(4.1
76)
Gov
ernm
ent
Ser
vic
e18
.788
1945
0.14
39.
665*
**20
890.
202
5.12
6***
0.29
10.
013
7.21
7***
-9.6
52**
*-2
.448
(3.1
39)
1945
(1.0
98)
(0.8
78)
(1.8
99)
(3.2
77)
(3.6
66)
Note
s:T
his
tab
lere
port
sth
ed
etai
lsof
calc
ula
tin
gb
oun
ds
for
the
trea
tmen
teff
ect
or
usi
ng
the
met
hod
of
Lee
(2009)
for
each
can
did
ate
sub
sam
ple
.N
ote
we
on
lyob
serv
eco
nn
ecti
on
sfo
rca
nd
idate
sw
ith
back
gro
un
din
vest
igati
on
sd
ata
.T
he
ou
tcom
eva
riab
leis
the
abso
lute
diff
eren
ceb
etw
een
pre
-agg
rega
tion
vote
san
dp
ost
-aggre
gati
on
vote
s.T
he
un
itof
obse
rvati
on
isth
eca
nd
idate
-p
oll
ing
stati
on.
Each
row
pro
vid
esre
sult
sfo
rth
ein
dic
ate
dca
nd
idate
sub
sam
ple
.C
olu
mn
s(1
)
rep
orts
the
mea
np
ost-
aggr
egat
ion
vote
tota
lin
the
contr
ols.
Col
um
n(2
)(C
olu
mn
(5))
rep
ort
sth
eto
tal
nu
mb
erof
pote
nti
al
ob
serv
ati
on
sin
the
contr
ol
(tre
atm
ent)
sam
ple
.C
olu
mn
(3)
(Colu
mn
(6))
rep
ort
sth
ep
rop
ort
ion
non
-mis
sin
gin
the
contr
ol
(tre
atm
ent)
sam
ple
.C
olu
mn
s(4
),(7
),(8
),an
d(9
)p
rovid
eth
em
ean
sin
the
contr
ol,
untr
imm
edtr
eatm
ent,
trea
tmen
ttr
imm
edab
ove
the
(1-
p)t
hp
erce
nti
le,
an
dan
dtr
eatm
ent
trim
med
bel
owth
ep
thp
erce
nti
lere
spec
tivel
y.T
he
last
two
colu
mn
sre
por
tth
elo
wer
and
up
per
bou
nd
son
the
trea
tmen
teff
ect.
Fu
lld
etail
sof
thes
eca
lcu
lati
ons
isre
port
edin
Ap
pen
dix
Tab
les
A10
thro
ugh
A15.
Lev
elof
sign
ifica
nce
:*p<
0.1,
**p
<0.
05,
***p
<0.0
1.
Sta
nd
ard
erro
rsob
tain
edby
boot
stra
pp
ing
pol
lin
gce
nte
rs25
0ti
mes
wit
hre
pla
cem
ent
are
rep
orte
din
pare
nth
eses
.
24
5.2 Votes for Elite Candidates:
We next estimate the impact of photo quick count on votes for elite candidates. This measure has
the benefit of much less treatment-related attrition, but provides a less precise reflection of fraud
than our picture-based measure.44 We obtained data on votes disaggregated by candidate and
polling substation from a scrape of the election commission website on October 24, 2010. During
the scrape, 98 of the 1,977 polling substations in our experimental sample (4.96 percent) had not
yet posted returns. These 98 polling substations were contained within ten polling centers. Missing
data are not predicted by treatment status (p = 0.439).45
We run regressions of the form:
Yis = β1 + β2Treats + β3Treats ∗ Connectioni + β4Treats ∗ Investigatedi + β′7Xis + εis, (2)
where Yis is votes cast for candidate i in polling substation s, Connectioni is a dummy variable equal
to one for candidates with a specific type of political connection, Investigatedi is a dummy variable
equal to one for candidates with investigations data, and Xis is a matrix containing covariates from
our baseline survey, the Investigatedi and Connectedi dummies, and a set of stratum fixed effects.
To succinctly summarize impact, we also identify highly influential candidates using an index
based on our political connections data. We identify one elite candidate in each of the 19 political
constituencies by constructing the following index from the data on political connections described
in Section 3:
Indexi = Karzaii +Governmenti +DEOi + PEOi.
Karzaii equals 1 for a candidate i with an indirect connection to Karzai (e.g., through a relative)
and 2 for a direct connection (e.g., serving directly with the president), Governmenti equals 1 for
holding a minor government post since 2001 (e.g., teacher) and 2 for holding a major government
post (e.g., parliamentarian), DEOi equals 1 when connected to the District Elections Officer, and
44This approach follows the existing literature on the impact of election monitoring (Hyde 2007).45Regressions reported in Table 7 and 8 are estimated on a sample of 461 polling centers when control
variables are not included and 440 polling centers when control variables are included. The reason for thisis that we could not obtain voting outcomes for 10 of our 471 polling centers during the web scrape and welack control variables for an additional 21 polling centers.
25
PEOi equals 1 when connected to the Provincial Elections Officer. We term the candidate with
highest index score who is also among the top 10 vote recipients in control polling centers as
the“most connected candidate.” Figure A3 depicts average votes in treatment and control polling
centers for each type of candidate connection.
Column (1) Panel A of Table 7 reports the impact on treatment for all candidates, finding no
evidence of effect across all candidates. Columns (2) - (5) provide estimated effects from regressions
interacting treatment with an dummy variable equal to 1 for the most connected candidate using
several different specifications. For all of these specifications, we find that the most connected
candidate obtains about 17 to 20 votes at a given polling substation. In columns (2) - (4) we see
that treatment reduced votes for most connected candidates from an average of about 21 votes to
about 15 votes (a reduction of about 25 percent). Column (5) reports estimates trimming the top
percentile of votes for the interacted candidate type. The most connected candidates are typically
among the very highest vote recipients. We trim to ensure that our effects are not driven by a few
extreme outliers. Removing these observations both lowers the number of votes powerful candidates
received in controls and reduces the estimated treatment effect by about one vote, though effects
remain significant. As in column (1), there is no evidence of effect on the remaining candidates
without recorded connections. This provides some evidence that the effect is localized to elite
candidates, consistent with political connections playing a role in facilitating access to aggregation
fraud.
Panel B of Table 7 reports results from regressions interacting treatment with specific candidate
categories and a dummy for whether they are among the set of 57 candidates with background
investigations data. The point estimates suggest a negative effect for all investigated candidates.
The estimated joint effect of being investigated and having a connection are statistically different
from zero at the 95 percent level for candidates with a provincial aggregator connection in column
(1) and for candidates with a history of government service in column (3). As in Panel A, there
appears to be no effect on candidates without recorded connections and connected candidates obtain
higher vote totals.
To benchmark the size of the impacts estimated in Table 7, Section A.2 examines the impact
of treatment on whether candidates score enough votes at a given polling substation to be ranked
26
above the threshold required to obtain office in their constituency. Specifically, we estimate the
effect of treatment on a dummy variable that equals 1 for every candidate - polling substation
observation recording enough votes to rank among the winning candidates at the polling station.
The largest effects are for candidates connected to the provincial aggregator (column (2) of of Table
A2). These candidates have a 57 percent chance of ranking above the victory threshold in control
polling stations. Point estimates indicate that treatment reduced the probability of winning by 11.5
percentage points. The degree of the reduction suggests that, for powerful candidates, aggregation
fraud is severe enough to play a meaningful role in determining final election outcomes.
5.3 Theft and Damaging of Forms:
On the day after the election, our field staff visited all 471 operating polling centers in the ex-
perimental sample. During the visit, they attempted to photograph returns forms. If the forms
were missing, they investigated whether any of the materials had been stolen or damaged during
the night of the election.46 Investigations involved interviewing residents living in the immediate
vicinity of polling centers.47 In this section, we estimate impacts on forms reported as stolen or
damaged by candidate agents, who are candidate proxies legally permitted to observe polling and
are typically present at most polling centers in their candidate’s constituency. Candidate agents
are supporters of the candidate who play a crucial role assisting candidates at all stages of the
campaign. Candidate agents stole or damaged materials at 62 (13.16 percent) of the 471 operating
polling centers.
There are several reasons that stealing or damaging forms may reflect an intention to change vote
totals during aggregation. Records of citizen and candidate complaints provided by the Electoral
Complaints Commission indicate that the purpose of stealing forms was to take them to a separate
46These investigations produce 44 reports of candidate agents stealing the returns form along with theballot boxes and other election materials, 18 reports of candidate agents merely tearing down the returnsform, 15 reports of citizens stealing returns forms, 17 reports of citizens tearing down returns forms, and 28reports of security officials stealing materials or denying our interviewers access to photograph them.
47While election commission staff should have vacated the polling center at the end of election day, wetrained our staff to investigate by only interviewing local community members and not to engage commissionstaff so as to not create a measurement effect. While this would not affect the internal validity of our estimatesof program effect, our aim was to minimize the additional monitoring effect for the entire sample. Resultsobtained with these data merit some caution as our field staff could only rely on reports of nearby citizensto determine the reason that Declaration of Results forms are missing.
27
Table 7: Estimates of Impact on Total Votes by Candidate Connection
Dependent Variable: VotesPanel A - Impacts on Most Connected Candidates (1) (2) (3) (4) (5)
Letter Treatment (=1) -0.066 -0.034 0.030 0.019 -0.033(0.212) (0.196) (0.045) (0.056) (0.054)
Treat x Most Connected -5.430* -5.392* -5.780* -4.634**(3.111) (3.080) (3.245) (2.347)
Most Connected (=1) 20.871*** 18.604*** 19.746*** 16.794***(2.391) (2.364) (2.489) (1.787)
Constant 1.410*** 1.301*** 1.279*** 1.952*** 1.808***(0.156) (0.144) (0.032) (0.108) (0.106)
Stratum FEs No No Yes Yes YesFull Covariates No No No Yes YesTrimming top 1% of votes for interacted candidate type No No No No YesR-Squared 0.000 0.024 0.079 0.077 0.094# Polling Centers 461 461 461 440 440# Candidate - Polling Station Observations 386855 386855 386855 309947 309822Mean of Dep. Var. in Controls 1.410 1.410 1.410 1.724 1.629Mean of Dep. Var. in Controls for Interacted Connection - 22.172 22.172 23.516 20.170
Dependent Variable: VotesPanel B - Impacts by Connection Type (1) (2) (3) (4) (5)
Letter Treatment (=1) -0.010 0.005 0.010 -0.011 -0.029(0.052) (0.053) (0.053) (0.052) (0.049)
Treat x Investigated -2.864* -1.583 -2.831* -1.065 -2.472(1.473) (1.641) (1.680) (2.711) (1.851)
Treat x Provincial Aggregator Connection -4.896(3.312)
Treat x Prov. and District Aggregator Connection -3.293(4.368)
Treat x Government Service -2.399(3.142)
Treat x Karzai Connection 0.567(2.302)
Provincial Aggregator Connection (=1) 5.930**(2.635)
Prov. and District Aggregator Connection (=1) 5.168(3.539)
Government Service (=1) -0.541(2.276)
Karzai Connect. (=1) -3.323*(1.786)
Investigated (=1) 15.788*** 14.524*** 15.993*** 16.429*** 15.958***(1.122) (1.186) (1.300) (1.951) (1.436)
Constant 1.667*** 1.732*** 1.726*** 1.669*** 1.536***(0.105) (0.107) (0.106) (0.105) (0.097)
Stratum FEs Yes Yes Yes Yes YesFull Covariates Yes Yes Yes Yes YesTrimming top 1% of votes for interacted candidate type Yes Yes Yes Yes NoR-Squared 0.123 0.118 0.117 0.123 0.131# Polling Centers 440 440 440 440 440# Candidate - Polling Station Observations 309843 309893 309900 309852 309699Mean of Dep. Var. in Controls 1.639 1.673 1.677 1.646 1.562Mean of Dep. Var. in Controls for Interacted Connection 18.713 24.147 24.834 18.723 15.312P-Value - H0 : Treat x Investigated + Treat x Connection = 0 0.052 0.031 0.141 0.047 0.201
Notes: This table reports effects on post-aggregation vote totals by candidate connection. The unit of observation is the candidate - polling
substation. Each column reports the results of an OLS regression where total post-aggregation votes is the dependent variable and dummy variables
that equal 1 for treatment, when the candidate - polling substation observation has the indicated connection, when the candidate is investigated, and
their interactions are the independent variables. Note that we only observe connections for investigated candidates. Panel A reports treatment effect
for the most connected candidate. Please see the text for details on how the most connected candidate variable is computed. Column (1) reports
a bivariate treatment regression with no interaction terms; Column (2) reports the results when treatment is interacted with the most connected
candidate; Column (3) reports a similar regression with stratum fixed effects included; Column (4) adds controls for the share of respondents
reporting that they are Pashtun, Tajik, and whether the polling center was visited by international election monitors; Column (5) trims the top
percentile of the dependent variable. Panel B reports results interacting the treatment dummy with each of the candidate connection dummies.
All regressions reported in Panel B include stratum fixed effects, controls for the share of respondents reporting that they are Pashtun, Tajik, and
whether the polling center was visited by international election monitors; and have the top percentile of the dependent variable trimmed. The
number of candidate - polling substations observations change slightly because of duplicates values at the top percentile. The final two rows in Panel
A and in Panel B report the mean of the dependent variable in the controls and in the controls among observations for the interacted candidate type.
Level of significance: *p < 0.1, **p < 0.05, ***p < 0.01. Robust standard errors clustered at the polling center level are reported in parentheses.
28
location, alter them, and then reinsert them into the aggregation process. For example, at the
Sayedullah Khan Bazaar High School in Terin Kot in Urozgan province, a candidate reports: “382
votes were cast, but then the voting papers were inexplicably lost. Later that evening, I observed the
brother of a leading candidate replacing the vote papers into the boxes.” Alternatively, candidates
might seek to destroy all evidence of the polling center count, and then manufacture entirely new
results forms. Correspondingly, an empowered Polling Center Manager, who knows that the absence
of forms will be photographically documented, might stand up to agents seeking to remove election
materials. While we lack data to know specifically how this happens, a reduction in this measure
may correspond to a reduction in fraud. Table 8 provides several specifications estimating effects
on this measure.48 Estimates in column (1) are that the effect of treatment was to reduce this
specific type of theft from 18.9 percent in controls to about 8 percent in controls (a reduction of
60 percent). These estimates are robust to adjusting specifications to include stratum fixed effects
(column 2), stratum fixed effects and covariates (column 3), estimation using logit (column 4) and
conditional fixed effects logit (column 5).
Table 8: Impacts on Form Theft
Dependent Variable: Election Returns Form Damaged (=1)(1) (2) (3) (4) (4a) (5) (5a)
OLS OLS OLS Logit ∂y/∂x Cond. Log. ∂y/∂x
Letter Treatment (=1) -0.108*** -0.111*** -0.110*** -0.975*** -0.106*** -1.072*** -0.220***(0.032) (0.031) (0.032) (0.302) (0.032) (0.371) (0.058)
Constant 0.189*** 0.191*** 0.211*** -1.013***(0.026) (0.025) (0.078) (0.317)
Stratum FEs No Yes Yes No YesFull Covariates No No Yes Yes YesR-Squared [Log-Likelihood] 0.025 0.218 0.228 [-165.651] [-88.087]# Observations 461 461 440 440 440 249
Notes: This table reports on the damaging of election returns form by treatment group. The unit of observation is the polling center.
Columns (1) report results from an OLS regression where a dummy variable that is equal to 1 if any forms were damaged at the polling
center is the dependent variable and a dummy variable equal to 1 for treatment polling centers is the independent variable. Column (2)
adds stratum fixed effects. Column (3) adds controls for the share of respondents reporting that they are Pashtun, Tajik, and whether
the polling center was visited by international election monitors. Column (4) reports a logit specification using the same dependent
and independent variables as Column (3). Column (4a) reports the marginal effect corresponding to Column (4). Column (5) reports
a conditional logit fixed effects regression. 191 polling centers are dropped from this specification as within 32 of the strata there is
no variation in the dependent variable. Column (5a) reports the marginal effect corresponding to Column (5). Level of significance:
*p < 0.1, **p < 0.05, ***p < 0.01. Robust standard errors are reported in parentheses.
48We constrain the sample to the 461 polling centers for which we have post-aggregation vote data usedin Table 7. Impacts are virtually identical relaxing this constraint.
29
5.4 Tests for spatial externalities
We next test for spatial externalities. We do this for four reasons. First, treatment externalities
may confound our estimates of program effect, leading us to overstate impact if fraud is merely
displaced and to understate impact if fraud is reduced in neighboring polling centers because of a
“chilling effect” (Miguel and Kremer 2004).49 Second, displacement effects and “chilling effects”
should be factored in when considering the efficacy of photo quick count.50 Third, tests for spatial
externalities can provide some information about how aggregation fraud is perpetrated. Fourth,
the estimated externalities, which indicate that monitoring announcements have negative effects on
fraud in neighboring polling centers, provide additional evidence of a high elasticity of fraud with
respect to the threat of monitoring.
The tests in this section are necessarily speculative. An ideal design allows an explicit com-
parison between various treatment densities against a set of pure controls. We could not achieve
this for several reasons. For example, we could not treat rural polling centers, which comprise the
majority of polling centers, due to safety concerns.51
A simple, but imperfect, test for externalities is to compare votes for the most connected
candidate in control polling substations which have no treated neighbors within a two kilometer
halo (N = 196) and in the remaining set of controls which have some treated neighbors within two
kilometers (N = 704). We find that the average number of votes for the most connected candidates
in the first group (42.939) is much larger than in the remaining controls (17.784) and that this
difference is statistically significant (p = 0.004, clustering standard errors at the polling center
level). This suggests a negative treatment externality.
49Because candidates run “at large” within a province, they may attempt substitution of legitimate andfake ballots anywhere within a province. Announcing monitoring in any polling center may therefore createtreatment externalities for other polling centers in a province. In practice, however, candidates may facelimits to displacing votes across polling substations within a province. Candidates mostly garner theirvotes in their home districts or towns where they remain popular, or from pockets of spatially clusteredfamily members. For example, given the extreme ethnic segregation of neighborhoods in Afghanistan, aTajik candidate receiving a considerable number of votes in a solidly Pashtun part of a province could raiseimmediate suspicions of fraud.
50Ichino and Schundeln (2012) find evidence that domestic observers monitoring the voter registrationprocess during the 2008 election in Ghana displaced some irregularities to unmonitored electoral areas.
51Additionally, with only 34 electoral constituencies, the Afghan electoral system affords limited power fortests at the constituency level.
30
To examine this further, we estimate treatment externalities with the specification:
Yics = ϕ1 + ϕ2Letter Treatmentc + ϕ3Treated1kmc + ϕ4Total
1kmc
+ ϕ5Treated1−2kmc + ϕ6Total
1−2kmc +ϕ′7Xc + ηics. (4)
Yics is the number of votes for the most connected candidate i in polling substation s in polling
center c, Treated1kmc is a dummy equal to 1 if any neighbors of polling center c within 1 km are
treated, Total1kmc is the total number of polling centers within a one kilometer halo, and variables
are similarly defined for polling centers between one and two kilometers away. As in Miguel and
Kremer (2004), our identifying assumption is that, conditional on the number of polling centers
within a fixed halo, the number treated is random. Table A9 reports tests verifying this identifying
assumption.52 We additionally estimate several variants of Specification 4, providing the full set
of interactions with Letter Treatment and separating Treated1kmc into a set of 5 dummy variables
to test for nonlinear effects in local treatment saturations.53 Table A5 provides summary statistics
for the data used in this analysis.
To assess the extent to which externalities confound our estimates of program effect, the first
column of Table 9 estimates impact only for the most connected candidate for comparison with
columns (2) - (4), which control for spatial externalities. The estimate increases slightly when
controlling for spatial externalities, but remains around four fewer votes for the most connected
candidate.54 Column (2) provides evidence of a negative externality, consistent with a “chilling
effect.” The estimates in column (3) suggest that having a neighbor treated within one kilometer
has a stronger negative externality (-6.74 votes) than having a neighbor treated between one and
two kilometers away (-4.74 votes). Column (4) separates out the effects by the number of neigh-
52We check this assumption by estimating Specification 4, replacing the dependent variable with the samebaseline variables that we use to test for treatment assignment balance in Table 4.
53In our data, the maximum number of polling centers within a one kilometer radius is five, so we createcategorical dummies for one treatment polling center within one kilometer, two treatment polling centerswithin one kilometer, and so on.
54The small change in the estimates of core impact despite large negative externalities in neighboringpolling centers may be because own-treatment status is weakly correlated with the treatment status ofnearby polling centers. To see this, imagine we estimate only yc = βT own
c + ηc, when yc = βT ownc +
γTneighborc + εc is the true model and cov(T own
c , Tneighborc ) ≈ 0. Then, since by a standard result E[β] =
β+ cov(T ownc , Tneighbor
c )γ, E[β] ≈ β, even when γ is large. Intuitively, this is equivalent to saying that thereare a large number of valid controls mixed in with the contaminated controls.
31
Table 9: Spatial Treatment Externalities
Dependent Variable: Votes for theMost Connected Candidate
(1) (2) (3) (4)
Letter Treatment (=1) -4.080** -4.183** -4.290** -4.159**(2.009) (1.982) (1.956) (1.980)
Any PCs treated within 1km (=1) -6.877* -6.742*(3.512) (3.486)
Total PCs within 1km -0.597 -0.499 -1.256(0.566) (0.564) (0.806)
Any PCs treated within 1-2km (=1) -4.738 -4.681(4.244) (4.240)
Total PCs within 1-2km 0.103 0.223(0.378) (0.392)
1 treated PC within 1km (=1) -6.457*(3.613)
2 treated PCs within 1km (=1) -5.831(3.882)
3 treated PCs within 1km (=1) -3.007(4.858)
4 treated PCs within 1km (=1) 1.459(5.620)
5 treated PCs within 1km (=1) -1.334(6.929)
Constant 28.064*** 30.543*** 32.378*** 32.697***(6.017) (6.043) (7.004) (6.987)
R-Squared 0.276 0.290 0.292 0.294Trimming top 1% of votes for interacted candidate type Yes Yes Yes Yes# Polling Centers 439 439 439 439# Candidate - Polling Substation Observations 1841 1841 1841 1841Mean Dep. Var. control +
no treated PCs 0-2km 42.939 42.939 42.939 42.939
Notes: Each column provides results from an OLS regression of the post-aggregation vote total for the most connected candidate
on a treatment dummy and variables to account for the number of neighboring polling centers that received treatment. These
are: Any PCs treated within 1km (2km) is a dummy variable that equals 1 if polling center within a 1 kilometer (2 kilometers)
radius received a treatment notification letter; Any PCs within 1km (1-2km) is a variable equal to the number of polling centers
within a 1 kilometer radius (1 to 2 kilometer radius) received a treatment notification letter; 1 treated PC within 1km is a dummy
that equals 1 if 1 polling center within a 1 km radius was treated and so on for 2 to 5 polling centers. The maximum number
of polling centers treated within a kilometer in our sample is 5. The unit of observation is the candidate - polling substation.
The sample is restricted to the most connected candidate only. The most connected candidate is identified using the procedure
described in Section 5. The top percentile is trimmed and stratum fixed effects and controls for the share of respondents reporting
that they are Pashtun, Tajik, and whether the polling center was visited by international election monitors are included in all
specifications. Level of significance: *p < 0.1, **p < 0.05, ***p < 0.01. Robust standard errors clustered at the polling center
level reported in parentheses.
bors treated. Estimates here are imprecise, but they are consistent with treatment externalities
decreasing in the number of neighbors treated.55
55This pattern suggests that polling center managers both actively communicate with one another aboutexternal monitoring and that they may be sensitive to changes in the degree of monitoring. These commu-nications may play some role in shaping managers’ beliefs about the probability that their results will beaudited. The diminishing pattern is consistent with updating being greatest in response to the arrival of thefirst letter in a neighborhood.
32
6 Conclusion
Our results demonstrate that aggregation fraud represents a threat to electoral fairness and that
photo quick count provides a means of addressing this problem. Our data provide, to our knowledge,
the first direct measurement of aggregation fraud and also the first systematic evidence of candidates
using their political networks to infiltrate electoral institutions. We discuss three direct policy
implications and three broader implications for electoral competition. We conclude with directions
for future research.
Our data yield three actionable policy results. First, photo quick count is effective, scalable,
well-suited to citizen-based implementation and “viral” adoption, and is cost-effective relative to
traditional international election monitoring.56 We successfully visited 471 polling centers, with a
total budget of just over 100,000 USD.57 By contrast, the largest foreign mission during this election
reached about 85 polling centers, spanning a smaller physical area than in our sample, with a budget
of approximately 10 million USD.58 Moreover, as cellular networks densify and bandwidth increases,
the potential for photo quick count to scale should expand and costs should decrease. Importantly,
this approach may also be sustainable with limited international support. The need for similar
interventions is likely to increase as developed countries transition to a “light footprint” approach
to international engagement. Second, no matter how it is addressed, aggregation fraud appears
to be a serious problem. Estimating the impact of aggregation fraud relative to other types of
fraud lies beyond our data. Nonetheless, the sheer magnitude of the problem and the centrality
of vote aggregation to the electoral process suggest that future efforts to improve elections should
target this problem. Last, improving the independence of electoral institutions and ensuring that
candidates cannot infiltrate them is necessary to achieve fair elections.
56Viral adoption refers to the adoption of new technology by independent actors based on widely availableinformation. Since completing our study, civil society groups in several countries have begun to implementphoto quick count. For example, during the 2014 Afghan Presidential Election, the method was used tomeasure aggregation fraud at 80 percent of active polling centers.
57Costs were contained primarily because we used local staff. Photo quick count can be implemented aseffectively by local staff without any special training by international election monitoring experts.
58Along these lines, the European Union spends approximately 4 million USD on an average electionobservation mission (European Union 2006). During the 2003 Cambodian National Assembly election, theEuropean Union spent approximately 1.5 million USD to deploy 120 observers (Lopez-Pintor and Fischer2006). To our knowledge, the European Union and the Asia Foundation are the only major internationalelection monitoring organizations that publish data on the cost of their missions.
33
Aggregation fraud, in our data, is both prevalent and disproportionately beneficial to connected
candidates. This has at least three negative implications for electoral competition. First, this
creates a substantial barrier to entry for unconnected candidates. We find evidence that powerful
candidates can both add votes for themselves and subtract votes for contenders. In this situation,
competitors without connections may be at a serious disadvantage. In extreme cases, they may not
be able to win at all without these connections, regardless of their competence. Second, candidates
have incentives to cultivate connections in order to win elections. This may create reasons for
candidates to favor clientielistic strategies relative to programmatic ones. Last, the prevalence of
aggregation fraud fundamentally limits the ability of voters to express their preferences through
the ballot. The candidate who wins may be the one with the strongest connections, not the one
preferred by voters. This would limit the disciplining role of elections.
Our findings produce a set of questions for future research. First, do informal connections
between state actors explain corruption and accountability failures in other settings, such as state
service provision? Second, a natural extension of this research involves investigating the longer
term effects of fraud reduction on the effectiveness of government in improving social welfare and
on citizens’ attitudes toward government—an important consideration in states where government
legitimacy is contested. Third, our paper suggests further work on institutional reforms that
might improve the effectiveness and sustainability of government monitoring efforts. Finally, and
perhaps more practically, our results suggest that identifying and operationalizing innovative uses
of technology to quickly gather information on corruption, waste, and abuse is a promising direction
for research and for policy.
34
References
Aker, Jenny, Paul Collier, and Pedro C. Vicente. 2013. “Is Information Power: Using
Cellphones During an Election in Mozambique.” Oxford University Working Paper.
Atanassova, Antonia, Marianne Bertrand, and Paul Niehaus. 2013. “Targeting with
Agents.” American Economic Journal: Economic Policy, 5(1): 206 – 238.
Beber, Bernd, and Alexandra Scacco. 2012. “What the Numbers Say: A Digit-Based Test for
Election Fraud Using New Data from Nigeria.” Political Analysis, 20(2): 211–234.
Berman, Eli, Jacob N. Shapiro, and Joseph H. Felter. 2011. “Can Hearts and Minds
Be Bought? The Economics of Counterinsurgency in Iraq.” Journal of Political Economy,
119(4): 766 – 819.
Bertrand, Marianne, Simeon Djankov, Rema Hanna, and Sendhil Mullainathan. 2007.
“Obtaining a Driver’s License in India: An Experimental Approach to Studying Corruption.”
The Quarterly Journal of Economics, 122(4): 1639–1676.
Besley, Timothy, and Robin Burgess. 2002. “The Political Economy of Government Respon-
siveness: Theory and Evidence from India.” Quarterly Journal of Economics, 117(4): 1415–1452.
Besley, Timothy, and Torsten Persson. 2011. “The Logic of Political Violence.” Quarterly
Journal of Economics, 126(3): 1411 – 1445.
Besley, Timothy, Rohini Pande, and Vijayendra Rao. 2005. “Participatory Democracy in
Action: Survey Evidence from South India.” Journal of the European Economic Association,
3(2-3): 648–657.
Bruhn, Miriam, and David McKenzie. 2009. “In Pursuit of Balance: Randomization in
Practice in Development Field Experiments.” American Economic Journal: Applied Economics,
1(4): 200–232.
Cadot, Olivier. 1987. “Corruption as a Gamble.” Journal of Public Economics, 33(2): 223–244.
35
Callen, Michael, and Nils Weidmann. 2013. “Violence and Election Fraud: Evidence from
Afghanistan.” British Journal of Political Science, 43(1): 53 – 75.
Cameron, A. Colin, Jonah B. Gelbach, and Douglas L. Miller. 2011. “Robust Inference
With Multiway Clustering.” Journal of Business & Economic Statistics, 29(2): 238–249.
Chattopadhyay, Raghabendra, and Esther Duflo. 2004. “Women as Policy Makers: Evidence
from a Nationwide Randomized Experiment in India.” Econometrica, 72(5): 1409–1443.
Cowan, Glenn, Melissa Estok, and Neil Nevitte. 2002. The Quick Count and Election Ob-
servation. National Democratic Institute for International Affairs.
Di Tella, Rafael, and Ernesto Schargrodsky. 2003. “The Role of Wages and Auditing during
a Crackdown on Corruption in the City of Buenos Aires.” Journal of Law and Economics,
46(1): 269–292.
Duflo, Esther, Rema Hanna, and Stephen Ryan. 2012. “Incentives Work: Getting Teachers
to Come to School.” American Economic Review, 102(4): 1241 – 1278.
European Union. 2006. “European Union: External Action FAQ.”
Ferraz, Claudio, and Frederico Finan. 2008. “Exposing Corrupt Politicians: The Effects of
Brazil’s Publicly Released Audits on Electoral Outcomes.” The Quarterly Journal of Economics,
123(2): 703–745.
Ferraz, Claudio, and Frederico Finan. 2011. “Electoral Accountability and Corruption: Evi-
dence from the Audits of Local Governments.” American Economic Review, 101: 1274–1311.
Fisman, Raymond. 2001. “Estimating the Value of Political Connections.” American Economic
Review, 91(4): 1095–1102.
Fujiwara, Thomas. 2013. “Voting Technology, Political Responsiveness, and Infant Health: Evi-
dence from Brazil.” Mimeo, Princeton University.
Gerber, Alan S., and Donald P. Green. 2012. Field Experiments: Design, Analysis, and
Interpretation. W.W. Norton.
36
Hyde, Susan D. 2007. “The Observer Effect in International Politics: Evidence from a Natural
Experiment.” World Politics, 60(1): 37–63.
Ichino, Nahomi, and Matthias Schundeln. 2012. “Deterring or Displacing Electoral Irregular-
ities? Spillover Effects of Observers in a Randomized Field Experiment in Ghana.” The Journal
of Politics, 74: 292–307.
Khwaja, Asim Ijaz, and Atif Mian. 2005. “Do Lenders Favor Politically Connected
Firms? Rent Provision in an Emerging Financial Market.” Quarterly Journal of Economics,
120(4): 1371–1411.
Lee, David S. 2009. “Training, Wages, and Sample Selection: Estimating Sharp Bounds on
Treatment Effects.” Review of Economic Studies, 76(3): 1071 – 1102.
Lopez-Pintor, Rafael, and Jeff Fischer. 2006. “Cost of Registration and Elections (CORE)
Project.” Center for Transitional and Post-Conflict Governance.
Martinez-Bravo, Monica, Gerard Padro i Miquel, Nancy Qian, and Yang Yao. 2013.
“Elections and Economic Policy: Evidence from Rural China.”
McChrystal, Stanley. 2009. “COMISAF’s Initial Assessment to the Secretary of Defense.”
McGee, Rosie, and John Gaventa. 2011. “Shifting Power? Assessing the Impact of Trans-
parency and Accountability Initiatives.” Institute of Development Studies Working Paper.
Mebane, Walter R. Jr. 2008. “Election Forensics: The Second-Digit Benford’s Law Test and
Recent American Presidential Elections.” In Election Fraud: Detecting and Deterring Electoral
Manipulation. , ed. R. Michael Alvarez, Thad E. Hall and Susan D. Hyde, 162–181. Washington,
DC:Brookings Institution Press.
Miguel, Edward, and Michael Kremer. 2004. “Worms: Identifying Impacts on Education and
Health in the Presence of Treatment Externalities.” Econometrica, 72(1): 159–217.
Myagkov, Mikhail, Peter C. Ordeshook, and Peter Shakin. 2009. The Forensics of Election
Fraud: Russia and Ukraine. New York:Cambridge University Press.
37
National Democratic Institute. 2010. “Afghanistan Election Data.” http: // 2010.
afghanistanelectiondata. org/ .
Olken, Benjamin A. 2007. “Monitoring Corruption: Evidence from a Field Experiment in In-
donesia.” Journal of Political Economy, 115(2): 200–249.
Olken, Benjamin A., and Patrick Barron. 2009. “The Simple Economics of Extortion: Evi-
dence from Trucking in Aceh.” Journal of Political Economy, 117(3): 417–452.
Olken, Benjamin A., and Rohini Pande. 2011. “Corruption in Developing Countries.” National
Bureau of Economic Research (NBER) Working Paper No. W17398.
Pande, Rohini. 2003. “Can Mandated Political Representation Increase Policy Influence for Disad-
vantaged Minorities? Theory and Evidence from India.” American Economic Review, 93(4): 1132
– 1151.
Rashid, Ahmed. 2009. Descent into Chaos: The U.S. and the Disaster in Pakistan, Afghanistan,
and Central Asia. London:Penguin.
Rose-Ackerman, Susan. 1975. “The Economics of Corruption.” Journal of Public Economics,
4(2): 187 – 203.
Shleifer, Andrei, and Robert W. Vishny. 1993. “Corruption.” Quarterly Journal of Eco-
nomics, 108(3).
Svensson, Jakob. 2003. “Who Must Pay Bribes and How Much? Evidence from A Cross Section
of Firms.” Quarterly Journal of Economics, 118(1): 207–230.
United States Army. 2006. “Field Manual 3-24: Counterinsurgency.”
Vogl, Tom S. forthcoming. “Marriage Institutions and Sibling Competition: Evidence from South
Asia.” Quarterly Journal of Economics.
World Bank. 2011. World Development Report 2011: Conflict, Security, and Development. WDR.
38
Yang, Dean. 2008. “Can Enforcement Backfire? Crime Displacement in the Context of Customs
Reform in the Philippines.” The Review of Economics and Statistics, 90(1): 1–14.
39
APPENDIX: NOT FOR PUBLICATION
A Appendix
A.1 Measuring Political Connections
Data on the historical political connections of candidates are based on a set of investigations
performed by Indicium consulting, an Afghan consulting firm, at the request of Democracy Inter-
national. These data were collected in the months leading up to the election and were completed
prior to the election. The Indicium research team for this project comprised an international
research lead, two supporting researchers based in Kabul, and one Afghan researcher based in ev-
ery constituency (province). The lead research coordinator had, through years of experience in
Afghanistan, gained considerable knowledge of relevant local dynamics and power structures.
Research into candidate histories proceeded in four stages. First, Democracy International
requested data on a specified list of candidates. In compiling this initial list, Democracy Interna-
tional referenced the 2005 parliamentary outcomes and consulted experts within the organization
and a broader set of Afghanistan experts. Second, Indicium expanded the list based on recom-
mendations from the research team it had convened for this project. Indicium adapted the list
through the course of the campaign, expanding the list when new powerful candidates emerged,
and dropping some candidates as they withdrew. Third, after compiling the list, the Kabul team
and the local staff based in the provinces began a series of expert interviews. Experts interviewed
for this purpose included local journalists, foreign embassy staff, parliamentary candidates (both
the contenders themselves and their rivals), and tribal elders. These interviews were cross-checked
against newspaper reports and against information obtained by Indicium from the United Nations,
the International Security Assistance Force, and the broader intelligence community. Last, after
these steps were completed, Indicium performed a second round of interviews and of cross-checking
before providing the data to Democracy International. The investigations provide data for 57 of the
leading candidates since the Soviet Invasion of Afghanistan in 1979. Table A1 provides summary
statistics for these data.
40
Table A1: Summary Statistics for Political Connections Data
Variable Mean Std. Dev.
Connected to Provincial Aggregator (=1) 0.491 0.504Connected to Provincial + District Aggregator (=1) 0.228 0.423Served in Senior Post Since 2001 (=1) 0.614 0.491Served in Junior Post Since 2001 (=1) 0.158 0.368Government Service (=1) 0.772 0.423Connected Directly to Karzai (=1) 0.298 0.462Indirectly Connected to Karzai (=1) 0.281 0.453Karzai Connection (=1) 0.579 0.498
Notes: Each row is a separate dummy variable that equals one if the candidate has
a connection of the indicated type according to the data. Candidates are the unit of
observation. There are 57 candidates in the data. Column (1) reports the average across
these candidates and Column (2) reports the standard deviation.
A.2 Impacts on Electoral Outcomes
To benchmark the size of the impacts estimated in Table 7, we examine the impact of treatment
on whether candidates score enough votes at a given polling substation to be ranked above the
threshold required to obtain office in their constituency. Specifically, we define a dummy variable
that equals 1 for every candidate - polling substation observation recording enough votes to rank
among the winning candidates at the polling station. For example, in Kabul, this dummy variable
is set equal to 1 for the top 33 candidates at each polling station.59 We then estimate Specification
(2) with this dummy as the dependent variable. Table A2 displays the results of this exercise.
The largest effects are for candidates connected to the provincial aggregator (column 3). These
candidates have a 57 percent chance of ranking above the victory threshold in control polling sta-
tions. Point estimates indicate that treatment reduced the probability of winning by 11.5 percentage
points. Correspondingly, estimates suggest an 8.9 percentage point reduction from a 45.3 percent
chance of being above the victory threshold for candidates linked to both aggregators (column 4); a
5.1 percentage point reduction from 57 percent for candidates with a history of government service;
and a .3 percentage point reduction from 54 percent for candidates with a connection to Karzai.
This exercise is necessarily speculative. We are limited in our ability to extrapolate from these
59In cases where several candidates are tied at the position of the lowest winning rank (e.g. the 33rdposition in Kabul), we code all candidates as 1. We also require that those tied for the bottom positionreceive at least one vote.
41
results because our sample is not representative. If we had a representative sample, then our
estimates should be consistent for the partial equilibrium effect of photo quick count at scale. The
partial equilibrium nature of the exercise implies additional limits; we do not know the degree
of substitutability between aggregation fraud and other types of fraud that we do not observe.
Last, some spatial externalities are present, which we examine below. Nonetheless, the size of the
estimates in this section suggest that the treatment effects we find could be large enough to impact
final election outcomes.
A.3 Impact on Problems During the Vote Count
After votes are cast on election day, the Polling Center Manager at each polling center is responsible
for counting up votes and for completing a Declaration of Results form for each polling substation.
We obtained data on problems during this process from the Electoral Complaints Commission.
These include complaints about the electoral process made by candidates, observers, and candidate
agents. Complaints indicating problems during the count occur widely in our sample. For example,
a complaint made by a candidate about the Charahi Taymani neighborhood in Kabul reports “in
Ismailya Polling substation, ten of my family members voted for me, but the Declaration of Results
Form displayed only seven.” The Electoral Complaints Commission received 5,869 total complaints
regarding the September 2010 parliamentary elections, of which 4,138 were made by candidates and
944 were made about Independent Elections Commission polling officials violating protocols. 650
of the 944 complaints about polling staff were made by candidates. In our sample of 1,977 polling
substations in the 465 operating polling centers, 1,847 complaints were filed with the Electoral
Complaints Commission. 1,217 of these complaints were filed by candidates and 900 were filed
regarding polling center staff. We assess impacts on two measures: (i) the number of complaints
filed by candidates about a given polling substation; and (ii) the number of complaints filed against
election commission staff about a given polling substation.
As the example given above suggests, these might reflect efforts by the Polling Center Manager
to increase fraud during the count because photo quick count prohibits tampering later in the
aggregation process. However, these measures are problematic for at least three reasons. First, we
cannot verify these complaints. Second, while many of the translated complaints report candidates
42
Table A2: Estimates of Impact on Whether a Candidate Receives Enough Votes to Win
Dependent Variable: Received Enough Votes to be in a Winning Rank Position (=1)Impacts by Connection Type (1) (2) (3) (4) (5) (6)
Letter Treatment (=1) 0.001 0.001 0.001 0.001 0.001 0.001(0.002) (0.002) (0.002) (0.002) (0.002) (0.002)
Treat x Most Connected -0.051(0.039)
Treat x Investigated -0.040 0.004 -0.029 0.001 -0.020(0.026) (0.030) (0.027) (0.053) (0.041)
Treat x Provincial Aggregator Connection -0.115***(0.044)
Treat x Prov. and District Aggregator Connection -0.060(0.051)
Treat x Government Service -0.051(0.059)
Treat x Karzai Connection -0.030(0.049)
Most Connected (=1) 0.473***(0.028)
Provincial Aggregator Connection (=1) -0.106***(0.032)
Prov. and District Aggregator Connection (=1) -0.140***(0.039)
Government Service (=1) -0.019(0.043)
Karzai Connect. (=1) -0.053(0.036)
Investigated (=1) 0.481*** 0.521*** 0.508*** 0.496*** 0.508***(0.018) (0.021) (0.019) (0.038) (0.030)
Constant 0.088*** 0.083*** 0.083*** 0.083*** 0.083*** 0.082***(0.004) (0.004) (0.004) (0.004) (0.004) (0.004)
Stratum FEs Yes Yes Yes Yes Yes YesFull Covariates Yes Yes Yes Yes Yes YesTrimming top 1% of votes for interacted candidate type Yes Yes Yes Yes Yes NoR-Squared 0.026 0.050 0.051 0.051 0.050 0.049# Polling Centers 440 440 440 440 440 440# Candidate - Polling Station Observations 309822 309843 309893 309900 309852 309699Mean of Dep. Var. in Controls 0.091 0.091 0.091 0.091 0.091 0.090Mean of Dep. Var. in Controls for Interacted Connection 0.582 0.571 0.497 0.453 0.565 0.540P - Value: Treat x Investigated + Treat x Connection = 0 0.190 0.128 0.002 0.051 0.078 0.112
Notes: This table reports effects on whether candidates receive enough votes to be in the set of winning candidates at a given polling substation. The
unit of observation is the candidate - polling substation. Each column reports the results of an OLS regression where a dummy variable that equals
1 for every candidate - polling substation observation recording enough votes to rank among the winning candidates at the polling substation is the
dependent variable. Columns (2) through (6) report results from regressions interacting the treatment dummy with each candidate connection type
dummy and a dummy for being investigated. Note we observe connections data only for investigated candidates. All regressions include stratum fixed
effects and controls for the share of respondents reporting that they are Pashtun, Tajik, and whether the polling center was visited by international
election monitors. The final two rows report the mean of the dependent variable in the controls and in the controls among observations for the interacted
candidate type. Level of significance: *p < 0.1, **p < 0.05, ***p < 0.01. Robust standard errors clustered at the polling center level are reported in
parentheses.
43
manipulating the counting process, complaints in these categories may also be filed for other reasons.
Last, it may be that candidates who are filing complaints do so strategically to try to cancel out their
competitors’ votes. Moreover, the Letter Treatment might merely empower potential complainants.
They might, for example, believe that the photograph of the Declaration of Results form will provide
them a means to substantiate their claim.
Table A3: Impact on Complaints
Dependent Variable: Complaints by Candidates Complaints Against Polling Official(1) (2) (3) (4) (5) (6)
Letter Treatment (=1) 1.477** 1.477** 1.302* 0.939 0.988 0.983*(0.726) (0.749) (0.723) (0.621) (0.622) (0.566)
Constant 1.899*** 1.899*** 2.926 1.476*** 1.451*** 2.357(0.297) (0.294) (3.742) (0.320) (0.318) (2.749)
Stratum FEs No Yes Yes No Yes YesFull Covariates No No Yes No No Yes
R-squared 0.009 0.154 0.177 0.005 0.201 0.238# Observations 461 461 440 461 461 440
Notes: This table reports on the complaints registered by the Electoral Complaints Commission and treatment. All
columns report OLS specifications. Columns (1) - (3) use the number of complaints filed by candidates as a dependent
variable. Columns (4) - (6) use the number of complaints against a polling official as a dependent variable. Columns
(1) and (4) report from bivariate regressions. Columns (2) and (5) add stratum fixed effects to the respective
preceding specification. Columns (3) and (6) control for stratum fixed effects, the share of respondents who are
Pashtun, Tajik and whether the polling center was visited by international election monitors. Level of significance:
*p < 0.1, **p < 0.05, ***p < 0.01. Robust standard errors are reported in parentheses.
A.4 Aggregation Fraud and Vote Reductions
This section examines the effect of treatment on votes recorded on the declaration of returns form
immediately after the treatment letter was delivered. The objective is to increase our understanding
of how the limited impact of treatment on simple vote differences can be consistent with the
estimates indicating an end-of-the-day reduction of votes in Table 7.
Table A4 reports estimates of the difference between pre-aggregation and post-aggregation
votes using the method of Lee (2009). This table follows Table 6 in the manuscript, with the only
difference being that we have changed the outcome, replacing the absolute vote difference with the
simple vote difference. The final two columns report the lower and upper bound respectively. The
estimates suggest no clear mean effect on vote differences.
44
This contrasts with Table 6, which reports negative effects for the absolute vote difference. To
understand why we find reductions for the absolute vote difference but not for the simple vote
difference, we plot the densities of the vote addition measures by treatment status for each of the
candidate subsamples in Figure A1. The figures suggests that one of the principal consequences of
treatment for powerful candidates is to increase the frequency with which the pre-aggregation total
perfectly matches the post-aggregation total (i.e., the vote difference equals zero). The densities
depicted in Figure A1 suggest that we observe no effect on the vote difference measure because
treatment reduces rigging both for and against powerful candidates in a way that is roughly sym-
metric.
We next consider how treatment reduced votes for powerful candidates, while having mainly
offsetting effects in terms of rigging both for and against the candidate. Figure A2 depicts Lee
(2009) bounds for pre-aggregation votes, post-aggregation votes, and their difference. Recall that
these are all post-treatment measures. Based on this figure, it appears that treatment operated by
first creating a reduction in votes for powerful candidates in the pre-aggregation number, and then
by creating greater fidelity between the pre-aggreagtion and post-aggregation totals.
The treatment letter indicated that we would be taking pictures of the results and then posting
them on a website accessible to “the people of Afghanistan, the international community, and local
and international media.” This makes clear that the technique would create broader awareness of
the vote numbers on the declaration of results form. We speculate that the reduction at this stage
may be because treatment reinforced the deterrent effect of posting the form at the polling center
in the first place.
To summarize our findings in this section, we find that treatment both reduced votes recorded
on the declaration of results form at the polling center and reduced rigging both for and against
powerful candidates in roughly equal measure. As a result, treatment diminished the absolute
difference between pre-aggregation and post-aggregation votes and decreased votes for powerful
candidates at the end of the election.
We treat these results speculatively mainly due to data limitations. The estimand in this
case is the treatment effect on polling centers that would report results irrespective of treatment.
This is about 9.7 percent of the total number of candidate polling substation observations in our
45
0.0
1.0
2.0
3.0
4.0
5D
ensi
ty
−200 −100 0 100 200Vote Difference During Aggregation
Provincial Aggregator Connection (=1)
0.0
1.0
2.0
3.0
4D
ensi
ty
−200 −100 0 100 200Vote Difference During Aggregation
Prov. + Dist. Aggregator Connection (=1)
0.0
2.0
4.0
6.0
8D
ensi
ty
−100 0 100 200Vote Difference During Aggregation
Karzai Connection (=1)
0.0
2.0
4.0
6D
ensi
ty
−100 0 100 200Vote Difference During Aggregation
Government Connection (=1)
0.0
2.0
4.0
6D
ensi
ty
−100 0 100 200Vote Difference During Aggregation
Incumbent (=1)
0.0
2.0
4.0
6D
ensi
ty
−100 0 100 200Vote Difference During Aggregation
Investigated (=1)
Treatment Control
Figure A1: Vote Difference Densities by Treatment GroupNotes: This figure depicts the kernel density estimates of the distribution of vote differences during aggregation. The vote
difference during aggregation is created by subtracting the number of votes recorded on pictures of the declaration of results
forms from the number of votes for the same candidate polling-station observation reported on the website of the Independent
Elections Commission (IEC). The maximum number of votes that can be cast a polling station is 600 votes. We therefore drop
one observation recoding -800 votes. Section A.1 describes the candidate connections data used to define candidate subsamples.
experimental sample and the behavior of the polling center managers that are always willing to
report may be different. Specifically, polling center managers who are willing to post declaration
of results forms irrespective of treatment may use the letter as an excuse not to record fake votes
for candidates on the declaration of returns form. The behavior could be different in our general
sample.
46
votes diff votes after votes before
Investigated
votes diff votes after votes before
Incumbent
votes diff votes after votes before
Government Service
votes diff votes after votes before
Karzai Connection
votes diff votes after votes before
Dist. Aggr. Connection
votes diff votes after votes before
Prov. Aggr. Connection
votes diff votes after votes before
Full Sample
Sam
ple
−30 −20 −10 0 10Estimated Treatment Effect (Votes)
Figure A2: Trimmed Estimates of Effect on Three MeasuresNotes: This error bars depict the upper and lower Lee (2009) bound estimates for votes recorded on the declaration of results
form at the polling center prior to aggregation (votes before), votes scraped from the Independent Elections Commission (IEC)
on its website (votes after) and the difference between votes after and votes before (votes diff). The dots correspond to the point
estimate, ignoring attrition. The maximum number of votes that can be cast a polling station is 600 votes. We therefore drop
one observation recoding -800 votes. Section A.1 describes the candidate connections data used to define candidate subsamples.
47
Tab
leA
4:T
rim
med
Est
imat
esof
Eff
ect
onV
ote
Diff
eren
ces
Con
trol
Tre
atm
ent
Tre
atm
ent
Eff
ect
Mea
nO
bs.
Pro
por
tion
Mea
nO
bs.
Pro
por
tion
Mea
nT
rim
min
gT
rim
med
Tri
mm
edL
ower
Up
per
Pos
t-A
ggre
g.N
on-m
issi
ng
Vot
eD
iff.
Non
-mis
sin
gV
ote
Diff
.R
atio
Mea
nM
ean
Bou
nd
Bou
nd
Vot
esE
[Y|Y
<y (
1−p)]
E[Y|Y
>y p
]
(1)
(2)
(3)
(4)
(5)
(6)
(7)
[(6)
-(3
)]/(
6)(8
)(9
)(8
)-
(4)
(9)
-(3
)
Sam
ple:
Fu
llS
amp
le2.
318
1836
590.
097
0.07
920
3291
0.14
90.
238
0.34
9-0
.264
0.62
9-0
.343
0.55
0(0
.103
)(0
.073
)(0
.200
)(0
.200
)(0
.220
)(0
.219
)In
cum
ben
t7.
501
1049
10.
118
0.33
711
485
0.17
10.
898
0.30
9-0
.506
1.80
4-0
.843
1.46
7(0
.465
)(0
.275
)(0
.388
)(0
.556
)(0
.602
)(0
.744
)M
ost
Con
nec
ted
Can
did
ate
24.7
3989
10.
111
1.10
196
90.
160
4.19
40.
305
-0.9
356.
944
-2.0
365.
843
(4.3
26)
(1.8
41)
(1.5
94)
(3.7
20)
(4.6
11)
(5.9
25)
Pro
v.
+D
istr
ict
Agg
rega
tor
Con
nec
tion
36.1
6144
00.
182
3.75
048
00.
260
4.32
00.
302
-1.0
347.
159
-4.7
843.
409
(4.8
10)
(2.3
01)
(2.1
23)
(4.3
25)
(5.0
12)
(7.0
30)
Kar
zai
Con
nec
tion
21.4
8415
490.
141
1.11
916
000.
204
2.88
40.
311
-0.5
694.
739
-1.6
883.
620
(1.8
77)
(0.9
67)
(0.7
09)
(1.5
56)
(1.9
89)
(2.4
72)
Gov
ernm
ent
Ser
vic
e18
.788
1944
0.14
20.
422
2089
0.20
23.
116
0.29
3-1
.424
5.82
2-1
.847
5.40
0(1
.905
)(1
.100
)(1
.079
)(1
.882
)(2
.214
)(2
.658
)
Note
s:T
his
table
rep
orts
the
det
ails
ofca
lcu
lati
ng
bou
nd
sfo
rth
etr
eatm
ent
effec
tor
usi
ng
the
met
hod
of
Lee
(2009)
for
each
can
did
ate
subsa
mp
le.
Th
eou
tcom
eva
riab
leis
the
diff
eren
ceb
etw
een
pre
-aggr
egati
on
vote
san
dp
ost
-agg
regati
on
vote
s.T
he
un
itof
obse
rvat
ion
isth
eca
nd
idat
e-
pol
lin
gst
atio
n.
Eac
hro
wp
rovid
esre
sult
sfo
rth
ein
dic
ate
dca
nd
idate
sub
sam
ple
.C
olu
mn
s(1
)re
port
sth
em
ean
post
-aggre
gati
on
vote
tota
lin
the
contr
ols
.C
olu
mn
(2)
(Colu
mn
(5))
rep
ort
s
the
tota
lnu
mb
erof
pot
enti
alob
serv
atio
ns
inth
eco
ntr
ol(t
reat
men
t)sa
mp
le.
Colu
mn
(3)
(Colu
mn
(6))
rep
ort
sth
ep
rop
ort
ion
non
-mis
sin
gin
the
contr
ol
(tre
atm
ent)
sam
ple
.C
olu
mn
s(4
),(7
),(8
),an
d(9
)p
rovid
eth
em
ean
sin
the
contr
ol,
untr
imm
edtr
eatm
ent,
trea
tmen
ttr
imm
edab
ove
the
(1-
p)t
hp
erce
nti
le,
and
and
trea
tmen
ttr
imm
edb
elow
the
pth
per
centi
lere
spec
tive
ly.
Th
ela
sttw
oco
lum
ns
rep
ort
the
low
eran
du
pp
erb
ou
nd
son
the
trea
tmen
teff
ect.
Lev
elof
sign
ifica
nce
:
*p<
0.1,
**p<
0.05
,**
*p<
0.0
1.S
tan
dar
der
rors
obta
ined
by
boot
stra
pp
ing
poll
ing
cente
rs250
tim
esw
ith
rep
lace
men
tar
ere
port
edin
pare
nth
eses
.
48
A.5 Data Appendix
We use the following administrative data: (i) systematic political background investigations of
the main candidates; (ii) geographic coordinates and security assessments of polling substations
provided by ISAF; (iii) complaints about illegal election activities filed at the Electoral Complaints
Commission; (iv) disaggregated vote counts from the Independent Elections Commission; and (v)
data on adherence to electoral protocols from the Free and Fair Elections Foundation of Afghanistan
(FEFA). Additionally, we fielded a baseline survey of households living in the immediate vicinity
of 450 of the 471 polling centers in our experimental sample a month before the election (August
2010). The 21 polling centers in the experimental sample not surveyed at baseline are in Kabul.
We subsequently added these using additional funding made available after the baseline. The
survey contained 2,904 respondents. To attempt to obtain a representative sample of respondents
living near polling centers, enumerators employed a random walk pattern starting at the polling
center, with random selection of every fourth house or structure. Respondents within households
are randomly selected using Kish grid. The survey had 50 percent male and female respondents
each and enumerators conducted it in either Dari or Pashto. Last, we obtain a primary measure
of returns form manipulation by sending field staff to investigate whether election materials were
stolen or damaged the day following the election (September 19), which we describe in section 3.
49
A.6 Additional Figures and Tables
Table A5: Additional Summary Statistics
Variable Mean Std. Dev. # Observations
Additional Fraud MeasuresElection returns form manipulation (=1) 0.134 0.342 461Number of Electoral Complaints Commission complaints by candidate 2.649 7.9 461Number of Electoral Complaints Commission complaints against polling official 1.952 6.724 461
Geospatial MeasuresPolling Centers treated within 1km (=1) 0.505 0.501 461Total Polling Centers treated within 1km 1.089 1.354 461Total Polling Centers within 1km 2.219 2.508 461Polling Centers treated within 1-2km (=1) 0.636 0.482 461Total Polling Centers treated within 1-2km 2.362 2.473 461Total Polling Centers within 1-2km 4.731 4.677 461
Notes: The first three rows are variables based on complaints data from the Electoral Complaints Commission. All three variables are the
total number of complaints pertaining to the station. The last six rows are computed using GIS based on geographic coordinates for the
polling substations in our sample provided by the International Security Assistance Force (ISAF). The first column reports the mean, the
second column reports the standard deviation, and the third reports the number of observations.
50
Table A6: Vote Changes During Aggregation by Candidate Type
(Sample: Only Candidates with Investigations Data)
Dependent Variable: Post-aggregation Votes - Pre-aggregation VotesPanel A - Votes (1) (2) (3) (4) (5) (6) (7)
Provincial Aggregator Connection (=1) 1.160(1.746)
Prov. + District Aggregator Connection (=1) 2.510 2.727 2.900(1.910) (1.736) (1.772)
Karzai Connect. (=1) -0.758 -0.970(2.317) (1.801)
Government Service (=1) -1.577 -1.879(2.038) (2.834)
Incumbent (=1) -0.570 -0.991 0.005(1.850) (1.646) (1.981)
Constant 1.768 1.525 2.559 3.330** 2.382 1.971 3.496**(1.269) (1.241) (2.264) (1.476) (1.746) (1.870) (1.678)
R-Squared 0.001 0.004 0.000 0.001 0.000 0.004 0.006# Candidates 56 56 56 56 56 56 56#Polling Stations 149 149 149 149 149 149 149# Candidate - Polling Station Observations 891 891 891 891 891 891 891
Dependent Variable: (Post-aggregation Share - Pre-aggregation Share) x 100Panel B - Vote Shares (1) (2) (3) (4) (5) (6) (7)
Provincial Aggregator Connection (=1) 0.384(2.393)
Prov. + District Aggregator Connection (=1) 1.171 1.499 1.149(2.385) (2.201) (2.239)
Karzai Connect. (=1) 0.513 1.537(2.097) (2.736)
Government Service (=1) 3.294 4.936(2.485) (4.344)
Incumbent (=1) -1.240 -1.476 -3.735(2.457) (2.317) (4.066)
Constant 4.004 3.844* 3.799 1.535** 4.736 4.510 0.921(2.450) (2.280) (2.420) (0.653) (2.884) (3.017) (1.668)
R-Squared 0.000 0.000 0.000 0.001 0.000 0.001 0.003# Candidates 57 57 57 57 57 57 57#Polling Stations 149 149 149 149 149 149 149# Candidate - Polling Station Observations 892 892 892 892 892 892 892
Notes: This table reports on political connections and the degree of aggregation fraud restricting the sample only to the 57 candidateswith connections data. The unit of observation is the candidate - polling substation. Each column in Panel A reports results froman OLS regression using the difference between post-aggregation and pre-aggregation votes as the dependent variable and a dummyvariable that equals one if a candidate - polling substation observation records the indicated connection(s) as an independent variable.Panel B repeats these specifications using the difference between pre-aggregation and post-aggregation vote shares (multiplied by 100)as a dependent variable. Both pre-aggregation and post-aggregation vote shares are calculated using pre-election polling station votetotals as the denominator. All specifications in both panels include province fixed effects and drop the five largest and five smallestobservations of the dependent variable. No candidates record a connection to the district aggregator and not to the provincial aggregatorso these coefficients cannot be estimated separately. Sample: treatment and control polling centers for which complete vote differencedata are available. Levels of significance: *p < 0.1, **p < 0.05, ***p < 0.01. Standard errors clustered by candidate and by pollingcenter using the method of Cameron, Gelbach and Miller (2011) are reported in parentheses.
51
Table A7: Vote Changes During Aggregation by Candidate Type
(Sample: Control Polling Stations)
Dependent Variable: Post-aggregation Votes - Pre-aggregation VotesPanel A - Votes (1) (2) (3) (4) (5) (6) (7)
Investigated (=1) 0.506 0.370 -0.224 1.089 2.985 0.479 3.267(1.319) (1.354) (1.549) (2.088) (2.393) (1.316) (2.262)
Provincial Aggregator Connection (=1) 0.522(2.209)
Prov. + District Aggregator Connection (=1) 3.499 4.320*(2.444) (2.250)
Karzai Connect. (=1) -0.967 -1.518(1.973) (1.949)
Government Service (=1) -3.260 -3.707(2.817) (2.778)
Incumbent (=1) 0.072 0.187(0.251) (0.233)
Constant 0.085*** 0.085*** 0.087*** 0.086*** 0.087*** 0.082*** 0.084**(0.030) (0.030) (0.031) (0.032) (0.031) (0.031) (0.033)
R-Squared 0.016 0.017 0.018 0.017 0.018 0.016 0.021# Candidates 1537 1537 1537 1537 1537 1537 1537#Polling Stations 67 67 67 67 67 67 67# Candidate - Polling Station Observations 17779 17779 17779 17779 17779 17779 17779Connection(s) + Investigated = 0 (p-value) 0.701 0.691 0.070 0.923 0.860 0.681 0.076Mean for Candidates Not Investigated 0.036 0.036 0.036 0.036 0.036 0.036 0.036
Dependent Variable: (Post-aggregation Share - Pre-aggregation Share) x 100Panel B - Vote Shares (1) (2) (3) (4) (5) (6) (7)
Investigated (=1) 3.703 4.397 3.968 1.198 -0.471 3.748 -2.416(2.740) (3.646) (3.412) (1.794) (1.646) (2.781) (1.942)
Provincial Aggregator Connection (=1) -2.605(2.965)
Prov. + District Aggregator Connection (=1) -1.234 -2.879(2.669) (3.737)
Karzai Connect. (=1) 4.147** 4.388**(1.672) (1.814)
Government Service (=1) 5.482 5.568(4.788) (5.422)
Incumbent (=1) -0.118 -0.393(0.261) (0.469)
Constant 0.210** 0.208** 0.209** 0.205** 0.206** 0.215** 0.215***(0.085) (0.085) (0.085) (0.085) (0.085) (0.086) (0.083)
R-Squared 0.014 0.014 0.014 0.014 0.014 0.014 0.014# Candidates 1538 1538 1538 1538 1538 1538 1538#Polling Stations 67 67 67 67 67 67 67# Candidate - Polling Station Observations 17778 17778 17778 17778 17778 17778 17778Connection(s) + Investigated = 0 (p-value) 0.177 0.226 0.119 0.120 0.189 0.176 0.087Mean for Candidates Not Investigated 0.751 0.751 0.751 0.751 0.751 0.751 0.751
Notes: This table reports on political connections and the degree of aggregation fraud in the control sample. The unit of observation is thecandidate - polling substation. Each column in Panel A reports results from an OLS regression using the difference between post-aggregation andpre-aggregation votes as the dependent variable and a dummy variable that equals one if a candidate - polling substation observation records theindicated connection(s) as an independent variable and an additional dummy that equals one if that observation has any background investigationdata. Panel B repeats these specifications using the difference between pre-aggregation and post-aggregation vote shares (multiplied by 100) asa dependent variable. Both pre-aggregation and post-aggregation vote shares are calculated using pre-election polling station vote totals as thedenominator. All specifications in both panels include province fixed effects and drop the five largest and five smallest observations of the dependentvariable. No candidates record a connection to the district aggregator and not to the provincial aggregator so these coefficients cannot be estimatedseparately. Sample: treatment and control polling centers for which complete vote difference data are available. Levels of significance: *p < 0.1,**p < 0.05, ***p < 0.01. Standard errors clustered by candidate and by polling center using the method of Cameron, Gelbach and Miller (2011)are reported in parentheses.
52
Table A8: Estimates of Impact on Total Votes by Candidate Connection
(Estimates Aggregating to the Candidate - Polling Center)
Dependent Variable: VotesPanel A - Impacts on Most Connected Candidates (1) (2) (3) (4) (5)
Letter Treatment (=1) 0.214 0.335 0.282 0.224 -0.024(0.942) (0.875) (0.290) (0.349) (0.331)
Treat x Most Connected -21.656 -21.935* -23.053* -25.751***(13.285) (13.233) (13.793) (9.863)
Most Connected (=1) 87.828*** 79.144*** 82.750*** 74.585***(10.026) (9.960) (10.361) (8.023)
Constant 6.539*** 6.035*** 6.113*** 8.760*** 7.521***(0.672) (0.619) (0.192) (0.882) (0.657)
R-Squared 0.000 0.031 0.093 0.089 0.101Stratum FEs No No Yes Yes YesFull Covariates No No No Yes YesTrimming top 1% of votes for interacted candidate type No No No No YesR-Squared 0.000 0.024 0.079 0.077 0.094# Polling Centers 461 461 461 440 440# Candidate - Polling Center Observations 80060 80060 80060 66137 66108Mean of Dep. Var. in Controls 6.539 6.539 6.539 7.660 7.401Mean of Dep. Var. in Controls for Interacted Connection - 93.863 93.863 98.073 88.991
Dependent Variable: VotesPanel B - Impacts by Connection Type (1) (2) (3) (4) (5)
Letter Treatment (=1) -0.027 0.008 0.071 -0.009 0.259(0.336) (0.349) (0.325) (0.321) (0.345)
Treat x Provincial Aggregator Connection -21.474**(10.018)
Treat x Prov. and District Aggregator Connection -26.688* -13.561(15.791) (23.674)
Treat x Government Service -15.389** -12.725(6.550) (18.707)
Treat x Karzai Connection -9.913 3.959(6.381) (18.967)
Provincial Aggregator Connection (=1) 71.140***(7.939)
Prov. and District Aggregator Connection (=1) 82.906*** 43.531**(12.852) (18.450)
Government Service (=1) 64.459*** 77.828***(5.079) (12.258)
Karzai Connect. (=1) 50.969*** -19.332(4.792) (12.321)
Constant 7.595*** 8.145*** 7.166*** 7.084*** 8.229***(0.661) (0.677) (0.657) (0.624) (0.881)
R-Squared 0.099 0.084 0.117 0.109 0.114Stratum FEs Yes Yes Yes Yes YesFull Covariates Yes Yes Yes Yes YesTrimming top 1% of votes for interacted candidate type Yes Yes Yes Yes NoR-Squared 0.091 0.078 0.108 0.103 0.098# Polling Centers 440 440 440 440 440# Candidate - Polling Center Observations 66118 66127 66116 66077 66137Mean of Dep. Var. in Controls 7.508 7.556 7.463 7.079 7.660Mean of Dep. Var. in Controls for Interacted Connection 84.798 98.928 76.144 62.673 -
Notes: This table reports effects on post-aggregation vote totals by candidate connection. The unit of observation is the candidate - polling
substation.Each column reports the results of an OLS regression where total post-aggregation votes is the dependent variable and a dummy
variable that equals 1 when the candidate - polling substation observation has the indicated connection is the independent variable. Panel
A reports treatment effect for the most connected candidate. Please see the text for details on how the most connected candidate variable
is computed. Column (1) reports a bivariate treatment regression with no interaction terms; Column (2) reports the results when treatment
is interacted with the most connected candidate; Column (3) reports a similar regression with stratum fixed effects included; Column (4)
adds controls for the share of respondents reporting that they are Pashtun, Tajik, and whether the polling center was visited by international
election monitors; Column (5) trims the top percentile of the dependent variable. Panel B reports results interacting the treatment dummy
with each of the candidate connection dummies. All regressions reported in Panel B include stratum fixed effects, controls for the share of
respondents reporting that they are Pashtun, Tajik, and whether the polling center was visited by international election monitors; and have
the top percentile of the dependent variable trimmed. The number of candidate - polling substations observations change slightly because of
duplicates values at the top percentile. The final two rows in Panel A and in Panel B report the mean of the dependent variable in the controls
and in the controls among observations for the interacted candidate type. Level of significance: *p < 0.1, **p < 0.05, ***p < 0.01. Robust
standard errors clustered at the polling center level are reported in parentheses.
53
Table A9: Validation Tests the Randomization to Test for Spatial Externalities
Transport Expects Vote Can TraditionalProblems Violence is Secret Identify MP Authority
Panel A (1) (2) (3) (4) (5)Received Letter (=1) 0.009 0.001 -0.018 0.014 0.017
(0.026) (0.026) (0.022) (0.020) (0.021)Any PCs treated within 1km (=1) 0.015 0.096** -0.040 0.019 0.023
(0.050) (0.042) (0.035) (0.035) (0.036)Total PCs within 1km (0-5) -0.010 -0.005 0.014* 0.000 0.009
(0.010) (0.009) (0.009) (0.007) (0.008)Any PCs treated within 1-2km (=1) 0.054 -0.002 0.030 0.021 0.070*
(0.047) (0.049) (0.038) (0.042) (0.041)Total PCs within 1-2km (0-24) 0.003 -0.003 0.001 -0.008** -0.011***
(0.006) (0.006) (0.005) (0.004) (0.004)Constant 0.499*** 0.471*** 0.634*** 0.390*** 0.256***
(0.033) (0.034) (0.030) (0.036) (0.030)# Observations 436 436 440 440 440R-Squared 0.332 0.434 0.340 0.641 0.370
Pashtun Income Electrified District Gov. Visited byKeeps Fair Int’l Monitors
Panel B (1) (2) (3) (4) (5)Received Letter (=1) -0.010 -66.909 -0.010 0.005 0.035
(0.018) (480.913) (0.016) (0.012) (0.035)Any PCs treated within 1km (=1) -0.007 487.848 0.030 0.019 0.068
(0.028) (721.844) (0.026) (0.022) (0.053)Total PCs within 1km (0-5) 0.005 -145.416 0.002 -0.006 0.021*
(0.007) (218.328) (0.005) (0.005) (0.012)Any PCs treated within 1-2km (=1) 0.014 -245.542 0.030 -0.024 0.004
(0.028) (724.078) (0.031) (0.023) (0.058)Total PCs within 1-2km (0-24) -0.006 144.201 0.002 -0.002 0.015*
(0.004) (131.907) (0.003) (0.003) (0.008)Constant 0.339*** 10167.997*** 0.674*** 0.139*** -0.006
(0.022) (569.063) (0.023) (0.018) (0.049)# Observations 440 435 440 438 440R-Squared 0.817 0.301 0.778 0.502 0.226
Notes: Each column provides results from an OLS regression of the indicated dependent variable on a treatmentdummy and variables to account for the number of neighboring polling centers that received treatment. Dependentvariables are computed as averages of answers from respondents near the polling center in the baseline survey.These are: Any PCs treated within 1km (2km) is a dummy variable that equals 1 if a polling center within a 1kilometer (2 kilometers) radius received a treatment notification letter; Any PCs within 1km (1-2km) is a variableequal to the number of polling centers within a 1 kilometer radius (1 to 2 kilometer radius) received a treatmentnotification letter; 1 treated PC within 1km is a dummy that equals 1 if 1 polling center within a 1 km radius wastreated and so on for 2 to 5 polling centers. The maximum number of polling centers treated within a kilometer inour sample is 5. The unit of observation is the candidate - polling substation. The sample is restricted to the mostconnected candidate only. The most connected candidate is identified using the procedure described in Section 5.The top percentile is trimmed and stratum fixed effects and controls for the share of respondents reporting thatthey are Pashtun, Tajik, and whether the polling center was visited by international election monitors are includedin all specifications. Level of significance: *p < 0.1, **p < 0.05, ***p < 0.01. Robust standard errors clustered atthe the polling center level reported in parentheses.
54
Table A10: Trimmed Estimates of Effect on Aggregation Fraud
(Sample: Full Sample)
Control (i) Number of Observations 183660(ii) Proportion Non-missing 0.097(iii) Mean abs(Vote Difference) for observed 0.817 Control Standard Error 0.210
Treatment (iv) Number of Observations 203291(v) Proportion Non-missing 0.149(vi) Mean abs(Vote Difference) for observed 0.582
p = [(v)-(ii)/(v)] 0.349
(vii) pth quantile 0.000(viii) Trimmed Mean: E[Y |Y > yp] 0.893 Treatment UB Std. Err. 0.246
(ix) (1-p)th quantile 0.000(x) Trimmed Mean: E[Y |Y < y(1−p)] 0.000 Treatment LB Std. Err. 0.245
Effect (xi) Upper Bound Estimate = (viii) - (iii) 0.076 (xiii) UB Effect Std. Err. 0.249(xii) Lower Bound Estimate = (x) - (iii) -0.817 (xiv) LB Effect Std. Err. 0.342
95 Percent Confidence Interval = [(xii) - 1.96*(xiv), (xi) + 1.96*(xiii)] [-1.487, 0.564]90 Percent Confidence Interval = [(xii) - 1.645*(xiv), (xi) + 1.645*(xiii)] [-1.380, 0.486]
Notes: This table reports details of the Lee (2009) procedure applied to the data on aggregation fraud. The sample is the full sample
for which pre-aggregation and post-aggregation data are available (see the text for details). Rows (i) - (iii) provide values for the control
group; rows (iv) - (vi) provide values for the untrimmed treatment group; rows (vii) and (viii) correspond to the treatment group after
trimming the bottom p percentiles; rows (ix) and (x) correspond to the treatment group after trimming the top p percentiles. Rows (xi)
and (xii) report the upper and lower bound respectively. The corresponding 95 percent and 90 percent confidence intervals are reported
in the final two rows. There are 17,786 non-missing observations in the control group. Before trimming, there are 30,232 non-missing
observations in the treatment group. After trimming, there are 19,688 (19,687) observations remaining in the treatment group after
trimming the lower p (upper 1-p) of the distribution. Standard errors reported in this table are obtained by bootstrapping all polling
centers in our experimental sample with replacement 250 times.
55
Table A11: Trimmed Estimates of Effect on Aggregation Fraud
(Sample: Incumbents)
Control (i) Number of Observations 10492(ii) Proportion Non-missing 0.118(iii) Mean abs(Vote Difference) for observed 2.840 Control Standard Error 0.801
Treatment (iv) Number of Observations 11485(v) Proportion Non-missing 0.171(vi) Mean abs(Vote Difference) for observed 1.597
p = [(v)-(ii)/(v)] 0.308
(vii) pth quantile 0.000(viii) Trimmed Mean: E[Y |Y > yp] 2.308 Treatment UB Std. Err. 0.572
(ix) (1-p)th quantile 0.000(x) Trimmed Mean: E[Y |Y < y(1−p)] 0.000 Treatment LB Std. Err. 0.498
Effect (xi) Upper Bound Estimate = (viii) - (iii) -0.531 (xiii) UB Effect Std. Err. 0.907(xii) Lower Bound Estimate = (x) - (iii) -2.840 (xiv) LB Effect Std. Err. 0.981
95 Percent Confidence Interval = [(xii) - 1.96*(xiv), (xi) + 1.96*(xiii)] [-4.763, 1.247]90 Percent Confidence Interval = [(xii) - 1.645*(xiv), (xi) + 1.645*(xiii)] [-4.454, 0.961]
Notes: This table reports details of the Lee (2009) procedure applied to the data on aggregation fraud. The sample is all candidates
with a recorded connection to the incumbent. Rows (i) - (iii) provide values for the control group; rows (iv) - (vi) provide values for
the untrimmed treatment group; rows (vii) and (viii) correspond to the treatment group after trimming the bottom p percentiles;
rows (ix) and (x) correspond to the treatment group after trimming the top p percentiles. Rows (xi) and (xii) report the upper
and lower bound respectively. The corresponding 95 percent and 90 percent confidence intervals are reported in the final two rows.
There are 1,241 non-missing observations in the control group. Before trimming, there are 1,964 non-missing observations in the
treatment group. After trimming, there are 1,359 (1,358) observations remaining in the treatment group after trimming the lower
p (upper 1-p) of the distribution. Standard errors reported in this table are obtained by bootstrapping all polling centers in our
experimental sample with replacement 250 times.
56
Table A12: Trimmed Estimates of Effect on Aggregation Fraud
(Sample: Most Connected Candidate)
Control (i) Number of Observations 962(ii) Proportion Non-missing 0.148(iii) Mean abs(Vote Difference) for observed 10.310 Control Standard Error 5.476
Treatment (iv) Number of Observations 1016(v) Proportion Non-missing 0.201(vi) Mean abs(Vote Difference) for observed 3.799
p = [(v)-(ii)/(v)] 0.265
(vii) pth quantile 0.000(viii) Trimmed Mean: E[Y |Y > yp] 5.167 Treatment UB Std. Err. 2.053
(ix) (1-p)th quantile 0.000(x) Trimmed Mean: E[Y |Y < y(1−p)] 0.000 Treatment LB Std. Err. 0.802
Effect (xi) Upper Bound Estimate = (viii) - (iii) -5.143 (xiii) UB Effect Std. Err. 5.938(xii) Lower Bound Estimate = (x) - (iii) -10.310 (xiv) LB Effect Std. Err. 5.566
95 Percent Confidence Interval = [(xii) - 1.96*(xiv), (xi) + 1.96*(xiii)] [-21.219, 6.500]90 Percent Confidence Interval = [(xii) - 1.645*(xiv), (xi) + 1.645*(xiii)] [-19.466, 4.625]
Notes: This table reports details of the Lee (2009) procedure applied to the data on aggregation fraud. The sample is all candidates
with a recorded connection to the most connected candidate. Rows (i) - (iii) provide values for the control group; rows (iv) - (vi)
provide values for the untrimmed treatment group; rows (vii) and (viii) correspond to the treatment group after trimming the bottom
p percentiles; rows (ix) and (x) correspond to the treatment group after trimming the top p percentiles. Rows (xi) and (xii) report the
upper and lower bound respectively. The corresponding 95 percent and 90 percent confidence intervals are reported in the final two
rows. There are 142 non-missing observations in the control group. Before trimming, there are 204 non-missing observations in the
treatment group. After trimming, there are 150 (149) observations remaining in the treatment group after trimming the lower p (upper
1-p) of the distribution. Standard errors reported in this table are obtained by bootstrapping all polling centers in our experimental
sample with replacement 250 times.
57
Table A13: Trimmed Estimates of Effect on Aggregation Fraud
(Sample: District + Provincial Aggregator Connection)
Control (i) Number of Observations 441(ii) Proportion Non-missing 0.184(iii) Mean abs(Vote Difference) for observed 18.790 Control Standard Error 9.009
Treatment (iv) Number of Observations 480(v) Proportion Non-missing 0.260(vi) Mean abs(Vote Difference) for observed 5.760
p = [(v)-(ii)/(v)] 0.295
(vii) pth quantile 0.000(viii) Trimmed Mean: E[Y |Y > yp] 8.090 Treatment UB Std. Err. 4.089
(ix) (1-p)th quantile 0.000(x) Trimmed Mean: E[Y |Y < y(1−p)] 0.000 Treatment LB Std. Err. 2.030
Effect (xi) Upper Bound Estimate = (viii) - (iii) -10.700 (xiii) UB Effect Std. Err. 10.340(xii) Lower Bound Estimate = (x) - (iii) -18.790 (xiv) LB Effect Std. Err. 9.090
95 Percent Confidence Interval = [(xii) - 1.96*(xiv), (xi) + 1.96*(xiii)] [-36.606, 9.566]90 Percent Confidence Interval = [(xii) - 1.645*(xiv), (xi) + 1.645*(xiii)] [-33.743, 6.309]
Notes: This table reports details of the Lee (2009) procedure applied to the data on aggregation fraud. The sample is all candidates
with a recorded connection to both the district and the provincial aggregator. Rows (i) - (iii) provide values for the control group; rows
(iv) - (vi) provide values for the untrimmed treatment group; rows (vii) and (viii) correspond to the treatment group after trimming
the bottom p percentiles; rows (ix) and (x) correspond to the treatment group after trimming the top p percentiles. Rows (xi) and
(xii) report the upper and lower bound respectively. The corresponding 95 percent and 90 percent confidence intervals are reported
in the final two rows. There are 17,786 non-missing observations in the control group. Before trimming, there are 81 non-missing
observations in the treatment group. After trimming, there are 89 (88) observations remaining in the treatment group after trimming
the lower p (upper 1-p) of the distribution. Standard errors reported in this table are obtained by bootstrapping all polling centers in
our experimental sample with replacement 250 times.
58
Table A14: Trimmed Estimates of Effect on Aggregation Fraud
(Sample: Karzai Connection)
Control (i) Number of Observations 1550(ii) Proportion Non-missing 0.141(iii) Mean abs(Vote Difference) for observed 8.831 Control Standard Error 3.701
Treatment (iv) Number of Observations 1600(v) Proportion Non-missing 0.204(vi) Mean abs(Vote Difference) for observed 3.667
p = [(v)-(ii)/(v)] 0.309
(vii) pth quantile 0.000(viii) Trimmed Mean: E[Y |Y > yp] 5.282 Treatment UB Std. Err. 1.782
(ix) (1-p)th quantile 0.000(x) Trimmed Mean: E[Y |Y < y(1−p)] 0.000 Treatment LB Std. Err. 0.675
Effect (xi) Upper Bound Estimate = (viii) - (iii) -3.549 (xiii) UB Effect Std. Err. 4.176(xii) Lower Bound Estimate = (x) - (iii) -8.831 (xiv) LB Effect Std. Err. 3.772
95 Percent Confidence Interval = [(xii) - 1.96*(xiv), (xi) + 1.96*(xiii)] [-16.224, 4.636]90 Percent Confidence Interval = [(xii) - 1.645*(xiv), (xi) + 1.645*(xiii)] [-15.036, 3.321]
Notes: This table reports details of the Lee (2009) procedure applied to the data on aggregation fraud. The sample is all candidates
with a recorded connection to Hamid Karzai. Rows (i) - (iii) provide values for the control group; rows (iv) - (vi) provide values for
the untrimmed treatment group; rows (vii) and (viii) correspond to the treatment group after trimming the bottom p percentiles;
rows (ix) and (x) correspond to the treatment group after trimming the top p percentiles. Rows (xi) and (xii) report the upper and
lower bound respectively. The corresponding 95 percent and 90 percent confidence intervals are reported in the final two rows. There
are 219 non-missing observations in the control group. Before trimming, there are 327 non-missing observations in the treatment
group. After trimming, there are 227 (226) observations remaining in the treatment group after trimming the lower p (upper 1-p) of
the distribution. Standard errors reported in this table are obtained by bootstrapping all polling centers in our experimental sample
with replacement 250 times.
59
Table A15: Trimmed Estimates of Effect on Aggregation Fraud
(Sample: Government Service)
Control (i) Number of Observations 1945(ii) Proportion Non-missing 0.143(iii) Mean abs(Vote Difference) for observed 9.665 Control Standard Error 3.139
Treatment (iv) Number of Observations 2089(v) Proportion Non-missing 0.202(vi) Mean abs(Vote Difference) for observed 5.126
p = [(v)-(ii)/(v)] 0.291
(vii) pth quantile 0.000(viii) Trimmed Mean: E[Y |Y > yp] 7.217 Treatment UB Std. Err. 1.899
(ix) (1-p)th quantile 1.000(x) Trimmed Mean: E[Y |Y < y(1−p)] 0.013 Treatment LB Std. Err. 0.878
Effect (xi) Upper Bound Estimate = (viii) - (iii) -2.448 (xiii) UB Effect Std. Err. 3.666(xii) Lower Bound Estimate = (x) - (iii) -9.652 (xiv) LB Effect Std. Err. 3.277
95 Percent Confidence Interval = [(xii) - 1.96*(xiv), (xi) + 1.96*(xiii)] [-16.075, 4.737]90 Percent Confidence Interval = [(xii) - 1.645*(xiv), (xi) + 1.645*(xiii)] [-15.043, 3.583]
Notes: This table reports details of the Lee (2009) procedure applied to the data on aggregation fraud. The sample is all candidates
with a recorded history of government service. Rows (i) - (iii) provide values for the control group; rows (iv) - (vi) provide values for
the untrimmed treatment group; rows (vii) and (viii) correspond to the treatment group after trimming the bottom p percentiles;
rows (ix) and (x) correspond to the treatment group after trimming the top p percentiles. Rows (xi) and (xii) report the upper and
lower bound respectively. The corresponding 95 percent and 90 percent confidence intervals are reported in the final two rows. There
are 278 non-missing observations in the control group. Before trimming, there are 421 non-missing observations in the treatment
group. After trimming, there are 299 (298) observations remaining in the treatment group after trimming the lower p (upper 1-p) of
the distribution. Standard errors reported in this table are obtained by bootstrapping all polling centers in our experimental sample
with replacement 250 times.
60
05
1015
2025
30Av
erag
e Vo
tes
MostConnected
ProvincialAggregatorConnection
Provincial andDistrict
AggregatorConnection
GovermentService
KarzaiConnection
Control Treatment
Figure A3: Treatment Effects by Candidate Connection
Notes: The depicted bands reflect 95 percent confidence intervals based on standard errors clustered.Vote totals above the 99th percentile are dropped when calculating the depicted means.
61
Polling Center Name: …………………..
Polling Center Code:…………………….
Date: ………………………………………
Dear Sir or Madam-
Greetings! I am an official election observer with the Opinion Research Center of Afghanistan (ORCA). My organization is providing this letter to collect some important information about your polling center and share it with our main office. Your polling center has been randomly selected from among polling centers in this province.
In our attempts to help Afghanistan have free and fair elections, I will return to this polling center tomorrow morning in order to take pictures of the results for every candidate in every station on the tally sheets after they have been posted.
The information will be posted on a website that belongs to local and international election observers so that it will be used by the people of Afghanistan, the international community, and local and international media. We will also compare the photos taken with the tally certified by the IEC in Kabul.
As recognition that you have read and understood this letter, please sign here: ___________
Thank you kindly for your help and cooperation.
Sincerely,
Haj Abdul Nabi Barakzai
Deputy Head of ORCA
Name and Signature of manager of polling station:………………………………………
Figure A4: Letter Delivered to Polling Center Managers
62
نام مرکز رائ دهی: _________________ تاریخ: ________________ ___ مرکز رائ دهی: _____________کود
به حضور محترم آقای / خانم
مسئولیت نظارت 472 مراکز رائ برحسب توافقنامه کمیسیون مستقل انتخابات دفتر اورکا دهی را بر عهده دارد.
میباشد و برای او (ORCA) دفتربه مربوطیک تن از نظارت کننده گان رسمیدارنده مکتوب معلومات تا بتواند مرکز رای دهی تسلیم نمودهدراین تا این مکتوب را وظیفه سپرده شده است. این مرکز دفتر مرکزی شریک بسازدجمع آوری نموده و با مرکز رای دهی این و دقیق را ازموثقاین ولایت تمام مراکز رای دهی میانبه صورت تصادفی از گر به شمول چندین مراکز دیرای دهی انتخاب شده است.
فردا صبح . ناظر ما یک انتخابات آزاد و مشروع در افغانستان کمک خواهیم کرد تقویتبرایما . نصب میگردد اخذ نمایدمرکز رای دهیاین که در را نتایج کاندیدان لست آمد تا تصاویر از دخواه
گذاشته مربوط به ناظرین انتخاباتی داخلی و خارجی این نتایج در سایت انترنتی تصاویر از این نتایج ، موسسات خارجی، و مطبوعات داخلی و خارجیخواهد شد تا تمام مردم افغانستان نتایج را با نتایج که از طرف اینتصاویر حاصله از ناظرحیث مناستفاده کنند. و همچنان ما انتخابات در کابل نشر میشود مقایسه خواهیم کرد. مستقلکمیسیون
در پائین ایدبرای تائید اینکه این مکتوب بدسترس شما قرار گرفته و شما انرا مطالعه نموده مضا نمائید. لطف نموده ا
از همکاری شما قبلا اظهار سپاس.
بااحترام
حاجی عبدالنبی بارکزی
معاون دفتر اورکا
آمر محترم مرکزرائ دهی: _______________ یامضااسم و
Figure A5: Dari Translation of Letter Delivered to Polling Center Managers
63