arXiv:1908.02793v4 [physics.soc-ph] 9 Jan 2020

33
Noncooperative dynamics in election interference David Rushing Dewhurst, 1 Christopher M. Danforth, 1 and Peter Sheridan Dodds 1 1 MassMutual Center of Excellence in Complex Systems and Data Science, Computational Story Lab, Vermont Complex Systems Center, and Department of Mathematics and Statistics, University of Vermont, Burlington, VT 05405 Foreign power interference in domestic elections is an existential threat to societies. Manifested through myriad methods from war to words, such interference is a timely example of strategic interaction between economic and political agents. We model this interaction between rational game players as a continuous-time differential game, constructing an analytical model of this competition with a variety of payoff structures. All-or-nothing attitudes by only one player regarding the outcome of the game lead to an arms race in which both countries spend increasing amounts on interference and counter-interference operations. We then confront our model with data pertaining to the Russian interference in the 2016 United States presidential election contest. We introduce and estimate a Bayesian structural time series model of election polls and social media posts by Russian Twitter troll accounts. Our analytical model, while purposefully abstract and simple, adequately captures many temporal characteristics of the election and social media activity. We close with a discussion of our model’s shortcomings and suggestions for future research. I. INTRODUCTION In democratic and nominally-democratic countries, elections are societally and politically crucial events in which power is allocated [1]. In fully-democratic coun- tries elections are the method of legitimate governmen- tal change [2]. One country, labeled “Red”, wishes to influence the outcome of an election in another coun- try, labeled “Blue’, because of the impact that elections in Blue have on Red’s national interest. Such attacks on democracies are not new. It is estimated that the United States (U.S.) and Russia (and its predecessor, the Soviet Union) often interfere in the elections of other nations and have consistently done this since 1946 [3]. Though academic study of this area has increased [4], we are unaware of any formal modeling of noncoopera- tive dynamics in an election interference game. Recent approaches to the study of this phenomenon have focused mainly on the compilation of coarse-grained (e.g., yearly frequency) panels of election interference events and qual- itative analysis of this data [5, 6], and data-driven studies of the aftereffects and second-order effects of interference operations [7, 8]. Attempts to create theoretical mod- els of interference operations are less common. These attempts include qualitative causal models of cyberoper- ation influence on voter preferences [9] and models of the underlying reasons that a state may wish to interfere in the elections of another [10]. We consider a two-player game in which one country wants to influence a two-candidate, zero-sum election taking place in another country. We think of Red as the foriegn intelligence service of the influencing country and Blue as the domestic intelligence service of the country in which the election is held. Red wants a particular candi- date, which we will set to be candidate A without loss of generality, to win the election, while Blue wants the effect of Red’s interference to be minimized. We derive a non- cooperative, non-zero-sum differential game to describe this problem, and then explore the game numerically. We find that all-or-nothing attitudes by either Red or Blue can lead to arms-race conditions in interference opera- tions. In the event that one party credibly commits to playing a fixed, deterministic strategy, we derive further analytical results. We then confront our model with data pertaining to the 2016 U.S. presidential election contest, in which Russia interfered [11]. We fit a Bayesian structural time series model to election polls and social media posts authored by Russian military intelligence-associated troll accounts. We demonstrate that our model, though simple, captures many of the observed and inferred parameters’ dynamics. We close by proposing some theoretical and empirical extensions to our work. II. THEORY A. Election interference model We consider an election between two candidates with no electoral institutions such as an Electoral College) We assume that the election process at any time t [0,T ] is represented by a public poll Z t [0, 1]. The model is set in continuous time, though when we estimate parameters statistically in Sec. III we move to a discrete-time ana- logue. We hypothesize that the election dynamics take place in a latent space where dynamics are represented by X t R. We will set X t < 0 to be values of the latent poll that favor candidate A and X t > 0 that favor candidate B. The latent and observable space are related by Z t = φ(X t ), where φ is a sigmoidal function which we choose to be φ(x)= 1 1+e -x . (Any sigmoidal function that is bounded between zero and one will suffice and lead only to different parameter estimates in the context of statis- tical estimation.) The actual result of the election—the number of votes that are earned by candidate B—is giv- arXiv:1908.02793v4 [physics.soc-ph] 9 Jan 2020

Transcript of arXiv:1908.02793v4 [physics.soc-ph] 9 Jan 2020

Noncooperative dynamics in election interference

David Rushing Dewhurst,1 Christopher M. Danforth,1 and Peter Sheridan Dodds1

1MassMutual Center of Excellence in Complex Systems and Data Science,Computational Story Lab, Vermont Complex Systems Center,

and Department of Mathematics and Statistics, University of Vermont, Burlington, VT 05405

Foreign power interference in domestic elections is an existential threat to societies. Manifestedthrough myriad methods from war to words, such interference is a timely example of strategicinteraction between economic and political agents. We model this interaction between rational gameplayers as a continuous-time differential game, constructing an analytical model of this competitionwith a variety of payoff structures. All-or-nothing attitudes by only one player regarding the outcomeof the game lead to an arms race in which both countries spend increasing amounts on interferenceand counter-interference operations. We then confront our model with data pertaining to the Russianinterference in the 2016 United States presidential election contest. We introduce and estimate aBayesian structural time series model of election polls and social media posts by Russian Twittertroll accounts. Our analytical model, while purposefully abstract and simple, adequately capturesmany temporal characteristics of the election and social media activity. We close with a discussionof our model’s shortcomings and suggestions for future research.

I. INTRODUCTION

In democratic and nominally-democratic countries,elections are societally and politically crucial events inwhich power is allocated [1]. In fully-democratic coun-tries elections are the method of legitimate governmen-tal change [2]. One country, labeled “Red”, wishes toinfluence the outcome of an election in another coun-try, labeled “Blue’, because of the impact that electionsin Blue have on Red’s national interest. Such attackson democracies are not new. It is estimated that theUnited States (U.S.) and Russia (and its predecessor,the Soviet Union) often interfere in the elections of othernations and have consistently done this since 1946 [3].Though academic study of this area has increased [4],we are unaware of any formal modeling of noncoopera-tive dynamics in an election interference game. Recentapproaches to the study of this phenomenon have focusedmainly on the compilation of coarse-grained (e.g., yearlyfrequency) panels of election interference events and qual-itative analysis of this data [5, 6], and data-driven studiesof the aftereffects and second-order effects of interferenceoperations [7, 8]. Attempts to create theoretical mod-els of interference operations are less common. Theseattempts include qualitative causal models of cyberoper-ation influence on voter preferences [9] and models of theunderlying reasons that a state may wish to interfere inthe elections of another [10].

We consider a two-player game in which one countrywants to influence a two-candidate, zero-sum electiontaking place in another country. We think of Red as theforiegn intelligence service of the influencing country andBlue as the domestic intelligence service of the country inwhich the election is held. Red wants a particular candi-date, which we will set to be candidate A without loss ofgenerality, to win the election, while Blue wants the effectof Red’s interference to be minimized. We derive a non-cooperative, non-zero-sum differential game to describe

this problem, and then explore the game numerically. Wefind that all-or-nothing attitudes by either Red or Bluecan lead to arms-race conditions in interference opera-tions. In the event that one party credibly commits toplaying a fixed, deterministic strategy, we derive furtheranalytical results.

We then confront our model with data pertaining to the2016 U.S. presidential election contest, in which Russiainterfered [11]. We fit a Bayesian structural time seriesmodel to election polls and social media posts authoredby Russian military intelligence-associated troll accounts.We demonstrate that our model, though simple, capturesmany of the observed and inferred parameters’ dynamics.We close by proposing some theoretical and empiricalextensions to our work.

II. THEORY

A. Election interference model

We consider an election between two candidates withno electoral institutions such as an Electoral College) Weassume that the election process at any time t ∈ [0, T ] isrepresented by a public poll Zt ∈ [0, 1]. The model is setin continuous time, though when we estimate parametersstatistically in Sec. III we move to a discrete-time ana-logue. We hypothesize that the election dynamics takeplace in a latent space where dynamics are representedby Xt ∈ R. We will set Xt < 0 to be values of thelatent poll that favor candidate A and Xt > 0 that favorcandidate B. The latent and observable space are relatedby Zt = φ(Xt), where φ is a sigmoidal function which wechoose to be φ(x) = 1

1+e−x . (Any sigmoidal function thatis bounded between zero and one will suffice and lead onlyto different parameter estimates in the context of statis-tical estimation.) The actual result of the election—thenumber of votes that are earned by candidate B—is giv-

arX

iv:1

908.

0279

3v4

[ph

ysic

s.so

c-ph

] 9

Jan

202

0

2

en by φ(XT ). The election takes place in a population ofN voting agents. Each voting agent updates their pref-erences over the candidates at each time step tn by arandom variable ξn,tn . These random variables satisfyEn [ξn,tk ] = 0 and En[ξ2n,tk ] < ∞ for all t. The incre-ments of the election process are the sample means ofthe voting agents’ preferences at time t. In the absenceof interference, the stochastic election model is an unbi-ased random walk:

Xtk+1= Xtk +

1

N

∑1≤n≤N

ξn,tk∆tk. (1)

where we have put ∆tk = tk+1 − tk. We display sam-ple realizations of this process for different distributionsof ξn,tk in Fig. 1. Though one distribution of preferencechanges has a larger variance than the other, the sam-ple paths of Xtk are statistically similar for each since1N

∑n ξn,tk does not vary much between the distribu-

tions. When N is large we can reasonably approximatethis discrete agent electoral process by a Wiener process,

dXt = σdWt, where σ2 ≈ Var(

1N

∑1≤n≤N ξn,t

), This

limit is valid in the limit of large N .

If the preference change random variables ξn,k did notsatisfy En[ξn,k] = 0 then the random walk approxima-tion to Eq. 1 would not necessarily be valid. For exam-ple, if {ξn,k}k≥0 were a random walk or were trend-stationary for each n, then {En[ξn,k]}k≥0 would alsorespectively be a random walk or trend-stationary. Atrend-stationary univariate time series is a stochastic pro-cess xt = f(t) + εt, where εt is a stationary process andf(·) is a deterministic function of time [12, 13]. A uni-variate time series with a unit root is a time series thatcan be written as an autoregressive process of order p(AR(p)), xt−

∑pp′=1 βp′xt−p′ = εt, such that the polyno-

mial zp−∑pp′=1 βp′z

t−p′ = 0 has a root on the unit circlewhen solved over the complex numbers. A random walkis a special case of the AR(p) process with characteristicpolynomial given by z − 1 = 0, which has the unit rootz = 1. Trend-stationary and unit-root time series dif-fer fundamentally in that a trend-stationary process sub-jected to an exogenous shock will eventually revert to itsmean function f(t). This is not the case for a stochasticprocess with a unit root. A unit root or trend-stationaryξn,k would model a population in which political prefer-ences were undergoing a shift in population mean ratherthan just in individual preferences.

However, there do exist cases where the random walkapproximation is valid even when En[ξn,k] 6= 0. If thestochastic evolution equation for En[ξn,k] has a station-ary colored noise with exponentially-decaying covariancefunction as its solution, then the integral of this noisesatisfies a Stratonovich-type equation [14–16]. This equa-tion would be a generalized version of the basic randomwalk model considered here, but we will not consider thisscenario in the remainder of this work.

We denote the control policies of Red and Blue (the

functions by which Red and Blue attempt to influence(or prevent influence on) the election) by uR(t) anduB(t). These functions are one-dimensional continuous-time stochastic processes (time series). The term “poli-cy” originates from the fields of economics and reinforce-ment learning [17–19]. These control policies are abstractvariables in the context of our model, but we interpretthem as expenditures on interference operations. Weassume that Red and Blue can affect the mean trajectoryof the election but not its volatility (standard deviationof its increments). We make this assumption because Xt

is an approximation to the process described by Eq. 1.As we show in Fig. 1, the variance of the electoral processdoes not change much even when the voting population’sunderlying preference change distributions have differingvariance and kurtosis. Under the influence of Red’s andBlue’s control policies, the election dynamics become

dXt = F (uR(t), uB(t)) dt+ σ dWt, X0 = y. (2)

The function F captures the mechanism by which Redand Blue affect the mean dynamics of the latent electoralprocess. We assume that F is at least twice continuously-differentiable for connvenience. To first order expansionwe have F (uR(t), uB(t)) = a0 + aRuR(t) + aBuB(t) +O(u2), which is most accurate near u = 0. We approxi-mate the state equation by

dXt = [uR(t) + uB(t)] dt+ σ dWt, X0 = y, (3)

since we have assumed zero endogenous drift and canabsorb constants into the definition of the control poli-cies. We will use Eq. 3 as the state equation for theremainder of the paper.

B. Subgame-perfect Nash equilibria

Red and Blue each seek to minimize separate scalarcost functionals of their own control policy and the otheragent’s control policy. We will assume that the agentsdo not incur a running cost from the value of the statevariable, although we will revisit this assumption in Sec.IV. The cost functionals are therefore

EuR,uB,X

{ΦR(XT ) +

∫ T

0

CR(uR(t), uB(t)) dt

}, (4)

and

EuR,uB,X

{ΦB(XT ) +

∫ T

0

CB(uR(t), uB(t)) dt

}. (5)

The functions CR and CB represent the running costor benefit of conducting election interference operations.We assume the cost functions have the form

Ci(uR, uB) = u2i − λiu2¬i (6)

3

Left No Rightn, 0 B(0.1, 0.1)

0

2p(

n,0) A t = 0

Left No Rightn, T B(1.5, 1.5)

0.0

0.5

p(n,

T) B t = T

0 50 100

150

200

250

300

350

T=365

t (Days)

0.1

0.0

0.1

X t (l

aten

t ele

ctio

n pr

oces

s)

CPreferences change less as t T, = 0.009Preferences change more as t T, = 0.01

Left No Rightn, 0 B(1.5, 1.5)

0.0

0.5

p(n,

0) D t = 0

Left No Rightn, T B(0.1, 0.1)

0

2p(

n,T) E t = T

FIG. 1. The random walk latent space election model is an accurate approximation to multiple different population candidatepreference updates. The latent election process evolves according to Xk+1 = Xk + 1

N

∑1≤n≤N ξn,k. The random variable ξn,k is

voting agent n’s shift toward the left or right. of the political spectrum at time k. In the center panel, the solid curve is a drawfrom the latent election process resulting from the preference updates ξn,t ∼ B

(0.1T−t

T+ 1.5 t

T, 0.1T−t

T+ 1.5 t

T

), where B(α, β)

is the Beta distribution and we have set T = 365. As t → T , the electorate exhibits increasing resistance to change in theirpolitical viewpoints. We display the preference shift distributions at t = 0 in Panel A and at t = T in Panel B. For contrast,the dashed curve is a draw from the latent election process resulting from ξn,t ∼ B

(1.5T−k

T+ 0.1 k

T, 1.5T−k

T+ 0.1 k

T

), which

describes an electorate in which agents more often have changing political preferences as t → T . We show the correspondingpreference shift distributions at t = 0 in Panel D and at t = T in Panel E. Despite these preference updates that are, in somesense, opposites of each other, the latent processes Xt are statistically very similar and are both well-modeled by the continuumapproximation dXt = σdWt.

for i ∈ {R,B}. The notation ¬i indicates the set of allother players. For example, if i = R, ¬i = B. Thisnotation originates in the study of noncooperative eco-nomic games. The non-negative scalar λi parameterizesthe utility gained by player i from observing player ¬i’seffort. If λi > 0, player i gains utility from player ¬i’sexpending resources, while if λi = 0, player i has noregard for ¬i’s level of effort but only for their own run-ning cost and the final cost. Our assumption that costaccumulates quadratically with magnitude of the controlpolicy is common in optimal control theory [20–22]. Wecan justify the functional form of Eq. 6 as follows. Sup-pose that an arbitrary analytic cost function for playeri as Ci(uR, uB) = C(i)(ui) − λiC(¬i)(u¬i). We make thefollowing assumptions:

• It is equally costly to for player i to conduct oper-ations that favor candidate A or candidate B. Thisimposes the constraint that C(i) and C(¬i) are even

functions.

• Player i conducting no interference operationsresults in player i’s incurring no direct cost fromthis choice. In other words, if ui(t) = 0 at some t,player i does not incur any cost from this.

With these assumptions, the first non-zero term in theTaylor expansion of Ci is given by Eq. 6.

1. Choice of final conditions

Finding optimal play in noncooperative games oftenrequires solving the game backward through time [23–26].Therefore, we must define final conditions that specifythe cost that Red and Blue incur from the actual electionresult φ(XT ). Red and Blue might have different finalconditions because of their qualitatively distinct objec-tives. Since Red wants to influence the outcome of the

4

election in Blue’s country in favor of candidate A, theirfinal cost function ΦR must satisfy ΦR(x) < ΦR(y) forall x < 0 and y > 0. In the final conditions that we areabout to present, we also assume that ΦR is monotoni-cally non-decreasing everywhere. We relax these assump-tion in Sec. III when we confront this model with electioninterference-related data. To the extent that this modeldescribes reality, it is probably not true that these restric-tive assumptions on the final condition are always satis-fied. However, one simple final condition that satisfiesthese requirements is ΦR(x) = c0 + c1x, but this allowsthe unrealistic limiting condition of infinite benefit if can-didate A gets 100% of the vote in the election and infi-nite cost if candidate A gets 0% of the vote. We will alsoconsider two Red final conditions with cost that remainsbounded as x → ±∞: one smooth, ΦR(x) = tanh(x);and one discontinuous, ΦR(x) = Θ(x)−Θ(−x). By Θ(·)we mean the Heaviside step function.

Blue wants to reduce the overall impact of Red’s inter-ference operations on the electoral process. Since Blue isa priori indifferent between the outcomes of the election,it initially seems that ΦB(x) = 0. However, if λB = 0,this results in Blue taking no action due to the function-al form of Eq. 6. In other words, if Blue does not gainutility from Red expending resources, then Blue will nottry to stop Red from interfering in an election in Blue’scountry. Hence we believe that Blue cannot be indifferentabout the election outcome.

We present three possible final conditions representingBlue’s preferences over the election result. Blue mightbelieve that a result was due to Red’s interference if XT istoo far from E0[XT ] = 0. An example of a smooth func-tion that represents this belief is ΦB(x) = 1

2x2. However,

this neglects the reality that Red’s objective is not to haveeither candidate A or candidate B win by a large margin,but rather to have candidate A win, i.e., have XT < 0.Thus Blue might be unconcerned about larger positivevalues of the state variable and, modifying the previousfunction suitably, have ΦB(x) = 1

2x2Θ(−x). Alternative-

ly, Blue may accept the result of the election if it doesnot deviate “too far” from the initial expected value. Anexample of a discontinuous final condition that representsthese preferences is ΦB(x) = Θ(|x| − ∆) − Θ(∆ − |x|),where ∆ > 0 is Blue’s accepted margin of error.

Though we could propose many other possible final con-ditions, these example functions demonstrate some pos-sible payoff structures. We include:

• “First-order” functions that could result from theTaylor expansion about zero of an arbitrary ana-lytic final condition. These functions are linear,in the case of Red’s antisymmetric final condition,and quadratic in the case of Blue’s smooth symmet-ric final condition (which is the first non-constantterm in the Taylor expansion of an even analyticfunction);

• Smooth functions that represent bounded prefer-ences over the result of the electoral process andthe recognition that Red favors one candidate inparticular; and

• Discontinuous final conditions that model “all-or-nothing” preferences over the outcome (either can-didate A wins or they do not; either Red interferesless than a certain amount or they interfere more).

These functions do not capture some behavior that mightexist in real election interference operations. For exam-ple, Red’s preferences could be as follows: “we would pre-fer that candidate A wins the election, but if they cannot,then we would like candidate B to win by a landslide sothat we can claim the electoral system in Blue’s countrywas rigged against candidate A”. These preferences corre-spond to a final condition with a global minimum at somex < 0 but a secondary local minimum at x � 0. Thissituation is not modeled by any of the final conditionsthat we have stated. In Sec. III we relax the assumptionthat the final conditions are parameterized according toany of the functional forms considered in thise sectionand instead infer them from observed election and elec-tion interference proxy data using the method describedin Sec. II B 3.

2. Value functions

Applying the dynamic programming principle [17, 18]to Eqs. 3, 4, and 5 leads to a system of coupled Hamilton-Jacobi-Bellman equations for the Red and Blue valuefunctions,

−∂VR∂t

= minuR

{∂VR∂x

[uR + uB] + u2R − λRu2B +σ2

2

∂2VR∂x2

},

(7)and

−∂VB∂t

= minuB

{∂VB∂x

[uR + uB] + u2B − λBu2R +σ2

2

∂2VB∂x2

}.

(8)The dynamic programming principle does not result inan Isaacs equation because the game is not zero-sum andthe cost functionals for Red and Blue can have differ-ent functional forms. (The Isaacs equation is a nonlinearelliptic or parabolic equation that arises in the study oftwo-player, zero-sum games in which one player attemptsto maximize a functional and the other player attemptsto minimize it [27, 28].) Performing the minimizationwith respect to the control variables gives the Nash equi-librium control policies,

uR(t) = −1

2

∂VR∂x

∣∣∣∣(t,Xt)

(9)

uB(t) = −1

2

∂VB∂x

∣∣∣∣(t,Xt)

, (10)

and the exact functional forms of Eqs. 7 and 8,

5

−∂VR∂t

= −1

4

(∂VR∂x

)2

− 1

2

∂VR∂x

∂VB∂x− λR

4

(∂VB∂x

)2

+σ2

2

∂2VR∂x2

, VR(x, T ) = ΦR(x); (11)

−∂VB∂t

= −1

4

(∂VB∂x

)2

− 1

2

∂VB∂x

∂VR∂x− λB

4

(∂VR∂x

)2

+σ2

2

∂2VB∂x2

, VB(x, T ) = ΦB(x). (12)

When solved over the entirety of state space, solutions toEqs. 11 and 12 constitute the strategies of a subgame-perfect Nash equilibrium. No matter the action tak-en by player ¬i at time t, player i is able to respondwith the optimal action at time t + dt. This is the(admittedly-informal) definition of a subgame-perfectNash equilibrium in a continuous-time differential game[26]. Given the solution pair VR(x, t) and VB(x, t), wecan write the distribution of x, uR, and uB analyti-cally. Substitution of Eqs. 9 and 10 into Eq. 3 givesdx = − 1

2

{∂VR

∂x |(t,x) + ∂VB

∂x |(t,x)}

dt+σdW. We discretizethis equation over N timepoints to obtain

xn+1 − xn +∆t

2[V ′Rn + V ′Bn]

− (∆t)1/2σwn − yδn,0 = 0,

(13)

with wn ∼ N (0, 1), ∆t = tn+1 − tn, and n = 0, ..., N − 1and where we have put V ′in ≡ V ′i (xn, tn). Thus the dis-tribution of an increment of the latent electoral processis

p(xn+1|xn) =1√

2πσ2∆te−

∆t2σ2 (

xn+1−xn∆t + 1

2 [V′Rn+V

′Bn]−y

δn0∆t )2 .

(14)Now, using the Markov property of Xt, we have

p(x1, ..., xN |x0) =

N−1∏n=0

p(xn+1|xn) (15)

=1

(2πσ2∆t)N/2exp

{− 1

2σ2S(x1, ..., xN )

},

(16)

where

S(x1, ..., xN ) =

N−1∑n=0

∆t[xn+1 − xn

∆t

+1

2[V ′Rn + V ′Bn]− yδn,0

∆t

]2.

(17)

Taking N → ∞ as N∆t = T remains constant gives afunctional Gaussian distribution,

p(x(0→ T )|x0) =1

Zexp

{− 1

2σ2S[x(0→ T )]

}, (18)

with action

S[x(0→ T )] =

∫ T

0

[dx

dt+

1

2

(∂VR∂x

∣∣∣x=x(t)

∂VB∂x

∣∣∣x=x(t)

)− yδ(t− t0)

]2dt

(19)and partition function

Z =

∫ x(T )

x(0)

Dx(0→ T ) exp

{− 1

2σ2S[x(0→ T )]

}. (20)

We have denoted by x(s → t) the actual path followedby the latent state from time s to time t. The mea-sure Dx(0 → T ) is classical Wiener measure. SinceuR(0 → T ) and uB(0 → T ) are deterministic time-dependent functions of x(0→ T ), we can find their distri-butions explicitly using the probability distribution Eq.16 and the appropriate time-dependent Jacobian trans-formation. These analytical results are of limited utilitybecause we are unaware of analytical solutions to thesystem given in Eqs. 11 and 12, and hence V ′R(x, t) andV ′B(x, t) must be approximated. In Sec. II C we will deriveanalytical results that are valid when player i announcesa credible commitment to a particular control path.

We find the value functions VR(x, t) and VB(x, t) numer-ically through backward iteration, enforcing a Neumannboundary condition at x = ±3, which corresponds tobounding polling popularity of candidate B from belowby 100× φ(−3) = 4.7% and from above by 100× φ(3) =95.3% [29]. We display example realizations of the valuefunctions for different λi and final conditions in Fig. 2.The value functions display diffusive dynamics becausethe state equation is driven by Gaussian white noise.The value functions also depend crucially on the finalcondition. When the final conditions are discontinuous(as in the top panels of Fig. 2) the derivatives of thevalue function reach larger magnitudes and vary morerapidly than when the final conditions are continuous.This has consequences for the game-theoretic interpreta-tion of these results, as we discuss in Sec. II B 4. Fig.2 also demonstrates that the extrema of the value func-tions are not as large in magnitude when λR = λB = 0as when λR = λB = 2; this is because higher values of λimean that player i derives utility not only from the finaloutcome of the game but also from causing player ¬i toexpend resources in the game.

Eqs. 7 and 8 give the closed-loop control policies giventhe current state Xt and time t, uR and uB. We display

6

FIG. 2. Example value functions corresponding to the system Eqs. 11 and 12. Panels A and B display VR(x, t) and VB(x, t)respectively for λR = λB = 0, ΦR(x) = 2[Θ(x)−Θ(−x)], and ΦB(x) = 2[Θ(|x| − 0.1)−Θ(0.1− |x|)] with ∆ = 0.1, while panelsC and D display VR(x, t) and VB(x, t) respectively for λR = λB = 2, ΦR(x) = 2 tanh(x), and ΦB(x) = 1

2x2Θ(−x). For each

solution we enforce Neumann no-flux boundary conditions and set σ = 0.6. The solution is computed on a grid with x ∈ [−3, 3],setting dx = 0.025, and integrating for Nt = 8,000 timesteps.

examples of uR, uB and the electoral process Zt in Fig.3. For this example, we simulate the game with param-eters λR = λB = 2, ΦR(x) = x, and ΦB(x) = 1

2x2Θ(−x).

We plot the control policies in the top panel. The meancontrol policies E[uR] and E[uB] are displayed in thickercurves. For this parameter set, it is optimal for Red tobegin play with a larger amount of interference than Bluedoes and on average decrease the level of interference overtime. Throughout the game Blue increases their resis-tance to Red’s interference. Even though Blue resistsRed’s interference, Red is able to accomplish their objec-tive of causing candidate A to win.

3. Inference and prediction

The solutions to Eqs. 11 and 12 are functions ofthe final conditions VR(x, T ) = ΦR(x) and VB(x, T ) =

ΦB(x). It is possible to perform both inference and pre-diction at times t < T even when ΦR(x) and ΦB(x) arenot known. To do this, we assume that the system givenby Eqs. 11 and 12 has a unique solution given particularfinal conditions ΦR and ΦB. Though we have numeri-cal evidence to suggest that such solutions do exist andare unique, we have not proved that this is the case. Ininference, we want to find the distributions of values ofsome unobserved parameters of the system. We will sup-pose that we want to infer ΦR and ΦB given the observedpaths x(0 → t), uR(0 → t), uB(0 → t) with t < T . Forsimplicity we assume that we know all other parametersof Eqs. 11 and 12 with certainty. Then the posteriordistribution of ΦR and ΦB reads

p(ΦR,ΦB|Xs) ∝ p(x(0→ t)|ΦR,ΦB) p(ΦR,ΦB). (21)

The likelihood p(x(0→ t)|ΦR,ΦB) is Gaussian, as shownin Eq. 18, and depends on the time-dependent Jacobian

7

0.0

0.2

0.4

0.6

0.8

1.0

1.2u(

t) (c

ontro

l pol

icy)

0.0 0.2 0.4 0.6 0.8 1.0t (years)

20%30%40%50%60%70%80%

Z t(c

andi

date

B v

ote)

with interferencewithout interference

FIG. 3. We display realizations of uR and uB in the top paneland paths of the electoral process in the bottom panel. Wedraw these realizations from the game simulated with param-eters λR = λB = 2, ΦR(x) = x, and ΦB(x) = 1

2x2Θ(−x).

For this parameter set, Blue is fighting a losing battle sinceoptimal play by both players results in lower E[Zt] than forthe electoral process without any interference.

transformtions defined implicitly by the solutions of Eqs.11 and 12. The prior over final conditions p(ΦR,ΦB) canbe set proportional to unity if we want to use a maximum-likelihood approach and not account for our prior beliefsabout the form of ΦR and ΦB. We can approximate ΦR

and ΦB with functions parameterized by a finite set ofparameters ai,k, where i ∈ {R,B} and k = 0, ...,K. Thefunctional prior p(ΦR,ΦB) is then approximated by themultivariate distribution p(kR,0, ..., kR,K , kB,0, ..., kB,K).We will take this approach when performing inferencein Sec. III.

We can predict future values of x(t), and hence uR(t) anduB(t), similarly. Now we want to find the probability ofobserving x(t → T ) given observed x(0 → t). To dothis, we integrate out all possible choices of ΦR and ΦB

weighted by their posterior likelihood given the observedpath x(0→ t). The integration is taken with respect to afunctional measure, D(ΦR(x),ΦB(x)). This means thatthe integration is taken over all possible choices of ΦR andΦB that lie in some particular class of functions [30]. Asin the case of inference, we can approximate ΦR and ΦB

by functions parameterized by a finite set of parametersai,k and integrate over the 2K-dimensional domain ofthese parameters. In the present work we do not predictany future values of the latent electoral process or controlpolicies.

4. Dependence of value functions on parameters

We conducted a coarse parameter sweep over λR, λB,ΦR, and ΦB to explore qualitative behavior of this game.We display the results of this parameter sweep for twocombinations of final conditions in Figs. 4 and 5. Theupper right-hand corner of each panel of the figures dis-plays the final condition of each player. Holding Blue’sfinal condition of ΦB(x) = 1

2x2Θ(−x) constant, we com-

pare the means and standard deviations of the Nashequilibrium strategies uR(t) and uB(t) across values ofthe coupling parameters λR, λB ∈ [0, 3] as Red’s finalcondition changes from ΦR(x) = tanh(x) to ΦR(x) =Θ(x)−Θ(−x).

For these combinations of final conditions, highervalues of the coupling parameters λi cause the controlpolicies to have higher variance. This increase invariance is more pronounced when Red’s final conditionis discontinuous, which is sensible since in this caselimt→T− uR(x, t) = − 1

2δ(x). Appendix A contains simi-

lar figures for each 32 = 9 combinations of Red examplefinal conditions, ΦR(x) ∈ {tanh(x),Θ(x)−Θ(−x), x}and Blue example final conditions, ΦB(x) ∈{

1xx

2, 12x2Θ(−x),Θ(|x| −∆)−Θ(∆− |x|)

}. We also

find that certain combinations of parameters lead toan “arms-race” effect in both players’ control policies.For these parameter combinations, Nash equilibriumstrategies entail superexponential growth in the mag-nitude of each player’s control policy near the endof the game. Figure 6 displays E[uR] and E[uB] forsome of these parameter combinations, along with themiddle 80% credible interval (10th to 90th percentile)of uR(t) and uB(t) for each t. A credible intervalfor the random variable Y ∼ p(y) is an interval intowhich Y falls with a particular probability [31]. Forexample, the middle 80% credible interval for Y ∼ p(y)

is the interval (a, b) for which∫ bap(y) dy = 0.8 and∫ a

−∞ p(y) dy =∫∞bp(y) dy = 0.1.

This growth in the magnitude of each player’s control pol-icy occurs when either player has a discontinuous finalcondition. Although a discontinuous final condition byplayer i leads to a greater increase in mean magnitude ofplayer i’s control policy than in player ¬i’s, the stan-dard deviation of each player’s policy exhibits similarsuperexponential growth. To the extent that this modelreflects reality, this points to a general statement aboutelection interference operations: an all-or-nothing mind-set by either Red or Blue about the final outcome ofthe election leads to an arms race that negatively affectsboth players. This is a general feature of any strategic

8

FIG. 4. Example sweeps over the coupling parameters λR and λB when Blue’s final condition is set to ΦB(x) = 12x2Θ(−x). We

vary the coupling parameters over [0, 3] and display the resulting standard deviation of the control policies uR(x) and uB(x).Panels A and B represent one coupled system of equations, while panels C and D represent a coupled system of equations witha different set of final conditions. In panel A, Red’s value function is set to ΦR(x) = tanh(x), while in panel B it is given byΦR(x) = Θ(x) − Θ(−x), where Θ(·) is the Heaviside function. We display a glyph of the corresponding final condition in theupper right corner of each panel. Changing Red’s continuous final condition tanh(x) to the discontinuous Θ(x)−Θ(−x) resultsin substantially increased variation in the control policies of both players.

interaction to which the model described by Eqs. 3 - 5applies.

C. Credible commitment

If player ¬i credibly commits to playing a particularstrategy v(t) on all of [0, T ], then the problem of playeri’s finding a subgame-perfect Nash equilibrium strategyprofile becomes an easier problem of optimal control. Acredible commitment by player ¬i to the strategy v(t)means that

• player ¬i tells player i, either directly or indirectly,that player ¬i will follow v(t); and

• player i should rationally believe that player ¬i willactually follow v(t).

An example of a mechanism that makes commitment toa strategy credible is the Soviet Union’s “Dead Hand”automated second-strike nuclear response system. Thismechanism would launch a full-scale nuclear attack onthe United States if it detected a nuclear strike on theSoviet Union [32, 33]. The existence of this mechanismmade the commitment to the strategy “launch a full-scale nuclear attack on the United States given that anynuclear attack on my country has occurred” credible eventhough the potential cost to both parties of executing thestrategy was high.

When player ¬i credibly commits to playing v(t), playeri’s problem reduces to finding the policy u(t) that mini-

9

FIG. 5. Example sweeps over the coupling parameters λR and λB when Blue’s final condition is set to ΦB(x) = 12x2Θ(−x).

We vary the coupling parameters over [0, 3] and display the resulting means of the control policies uR(x) and uB(x). PanelsA and B represent one coupled system of equations, while panels C and D represent a coupled system of equations with adifferent set of final conditions. In panel A, Red’s value function is set to ΦR(x) = tanh(x), while in panel B it is given byΦR(x) = Θ(x)−Θ(−x). Altering Red’s final condition from continuous to discontinuous causes a greater than 100% increasein the maximum value of the mean of Blue’s control policy.

mizes the functional

Eu,X

{Φ(XT ) +

∫ T

0

(u(t)2 + λv(t)2) dt

}, (22)

subject to the modified state equation

dx = [u(t) + v(t)]dt+ σdW. (23)

Player i’s value function is now given by the solution tothe HJB equation

− ∂V∂t

= minu

{∂V

∂x[u+ v] + u2 − λv2 +

σ2

2

∂2V

∂x2

}. (24)

Performing the minimization gives the control policy

u(t) = − 12∂V∂x

∣∣∣x=x(t)

and the explicit functional form of

the HJB equation,

−∂V∂t

= −1

4

(∂V

∂x

)2

+ v(t)∂V

∂x+ λv(t)2 +

σ2

2

∂2V

∂x2,

V (x, T ) = Φ(x).(25)

1. Path integral control

Though nonlinear, this HJB equation can be trans-formed into a backward Kolmogorov equation (BKE)through a change of variables. The BKE can be solvedusing path integral methods [34]. Setting V (x, t) =−η logϕ(x, t), substituting in Eq. 25, and performing thedifferentiation, we are able to remove the nonlinearity if

10

FIG. 6. In the case of strong coupling (λR and λB � 0), discontinuous final solutions by either player cause superexponentialgrowth in the magnitude of each player’s control policy. Here we set λR = λB = 3 and integrate three systems, varyingonly one final condition in each. Panel A displays a system with two continuous final conditions: ΦR(x) = tanh(x) andΦB(x) = 1

2x2Θ(−x). Panel B displays the mean Red and Blue control policies when the Red final condition is changed to

ΦR(x) = Θ(x)−Θ(−x) as the Blue final condition remains equal to 12x2Θ(−x), while panel C shows the control policies when

ΦB(x) = Θ(|x| > 1) − Θ(|x| < 1) and ΦR(x) = tanh(x). The shaded regions correspond to the middle 80 percentiles (10thto 90th percentiles) of uR(t) and uB(t) for each t. When either player has a discontinuous final condition, the inter-percentilerange is substantially wider for both players than when both players have continuous final conditions.

and only if η2

41ϕ2

(∂ϕ∂x

)2= σ2η

21ϕ2

(∂ϕ∂x

)2. Setting η = 2σ2

satisfies this condition. Performing the change of vari-ables, Eq. 25 is now linear and has a time-dependentdrift and sink term,

∂ϕ

∂t=

λ

2σ2v(t)2ϕ(x, t)− v(t)

∂ϕ

∂x− σ2

2

∂2ϕ

∂x,

ϕ(x, T ) = exp

{− 1

2σ2Φ(x)

}.

(26)

Application of the Feynman-Kac formula gives the solu-tion to Eq. 26 as [35]

ϕ(x, t) = exp

{− λ

2σ2

∫ T

t

v(t′)2 dt′

EYt

{exp

[− 1

2σ2Φ(YT )

] ∣∣∣∣Yt = x

},

(27)

where Yt is defined by

dYt = v(t) dt+ σdWt, Y0 = x. (28)

Using this formalism, we apply path integral control toestimate the value function for arbitrary v(t). Fig. 7displays example path integral solutions to Eq. 25 whenplayer ¬i credibly commits to playing v(t) = t2 for theduration of the game and player i’s final cost functiontakes the form Φ(x) = Θ(|x| − 1) − Θ(1 − |x|). In this

FIG. 7. Result of the path integral Monte Carlo solutionmethod applied to Eq. 25 with the final condition Φ(x) =Θ(|x| > 1) − Θ(|x| ≤ 1) and v(t) = t2. Approximate valuefunctions are computed using N = 10000 trajectories sampledfrom Eq. 28 for each point (x, t). We display approximate val-ue functions in Panel A for t ∈ {0, 0.75, 1 − dt} and the cor-responding approximate control policies u(x, t) in Panel B,along with smoothed versions of u(x, t), which we denote byus(x, t), plotted in dashed curves. Panel C displays realiza-tions of Yt, the process generating the measure under whichthe solution is calculated. The analytical control policy att = T is given by u(t) = − 1

2[δ(x− 1)− δ(x+ 1)].

11

figure we display the approximate value functions V (x, t)along with their corresponding approximate control poli-

cies u(x, t) = − 12∂V (x,t)∂x . We calculated these approxima-

tions on a grid of Nx = 500 linearly spaced xn ∈ [−2, 2].Since the approximated u(xn, t) are noisy stochastic func-tions we also plot smoothed versions of them in Fig. 7.These smoothed versions are defined by

us(xn, t) =

k∑n′=−k

u(xn+n′ , t). (29)

We set k = 7 and hence us(xn, t) are 2k + 1 = 15-stepmoving averages of the more noisy u(xn, t). As t → T ,us(xn, t) approaches the analytical solution of the controlproblem at t = T , which is given by u(x, T ) = − 1

2 [δ(x−1)− δ(x+ 1)].

In the further restricted case where there is a crediblecommitment by one party to play a constant control pol-icy v(t) = v, we can derive further analytical results.Under this assumption, the probability law correspond-ing with Eq. 28 is given by

u(y, t) =1√

2πσ2texp

{1

2σ2t[(y − x)− vt]2

}, (30)

so that the (exponentially-transformed) value functionreads

ϕ(x, t) = Eu(y,T−t)

[exp

{−λv

2

2σ2(T − t)Φ(YT )

}](31)

=exp

{−λv

2

2σ2 (T − t)}

√2πσ2(T − t)

× (32)

∞∫−∞

exp

{− 1

2σ2

[Φ(y) +

((y − x)− v(T − t))2

T − t

]}dy.

This integral can be evaluated exactly for many Φ(y) and,for many other final conditions, it can be approximatedusing the method of Laplace. When t → T so that thedenominator of the argument of the exponential in Eq. 31approaches zero, Laplace’s approximation to the integralreads

∞∫−∞

exp

{− 1

2σ2

[Φ(y) +

((y − x)− v(T − t))2

T − t

]}dy

≈√

2πσ2(T − t) exp

{− 1

2σ2Φ(x+ (T − t)v)

}.

(33)Inverting the transformation ϕ, the value function isapproximated by

V (x, t) = λv2(T − t) + Φ(x+ (T − t)v), (34)

and the control policy by

u(x, t) = −1

2Φ′(x+ (T − t)v). (35)

0

1

2

3

4

V(x,

t)

V(x, 0)VLaplace(x, 0)V(x, T)

2.0 1.5 1.0 0.5 0.0 0.5 1.0 1.5 2.0x

1.0

0.5

0.0

0.5

1.0

V(x,

t)

V(x, 0)VLaplace(x, 0)V(x, T)

FIG. 8. When player ¬i commits to playing a constant strat-egy profile v(t) = v for a fixed interval of time, an analyticapproximate form for player i’s value function V (x, t) is giv-en by V (x, t) ≈ λv2(T − t) + Φ(x + (T − t)v). We show thenumerically-determined value functions at time t = 0 in solidblack curves, We display the Laplace approximations at t = 0in dashed black curves. The lighter-hue curves are the valuefunctions at the final time T , i.e., the respective final condi-tions. The top panel demonstrates results for the final condi-tion Φ(x) = 1

2x2, while the bottom panel has Φ(x) = tanh(x).

We display the results of approximating the value func-tion with Eq. 34 at t = 0 in Fig. 8, along with the actualnumerically-determined value function at both t = 0 and,for reference, t = T .

2. Dependence on a free parameter

The Laplace-approximated value function may dependon a free parameter a that can be used as a “controlknob” to adjust the approximation. For example, playeri might use a to tune the sensitivity of the approximationto the electoral process’s distance from a dead-heat. Ide-ally, the approximated control policy should have similarscaling and asymptotic properties as the true control pol-icy. Solving an optimization problem for optimal valuesof a is be one approach to satisfying this desideratum.

As a case study we consider the behavior of the Laplace-approximated value function V (a)(x, t) and its corre-sponding control policy u(a)(x, t) when we set Φ(a)(x) =tanh(ax). We consider this specific example becauseΦ(a)(x)→ Θ(x)−Θ(−x) as a→ +∞. This limit can bethe source of complicated behavior in a variety of fieldssuch as piecewise-smooth dynamical systems (both deter-ministic and stochastic) [36, 37], Coulombic friction [38],and evolutionary biology [39].

Fig. 9 displays player i’s exponentially-transformed valuefunction Eq. 31 with final condition Φi(x) = tanh(ax).Here player ¬i credibly commits to playing a constantstrategy of v = 0.01. As t → T , larger values of a lead

12

to a sharp boundary betweeen regions of the state spacethat are costly for player i (positive values of x) and thosethat are less costly (negative values of x). This behavioris qualitatively similar to behavior arising from the finalcondition Φi(x) = Θ(x)−Θ(−x). However, we will showthat that there are significant scaling differences in thecontrol policies resulting from using Φ(x) versus Φ(a)(x).From Eqs. 33 and 34, we approximate the value function

by

V (a)(x, t) ≈ λv2(T − t) + tanh {a[x+ v(T − t)]} , (36)

and hence the control policy is approximately

u(a)(x, t) ≈ −a2

sech2 {a[x+ v(T − t)]} , (37)

with both expansions increasingly accurate as t → T .When Φ(x) = Θ(x)−Θ(−x), we can compute the valuefunction analytically:

1√2σ2(T − t)

∞∫−∞

exp

{− 1

2σ2

[Θ(y)−Θ(−y) +

((y − x)− v(T − t))2

T − t

]}dy

= cosh

(1

2σ2

)+ sinh

(1

2σ2

)erf

(− x+ v(T − t)√

2σ2(T − t)

),

(38)

whereupon we find that

V (x, t) = λv2(T − t)− 2σ2 log

[cosh

(1

2σ2

)+ sinh

(1

2σ2

)erf

(− x+ v(T − t)√

2σ2(T − t)

)](39)

and

u(x, t) = −

√2σ2

π(T − t)

exp(−(x+v(T−t))2

2σ2(T−t)

)coth

(1

2σ2

)+ erf

(− x+v(T−t)√

2σ2(T−t)

) .(40)

The approximate control policy u(a)(x, t) and the limit-ing control policy have similar negative “bell-like” shapesbut also differ in important ways. The true control pol-icy decays as a Gaussian modulated by the asymmetricfunction erf(·). The approximate policy decays logisti-cally and hence more slowly than the true control policy.While the approximate policy is symmetric, the true pol-icy is asymmetric due to the error function term. Usingthe Laplace approximation results in more control beingapplied to the electoral process than is optimal. This isbecause the tails of u(a)(x, t) are heavier than those ofu(x, t).

We can maximize the similarity between u(a)(x, t) andu(x, t) by letting the free parameter a be a function of tand solving the functional minimization problem

mina(t)

T∫t

∞∫−∞

[u(a(t))(x, t)− u(x, t)]2 dx dt. (41)

A stationary point of this problem is given by an a(t)

that solves

∞∫−∞

[u(a(t))(x, t)− u(x, t)]∂u(a(t))(x, t)

∂a(t)dx = 0. (42)

We are unable to compute this integral analytically uponsubstituting Eqs. 37 and 40. We find the solution to thisproblem by numerically solving Eq. 42 using the secantmethod for each of 100 linearly-spaced t ∈ [0.5, 0.9975].We display the optimal a(t), along with the correspond-ing u(a(t))(x, t) and true u(x, t) in Fig. 10. We find thatthe optimal a(t) grows superexponentially as t→ T andthat the accuracy of the approximation increases in thislimit. This is expected given that u(a)(x, t) is derivedusing the Laplace approximation and it is in this limitthat the Laplace approximation is valid.

Even with the assumption of credible committment toa constant control policy v, we can use this theory toapproximate the value function in a noncooperative sce-nario. For arbitrary v(t), expansion about t + ∆t givesv(t + ∆t) = v(t) + v′(t)∆t, leading to an approximatevalue function iteration over a small time increment ∆t,

V (x, t+∆t) ≈ λv(t)2(T−t)+Φ(x+(T−t)[v(t)+v′(t)∆t]).(43)

In application, both of v(t) and v′(t) can be estimatedfrom possibly-noisy data on t′ ∈ [0, t].

13

10 3 100 103

a

1

0

1

xt = 0.5

10 3 100 103

a

1

0

1

x

t = 0.75

10 3 100 103

a

1

0

1

x

t = 0.875

10 3 100 103

a

1

0

1

x

t = 0.9375

10 3 100 103

a

1

0

1

x

t = 0.9688

10 3 100 103

a

1

0

1

x

t = 0.9844

10 3 100 103

a

1

0

1

x

t = 0.9922

10 3 100 103

a

1

0

1

x

t = 0.9961

100

6 × 10 1

2 × 1003 × 100

(x,t

)

100

4 × 10 16 × 10 1

2 × 100

(x,t

) 100

3 × 10 14 × 10 16 × 10 1 (x

,t)

100

2 × 10 13 × 10 14 × 10 16 × 10 1

(x,t

)

2 × 10 13 × 10 14 × 10 16 × 10 1

(x,t

)

10 1

2 × 10 13 × 10 14 × 10 1

(x,t

)

10 1

2 × 10 13 × 10 14 × 10 1

(x,t

)

10 1

6 × 10 2

2 × 10 1

(x,t

)

FIG. 9. If player ¬i credibly commits to a strategy of playing a constant strategy with value equal to v for the entire durationof the game, player i’s (exponentially-transformed) value function ϕ(x, t) has an integral representation given by Eq. 31. Wedisplay dynamics of ϕ(x, t) with the final condition set to Φ(x) = tanh(ax). We calculated this value function for x ∈

(− 3

2, 32

)and logarithmically equally-spaced values of a ∈ [10−3, 105]. For a < 10−1, the value function is nearly constant. When a > 101,∂∂xϕ(x, t) increases in magnitude near x = 0 as t→ T .

2.0 1.5 1.0 0.5 0.0 0.5 1.0 1.5 2.0x

2.5

2.0

1.5

1.0

0.5

0.0

u(x,

t)

0.5

0.6

0.7

0.8

0.9

1.0

t (ye

ars)

2 1 0 1 2x

101

100

0

u(x,

t)

0.6 0.8 1.0t (years)

2 × 100

1012 × 101

a(t)

FIG. 10. The solution to Eq. 41 is a superexponentially-increasing a(t) parameter in the Laplace method-derived value

function V (a)(x, t) = tanh(ax). We use this value function as an approximation to the exact value function given in Eq. 39.

Dashed curves indicate u(x, t), while solid curves indicate u(a)(x, t). The lower-right inset axis displays the same data as the

main axis and also includes u(x, t) and u(a)(x, t) at the last simulation timestep, t = 0.9975, to demonstrate the increasingaccuracy of the approximation as t→ T . The lower-left inset displays the optimal a(t).

III. APPLICATION

An example of election interference operations is theRussian military foreign intelligence service (Red team)activity in the 2016 U.S. presidential election contest.Red team attempted to harm one candidate’s (HilaryClinton’s) chances of winning and aid another candidate(Donald Trump) [11]. Though Russian foreign intelli-gence had conducted election interference operations inthe past at least once before, in the Ukrainian elections

of 2014 [40], the 2015 and 2016 operations were notablein that Red team operatives used the microblogging web-site Twitter in an attempt to influence the election out-come. When this attack vector was discovered, Twittershut down accounts associated with Red team activityand all data associated with those accounts was collect-ed and analyzed [41–43]. There has been analysis ofthe qualitative and statistical effects of these and oth-er election attack vectors (e.g., Facebook advertisementpurchases) on election polling and the outcome of theelection [44] and on the detection of election influence

14

campaigns more generally [45, 46]. However, to the bestof our knowledge, there exists no publicly-available effortto reverse-engineer the quantitative nature of the controlpolicies used by Russian military intelligence and by U.S.domestic and foreign intelligence agencies.

We first fit a discrete-time formulation of the modeldescribed in Sec. II A. We then compare it to theoret-ical predictions by finding values of free parameters inthe theoretical model that best describe the observeddata and inferred latent controls. We are faced with twodistinct sources of uncertainty in this procedure. First,we cannot observe either Red’s or Blue’s control policydirectly because foreign and domestic intelligence agen-cies shroud their activities in secrecy. Second, each play-er’s final time payoff structure is also secret and unknownto us. To partially circumvent these issues, we constructa two-stage model. The first stage is a Bayesian struc-tural time series model, depicted graphically in Fig. 11,through which we are able to infer distributions of dis-cretized analogues of uR(t), uB(t), and x(t). Once wehave inferred these distributions, we minimize a loss func-tion that compares the means of these distributions to themeans of distributions produced by the model describedin Sec. II A.

We make the simplifying assumptions about the formatof the election that we stated in Sec. I when construct-ing the discrete-time election model. Namely, we assumethat only two candidates contest the election and thatthe election process is modeled by a simple “candidate Aversus candidate B” poll. Though there are methods forforecasting elections that make fewer and less restrictiveassumptions than these, such as compartmental infectionmodels [47], prediction markets [48], and more sophisti-cated Bayesian models [49, 50], we construct our statisti-cal model to mimic the underlying election model of Sec.II A. We do this to test the ability of this underlying the-oretical model to reproduce inferred control and observedelection dynamics.

We can observe neither the Red uR(t) nor Blue uB(t)control policies. However, we are able to observe a proxyfor uR: the number of tweets sent by Russian militaryintelligence-associated accounts in the year leading upto the 2016 election [51]. This dataset contains a totalof 2,973,371 tweets from 2,848 unique Twitter handles.Of these tweets, a total of 1,107,361 occurred in theyear immediately preceding the election (08/11/2015 -08/11/2016). We grouped these tweets by day and usedthe time series of total number of tweets on each dayas an observable from which we could infer uR. Werestricted the time range of the model to begin at thelater of the end dates of the Republican National Con-vention (21 July 2016) and Democratic National Con-vention (28 July 2016). We did this because the laterof these dates, 28 July 2016, is the day on which therace was officially between two major party candidates.Of all Russian military intelligence-associated tweets in2016, 363,131 occurred during the 102 days beginning on

28 July 2016 and ending the day before Election Day.Though the presence of minor party candidates proba-bly played a role in the result of the election, even themost prominent minor parties (Libertarian and Green)received only single-digit support [52, 53]. We do notmodel these minor parties and instead consider only theelectoral contest between the two major party candi-dates. We used the RealClearPolitics poll aggregationas a proxy for the electoral process itself [54], averag-ing polls that were recorded on the same date and usingthe earliest date in the date range of the poll if it wasconducted over multiple days as the timestamp of thatobservation. We weighted all polls equally when averag-ing.

Using these two observed random variables, we fit aBayesian structural time series model [55] of the formpresented in Fig. 11. We now describe the structure of

FIG. 11. We approximate the time series components ofthe analytical model defined in Sec. II A by a Bayesian struc-tural time series (BSTS) model. We subsequently confrontthe BSTS model with 2016 U.S. presidential election data.Observed random variables are denoted by gray-shaded nodes,while latent random variables are represented by unshadednodes or red (uR,t) and blue (uB,t) nodes. We observe a noisyelection poll, denoted by Zt, and a time series of tweets asso-ciated with Russian military intelligence, denoted by Tweets.Our objective in this modeling stage is to infer the latent elec-toral process, denoted by Xt, and the latent control policies.

the model and explain our choices of priors and likeli-hood functions. In the analytical model, we model thelatent control policies uR(t) and uB(t) by time- and state-dependent Wiener processes. To see this, recall that thestate equation evolves according to a Wiener process andapply Ito’s lemma to the deterministic functions of arandom variable − 1

2∂VR

∂x′

∣∣x′=xt

and − 12∂VB

∂x′

∣∣x′=xt

, which

15

define the control policies. A discretized version of theWiener process is a simple Gaussian random walk. Wethus model the latent Red and Blue control policies byGaussian random walks:

p(uR,t|uR,t−1, µR, σ) = N (uR,t−1 + µR, σ2) (44)

p(uB,t|uB,t−1, µB , σ) = N (uB,t−1 + µB , σ2) (45)

Similarly, we model the latent election process by a dis-cretized version of the state evolution equation Eq. 3:

p(Xt|Xt−1, uR, uB)

= N (Xt−1 + uB,t−1 − uR,t−1, 1)(46)

We assume that the latent election model is subject tonormal observation error in latent space. Since we chosea logistic function as the link between the latent and real(on (0, 1)) election spaces, the likelihood for the observedelection process is thus given by a Logit-Normal distri-bution. The pdf of this distribution is

p(Zt|Xt, σZ) =

√1

2πσ2Z

exp{− (logit(Zt)−Xt)2

2σ2Z

}Zt(1− Zt)

. (47)

Though the number of Russian military intelligencetweets that occur on any given day is obviously a non-negative integer, we chose not to model it this way. Acommon and simple model for a “count” random vari-able, such as the tweet time series, is a Poisson distribu-tion with possibly-time-dependent rate parameter [56–59]. This model imposes a strong assumption on thevariance of the count distribution (namely, that the vari-ance and mean are equal) which does not seem realisticin the context of the tweet data. Instead of searching fora discrete count distribution that meets some optimalitycriterion, we instead normalized the tweet time series tohave zero mean and unit variance, making it a continu-ous random variable rather than a discrete one. We thenshifted the time series so that the the new time serieswas equal to zero on the day during our study with thefewest tweets. We then modeled this time series Tweetstby a normal observation likelihood,

p(Tweetst|uR,t, σTweets) = N (uR,t, σ2Tweets). (48)

We placed a weakly-informative prior, a Log-Normal dis-tribution, on each standard deviation random variable(σ, σZ , σTweets), and zero-centered Normal priors on eachmean random variable (µR, µB). This model is high-dimensional, since the latent time series X, uR, and uBare inferred as T -dimensional vectors. In total this modelhas 3T + 5 = 311 degrees of freedom.

We display a graphical representation of this model inFig. 11. We fit this model using the No-U-Turn Sam-pler algorithm [60], sampling 2000 draws from the mod-el’s posterior distribution from each of two independentMarkov chains, We do not include 1000 draws per chainof burn-in in the samples from the posterior. The sampler

appeared to converge well based on graphical considera-tion (i.e., the “eye test”) of draws from the posterior pre-dictive distribution of Zt and Tweetst, and, more impor-tantly, because maximum values of Gelman-Rubin statis-tics [61] for all variables satisfied Rmax < 1.01 except forthat of σZ , which had Rmax = 1.07646. Each of these val-ues is well below the level R = 1.1 advocated by Brooksand Gelman [62]. Fig. 12 displays draws from the poste-rior and posterior predictive distribution of this model.Panel A displays draws of Xt from the posterior distribu-tion, along with E[Xt] and logit(Zt), while in panel B weshow posterior draws of uR and uB, along with E[uR] andE[uB] in thick red and blue curves respectively. In panelC, we display Tweetst and draws from its posterior pre-dictive distribution. On 10/06/2016, Tweetst exhibited alarge spike that is very unlikely under the posterior pre-dictive distribution. This spike likely corresponds with astatement made by the U.S. federal government on thisdate that officially recognized the Russian government asculpable for hacking the Democratic National Committeecomputers.

After inferring the latent control policies and electoralprocess, we searched for the parameter values θ =(λR, λB, σ,ΦR,ΦB) of the theoretical model that bestexplain the observed data and inferred latent variables.For clarity in reference, we will refer to the theoreticalmodel asQ and the Bayesian structural time series modelasM. We use Legendre polynomials to approximate thefinal conditions ΦR and ΦB, as discussed in Sec. II B 3.

Setting Φi(x) ≈∑Kk=0 aikPk(x), Q’s parameter vector

is θ = (λR, λB, σ, a0,r, ..., aK,r, a0,b, ..., aK,b). In contrastwith M, Q has relatively few degrees of freedom sincethe assumption of state and policy co-evolution via solu-tion of coupled partial differential equations substantiallyrestricts the system’s dynamics. In total, Q has 2K + 3free parameters. The smaller the value of K, the lessaccurate the approximation to the true final conditionswill be. Conversely, large K could lead to overparam-eterization of the model and increases the size of thesearch space. We thus chose K = 10 as a compromisebetweeen these two extremes. With K = 10, the modelhas 2K + 3 = 23 degrees of freedom.

The theoretical model Q can be viewed as a generativeprobabilitistic function. To find optimal parameter val-ues, we generate (uR, uB, X) from Q and minimize a lossfunction of these generated values and the values inferredby M. We defined this loss function as

L(θ|Q) =∑(y,y)

[‖µy − µy‖22 + ησy

], (49)

where y ∈ {uR, uB, X} and y ∈ {uR, uB, X}. We havedefined the mean and standard deviation under the cor-responding distribution by µ and σ respectively. The `2terms in Eq. 49 penalize deviation by Q from the mean ofM’s inferred posterior distribution. The standard devi-ation term in Eq. 49 imposes a penalty on dispersion.

16

FIG. 12. Panel A displays the logit of the observed elec-tion time series (black curve) logit(Zt), along with the pos-terior distribution of the latent electoral process Xt. PanelB displays the mean latent control policies in thick red andblue curves, along with their posterior distributions. Panel Cshows the true tweet time series (subject to the normalizationdescribed in the main body) along with draws from its poste-rior predictive distribution. The large spike in the tweet timeseries that is very unlikely under the posterior predictive dis-tribution corresponds to the day (10/06/2016) on which theU.S. federal government officially accused Russia of hackingthe Democratic National Committee computers.

We minimized L(θ|Q) using a Gaussian-process Bayesianoptimization algorithm. The details of this algorithm arebeyond the scope of this work but are readily found inany review paper on the subject [63–65]. Fig. 13 displaysthe result of this optimization procedure for K = 10 andη = 0.002. For this set of hyperparameters, we found cou-pling parameter values of λR = 0.1432 and λB = 1.7847and a latent space volatility of σ = 0.7510. In Fig. 13,we use credible intervals to denote ranges into which ourestimates of model parameters fall.

Panel A of Fig. 13 displays logit(Zt) in a thick blackcurve and a middle 80% (10% to 90%) credible interval

of X from Q in grey shading. The observed logit(Zt)

is centered in the credible interval of X and hence hasa high probability under Q. In panel B, we show E[uR]and E[uB] in thick red and blue curves respectively alongwith middle 80% credible intervals of uR and uR. Themean paths of the latent red and blue control policies donot lie in the middle 80% credible intervals for approx-imately the first two weeks after the end of the Demo-cratic National Convention, but do lie in these credibleintervals for the remainder of the time until the elec-tion. Our model is able to capture the election interfer-ence dynamics in the middle range of this timespan butis not able to capture the dynamics immediately afterthe race becomes a two-candidate election. Though theelection does officially become a two-candidate contest atthat time (notwithstanding our previous comments aboutthird-party candidates), the effects of the Republican andDemocratic primaries may take time to dissipate. Ourmodel does not capture the dynamics of noncooperativegames in the presence of many candidates. We commenton this finding more in Sec. IV. Finally, we display theinferred final conditions ΦR(x) and ΦB(x) in panel C ofFig. 13.

IV. DISCUSSION AND CONCLUSION

We introduce, analyze, and numerically (analyticallyin simplified cases) solve a simple model of noncooper-ative strategic election interference. This interferenceis undertaken by a foreign intelligence service from onecountry (Red) in an election occurring in another coun-try (Blue). Blue’s domestic intelligence service attemptsto counter this interference. Though simple, our modelis able to provide qualitative insight into the dynamics ofsuch strategic interactions and performs well when fittedto polling and social media data surrounding the 2016U.S. presidential election contest. We find that all-or-nothing attitudes regarding the outcome of the electioninterference, even if these attitudes are held by only oneplayer, result in an arms race of spending on interfer-ence and counter-interference operations by both play-ers. We then find analytical solutions to player i’s opti-mal control problem when player ¬i credibly commits toa strategy v(t). We detail an analytical value function

17

FIG. 13. We display credible intervals of latent election process X and Red and Blue control policies, uR and uB generatedusing optimal θ = (λR, λB, σ, a0,r, ..., aK,r, a0,b, ..., aK,b) values. We ran the optimization algorithm with the number of termsof the Legendre expansion of ΦR and ΦB set to K = 10 and set the variance regularization in the algorithm to η = 0.002.This resulted in fit parameters of λR = 0.849, λB = 0.727, and σ = 1.509. Panel A displays draws from the latent electoralprocess under Q, along with logit(Zt), the logit-transformed real polling popularity process. Panel B displays draws from thedistributions of uR and uB under Q, while panel C displays the inferred final conditions ΦR(x) and ΦB(x).

approximation that can be used by player i even whenplayer ¬i does not commit to a particular strategy aslong as player ¬i’s current strategy and its time deriva-tive can be estimated. We demonstrate the applicabilityof our model to real election interference scenarios byanalyzing the Russian effort to interfere in the 2016 U.S.presidential election through observation of Russian trollaccount posts on the website Twitter. Using this data,along with aggregate presidential election polling data,we infer the time series of Russian and U.S. control poli-cies and find parameters of our model that best explainthese inferred control policies. We show that, for mostof the time under consideration (after the DemocraticNational Convention and before Election Day), our mod-el provides a good explanation for the inferred variables.However, our model does not accurately or precisely cap-ture the interference dynamics immediately after the racebecomes a two-candidate race.

There are several areas in which our work could beimproved. While our model is justifiable on the groundsof parsimony and acceptable empirical performance onat least one election contest, the kind of assumptionsthat we make in constructing our modeling frameworkare unrealistic. Though a pure random walk model foram election is not without serious precedent [66], anex-tension of this work could incorporate non-interference-related state dynamics as a generalization of Eq. 3. Forexample, the state equation could read

dx = [µ0 + µ1x+ uR(t) + uB(t)]dt+ σdW. (50)

This state equation accounts for simple drift in the elec-tion results as a candidate endogenously becomes moreor less popular. It can also account for possible mean-reverting behavior in a hotly-contested race. Another

extension could introduce state-dependent running costs,particularly in the case of the Red player. Though theaction of election interference is nominally intended tocause a particular candidate to win or lose, Red couldhave other objectives as well, such as undermining theBlue citizens’ trust in their electoral process. Red mightgain utility from having a candidate lead in polls multi-ple times when that candidate would not have otherwisedone so, even if the candidate does not actually win theelection. In the context of our model, this is representedby setting Red’s cost functional to be

EuR,uB,X

{ΦR(XT ) +

∫ T

0

[−Θ(−Xt)

+ u2R(t)− λRu2B(t)] dt}.

(51)

Both of these modifications are easy to incorporate intothe model and do not change the qualitative nature ofRed and Blue’s HJB equations since their effects will sim-ply be to introduce an additional drift term (Eq. 50) or acontinuous, non-differentiable source term (Eq. 51) intothe HJB equations (Eqs. 11 and 12). That is, the funda-mental nature of these equations as nonlinear parabolicequations coupled through quadratic terms of self andother-player first spatial derivatives remains unchangedas these modifications to the theory do not introduce anynew coupling terms. The solutions to these equations donot demonstrate shock or travelling wave behavior withthe addition of the drift or source terms, as we show inFigs. 14 and 15. With the modification of Eq. 50, the

18

HJB equations become

− ∂VR∂t

= (µ0 + µ1x)∂VR∂x− 1

4

(∂VR∂x

)2

− 1

2

∂VR∂x

∂VB∂x− λR

4

(∂VB∂x

)2

+σ2

2

∂2VR∂x2

,

VR(x, T ) = ΦR(x)

(52)

and

− ∂VB∂t

= (µ0 + µ1x)∂VB∂x− 1

4

(∂VB∂x

)2

− 1

2

∂VB∂x

∂VR∂x− λB

4

(∂VR∂x

)2

+σ2

2

∂2VB∂x2

,

VB(x, T ) = ΦB(x).

(53)

In Fig. 14 we give examples of solutions of Eqs. 52 and53 at t = 0 and t = T . These solutions do not displayqualitative changes, such as the formation of shock ortravelling waves, with the inclusion of nonzero drift termsof the form µ0+µ1

∂Vi∂x , i ∈ {R,B}. With the modification

of Eq. 51, Red’s HJB equation reads

− ∂VR∂t

= −1

4

(∂VR∂x

)2

− 1

2

∂VR∂x

∂VB∂x

− λR4

(∂VB∂x

)2

−Θ(−x) +σ2

2

∂2VR∂x2

,

VR(x, T ) = ΦR(x).

(54)

We plot solutions of Eq. 54 in Fig. 15. These solutionsalso do not change qualitatively from the solutions toEqs. 11 and 12 in that there is no shock or travelling waveformation [67]. A more fundamental qualitative changewould be to expand the scope of Red’s interference toalter the latent volatility of the election process. Red’sadditional objective might be to increase the uncertaintyin polling results.

In addition to theoretical modifications, other work couldextend these results to other elections using similarly fine-grained or more granular data. This approach is difficultbecause there is very little granular public data on elec-tion interference [6]. We are able to confront our modelto data only because the Russian interference in the 2016U.S. presidential election was well-publicized and becausethe interference took place at least partially through themechanism of Twitter, which is a public data source. Wewere unable to find any other publicly-available data atdaily (or finer) temporal resolution for any other publicly-acknowledged election interference episode.

In the case study of the 2016 U.S. presidential election,our theoretical model accurately captured election inter-ference dynamics from approximately two weeks afterthe Democratic National Convention until Election Day.However, it did not capture the dynamics of electioninterference accurately or precisely during the first fort-night of time under study. We believe this is because,

even though the election was then a two-candidate con-test, there were additional election state and interferencedynamics that we did not model. We believe that thesedynamics arise because the transition from interferingin many candidates’ primary campaigns to interfering inonly one electoral contest is not immediate. It likelytakes time for the foreign intelligence agency to recali-brate their interference strategy. In addition, the foreignintelligence agency may still expend reseources on influ-encing other candidates’ supporters, even though thoseother candidates had been unsuccessful in their questfor inclusion in the general election. Suppose that thereare initially NR “red candidates” (candidates that theforeign intelligence service would like to win the elec-tion) and NB “blue candidates” (candidates that the for-eign intelligence service would not like to win the elec-tion). Then modeling the transition between the con-ventions and the general election requires collapsing thestate equation from a NR +NB− 2-dimensional stochas-tic differential equation (SDE) to a one-dimensional SDE.The cost functions of Red and Blue would probably alsochange during this transition, but we are unsure of how tomodel this change. Because of the dimensionality reduc-tion in the state equation, the coupled HJB equationswould change from being PDEs solved in NR+NB−2 spa-tial dimensions to ones solved in one spatial dimension,as now. We did not attempt to model these dynamics,but this could be a useful expansion of our model.

We used two models in our analysis of interference inthe 2016 U.S. presidential election. We used a Bayesianstructural time series model to infer the latent randomvariables uR, uB, andX, and then used these inferred val-ues to fit parameters of the theoretical model described inSec. II B. While the theoretical model does not have manyfree parameters (2K + 3 = 23 degrees of freedom), thestructural time series model does have many free parame-ters (3T+5 = 311 degrees of freedom). The large numberof free parameters of the structural time series model doesnot mean that the model is overparameterized. At eacht, we observe a number of tweets Tweetst and a popular-ity rating for the candidates Zt. From these we want toinfer the distributions of the random variables uR,t, uB,t,and Xt. Since we want to infer the distribution of each ofthese three random variables at each of the T timesteps,this large number of parameters is expressly necessary. Ifwe observe M identical trials of an election process overT timesteps, the ratio of structural time series modelparameters to observed datapoints is given by

R(T,M) =3T + 5

2MT=

3

2M−1

(1 +

5

3T−1

). (55)

With T held constant, R(T,M) → 0 as M grows, whilewith M held constant, R(T,M) → 3

2 as T grows large.Since we observe only one draw from the election interfer-ence model, M = 1 in our case. However, this approachof inferring each random variable’s distribution is nottractable when T becomes large since the number of

19

FIG. 14. We demonstrate the lack of qualitative changes in the value functions given by solutions to Eqs. 52 and 53 whencompared to the solutions of Eqs. 11 and 12. We plot Vi(x, 0) in solid curves and Vi(x, T ) in dashed curves, i ∈ {R,B}. We setfinal conditions here to ΦR(x) = tanhx and ΦB(x) = 1

2x2Θ(−x).

parameters to fit still grows linearly with T .

One way of partially circumventing this problem is to usea variational inference approach combined with amorti-zation of the random variables. Denote the vector ofall observed random variables at time t by yt and thevector of all latent random variables at time t by wt.Variational inference replaces the actual posterior distri-bution with an approximate posterior distribution thathas a known normalization constant (thereby eliminatingthe need for MCMC routines to compute this constant)[68–70]. The parameters of the approximate posteriorare found through optimization, which is generally muchfaster than Monte Carlo sampling. If the joint distribu-tion of y = (y1, ..., yT ) and w = (w1, ..., wT ) is given by

p(y, w) =∏Tt=1 p(yt|wt)p(wt), then the true posterior is

given by p(w|y) = p(y, w)/p(y). The approximate (varia-

tional) posterior is given by qθ(w) =∏Tt=1 qθt(wt), where

θ = (θ1, ..., θT ) is the vector of parameters found throughoptimization and qθt(wt) are the probability distributionsfor each timestep. The normalizing constants are knownfor each qθt(wt).

Amortization of the random variables means that,instead of finding the optimal value of the entire length-Tvector θ, we model the approximate posterior as qψ(w) =∏Tt=1 q(wt|fψ(yt)) [69, 71]. The vector ψ is the vector

of parameters of the (probably nonlinear) function fψ(·)

and does not scale with T . The function fψ(·) models theeffect of the time-dependent parameters θt of the varia-tional posterior. This amortized variational posterior isalso fit using an optimization routine. Since the numberof parameters of this model does not scale with time, Eq.55 for this model becomes

Ramort(T,M) =P

2MT, (56)

where P is the constant number of parameters (thedimension of ψ) in the amortized model. For fixed M ,Ramort(M,T ) → 0 as T becomes large. Another usefulextension of our present work would be to reimplementour Bayesian structural time series model using amor-tized variational inference. This would also eliminate theproblem of choosing the “correct” number of parame-ters in the Legendre polynomial approximation that wedescribed in Sec. III.

ACKNOWLEDGEMENTS

The authors are grateful for financial support from theMassachusetts Mutual Life Insurance Company and arethankful for the truly helpful comments from an anony-mous reviewer.

20

FIG. 15. We demonstrate the lack of qualitative changes in the value functions given by the solution to Eqs. 54 when comparedto the solutions of Eqs. 11 and 12. We plot Vi(x, 0) in solid curves and Vi(x, T ) in dashed curves, i ∈ {R,B}. We set finalconditions here to ΦR(x) = tanhx and ΦB(x) = 1

2x2Θ(−x). We also plot solutions for an augmented HJB equation for Blue

in panels C and D. This HJB equation is identical to Eq. 54 except with R 7→ B and the sign of argument of the discontinuousrunning cost function reversed.

[1] Paul SA Renaud and Frans AAM van Winden. On theimportance of elections and ideology for government pol-icy in a multi-party system. In The Logic of MultipartySystems, pages 191–207. Springer, 1987.

[2] Jorgen Elklit and Palle Svensson. What makes electionsfree and fair? Journal of Democracy, 8(3):32–46, 1997.

[3] Dov H Levin. When the great power gets a vote: Theeffects of great power electoral interventions on electionresults. International Studies Quarterly, 60(2):189–202,2016.

[4] Stephen Shulman and Stephen Bloom. The legitimacy offoreign intervention in elections: The ukrainian response.Review of International Studies, 38(2):445–471, 2012.

[5] Daniel Corstange and Nikolay Marinov. Taking sidesin other peoples elections: The polarizing effect of for-eign intervention. American Journal of Political Science,56(3):655–670, 2012.

[6] Dov H Levin. Partisan electoral interventions by thegreat powers: Introducing the PEIG Dataset. ConflictManagement and Peace Science, 36(1):88–106, 2019.

[7] Erica D Borghard and Shawn W Lonergan. ConfidenceBuilding Measures for the Cyber Domain. Strategic Stud-ies Quarterly, 12(3):10–49, 2018.

[8] Dov H Levin. Voting for Trouble? Partisan ElectoralInterventions and Domestic Terrorism. Terrorism andPolitical Violence, pages 1–17, 2018.

[9] Isabella Hansen and Darren J Lim. Doxing democracy:influencing elections via cyber voter interference. Con-temporary Politics, 25(2):150–171, 2019.

[10] Johannes Bubeck, Kai Jager, Nikolay Marinov, and Fed-erico Nanni. Why Do States Intervene in the Electionsof Others? Available at SSRN 3435138, 2019.

[11] Read the Mueller Report: Searchable Document andIndex: https://www.nytimes.com/interactive/2019/04/18/us/politics/mueller-report-document.html.New York Times, Apr 2019.

[12] Charles R Nelson and Charles R Plosser. Trends andrandom walks in macroeconmic time series: some evi-dence and implications. Journal of monetary economics,10(2):139–162, 1982.

[13] David N DeJong, John C Nankervis, N Eugene Savin,and Charles H Whiteman. Integration versus trend sta-tionary in time series. Econometrica: Journal of theEconometric Society, pages 423–433, 1992.

[14] Crispin W Gardiner. Handbook of Stochastic Methods,volume 3. Springer Berlin, 1985.

21

[15] Peter Jung and Peter Hanggi. Dynamical systems: aunified colored-noise approximation. Physical Review A,35(10):4464, 1987.

[16] Peter Hanggi and Peter Jung. Colored noise in dynami-cal systems. Advances in Chemical Physics, 89:239–326,1995.

[17] Richard Bellman. Dynamic programming and a new for-malism in the calculus of variations. Proceedings of theNational Academy of Sciences of the United States ofAmerica, 40(4):231, 1954.

[18] Richard Bellman. Dynamic programming. Science,153(3731):34–37, 1966.

[19] Richard S Sutton and Andrew G Barto. Reinforcementlearning: An introduction. MIT press, 2018.

[20] Hilbert J Kappen. An introduction to stochastic controltheory, path integrals and reinforcement learning. In AIPconference proceedings, volume 887, pages 149–181. AIP,2007.

[21] Karl J Astrom. Introduction to stochastic control theory.Courier Corporation, 2012.

[22] Tryphon T Georgiou and Anders Lindquist. The separa-tion principle in stochastic control, redux. IEEE Trans-actions on Automatic Control, 58(10):2481–2494, 2013.

[23] Ernst Zermelo. Uber eine anwendung der mengenlehreauf die theorie des schachspiels. In Proceedings of thefifth international congress of mathematicians, volume 2,pages 501–504. Cambridge University Press Cambridge,UK, 1913.

[24] John Nash. Non-cooperative games. Annals of mathe-matics, pages 286–295, 1951.

[25] Drew Fudenberg and Jean Tirole. Noncooperative gametheory for industrial organization: an introduction andoverview. Handbook of industrial Organization, 1:259–327, 1989.

[26] Andreu Mas-Colell, Michael Dennis Whinston, Jerry RGreen, et al. Microeconomic theory, volume 1. Oxforduniversity press New York, 1995.

[27] Rainer Buckdahn and Juan Li. Stochastic differen-tial games and viscosity solutions of Hamilton–Jacobi–Bellman–Isaacs equations. SIAM Journal on Control andOptimization, 47(1):444–475, 2008.

[28] Triet Pham and Jianfeng Zhang. Two person zero-sumgame in weak formulation and path dependent Bellman–Isaacs equation. SIAM Journal on Control and Optimiza-tion, 52(4):2090–2121, 2014.

[29] Code to recreate simulations and plots in this paper, or tocreate new simulations and “what-if” scenarios, is locatedat https://gitlab.com/daviddewhurst/red-blue-game.

[30] There is a large literature on nonparametric functionalapproximation that we cannot review here. Here is anexample of this type of approximation. If {xk}1≤k≤K isany partition of [a, b) and the vector (ΦR(xk),ΦB(xk))were jointly distributed Gaussian on R2K , then we saythat ΦR and ΦB are distributed according to a Gaussianprocess [? ]. Though this infinite-dimensional formal-ism can be useful when deriving theoretical results, inpractice we would approximate this process by a finite-dimensional multivariate Gaussian. In Sec. III we actu-ally approximate ΦR and ΦB by finite sums of Legendrepolynomials and infer the coefficients of these polynomi-als as our multivariate approximation.

[31] Ming-Hui Chen and Qi-Man Shao. Monte carlo esti-mation of bayesian credible and hpd intervals. Journal

of Computational and Graphical Statistics, 8(1):69–92,1999.

[32] Valery E Yarynich. C3: nuclear command, control coop-eration. Center for Defense Information, 2003.

[33] Anthony M Barrett. False alarms, true dangers? RANDCorporation document PE-191-TSF, DOI, 10, 2016.

[34] Hilbert J Kappen. Path integrals and symmetry breakingfor optimal control theory. Journal of Statistical Mechan-ics: Theory and Experiment, 2005(11):P11011, 2005.

[35] Mark Kac. On distributions of certain Wiener function-als. Transactions of the American Mathematical Society,65(1):1–13, 1949.

[36] Yaming Chen, Adrian Baule, Hugo Touchette, and Wol-fram Just. Weak-noise limit of a piecewise-smoothstochastic differential equation. Physical Review E,88(5):052103, 2013.

[37] Julie Leifeld, Kaitlin Hill, and Andrew Roberts. Per-sistence of saddle behavior in the nonsmooth limitof smooth dynamical systems. arXiv preprint arX-iv:1504.04671, 2015.

[38] Kiyoshi Kanazawa, Tomohiko G Sano, Takahiro Sagawa,and Hisao Hayakawa. Asymptotic derivation of Langevin-like equation with non-gaussian noise and its analyti-cal solution. Journal of Statistical Physics, 160(5):1294–1335, 2015.

[39] Sofia H Piltz, Lauri Harhanen, Mason A Porter, andPhilip K Maini. Inferring parameters of prey switching ina 1 predator–2 prey plankton system with a linear pref-erence tradeoff. Journal of Theoretical Biology, 456:108–122, 2018.

[40] Peter Tanchak. The Invisible Front: Russia, Trolls,and the Information War against Ukraine. Revolutionand War in Contemporary Ukraine: The Challenge ofChange, 161:253, 2017.

[41] Brandon C Boatwright, Darren L Linvill, and Patrick LWarren. Troll factories: The internet research agencyand state-sponsored agenda building. Resource Centreon Media Freedom in Europe, 2018.

[42] Adam Badawy, Emilio Ferrara, and Kristina Lerman.Analyzing the digital traces of political manipulation:the 2016 Russian interference Twitter campaign. In 2018IEEE/ACM International Conference on Advances inSocial Networks Analysis and Mining (ASONAM), pages258–265. IEEE, 2018.

[43] Ryan L Boyd, Alexander Spangher, Adam Fourney,Besmira Nushi, Gireeja Ranade, James Pennebaker, andEric Horvitz. Characterizing the Internet Research Agen-cys Social Media Operations During the 2016 US Presi-dential Election using Linguistic Analyses. 2018.

[44] Damian J Ruck, Natalie M Rice, Joshua Borycz, andR Alexander Bentley. Internet Research Agency Twitteractivity predicted 2016 US election polls. First Monday,24(7), 2019.

[45] Nicolas Guenon des Mesnards and Tauhid Zaman.Detecting influence campaigns in social networks usingthe Ising model. arXiv preprint arXiv:1805.10244, 2018.

[46] Jane Im, Eshwar Chandrasekharan, Jackson Sargent,Paige Lighthammer, Taylor Denby, Ankit Bhargava, Lib-by Hemphill, David Jurgens, and Eric Gilbert. Still outthere: Modeling and Identifying Russian Troll Accountson Twitter. arXiv preprint arXiv:1901.11162, 2019.

[47] Alexandria Volkening, Daniel F Linder, Mason A Porter,and Grzegorz A Rempala. Forecasting elections usingcompartmental models of infections. arXiv preprint arX-

22

iv:1811.01831, 2018.[48] David Rothschild. Forecasting elections: Comparing pre-

diction markets, polls, and their biases. Public OpinionQuarterly, 73(5):895–916, 2009.

[49] Drew A Linzer. Dynamic Bayesian forecasting of presi-dential elections in the states. Journal of the AmericanStatistical Association, 108(501):124–134, 2013.

[50] Wei Wang, David Rothschild, Sharad Goel, and AndrewGelman. Forecasting elections with non-representativepolls. International Journal of Forecasting, 31(3):980–991, 2015.

[51] Data can be downloaded at https://github.com/

fivethirtyeight/russian-troll-tweets/.[52] Ramin Skibba. Pollsters struggle to explain failures of us

presidential forecasts. Nature News, 539(7629):339, 2016.[53] Ryan Neville-Shepard. Constrained by duality: Third-

party master narratives in the 2016 presidential election.American Behavioral Scientist, 61(4):414–427, 2017.

[54] Data can be downloaded at https://www.

realclearpolitics.com/epolls/2016/president/

us/general_election_trump_vs_clinton_vs_johnson_

vs_stein-5952.html.[55] David Barber, A Taylan Cemgil, and Silvia Chiappa.

Bayesian Time Series Models. Cambridge UniversityPress, 2011.

[56] Agner Krarup Erlang. Sandsynlighedsregning og telefon-samtaler. Nyt tidsskrift for Matematik, 20:33–39, 1909.

[57] Georg Rasch. The poisson process as a model for a diver-sity of behavioral phenomena. In International congressof psychology, volume 2, page 2, 1963.

[58] David R Cox. Regression models and life-tables. Journalof the Royal Statistical Society: Series B (Methodologi-cal), 34(2):187–202, 1972.

[59] David Roxbee Cox and Valerie Isham. Point processes,volume 12. CRC Press, 1980.

[60] Matthew D Hoffman and Andrew Gelman. The No-U-Turn sampler: adaptively setting path lengths inHamiltonian Monte Carlo. Journal of Machine Learn-ing Research, 15(1):1593–1623, 2014.

[61] Andrew Gelman, Donald B Rubin, et al. Inference fromiterative simulation using multiple sequences. StatisticalScience, 7(4):457–472, 1992.

[62] Stephen P Brooks and Andrew Gelman. Generalmethods for monitoring convergence of iterative simula-tions. Journal of Computational and Graphical Statistics,7(4):434–455, 1998.

[63] Eric Brochu, Vlad M Cora, and Nando De Freitas. Atutorial on Bayesian optimization of expensive cost func-tions, with application to active user modeling and hier-archical reinforcement learning. arXiv preprint arX-iv:1012.2599, 2010.

[64] Bobak Shahriari, Kevin Swersky, Ziyu Wang, Ryan PAdams, and Nando De Freitas. Taking the human out ofthe loop: A review of Bayesian optimization. Proceedingsof the IEEE, 104(1):148–175, 2015.

[65] Peter I Frazier. A tutorial on bayesian optimization. arX-iv preprint arXiv:1807.02811, 2018.

[66] Nassim Nicholas Taleb. Election predictions as martin-gales: An arbitrage approach. Quantitative Finance,18(1):1–5, 2018.

[67] Using the HJB integrator red blue pdes.py includedin the code at https://gitlab.com/daviddewhurst/

red-blue-game, the reader may further investigate theeffects of changes in the drift and running cost terms on

the qualitative nature of solutions to the coupled HJBsystem.

[68] Matthew D Hoffman, David M Blei, Chong Wang, andJohn Paisley. Stochastic variational inference. TheJournal of Machine Learning Research, 14(1):1303–1347,2013.

[69] Danilo Jimenez Rezende and Shakir Mohamed. Varia-tional inference with normalizing flows. arXiv preprintarXiv:1505.05770, 2015.

[70] David M Blei, Alp Kucukelbir, and Jon D McAuliffe.Variational inference: A review for statisticians. Journalof the American statistical Association, 112(518):859–877, 2017.

[71] Cheng Zhang, Judith Butepage, Hedvig Kjellstrom, andStephan Mandt. Advances in variational inference. IEEEtransactions on pattern analysis and machine intelli-gence, 2018.

23

Appendix A: Coupling parameter sweeps

We conducted parameter sweeps over different valuesof the coupling parameters λR and λB for multiple com-binations of Red and Blue final conditions. To do this,we integrated Eqs. 11 and 12 for λR, λB ∈ [0, 3] for eachof the 32 = 9 combinations of the final conditions

ΦR(x) ∈ {x, 2 tanhx,Θ(x)−Θ(−x)} (A1)

and

ΦB(x) ∈{

1

2x2,

1

2x2Θ(−x), 2[Θ(|x| −∆)−Θ(∆− |x|)]

}.

(A2)We integrated Eqs. 11 and 12 for Nt = 8000 timesteps,setting t0 = 0 and T = 1 year. We enforced Neumannboundary conditions on the interval x ∈ [−3, 3] and setthe spatial step size to be dx = 0.025. After integration,

we then drew 1000 paths u(n)R and u

(n)B from the result-

ing probability distribution over control policies. (Thisprobability distribution is a time-dependent multivari-ate Gaussian, since the control policies are determinis-tic functions of the random variable defined by the Itostochastic differential equation Eq. 3.) We then calculat-ed the intertemporal means and standard deviations ofthese control policies, which we denote by

mean(ui) =1

TN

N∑n=1

∫ T

0

u(n)i (t) dt (A3)

and

std(ui) =

√√√√ 1

TN

N∑n=1

∫ T

0

[u(n)i (t)−mean(ui)

]2dt,

(A4)for i ∈ {R,B}. We display the mean control policiesin Sec. A 1 and the standard deviations of the controlpolicies in Sec. A 2.

1. Expected value of ui(t)

We display the mean paths of draws from the distri-butions of control policies generated by solutions to Eqs.11 and 12.

FIG. 16. Parameter sweep over coupling parameters λR, λB

with Red final condition ΦR(x) = x and Blue final conditionΦB(x) = 1

2x2Θ(−x). Intensity of color corresponds to mean

of control policy.

24

FIG. 17. Parameter sweep over coupling parameters λR, λB

with Red final condition ΦR(x) = x and Blue final conditionΦB(x) = 2[Θ(|x| − 0.1) − Θ(0.1 − |x|)]. Intensity of colorcorresponds to mean of control policy.

25

FIG. 18. Parameter sweep over coupling parameters λR, λB

with Red final condition ΦR(x) = x and Blue final condi-tion ΦB(x) = 1

2x2. Intensity of color corresponds to mean of

control policy.

FIG. 19. Parameter sweep over coupling parameters λR, λB

with Red final condition ΦR(x) = 2[Θ(x)−Θ(−x)] and Bluefinal condition ΦB(x) = 1

2x2Θ(−x). Intensity of color corre-

sponds to mean of control policy.

26

FIG. 20. Parameter sweep over coupling parameters λR, λB

with Red final condition ΦR(x) = 2[Θ(x)−Θ(−x)] and Bluefinal condition ΦB(x) = 2[Θ(|x|−0.1)−Θ(0.1−|x|)]. Intensityof color corresponds to mean of control policy.

FIG. 21. Parameter sweep over coupling parameters λR, λB

with Red final condition ΦR(x) = 2[Θ(x)−Θ(−x)] and Bluefinal condition ΦB(x) = 1

2x2. Intensity of color corresponds

to mean of control policy.

27

FIG. 22. Parameter sweep over coupling parameters λR, λB

with Red final condition ΦR(x) = tanh(x) and Blue final con-dition ΦB(x) = 1

2x2Θ(−x). Intensity of color corresponds to

mean of control policy.

FIG. 23. Parameter sweep over coupling parameters λR, λB

with Red final condition ΦR(x) = tanh(x) and Blue final con-dition ΦB(x) = 2[Θ(|x|−0.1)−Θ(0.1−|x|)]. Intensity of colorcorresponds to mean of control policy.

28

FIG. 24. Parameter sweep over coupling parameters λR, λB

with Red final condition ΦR(x) = tanh(x) and Blue final con-dition ΦB(x) = 1

2x2. Intensity of color corresponds to mean

of control policy.

29

2. Standard deviation of ui(t)

We display the standard deviation of draws from thedistributions of control policies generated by solutions toEqs. 11 and 12.

FIG. 25. Parameter sweep over coupling parameters λR, λB

with Red final condition ΦR(x) = x and Blue final conditionΦB(x) = 1

2x2Θ(−x). Intensity of color corresponds to std of

control policy.

FIG. 26. Parameter sweep over coupling parameters λR, λB

with Red final condition ΦR(x) = x and Blue final conditionΦB(x) = 2[Θ(|x| − 0.1) − Θ(0.1 − |x|)]. Intensity of colorcorresponds to std of control policy.

30

FIG. 27. Parameter sweep over coupling parameters λR, λB

with Red final condition ΦR(x) = x and Blue final conditionΦB(x) = 1

2x2. Intensity of color corresponds to std of control

policy.

FIG. 28. Parameter sweep over coupling parameters λR, λB

with Red final condition ΦR(x) = 2[Θ(x)−Θ(−x)] and Bluefinal condition ΦB(x) = 1

2x2Θ(−x). Intensity of color corre-

sponds to std of control policy.

31

FIG. 29. Parameter sweep over coupling parameters λR, λB

with Red final condition ΦR(x) = 2[Θ(x)−Θ(−x)] and Bluefinal condition ΦB(x) = 2[Θ(|x|−0.1)−Θ(0.1−|x|)]. Intensityof color corresponds to std of control policy.

FIG. 30. Parameter sweep over coupling parameters λR, λB

with Red final condition ΦR(x) = 2[Θ(x)−Θ(−x)] and Bluefinal condition ΦB(x) = 1

2x2. Intensity of color corresponds

to std of control policy.

32

FIG. 31. Parameter sweep over coupling parameters λR, λB

with Red final condition ΦR(x) = tanh(x) and Blue final con-dition ΦB(x) = 1

2x2Θ(−x). Intensity of color corresponds to

std of control policy.

FIG. 32. Parameter sweep over coupling parameters λR, λB

with Red final condition ΦR(x) = tanh(x) and Blue final con-dition ΦB(x) = 2[Θ(|x|−0.1)−Θ(0.1−|x|)]. Intensity of colorcorresponds to std of control policy.

33

FIG. 33. Parameter sweep over coupling parameters λR, λB

with Red final condition ΦR(x) = tanh(x) and Blue final con-dition ΦB(x) = 1

2x2. Intensity of color corresponds to std of

control policy.