When the winner_comes_third

45
When the Winner Comes Third: Simulating Candidates’ Winnability With Inaccurate Polls Daniel Marcelino and Alejandro Tapias 5 de agosto de 2014 Daniel Marcelino and Alejandro Tapias When the Winner Comes Third: Simulating Candidates’ Winnability With Inaccur 5 de agosto de 2014 1 / 40

description

Presentation I gave at ABPC annual meeting.

Transcript of When the winner_comes_third

Page 1: When the winner_comes_third

When the Winner Comes Third:Simulating Candidates’ Winnability With Inaccurate Polls

Daniel Marcelino and Alejandro Tapias

5 de agosto de 2014

Daniel Marcelino and Alejandro Tapias When the Winner Comes Third: Simulating Candidates’ Winnability With Inaccurate Polls5 de agosto de 2014 1 / 40

Page 2: When the winner_comes_third

Where did the pollsters go wrong?

Average EstimatesCandidates Actual 3 Weeks (n=10)

Russomanno 18.84 29.08Serra 26.83 20.84Haddad 25.28 18.25Others 23.63 24.53Undecideds – 07.30

Daniel Marcelino and Alejandro Tapias When the Winner Comes Third: Simulating Candidates’ Winnability With Inaccurate Polls5 de agosto de 2014 2 / 40

Page 3: When the winner_comes_third

Where did the pollsters go wrong?

Average Estimates

Candidates Actual 3 Weeks (n=10) 1 Week (n=5)

Russomanno 18.84 29.08 24.56Serra 26.83 20.84 22.48Haddad 25.28 18.25 19.90Others 23.63 24.53 26.06Undecideds – 07.30 07.00

Daniel Marcelino and Alejandro Tapias When the Winner Comes Third: Simulating Candidates’ Winnability With Inaccurate Polls5 de agosto de 2014 3 / 40

Page 4: When the winner_comes_third

Where did the pollsters go wrong?

Pollster

Datafolha Ibope Veritá VoxPopuliCandidates n=2 n=2 n=1 n=1

Russomanno + + + +Serra - - * -Haddad - - - -Others + + + *

Daniel Marcelino and Alejandro Tapias When the Winner Comes Third: Simulating Candidates’ Winnability With Inaccurate Polls5 de agosto de 2014 4 / 40

Page 5: When the winner_comes_third

How do political scientists predict elections?

Four traditions are common in the literature(1) Economic vote models(2) Electoral cycles models(3) Models using prediction markets(4) Models that use polling data as the primary predictors

Daniel Marcelino and Alejandro Tapias When the Winner Comes Third: Simulating Candidates’ Winnability With Inaccurate Polls5 de agosto de 2014 5 / 40

Page 6: When the winner_comes_third

How do political scientists predict elections?

Four traditions are common in the literature(1) Economic vote models(2) Electoral cycles models(3) Models using prediction markets(4) Models that use polling data as the primary predictors

Daniel Marcelino and Alejandro Tapias When the Winner Comes Third: Simulating Candidates’ Winnability With Inaccurate Polls5 de agosto de 2014 5 / 40

Page 7: When the winner_comes_third

How do political scientists predict elections?

Four traditions are common in the literature(1) Economic vote models(2) Electoral cycles models(3) Models using prediction markets(4) Models that use polling data as the primary predictors

Daniel Marcelino and Alejandro Tapias When the Winner Comes Third: Simulating Candidates’ Winnability With Inaccurate Polls5 de agosto de 2014 5 / 40

Page 8: When the winner_comes_third

How do political scientists predict elections?

Four traditions are common in the literature(1) Economic vote models(2) Electoral cycles models(3) Models using prediction markets(4) Models that use polling data as the primary predictors

Daniel Marcelino and Alejandro Tapias When the Winner Comes Third: Simulating Candidates’ Winnability With Inaccurate Polls5 de agosto de 2014 5 / 40

Page 9: When the winner_comes_third

Political polls: The size matters

Sample Size

Ma

rgin

of

Err

or

500 1500 2500 3500 4500 5500 6500 7500 8500 9500 10500 11500 12500 13500 14500

0.0

10

.01

50

.02

0.0

25

0.0

30

.03

50

.04

0.0

45

α = 0.01α = 0.05α = 0.1

Daniel Marcelino and Alejandro Tapias When the Winner Comes Third: Simulating Candidates’ Winnability With Inaccurate Polls5 de agosto de 2014 6 / 40

Page 10: When the winner_comes_third

A poll is likely to be wrong, yet...

House effectsRoundingNon-response biasWording and OrderingMode bias

Daniel Marcelino and Alejandro Tapias When the Winner Comes Third: Simulating Candidates’ Winnability With Inaccurate Polls5 de agosto de 2014 7 / 40

Page 11: When the winner_comes_third

A poll is likely to be wrong, yet...

Context: local elections in Brazil(-) High number of candidates (12)(-) Local elections = polls shortage (28)(-) Few pollsters (4)(-) Poor sampling designs(+) Face-to-face surveys(.) Political system features may cause high volatility

How can we cope with irregular and inaccurate polls to fit regular politicalsupport?

Daniel Marcelino and Alejandro Tapias When the Winner Comes Third: Simulating Candidates’ Winnability With Inaccurate Polls5 de agosto de 2014 8 / 40

Page 12: When the winner_comes_third

The Method

Bayesian inferenceThe estimand parameters are considered random variables, but theseare still related to one another.

ExampleIf a candidate X at t1 had 28% of the popular vote, it is very likely that att2 he will be alos close to 28% once t1 and t2 are close to one another intime. Therefore, if one know θ for X at t1, this information would changeyour beliefs about the likely values for X at t2. Moreover, given thisinformation we would like to know the probability of X winning theelection.

Daniel Marcelino and Alejandro Tapias When the Winner Comes Third: Simulating Candidates’ Winnability With Inaccurate Polls5 de agosto de 2014 9 / 40

Page 13: When the winner_comes_third

The Method

Bayesian inferenceThe estimand parameters are considered random variables, but theseare still related to one another.

ExampleIf a candidate X at t1 had 28% of the popular vote, it is very likely that att2 he will be alos close to 28% once t1 and t2 are close to one another intime. Therefore, if one know θ for X at t1, this information would changeyour beliefs about the likely values for X at t2. Moreover, given thisinformation we would like to know the probability of X winning theelection.

Daniel Marcelino and Alejandro Tapias When the Winner Comes Third: Simulating Candidates’ Winnability With Inaccurate Polls5 de agosto de 2014 9 / 40

Page 14: When the winner_comes_third

The Method

Bayesian inferenceThe estimand parameters are considered random variables, but theseare still related to one another.Incorporate data from various sources as well as uncertaintiesassociated with the data.Prior distribution → Posterior distribution.

Daniel Marcelino and Alejandro Tapias When the Winner Comes Third: Simulating Candidates’ Winnability With Inaccurate Polls5 de agosto de 2014 10 / 40

Page 15: When the winner_comes_third

The Method

Bayesian inferenceThe estimand parameters are considered random variables, but theseare still related to one another.Incorporate data from various sources as well as uncertaintiesassociated with the data.Prior distribution → Posterior distribution.

Daniel Marcelino and Alejandro Tapias When the Winner Comes Third: Simulating Candidates’ Winnability With Inaccurate Polls5 de agosto de 2014 10 / 40

Page 16: When the winner_comes_third

Priors: Advertising slots

0

10

20

30

200 400 600Time in Seconds

Pre

dict

ed V

ote

Sha

re

Daniel Marcelino and Alejandro Tapias When the Winner Comes Third: Simulating Candidates’ Winnability With Inaccurate Polls5 de agosto de 2014 11 / 40

Page 17: When the winner_comes_third

Prior Distribution

Candidates αi Prior mean Prior var Prior sdSerra 280 0.28 0.101 0.317Haddad 270 0.27 0.099 0.314Russomanno 170 0.17 0.071 0.266Others 280 0.28 0.101 0.317

Total 1,000 1.00

(yt1:k) ∼ Multinomial(n, αt1:k).

Daniel Marcelino and Alejandro Tapias When the Winner Comes Third: Simulating Candidates’ Winnability With Inaccurate Polls5 de agosto de 2014 12 / 40

Page 18: When the winner_comes_third

Evidence: Polling data

A poll by Vox Populi with 1,000 voters conducted roughly 15 monthsahead the election (13th July 2011) gave this:

Serra: 26%Russomanno: 14%Haddad: 2%Others: 58%

Daniel Marcelino and Alejandro Tapias When the Winner Comes Third: Simulating Candidates’ Winnability With Inaccurate Polls5 de agosto de 2014 13 / 40

Page 19: When the winner_comes_third

Posterior Distribution

Candidates αi + yi Posterior mean Posterior var Prior sdSerra 540 0.27 0.066 0.256Haddad 290 0.15 0.041 0.203Russomanno 300 0.15 0.044 0.209Others 580 0.43 0.082 0.286

Total 2,000 1.00

p(αt1:k|yt1:k) ∼ Dirichlet(bt1:k + yt1:k).

p(αt1:k) =Γ(bt1:k)

Γ(bt1:k)αbt1:k−1t1:k . . . , αbtk−1

tk

Daniel Marcelino and Alejandro Tapias When the Winner Comes Third: Simulating Candidates’ Winnability With Inaccurate Polls5 de agosto de 2014 14 / 40

Page 20: When the winner_comes_third

The Model

Weighted averageEach poll has its own precision: p = 1/σ2.DataFolha of n=3,959Ibope of n=1,204

y∗di =pDyD + pIyIpD + pI

(1)

Daniel Marcelino and Alejandro Tapias When the Winner Comes Third: Simulating Candidates’ Winnability With Inaccurate Polls5 de agosto de 2014 15 / 40

Page 21: When the winner_comes_third

The Model

Predicting vote intentionsIgnorance about θ can be expressed by making the prior precision small.That is, by making prior variance σ20 large.

yi ∼ N(µi, σ2i ) (2)

Daniel Marcelino and Alejandro Tapias When the Winner Comes Third: Simulating Candidates’ Winnability With Inaccurate Polls5 de agosto de 2014 16 / 40

Page 22: When the winner_comes_third

The Model

Predicting vote intentionsGiven that polls lack precision:

µi = αti + δji + ∆ (3)

where δj is the bias of polling firm j, an unknown parameter to beestimated. ∆ is an unknown parameter to be estimated of event change.

Daniel Marcelino and Alejandro Tapias When the Winner Comes Third: Simulating Candidates’ Winnability With Inaccurate Polls5 de agosto de 2014 17 / 40

Page 23: When the winner_comes_third

The Model

Predicting vote intentionsTo model change in vote intentions, we use a random-walk model as that

αt ∼ N(αt−1, w2), t = 1, . . . , T (4)

where w2 is a linear interpolation component that detects eventdiscontinuity (before vs. after campaign advertising on TV).

Daniel Marcelino and Alejandro Tapias When the Winner Comes Third: Simulating Candidates’ Winnability With Inaccurate Polls5 de agosto de 2014 18 / 40

Page 24: When the winner_comes_third

The Model

Predicting vote intentionsWith an uniform distribution of prior beliefs, that is before we see anypolling data:

αti ∼ Uniform(l, u) (5)

where l and u denotes lower and upper limits for the range of plausibleelectoral outcome for a candidate.

Daniel Marcelino and Alejandro Tapias When the Winner Comes Third: Simulating Candidates’ Winnability With Inaccurate Polls5 de agosto de 2014 19 / 40

Page 25: When the winner_comes_third

Random-walk model (drunkard’s walk)

Candidates are the drunkardsStagger left(right) = gain(lose) support.Noisy signals = opinion polls.Kalman filtering: Learn about likely path given polling data.

Daniel Marcelino and Alejandro Tapias When the Winner Comes Third: Simulating Candidates’ Winnability With Inaccurate Polls5 de agosto de 2014 20 / 40

Page 26: When the winner_comes_third

We know which bar you left

Figura: We have a belief how a candidate would fare on the election before ittakes place

Daniel Marcelino and Alejandro Tapias When the Winner Comes Third: Simulating Candidates’ Winnability With Inaccurate Polls5 de agosto de 2014 21 / 40

Page 27: When the winner_comes_third

We know the direction of travel

Figura: We have a belief how a candidate would fare on the election before ittakes place

Daniel Marcelino and Alejandro Tapias When the Winner Comes Third: Simulating Candidates’ Winnability With Inaccurate Polls5 de agosto de 2014 22 / 40

Page 28: When the winner_comes_third

We don’t know whether you staggered left or right

Figura: We have a belief how a candidate would fare on the election before ittakes place

Daniel Marcelino and Alejandro Tapias When the Winner Comes Third: Simulating Candidates’ Winnability With Inaccurate Polls5 de agosto de 2014 23 / 40

Page 29: When the winner_comes_third

We don’t know whether you staggered left or right

Daniel Marcelino and Alejandro Tapias When the Winner Comes Third: Simulating Candidates’ Winnability With Inaccurate Polls5 de agosto de 2014 24 / 40

Page 30: When the winner_comes_third

We don’t know whether you staggered left or right

Daniel Marcelino and Alejandro Tapias When the Winner Comes Third: Simulating Candidates’ Winnability With Inaccurate Polls5 de agosto de 2014 25 / 40

Page 31: When the winner_comes_third

We don’t know whether you staggered left or right

Figura: We have a belief how a candidate would fare on the election before ittakes place

Daniel Marcelino and Alejandro Tapias When the Winner Comes Third: Simulating Candidates’ Winnability With Inaccurate Polls5 de agosto de 2014 26 / 40

Page 32: When the winner_comes_third

We have a belief how a candidate would fare on the electionbefore it takes place

0.0

0.2

0.4

0.6

Daniel Marcelino and Alejandro Tapias When the Winner Comes Third: Simulating Candidates’ Winnability With Inaccurate Polls5 de agosto de 2014 27 / 40

Page 33: When the winner_comes_third

Polls are noisy signals, but we can “learn” about most likelydeviations given these signals

0.0

0.2

0.4

0.6

Daniel Marcelino and Alejandro Tapias When the Winner Comes Third: Simulating Candidates’ Winnability With Inaccurate Polls5 de agosto de 2014 28 / 40

Page 34: When the winner_comes_third

Computation

Computation detailsSoftware WinBugs (OpenBugs)The MCMC sampler was run on a single chain with an adaptationperiod (burn-in) of 100,000 iterations, followed by 500,000 iterationsin which every 500th draw was kept for the analysis.The resulting data set is a pooled sample of 1,000 valid cases(elections).

Daniel Marcelino and Alejandro Tapias When the Winner Comes Third: Simulating Candidates’ Winnability With Inaccurate Polls5 de agosto de 2014 29 / 40

Page 35: When the winner_comes_third

Average estimates for the house effects parameters

Russomanno (PRB) Serra (PSDB) Haddad (PT)

Pollters Estimate 2.5% 97% Estimate 2.5% 97% Estimate 2.5% 97%

Datafolha 3.98 2.00 5.89 -2.14 -3.78 -0.39 -5.40 -7.09 -3.77

Ibope 3.51 1.50 5.69 -4.52 -6.34 -2.53 -4.88 -6.65 -3.00

Veritá 3.00 -0.09 6.17 -1.03 -3.77 1.97 -3.60 -6.16 -1.21

VoxPopuli 3.37 0.84 5.52 -3.58 -5.56 -1.38 -4.75 -6.97 -2.90

Daniel Marcelino and Alejandro Tapias When the Winner Comes Third: Simulating Candidates’ Winnability With Inaccurate Polls5 de agosto de 2014 30 / 40

Page 36: When the winner_comes_third

Simulation Results

Average EstimatesCandidates Actual Last day(n=1,000)

Russomanno 18.84 20.20Serra 26.83 26.18Haddad 25.28 24.32Others 23.63 24.10Undecideds – 05.02

Daniel Marcelino and Alejandro Tapias When the Winner Comes Third: Simulating Candidates’ Winnability With Inaccurate Polls5 de agosto de 2014 31 / 40

Page 37: When the winner_comes_third

Simulation Results: Share and pointwise for Russomanno(PRB)

Daniel Marcelino and Alejandro Tapias When the Winner Comes Third: Simulating Candidates’ Winnability With Inaccurate Polls5 de agosto de 2014 32 / 40

Page 38: When the winner_comes_third

Simulation Outcomes: Share and pointwise for Serra(PSDB)

Daniel Marcelino and Alejandro Tapias When the Winner Comes Third: Simulating Candidates’ Winnability With Inaccurate Polls5 de agosto de 2014 33 / 40

Page 39: When the winner_comes_third

Simulation Results: Share and pointwise for Haddad (PT)

Daniel Marcelino and Alejandro Tapias When the Winner Comes Third: Simulating Candidates’ Winnability With Inaccurate Polls5 de agosto de 2014 34 / 40

Page 40: When the winner_comes_third

Simulation Results: Probabilities of Haddad (PT) beatRussomanno (PRB) and advance in the runnoff

Daniel Marcelino and Alejandro Tapias When the Winner Comes Third: Simulating Candidates’ Winnability With Inaccurate Polls5 de agosto de 2014 35 / 40

Page 41: When the winner_comes_third

Conclusions

In Brazil as everywhere polls lack precision. Precision is mainlyaffected by two sources: sample size and house effects. After accountfor them, we could improve the predictions; consequently theinformation about the election.In Brazil, the institution of campaign advertising on TV and radio maycause significant breaks in vote intention, which needs to be modeledaccordingly, otherwise, a violation of the linearity assumption mayoccur.

Daniel Marcelino and Alejandro Tapias When the Winner Comes Third: Simulating Candidates’ Winnability With Inaccurate Polls5 de agosto de 2014 36 / 40

Page 42: When the winner_comes_third

Daniel Marcelino and Alejandro Tapias When the Winner Comes Third: Simulating Candidates’ Winnability With Inaccurate Polls5 de agosto de 2014 37 / 40

Page 43: When the winner_comes_third

Where did the pollsters go wrong?

Daniel Marcelino and Alejandro Tapias When the Winner Comes Third: Simulating Candidates’ Winnability With Inaccurate Polls5 de agosto de 2014 38 / 40

Page 44: When the winner_comes_third

Polls fielded over the last 3 weeks to the election

Average error

Mean Datafolha Ibope Veritá VoxPopuliCandidates Actual n=10 n=4 n=4 n=1 n=1

Russomanno 18.84 29.08 9.41 10.66 6.96 15.16Serra 26.83 20.84 -4.33 -7.58 -2.43 -9.83Haddad 25.28 18.25 -7.28 -6.03 -4.78 -8.28Others 23.63 24.53 6.09 8.49 7.19 2.23

Undecideds 7.30 5.75 8.00 5.00 13.00Actual vote is in bold face.

Daniel Marcelino and Alejandro Tapias When the Winner Comes Third: Simulating Candidates’ Winnability With Inaccurate Polls5 de agosto de 2014 39 / 40

Page 45: When the winner_comes_third

Polls fielded over the week before the election

Average error

Mean Datafolha Ibope VeritáCandidates Actual n=5 n=2 n=2 n=1

Russomanno 18.84 24.56 5.15 5.66 6.96Serra 26.83 22.48 -3.33 -6.33 -2.43Haddad 25.28 19.90 -5.78 -5.28 -4.78Others 23.63 26.06 -2.05 -3.05 -4.75

Undecideds 7.00 6.00 9.00 5.00Actual vote is in bold face.

Daniel Marcelino and Alejandro Tapias When the Winner Comes Third: Simulating Candidates’ Winnability With Inaccurate Polls5 de agosto de 2014 40 / 40