Author: Xavier Fernández-i-Marín
October 13, 2018 - 9 minutes
Electoral forecast for Bavarian elections 2018Electoral forecast Bayesian Data visualization ggmcmc
(Update: check also the evaluation of the forecasts.)
The forecast of an election can be understood as the outcome of o measurement problem. We need to measure the day-to-day latent party support and make a prediction for a specific day (the election day).
How is the party support known? We can approach to the support that a party is going to receive by looking at polls. Polls contain fuzzy information about the support that a party receives, measured for a very concrete time period and with a lot of noise. In fact, polls can be biased (and I am refering to the fact that technically some houses are more able to reach certain parts of the electorate, use telephone-based / Internet-based or presential interviews) and therefore we need a way to disentangle the bias (house effect) from the real signal that the poll provides.
But even with the poll biase taken out of the equation, there is still the fact that there are no polls everyday, and therefore it is not possible to fill in automatically all the time periods during which there is no information about the latent parties support. This is where a daily volatility support is needed. We need to either estimate or assume some sort of daily variation possible for each party support. In the model presented here, I have used a simple approach and have assumed a 1 percent of the latent support change as a maximum for the daily change.
The basis of the results presented here is a model based on a simple application of a Kalman filter. The idea is that the value of a support for a concrete party in day 2 is equal to the support in the previous day (day 1) with some sort of random error (remember: an error that can be up to a maximum of a 1 percent of the support). This apparently easy way to specify a time trend avoids having to rely on complicated time-series assumptions, time-trend polynomials or smoothing functions. Today is equal to yesterday’s plus some noise.
To sum up, in the following forecast only two elements are taken into consideration: house effects and daily volatility. No time trend is employed.
This approach is an adaptation to a multiply party setup of an original Simon Jackman’ model of “pooling the polls”, which he calls it a Bayesian state-space method.
I first present the raw polls for the concrete Bavarian elections since the last Landtag elections in September 22nd, 2013. Then I show the estimation of the house effects (the biases that every firm involved in the polls produces). Then, I show the estimated day to day latent support for every party. With this, I produce a (risky) seat assignation and finally I present some probabilities of interest.
Since September 22nd, 2013, there have been several polls that have attempted to measure the support for each political party in Bavaria. Notice that the lack of polls in 2017 is due to the fact that all polling effeorts were diverted to the German federal elections.
Although for the press the polls produce concrete numbers of the support for every party (the dots in the plots), in fact this is a simplification, and from a scientific point of view we must always show the results of a plot with the estimated support (the dot) and the uncertainty that involves any sampling procedure (the vertical line).
As of today, if you have to place a bet for a single number that represents the support for each party, it is this one:
And this is the single number (dot) with the uncertainty bands. The intervals are bayesian credible intervals, not the classical frequentist confidence intervals, and therefore can be interpret in a plain way as one would expect (there is 95 percent probability that the support for a specific party as of today is within the two ends of the horizontal line), without the frequentist reliance in sampling and the null hypothesis).
When projecting the electoral support into votes, the specificities of the electoral system must be taken into account. This is the place where a small error in the forecast in the electoral support can be easily magnified into a wrong distribution of seats. The reason is that several simplifications must be done. The most important one in the Bavarian specific case, is that the German system of double vote is tricky to incorporate in this set-up. Only the distribution of the second list votes, that reflect the party support, can effectively be predicted. In this case, I am limiting the seat assignation to the 180 seats that are assigned by the second vote.
The electoral rules state that the Hare-Niemeyer rule for the seat distribution with an entry level of 5 percent of the votes must be used.
The figure of the forecasted distribution of seats therefore takes into account the fact that in some scenarios there may be more or less parties competing for the seats (as some of them may not arrive to the 5 percent required to enter into the assignation). This is the reason why the bars are not distributed in a symmetric or “normal-distribution-like” way.
The house effects, as stated before, are the estimated biases that every polling house has when their poll numbers are systematically higher or lower for specific parties. They may be due to technical reasons of the sampling method, or also to specific house correction formulas to adjust the raw survey responses (sometimes missing) to the party support and the most likely outcome of every individual.
The temporal evolution of the latent support for every party is the smoothed day to day support that discounts the house effects and produces a daily value taking into consideration the periods in which there has not been a survey in place. For periods without surveys, therefore, the uncertainty about the specific party support increases.
I show the whole time period since the last elections in Baviera and also the period of the six months prior to the election.
Quantities of interest
With the day to day support at every party and the projection of the seats, it is then relatively easy to ask simple questions in a probabilistic way about several quantities of interest.
In all cases, I present the probability as of today, as well as the temporal evolution.
CSU < 35What is the probability that the CSU receives less than 35% of the votes as of today?
|Date||Prob (CSU < 35%)|
Grüne > 20What is the probability that Grüne receives more than 20% of the votes as of today?
|Date||Prob (Grüne > 20%)|
AfD > 15What is the probability that AfD receives more than 15% of the votes as of today?
|Date||Prob (AfD > 15%)|
FW > 5What is the probability that FW gets more than 5% of the votes as of today?
|Date||Prob (FW > 5%)|
CSU + FW = majority
What is the probability that the CSU and FW get majority of seats?
|Date||Prob (CSU + FW = majority)|
CSU + FW + FDP = majority
What is the probability that the CSU, FW and FDP get majority of seats?
|Date||Prob (CSU + FW + FDP = majority)|
CSU + AfD = majority
What is the probability that the CSU and AfD get majority of seats?
|Date||Prob (CSU + AfD = majority)|
CSU + Grüne = majority
What is the probability that the CSU and the Grüne get majority of seats?
|Date||Prob (CSU + Grüne = majority)|
CSU + FDP = majority
What is the probability that the CSU and the FDP get majority of seats?
|Date||Prob (CSU + FDP = majority)|
What is the probability of a 5-party parliament?
What is the probability of a 6-party parliament?
What is the probability of a 7-party parliament?
Have in mind that this is only a purely speculative exercise, with a strong limitation: it only considers the part of the election where electors express their party preferences. Therefore, all seats that may be assigned on the personal list vote are not considered. And for the calculations of probabilities this can lead to very different scenarios. For instance, it can be expected that the CSU may get more seats in comparison to the other parties in this process, changing the majories that are forecasted in this document.
##  "2019-03-07 11:54:32 CET"