And there’s more!
As an addendum to my predictions for tomorrow’s final, I thought it would be interesting to look at the distribution of finishing places for each country. One of the main motivations for this is a common misinterpretation of the list of winning probabilities I give each year.
Last year, I had listed the UK as the 19th most likely country to win the final. As it turned out, the UK came in 19th place, and a number of people congratulated me on the accuracy of the prediction. Now, as lovely as it is to be congratulated, I never predicted that the UK would come in 19th. There’s a world of difference between “19th most likely to win”, and “will most likely come 19th”. I demonstrated this two years ago in my wrapup post for the 2012 contest, with the particular example of Malta that year.
I think the easiest way to avoid this confusion again is probably to make some actual predictions for the finishing place of each country. That way, hopefully nobody will misinterpret winning probabilities in this way.
What I’ve plotted here, as the green bars, is the interquartile range of finishing position for each country. So, for example, Austria’s bar goes from 7 to 16. This means that in 50% of simulations, Austria finished between these two positions (inclusive). In 25% of simulations, Austria finished in 7th or better, and in 25% of simulations 16th or worse. The black bar shows the average placing, which is not necessarily a whole number. In this case, on average Austria finished 11.87th.
Because we’ve got better knowledge of song quality this year, there’s definitely a strong correspondence between the order from winning probability (as seen last time and the ordering here. There are a few discrepancies though. For example, Belarus are one of the only countries which didn’t win a single simulation. However, they’re still forecast to come in ahead of France and Germany, because they can rely on votes from the other exSoviet bloc countries.
I should mention here that the model is generally pretty bad at scoring at the bottom end of the table. Although the dreaded “nul points” hasn’t happened since 2003, the model still predicts it in over half of simulations. This may mean that the placings near the bottom are a bit off as well.
I should also mention that it’s extremely unlikely that this exact ordering comes up. With twentysix finalists, there are approximately four hundred trillion trillion^{1} possible orderings. Picking the right one by chance alone is roughly as likely as three separate entrants being independently struck by lightning during the interval act.
Because these are probabilistic forecasts, it’s likely that some of them will be wrong. In fact, it would be a problem if they weren’t^{2}. If a model claims something will happen 50% of the time, then that thing should fail to happen 50% of the time as well. In a wellcalibrated model, a 48% prediction (such as our prediction that Sweden will win) should be wrong a little over half the time.
If we take the interquartile ranges as predictions, then we should expect the true position to lie inside them 50% of the time^{3}. Or, looking at it another way, we should expect half of our predictions to be right, and half to be wrong. Much more or less than that and there’s a problem with the model.
But who’s got the best song?
Another way of looking at this data is to compare how well a song is likely to finish with how “good” the model thinks it is. Of course, by this stage the model “quality score” incorporates information about the running order and probably other information, but it’s still interesting to get an idea of how the mechanics of the contest favour some countries over others.
There is definitely a strong relationship between song quality and finishing position, as one would hope, but the countries which deviate from this relationship are probably the most interesting. Netherlands and Austria, according to the model, have the second and fourth best songs, and yet they finish in fifth and eleventh place respectively. This reflects their lack of established voting patterns.
On the flip side, Ukraine managed to squeeze out a third place finish from the seventh best song. Russia, even more egregiously, have a song which ranks third from the bottom in quality, but they still manage to crack the top ten in terms of mean finishing position.
Before I get deluged with complaints from Russians, I should say that the model quality score isn’t really intended to match up with objective musical quality. Instead it models the opinion of a completely average Eurovision voter^{4}. It’s entirely possible that they just don’t like twins.

Four hundred and three septillion, two hundred and ninetyone sextillion, four hundred and sixtyone quintillion, one hundred and twentysix quadrillion, six hundred and five trillion, six hundred and thirtyfive billion, five hundred and eightyfour million. You go and count, I’ll wait here. ↩

Nate Silver wrote a nice piece on checking this calibration for his March Madness basketball model. ↩

In this case it will be slightly more than 50%, because the position variable only takes on discrete values. The quantiles don’t actually divide up the set of outcomes very nicely. ↩

The best available approximation to this is a Hungarian. ↩