The ball already rolls through the Russian stadiums, and the surprises of the initial days threaten to ruin the predictions of coaches, journalists and fans. Neither Germany, nor Brazil, nor Argentina, nor Spain have won in their debut. Of the favorites, only France has started on the right foot.
Can science help us know how the tournament will end? Andreas Groll, a statistician from the Technical University of Dortmund (Germany), and his collaborators think so. They have created a method that combines machine learning (the branch of artificial intelligence that allows learning machines) and statistics to issue a forecast.
Its mathematical tool gives Spain as the most likely winner, with a 17.8% chance, but this forecast corresponds to the analysis prior to the knockout rounds. In the method created by Groll and his team, the structure of the competition is a key factor. As they say, in the event that Spain and Germany manage to reach the quarterfinals, their chances of winning will be almost the same, although with the Germans a bit ahead.
How can they make a forecast?
The mix of machine learning and statistics of these researchers is called random forests or random forests. This technique is based on a decision tree, a predictive model that works by calculating all the possible outcomes of the events in each branch of the tree (in this case, the results of all the matches), which gives the most probable outcome of the tournament .
The advantage of random forests is that they calculate so many probabilities that an average can be extracted that guides with great precision what can happen. In this work, the statisticians of the University of Dortmund did 100,000 simulations of the tournament.
Among the variables included in his model are the position of each FIFA World Cup team in the FIFA ranking, the results of their previous World Cups, the characteristics of the teams (for example, the average age of their players, if they have experience in international tournaments such as the Champions ...) and even the gross domestic product of the participating countries.
It would be nice for this article to have some references at the end so people could continue reading if the subject is interesting to them (such as my case).
Also, I would like to know what were the predictions they did?
I remember in the last world cup an octopus called Paul predicted a lot of results.