Poisson and Gamma Models for Predicting Bundesliga Results
Abstract
Soccer is unpredictable. Picking the winner of a soccer league has many variables that affect the final standings, from the performance and health of the individual players, to the environment and timing of games, to the evolution of each team’s tactics. It is impossible to predict how every variable that can affect a season will play out, and what affect each variable will have on the final standings. This paper aims to test how well single measures of teams’ past games can predict the final standing of a league. More specifically, the goals scored and expected goals of games from the first half of the season will be used to try to predict the final league standings of the top German professional soccer league, the Bundesliga. The goals scored and conceded for each team in the first half of the season were used to build a model based on the Poisson distribution that simulated the remaining games to find the probability of each teams’ final league table position. The expected goals for and against each team in the first half of the season were used to build a model based on the gamma distribution that simulated the remaining games in the season to find the probability of each teams’ final league table position. The predicted probabilities for both models were then tested for accuracy individually and both models were shown to have strong predictive power. The two models’ results were then compared, and it was found that the model based on actual goals scored had slightly stronger predictive power than the model based on expected goals. Expected goals is a relatively new statistic that looks to have promising predictive power, but the predictive model used in this project was not able to outperform the more commonly used Poisson model based on actual goals scored. More research is likely needed to be able to find the true predictive power of expected goals relative to actual goals scored.