How good is Driblab’s Expected Goals (xG) model?

Category: Team Analysis

We analysed more than 44000 shots to measure whether Driblab's Expected Goals (xG) model measures the value of shots well. The result confirms that our model is well balanced.

Published:03/08/2021

Short answer: Very good.

Long answer: 

It is not unusual to see discrepancies in different xG values providers. Expected goals have a clear definition in theory, but it is difficult to put it in practice. A new season is about to start, and has actually started in many leagues, so we want to test our xG model. Football analytics main tool needs to be accurate in order to assess properly teams and players.

But, how can a xG model be tested? A shot with 0.22 xG means that, if an average player shots in the same circumstances a hundred times, he would score 22 goals. But this shot happens only one time. Does our model give too much value? Or do we underestimate the chance? The best way to asses this model is by looking not at one shot, but at 44.406 shots. This is the number of shots (excluding penalti kicks and own goals) that were taken in Europe’s top 5 leagues, the Copa America and the Euro.

One may think that it is impossible to miss that shot. Our model gives a value of 0.85xG. When our team’s striker misses a 0.85xG chance, we think that even ourselves could have scored that. But the missing 0.15xG tells us that roughly one out of every 7 of these shots is missed. And this is the one. Throughout the seasons, we recorded other 6 shots with that much xG, and all were scored. This is, obviously, a coincidence, but it still works to illustrate the meaning of Expected Goals.

Through the EURO, our model expected a total of 124.75 non-penalty goals, and the total number of non-penalty goals scored was 122. In the end, only a few national teams played and not a large number of matches were disputed. Taking the seven competitions we mentioned, we have 1905 matches where a total of 4485.8 non-penalty goals were expected and 4581 were scored. An underestimation of 2.07%, which is not statistically significant, shows us that our model is well balanced. To compare, the previous season, on these leagues, we had an overestimation of 0,16%.

But this does not tell the whole story. We might be overestimating low value shots and underestimating, or not counting all shot zones or shot types. As an example, we’ll take a look into different value shots. The graphic above shows our prediction against reality. We put the shots in bins of 0.01xG. This is, our predicted probability of 0.04 xG are shots between 0.035 and 0.045 xG. The size of the circle represents the number of shots (more shots mean less deviation) and we show how many of these shots were actually converted. An R-Squared of 0.968 just mathematically confirms our intuition: Driblab’s Expected Goals model is well weighted.

For big chances (xG over 0.33), small number of shots are recorded and convergence may not happen. Here is a histogram of the number of shots taken and goals scored. We can see how most shots amass less than 0.05xG, and how the number of big chances decreases drastically.

Expected goals also allows us to evaluate team performance. In this case, even though there is an obvious correlation, some teams have scored more than expected, and others, such as Brighton, have well underperformed. Having efficiency might hand you the League title (see Lille). And even though underperforming in the domestic league might take you to fourth place, Chelsea showed its true potential at the Champions League.

Driblab’s Expected Goals model is basic for many other metrics, so we have studied it to be very accurate. We show some of the internal analysis the model undergoes, in order to reduce any possible bias. A metric that allows us to asses players, teams and even leagues, is of the upmost importance to us. And the next time your team’s striker misses a golden chance, remember that even 0.97 xG shots are missed every now and then.

We are Driblab, a consultancy specialized in football analytics and big data; our work is focused on advising and minimizing risk in professional football decision-making in areas related to talent detection and footballer evaluations. Our database has more than 180,000 players from more than 180 competitions, covering information from all over the world. Here you can learn more about how we work and what we offer.

Autor: Joan Hernanz
For Team Analysis we also recommend you:

GBE: South American U-20 talents now eligible to play in the UK

There are not many U20 talents in South America who can play in the UK. We look at some of the best who can do so by meeting the requirements set by the GBE filter.

Five years of VAR: are more penalties being awarded?

We go to our database to find out what impact VAR has had on penalties awarded since its inception.

How to use driblabPRO as a team analyst

We use numerous visualisations that the Driblab team has developed over time to facilitate the work of an analyst within a club.

driblabPRO Release Notes December ‘22

In recent weeks we have improved the speed of driblabPRO to offer a more dynamic experience to our users.

Iraola’s rules on the most difficult pitch in Europe

We analyse how Rayo’s game changes when they play at home and when they play away, making their stadium the most difficult in the top leagues.

As chaves de um Benfica invicto (Portuguese)

A época do Benfica até agora tem sido perfeita. O seu talento individual e o seu desempenho colectivo têm-nos mantido invictos. Analisamos as chaves e visualizamo-las com dados.

The keys to an unbeaten Benfica

Benfica’s season so far has been perfect. Their individual talent and collective performance has kept them unbeaten. We analyse the keys and visualise them with data.

A grande evolução do futebol: a distância média de remate (Portuguese)

O desenvolvimento médio dos tiros é um dos grandes desenvolvimentos do futebol. Analisamos o que aconteceu para tornar possível disparar a uma distância cada vez maior.

Francia vs Argentina: World Cup final

We analyse all the key points of the Qatar World Cup final between Argentina and France: statistics, metrics and key tactics.

Spain and its possession, when did the problems begin?

Were Spain’s problems at the World Cup one-offs? We relate Spain’s possession to their ability to generate threat throughout Luis Enrique’s era.

Driblab

Información corporativa

Somos una empresa con sede en Madrid fundada en 2017 por Salvador Carmona y Cristian Coré Ramiro. Desde nuestros inicios nuestro trabajo se ha centrado en el análisis estadístico de datos para ayudar a los clubes en la planificación deportiva. Somos una consultora big data que ofrece servicios personalizados para cada cliente y defiende un modelo de gestión mixto y una comunicación constante para acompañar el día a día de las instituciones. Nuestro punto fuerte es la más amplia cobertura disponible en número de torneos profesionales y juveniles. Para más detalles, póngase en contacto con nosotros.

Colaboramos con:

           

Hemos aparecido en:


Talk to our speciali