How good is Driblab’s Expected Goals (xG) model?

Category: Team Analysis

We analysed more than 44000 shots to measure whether Driblab's Expected Goals (xG) model measures the value of shots well. The result confirms that our model is well balanced.

Published:03/08/2021

Short answer: Very good.

Long answer: 

It is not unusual to see discrepancies in different xG values providers. Expected goals have a clear definition in theory, but it is difficult to put it in practice. A new season is about to start, and has actually started in many leagues, so we want to test our xG model. Football analytics main tool needs to be accurate in order to assess properly teams and players.

But, how can a xG model be tested? A shot with 0.22 xG means that, if an average player shots in the same circumstances a hundred times, he would score 22 goals. But this shot happens only one time. Does our model give too much value? Or do we underestimate the chance? The best way to asses this model is by looking not at one shot, but at 44.406 shots. This is the number of shots (excluding penalti kicks and own goals) that were taken in Europe’s top 5 leagues, the Copa America and the Euro.

One may think that it is impossible to miss that shot. Our model gives a value of 0.85xG. When our team’s striker misses a 0.85xG chance, we think that even ourselves could have scored that. But the missing 0.15xG tells us that roughly one out of every 7 of these shots is missed. And this is the one. Throughout the seasons, we recorded other 6 shots with that much xG, and all were scored. This is, obviously, a coincidence, but it still works to illustrate the meaning of Expected Goals.

Through the EURO, our model expected a total of 124.75 non-penalty goals, and the total number of non-penalty goals scored was 122. In the end, only a few national teams played and not a large number of matches were disputed. Taking the seven competitions we mentioned, we have 1905 matches where a total of 4485.8 non-penalty goals were expected and 4581 were scored. An underestimation of 2.07%, which is not statistically significant, shows us that our model is well balanced. To compare, the previous season, on these leagues, we had an overestimation of 0,16%.

But this does not tell the whole story. We might be overestimating low value shots and underestimating, or not counting all shot zones or shot types. As an example, we’ll take a look into different value shots. The graphic above shows our prediction against reality. We put the shots in bins of 0.01xG. This is, our predicted probability of 0.04 xG are shots between 0.035 and 0.045 xG. The size of the circle represents the number of shots (more shots mean less deviation) and we show how many of these shots were actually converted. An R-Squared of 0.968 just mathematically confirms our intuition: Driblab’s Expected Goals model is well weighted.

For big chances (xG over 0.33), small number of shots are recorded and convergence may not happen. Here is a histogram of the number of shots taken and goals scored. We can see how most shots amass less than 0.05xG, and how the number of big chances decreases drastically.

Expected goals also allows us to evaluate team performance. In this case, even though there is an obvious correlation, some teams have scored more than expected, and others, such as Brighton, have well underperformed. Having efficiency might hand you the League title (see Lille). And even though underperforming in the domestic league might take you to fourth place, Chelsea showed its true potential at the Champions League.

Driblab’s Expected Goals model is basic for many other metrics, so we have studied it to be very accurate. We show some of the internal analysis the model undergoes, in order to reduce any possible bias. A metric that allows us to asses players, teams and even leagues, is of the upmost importance to us. And the next time your team’s striker misses a golden chance, remember that even 0.97 xG shots are missed every now and then.

We are Driblab, a consultancy specialized in football analytics and big data; our work is focused on advising and minimizing risk in professional football decision-making in areas related to talent detection and footballer evaluations. Our database has more than 180,000 players from more than 180 competitions, covering information from all over the world. Here you can learn more about how we work and what we offer.

Autor: Joan Hernanz
For Team Analysis we also recommend you:

Serie A: Set pieces in the title race

In this analysis, we take a look at the goals expected by each Serie A team from both set pieces and in play. Mourinho’s Roma are the clear leaders.

Il pressing può anche essere costoso (Italian)

Il pressing non è più una novità, ma lo è il modo in cui viene fatto. Nelle prossime righe ci immergeremo nel mondo dei dati per scoprire chi fa più pressing e chi lo fa meglio.

Serie A. I calci piazzati nella corsa per il titolo (Italian)

In questa analisi, diamo un’occhiata ai gol previsti da ogni squadra di Serie A sia dai calci piazzati che in gioco. La Roma di Mourinho è il chiaro leader.

Serie A, la rivoluzione silenziosa (Italian)

Che cosa è successo per la Serie A per diventare campioni di Euro 2021 dopo aver fallito la qualificazione per la Coppa del Mondo 2018? Guardiamo l’evoluzione del calcio italiano negli ultimi cinque anni.

‘Expected Points’ (xP): those who give credit to merit

What are Expected Points (xP)? We take a closer look at one of the metrics that tries to establish a long-term deservedness for how many points a team should have earned.

Vinicius, by the numbers: this is how he has stopped the excuses

Vinicius Junior has become the star he promised to be. And it’s all being proven in a multitude of data and metrics. We analyse where he is making an incredible leap forward.

Pressing can also be expensive

Pressing is no longer new, but the way it is done is. We dive into the data to find out who presses the hardest and who presses the best.

Amine Gouiri, why is he so good?

At 21, Amine Gouiri is another of the many talents of the French school. But what makes him a step ahead of the rest?

driblabPRO, already available in Portuguese

In less than a year, driblabPRO is now available in six languages, with the Portuguese update having just been released.

Verhinderte Tore: die wahre Lösung, um festzustellen, ob ein Torwart Spiele gewinnt (German)

Wir messen die Anzahl der Tore, die ein Torhüter mit seinen Eingriffen direkt verhindert, mit der Metrik “Expected Goals On Target” (xGOT).

Driblab

Información corporativa

Somos una empresa con sede en Madrid fundada en 2017 por Salvador Carmona y Cristian Coré Ramiro. Desde nuestros inicios nuestro trabajo se ha centrado en el análisis estadístico de datos para ayudar a los clubes en la planificación deportiva. Somos una consultora big data que ofrece servicios personalizados para cada cliente y defiende un modelo de gestión mixto y una comunicación constante para acompañar el día a día de las instituciones. Nuestro punto fuerte es la más amplia cobertura disponible en número de torneos profesionales y juveniles. Para más detalles, póngase en contacto con nosotros.

Colaboramos con:

           

Hemos aparecido en:


Talk to our speciali