How good is Driblab’s Expected Goals (xG) model?

Category: Team Analysis

We analysed more than 44000 shots to measure whether Driblab's Expected Goals (xG) model measures the value of shots well. The result confirms that our model is well balanced.

Published:03/08/2021

Short answer: Very good.

Long answer: 

It is not unusual to see discrepancies in different xG values providers. Expected goals have a clear definition in theory, but it is difficult to put it in practice. A new season is about to start, and has actually started in many leagues, so we want to test our xG model. Football analytics main tool needs to be accurate in order to assess properly teams and players.

But, how can a xG model be tested? A shot with 0.22 xG means that, if an average player shots in the same circumstances a hundred times, he would score 22 goals. But this shot happens only one time. Does our model give too much value? Or do we underestimate the chance? The best way to asses this model is by looking not at one shot, but at 44.406 shots. This is the number of shots (excluding penalti kicks and own goals) that were taken in Europe’s top 5 leagues, the Copa America and the Euro.

One may think that it is impossible to miss that shot. Our model gives a value of 0.85xG. When our team’s striker misses a 0.85xG chance, we think that even ourselves could have scored that. But the missing 0.15xG tells us that roughly one out of every 7 of these shots is missed. And this is the one. Throughout the seasons, we recorded other 6 shots with that much xG, and all were scored. This is, obviously, a coincidence, but it still works to illustrate the meaning of Expected Goals.

Through the EURO, our model expected a total of 124.75 non-penalty goals, and the total number of non-penalty goals scored was 122. In the end, only a few national teams played and not a large number of matches were disputed. Taking the seven competitions we mentioned, we have 1905 matches where a total of 4485.8 non-penalty goals were expected and 4581 were scored. An underestimation of 2.07%, which is not statistically significant, shows us that our model is well balanced. To compare, the previous season, on these leagues, we had an overestimation of 0,16%.

But this does not tell the whole story. We might be overestimating low value shots and underestimating, or not counting all shot zones or shot types. As an example, we’ll take a look into different value shots. The graphic above shows our prediction against reality. We put the shots in bins of 0.01xG. This is, our predicted probability of 0.04 xG are shots between 0.035 and 0.045 xG. The size of the circle represents the number of shots (more shots mean less deviation) and we show how many of these shots were actually converted. An R-Squared of 0.968 just mathematically confirms our intuition: Driblab’s Expected Goals model is well weighted.

For big chances (xG over 0.33), small number of shots are recorded and convergence may not happen. Here is a histogram of the number of shots taken and goals scored. We can see how most shots amass less than 0.05xG, and how the number of big chances decreases drastically.

Expected goals also allows us to evaluate team performance. In this case, even though there is an obvious correlation, some teams have scored more than expected, and others, such as Brighton, have well underperformed. Having efficiency might hand you the League title (see Lille). And even though underperforming in the domestic league might take you to fourth place, Chelsea showed its true potential at the Champions League.

Driblab’s Expected Goals model is basic for many other metrics, so we have studied it to be very accurate. We show some of the internal analysis the model undergoes, in order to reduce any possible bias. A metric that allows us to asses players, teams and even leagues, is of the upmost importance to us. And the next time your team’s striker misses a golden chance, remember that even 0.97 xG shots are missed every now and then.

We are Driblab, a consultancy specialized in football analytics and big data; our work is focused on advising and minimizing risk in professional football decision-making in areas related to talent detection and footballer evaluations. Our database has more than 180,000 players from more than 180 competitions, covering information from all over the world. Here you can learn more about how we work and what we offer.

Autor: Joan Hernanz
For Team Analysis we also recommend you:

‘Elevens’: discover the best anywhere in the world

We continue to deepen our numerous tools and functionalities. With ‘Elevens’ you will be able to know which players are the best in each league by position.

The revival of Gonçalo Guedes?

After a very irregular season, the moment of Gonçalo Guedes gives reason for hope. We analyze his numbers since the arrival of José Bordalás to the Valencian team.

‘Loans’: find your next market opportunity

We explain how our ‘Loans’ tool filters to find players who may have lost prominence.

Alerts: never miss the next phenomenon again

Our Alerts tool allows us to know and make available to our clients, in an automatic and updated way, which players make their debuts, establish themselves and stand out in their clubs in their first steps in the professional world.

How to identify great talent in unsuccessful teams?

How to spot talent and performance in lower level teams? We explain how Driblab’s tools help in scouting.

driblabPRO Release Notes July ’21

During the month of July we have developed new tools in driblabPRO that expand the resources we provide to our customers. Here are the details of the month’s new releases.

A new arrow in Mourinho’s bow: Eldor Shomurodov

The arrival of Uzbekistan’s Eidor Shomudorov at Mourinho’s Roma should not come as much of a surprise. We analyse what made him worth his move to the Giallorossi after his great year in Genoa.

Italy and Denmark’s key to Euro 2020: High Turnovers, a look at high pressing’s efficiency

How to measure the effectiveness of high pressing? We introduce the concepts of High Turnovers and High Recoveries and analyse EURO 2020 to understand where Italy and Denmark excelled.

Women’s football on Driblab

We display all the statistics and tools available on Driblab to visualise with data all the women’s football in the five major European leagues and the Champions League.

ON/OFF: Coman & Gnabry, Bayern’s life insurance

The importance of Bayern’s wingers this, and other seasons, is reflected in the statistics. Coman and Gnabry, vital for Bayern.

Driblab

Información corporativa

Somos una empresa con sede en Madrid fundada en 2017 por Salvador Carmona y Cristian Coré Ramiro. Desde nuestros inicios nuestro trabajo se ha centrado en el análisis estadístico de datos para ayudar a los clubes en la planificación deportiva. Somos una consultora big data que ofrece servicios personalizados para cada cliente y defiende un modelo de gestión mixto y una comunicación constante para acompañar el día a día de las instituciones. Nuestro punto fuerte es la más amplia cobertura disponible en número de torneos profesionales y juveniles. Para más detalles, póngase en contacto con nosotros.

Colaboramos con:

           

Hemos aparecido en:


Talk to our speciali