And the winner is?

Mike Langen
6 min readJul 14, 2021

Club and league performance in the UEFA EURO 2020

The UEFA Euro 2020 just ended and Italy won over England. However, besides the aspect of national competition, major football tournaments are also an opportunity for players to present themselves to the world. Simultaneously, the UEFA Euro 2020 offers an opportunity to analyse the squads of the best clubs in Europe. In this little data science exercises, I therefore want to see how the clubs and leagues performed.

To start, I accessed the official UEFA Euro 2020 website, which provides a lot of statistics on the tournament and individual teams. The official squad of every team is listed and for all players we find an individual website with statistics, such as number of age, club, number of goals. The first step was to collect all the information of individual players in a dataset. I wrote a small web scraper in Python, using Selenium. The final dataset contains information on 621 players of 16 teams.

Since I am also interested in statistics on league level, I needed to link the club information to the respective leagues. I therefore wrote a similar Python script, downloading all clubs in the major European leagues for the 20–21 season (since the UEFA club information are also for the 20–21 season). Now, we can start to look at the outputs :)

Number of players

Figures 1 and 2 below show the top clubs and leagues by total number of players in the tournament. Number of players is a reasonable indicator of club / league quality as it shows how many players of a clube are considered best in their country. Chelsea had the most players in the tournament, followed Manchester City and Bayern Munich. Considering that the total Chelsea squad for the 20–21 season had 49 players playing for Europe, this means 32% of the squad are considered best of their home country (playing in their national teams) and are also among the top 16 national teams in Europe!

Figure 2 looks at this from a league perspective. Surprise, surprise most players in the tournament call the Premier League their home (you may have guessed if from Figure 1). Second and third are the Bundesliga (Germany) and Serie A (Italy). This means fans of these leagues saw a lot of familiar faces on TV and that these leagues are also considered the best of Europe (if not the World).

Figure 1
Figure 2

Number of goals

Number of players is a reasonable measure, but in the end we want to see goals! Therefore, I did the same exercise for goals! How many goals did players of individual clubs and leagues shot? It is important to mention that the dataset only contains shot goals per player (so no own goals) and also includes penalty goals (such as from the Italy vs. England final).

Figures 3 and 4 show the total number of goals per league and club. Out of 132 goals (excluding 11 own goals), 89 were made through players in the top 3 leagues (67%). Around 26% of all goals were made through players currently playing in the Serie A. Looking at the club level in Figure 4, we can see that 9 goals were made through players from Juventus Turin, followed by 8 goals from Man City and Inter Milan players.

However, there are some issues with this measure. First, forwarders have a higher chance to make a goal, so it could just be that these leagues and clubs have the best forward players. Second, individual players might distort the club statistic. As an example, Cristiano Ronaldo is responsible for 5 out of the 9 Juventus goals!

Figure 3
Figure 4

Minutes played

Since goals and number of players are a bit biased measures, let’s turn to a third measure of performance: Minutes played. For every player, the UEFA reports the actual minutes played in the tournament. This measure doesn’t discriminate players by position, such as with forwarders having higher chances to make goals than defenders. In contrast to purely counting the number of players, it also accounts for a players’ role in the national team (is he just on the squad or actually playing from the start). The dataset contains the minutes played in the tournament for every player (minutes played across all games). Obviously, total minutes are higher for teams that proceeded further in the tournament.

Figures 5 and 6 show the total minutes played by all players in a league or club. Most of the time we saw Premier Leagues, Serie A and Bundesliga players on the pitch. We can compare this to the number of players in the tournament (obviously highly correlated). Even though there were more players in the tournament, Bundesliga players played slightly less minutes than Serie A players (who were less). On average, every Premier League player played 226 minutes in the tournament, every Bundesliga player 191 minutes and every Serie A player 229 players. Comparing quality with quantity, this means Serie A and Premier League players played the most on average.

On a club level, the ranking by minutes played is nearly portraying the ranking in the UEFA Champions League, indicating that players from the top clubs in Europe played the most!

Figure 5
Figure 6

Minutes per Goal

As a last measure for this exercise, we look at the average minutes per goal. Minutes per goal is a common measure for individual forward players, indicating their efficiency (e.g. how long do they need for a goal). You can find the top values in Europe for the 19–20 season here. Since this ratio can only be calculated if there is at least one goal, it only considers players who shot a goal. Furthermore, I exclude leagues and clubs with only one goal shot as these can be lucky shots.

Based on Figure 7 and 8, we can see that the picture for leagues looks a little bit different now. The most efficient forwards play actually in France (Ligue 1), needing 107 minutes per goal on average! This is followed by Portuguese Primeira Liga (127 minutes) and the Dutch Eredivisie (158 minutes). The bigger top leagues, with most players in the game and minutes played turn out to be way less efficient! On a club level, we however find a lot of European top clubs again, painting a similar picture as before.

Figure 7
Figure 8

Conclusion & Disclaimer:

I hope you like this little project. If you have an idea for other measures, please comment below or send me a message. I normally share my project code on Github. However, web scraping is still a legal grey. It is ok to use information but not to redistribute them. I therefore decided not to share any data or code for this project.

--

--

Mike Langen

I am an Assistant Professor in real estate finance, exploring the opportunities of big data.I am interested in all kinds of data science exercises.