Soccer analytics: treat it like a ‘stop-and-go’ sport!

Posted on 5 March 2012


Last Sunday, pre-match interview of top Eredivisie manager ahead of title contenders clash. Quote above the video: “Statistics mean nothing to me”.

Just two days earlier, the biggest sports analytics conference in the world had its 2012 edition, the MIT Sloan Sports Analytics Conference, or #SSAC as it went around on Twitter…

The contrast could hardly have been bigger.


Analyzing soccer

The SSAC brought professional and amateur analysts together for two days of exchanging ideas, presenting research and most of all, networking. In these two packed days, a mere hour was reserved soccer. The rest of the two days? Baseball, basketball, NFL, ice hockey… Soccer may be the biggest sport in Europe, but in the USA it still isn’t, and sports analysis is very USA.

One of the main objections most people hold against analyzing soccer is its presumed continuous state-of-flow, as opposed to the very stop-and-go nature of other sports, most notably the godfather sport of analytics, baseball. But is this a fair point? I did not attend any of the SSAC, but rather spent part of my recent holiday chewing on this very question and came up with some thoughts that I will put forward here, defending the view that soccer is just another stop-and-go sport, be it with less identifiable and more heterogenic ‘stops’ in it. Let me explain.


Possession analysis

Soccer, in its base, is a rather simple two teamed affair. Either one of both teams is in possession of the ball and tries to create a goal scoring attempt of maximum opportunity, while the other teams tries to prevent them from doing so, trying to regain possession in the process. This goes on for ninety minutes and whoever found the back of the net most times walks off as the winner.

Nothing new so far. But try to think of these possessions as separate plays, and, for convenience, try to separate offensive and defensive plays of your team. This is a rather counter-intuitive thing to do for soccer fans, given the continuous nature of the game at hand. All of your team’s possession plays start at a certain point (open play turnovers, goal kicks, corners, free-kicks, throw-ins) and come to an end eventually. Most of them will end with a disappointing turnover of possession in open play. However, the more successful ones will lead to a new possession play, like a corner, a free-kick, or a penalty, and a minority of the possession plays will lead to open play goal scoring attempts.

These goal scoring attempts can be assigned a certain value, based on the amount of goals being scored in historical data with shots from that same position in that same match situation. This reasoning has been explained in the ‘A chance is a chance is a chance’ post of last summer.

Imagine yourself reading a match report that would say something like this. Team A had 100 possession spells, of which 62 started with turnovers in open play, 10 started with goal kicks, 20 were free-kicks and 8 were corners. These 62 possession spells that started in open play created 8 goal scoring chances for an expected number of goals of 1.92. The spells starting with goal kicks failed to produce any shots, while the 20 free-kick possession produced 3 shots with 0.8 expected goals and the corners failed to produce any shots either. Guess you get the point by now.



The insight these numbers would give is how your teams fares from winning possession to finding the back of the net. What share of possessions is won by turnovers in open play? How many chances are created with possessions starting with indirect free-kicks and what number of goals may be expected from those shots? It will be much easier to compare teams straight up using the ‘possession spell’ approach and counting the outcome of these spells than it would be to compare teams on the basis of possession percentages, as is the current trend. After all, who cares whether an attacks takes three seconds or three minutes to come to fruition, the end-result of the possession spell is what counts.

The reverse analysis can immediately be applied to the defensive end of the game. How did the team deal with losing possession in open play? How often did it happen, and what number of expected goals were conceded from it? Of course, turnovers in open play form a heterogenic group of starting points of possession plays, but their spread can be informative too. Where on the pitch does your team suffer open play turnovers? Which turnovers are most harmful in terms of expected goals conceded from the following spell of opponent possession?

Note that the term ‘expected goals’ features a lot in this analysis, rather than the actual number of goals scored or conceded. This allows to deal with the low scoring nature of soccer, by looking at the quality of the chance created, rather than the goals scored. After all, you may get away with conceding a chance if your opponent blows his header, but on the long run you’ll pay exactly the expected price.


The future

Analysis like this may not be far away. The Eredivisie recently saw the introduction of the second screen application ‘Sidekick’, aiming to serve (presumed) fans’ demand of insight by numbers. A breakdown into ‘possession plays’ and the resulting outcomes may increase insight during live games, as it’s easy to digest at a glance, while it also offers enough depth for analysis afterwards.

Posted in: Uncategorized