Analysis In Action: Part One – Self Collected Data

If I were to write an article about explaining various metrics, “what is performance analysis?”, and other topics about what advanced analytics/analysis is, it wouldn’t get a lot of traction – People are still obviously interested in these: It’s not 100% part-and-parcel of the game yet, and there is still a lot of negative reaction when advanced statistics, specialization in coaching, and increased levels of objectivity are sought out. However, throughout my time working in football and since it became my full time job 4 years ago now, I have always tried to frame my work which can be summarized in one word: Actionable.

Ultimately, posting visualizations, videos, threads on tactical tweaks, etc. can get you picked out from the crowd and show people you have the technical skills to get a job. However the day to day of actually working in the game and making it usable is VERY different: Deadlines are more meaningful, ad hoc requests come out of the blue, and you need to work within a framework (ideally) of the coaching staff’s game model. The goal of this mini-series is to help people understand some basic tools I use (and those who are at level above me!) to translate technical skills to working at it. 

To do this, words are somewhat unhelpful: *Describing* how to work is just an extension of the articles I’m trying to get away from. Alongside this mini-series, I will be posting example data dashboards, presentations, and more! Hopefully it’ll give you a flavour of what I do. I cannot share all my trade secrets, of course, but it should serve as an inspiration and understanding for what analysis is like in the real world. Part one will look at “self collected” metrics.

What Actually Matters: Self Collection Of Data

When watching a match, it’s very easy to become over-invested in clipping or coding EVERYTHING: Essentially doing event data work to the level that suppliers and companies do. Don’t do this. Please. Anyways: if you’re reading this, you probably don’t have data access to this degree regardless!  However, what you can do is find a work around – Find examples and collagating them together to home-brew your own. I won’t pretend to note the disadvantages: Much of what you consider to qualify (i.e. the definition that collection follows) may be hit or miss. For example: If you’re looking for moments where your player counterpressed following a misplaced pass without pressure, one occasion might be in the gray area of yes or no – Do this alot, and the data is skewed ever so slightly. However, this small subjectivity in an environment without any possibilities of avoiding these minor errors is not the end of the world: “lesser” quality data when it fits within your game model and your principles of play is not the same as “bad” quality event data from a supplier. Ultimately, your work is meant to fit the long term framework that the coaching staff put in place. If it helps you achieve these aims, it’s worth it. 

A small part of a code window within SportsCode: used to reinforce game model principles.

These self collected and team/individual specific metrics can be as broad or as narrow focused as you want them to be, and they are all based around your game model/how you want to play. For further context: A game model is essentially a blueprint of how a team wants to play – In possession (how do we want to build? Do we want to be vertical?) out of possession (high press? man-marking?) positional roles within that, and various sub-principles to make these things happen. If you want to read more on this subject, which is far too complex to get into in this sphere – read Rene Maric’s article here. (He’s much more qualified than me anyways). Regardless of your model, the aforementioned metrics you collect and disseminate need to support the goals of the staff, which the game model is an extension of. Some specific examples of this come from my time working in the college game in the USA. At UVA, out of possession, we defended in an aggressive 4-3-3: Counterpress when we won the ball, compress lines to deny them space to play. Essentially, a defensive mentality akin to that of Liverpool. While we did have a data provider at that time, I live-coded all our games through Sportscode and supported their data with my own clips incorporating phases of play: Traditional video analysis. While these were mostly used for post match analysis and pre match analysis (as was training footage), I also tagged events after the fact which were exported to a CSV format and used as data. Providers were able to give us event level data which suited the needs of all our clients, but self collecting things was important.

One example of such a metric I created was “counterpresses following an unforced error.” Obviously in the final third, risky passes are gonna happen: Scoring goals are hard. However we did want to limit the amount of counterpresses and high intensity sprints we made which were unnecessary. To do this, I tagged every event where we gave the ball away cheaply that was not in the final third/around the box and started a counterpress. I also cross=referenced this with our Catapult (GPS device) data to see who was performing high in the high intensity sprints category. By doing so, I was able to pinpoint players who were counterpressing to make up for errors, alongside thoughts who were doing less of this. This wasn’t used as a means to shame players for giving the ball away – it was a reference point (used in conjunction with the video) to try and be more efficient in our high intensity running/pressing and see what ways we could limit it in areas which were more beneficial. This is just one example of a “homebrewed” metric that we used and the possibilities are endless.

Self collected/team specific metrics presented simply and clearly.

As I mentioned previously, there is a bit of subjectivity to this: risk taking in possession obviously occurs in the first two thirds of the pitch – was it truly an “unforced error”? was the intended target at fault? Were there other mitigating factors which might have caused the pass to happen? As well, not all counterpresses are created equal and the length or intensity of said sequence of events is never the same. I could go on. However, while understanding the subjectivity and margin of error in this type of collection (and not taking it for gospel) having a level of objectivity to your analysis gets your closer to discovering strengths and flaws, and in turn, things which need to be tweak over the course of the season: be it in regards to team success, or individual player development. If you have a data provider, used in conjunction with more strict definitions of events and simple accumulation statistics you get a supreme level of detail to your work. If you don’t – You still have something to work from rather than the basic eye test of film. 

There are other various ways outside of basic data visualizations to utilize your metrics and promote your game model. The main one is of course (and if you know me – not a surprise) video!

Sportscode workflow: Allowing for both video analysis and exportation of data outputs.

I’ve had the advantage throughout my time working in football of having access to high end video analysis software. As a result of this, most of my data collection is done on this platform. All tagged events I have come with an associated clip and timestamp. This makes the contextualization process simple and straightforward. When necessary, I had a database (more on databasing later in the series) of examples to pull and use for presentation purposes. Video is the best way to make things actionable, and why I’ve stressed it’s importance at every occasion. Raw figures give you the ability to locate who and how many times someone is doing an action, but video (as most people don’t have access to high level freeze frame data – just a plug) is the best way to show how/why/where these things are happening. Increased contextualization allows you to gain the whole picture, plan ways to coach it, and put in work on the pitch. Essentially: game model planning! 

Coaches are more and more inclined towards data in this day and age – Especially those closer to the beginning of the career. There’s no denying the fact that data as close to game realistic as possible presented to them is the holy grail of analysis. The same is true for players, who are obviously important in this analysis factor. When planning video analysis sessions prior to matches or training over the course of the season, having positive and negative examples of our game model and how you want to play are great ways to get “buy in” or encourage (or discourage) certain types of actions. 

At younger and grassroots levels, another useful task is to compare and contrast teams you are similar to stylistically. You might not have the luxury of being able to get top quality footage of every match or training session, but if you can do so in the framework of games which are on TV, it’s a valuable workaround. You can hand collect these metrics again (be it on software or a simple timestamp with pen and paper), and show players and staff. The simple fact I’m trying to show here is that while access to top quality software might not always be a possibility, through a bit of creativity and some effort, you can find a way to make objectivity in analysis a possibility and link it to subjective/alternative means of displaying it. 

The Final Part – Sample Data Dashboards

I hope you enjoyed part one of this multi part series about applying performance analysis/analytics into practice. I’m not sure exactly how many parts this will have, but hopefully you stay tuned for the rest of them! Some areas will be a bit more dry (such as database management) but they are no less important and will certainly help you in regards to applying these sorts of techniques into practice yourself. 

As I mentioned at the beginning, alongside the articles, I will be sharing some basic data dashboards which incorporate some basic metrics and interactive visualizations. They are not meant to be as artistic as ones you find from the data vizs pros, but they are simply meant as a means to display key metrics (both event level and self-collected/team style ones) that I would make available to players and staff I work with. All data has been randomized and players names have been changed for obvious reasons, but they are all things I’ve utilized in the past. As ever, this data is used in conjunction with the video analysis techniques and training/coaching plans throughout the week – The actionable side of the job. As the series goes on and I touch on more examples I’ve pulled more my experiences, the dashboards will be added to and more layers to the curtain will be unveiled! 


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: