- Get Goalside
- Posts
- Festive readings
Festive readings
Blogs blogs blogs - on analytics, data science, and genAI

Howdy. Because blogs are back, I’ve been reading some blogs.
As a quick festive gift, here are some nice links:
Football:
‘Uncovering runs: Jack Grealish in 2023 Champions League final’ - Henrik Schjøth
One of the great types of blog: pretty short, packed with ideas. One of the main ideas is looking at the variety in direction of players’ runs (there’s a neat comparison between Erling Haaland and Romelu Lukaku, for example). There’s also a couple of neat visualisations showing the attacking runs during a spell of possession, highlighting two City players making runs into the box - I can imagine this a springboard for an analysis on ‘box runs during crossing situations’.
‘Football data: finally under scrutiny?’ - Arnaud Santin
I’m a sucker for this kind of blog, a commentary on different types of data and their history. More importantly, I’m not sure how easy it is for users of football data to appreciate the differences between providers (and between the provision that different competitions get, from the same provider).
I’m not a “there is a single ‘truth’” kind of guy (as I’ve written about before: “If categories can't be 'right', then definitions can't either, only sensible.”). But, obviously, some interpretations are more reasonable than others. Santin, who is cofounder of a company SportsDynamics, writes of an example:
One of Europe’s major broadcasters highlighted Arsenal’s 24 breaking line passes after 30 minutes of their game against Monaco, while my feeling watching live was that Monaco was actually defending well (they ended up making two huge individual mistakes). The next morning, I still couldn’t understand the discrepancy between the data provided live and my own vision of this game and decided to check our own data for the first 30 minutes, based the official and accurate optical tracking data.
We had 7 breaking line passes only for Arsenal.
ok, not a ‘blog’, but if this headline doesn’t grab you I promise you it’s (at least) five times more interesting than you think. Although this piece isn’t directly linked to ‘football analytics’, it deals with the ‘what assumptions are made about fans’ questions have clear parallels in the types of things analytics nerds think about. The piece is also great for giving a subject that can get flattened to primary colour analysis the consideration from multiple angles that it merits.
Data nerds may also enjoy related a conversation in this episode of the Expected Goals podcast (about halfway through) about the data around whether someone has bought a “women’s team” shirt or a “men’s team” shirt.
Not football:
A set of thoughts that resonate with me, both from when I was first learning to code and when I’m learning something technical that is new. There’s probably a lot of ‘what is teaching/what is learning’ commentary that you could do relating to this as well.
‘Write code with your Alphabet Radio on’ - Vicki Boykis
A tl;dr - experienced programmers can have an advantage over LLM-code tools because we’ve worked with good programmers. Though, as Boykis references, data scientists and programmers don’t always have teams around them (which particularly applies to sports).
‘LLMs, reliability & the scientific process’ - Martina Pugliese
Interesting, readable blog on a subject I think a lot of people will be interested in. Like, I like this nugget: “You usually test their [LLMs] behaviour on a bunch of data. […] You make this dataset into an eval set which you run periodically to check for robustness. But the fact that results may look good doesn’t tell you how the LLM will behave at scale.” The last couple of years have seen best practices start to emerge for LLM-based tools (like code editors); will data science testing best practices follow?
If this is a festive time for you, have a happy festive period; if not, have a good time anyway