I love stats that tell a story, but I hate it when people claim a real story is wrong because the stats say it should be. I saw a tweet from a Spurs fan, who I won’t name, that said his team have been unlucky this season. His reasoning was this: “Spurs are top for many metrics used to measure best league team. They should be top, but aren’t.” He went on to say that Leicester had been lucky, and should in fact be somewhere between fourth and sixth in the table, and the reason they weren’t is that “luck plays a part.”
Well I’ve got news for him: Leicester finished top because they’ve won more games than anyone else, Spurs included, and over a 38-game season (36 at time of the tweet, but the League was won by then) that’s not luck, it’s playing better than everyone else. Let me rephrase that: it’s playing more effectively than everyone else, as ‘better’ is sometimes used to denote style.
Mr Spurs-Fan didn’t elaborate on the stats that showed his club should be top, but a popular one these days is ‘expected goals’, or xG. Basically xG is supposed to measure how likely it is that a particular goal attempt will result in a goal, based on a number of criteria. At its simplest the criteria is where the shot was taken from, but some models include how the ball was delivered to the person shooting, whether the goal attempt was by foot or head and which part of the goal was aimed for. Some models only include shots on target while others include all shots. None of the models in the public domain apparently include position of defenders and keeper, but there may be analytics nerds working on this. Arsène Wenger is apparently keen on xG, and was quoted (by Rory Smith in The Times) as using it to justify playing Aaron Ramsey in central midfield rather than wider: “If you look at his Expected Goals when he is in a central position, it is amongst the best in the Premier League.” Okay, Arsène. If you say so. But I hope you are basing team selection on slightly more than that.
Richard Whittall recently blogged to explain xG, using Arsenal’s 1-1 draw with Crystal Palace as an example. It’s a good read. According to most models, said Richard, from the shots both teams took Arsenal ‘should have’ won this match 2-1; that’s what you’d expect to see on average from the particular shots taken. But Arsenal didn’t win, so why not? Obviously there are a lot of variables – everything from how good the players are to whether the sun was blinding the keeper at a particular moment, or whether the wind got up in the second half, to the whims of the ref and linesmen, to whether Olivier Giroud stubbed his toe in the dressing room just before he put his boots on and didn’t have time to go to the toilet before he came out. The number of potential variables is massive, and with an expected difference of just one goal any variable could affect the match outcome.
But in general terms, and using this game as an example, I’d say that a major part of the reason Arsenal didn’t score as many as ‘expected’ is down to their tactics. Arsenal mostly like to play the ball around slowly, keeping possession until they create an opening that in their collective opinion is worth having a shot from. This can take a while, which gives the defending team plenty of time to get back into position. Arsenal fans will be very familiar with this. The outcome – not every time, but enough to skew the averages – is that often when Arsenal do shoot it’s often from relatively close range but under pressure from defenders and with the keeper well-positioned to make a save if needed. If you looked at the simplest xG model – where a shot is from – you could conclude Arsenal don’t score as often as they should from those shots. You might think Arsenal were therefore ‘unlucky’, or you might think the strikers were rubbish, and neither of these might be true.
I imagine xG is one of the stats that the Spurs fan I mentioned is using to justify his belief that Spurs are unlucky compared to Leicester. His twitter profile makes it clear he’s a gambler, so I hope he lost money by continuing to back Spurs for the title based on a belief that Leicester’s luck would run out.
As it happened someone then tweeted a graphic of the goals Leicester have conceded this season (up to 3 May) onto my timeline. This originated from Ted Knutson, aka @mixedknuts. I don’t know Mr Knutson, but I imagine from his twitter handle he has some sense of humour. Unfortunately this didn’t extend to indulging me by explaining why he believes Leicester are lucky based on this graphic, so I’ll have to try and work out his thoughts for myself. I waited politely, but in vain, for any further explanation. However, from the graphic below we can see that Ted’s model ‘expects’ Leicester to concede an average of 46.93 goals from the 489 goal attempts made against them, but in fact they conceded only 31. This is broken down further into shots from set-pieces, open play and crosses, throughballs, dribbles and ‘Other’.
The biggest variance from the figure Ted’s model expects is ‘Other’, but I don’t have a breakdown of what Other might refer to, so it’s difficult to say too much more there. Presumably it includes own goals and penalties, unless penalties are part of set-pieces. Leicester actually had no own goals for or against in the Premier League this season, so ‘Other’ remains a mystery. But I’ll delve into the remaining figures provided.
Set pieces – expected to concede 14.84; actually conceded 9. Why might this be? Perhaps the Leicester defenders are very good at marking. Perhaps Kasper Schmeichel is well above average at positioning himself, especially when given time. Maybe Robert Huth kidney punches or pulls the hair of each opposing team’s most dangerous player at every set piece. There could be a dozen other reasons. Are any of these down to luck? No. Cheating in some cases, but not luck.
From open play/crosses and throughballs the variance is very slight, and the numbers are so small that trying to draw a conclusion is pointless. You certainly would not be able to state that Leicester were lucky based on these.
Similarly from dribbles, they conceded just once when they were expected by the model to concede 2.24 times. Maybe you could conclude their defenders are good at tackling dribblers, but the numbers are again too small to be meaningful – a single different outcome to one dribble in the whole season would have changed the ratio completely. You could also conclude in each of the above cases that perhaps Kaspar Schmeichel has just had a brilliant season, which again is hardly lucky.
Another telling stat on the graphic is blocked shots. If I’m reading it correctly Leicester blocked 33.1 per cent of shots attempted against them, compared to a league average of 28 per cent. Is this luck? If you never watch Leicester play, you might conclude that, but if you have watched them you’ll know that they make a lot of effort to block shots. Some teams don’t. Some defenders turn their backs on shots, but not Leicester; they jump in and, as the old saying goes, put their bodies on the line. Is that luck? No, it’s determination, will to win, being prepared to make a sacrifice for your team. If they keep playing like that they’ll keep blocking more shots than average. No luck involved whatsoever.
So I see no evidence of luck in Leicester’s defending at all. As regards their attacking, if xG predicts they’ll score fewer goals than they have, is that luck or is it down to breaking quickly and pulling defenders out of position so their own attackers are under less pressure when shooting? Is it luck, or is playing to their strengths and getting Vardy in behind defences with accurate long balls? Is it luck or is just being well-drilled at attacking set-pieces? If someone wants to show me where Leicester have definitely been lucky I might be convinced, but not so far.
What we need to bear in mind is that another factor preventing football stats models from being anywhere near perfect is the number of goals in football is very small compared to goals or points in almost every other game. A single random act can win a football match. That’s possible in rugby, though very, very rare. Theoretically perhaps also in hockey. It’s not possible at all in cricket, tennis, American Football, squash, baseball, basketball, badminton – almost every other team sport I can think of, not to mention the likes of golf, snooker, pool and darts.
It’s also hard to distinguish a clear dividing line when you’re counting football stats between signal and noise. So much happens accidentally: the ball deflects and ricochets in unexpected directions and with unexpected consequences. Corners go straight into the goal, poor shots are deflected in by defenders when they would have been saved, the ref gets in the way, bad shots become good crosses, assists and even goals can often come from mistakes… sometimes you get a defender under little pressure doing something horrendous, like Lee Dixon lobbing his own keeper from 35 yards. Merely counting all goals as equal and all assists as equal irons out differences, removes context and hopes the averages will take care of everything. If you want to believe that always works keep putting your money on ‘unlucky’ Spurs for the title.
So what use are stats? There’s a place for them in assessing players and games, but coaches who rely on them without taking other factors into account are going to get caught out (Arsène with Ramsey’s expected goals, perhaps?). A more realistic use so far is in building betting models. In some cases you do get enough information to see patterns that mean you can work the odds slightly, or have a better idea when it’s worth putting your money on. But you can be sure the bookies are equally keen to stay ahead of the game, or at least ahead of the 99.9 per cent of mug punters like the Spurs fan who thinks Leicester should be between fourth and sixth. Be lucky, mate.