In this post I'm going to take that a little further and look some of the players that performed to model and those that performed above or below their straight-up statistical performances and try and fathom the difference. In a later post I will also follow up on excellent suggestion by Lucien Gan* and see which periods of the season yield the best results to the model.
* - Lucien's blog "Transfer History Analysis" is a great read for FPL managers
Overall Correlation with FSCORE and Points
The graph below shows the XY plot for all the Premier League players this season. The correlation value (R) is 0.952 and R-squared is 0.907.
If defenders are removed from the sample this relationship improves to R= 0.96 R-squared=0.92 which is as I expected. Modelling the probability of a team/player securing a clean sheet needs a few more years of head-scratching that I've not yet undertaken :)
All in all, this demonstrates that there is a extremely positive relationship relationship between a players underlying statistical performance, shots,SoT, etc - and the amount of points he'll score in FPL. Remember that FSCORE does not take actual goals, assists or points scored into account. FPL points and FSCORE are distinct.
We pretty much knew this already but I am impressed with the strength of it. Early last season I looked at and found strong correlations between FPL points and individual attacking stats, particulary shots on target, which of course is what led me to aggregate the individual metrics into FSCORE (and give this blog it's name!)
I Walk The Line
This being said though, and as can be seen by the graph above, there are certainly players who well outperform the model (above the line), and those that under-perform (below the line).
Below I've highlighted a group of players who interest me the most. They all are starting to register decent points so will be of relevance to FPL managers, but have a pretty big and unfavourable spread in terms of FSCORE.
It's the players who are furthest from the line that we're interested in and these are:
FSCORE | FPL PTS | Model Variance | |||
Mata | MID | CHE | 160 | 212 | +25% |
Hazard (Eden) | MID | CHE | 154 | 190 | +19% |
Tevez | FWD | MCI | 155 | 169 | +8% |
Walters | MID | STO | 159 | 141 | -12% |
Koné | FWD | WIG | 159 | 140 | -14% |
Vertonghen | DEF | TOT | 160 | 136 | --18% |
Agger | DEF | LIV | 162 | 134 | -21% |
Defoe | FWD | TOT | 156 | 124 | -26% |
Osman | MID | EVE | 154 | 122 | -26% |
Huth | DEF | STO | 157 | 112 | -40% |
Cissé | FWD | NEW | 159 | 113 | -40% |
Jelavic | FWD | EVE | 158 | 99 | -59% |
The immediate thing to strike me is that the two players to stand with their head and shoulders above the others are both from Chelsea, a top team, and the next best is Tevez, whose also from a top team.
I could get down and dirty in the numbers but will make a small leap-of-faith assumption that we all do already and that is players in better teams (note - not better players) get higher quality chances and therefore score more goals, assists and FPL points.
Shot Zones and Goal States
Why do players in better teams score more then? I have been following recent posts by 11tegen11 about Shot Zones and Goal States and I think these are the data-driven answer to the variance in the model..
Shot Zone is where on the pitch shots are taken from. This is actually built into FSCORE already, distinguishing between whether they are taken inside or outside the penalty box, but more can be done. Goal State regards the current scoreline in a game, whether it's an edgy 0-0 or a team are rampaging 5-0, and how a teams' likelihood to convert chances changes significantly with the state of the game.
Over the summer I'll be developing FSCORE and the Point Projections to incorporate both these factors. I think they will improve the reliability and accuracy considerably and am dead excited about it :)
11tegen11 - Analysis of Dutch Football |