Friday, 9 November 2012

Gameweek 11 Predicted Scores [by SuperGrover]

Intro by @shots_on_target:
Since starting up this site a couple months ago I have engaged with quite a few people with an interest in the growing field of stats and football.  I'd like to introduce the first in a series of weekly articles by a very welcome guest writer on this site, SuperGrover.   SuperGrover has a couple of decades of involvement with the Sabremetrics movement in baseball and NFL so knows his stuff. 

On our fledgling forum 'FPL Analaytical' we have been  predicting the scores ahead of the gameweek for the last few weeks.  As he will explain below, SuperGrover has built a model to do this.  Please note that the data and articles normally on this site are derived from my own model but I am working very closely with SuperGrover and others to incorporate the best elements from each others work to give you the best fantasy football forecasts.  

Please add your views or any constructive criticism in the comments section, or check out the forum.

Gameweek 11 Predicted Scores [by SuperGrover]

Football statistical modelling is in its infancy.  While no doubt major clubs have squads of statisticians with proprietary algorithms defining player and team value, the general population has been left to look at goal records and the league table to determine the quality of player and team alike.  That all changed with OptaStats.  Now, everyone can see the story behind the game.  We can look at the activities throughout the pitch and begin to ascertain which of these lead to goals.  Further, we can begin to see which activities indicate innate ability and which are simply the luck of the draw.  By combining these we can come up with forecasting models for both team and player, a Holy Grail for fantasy football managers.

While @shots_on_target has primarily focused on player evaluation, I have spent numerous hours over the past months hypothesizing on team value.  I have worked to understand the underlying activities that drive goals scored and allowed, allowing me to construct team value models that I believe are far superior to the league tables.  While still a work in progress, I am confident enough in these models to share them with you.

It has been known for some time that shots, more specifically shots on target, are great predictors of goals.  Teams tend to score on about a third of shots on target on average.  More importantly, teams that exceed or fail to achieve that rate one year tend to regress towards the mean the following (see James Grayson’s excellent blog for more discussion).  As a result, we can use shots on target rather than goal scored as a better indicator of team performance, mainly due to the sample size issues in goals scored (logically, there are about 3 shots on target per goal scored).  This helps us identify teams that maybe underrated or overrated based upon goals alone, especially early in the season when sample sizes are low.

But are shots on target enough?  It certainly is a start and much better than plain old shots or goals scored as a forecaster.  Yet, to me it seemed…wanting. (ed:  I agree, more is needed!  SoT) I looked for some other factor that may do a better job of explaining things.

What I have found is that shots on target in combination with “Big Chances” (BC) result in a stronger correlation than shots on target alone.  BCs are defined by OptaStats as follows:

Big Chance  A situation where a player should reasonably be expected to score usually in a one-on-one scenario or from very close range.

In my mind, BCs represent shots on target on steroids.  Adding BCs as additional factor results in the following improvements in goals scored projections over the past three seasons:

2010 – 1.75%
2011 – 2.93%
2012 – 1.02%

While the improvement is not substantial, it is consistent enough for me to have some confidence in the model.  For goals allowed, the improvement is even starker, although the data set available currently only goes back a single season:

2011 – 7.52%
2012 – 25.37%

I have theories as to why the improvement is much greater for goals allowed, but I will save that for another time.

Beginning next week, I will post my team ratings and goal forecasts using this model with commentary on weekly performances.  For now, I will post the projected scoreboard for this week’s games.  I encourage you to check out our gameweek predictions forum for additional projections from other members.

So there you have it, the nuts and bolts of my current model.  I am still tweaking things and looking for further improvements, but I do feel it is a good start.  Comments and criticisms are very welcome.

No comments:

Post a Comment