6. Gameweek 1 Preview: Final Machine Learning Model and Picking a Team to Start the Season

So here we are. At the start of a new season. After some experimentation and feature selection, I've landed on a model that is being used to pick the GW 1 team. The image below shows the number of returns each player that is playing in GW 1 is expected to get in the first 4 weeks by position. 


Since Part 5, there have been a few changes to the features of the model and these changes are reflected in the team that has been selected. 

  • The clean sheets are now being predicted by a model that has data points of teams instead of players. This means that we are looking at a more accurate model and it has reduced the training as well as test error. 
  • The features that are being used for predicting attacking returns were short term & long term goals and assists, short term & long term xGI and short term opponent xGC. Long term xGC and number of goals conceded by opponents did not prove to have a good correlation in predicting the attacking returns.
  • The features that are being used for predicting defensive returns were short term and long term xGC, long term clean sheets and short & long term goals scored by opposition. xG of opposition and goals conceded by the team did not end up having a good correlation in predicting the defensive returns.
  • While we continue to look back at 4 gameweeks as short term form for both attacking and defensive returns, the long term performance that we consider before those 4 gameweeks is 16 gameweeks for attackers and 14 gameweeks for defenders. 
  • Promoted teams now have historic data that is assigned to them. Currently, this is a mix of their performance in the Championship along with average performance of teams that have been relegated last season from the Premier League. For this purpose, I have compared Leeds, West Brom and Fulham to Bournemouth, Watford and Norwich respectively. 
  • Similarly, new players' statistics are a mix of their statistics from their previous team (where stats are available) and players in a similar position who played for the same team last year. For example, Timo Werner's stats are a mix of his Bundesliga performance and the underlying statistics of Abraham and Giroud.
  • A captaincy model has also been run with prediction window as 1 gameweek instead of 4, which had Michael Antonio narrowly edging Mo Salah out as the predicted captain. 
Some comments on the model and its results as a human observer:
  • Since premium defenders are likely to get clean sheets as well as attacking returns, the model seems to favour them over mid priced midfielders. However, in reality, FPL managers tend to pick a healthy mix of the two categories in order to have a higher ceiling of the points. 
  • The model loves Michael Antonio, which concerns me. Perhaps it is his extraordinary underlying stats in the post lockdown period of the game but hopefully that is more of a strength than weakness.
  • Willian is so highly predicted because he was on setpieces and penalties at Chelsea, which may not be the case at Arsenal. 
  • I did not pick any Man United and Man City players even if they featured in the returns for the first 4 gameweeks as they do not have a fixture in Gameweek 1. 
  • The model predicts number of returns and not points. So it isn't a completely accurate prediction of how many points the team will get. But in order to be able to compare across positions in a reasonable manner and give the model a good chance of getting it right, I chose this as the label. 
  • It's still quite difficult to maximise the number of returns in the team for Gameweek 1 (a kind of knapsack problem). Hopefully it will get easier while making transfers because there will only be 1 or 2 moves to make. 
  • Finally, the way I went about picking the team from the above spreadsheet is to try maximising the number of returns in the team. I did this manually and strictly followed the model. The rule of thumb I'm following is to pick the team the model suggests while removing players who are not expected to play. When a player is even slightly likely to play, I will pick the player that the model has chosen. 
So here are the troops for GW 1 barring any last minute team news that indicates one of these players won't be starting:


No comments:

Post a Comment