Like many others, I have the MCFC data set. I used it to toy with data and created two new statistics. In fact three, I'll explain later.
The main thing I tried to do is to assess the impact of a player on the season in terms of scoring opportunities and wins. So first of all, I did some computation, and selected the data I needed:
- Appearances
- Assists
- Big Chances
- Goals
- Key Passes
- Time Played
- Shots
I have to say that I'm not too happy with the data set. Many events are not recorded with a name, so you have data that cannot be used when you look at player production. But I managed to do what I wanted with the data. All data are from the MCFC data set, I rely on them for accuracy.
First thing I looked at: topscorers.
But this table doesn't really help. Van Persie scored 30 goals but what if all 30 where scored home when Arsenal already lead 3-0? Remember the data are for the 2011/2012 season, so he's still at Arsenal. It doesn't give us an idea on how much this contribued to Arsenal's succes. Peter Crouch scored 10 goals, but maybe 10 winners. So my idea was to calculate how many goals each player created and how much wins were derived from these goals.
I created a new statistic called Goals Created. The formula is simple but yet comprehensive :
Goals Created = (Key Passes + Assists) * Chance Conversion Rate
In fact it is just Chances created * Chance conversion rate. This last value is calculated by dividing each team's total chances by the number of goals scored. This is a kind of efficiency rating. So now the top 25 goals creators:
Well, the picture is changing. Robin van Persi stays high, but Peter Crouch dissapears. David Silva is the leader with 17 goals created. Stéphane Sessegnon and Morten Gamst Pederson both with 10 goals created are not from top teams but are better than Wayne Rooney or Frank Lampard. I think this quite interesting, but we can go further.
By a simple formula I created, you can convert a player contribution in wins.
Win Created = (Goals + Goals Created) * Goal in Win Rate
The Goal in Win Rate, is simply the weight of a goal in terms of win. This helps correct for high scoring teams by reducing the weight of each goal. For teams with few wins, each goal has a higher value than teams with many wins and high scoring.
I corrected this value to each teams real win numbers. For example for each you have 50 Wins Created when the team only won 20 games. So I kind of normalized the data, so it keeps a real meaning. I called this the Corrected Wins Created.
I think one can keep the Wins Created, because it favors Attackers and Midfielder. I'm working on the same of stats for Defenders and Keepers, and the results are likely to be negative. So when adding each players Win Created, you should find the team's real number of win. Allright enough talking, now the top 25 wins creator:
Not surprisingly, Robin van Persie is still on the top spots with 7.9 Wins Creates. But the #1 is Wayne Rooney with 8.4 Wins Created. Stéphane Sessegnon keeps a nice 9th place with 4.0 Wins Created. Demba Ba, the former Newcastle Striker has 4.9 Wins Created. David Silva, our creator has "only" 4.4 Wins Created. Clint Dempsey with 5.5 Wins Created will certainly make Fulham weaker this year because of his transfer. Mario Balotelli, despite being... Mario Balotelli has 3.2 Wins Created. Not bad when you look at his behaviour.
With all the data I collected, I created a third variable, Goal Situations Created. As I'm still working on improvements on the model, I will not give details. I based my work off the work of Bill James' Run Created formula. The results for now:
To conclude, I want to say that I enjoyed toying with the data and found some interesting stuff. I created two new statistics, Goals Created and Wins Created by using relatively easily accessible data. Unlike american sports, football (or soccer for my American readers) has only a few publicly available.
The main thing I tried to do is to assess the impact of a player on the season in terms of scoring opportunities and wins. So first of all, I did some computation, and selected the data I needed:
- Appearances
- Assists
- Big Chances
- Goals
- Key Passes
- Time Played
- Shots
I have to say that I'm not too happy with the data set. Many events are not recorded with a name, so you have data that cannot be used when you look at player production. But I managed to do what I wanted with the data. All data are from the MCFC data set, I rely on them for accuracy.
First thing I looked at: topscorers.
Rank
|
Player
full name
|
Goals
|
1
|
Robin
van Persie
|
30
|
2
|
Wayne
Rooney
|
27
|
3
|
Sergio
Agüero
|
23
|
4
|
Clint
Dempsey
|
17
|
5
|
Emmanuel
Adebayor
|
17
|
6
|
Demba
Ba
|
16
|
7
|
Grant
Holt
|
15
|
8
|
Edin
Dzeko
|
14
|
9
|
Papiss
Demba Cissé
|
13
|
10
|
Mario
Balotelli
|
13
|
11
|
Danny
Graham
|
12
|
12
|
Steven
Fletcher
|
12
|
13
|
Luis
Suárez
|
11
|
14
|
Rafael
van der Vaart
|
11
|
15
|
Frank
Lampard
|
11
|
16
|
Daniel
Sturridge
|
11
|
17
|
Jermain
Defoe
|
11
|
18
|
Peter
Crouch
|
10
|
19
|
Javier
Hernández
|
10
|
20
|
Peter
Odemwingie
|
10
|
21
|
Gareth
Bale
|
9
|
22
|
Danny
Welbeck
|
9
|
23
|
Steve
Morison
|
9
|
24
|
Nikica
Jelavic
|
9
|
25
|
Darren
Bent
|
9
|
But this table doesn't really help. Van Persie scored 30 goals but what if all 30 where scored home when Arsenal already lead 3-0? Remember the data are for the 2011/2012 season, so he's still at Arsenal. It doesn't give us an idea on how much this contribued to Arsenal's succes. Peter Crouch scored 10 goals, but maybe 10 winners. So my idea was to calculate how many goals each player created and how much wins were derived from these goals.
I created a new statistic called Goals Created. The formula is simple but yet comprehensive :
Goals Created = (Key Passes + Assists) * Chance Conversion Rate
In fact it is just Chances created * Chance conversion rate. This last value is calculated by dividing each team's total chances by the number of goals scored. This is a kind of efficiency rating. So now the top 25 goals creators:
Rank
|
Player
full name
|
Goals
created
|
1
|
David
Silva
|
17
|
2
|
Robin
van Persie
|
14
|
3
|
Juan
Mata
|
13
|
4
|
Samir
Nasri
|
13
|
5
|
Luka
Modric
|
12
|
6
|
Gareth
Bale
|
10
|
7
|
Morten
Gamst Pedersen
|
10
|
8
|
Stéphane
Sessegnon
|
10
|
9
|
Mikel
Arteta
|
9
|
10
|
Rafael
van der Vaart
|
9
|
11
|
Yohan
Cabaye
|
9
|
12
|
Sergio
Agüero
|
9
|
13
|
Wayne
Rooney
|
9
|
14
|
Leighton
Baines
|
9
|
15
|
Danny
Murphy
|
9
|
16
|
Aaron
Ramsey
|
8
|
17
|
Martin
Petrov
|
8
|
18
|
Ryan
Giggs
|
8
|
19
|
Matthew
Jarvis
|
8
|
20
|
Emmanuel
Adebayor
|
8
|
21
|
Ashley
Young
|
7
|
22
|
Frank
Lampard
|
7
|
23
|
Patrice
Evra
|
7
|
24
|
Joey
Barton
|
7
|
25
|
Bobby
Zamora
|
7
|
Well, the picture is changing. Robin van Persi stays high, but Peter Crouch dissapears. David Silva is the leader with 17 goals created. Stéphane Sessegnon and Morten Gamst Pederson both with 10 goals created are not from top teams but are better than Wayne Rooney or Frank Lampard. I think this quite interesting, but we can go further.
By a simple formula I created, you can convert a player contribution in wins.
Win Created = (Goals + Goals Created) * Goal in Win Rate
The Goal in Win Rate, is simply the weight of a goal in terms of win. This helps correct for high scoring teams by reducing the weight of each goal. For teams with few wins, each goal has a higher value than teams with many wins and high scoring.
I corrected this value to each teams real win numbers. For example for each you have 50 Wins Created when the team only won 20 games. So I kind of normalized the data, so it keeps a real meaning. I called this the Corrected Wins Created.
I think one can keep the Wins Created, because it favors Attackers and Midfielder. I'm working on the same of stats for Defenders and Keepers, and the results are likely to be negative. So when adding each players Win Created, you should find the team's real number of win. Allright enough talking, now the top 25 wins creator:
Rank
|
Player
full name
|
Corrected Wins created
|
1
|
Wayne
Rooney
|
8,4
|
2
|
Robin
van Persie
|
7,9
|
3
|
Sergio
Agüero
|
6,2
|
4
|
Clint
Dempsey
|
5,5
|
5
|
Demba
Ba
|
4,9
|
6
|
Emmanuel
Adebayor
|
4,6
|
7
|
David
Silva
|
4,4
|
8
|
Luis
Suárez
|
4,1
|
9
|
Stéphane
Sessegnon
|
4,0
|
10
|
Juan
Mata
|
3,8
|
11
|
Rafael
van der Vaart
|
3,8
|
12
|
Edin
Dzeko
|
3,7
|
13
|
Papiss
Demba Cissé
|
3,7
|
14
|
Frank
Lampard
|
3,6
|
15
|
Gareth
Bale
|
3,6
|
16
|
Grant
Holt
|
3,5
|
17
|
Danny
Welbeck
|
3,4
|
18
|
Samir
Nasri
|
3,4
|
19
|
Yohan
Cabaye
|
3,4
|
20
|
Peter
Crouch
|
3,2
|
21
|
Javier
Hernández
|
3,2
|
22
|
Mario
Balotelli
|
3,2
|
23
|
Ashley
Young
|
3,1
|
24
|
Nicklas
Bendtner
|
3,0
|
25
|
Mikel
Arteta
|
3,0
|
Not surprisingly, Robin van Persie is still on the top spots with 7.9 Wins Creates. But the #1 is Wayne Rooney with 8.4 Wins Created. Stéphane Sessegnon keeps a nice 9th place with 4.0 Wins Created. Demba Ba, the former Newcastle Striker has 4.9 Wins Created. David Silva, our creator has "only" 4.4 Wins Created. Clint Dempsey with 5.5 Wins Created will certainly make Fulham weaker this year because of his transfer. Mario Balotelli, despite being... Mario Balotelli has 3.2 Wins Created. Not bad when you look at his behaviour.
With all the data I collected, I created a third variable, Goal Situations Created. As I'm still working on improvements on the model, I will not give details. I based my work off the work of Bill James' Run Created formula. The results for now:
Rank
|
Player
full name
|
Goal
situations created
|
1
|
Robin
van Persie
|
46
|
2
|
Wayne
Rooney
|
37
|
3
|
Sergio
Agüero
|
35
|
4
|
Emmanuel
Adebayor
|
33
|
5
|
Clint
Dempsey
|
26
|
6
|
Demba
Ba
|
21
|
7
|
Edin
Dzeko
|
21
|
8
|
Grant
Holt
|
20
|
9
|
Luis
Suárez
|
20
|
10
|
Danny
Graham
|
19
|
11
|
Mario
Balotelli
|
19
|
12
|
Rafael
van der Vaart
|
18
|
13
|
Papiss
Demba Cissé
|
18
|
14
|
Gareth
Bale
|
17
|
15
|
Danny
Welbeck
|
17
|
16
|
Daniel
Sturridge
|
17
|
17
|
Frank
Lampard
|
16
|
18
|
David
Silva
|
16
|
19
|
Theo
Walcott
|
16
|
20
|
Steven
Fletcher
|
16
|
21
|
Javier
Hernández
|
16
|
22
|
Peter
Crouch
|
15
|
23
|
Peter
Odemwingie
|
15
|
24
|
Jermain
Defoe
|
15
|
25
|
Darren
Bent
|
15
|
To conclude, I want to say that I enjoyed toying with the data and found some interesting stuff. I created two new statistics, Goals Created and Wins Created by using relatively easily accessible data. Unlike american sports, football (or soccer for my American readers) has only a few publicly available.
Commentaires
Enregistrer un commentaire