Accéder au contenu principal

Win Probability Over Time (Final)

After crunching the number and the data, I came up with nice equations for each score. This allow me to create a table showing the Win probability at every minte for a score differential of -3 to 3. But the two extreme cases are not as accurate as the others. One can check the R² later in the post. So here is the final graph.





























Trailing by 1 Trailing by 2 Trailing by 3 Tie Leading by 1 Leading by 2 Leading by 3
R² = 0,7074 R² = 0,71 R² = 0,1234 R² = 0,2581 R² = 0,9007 R² = 0,8253 R² = 0,2019















So from these I derived the marginal winning chance provided by a goal. I only used the goal differential -2 to 2 because I want to keep the most accurate models. On the graph, T means trailing and l means Leading. So for example the graph T2-->T1 means the winnig probability added by going from a 2 goal deficit to a 1 goal deficit. If my explanations are not clear, the graph will be usefull.



The results are interesting. Tying the game gives the biggest opportunity to take the threee points and increases over time. A late goal is always possible. Reducing the gap between two teams, I mean being down 2 and scoring so the team is down only one, first increases the team Win%. But around half time the effect of this goal is reducing, I guess because time is flying. The inverse phenomena happens when a team takes the lead. I think until now, results are quite normal. This is what you expect.
A last comment is when a team takes a two-goal lead. During first half, this helps increasing a team chance of taking the three points. But after half-time, the effect is slowly reducing. I guess this is not counter-intuitive but the effect is not as huge as I thought. Scoring a second goal when in lead increases the team Win% by around 11,4%. This is just the mean of the values. I don't know what you thin about this, but this looks thin to me. Is the gamble worth the risk? Should a manager go for the break and add a striker or just defend?


I will continue to analyze the recent football results, the next mini-project is to compare the different leads. These models can be used to assess the value of a goal. Thanks to this, one can calculate the Win% added by each goal, and can compare players in terms of AddedWin%. Another mini-project I will undertake.

Commentaires

Posts les plus consultés de ce blog

Goals created and Wins created

Like many others, I have the MCFC data set. I used it to toy with data and created two new statistics. In fact three, I'll explain later. The main thing I tried to do is to assess the impact of a player on the season in terms of scoring opportunities and wins. So first of all, I did some computation, and selected the data I needed:  - Appearances  - Assists  - Big Chances  - Goals  - Key Passes  - Time Played  - Shots I have to say that I'm not too happy with the data set. Many events are not recorded with a name, so you have data that cannot be used when you look at player production. But I managed to do what I wanted with the data. All data are from the MCFC data set, I rely on them for accuracy. First thing I looked at: topscorers. Rank Player full name Goals 1 Robin van Persie 30 2 Wayne Rooney 27 3 Sergio Agüero 23 ...

Win Probability Over Time (Continued)

Allright, I will continue to show and explain what I found while toying with the data. A)   Trailing by 2   So now, we’ll continue our journey.    The graph is really interesting. Despite no strong correlation, we can see that there are two sorts of sub-graphs. After the thirtieth minute, winning chances drop significantly from 21,05% to 7,69%. Then we see that the data are close. We have a R² of 0.71 for [35-90] minutes. The first 30 minutes are more interesting. Even if the number of events is low, because it’s not usual to concede 2 goals quickly in the game we can still draw some conclusions from this. We have an odd phenomenon around the twentieth minute, with winning chances increasing then dropping.   As previously observed, the winning chance increases after the twentieth minute so the manager effect hypothesis is starting to seem likely. B)   Trailing by 3 Not much to say here, as there is so few teams that came back ...