Probability to Win a Game
In this article I will calculate the probability to win a game. Here I will assume that probabilities for rallies are independent and identically distributed and derive formulas for probabilities to win a game.
Independent and Identically Distributed
As said I will assume that the probabilities for the player are independent and identically distributed. That means there is one value \(p\) giving the probability for a player to win a rally that is independent from the outcomes of the previous rallies and the current state of the match, and identical, i.e. constant. The probability for the other player to win the rally is thus \(1-p\).
This is an assumption to simplify the mathematical analysis. This assumption is of course not met in real matches. One important influence is the serving situation that poses a disadvantage for the server, thus lowering the probability for the server. As the server is also the winner of the previous rally this introduces a dependence within the distributions. Also other factors could alter the probabilities. Players whose stamina is better than their opponents stamina would have a probability that rises within the match. Also it could be conceivable that better players perform better at crucial stages of the match.
Probability to Win a Game
The probability to win a game can be obtained by summing over the probabilities for all winning outcomes of the game. If the probability for a score of \(n_1 - n_2\) to be reached can be written as \(\mathcal{P}(p, n_1, n_2)\), this can be written as
\[P_{Game}(p) = \sum\limits_{n_2=0}^{19} \mathcal{P}(p, 21, n_2) + \sum\limits_{n_2=20}^{28} \mathcal{P}(p, n_2+2, n_2) + \mathcal{P}(p, 30, 29)\]These probabilities can be written as
\[\begin{eqnarray} P_{Game} (p) &=& \sum\limits_{n_2 = 0}^{19} \, \dbinom{20+n_2}{20} p^{20} (1-p)^{n_2}\cdot p \\ &+& \sum\limits_{i = 0}^8 \dbinom{20+20}{20} p^{20} (1-p)^{20} \cdot \left[ 2p(1-p) \right]^i\cdot p^2 \\ &+& \left. \left. \dbinom{20+20}{20} p^{20} (1-p)^{20} \cdot \right[2 p (1-p) \right]^9 \cdot p \end{eqnarray}\]The three summands correspond to three possible ways to win the game. The first is a win without extra points, i.e. to get to 20\(-n_2\) and then winning 21\(-n_2\). The second summand corresponds to winning with a two point difference in extra points. This probability is given by the probability to get to 20-all, then to the different equal scorelines and then winning two points in a row. The last summand is analogous for a score of 30-29.
Plot
The following plot shows the dependence of the probability to win a game on the probability to win a rally.
The curve shows a distinct S-shape. For low rally probabilities, the curve is almost flat as the game probability remains close to zero. It only starts to visibly rise after 30%. For a rally probability of about 40% the game probability reaches 10%. The slope further increases when approaching a rally probability of 50%, where the slope reaches its maximum. Thus any difference in rally probabilities around a value of 50% are greatly magnified when being converted to game probabilities. This magnification effect is also known from studies about tennis games1.
The curve for probabilities over 50% is just a mirrored image of the curve below 50%, due to the exchangability of the players and probabilities.
Table
For rally probabilities from 10% to 50% the probabilities to win a game are given in the following table. Also the number of games needed on average to win one game are given. For example, for a rally probability of 45%, a player would win 25.4% of games, or win one game every 3.93 games.
Probability for Rally | Probability for Game | Games per Won Game |
---|---|---|
10% | 0.0000% | 46209933008.0883 |
11% | 0.0000% | 7562478333.0960 |
12% | 0.0000% | 1476554342.6810 |
13% | 0.0000% | 334489529.8954 |
14% | 0.0000% | 86031336.6792 |
15% | 0.0000% | 24695628.7342 |
16% | 0.0000% | 7803161.1824 |
17% | 0.0000% | 2683512.3246 |
18% | 0.0001% | 995087.0063 |
19% | 0.0003% | 394773.6448 |
20% | 0.0006% | 166455.1286 |
21% | 0.0013% | 74176.2078 |
22% | 0.0029% | 34765.7962 |
23% | 0.0059% | 17066.5333 |
24% | 0.0114% | 8743.0891 |
25% | 0.0215% | 4659.3864 |
26% | 0.0388% | 2575.8494 |
27% | 0.0679% | 1473.5488 |
28% | 0.1149% | 870.3730 |
29% | 0.1888% | 529.7740 |
30% | 0.3015% | 331.7086 |
31% | 0.4688% | 213.3127 |
32% | 0.7108% | 140.6861 |
33% | 1.0522% | 95.0381 |
34% | 1.5225% | 65.6820 |
35% | 2.1556% | 46.3906 |
36% | 2.9894% | 33.4519 |
37% | 4.0642% | 24.6052 |
38% | 5.4215% | 18.4451 |
39% | 7.1015% | 14.0815 |
40% | 9.1409% | 10.9398 |
41% | 11.5698% | 8.6432 |
42% | 14.4095% | 6.9399 |
43% | 17.6693% | 5.6595 |
44% | 21.3450% | 4.6849 |
45% | 25.4171% | 3.9344 |
46% | 29.8507% | 3.3500 |
47% | 34.5953% | 2.8906 |
48% | 39.5867% | 2.5261 |
49% | 44.7494% | 2.2347 |
50% | 50.0000% | 2.0000 |
-
See for example: Franc Klaassen and Jan R. Magnus, Analyzing Wimbledon - The Power of Statistics, Oxford University Press, 204, p. 16. ↩