Trading ATP Grand Slams - Data From 2016

Skype: @TennisRatings
SUBSCRIBE TO THE TENNISRATINGS YOUTUBE CHANNEL FOR THE LATEST TENNIS TRADING & TENNIS BETTING VIDEOS!

12th January, 2017.

ATP Grand Slams are always a tricky format with the best of five set style rendering the best of three data that ATP tournaments typically have less useful.  For example, if a player has a good third set record in best of three, how does that translate to best of five?  It's likely that the player will have a stronger record in the latter sets of best of five, but it's far from an exact science.  Therefore, before the Australian Open next week I wanted to run through some key data from the 2016 Grand Slams in an attempt to identify in-set swing trading trends that we can exploit.  

To do this, I built a mini-database of matches from Grand Slams in 2016.  I then filtered out matches which I had no data for a player on (e.g. a young wild card with a very small sample size) and included information on each player, such as the Pinnacle price when I sent the daily spreadsheets out, projected hold percentages and combined scores (for lead loss/recovery) for each player.  

These projected hold percentages and combined scores are available for every ATP & WTA match via the daily spreadsheets.

Where a player's lead loss/recovery data (to use in the combined scores) was not known, I used a back-tested formula to estimate it - the formula is a little to sensitive to share on the general internet, but I'd be happy to share this with subscribers who email me at tennistrades@gmail.com.

I then collated the in-play first break lead and deficit recovery for each match, bearing the following guidelines in mind:-

1) Only break leads where the player is leading or level in sets in the match were included (it would be reasonable to assume that the player is trading below SP in these cases).
2) Only the first break lead of each set was included.

This then generated a sample of data for each match, which looked like this (example match Pablo Andujar vs Pierre-Hugues Herbert in the Australian Open 2016).

Event 

Year

Player 1

Current

Player 2

Current

Projected

Projected

Combined Score on 


Set 1

Set 1

Set 1

Set 1

Set 2

Set 2

Set 2

Set 2

Set 3

Set 3

Set 3

Set 3

Set 4

Set 4

Set 4

Set 4

Set 5

Set 5

Set 5

Set 5




Pinnacle


Pinnacle

Hold

Hold

Player's Serve

























Price


Price



Player 1

Player 2

P1 Bk Ld

P1 Def Rec

P2 Bk Ld

P2 Def Rec

P1 Bk Ld

P1 Def Rec

P2 Bk Ld

P2 Def Rec

P1 Bk Ld

P1 Def Rec

P2 Bk Ld

P2 Def Rec

P1 Bk Ld

P1 Def Rec

P2 Bk Ld

P2 Def Rec

P1 Bk Ld

P1 Def Rec

P2 Bk Ld

P2 Def Rec

Australian Open

2016

Andujar

3.01

Herbert

1.46

73.80

82.40

47.28

48.60



1

1

1

1









1

0






To summarise this match, it would show that Herbert took the first break in the first set (P2 Set 1 Bk Ld = 1), but lost the lead (P2 Set 1 Def Rec = 1) to go back on serve.  Andujar then took the set, and led by a set and break in set 2 (P1 Set 2 Bk Ld = 1) and lost the lead (P1 Set 2 Def Rec = 1) to go back on serve.

We then have to go all the way to set 4 to find another example (set 3 had no breaks) - at this stage, Herbert is 2-1 up in sets and a break up (P2 Set 4 Bk Ld = 1), so close to victory, and didn't lose the lead (P2 Set 4 Def Rec = 0).

With that explanation out of the way, I then filtered the data for various different metrics, with the first being the break-back percentage by set, which can be seen below:-

Set

Possible Set Scenarios

Break Leads

Deficit Recoveries 

Recovery %






1

0-0 & Break

426

156

36.62

2

1-0 & Break

320

97

30.31

3

1-1 & Break, 2-0 & Break

370

117

31.62

4

2-1 & Break

151

48

31.79

5

2-2 & Break

83

30

36.14



Here we can see that at 0-0 & a break, and 2-2 & a break the recovery percentages were highest, with 1-0 & a break recording the lowest.  

This is pretty logical for both areas, with 0-0 & a break being an early lead, which is more likely to be variance-driven than a later lead in a match, and at 2-2 & a break there is an issue whereby the player losing needs to recover the deficit or he is out of the tournament.  It's also understandable that this figure is greater than at 2-1 & a break, as 2-2 & a break is a scoreline which is less dominant to the leader. 

Furthermore, it's logical that 1-0 & a break should have slightly the lowest recovery, given that the losing player is yet to take a set in the match.  

The next area I looked at was the recovery percentage in the first break of any set per Grand Slam event:-

Tournament

Year

Break Leads

Break Deficits

% Recovered






Australian Open

2016

318

91

28.62

French Open

2016

357

130

36.41

WImbledon

2016

320

93

29.06

US Open

2016

355

134

37.75


Here we can see that actually in 2016, the Australian Open had the lowest deficit recovery of all four Grand Slam events, just below Wimbledon which most would have predicted to be the lowest, given that it is played on relatively quick grass courts, which have the highest service hold percentage across the four Slam events.

However, whilst it is impossible to get a definitive reason as to why the Australian Open had such a low break-back percentage, it's reasonable to speculate two factors had a significant input - firstly, the conditions are extremely hot.  If a player is losing by a break in the set but they are still in the match if they lost the set, it is fair to think that some would 'tank' the set to conserve energy.  Furthermore, also from a fitness perspective, top players are likely to have more energy/less injuries in January than later in the season - therefore they are better at protecting leads.

Based on this information, traders should set pretty high tolerance levels at the Australian Open if they are looking to oppose players leading and a break up in matches.  

Speaking of these tolerance levels, what should we set?  

The main drivers for these would be projected hold and combined score percentages, and I evaluated these below.  It's important to bear in mind that the ATP service hold mean across all surfaces in 2016 was 79.0%, and current mean first break lead loss percentage was 32.71% from the lead loss/recovery sheets, with a combined score of 67.10 being average.

Grand Slam break deficit recovery percentages based on player projected hold percentages are as follows:-

Projected Hold %

Break Leads

Deficit Recoveries 

Recovery %





<63.99%

44

30

68.18

64.00-68.99%

78

36

46.15

69.00-73.99%

190

83

43.68

74.00-78.99%

270

98

36.30

79.00-83.99%

305

102

33.44

84.00-88.99%

260

60

23.08

>89.00%

203

39

19.21


Here we can see that the projected hold percentage is a huge driver behind leading players losing those break leads in Grand Slam events, with the four categories below the ATP service hold mean of 79.0% had more than the average number of break-backs.   The drop-off from the mean really started when the projected hold percentage rose over 84.0%.  

When a player's projected hold percentage was below 64%, they were incredibly vulnerable as a front-runner, losing a huge 68.18% of those break leads.  

We established earlier that the mean combined score for ATP players was 67.10, so I then evaluated the likelihood of break-backs per various combined score brackets, detailed below:-

Combined Score

Break Leads

Deficit Recoveries 

Recovery %





<46

129

29

22.48

46.00-52.99

155

35

22.58

53.00-59.99

260

77

29.62

60.00-66.99

274

82

29.93

67.00-73.99

234

82

35.04

74.00-80.99

135

63

46.67

81.00-87.99

77

34

44.16

88.00-94.99

45

21

46.67

>95

41

25

60.98


As we can see, combined score is also a huge driver for break-backs, with a strong upward trend for recovery % as combined score increased.  When the combined score rose above 95, the break-back recovery percentage was stellar, at just over 60%.

Using all this data, we can look to formulate Grand Slam specific trading scripts.  I'll leave that part to you, but it is obvious that projected hold and combined score - available via the daily spreadsheets - are clear metrics which contribute to high deficit recovery.








Comments