SUBSCRIBE TO THE TENNISRATINGS YOUTUBE CHANNEL FOR THE LATEST TENNIS TRADING & TENNIS BETTING VIDEOS!
12th January, 2017.
ATP Grand Slams are always a tricky format with the best of five set style rendering the best of three data that ATP tournaments typically have less useful. For example, if a player has a good third set record in best of three, how does that translate to best of five? It's likely that the player will have a stronger record in the latter sets of best of five, but it's far from an exact science. Therefore, before the Australian Open next week I wanted to run through some key data from the 2016 Grand Slams in an attempt to identify in-set swing trading trends that we can exploit.
To do this, I built a mini-database of matches from Grand Slams in 2016. I then filtered out matches which I had no data for a player on (e.g. a young wild card with a very small sample size) and included information on each player, such as the Pinnacle price when I sent the daily spreadsheets out, projected hold percentages and combined scores (for lead loss/recovery) for each player.
These projected hold percentages and combined scores are available for every ATP & WTA match via the daily spreadsheets.
Where a player's lead loss/recovery data (to use in the combined scores) was not known, I used a back-tested formula to estimate it - the formula is a little to sensitive to share on the general internet, but I'd be happy to share this with subscribers who email me at firstname.lastname@example.org.
I then collated the in-play first break lead and deficit recovery for each match, bearing the following guidelines in mind:-
1) Only break leads where the player is leading or level in sets in the match were included (it would be reasonable to assume that the player is trading below SP in these cases).
2) Only the first break lead of each set was included.
This then generated a sample of data for each match, which looked like this (example match Pablo Andujar vs Pierre-Hugues Herbert in the Australian Open 2016).
To summarise this match, it would show that Herbert took the first break in the first set (P2 Set 1 Bk Ld = 1), but lost the lead (P2 Set 1 Def Rec = 1) to go back on serve. Andujar then took the set, and led by a set and break in set 2 (P1 Set 2 Bk Ld = 1) and lost the lead (P1 Set 2 Def Rec = 1) to go back on serve.
We then have to go all the way to set 4 to find another example (set 3 had no breaks) - at this stage, Herbert is 2-1 up in sets and a break up (P2 Set 4 Bk Ld = 1), so close to victory, and didn't lose the lead (P2 Set 4 Def Rec = 0).
With that explanation out of the way, I then filtered the data for various different metrics, with the first being the break-back percentage by set, which can be seen below:-
Here we can see that at 0-0 & a break, and 2-2 & a break the recovery percentages were highest, with 1-0 & a break recording the lowest.
This is pretty logical for both areas, with 0-0 & a break being an early lead, which is more likely to be variance-driven than a later lead in a match, and at 2-2 & a break there is an issue whereby the player losing needs to recover the deficit or he is out of the tournament. It's also understandable that this figure is greater than at 2-1 & a break, as 2-2 & a break is a scoreline which is less dominant to the leader.
Furthermore, it's logical that 1-0 & a break should have slightly the lowest recovery, given that the losing player is yet to take a set in the match.
The next area I looked at was the recovery percentage in the first break of any set per Grand Slam event:-
Here we can see that actually in 2016, the Australian Open had the lowest deficit recovery of all four Grand Slam events, just below Wimbledon which most would have predicted to be the lowest, given that it is played on relatively quick grass courts, which have the highest service hold percentage across the four Slam events.
However, whilst it is impossible to get a definitive reason as to why the Australian Open had such a low break-back percentage, it's reasonable to speculate two factors had a significant input - firstly, the conditions are extremely hot. If a player is losing by a break in the set but they are still in the match if they lost the set, it is fair to think that some would 'tank' the set to conserve energy. Furthermore, also from a fitness perspective, top players are likely to have more energy/less injuries in January than later in the season - therefore they are better at protecting leads.
Based on this information, traders should set pretty high tolerance levels at the Australian Open if they are looking to oppose players leading and a break up in matches.
Speaking of these tolerance levels, what should we set?
The main drivers for these would be projected hold and combined score percentages, and I evaluated these below. It's important to bear in mind that the ATP service hold mean across all surfaces in 2016 was 79.0%, and current mean first break lead loss percentage was 32.71% from the lead loss/recovery sheets, with a combined score of 67.10 being average.
Grand Slam break deficit recovery percentages based on player projected hold percentages are as follows:-
Here we can see that the projected hold percentage is a huge driver behind leading players losing those break leads in Grand Slam events, with the four categories below the ATP service hold mean of 79.0% had more than the average number of break-backs. The drop-off from the mean really started when the projected hold percentage rose over 84.0%.
When a player's projected hold percentage was below 64%, they were incredibly vulnerable as a front-runner, losing a huge 68.18% of those break leads.
We established earlier that the mean combined score for ATP players was 67.10, so I then evaluated the likelihood of break-backs per various combined score brackets, detailed below:-
As we can see, combined score is also a huge driver for break-backs, with a strong upward trend for recovery % as combined score increased. When the combined score rose above 95, the break-back recovery percentage was stellar, at just over 60%.
Using all this data, we can look to formulate Grand Slam specific trading scripts. I'll leave that part to you, but it is obvious that projected hold and combined score - available via the daily spreadsheets - are clear metrics which contribute to high deficit recovery.