Using Historical Scenarios in Trading

Skype: @TennisRatings

Undertaking scenario research as part of my own statistical database was at the top of my priority list during the 2014-2015 off-season, as a previous initial smaller sample database had indicated that it is very possible to group together brackets of matches and make a mathematical calculation of the likelihood that there will be swings from given situations.

I alluded to this previously in the article Laying The Underdog in Set Two which showed that by filtering a database for results would allow traders to have a high level of confidence in an ‘averaging down’ strategy for the two matches in Wuhan involving Timea Bacsinszky, who took the first set as a heavy pre-match underdog against Maria Sharapova and Caroline Wozniacki.  It was shown that there was an extremely high likelihood that Bacsinszky would not ‘train’, either from the start of the second set, or if she even got a set and break lead.

Timea Bacsinszky had a low likelihood of 'training' in both matches against more illustrious players...

I’ve now performed this analysis for the whole of the WTA for second sets and deciding sets involving all price ranges (for favourites and underdogs)  in the 2014 season and the value of the data is truly incredible.  It allows me to make specific scenario judgements on current players as well as – much more importantly - for future matches.

The benefit of this is I can feed in match-relevant statistics into my database and it will tell me the historical likelihood of future events happening in the match, such as the player who lost the first set breaking first, or their ability to recover from a set and break down, or a set and double break down, for example.

Firstly, using this data it is possible to dispel the myth that Serena Williams and Maria Sharapova are ‘comeback beasts’ in the WTA.  Whilst these two players obviously have highly impressive statistics for recovering set deficits, it’s much more useful to look at why this has happened…

Serena Williams has strong comeback stats, but they are merely a natural consequence...

There is a clear relationship, from this data, between the match projected hold percentages in my daily trading spreadsheets and the likelihood of a player coming back from a losing position.  Broadly speaking, the lower a player’s projected hold is, the less likely they are to hold on to a lead.  However, it’s important to make the point that some scenarios affect this much more than others. 

Naturally this is logical but it does illustrate why Williams and Sharapova are so good at coming back from set deficits – they break their opponents a lot historically, which then contrives to give their opponents a low projected hold. 

Furthermore, there is another clear relationship.  This is between the starting price of the player who takes the first set, and their propensity to lose the set.  Again, broadly speaking, a short-priced player who takes the first set will be more likely to ‘train’ than a heavy pre-match underdog that wins the first set. 

So what else do Williams and Sharapova have in common?  They are arguably two of the best players on Tour, if not the two best.  The consequence of this is that their starting prices are extremely short, and therefore their opponents are heavy pre-match underdogs.  It’s only natural that they’d recover these set deficits much more than average.

There are also several other strong first-set metrics which indicate that Williams and Sharapova should comeback more than average, but at risk of not knowing who might be reading this, I’ll keep those for a rainy day.

An example of a ‘comeback beast’ would be Kirsten Flipkens in 2013, where she recovered from a set deficit a ridiculous amount of times, despite rarely being priced prohibitively short pre-match.  I would make a strong case for the 2013 version of Flipkens being hugely better in terms of beating comeback expectation than either Williams or Sharapova generally.

Kirsten Flipkens - a true 'comeback beast'?

Moving onto a different area entirely, the data allows me to assess data by segments of the season, and quite unsurprisingly it was January-March where there were more second set fightbacks in matches – the time of the year where players are fresh and a high level of motivation is almost always guaranteed.

To clarify matters, I termed a ‘second set fightback’ by three metrics:-

Either the player who lost the first set broke first in the second set, or they recovered a set and break deficit to get the second set back on serve, or they recovered at least one break when they were a set and double break down.  The first two categories in particular show a high degree of fight and mental strength from the losing player.

My reasoning behind these three areas was pretty simple – all three of these metrics generate a swing in price that we as traders can take advantage of, if we anticipate it.  This is a lot more useful than a series of return games where a player might get to deuce in all of them but fail to break, which arguably also shows fight but is of much less interest to a trader.

This allowed me to back up my point entirely that some WTA favourites failed to give their best efforts or weren’t mentally able to recover deficits – some would say tanked - the last week of the main season, in Luxembourg and Moscow. 

When I made this point at the time on Twitter it felt like the whole Twitter world argued against me.  It’s sometimes difficult to articulate your thoughts into 140 characters on Twitter so it’s not that easy to prove a point in a discussion, so writing an article is a good solution.  Players would never do this, the Twitterati cried.  Players aren’t robots, it’s a tough season, and so on…

Well actually they basically are robots.  They all fit a genre to some extent, or their matches follow very predictable patterns in certain scenarios, and it’s very possible to map future rest-of-match expectation based on statistics we know pre-match and what happens in the match prior to the point we need to make a decision on what will happen.  And on the ‘tough season’ point, it’s a tough, long season for ALL players, not just the ‘better’ ones. 

I will say now that if you know the relevant hold/break statistics, and how often they lose break leads/recover deficits for a player before the match, and can assess the game state well, you don’t even need to watch a match.  You could just scoreboard trade without any live pictures.  The players could even wear masks so you didn’t know who was who!  As long as you have a detailed point by point scoreboard feed that’s all that matters.  The only real benefit to watching a match is so you can see any possible injury issues.  Even a detailed scoreboard that you can get online illustrates whether a player’s level has dropped.

Look at Heather Watson.  How often does she win a match easily?  I looked back to see how often she trained from starting price in WTA main draw matches and she did so just once in 2014, against Ajla Tomljanovic at Wimbledon.  Even in a rare straight sets win over Barbora Zahlavova-Strycova at the French Open she conceded a break lead in both sets.   I’d be surprised if Watson trained from starting price – or even from a set up - more than a few percent of the time in her WTA career.

Watson’s failure to win matches easily is because she has a generally poor service hold percentage and loses a break lead too frequently, as well as rarely starting as a heavy pre-match favourite.  All rolled into one, this shows that she will struggle to win matches easily, by convincing margins, and anyone with this information would have made an absolute fortune trading her matches this year, particularly her match against Dominika Cibulkova in the Rogers Cup in Montreal, where she lead by a set and double break 4-0 before being taken to three sets.  In this she traded below 1.05 twice and below 1.20 numerous times, but subsequently started the final set as an underdog.

Calling Heather Watson's match 'swingy' against Dominika Cibulkova would be an understatement...

We’ve also already seen how we can put Williams and Sharapova into the comeback box purely on the basis of statistics, as opposed to any sub-human ability, and they are effectively the ‘reverse Watson’.  If Watson was to take the first set against either of these players in 2015, or even lead by a set and break, this would be one of the highest expected value scenarios possible in Tennis trading.  A ‘mortgage’ job, so to speak, if ever there was one.

But to conclude the article, I want to go back to the final week of the season, and assess the scenarios in Luxembourg and Moscow.

When I filtered for first set winners priced over 2.00 who have a projected hold of 5% higher than WTA surface mean or worse, there were 17 matches that fitted this profile.  This included seven first set winners who had a starting price of over 3.00 with a projected hold of greater than 5% worse than WTA mean (so extremely low projected hold).  This in itself is unlikely and is a very high proportion of underdogs winning the first set, but that isn’t the ultimate point I want to make, although it does illustrate how poorly pre-match favourites started matches this week.

Out of those 17 matches just 5 (FIVE) pre-match favourites broke first in set 2.  Due to the sensitivity of the data, I don’t want to talk numbers in detail but let me say that this is an absurd amount below expectation.  Just three of the 12 players that went a set and break down got the set back on serve, which is a woeful figure even further below expectation. 

The matches where a comeback was the most likely were the following:-

Andrea Petkovic (lost to Pauline Parmentier 6-4 6-2) – Parmentier trained in set two, despite having a projected hold of 55.2% and starting at around 6.00 in price.  Not only this, Petkovic won just SIX return points in four games in set two.  This is a very, very rare scenario indeed.  Despite this, Petkovic managed to miraculously rediscover her game to take the Tournament of Champions event several weeks later in Sofia – a much more prestigious event…

Timea Bacsinszky (lost to Annika Beck 7-6 6-3) – Bacsinszky recovered to some extent, going a double break 3-0 down in set 2 before getting it back to 5-3, and then being broken.  With Beck going on to win the tournament, this indicates that perhaps Bacsinszky just was beaten by the better player on the day, although it’s important to remember that pretty much WTA player breaking Beck – with her horrific serve - at some point, is almost a given.

Roberta Vinci (lost to Annika Beck 7-5 6-0) – a pre-match favourite getting bagelled by the pre-match underdog in the second set, despite having lost a tight first set and the pre-match underdog having a very low projected hold (48.6%) is an incredibly rare scenario.  Out of the 172 completed second sets in the WTA season from the start of the calendar until the US Open, where the first set score was 7-5, just five went 6-0 to the first set winner (2.9%).  It is highly likely that much fewer than these five first sets were won by the pre-match underdog with a projected hold of under 50%.  If the bookmakers line was 0.5, I’d take unders...

Alize Cornet (lost to Kiki Bertens 6-2 6-3) – as with Bacsinszky she got a break back from a double break (4-1) down but lost the set 6-3.  Slightly tough to analyse this because Cornet had played an absolute ton of matches in Asia prior to this event but it’s important to remember that Bertens is woeful away from clay (at the time her 12 month hold/break percentage on hard/indoor prior to this match was 56.9% holds and 33.8% breaks – not even top 100 level).

Sabine Lisicki (lost to Denisa Allertova 7-5 6-2) – an atrocious defeat for Lisicki who was priced 1.22 prior to the match (my model said 1.18) although she did get set two back on serve from a set and break down, so somehow, despite falling apart from this point onwards, against a player with barely any main draw WTA experience, she actually qualified as a success for recovering the set and break down scenario. 

Svetlana Kuznetsova (beat Kateryna Kozlova 3-6 6-3 6-3) – Kuznetsova took the early break in set two and generally controlled matters from that point onwards.

Camila Giorgi (lost to Katerina Siniakova 7-6 4-6 7-5) – one of the highest likelihoods for a set two comeback that you’re likely to see.  Giorgi took the set despite going a set and break down at *2-1 in set two.

I plan on doing further research to see if this is a common tendency in the last main week of the WTA season.  Whilst some of these extremely high comeback likelihood players in the given scenario managed to fight back as expected (particularly Kuznetsova and Giorgi), some clearly waved the white flag and had very little interest or motivation to turn around a losing position.  Should research of previous years also prove to have similar results, backing heavy pre-match underdogs to win 2-0 or ‘-1.5 sets’ should have huge positive expectation.

If you are interested in the data mentioned in this article and would like to discuss this further, please feel free to email me at:  Due to the sensitivity of the data and the obvious use it has for my own trading, I am in no rush to share it with anyone, although if a syndicate or high-stakes trader was interested, I would be happy to listen to proposals.

A Selection of TennisRatings Products
Please visit the
TennisRatings Products links for a full overview of our fantastic Tennis Trading tools, and the TennisRatings Subscription Packages link to see our great value range of discounted subscription packages!

Please check out our testimonials page!

The TennisRatings Daily Trading Spreadsheets have never been more popular!  

To find out more on how these can dramatically improve your Tennis Trading, check out the YouTube Video we made.

The Challenger Daily Spreadsheets cover all ATP Challenger Events and include projected hold percentages (for traders) and model prices (for bettors and traders).

Subscriptions are available for 3 months:-

The Lead Loss/Recovery Data Spreadsheets have taken the Tennis Trading World by storm - discussed in detail in October 2015 at the Matchbook Traders Conference these incredible spreadsheets highlight lead loss & deficit recovery in individual sets, as well as how often a player loses/gains the first break of the second set based on whether they won or lost the first set!