The recent FIFA World Cup gave us more statistics than ever before. We knew instantly how many shots on goal England had in the semi-final against Croatia (one), which Premier League team had the most players represented in the last four (Spurs) and how many waistcoats Gareth Southgate has in his wardrobe at any given time (ok, maybe not).
Whilst cricket has long been the sporting champion of UK mathematics aficionados, in the same way Baseball has in the US, Moneyball has famously come to soccer, sorry, football with Optasports and others. For some it has shaped recruitment strategies and tactics (I understand it’s how Jordan Henderson continues to be selected…), for others it has given the armchair fan additional insight. These statistics are the result of a combination of human input (counting shots on target, or how many times a player touches the ball) and electronic measurement (timing the ball in-play or distance run by each player).
Another form of statistics has long been covered by a marketing buzzword – Big Data – where the power of the web is harnessed to bring us information of hitherto unfathomable scale from myriad sources. My favourite of the last few weeks is a fantastic report by the New York Times which looks at performance in marathons and half marathons, broken down by the shoe style worn. I’ve referenced Nike’s Breaking2 project in this column previously, but this article focusses on the commercial output of that venture – the Vaporfly 4%.
Elites or “Normos”?
The original Nike-funded research into the Vaporfly 4% showed the 18 (count them) elite male athletes were around 4% more energy efficient when tested over various distance on a treadmill whilst wearing the shoe – which, it was claimed, could lead to a time gain of around 3% over the marathon distance.
Now that the shoe has been generally available to us “sub-elites” (or normos, if you will) for around a year, more extensive, real-world testing can be done. Step forward Strava. Our favourite run / cycle / swim tracking app has the option for every run to be tagged with the pair of shoes worn. Apparently around a third of us do this in races (the report doesn’t state whether that is borne out for training runs as it’s focussed on performance), so the NYT has been able to pull together that data and the published results to examine if and how race-day shoe selection affects performance.
The results are clear cut in relation to the star of the show, seemingly affirming Nike’s own startling assertions, but also give a great tail of data for the top fifty running shoes. The authors concede that there are some questions left unanswered, but the article is deep and covers a lot of interesting factors, from statistical modelling, to weather, training mileage and propensity of ambitious runners to make certain shoe selections.
Proper BIG Data
The sample size for some of this latest analysis runs into the tens of thousands, so from that perspective it’s a very robust report. We’ve all had reason to question opinion poll data over the last couple of years. Famously, predictions of the elections both sides of the Atlantic, and of course the UK EU referendum have been inaccurate, in some cases spectacularly so. In terms of scale, typical sample sizes for UK election polls are 5,000 which, whilst substantial, is maybe not statistically significant for predicting those results. Of course, there are other extenuating factors there too, it should be noted. Messrs Trump and Putin can continue to debate this, I’m not going to get involved.
This leads us to questions closer to home. What are our sample sizes when we make decisions on product selection, on colour choice and on our plans for next season? Are we managed by gut reaction or the experience of a small number of experts? Are we careful to observe current market conditions and demands, or better still, able to include some prediction or modelling of what might happen in the future?
Looking forward to success
Wherever our businesses are on this spectrum, it is often relatively easy to improve the quality of the data that we are using, especially in terms of consumer input. This could range from increasing the sample size for purchase decisions from one lone buyer to the views and experience of several staff, through adding consumer focus groups of a room-full of people, to utilising an online platform to counsel ideas, feedback and contributions from an unlimited number of consumers and potential consumers. Each of these is progressively more preferable to the latter, can confirm initial hypothesis, completely change tactical direction, or anything in between. If we are open to these possibilities whilst still trusting our own judgement, we have already begun our own journey to waistcoat nirvana.