Listen Up, Backtesting is NOT Real-Time

06Aug10

I’ll take a month of real-time trading over a 10-year backtest any day of the week.

That may be a surprise coming from someone whose blog almost wholly consists of backtests, but at the end of the day I know (and so should you) that NONE of this really matters. The only thing that matters is what you do in real-time, audited.

That’s why I don’t release backtests for our own proprietary strategies and I raise a wary eyebrow anytime I see a sexy backtest bandied about.

It’s not that I don’t trust the backtester. There’s a whole universe of good folks whose workmanship I respect very much, but at the end of the day ‘the best laid schemes of mice and men often go awry’.

The table above lists the evil little demons that lead our backtests astray. None of these are new…just a reminder of what we already know (but sometimes forget).

There are things we should be able to control…the hard-and-fast, black-and-white “math” of backtesting.

Have we accurately modeled the trading environment including transaction costs, slippage, realistic quotes, and survivorship bias? Small mistakes here compounded = hugely inaccurate results.

Have we built a mathematically-sound model? You would be shocked to know how often I test published strategies only to find that the results rubbish (reader beware).

And there are things we can try to control but never totally will…the far more fuzzy “art” of backtesting.

We have to cope with curve-fitting and other biases (read more from CXO), markets that are constantly evolving at the most fundamental level (an example) and markets that inevitably go through abrupt “abnormal” periods where nothing works the way it’s supposed to (another example).

. . . . .

The only way – the ONLY WAY – to judge how well a trader has responded to this myriad of challenges is real-time audited results.

It’s easy to churn out fancy charts showing what has worked in the past. It’s a very different thing to put yourself on the line each and every day.

Done right (independently-audited without cherry-picking) there are no mulligans. Your moments of glory and defeat, of brilliance and stupidity, are laid bare.

We the investing community get way too excited about sexy backtests and make way too half-hearted a demand for the real-time.

In my mind, in your mind, in all of our minds, 1-month of real-time audited trading should mean more than 10-years of backtesting any day of the week.

Happy Trading,
ms

. . . . .

To stay up to date with what’s happening at the MarketSci Blog, we recommend subscribing to our RSS Feed or Email Feed.



18 Responses to “Listen Up, Backtesting is NOT Real-Time”

  1. 1 Lon

    I love to do research and have been working on timing systems since the 80s. One day I realized all testing is back-testing. Whether it is a day, a week or ten years. It is the only kind of testing we have. Even though we may have done it in real time it actually means no more than a back-test if the back-test is a reflection of what you would have really done. If you adjust to get the most gain then it is useless but if you do something like the last 3 and first 2 of the month it doesn’t matter how far back you go you would have gotten those actual results.
    What will fail is like when I used Monocle and adjusted and adjusted back-tests to peak them out. Once you have the system there was no way to know if it would ever work again. Moving average systems will work again and again but that doesn’t mean you won’t have a bad loss if you don’t have stops in place. Timing is always a challenge but always better to me than buy and hold. The draw-downs in timing systems is why we all give up. Get the drawdown down and you can do it forever.

  2. 2 CarlosR

    Michael,

    Truer words were never spoken. I can’t tell you how many systems and strategies I’ve backtested and obtained great results, only to have them fall apart in the real world.

    The ironic thing (and I think you”ll agree) is that I find backtesting to be pretty much essential step in strategy development. If a new strategy doesn’t backtest well, then I’m certainly not going to start trading it. On the other hand, if it does backtest well, that’s when you need to start *really* being cautious.

    Of all the items you listed as possible causes of backtesting problems, the one that gives me the most trouble is the changing nature of the markets. While that’s a problem for any strategy developer, it is particularly serious if you focus on high-reward short-term strategies, which I do. They are what I call “disposable” strategies, and believe me, I have disposed of many, mostly because the market changed on me. But, at the end of the day, you pays your money and you picks your poison.

    Thanks again for the blog.

    =Carlos=

  3. 3 Juan Carlos Christensen

    Some of your points doesn’t really apply to the “backtesting real-time” issue.
    One thing is the backtesting methodology, and another (related but different) thing is the model itself.
    The changing behavior of the markets it’s not an argument against backtesting, it’s an argument against non-adaptive models. Also, with that same argument, real-time and backtesting suffer the same problem under evolving markets, if used with a non-adaptive model.
    Backtesting is just a analysis procedure, that helps us get a greater understanding of the markets dynamics. It’s an approximation of real-time, and thus not expected to give exact comparative results.
    Other than that, in my view good model-making needs both things, in-sample and out-sample backtesting, and forward testing on real-time.

  4. Thank you. I couldn’t agree more b/c it is SOOOO TRUE!!! Back-test all you want and in the end when you ACTUALLY start to TAKE TRADES then and ONLY THEN will you find out just exactly what it takes TO MAKE MONEY!!! And even trading the same market, day-in and day-out, you WILL FIND “abnormalities and unexpected issues arising. Systems and analysis ALWAYS need to be massaged to fit the market environment one finds themselves in THAT DAY and/or MOMENT trading.

    Thanks again for your great post here AND I must admit I also love your tests and great back-testings that you do….it’s all good info to have and to “store away” in the brain b/c it does add even more depth to our trading and/or investing…but it surely is NOT the holy grail…the only holy grail in this industry is RESULTS and that equates solely and purely to one thing…MONEY!!!

    Thanks again and keep up the great work!!

    Trade/Invest PROFITABLY everyone!!

    LL

    MT :)

  5. 5 Blue

    Great post. Lon makes a good point too. Even 1-month of real-time audited trading constitutes a back test after the month is over. Everyone has heard — many times — of people who made lots of money for a while and then blew up.

    Markets adapt to everything, and black swans come out of nowhere. There’s no way to change that.

  6. 6 Pete

    I think Michael’s talking-up his own book as a professional strategy developer in this post. And that’s cool, of course. It’s great that Michael can highlight his real-time audited results as means of attracting funds. And in this context I think he’s right – better place your hard earned cash with a manager who has a multi-year audited real-time performance rather than one who just has multi-year back-test.

    However, in a broader context I’m not sure I agree that a month of real-time trading is better than a 10-year backtest. For me, neither of these is good enough.

    I agree that poor testing is fools gold. I agree that testing can never fully replicate real-time trading. But I disagree that one month of real-time results tells you anything meaningful (not for a swing trading strategy anyway).

    Testing is an extremely important aspect of the trading strategy developing process and Michael no doubt agrees with me given the content of his blog over the last 2 years. For me, methodology + real-time results is the ideal combination. Michael clearly has both.

    • 7 MarketSci

      RE to Pete: thanks for the kind words sir.

      Two points of clarification…

      I knew someone would mention “talking the book” but I wholeheartedly disagree. I don’t believe what I believe because I sell based on audited real-time track records. I sell based on audited real-time track records because I believe what I believe. I could very easily wow readers with backtests for our programs but (as anyone who has ever emailed me and asked for a backtest knows) I refuse to play that game. It hurts our revenue no doubt, but my money is where my mouth is.

      The 1 month vs 10 years comparison was a little bit hyperbolic on my part. Yes, 1 month of real-time trading is pretty useless. The bigger point was that a 10-year backtest is not any less useless.

      michael

  7. mmmh – I agree with some points in your post but I still think that long evaluation period (even more than 10 years) provides some good info with regards to how the strategy deals with “things we will never fully control” such as market changes, abnormal markets, etc.

    From a personal point of view on MY back-tests, I would get more reassurance from a trading system backtest over several decades than a live result over even a few years where markets might have just been especially good for your strategy at the time – with the system possibly breaking down further down the line…
    Look at that famous equity curve for instance:
    http://www.automated-trading-system.com/variance-risk/
    Very profitable as real-time reported results for over 3 years…

    From a system promotion point of view, I agree it is all too easy to just apply a good dose of data mining to your back-test process to generate fabulous “past-only” equity curves…
    Maybe a solution to this would be to require system promoters to publish robustness measures of their back-tests.

    Ideally you’d want 50 years of past live trading results, independently audited, etc… But this would limit the field of possible investments…

  8. 9 Jeff

    “Accurately modelling” is a contradiction to “curve fitting” (in your pros and cons of backtesting). BTW- White et al. have done a good job at addressing curve fitting.

    • 10 MarketSci

      RE to Jeff: (a) this is not a pros and cons of backtesting or anything remotely resembling, (b) “accurately modeling” is totally unrelated to “curve fitting” (and if you read what I meant by “accurately modeling” that would be abundantly clear), and (c) many, many careers have been made addressing curve fitting and yet, despite our best efforts, it still exists as something we will never totally rid ourselves of.

      Apologies if that all seemed snarky, but it irks me when folks leave comments that make clear the fact that they didn’t actually read the post. michael

  9. 11 Perry Kaufman

    I’ve been an outsider looking in for some time and I appreciate the common sense that is seen throughout this blog, so don’t take my comments as critical.

    Certainly, you are right that actual performance is more important than backtesting; however, we have no other option when developing a new strategy than looking at how it performed in the past. I can’t distinguish your “accuracy testing” from any other backtesting — it must all be done properly.

    When I first started testing new ideas, in about 1970, the results would have been overfit but markets were much more trending, so any choice of a trend period would have made money. Of course, I didn’t realize that until much later. When the market trends, all trending methods work. The real challenge is to narrow the gap between expectations and reality. I think that can be done although “unusual” markets will always appear to disrupt those expectations.

    I am most interested in all of your thoughts on how to make backtesting a valuable process, rather than focus on the negatives. It needs to be a valuable process to make our trading profitable. I will be happy to share some of my experiences as well.

    For example, let me suggest that choosing a single parameter set from the results of backtesting is not the right decision. That implies that it will be the “right” choice for the futures and is most likely to produce high variability in future performance. Perhaps a better solution is using multiple parameters, or even all (reasonable) parameters. If you use all parameters and set expectations as the average of all the results, would that be overfitting?

  10. 12 John French

    Michael, please delete the first post it went off prematurely! The full one should read:

    A topic close to my heart. The Monkey on our backs etc. Backtests that compound untill one owns the known universe and so on.

    I have been churning out trading models in Excel for many years now and have learnt the hard was about curve-fitting, data-snooping etc. etc. Gradually I have uncovered some concepts that do (a) work and (b) appear to have a solid basis for doing so. Nothing spectacular and they need to be mixed in with a good dose of patience and discipline.

    it’s all about real time for me too but I think for a lot of people designing trading systems is really a sort of hobby i.e. they can then pontificate about “their” models online when realistically they have no real intention or interest of trading them and hence when one blows up they will simply churn out something else having suffered no pain en route. For me it is something I like to do but ultimately it is about my brokerage account pure and simple.

    My models are almost exclusively NDO only which I think is most realistic in many ways and I do factor in margin costs and so on. My aim is to produce tradeable models i.e one with low DD’s which hopefully do not have long flat periods whilst the market(s) is/are off to the races (hence the patience and discipline!). My lastest effort is alive and well after six months and performing pretty well as expected which I am very happy about.

    I am grateful to you, David V and CXO for fast-forwarding much of my progress.

    John

  11. Great post as always. With all the backtesting I’ve done and tweaking systems to give one more percentage point in annual yield, I’ve started to conclude that backtesting can at best provide the developer a “warm fuzzy feeling” that the system has a good chance of success in the future. Like you say, all that counts is the audited real-time results.

  12. 14 sl

    Hi Michael
    Your post is so true – but it leads to the next question:
    How does one compare real trade data between two programs if one has a long life (greater than 5 years) and the other a short life. (less than two years)? It becomes difficult if the “new program” has a better short term performance than the “old program”. but the CAGR of the “old program” is better. I guess the problem becomes:
    1) how do you give extra credit to the long running programs with good performance
    2) how do you give extra credit to the recent performance over older performance

  13. 15 CarlosR

    One follow-up thought re backtesting: while I agree 100% with you about the relative importance of audited real-time results vs. backtests, I think it’s worth differentiating among backtests.

    Someone who has developed a strategy through backtesting, but then left himself a decent out of sample period over which to validate the strategy is way, way ahead of someone with no OOS testing, in my opinion. Say you backtested for 10 years, but didn’t include the most recent year (you developed the strategy using the 10-years of data), and then you run it over the most recent year and it works great — that gives you substantial confidence. (and, of course, if it doesn’t work great, you are NOT allowed to use the most recent year to tweak it!!)

    So I guess that maybe we can say that one-month of audited results is only worth 5 years of backtesting with a good OOS period! :)


  1. 1 Some thoughts on mechanical systems « ducati998
  2. 2 Sunday links: old world dividends Abnormal Returns
  3. 3 Some Great Points in an Article You Should Really Take to Heart | SPYderCrusher Trade Advisory

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s