A few weeks back i wrote a post including the source code for a Monte Carlo simulation function in R. The idea was to randomly sample daily returns produced by a backtest and build a confidence interval distribution of the middle 50% and 90% of returns. Since then Brian Peterson got in touch with me asking if i would work with him in getting some form of Monte Carlo simulation functionality standardized within the R blotter package. The blotter package forms part of the TradeAnalytics project on R-Forge (which includes “quantstrat”) where you will read the project description as…
“Transaction-oriented infrastructure for defining instruments, transactions, portfolios and accounts for trading systems and strategy simulation. Provides portfolio support for multi-asset class and multi-currency portfolios. Still in heavy development.”
Whilst working with Brian he suggested a few enhancements for the function in the original post, one of which was to include the option to perform a block bootstrap of the original backtest in order to capture any autocorrelation effects. This leaves 2 options as far as existing R packages go, namely; meboot() and boot().
Now meboot() is a really neat package which i found out about thanks to this post by James Picerno over at The Capital Spectator. It eliminates the requirement for stationarity in a time series. Below is an example of what a meboot plot looks like using the same simSample file i used in the previous post with 1,000 samples (or replicates as the meboot package refers to it).
Looking at the above chart I am not sure it is the most appropriate tool for the job, since the sampling is done within a very narrow range giving you large swings which tend to revert back to the original backtest. To make sure though i also ran the meboot() function over the backtest results generated by the “longtrend” demo strategy script in the blotter package. Again, samples track the original backtest quite closely.
Whether or not meboot is the best tool for the job will ultimately come down to what the sample statistics say and which metrics you use to determine whether or not to proceed with a strategy into production. Either way it may be a useful option to include in any standardized Monte Carlo simulation functionality within blotter.
The boot package includes functions and datasets for bootstrapping from “Bootstrap Methods and Their Application” by A. C. Davison and D. V. Hinkley (1997, CUP). Of particular interest for time series bootstrapping is the tsboot() function, whose purpose is described in the vignette on CRAN as to “Generate R bootstrap replicates of a statistic applied to a time series. The replicate time series can be generated using fixed or random block lengths or can be model based replicates.”
Now the tsboot object returned by the function does not include the samples themselves, but instead you will have to call the boot.array() function to get the sample indexing. Off the bat though i wasn’t sure the tsboot function was appropriate for simply block sampling a time series, albeit very eloquently and efficiently achieving just that on top of calculating a statistic of each sampled series. So i spent some time debugging the function trying to learn how it goes about performing the block bootstrapping and managed to come up with my own function which produces the below graph after sampling simSample 1,000 times (R = 1000) using a block length of 5 (l = 5) [full source code below the graph].
Your choice of block length (l) can be anything (as long as its less than the number of periods in your backtest) and how long you make your blocks could be based on a number of factors including the average length that your strategy is in the market or the average period in which your strategy experiences autocorrelation. Remember, the whole purpose of adding block bootstrap functionality was to account for autocorrelation characteristics in your backtest.
So, whilst the above mcblocksim() function is likely an improvement on the mcsimr() function i wrote about in my previous post, there is still quite a lot of work to do to get it ready for testing and committing to the blotter package. Particular to-do items include:
- Revisiting the tsboot() function and determining whether or not some form of statistic can help users identify the optimal number of samples required
- Potentially using the drawdowns() function in the Performance Analytics package as a statistic in the tsboot() function
- Applying confidence intervals to the samples, thereby eliminating outlying returns (as in mcsimr())
- Adding any other strategy-specific noise to the samples
So lots to do but as they say, Rome wasn’t built in a day. Hopefully you find some use from the above source code in the mean time.
Thanks to Brian Peterson for his valuable input thus far.
Thanks for reading.