Block Bootstrapped Monte Carlo – in R

roulette-wheel-gambling-5-wallpaper-1.jpg

A few weeks back i wrote a post including the source code for a Monte Carlo simulation function in R. The idea was to randomly sample daily returns produced by a backtest and build a confidence interval distribution of the middle 50% and 90% of returns. Since then Brian Peterson got in touch with me asking if i would work with him in getting some form of Monte Carlo simulation functionality standardized within the R blotter package. The blotter package forms part of the TradeAnalytics project on R-Forge (which includes “quantstrat”) where you will read the project description as…

“Transaction-oriented infrastructure for defining instruments, transactions, portfolios and accounts for trading systems and strategy simulation. Provides portfolio support for multi-asset class and multi-currency portfolios. Still in heavy development.”

Whilst working with Brian he suggested a few enhancements for the function in the original post, one of which was to include the option to perform a block bootstrap of the original backtest in order to capture any autocorrelation effects. This leaves 2 options as far as existing R packages go, namely; meboot() and boot().

 

MEBOOT

Now meboot() is a really neat package which i found out about thanks to this post by James Picerno over at The Capital Spectator. It eliminates the requirement for stationarity in a time series. Below is an example of what a meboot plot looks like using the same simSample file i used in the previous post with 1,000 samples (or replicates as the meboot package refers to it).

meboot_simSample

Looking at the above chart I am not sure it is the most appropriate tool for the job, since the sampling is done within a very narrow range giving you large swings which tend to revert back to the original backtest. To make sure though i also ran the meboot() function over the backtest results generated by the “longtrend” demo strategy script in the blotter package. Again, samples track the original backtest quite closely.

meboot_longtrend

Whether or not meboot is the best tool for the job will ultimately come down to what the sample statistics say and which metrics you use to determine whether or not to proceed with a strategy into production. Either way it may be a useful option to include in any standardized Monte Carlo simulation functionality within blotter.

 

BOOT

The boot package includes functions and datasets for bootstrapping from “Bootstrap Methods and Their Application” by A. C. Davison and D. V. Hinkley (1997, CUP). Of particular interest for time series bootstrapping is the tsboot() function, whose purpose is described in the vignette on CRAN as to “Generate R bootstrap replicates of a statistic applied to a time series. The replicate time series can be generated using fixed or random block lengths or can be model based replicates.”

Now the tsboot object returned by the function does not include the samples themselves, but instead you will have to call the boot.array() function to get the sample indexing. Off the bat though i wasn’t sure the tsboot function was appropriate for simply block sampling a time series, albeit very eloquently and efficiently achieving just that on top of calculating a statistic of each sampled series. So i spent some time debugging the function trying to learn how it goes about performing the block bootstrapping and managed to come up with my own function which produces the below graph after sampling simSample 1,000 times (R = 1000) using a block length of 5 (l = 5) [full source code below the graph].

mcblocksim_simSample

Your choice of block length (l) can be anything (as long as its less than the number of periods in your backtest) and how long you make your blocks could be based on a number of factors including the average length that your strategy is in the market or the average period in which your strategy experiences autocorrelation. Remember, the whole purpose of adding block bootstrap functionality was to account for autocorrelation characteristics in your backtest.

So, whilst the above mcblocksim() function is likely an improvement on the mcsimr() function i wrote about in my previous post, there is still quite a lot of work to do to get it ready for testing and committing to the blotter package. Particular to-do items include:

  • Revisiting the tsboot() function and determining whether or not some form of statistic can help users identify the optimal number of samples required
  • Potentially using the drawdowns() function in the Performance Analytics package as a statistic in the tsboot() function
  • Applying confidence intervals to the samples, thereby eliminating outlying returns (as in mcsimr())
  • Adding any other strategy-specific noise to the samples

So lots to do but as they say, Rome wasn’t built in a day. Hopefully you find some use from the above source code in the mean time.

Thanks to Brian Peterson for his valuable input thus far.

 

Thanks for reading.

 

Advertisements

How i used 18 lines of R code to crack the Triangle Challenge

I recently came across a question that required logic and coding skills to solve. It went a little something like this:

Let there be a stick of length 1. Pick 2 points uniformly at random along the stick, and break the stick at those points. What is the probability of the three resulting pieces being able to form a triangle?

Interesting i thought. I do enjoy these types of challenges, and had fun trying to solve this problem. It took me a while to understand that the longest side (or one of the longest sides in the case of an isosceles triangle)  had to be smaller than the sum of the 2 smaller sides. In pseudo code a triangle would be possible IF:

Length of longest side < sum of other 2 sides, which is true IF

Length of longest side < 0.5 (half the length of the stick)

When i saw (in my mind’s eye) me breaking a stick in half and folding it so that both pieces overlap, it became obvious that i needed one of the pieces to be longer and then break that piece into 2 pieces of any length to form a would-be triangle. As a side note, i have been hoping to create my first gif for a while and this might be a useful one to animate a stick being broken into half, folded over and then one half growing in length and then breaking into 2 and forming a triangle.

I turned to the statistical computing language R to solve my problem, knowing i could randomly draw values from a uniform distribution with the runif() function.

Continue reading