I wanted to show the process by animating at least one of the charts in this post, but after hours of reading about the R "animation" package and trying to figure out how to use SWFTools, I gave up! If any kind soul wants to help me figure out how to animate a series of charts produced in R, don't hold back! I will put up the R code I used to generate the charts in this post - feel free to download and experiment with it.
In my previous posts, we have seen that
- The absolute error of an integration depends upon the number of trials
- The absolute error depends upon the degree to which the integrand fills the space bounding the simulation
- We have the flexibility to change the size and shape of the space bounding the simulation if it suits us
My objective is to complete an integration to within a desired accuracy. If I knew ahead of time the answer to the integration I would know exactly how many trials to run. One strategy might be to go through a process of successive rounds of simulation gradually increasing the trials until I hit my target. Better yet, I could adopt a strategy that retains the value of each trial and focuses additional trials only where needed.
I want to calculate the value of pi by integrating the area under a quarter circle contained inside a square. I want the answer to be sufficiently accurate that two times the standard error in the estimate of the area is less than 0.1% of the area estimated.
Then I divide the space horizontally and repeat the process. Even though, due to the symmetry of this case, it makes no difference which way I divide the area, because of the stochastic nature of the method one of the answers will have a lower absolute error than the other. I select this answer and discard the other. I now have results for the total area divided into two strata. More generally, the choice will depend on the shape of the curve - it will be driven by the degree to which the integrand "fills" each stratum in the pair. 50% full has the highest absolute error.
If the total absolute error is sufficiently small that my accuracy criterion is met, I stop right there.
If not, I select the stratum with the worst absolute error. I divide it in two, first horizontally and then vertically. I run 500 trials on each of the two pairs of strata. I estimate the absolute errors for each pair and add them up (root sum of the squares). I discard the pair of strata with the worst error and retain the other pair. By loading 500 trials into a smaller area (half as big as before) I have doubled the density of my trials - I am getting a level of accuracy as if I had started out using 1000 trials. Adding the areas and error from these two strata and the undivided stratum from before I now have an updated estimate of the total area and total absolute error.
If the total absolute error is sufficiently small that my accuracy criterion is met, I stop right there. Otherwise I repeat the process again and again until I have achieved my target error!
The three charts at the right show the process completed different ways for comparison:
- 500 trials 0.1% accuracy
- 1000 trials 0.1% accuracy
- 500 trials 0.01% accuracy
Second, note what happens when 1000 trials is used instead of 500: there is less stratification, the strata look generally bigger. The procedure achieves the required level of accuracy without having to sub-divide (stratify) the space to the same degree. The trade-off is that many more trials are "wasted" on areas away from the vicinity of the curve (where the information is).
Finally, note what happens when I demand greater accuracy of my 500 trial simulations: the area of interest is stratified to a much greater degree. I never thought to measure this effect, but I would guess that the stratification goes 4 times deeper to get half the error.
I wish I could figure out the animation because it is very cool to watch the algorithm work!
I explored the trade-offs between trials, accuracy and time by measuring the average time to complete an integration using 100 - 1000 trials and 0.1% to 0.01% accuracy. I did not find it particularly informative. I think the curve you are trying to integrate might have a big effect. I expect there to be an optimum trial size resulting from the excessive stratification when "trials" is too small, vs wasted effort when "trials" is too large. I am not sure I really saw that, though about 500 trials seemed to go much quicker than 100 and a little quicker than 1000. More work!
Following is the R code. The one little wrinkle I introduced was to set the error to 100% if ever less than 5% or more than 95% of the points in a simulation fall under the curve. With so few points sitting on one side of the curve, it is not really appropriate to use sqrt(p.(1-p)/trials) to estimate the error in p. Setting the error to one in this case simply forces that stratum to be split further as it will automatically be the worst existing stratum.
Have fun with MC. That's all I have planned to do on it, it's time for some different topics!