## Triangulating Pi in Monte Carlo

I have made several posts on the basics of Monte Carlo integration techniques. I introduced the topic using MC integration to estimate the value of pi. Then I took a look at the way the convergence characteristics of the estimate depend upon the number of trials. Finally, I looked at how the relative size of the area being estimated affects convergence. Finally, I want to put some of these pieces together and introduce a final basic concept: the choice of the sampling distribution.

For this post I will return to the example of estimating pi from a quarter-circle inside a square, only this time I am going to change the shape of the search space from a square to a triangle. I want to do this because, as you can see from the chart at the right, all the "information" about pi is contained in the top right half of the square containing the quarter circle. A randomly selected point can fall anywhere in the bottom left corner and it tells us nothing about the shape of the curve - so why put any points there at all?

When a point is selected at random to be dropped on my search space I am sampling from a random distribution. Without explicitly stating it, my prior examples have all used the same random distribution: my x-values are drawn from a uniform distribution between 0 and 1. My y values are drawn from an identical but independent distribution. My points, (x, y), are thus randomly distributed across the unit square with equal likelihood of being placed at any given spot.

To take advantage of my observation about where the information is located in the x-y plane, I need to be a bit cleverer about my distribution: I want the probability of randomly selecting a point in the bottom left corner to be zero, and the probability of selecting any point in the top right to be constant.

I am doing this with one of two objectives in mind:
1. The area of the upper right triangle is half that of the square. So for any given number of trials, the density of points will be double if I use the triangle rather than the square. As my second post on MCsecond post on MC demonstrated, this buys me a reduction in the error of my estimate of 1/sqrt(2). Alternatively, I need only run 1/sqrt(2) as many trials to get the same accuracy as when I use the whole square.

2. If the area of interest occupies a greater proportion of the sample space, then as discussed in my third MC post, my accuracy will increase further still (or I can use fewer trials). The area under the curve occupies a 57% of the triangle ((pi/4 - 1/2) / 1/2 = 57%) vs 79% of the square (pi/4 / 1 = 79%). So, in this case, I will give up a little of my gains from the effect discussed in 1 above, to this effect.
I start by considering the probability distribution I want to sample from to get my x values. I want to randomly pick any place in the triangle with equal probability. Let's say that A(x) is the total area integrating over x from 0 to x. Obviously A(x) = x^2 / 2 (area of a triangle = 1/2 x base x perp height). I want to randomly sample, uniformly, from 0 to A, the total area. So we can say A(x) ~U(0, A), and therefore:

\frac{x^{2}}{2}\sim U(0, A), A=\frac{1}{2} => x \sim \sqrt{U(0, 1))}

y is then uniformly sampled between x and 1: y ~ U(x, 1)

Was it worth it? Using the square I estimated a value for pi/4. Using the triangle I estimated a value for pi/2. Let's say we want the same % error using both approaches, and I want to calculate the number of trials necessary using the triangle method to give me the same accuracy as 10,000 trials using the square. So I solve the following using the form %error = sqrt(p.(1 - p) / trials) / p:

\frac{\sqrt{\frac{\left ( \frac{\pi}{2}-1 \right ).\left ( 2 - \frac{\pi}{2} \right )}{trials_{triangle}}}}{\frac{\pi}{2}}=\frac{\sqrt{\frac{\left ( \frac{\pi}{4} \right ).\left ( 1 - \frac{\pi}{4} \right )}{10,000}}}{\frac{\pi}{4}} => trials_{triangle}=3,634

The charts below show 1,000 simulations of 10,000 trials using the square distribution on the left and the triangle distribution with 3634 trials on the right. The blue points model a normal distribution. As you can see, their respective % errors are in the same ballpark!

If I haven't bored you to death with this yet, I have one more post I want to do on MC, which I hope will show a really interesting progression of these ideas: stratification.