13. Mai 2011
Sampling from a Poisson distribution - a benchmark
Having a poisson-distributed random variable X, how can we efficiently generate samples (or say realizations) of that random variable? This article shows how Knuth’s technique can be derived and it also indicates that the Ratio-of-Uniforms method is quite efficient.
X is the number of arrivals in a poisson process within a fixed period of time. The inter-arrival times follow an exponential distribution. The following small benchmark makes an attempt to compare the time necessary to compute samples using three different algorithms:
- Algorithm exp, do it yourself
- Algorithm uni by D. Knuth
- Algorithm rou, from Numerical Recipes
(Let’s peek at the results up-front: rou seems to win this challenge!)
exp – counting inter-arrival times
We want a random sample of the number of arrivals within a fixed period of time. We know the distribution of the inter-arrival times. By summing up inter-arrival times until the sum exceeds the fixed period of time (use the unit interval for convenience) we generate a sample number of arrivals. Formalizing this idea, we have a sequence of exponentially distributed random values:
The i-th interval represents the time until the i-th arrival. We then find a number k such that
This number k is the sought-after random sample.
double exponential(Ran *ran, double lambda) { double p; do { p = ran->doub(); } while (p == 0.); return -log(p) / lambda; } int poisson_exp(Ran *ran, double lambda) { double t = 0; int k = 0; while (1) { t += exponential(ran, lambda); if (t > 1) { return k; } else { k++; } } }
uni – counting inter-arrival times, improved
Usually, the inversion method is used for sampling from an exponential distribution.
Note the logarithm. It is the computationally most expensive operation when generating a sample inter-arrival time. Knuth’s variant gets rid of this nuisance. Here’s how it works
Finding a k that satisfies the last inequation only requires λ+1 uniform samples on average, reflecting in shortened computation times.
int poisson_uni(Ran *ran, double lambda) { int k = -1; double p = 1.; double lambda_exp = exp(-lambda); do { k++; p *= ran->doub(); } while (p > lambda_exp); return k; }
rou – using Ratio-of-Uniforms method
No explanation here (unfortunately…). Just take the algorithm from Numerical Recipes and see what it’s worth. Actually, I modified it to use Ratio-of-Uniforms for all values of λ for the sake of comparison. Sources are licensed and therefore cannot be listed here.
Benchmark
Taking different values for λ (1, 2, 4, 16, 32, 64) and generating a million random samples using each of the three algorithms on a notebook from 2008, here’s the results (interpolated).
g++ -O2 benchmark.cpp -I . -o benchmark ./benchmark
Numbers generated per algorithm: 1000000 Expected value: 1 method mean time [s] uni 1.000 0.130 exp 0.998 0.300 rou 1.036 0.310
Numbers generated per algorithm: 1000000 Expected value: 2 method mean time [s] uni 2.001 0.140 exp 2.000 0.460 rou 2.008 0.310
Numbers generated per algorithm: 1000000 Expected value: 4 method mean time [s] uni 3.999 0.190 exp 3.999 0.750 rou 4.000 0.300
Numbers generated per algorithm: 1000000 Expected value: 8 method mean time [s] uni 8.000 0.290 exp 8.000 1.340 rou 8.001 0.300
Numbers generated per algorithm: 1000000 Expected value: 16 method mean time [s] uni 16.002 0.470 exp 15.996 2.440 rou 16.000 0.220
Numbers generated per algorithm: 1000000 Expected value: 32 method mean time [s] uni 32.007 0.820 exp 32.001 4.710 rou 31.996 0.220
Numbers generated per algorithm: 1000000 Expected value: 64 method mean time [s] uni 63.982 1.530 exp 64.009 9.270 rou 63.995 0.210
Sources
benchmark.zip : In order to compile, you need to obtain files nr3.h, ran.h, deviates.h, gamma.h from Numerical Recipes and put them into the folder with the other source files.