Let’s take a look at the gamma distribution:
I don’t know about you, but I think it looks pretty horrible. Typing it up in Latex is no fun and gave me second thoughts about writing this post, but I’ll plow ahead.
First, what does it mean? One interpretation of the gamma distribution is that it’s the theoretical distribution of waiting times until the -th change for a Poisson process. In another post I derived the exponential distribution, which is the distribution of times until the first change in a Poisson process. The gamma distribution models the waiting time until the 2nd, 3rd, 4th, 38th, etc, change in a Poisson process.
As we did with the exponential distribution, we derive it from the Poisson distribution. Let W be the random variable the represents waiting time. Its cumulative distribution function then would be
But notice that is the probability of fewer than changes in the interval [0, w]. The probability of that in a Poisson process with mean is
To find the probability distribution function we take the derivative of . But before we do that we can simplify matters a little by expanding the summation to two terms:
Why did I know to do that? Because my old statistics book did it that way. Moving on…
After lots of simplifying…
And we’re done! Technically we have the gamma probability distribution. But it’s a little too bloated for mathematical taste. And of course it doesn’t match the form of the gamma distribution I presented in the beginning, so we have some more simplifying to do. Let’s carry out the summation for a few terms and see what happens:
Notice that besides the -1 and 2nd to last term, everything cancels, so we’re left with
Plugging that back into the gamma pdf gives us
This simplifies to
Now that’s a lean formula, but still not like the one I showed at the beginning. To get the “classic” formula we do two things:
- Let , just as we did with the exponential
- Use the fact that
Doing that takes us to the end:
We call the shape parameter and the scale parameter because of their effect on the shape and scale of the distribution. Holding (scale) at a set value and trying different values of (shape) changes the shape of the distribution (at least when you go from to :
Holding (shape) at a set value and trying different values of (scale) changes the scale of the distribution:
In the applied setting (scale) is the mean wait time between events and is the number of events. If we look at the first figure above, we’re holding the wait time at 1 and changing the number of events. We see that the probability of waiting 5 minutes or longer increases as the number of events increases. This is intuitive as it would seem more likely to wait 5 minutes to observe 4 events than to wait 5 minutes to observe 1 event, assuming a one minute wait time between each event. The second figure holds the number of events at 4 and changes the wait time between events. We see that the probability of waiting 10 minutes or longer increases as the time between events increases. Again this is pretty intuitive as you would expect a higher probability of waiting more than 10 minutes to observe 4 events when there is mean wait time of 4 minutes between events versus a mean wait time of 1 minute.
Finally notice that if you set , the gamma distribution simplifies to the exponential distribution.
Update 5 Oct 2013: I want to point out that and can take continuous values like 2.3, not just integers. So what I’ve really derived in this post is the relationship between the gamma and Poisson distributions.