The forgetful exponential distribution

The exponential distribution has the quirky property of having no memory. Before we wade into the math and see why, let’s consider a situation where there is memory: drawing cards. Let’s say you have a well-shuffled deck of 52 cards and you draw a single card. What’s the probability of drawing an ace? Since there are 4 aces in a deck of 52 cards, the probability is $\frac{4}{52}$. We draw our card and it’s not an ace. We set the card aside, away from the deck, and draw again. Now our probability of drawing an ace is $\frac{4}{51}$. We have a slightly better chance on the 2nd draw. The condition that we have already selected a card that wasn’t an ace changes the probability we draw an ace. This doesn’t happen with the exponential distribution.

Let’s say we have a state-of-the-art widget (version 2.0) that has a lifespan that can be described with an exponential distribution. Further, let’s say the mean lifespan is 60 months, or 5 years. Thanks to the “no memory” property, the probability of the lifespan lasting 7 years is that same whether the widget is new or 5 years old. In math words:

$P(X > 7 + 5 | X>5) = P(X>7)$

That means if I bought a widget that was 5 years old, it has the same probability of lasting another 7 years as a brand new widget has for lasting 7 years. Not realistic but certainly interesting. Showing why this is the case is actually pretty straight-ahead.

We want to show that for the exponential distribution, $P(X > x + y | X > x) = P(X > y)$.

Recall the cumulative distribution of an exponential distribution is $P(X \le x)=F(x) = 1 -e^{-x/\theta}$. That’s the probability of an event occurring before a certain time x. The complement of the cumulative distribution is the probability of an event occurring after a certain time:

$P(X > x) = 1 - P(X \le x) = 1 - (1 - e^{-x/ \theta} ) = e^{-x/ \theta}$

Also recall the definition of conditional probability: $P(A |B) = \frac{P(A \cap B)}{P(B)}$

Let’s plug into the equality we want to prove and see what happens:

$P(X > x + y | X > x) = \frac{P(X>x+y) \cap P(X>x)}{P(X > x)} = \frac{P(X>x+y)}{P(X > x)}$ $=\frac{e^{-(x+y)/\theta}}{e^{-x/\theta}} = \frac{e^{-x/\theta}e^{-y/\theta}}{e^{-x/\theta}} = e^{-y/\theta} = P(X>y)$

There you go. Not too bad.

We can actually go the other direction as well. That is, we can show that if $P(X > x + y | X > x) = P(X > y)$ is true for a continuous random variable X, then X has an exponential distribution. Here’s how:

$P(X > x + y | X > x) = P(X > y)$ (given)

$P(1 - F(x + y) | 1 - F(x)) = 1 - F(y)$ (substitute the cdf expressions)

$\frac{1-F(x+y) \cap 1-F(x))}{1-F(x)}=1-F(y)$ (using the definition of conditional probability)

$\frac{1-F(x+y)}{1-F(x)}=1-F(y)$ (If X > x + y, then X > x)

Now substitute in generic function terminology, say $h(x) = 1 - F(x)$:

$\frac{h(x+y)}{h(x)}=h(y)$

Rearranging terms gives us $h(x+y)=h(y)h(x)$

Now for that equality to hold, the function h(x) has to have an exponential form, where the variable is in the exponent, like this: $a^{x}$. Recall that $a^{x}a^{y}=a^{x+y}$. If $h(x) = a^{x}$, then our equality above works. So we let $h(x)=a^{x}$. That allows to make the following conclusion:

$1-F(x) = h(x) = a^{x} = e^{ln a^{x}} = e^{x ln a}$

Now let b = ln a. We get $1-F(x) = e^{bx}$. Solving for F(x) we get $F(x) = 1 - e^{bx}$. Since $F(\infty) = 1$, b must be negative. So we have $F(x) = 1 - e^{-bx}$. Now we just let $b = \frac{1}{\theta}$ and we have the cumulative distribution function for an exponential distribution: $F(x) = 1 - e^{-x/\theta}$.

That’s the memoryless property for you. Or maybe it’s called the forgetfulness property. I can’t remember.