The forgetful exponential distribution

The exponential distribution has the quirky property of having no memory. Before we wade into the math and see why, let’s consider a situation where there is memory: drawing cards. Let’s say you have a well-shuffled deck of 52 cards and you draw a single card. What’s the probability of drawing an ace? Since there are 4 aces in a deck of 52 cards, the probability is \frac{4}{52}. We draw our card and it’s not an ace. We set the card aside, away from the deck, and draw again. Now our probability of drawing an ace is \frac{4}{51}. We have a slightly better chance on the 2nd draw. The condition that we have already selected a card that wasn’t an ace changes the probability we draw an ace. This doesn’t happen with the exponential distribution.

Let’s say we have a state-of-the-art widget (version 2.0) that has a lifespan that can be described with an exponential distribution. Further, let’s say the mean lifespan is 60 months, or 5 years. Thanks to the “no memory” property, the probability of the lifespan lasting 7 years is that same whether the widget is new or 5 years old. In math words:

P(X > 7 + 5 | X>5) = P(X>7)

That means if I bought a widget that was 5 years old, it has the same probability of lasting another 7 years as a brand new widget has for lasting 7 years. Not realistic but certainly interesting. Showing why this is the case is actually pretty straight-ahead.

We want to show that for the exponential distribution, P(X > x + y | X > x) = P(X > y).

Recall the cumulative distribution of an exponential distribution is P(X \le x)=F(x) = 1 -e^{-x/\theta}. That’s the probability of an event occurring before a certain time x. The complement of the cumulative distribution is the probability of an event occurring after a certain time:

P(X > x) = 1 - P(X \le x) = 1 - (1 - e^{-x/ \theta} ) = e^{-x/ \theta}

Also recall the definition of conditional probability: P(A |B) = \frac{P(A \cap B)}{P(B)}

Let’s plug into the equality we want to prove and see what happens:

P(X > x + y | X > x) = \frac{P(X>x+y) \cap P(X>x)}{P(X > x)} = \frac{P(X>x+y)}{P(X > x)} =\frac{e^{-(x+y)/\theta}}{e^{-x/\theta}} = \frac{e^{-x/\theta}e^{-y/\theta}}{e^{-x/\theta}} = e^{-y/\theta} = P(X>y)

There you go. Not too bad.

We can actually go the other direction as well. That is, we can show that if P(X > x + y | X > x) = P(X > y) is true for a continuous random variable X, then X has an exponential distribution. Here’s how:

P(X > x + y | X > x) = P(X > y) (given)

P(1 - F(x + y) | 1 - F(x)) = 1 - F(y) (substitute the cdf expressions)

\frac{1-F(x+y) \cap 1-F(x))}{1-F(x)}=1-F(y) (using the definition of conditional probability)

\frac{1-F(x+y)}{1-F(x)}=1-F(y) (If X > x + y, then X > x)

Now substitute in generic function terminology, say h(x) = 1 - F(x):

\frac{h(x+y)}{h(x)}=h(y)

Rearranging terms gives us h(x+y)=h(y)h(x)

Now for that equality to hold, the function h(x) has to have an exponential form, where the variable is in the exponent, like this: a^{x}. Recall that a^{x}a^{y}=a^{x+y}. If h(x) = a^{x}, then our equality above works. So we let h(x)=a^{x}. That allows to make the following conclusion:

1-F(x) = h(x) = a^{x} = e^{ln a^{x}} = e^{x ln a}

Now let b = ln a. We get 1-F(x) = e^{bx}. Solving for F(x) we get F(x) = 1 - e^{bx}. Since F(\infty) = 1, b must be negative. So we have F(x) = 1 - e^{-bx}. Now we just let b = \frac{1}{\theta} and we have the cumulative distribution function for an exponential distribution: F(x) = 1 - e^{-x/\theta}.

That’s the memoryless property for you. Or maybe it’s called the forgetfulness property. I can’t remember.

Leave a Reply

Your email address will not be published. Required fields are marked *