# The forgetful exponential distribution

The exponential distribution has the quirky property of having no memory. Before we wade into the math and see why, let’s consider a situation where there is memory: drawing cards. Let’s say you have a well-shuffled deck of 52 cards and you draw a single card. What’s the probability of drawing an ace? Since there are 4 aces in a deck of 52 cards, the probability is $$\frac{4}{52}$$. We draw our card and it’s not an ace. We set the card aside, away from the deck, and draw again. Now our probability of drawing an ace is $$\frac{4}{51}$$. We have a slightly better chance on the 2nd draw. The condition that we have already selected a card that wasn’t an ace changes the probability we draw an ace. This doesn’t happen with the exponential distribution.

Let’s say we have a state-of-the-art widget (version 2.0) that has a lifespan that can be described with an exponential distribution. Further, let’s say the mean lifespan is 60 months, or 5 years. Thanks to the “no memory” property, the probability of the lifespan lasting 7 years is that same whether the widget is new or 5 years old. In math words:

$$P(X > 7 + 5 | X>5) = P(X>7)$$

That means if I bought a widget that was 5 years old, it has the same probability of lasting another 7 years as a brand new widget has for lasting 7 years. Not realistic but certainly interesting. Showing why this is the case is actually pretty straight-ahead.

We want to show that for the exponential distribution, $$P(X > y + x | X > x) = P(X > y)$$.

Recall the cumulative distribution of an exponential distribution is $$P(X \le x)=F(x) = 1 -e^{-x/\theta}$$. That’s the probability of an event occurring before a certain time x. The complement of the cumulative distribution is the probability of an event occurring after a certain time:

$$P(X > x) = 1 – P(X \le x) = 1 – (1 – e^{-x/ \theta} ) = e^{-x/ \theta}$$

Also recall the definition of conditional probability: $$P(A |B) = \frac{P(A \cap B)}{P(B)}$$

Let’s plug into the equality we want to prove and see what happens:

$$P(X > y + x | X > x) = \frac{P(X>y + x) \cap P(X>x)}{P(X > x)} = \frac{P(X>y + x)}{P(X > x)}$$

$$=\frac{e^{-(x+y)/\theta}}{e^{-x/\theta}} = \frac{e^{-x/\theta}e^{-y/\theta}}{e^{-x/\theta}} = e^{-y/\theta} = P(X>y)$$

There you go. Not too bad.

We can actually go the other direction as well. That is, we can show that if $$P(X > y + x | X > x) = P(X > y)$$ is true for a continuous random variable X, then X has an exponential distribution. Here’s how:

$$P(X > y + x | X > x) = P(X > y)$$ (given)

$$P(1 – F(y + x) | 1 – F(x)) = 1 – F(y)$$ (substitute the cdf expressions)

$$\frac{1-F(y + x) \cap 1-F(x))}{1-F(x)}=1-F(y)$$ (using the definition of conditional probability)

$$\frac{1-F(y + x)}{1-F(x)}=1-F(y)$$ (If X > y + x, then X > x)

Now substitute in generic function terminology, say $$h(x) = 1 – F(x)$$:

$$\frac{h(y + x)}{h(x)}=h(y)$$

Rearranging terms gives us $$h(y + x)=h(y)h(x)$$

Now for that equality to hold, the function h(x) has to have an exponential form, where the variable is in the exponent, like this: $$a^{x}$$. Recall that $$a^{x}a^{y}=a^{x+y}$$. If $$h(x) = a^{x}$$, then our equality above works. So we let $$h(x)=a^{x}$$. That allows to make the following conclusion:

$$1-F(x) = h(x) = a^{x} = e^{ln a^{x}} = e^{x ln a}$$

Now let b = ln a. We get $$1-F(x) = e^{bx}$$. Solving for F(x) we get $$F(x) = 1 – e^{bx}$$. Since $$F(\infty) = 1$$, b must be negative. So we have $$F(x) = 1 – e^{-bx}$$. Now we just let $$b = \frac{1}{\theta}$$ and we have the cumulative distribution function for an exponential distribution: $$F(x) = 1 – e^{-x/\theta}$$.

That’s the memoryless property for you. Or maybe it’s called the forgetfulness property. I can’t remember.

## One thought on “The forgetful exponential distribution”

1. jacob

fantastic explanations and very much appreciated, thanks for all your hard work , really helping beginners such as myself with derivations for prob distributions :)

This site uses Akismet to reduce spam. Learn how your comment data is processed.