The moment-generating function (or mgf) is probably better understood if you simply focus on its name rather than its formula. It’s a function that generates moments. Of course it helps to know what “moments” are. A moment is a term from mechanics for the product of a distance and its weight. If you have several distances and associated weights, you can calculate all the products and sum them and get what you call the “first moment about the origin.” If that’s a little too abstract, just know that the calculation of the “first moment about the origin” is the same calculation one uses to find the mean of a random variable. Thus the mean of a random variable is the same as the first moment about the origin. Therefore we can use the mgf to generate the first moment about the origin, and thus find the mean of a random variable. But the moment-generating function doesn’t stop there. It also generates second moments, third moments, etc., provided the mgf exists (a detail I’m not going to address in this post). Since the variance of a random variable can be calculated using the first and second moments about the origin, we can use the mgf to find variance as well as the mean.

So that’s the big idea: **the moment-generating function provides another (sometimes easier) way to find the mean and variance of a random variable**. That’s why it’s taught in upper-level statistics (even though many instructors dive into this topic without providing any motivation for it). Recall the usual formulas for finding the mean and variance: \( \mu = \sum_{x\in S}xf(x) \) and \( \sigma^{2} = \sum_{x\in S}(x-\mu)^{2}f(x) \). These can lead to some awfully long and hard math, especially for finding the variance. Even the variance “shortcut” \( \sigma^{2} =E[X^{2}]-\mu^{2}\) can get bogged down in finding \( E[X^{2}] = \sum_{x\in S}x^{2}f(x)\). This is where the moment-generating function can be of service. Let’s introduce it (for discrete random variables):

**mgf**: \( M(t) = E(e^{tX}) = \sum_{x\in S}e^{tX}f(x)\)

I’ll admit it doesn’t look very friendly. And it’s certainly not obvious that it would help you find the mean and variance of a distribution. Why it works is a topic for another post, ~~a post I’ll almost certainly never write~~. But seriously, why the mgf does what it does is explained rather well in one of my favorite Dover books, *Principles of Statistics* by M.G. Bulmer. A highly recommended book at only $15. But say you believe me and take it on faith that the mgf can help us find the mean and variance. How does it work? Well, here are the steps you usually follow:

- find the mgf
- find the first derivative of the mgf
- find the second derivative of the mgf
- solve the first derivative of the mgf for t=0. That gives you the mean.
- solve the second derivative of the mgf for t=0. That gives you \( E[X^{2}]\)
- subtract the mean squared from \( E[X^{2}]\) (ie, use the formula \( \sigma^{2} =E[X^{2}]-\mu^{2}\))

Let’s do a simple example. Say we have a probability mass function (pmf) \( f(x) = (4 – x) / 6\) with \( x = 1, 2, 3\). (You can verify it’s a pmf by calculating \( f(1), f(2)\) and \( f(3)\), and adding the results. They sum to 1.) Use the moment-generating function to find the mean and variance.

Following the steps above we first find the mgf:

\( M(t) = E(e^{tX}) =e^{1t}((4-1)/6)+e^{2t}((4-2)/6)+e^{3t}((4-3)/6)\)

\( M(t) = \frac{3}{6}e^{t}+\frac{2}{6}e^{2t}+\frac{1}{6}e^{3t}\)

Next find the first and second derivatives:

\( M^{\prime}(t) = \frac{3}{6}e^{t}+\frac{4}{6}e^{2t}+\frac{3}{6}e^{3t}\)

\( M^{\prime\prime}(t) = \frac{3}{6}e^{t}+\frac{8}{6}e^{2t}+\frac{9}{6}e^{3t}\)

Now solve \( M^{\prime}(0)\) to find the mean:

\( \mu = M^{\prime}(0) = \frac{3}{6}e^{0}+\frac{4}{6}e^{0}+\frac{3}{6}e^{0} = \frac{3}{6}+\frac{4}{6}+\frac{3}{6} = \frac{10}{6} \approxeq 1.667\)

Finally solve \( M^{\prime\prime}(0) = E[X^{2}]\) to find the variance:

\( M^{\prime\prime}(0) = \frac{3}{6}e^{0}+\frac{8}{6}e^{0}+\frac{9}{6}e^{0}=\frac{3}{6}+\frac{8}{6}+\frac{9}{6} = \frac{20}{6}\)

\( \sigma^{2} = M^{\prime\prime}(0) – \mu^{2} = \frac{20}{6} – \frac{10}{6}^{2} = \frac{5}{9} \approxeq 0.556\)

Of course you wouldn’t normally employ the mgf for a simple distribution like this, but I thought it was a nice way to illustrate the concepts of the moment-generating function. By the way, I took this example from the textbook *Probability and Statistical Inference* (7th ed) by Hogg and Tanis (p. 101). Another highly recommended book.

Pingback: » Why the moment-generating function works Statistics you can Probably Trust