Stirling’s Formula

Stirling’s formula says

\displaystyle{ n! \sim \sqrt{2 \pi n}\, \left(\frac{n}{e}\right)^n }

where \sim means that the ratio of the two quantities goes to 1 as n \to \infty.

Where does this formula come from? In particular, how does the number 2\pi get involved? Where is the circle here?

To understand these things, I think a nonrigorous argument that can be made rigorous is more useful than a rigorous proof with all the ε’s dotted and the δ’s crossed. It’s important, I think, to keep the argument short. So let me do that.

The punchline will be that the 2\pi comes from this formula:

\displaystyle{  \int_{-\infty}^\infty e^{-x^2/2} \, dx = \sqrt{2 \pi} }

And this, I hope you know, comes from squaring both sides and converting the left side into a double integral that you can do in polar coordinates, pulling out a factor of 2 \pi because the thing you’re integrating only depends on r, not \theta.

Okay, here goes. We start with

\displaystyle{ \int_0^\infty x^n e^{-x} \, dx = n! }

This is easy to show using repeated integration by parts.

Next, we do this:

\begin{array}{ccl}  n! &=& \displaystyle{ \int_0^\infty x^n e^{-x} \, dx } \\ \\  &=& \displaystyle{ \int_0^\infty e^{n \ln x  -x} \, dx } \\ \\  &=& \displaystyle{  n\int_0^\infty e^{n \ln (ny) -n y} \, dy } \\ \\  \end{array}

In first step we’re writing x^n as e^{n \ln x}. In the second we’re changing variables: x = n y.

Next we use \ln(ny) = \ln n + \ln y to bust things up:

\displaystyle{ n! =  n e^{n \ln n} \int_0^\infty e^{n \ln y -n y} \, dy }

All the hard work will come in showing this:

\displaystyle{ \int_0^\infty e^{n \ln y -n y} \, dy \sim \sqrt{\frac{2 \pi}{n}} \; e^{-n} }

Given this, we get

\displaystyle{ n! \sim  n e^{n \ln n} \sqrt{\frac{2 \pi}{n}} \; e^{-n} }

and simplifying we get Stirling’s formulas:

\displaystyle{ n! \sim \sqrt{2 \pi n} \, \left(\frac{n}{e}\right)^n}

Laplace’s method

So to prove Stirling’s formula, the big question is: how do we get

\displaystyle{ \int_0^\infty e^{n \ln y -n y} \, dy \sim \sqrt{\frac{2 \pi}{n}} \; e^{-n} } \; ?

Let’s write it like this:

\displaystyle{ \int_0^\infty e^{-n (y - \ln y)} \, dy \sim \sqrt{\frac{2 \pi}{n}} \; e^{-n} }

The trick is to note that as n gets big, the integral will become dominated by the point where y - \ln y is as small as possible. We can then approximate the integral by a Gaussian peaked at that point!

Notice that

\displaystyle{ \frac{d}{dy} (y - \ln y) = 1 - y^{-1} }

\displaystyle{ \frac{d^2}{dy^2} (y - \ln y) = y^{-2} }

so the function y - \ln y has a critical point at y = 1 and its second derivative is 1 there, so it’s a local minimum. Indeed this point is the unique minimum of our function on the whole interval (0,\infty).

Then we use this:

Laplace’s Method. Suppose f \colon [a,b] \to \mathbb{R} has a unique minimum at some point x_0 \in (a,b) and f''(x_0) > 0. Then

\displaystyle{ \int_a^b e^{-n f(x)} dx \sim \sqrt{\frac{2 \pi}{n f''(x_0)}} \; e^{-n f(x_0)}    }

as n \to \infty.

This says that asymptotically, the integral equals what we’d get if we replaced f by the quadratic function whose value, first derivative and second derivative all match that of f at the point x_0. With this quadratic replacing f, you can do the integral by hand—it’s the integral of a Gaussian—and you get the right hand side.

Applying this formula to the problem at hand we get

\displaystyle{ \int_a^b e^{-n (y - \ln y)} dy \sim \sqrt{\frac{2 \pi}{n f''(y_0)}} \; e^{-n f(y_0)}    }

where f(y) = y - \ln y, y_0 = 1, f(y_0) = 1 and f''(y_0) = 1. So we get

\displaystyle{ \int_a^b e^{-n (y - \ln y)} dy \sim \sqrt{\frac{2 \pi}{n}} \; e^{-n}    }

and then letting a = 0, b \to +\infty we get what we want.

So, from this viewpoint—and there are others—the key to Stirling’s formula is Laplace’s method of approximating an integral like

\displaystyle{ \int_a^b e^{-n f(x)} dx }

with a Gaussian integral. And in the end, the crucial calculation is where we do that Gaussian integral, using

\displaystyle{  \int_{-\infty}^\infty e^{-x^2/2} \, dx = \sqrt{2 \pi} }

You can see the whole proof of Laplace’s method here:

• Wikipedia, Laplace’s method.

Physicists who have done quantum field theory will know that when push comes to shove it’s largely about Gaussian integrals. The limit n \to \infty we’re seeing here is like a ‘classical limit’ where \hbar \to 0. So they will be familiar with this idea.

There should be some deeper moral here, about how n! is related to a Gaussian process of some sort, but I don’t know it—even though I know how binomial coefficients approximate a Gaussian distribution. Do you know some deeper explanation, maybe in terms of probability theory and combinatorics, of why n! winds up being asymptotically described by an integral of a Gaussian?

For a very nice account of some cruder versions of Stirling’s formula, try this blog article:

• Michael Weiss, Stirling’s formula: Ahlfors’ derivation, Diagonal Argument, 17 July 2019.

His ‘note’, which you can find there, will give you more intuition for why something like Stirling’s formula should be true. But I think the above argument explains the \sqrt{2 \pi} better than Ahlfors’ argument.

But in math, there are always mysteries within mysteries. Gaussians show up in probability theory when we add up lots of independent and identically distributed random variables. Could that be going on here somehow?

Yes! See this:

• Aditya Ghosh, A probabilistic proof of Stirling’s formula, Blog on Mathematics and Statistics, September 7, 2020.

Folks at the n-Category Café noticed more mysteries. n!/n^n is the probability that a randomly chosen function from an n-element set to itself is a permutation. Stirling’s formula is a cool estimate of this probability! Can we use this to prove Stirling’s formula? I don’t know!

 


 
So I don’t think we’ve gotten to the bottom of Stirling’s formula! Comments at the n-Category Café contain other guesses about what it might ‘really mean’. You can read them here. Also check out Section 3 here, which discusses many different articles on Stirling’s formula in the American Mathematical Monthly:

• Jonathan M. Borwein and Robert M. Corless, Gamma and factorial in the Monthly.

Judging by the number of articles in the Monthly on the subject, Stirling’s formula approximating n! for large n is by far the most popular aspect of the \Gamma function. There are “some remarks”, “notes”, more “remarks”; there are “simple proofs”, “direct proofs”, “new proofs”, “corrections”, “short proofs”, “very short proofs”, “elementary” proofs, “probabilistic” proofs, “new derivations”, and (our favourite title) “The (n+1)th proof”.

That should be “(n+1)st”.

13 Responses to Stirling’s Formula

  1. John Imperio says:

    I was reading a biography of grace hopper and when she was a math professor at vassar she had her students write an essay about stirling’s formula. Her reasoning was because they is no use of knowing math if you can’t teach it to others.

  2. Wyrd Smythe says:

    \displaystyle{n}!=\int_{0}^{\infty}{x}^{n}{e}^{-x}\;{dx}

    Whoa!! Mind definitely blown! That’s awesome; great post!

  3. […] Baez has a post outlining another derivation of the full Stirling formula, using Laplace’s method. It looks a lot easier than Ahlfors’ […]

  4. allenknutson says:

    You’re not going to compare Laplace’s method to stationary phase approximation to semiclassical limits of Feynman integrals? What self-control!

    • John Baez says:

      I thought I had! “Physicists who have done quantum field theory will know that when push comes to shove it’s largely about Gaussian integrals.” (Quantum field theorists typically Wick rotate to turn exp(ix2) into exp(-x2), but of course the idea is the same.)

  5. Perhaps a more transparent way to see how the Gaussian comes from is to parametrize x=n+\xi with \xi\ll n. Then

    \ln(x^n e^{-x})=n\ln x -x=n\left[\ln(n+\xi)-(n+\xi)\right]=n\ln\left[n\left(1+\frac{\xi}{n}\right)\right]-\left(n+\xi\right)

    =n\ln n+n\ln\left(1+\frac{\xi}{n}\right)-(n+\xi)\simeq n\ln n+n\left(\frac{\xi}{n}-\frac{\xi^2}{2n^2}+...\right)-(n+\xi)

    =n\ln n-n-\frac{\xi^2}{2n}.

    Therefore, we have
    n!=\int_0^\infty x^n e^{-x}dx=e^{n\ln n}e^{-n}\int_{-n}^\infty e^{\frac{-\xi^2}{2n}}d\xi

    \simeq e^{n\ln n}e^{-n}\sqrt{2\pi n} as n\rightarrow\infty.

    However, I don’t see how n! is related to Gaussian process or binomial coefficients. Perhaps the binomial coefficients might be due to the very definition of e^x\sim \lim_{m\rightarrow\infty}\left(1+\frac{x}{m}\right)^m and its binomial expansion?

    • John Baez says:

      That’s nice.

      I’m also discussing this on my other blog, the n-Category Café, and we’ve seen a nice probabilistic proof of Stirling’s formula using the Central Limit Theorem. It turns out that this statement, when translated into equations, is equivalent to Stirling’s formula:

      Suppose raindrops are randomly landing on your patio at a constant average rate that you know. You start your stopwatch, wait until the expected number of drops that have landed is n, and record how many have actually landed. Do this over and over and get a probability distribution of answers. In the limit as n \to \infty, this probability distribution is a Gaussian with mean n and standard deviation \sqrt{n}.

      This is stated a bit informally, but it’s about Poisson distributions.

  6. Graham Jones says:

    You can derive Stirling’s formula from the central limit theorem (CLT). Adding independent random variables means convolving their densities, which means multiplying their Fourier transforms. That is one approach to proving the CLT. If you have n independent random variables with an exponential density (with mean 1) you can do the convolutions directly, and get a gamma density, which has mean and variance equal to n. (You effectively did this calculation as part of your proof.) Then apply the CLT.

    I’m not a physicist, but I think the CLT is analogous to the classical limit in quantum theory. The exponential density is a special case where exact calculations are possible. I can’t think of any other densities which convolve nicely, except the Cauchy, which is not covered by the CLT. It has an infrared divergence, perhaps.

    • Graham Jones says:

      Hah, I wrote the above in a text editor and pasted it in, before noticing John’s comment post about the CLT and the Poisson.

    • John Baez says:

      No problem. I actually don’t see how the CLT is analogous to the classical limit in quantum theory. That is, I can sense the analogy in a vague way, but I don’t see how the math would work. That sounds like something I should think about, since I’m busy exploring the connections between statistical mechanics and quantum mechanics!

You can use Markdown or HTML in your comments. You can also use LaTeX, like this: $latex E = m c^2 $. The word 'latex' comes right after the first dollar sign, with a space after it.

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.