Triangular Numbers

This post is just for fun. I’ll start with Pascal’s triangle and show you the number e hiding inside it. Using this, we’ll see how the product of all numbers in the nth row of Pascal’s triangle is related to the nth triangular number.

That’s cute, because Pascal’s triangle looks so… triangular!

But then, with a massive amount of help from Greg Egan, we’ll dig a lot deeper, and meet strange things like superfactorials, the magic properties of the number 1/12, and the Glaisher–Kinkelin constant.

First let’s get warmed up.


Pascal’s triangle is a famous and much-studied thing. It was discovered long before Pascal. It looks like this:

We write 1’s down the edges. We get every other number in the triangle by adding the two numbers above it to its left and right. People call these numbers binomial coefficients, since you can also get them like this:

(x + y)^5 = x^5 + 5 x^4 y + 10 x^3 y^2 + 10 x^2 y^3 + 5 x y^4 + y^5

We get the 10 in 10 x^3 y^2 here because there are 10 ways to take 5 things and choose 3 to call x’s and 2 to call y’s. In general we have

\displaystyle{ (x + y)^n = \sum_{k=0}^n \binom{n}{k} x^k y^{n-k} }

where the binomial coefficient \binom{n}{k} is the kth number on the nth row of Pascal’s triangle. Here count both rows and the numbers in a row starting from zero.

Since we can permute n things in

n! = 1 \cdot 2 \cdot 3 \cdot \cdots \cdot n

ways, there are

\displaystyle{ \frac{n!}{k! (n-k)!} }

ways to take n things and choose k of them to be x’s and (n-k) of them to be y’s. You see, permuting the x’s or the y’s doesn’t change our choice, so we have to divide by k! and (n-k)!.

So, the kth number in the nth row of Pascal’s triangle is:

\displaystyle{\binom{n}{k} = \frac{n!}{k! (n-k)!} }

All this will be painfully familiar to many of you. But I want everyone to have an fair chance at understanding the next section, where we’ll see something nice about Pascal’s triangle and the number

e \approx 2.718281828459045...

The hidden e in Pascal’s triangle

In 2012, a guy named Harlan Brothers found the number e hiding in Pascal’s triangle… in a very simple way!

If we add up all the numbers in the nth row of Pascal’s triangle we get 2^n. But what if we take the product of all these numbers? Let’s call it t_n. Then here’s what Brothers discovered:

\displaystyle{ \lim_{n \to \infty}  \frac{t_n t_{n+2}}{t^2_{n+1}} = e }

This may seem mysterious, but in fact it’s rather simple once you see the trick. I’ll use a nice argument given by Greg Egan.

We’ve said t_n is the product of all numbers in the nth row of Pascal’s triangle. Just for fun, let’s divide this by the product of all numbers in the next row:

\displaystyle{ u_n = \frac{t_n}{t_{n+1}} }

And since this is so much fun, let’s do it again! Divide this quantity by its next value:

\displaystyle{ v_n = \frac{u_n}{u_{n+1}}   }

Now look:

\displaystyle{v_n = \frac{t_n/t_{n+1}}{t_{n+1}/t_{n+2}} = \frac{t_n t_{n+2}}{t_{n+1}^2}  }

So, this is the thing that should approach e.

But why does it approach e? To see this, first take Pascal’s triangle and divide each number by the number to its lower left. We get a second triangle of numbers. Then take each number in this second triangle and divide it by the number to its lower right! We get a third triangle, like this:

If you think a bit, you’ll see:

• In the first triangle, the product of all numbers in the nth row is t_n.

• In the second, the product of all numbers in the nth row is u_n.

• In the third, the product of all numbers in the nth row is v_n.

And look—there’s a cool pattern! In the third triangle, all the numbers in any given row are equal. In the row with n numbers, all those numbers equal

\displaystyle{ (n+1)/n = 1 + \frac{1}{n} }

So, the product of all these numbers is

\displaystyle{ \left(1 + \frac{1}{n}\right)^n }

But it’s a famous fact that

\displaystyle{ \lim_{n \to \infty}  \left(1 + \frac{1}{n}\right)^n = e}

Some people even use this as a definition of e. So,

\displaystyle{ \lim_{n \to \infty} v_n = e}

which is just what we wanted!

Puzzle 1. Use the formula I mentioned:

\displaystyle{\binom{n}{k} = \frac{n!}{k! (n-k)!} }

to show all the numbers in the same row of the third triangle are equal.

Triangular numbers

The number of balls in a triangular stack with n balls at the bottom is called the nth triangular number,

T_n = 1 + 2 + \cdots + n

If you chop a square of balls in half along a diagonal you get a triangle, so T_n is approximately half the nth square number:

\displaystyle{ T_n \approx \frac{n^2}{2} }

But if you chop the square in half this way, the balls on the diagonal get cut in half, so to get the nth triangle number we need to include their other halves:

T_n = \displaystyle{ \frac{n^2}{2} + \frac{n}{2} = \frac{n(n+1)}{2} }

These numbers go like

1, 3, 6, 10, \dots

and they are one of the simple pleasures of life, like sunshine and good bread. Spotting them in a knight-move zigzag pattern in the multiplication table was one of my first truly mathematical experiences. You can also see them in Pascal’s triangle:


T_n = \displaystyle{ \binom{n+1}{2} }

But today it’s time to see how T_n is related to t_n, the product of all the numbers in the nth row of Pascal’s triangle!

If we subtract each triangular number from the the one before it we get the numbers -n, and if we subtract each of these numbers from the one before it we get the number 1. This should remind you of something we’ve seen:

\displaystyle{ u_n = \frac{t_n}{t_{n+1}} }

\displaystyle{ v_n = \frac{u_n}{u_{n+1}}   }

\displaystyle{ \lim_{n \to \infty} v_n = e }

Here we’re dividing instead of subtracting. But if take logarithms, we’ll be subtracting, and we get

\ln(u_n) = \ln(t_n) - \ln(t_{n+1})

\ln(v_n) = \ln(u_n) - \ln(u_{n+1})

\displaystyle{ \lim_{n \to \infty} \ln(v_n) = 1 }

What can we do with this? Well, suppose \ln(v_n) were equal to 1, instead of approaching it. Then \ln(t_n) could be the nth triangular number… and we’d get a cool formula for the product of all numbers in the nth row of Pascal’s triangle!

But since \ln(v_n) is just approximately equal to 1, we should only expect an approximate formula for the product of all numbers in the nth row of Pascal’s triangle:

\ln(t_n) \approx T_n


\displaystyle{ t_n \approx e^{T_n} = e^{n(n+1)/2}  }

This was my hope. But how good are these approximations? I left this as a puzzle on Google+, and then Greg Egan stepped in and solved it.

For starters, he graphed the ratio \ln(t_n)/T_n, and got this:

That looks pretty good: it looks like it’s approaching 1. But he also graphed the ratio t_n/e^{T_n}, and got this:

Not so good. Taking exponentials amplifies the errors. If we want a good asymptotic formula t_n, we have to work harder. And this is what Egan did.

Digging deeper

So far we’ve talked a lot about t_n, the product of all numbers in the nth row in Pascal’s triangle… but we haven’t actually computed it. Let’s try:

\begin{array}{ccl}  t_n &=& \displaystyle{ \binom{n}{0} \cdot \binom{n}{1} \cdot \cdots \cdot \binom{n}{n} }   \\  \\  &=& \displaystyle{ \frac{n!}{0! \cdot n!} \cdot \frac{n!}{1! \cdot (n-1)!} \cdot \cdots \cdot \frac{n!}{n! \cdot 0!} }   \\  \\  &=& \displaystyle{ \frac{(n!)^{n+1}}{\left(0! \cdot 1! \cdot \cdots \cdot n!\right)^2} }  \end{array}

So, we’re seeing the superfactorial

\displaystyle{ 0! \cdot 1! \cdot \cdots \cdot n! }

raise its pretty head. This is also called the Barnes G-function, presumably by people who want to make it sound more boring.

Actually that’s not fair: the Barnes G-function generalizes superfactorials to complex numbers, just as Euler’s gamma function generalizes factorials. Unfortunately, Euler made the mistake of defining his gamma function so that

\Gamma(n) = (n-1)!

when n = 1, 2, 3, \cdots. Everyone I trust assures me this was a bad idea, not some deep insight… but Barnes doubled down on Euler’s mistake and defined his G-function so that

G(n) = 0! \cdot 1! \cdot \cdots \cdot (n-2)!

So, we’ll have to be careful when looking up formulas on Wikipedia: the superfactorial of n is G(n+2). Thus, we have

\displaystyle{ t_n = \frac{(n!)^{n+1}}{G(n+2)^2} }


\displaystyle{ \ln(t_n) = (n+1) \ln (n!) - 2 \ln G(n+2) }

Now, there’s a great approximate formula for the logarithm of a factorial. It’s worth its weight in gold… or at least silver, so it’s called Stirling’s formula:

\displaystyle{ \ln(n!) = n \ln (n)  \; - \; n \; + \;\tfrac{1}{2} \ln(2 \pi n) \; +  \; \frac{1}{12n} \cdots }

where the dots mean stuff that goes to zero when divided by the last term, in the limit n \to \infty.

There’s also an approximate formula for the logarithm of the superfactorial! It’s a bit scary:

\begin{array}{ccl} \ln G(n+2) &=& \displaystyle{ \left(\tfrac{1}{2} \ln(n+1) - \tfrac{3}{4}\right) (n+1)^2 \; + }  \\  \\  &&  \tfrac{\ln(2\pi)}{2} \, (n+1) \; - \\ \\ && \tfrac{1}{12} \ln(n+1) \; + \\ \\  && \tfrac{1}{12} - \ln A \; + \cdots   \end{array}

where the dots mean stuff that actually goes to zero as n \to \infty.

What’s A? It’s the Glaisher–Kinkelin constant! If you’re tired of memorizing digits of pi and need a change of pace, you can look up the first 20,000 digits of the Glaisher–Kinkelin constant here, but roughly we have

A \approx 1.282427129100622636875342...

Of course most mathematicians don’t care much about the digits; what we really want to know is what this constant means!

Euler, who did some wacky nonrigorous calculations, once argued that

\displaystyle{ 1 + 2 + 3 + \cdots = -\tfrac{1}{12} }

Riemann made this rigorous by defining

\displaystyle{ \zeta(s) = 1^{-s} + 2^{-s} + 3^{-s} + \cdots }

which converges when \mathrm{Re}(s) > 1, and then analytically continuing this to other values of s. He found that indeed

\displaystyle{ \zeta(-1) = -\tfrac{1}{12} }

This fact has many marvelous ramifications: for example, it’s why bosonic string theory works best in 24 dimensions! It’s also connected to the \tfrac{1}{12n} term in Stirling’s formula.

But anyway, we might wonder what happens if we compute \zeta(s) for s near -1. This is where the Glaisher–Kinkelin constant shows up, because if we try a Taylor series we get

\displaystyle{ \zeta'(-1) = \tfrac{1}{12} - \ln A }

To me, this means that \tfrac{1}{12} - \ln A is more fundamental than A itself, and indeed you’ll see it’s this combination that shows up in the approximate formula for superfactorials. So, we can simplify that formula a bit:

\begin{array}{ccl} \ln G(n+2) &=& \displaystyle{ \left(\tfrac{1}{2} \ln(n+1) - \tfrac{3}{4}\right) (n+1)^2 \; + }  \\  \\  &&  \tfrac{\ln(2\pi)}{2} \, (n+1) \; -  \\ \\ &&  \tfrac{1}{12} \ln(n+1) \; + \\ \\  &&   \zeta'(-1) \; + \cdots   \end{array}

Now, let’s use this together with Stirling’s formula to estimate the logarithm of the product of all numbers in the nth row of Pascal’s triangle. Remember, that’s

\displaystyle{ \ln(t_n) = (n+1) \ln (n!) - 2 \ln G(n+2) }

So, we get

\begin{array}{ccl} \ln(t_n) &\approx& \displaystyle{(n+1)\Big[ n \ln(n) - n + \tfrac{1}{2} \ln(2 \pi n) + \frac{1}{12n} \Big] } \\  \\  && - \displaystyle{ \Big[  \left( \ln(n+1) - \tfrac{3}{2}\right) (n+1)^2 + \ln(2\pi) \cdot (n+1) -} \\ \\  && \quad \displaystyle{   \tfrac{1}{6} \ln(n+1) + 2\zeta'(-1) \Big]  }  \end{array}

and exponentiating this gives a good approximation to t_n.

Here is a graph of t_n divided by this approximation, created by Egan:

As you can see, the ratio goes to 1 quite nicely!

So, we’ve seem some nice relationships between these things:

1 + 2 + \cdots + n = T_n

1 \cdot 2 \cdot \cdots \cdot n = n!

\binom{n}{1} \cdot \binom{n}{2} \cdot \cdots \cdot \binom{n}{n} = t_n

1! \cdot 2! \cdot \cdots \cdot n! = G(n+2)

\frac{1}{0!} +\frac{1}{1!} + \frac{1}{2!} + \frac{1}{3!} + \cdots = e

and Euler’s wacky formula

\displaystyle{ 1 + 2 + 3 + \cdots = -\tfrac{1}{12} }

Puzzle 2. Can you take the formula

\begin{array}{ccl} \ln(t_n) &\approx& \displaystyle{(n+1)\Big[ n \ln(n) - n + \tfrac{1}{2} \ln(2 \pi n) + \frac{1}{12n} \Big] } \\  \\  && - \displaystyle{ \Big[  \left( \ln(n+1) - \tfrac{3}{2}\right) (n+1)^2 + \ln(2\pi) \cdot (n+1) -} \\ \\  && \quad \displaystyle{   \tfrac{1}{6} \ln(n+1) + 2\zeta'(-1) \Big]  }  \end{array}

and massage it until it looks like n(n+1)/2 plus ‘correction terms’? How big are these correction terms?


Any errors in the formulas above are my fault. Here are the papers that first squeezed e out of Pascal’s triangle:

• Harlan J. Brothers, Pascal’s triangle: The hidden stor-e, The Mathematical Gazette, March 2012, 145.

• Harlan J. Brothers, Finding e in Pascal’s triangle, Mathematics Magazine 85 (2012), 51.

I learned the idea from here, thanks to Richard Elwes:

• Alexander Bogomolny, e in the Pascal Triangle, Interactive Mathematics Miscellany and Puzzles.

For how Euler derived his crazy identity

\displaystyle{ 1 + 2 + 3 + \cdots = -\tfrac{1}{12} }

and how it’s relevant to string theory, try:

• John Baez, My favorite numbers: 24.

43 Responses to Triangular Numbers

  1. arch1 says:

    “You see, permuting the x’s or the y’s doesn’t change our choice, so we have divide[d] by and k! and (n-k)[!]” (coupla typos)

    This symmetrical description of the 2-class case made me realize that I have always thought of it asymmetrically (as n!/(n-k)! ways of choosing k x’s, then dividing by k! if order irrelevant), even though I thought of the N-class case (for N>2) symmetrically. Maybe I subconsciously had trouble seeing a monomial and a binomial as peers. Duh (and thanks)!

  2. Jon Awbrey says:

    That Euler really had summability❢

  3. arch1 says:

    In the equation following “Now look:” the denominator in the RHS should be squared. It would be nice to label the lowest value (1) in the vertical axis of the last graph.

    • John Baez says:

      arch1 wrote:

      In the equation following “Now look:” the denominator in the RHS should be squared.


      It would be nice to label the lowest value (1) in the vertical axis of the last graph.

      True, but I’m not the one drawing the pictures here.

  4. domenico says:

    It is all very, very, interesting.
    I am thinking that the Pascal’s triangle could be a solution of a differential equation, with one values along two direction, with a continuous solution on all the open triangle, and with integer value in the intersection grid that cover the Pascal’s triangle.
    I am thinking that the binomial coefficients can be obtained from the relaxation equation:
    that can be write in a differential form:
    u=\epsilon u_y+O(\epsilon^2)
    the complete differential equation have integer solution (like each differential equation with a border with an integer value) in a open triangle, along some axis with y=n \epsilon constant values.
    It is possible to balance the equation on a vertex (with two Pascal rules):
    u(x,y) = u(x+\epsilon,y+\epsilon) -u(x+2\epsilon,y)
    u(x,y) = u(x-\epsilon,y+\epsilon) -u(x-2\epsilon,y)
    so that:
    0= u(x-\epsilon,y+\epsilon) -u(x-2\epsilon,y) -u(x+\epsilon,y+\epsilon) +u(x+2\epsilon,y)
    0 = u_{y}-\epsilon u_{xy}- \epsilon^2 u_{xyy}+\frac{7 \epsilon^2}{3} u_{xxx}+O(\epsilon^4)
    I am thinking that can be possible to write a differential equation with infinite terms (a Polynomial differential equation), and I think that the not integer Binomial coefficient can be a solution in each real point of the triangle, so that
    could be a solution (or it can be possible to evaluate a differential equation with this solution).
    I am thinking that there is a class of differential equation (that is an approximation of a relaxation method) that solve a problem with a integer frontier, and that have integer solution on a grid (the solution of the differential Pascal equation is a real equation, with integer solution on a grid).

  5. John Baez says:

    After originally writing this post I realized that because it’s not as complete as I’d like, I wanted to add this puzzle:

    Puzzle: Can you take the formula

    \begin{array}{ccl} \ln(t_n) &\approx& \displaystyle{(n+1)\Big[ n \ln(n) - n + \tfrac{1}{2} \ln(2 \pi n) + \frac{1}{12n} \Big] } \\  \\ && - \displaystyle{ \Big[  \left( \ln(n+1) - \tfrac{3}{2}\right) (n+1)^2 + \ln(2\pi) \cdot (n+1) -} \\ \\ && \quad \displaystyle{   \tfrac{1}{6} \ln(n+1) + 2\zeta'(-1) \Big]  } \end{array}

    and massage it until it looks like n(n+1)/2 plus ‘correction terms’? How big are these correction terms?

  6. David Lyon says:

    Isn’t the nth triangular number T_n = (\frac{n+1}{2}) ? I don’t think one can choose 2 things if there is only one thing to choose. Is there a way to see what numbers are hiding outside of the 1s in Pascal’s triangle?

    Aha, hybrid sets suggest that there are zeros outside of Pascal’s triangle. Above the triangle, however, is another triangle with a border of 1s. But this triangle has its zeros inside and its filling on the outside!

    • John Baez says:

      David wrote:

      Isn’t the nth triangular number T_n = \frac{n+1}{2} ?

      A typo there, maybe? It’s often defined as

      T_n = \frac{n(n+1)}{2}

      and that’s the definition I’m using. This is the ‘big’ triangular where you chop a square of dots in half and keep the dots on the diagonal. There’s also the ‘small’ triangular number where you leave out those dots; that’s

      \displaystyle{ \frac{n(n-1)}{2} = \binom{n}{2} }

      The big triangular number is the number of states for a pair of identical bosons each of which can be in n states. The The big triangular number is the number of states for a pair of identical fermions each of which can be in n states.

      • Arjun Jain says:

        Very exciting. In the section named Triangular numbers, below the image of the Pascal Triangle, T_n should be \displaystyle{ \binom{n+1}{2} }, since you’re using the ‘big’ triangular numbers.

        Also, there’s a nice way to see this:
        For n= 3, If you construct a 4X4 matrix with the elements being combinations of A,B,C,D, then the diagonal elements are not allowed and the table has to be symmetric. The lower or upper triangular matrix, excluding the diagonal, has all the possible combinations, i.e. \displaystyle{ \binom{n+1}{2} }, and matches up to the diagrams showing the triangular numbers.

        \begin{bmatrix} AA & AB & AC & AD \\ BA & BB & BC & BD \\ CA & CB & CC &  CD \\ DA & DB & DC & DD \end{bmatrix}

      • John Baez says:

        Arjun wrote:

        In the section named Triangular numbers, below the image of the Pascal Triangle, T_n should be \displaystyle{ \binom{n+1}{2} }, since you’re using the ‘big’ triangular numbers.

        Hi! Thanks for catching that mistake! Fixed.

        Also, there’s a nice way to see this:

        For n= 3, If you construct a 4×4 matrix with the elements being combinations of A,B,C,D, then the diagonal elements are not allowed and the table has to be symmetric. The lower or upper triangular matrix, excluding the diagonal, has all the possible combinations, i.e.

        \displaystyle{ \binom{n+1}{2} }

        Yes. This argument also generalizes to higher dimensions. For example, you can see that

        \displaystyle{ \binom{n+2}{3} }

        is the nth tetrahedral number by looking at an n × n × n cube of numbers x_{i j k} and counting the number of entries with i \le j \le k.

        It’s perhaps easier to see that the ‘small’ tetrahedral number

        \displaystyle{ \binom{n}{3} }

        counts the entries with i < j < k, since these corresponding to ways of picking a 3-element subset out of an n-element set.

        The fun really starts when you use these ideas to think about identities like

        \displaystyle{ \binom{5}{2} = \binom{5}{3} }

        since the left-hand side is a triangular number while the right-hand side is a tetrahedral number:

        1 + 2 + 3 + 4 = 1 + 3 + 6

        On the one hand, it’s obvious that

        \displaystyle{ \binom{5}{2} = \frac{5!}{2! \cdot 3!} =  \binom{5}{3} }

        but on the other, it’s not obvious that there’s a systematic way to prove identities of this sort by rearranging piles of balls—for example, rearranging a triangle of 10 balls into a tetrahedron of 10 balls.

      • Greg Egan says:

        I think there’s a systematic (albeit recursive!) way of comparing arrangements of objects that represent equal generalised triangular numbers of different dimensions.

        Define a generalised triangular number T_{d,n} of dimension d and size n as the sum of the triangular numbers of up to the same size with dimension one lower:

        \displaystyle{T_{d,n} = \sum_{k=1}^n T_{d-1,k}}

        At dimension 1, define the number itself to be equal to its size:

        \displaystyle{T_{1,n} = n}

        Now, consider T_{d,2}, which can be visualised as a d-dimensional pyramid of edge length 2, with one object at the origin and one object a unit distance away along each coordinate axis, for a total of d+1 objects.

        We can identify any set of this form with the one-dimensional set of d+1 objects arranged at unit intervals along the coordinate axis, which corresponds to the triangular number T_{1,d+1}.

        The relationship between triangular numbers of different dimensions that can be found via the symmetry of the binomial coefficient is:


        and the identification we’ve just made:

        \displaystyle{T_{d,2} = T_{1,d+1}}

        is an example of that.

        To extend the identification recursively, we need to look at two ways a triangular number can grow: when you increase its size, and when you increase its dimension.

        When you increase the size of a triangular number, leaving the dimension the same, you’re just adding a triangular number of one lower dimension and one greater size to the original number:

        \displaystyle{T_{d,n+1} = T_{d,n} + T_{d-1,n+1}}

        This can be visualised as, say, adding a linear row of objects to a triangle to increase its size, or adding an extra triangle of objects to the base of a three-dimensional triangular pyramid, and so on.

        When you increase the dimension of a triangular number, leaving its size the same, you’re doing essentially the same kind of gluing operation, but the roles of the original piece and the added piece are swapped: you add a piece of the new, higher dimension and a size one smaller than the original size.

        \displaystyle{T_{d+1,n} = T_{d,n} + T_{d+1,n-1}}

        For example, to increase the dimension of a 2-dimensional triangle of objects, you just treat that triangle as the base of a pyramid and stick on a pyramid one size smaller.

        If we take our original identification between a triangular number of any dimension d and size 2 and a 1-dimensional triangular number of size d+1 as a kind of primitive map, I believe we can recursively tear down any case of:


        into multiple cases of that primitive map.

        For example, we have:


        We visualise T_{2,4} as a triangle of size 4, and T_{3,3} as a pyramid of size 3.

        We break the triangle of size 4 into a triangle of size 3 and a linear row of size 4. We break the pyramid of size 3 into a triangle of size 3 and a pyramid of size 2.

        Identifying the linear row of size 4 and the pyramid of size 2 is our primitive map.

        The remaining pieces in this case are both triangles of size 3, but to be systematic we should continue breaking them down, one by size and one by dimension. So we break the first one into a triangle of size 2 and a linear row of size 3, and the second one into a linear row of size 3 and a triangle of size 2. Then we can identify each triangle of size 2 with a linear row of size 3 from the other pile.

      • Greg Egan says:

        I probably need to add a clearer description of the recursive decomposition of triangular numbers, because the formulas I gave previously were ways of building them up, rather than tearing them down.

        To reiterate, the basic identity we get from the binomial coefficient is:

        \displaystyle{T_{d,n} = T_{n-1,d+1}}

        We can decompose the left-hand-side as:

        \displaystyle{T_{d,n} = T_{d,n-1} + T_{d-1,n}}

        and the right-hand side as:

        \displaystyle{T_{n-1,d+1} = T_{n-2,d+1} + T_{n-1,d}}

        In both cases, we’re just splitting off a lower-dimensional triangular number of the same size as the original, and leaving behind a piece of the original dimension that’s one size smaller. But we write the two equations above with the order of the pieces swapped, because the first items on the right of each equation are equal to each other, as are the second items. So we end up with two new comparisons:

        \displaystyle{T_{d,n-1} = T_{n-2,d+1}}


        \displaystyle{T_{d-1,n} = T_{n-1,d}}

        These are both just cases of the basic identity with different values for the dimension and size. And we just keep splitting things this way, with one of the dimensions growing smaller each time, until we end up with cases of:


        That is, a linear row of size d+1 is equivalent to a pyramid of dimension d and size 2.

      • John Baez says:

        Greg wrote:

        I think there’s a systematic (albeit recursive!) way of comparing arrangements of objects that represent equal generalised triangular numbers of different dimensions.

        Great! Your description is a bit complicated, but I think this is inevitable. We’re trying to get a ‘bijective proof’ of the identity

        T_{d,k} = T_{k-1,d+1}

        where T_{d,k} is the number of balls in a d-dimensional simplex-shaped pyramid with k balls along each edge. And in this case the bijection is surprisingly subtle, even though the identity is elementary. I first heard the puzzle of describing this bijection from James Dolan.

        Here’s a very abstract way of describing it, which may unfold into your description if we try to make it explicit.

        I’m going to use a standard trick that category theorists use for bijective proofs, and reuse standard notations for various numbers as notations for sets having these numbers as their cardinalities.

        So, I’ll let

        \displaystyle{\binom{n}{d} }

        be the set of d-element subsets of the set \{1, \dots,  n\}.

        There’s a bijection

        \alpha_{n,d}: \displaystyle{\binom{n}{d} \stackrel{\sim}{\longrightarrow} \binom{n}{n-d} }

        sending each subset to its complement.

        Let T_{d,k} be the set of balls in a d-dimensional pyramid with k balls along each edge. This is a bit vague since it doesn’t give us a specific name for each ball—unlike my description of \binom{n}{d}, where we have a precise way to specify any subset, e.g.

        \{3,4,7\} \subseteq \{1, \dots, 7\}

        It can be hard to describe a bijection between sets if we don’t have names for the elements! However, we get a naming scheme for the balls in this pyramid if we think of this pyramid as the set of points

        T_{d,k} = \{(x_1, \dots, x_d) \in \mathbb{Z}^d : 1 \le x_1 \le \cdots \le x_d \le k \}

        So that’s what I’ll do. This is neatly parallel to the fact that

        \displaystyle{\binom{n}{d} = \{(x_1, \dots, x_d) \in \mathbb{Z}^d : 1 \le x_1 < \cdots < x_d \le n \} }

        Now, if we can describe a specific bijection

        \displaystyle{ \beta_{n,k} : T_{d,k} \stackrel{\sim}{\longrightarrow} \binom{n}{d} }

        where n = d+k-1, we’ll be done… since then we can compose bijections of the sort of we’ve found and get the bijection we’re looking for:

        \displaystyle{ T_{d,k} \stackrel{\beta}{\longrightarrow} \binom{n}{d} \stackrel{\alpha}{\longrightarrow} \binom{n}{n-d} \stackrel{\beta^{-1}}{\longrightarrow} T_{k-1,d+1} }

        (Here I’m leaving out all the fiddly little numbers on the \alphas and \betas, since they can be reconstructed and they don’t matter much for getting the idea across.)

        Note that

        \displaystyle{\binom{n}{d} }

        is the set of d-element subsets of \{1, \dots, n \}, while


        is the set of d-element multisubsets of \{1, \dots, k \} (meaning that elements can appear more than once). So, we just need an explicit bijection between these. And I think it’s not hard to find, but I’m getting sick of writing this!

        Anyway, this abstract approach is not really a substitute for a concrete description, just a strategy for finding such a concrete description.

      • John Baez says:

        Let’s see if the bijection between, say, 2-element multisubsets of {1,2,3} and 2-element subsets of {1,2,3,4} is evident.

        Here are the 2-element multisubsets of {1,2,3}, where the elements included (perhaps more than once) are in bold:

        1, 1, 2, 3
        1, 2, 3
        1, 2, 3
        1, 2, 2, 3
        1, 2, 3
        1, 2, 3, 3

        Here are the 2-element subsets of {1,2,3,4}, where the elements included are in bold:

        1, 2, 3, 4
        1, 2, 3, 4
        1, 2, 3, 4
        1, 2, 3, 4
        1, 2, 3, 4
        1, 2, 3, 4

        We certainly get a bijection just by listing them both ‘lexicographically’ as I’ve done here, and then matching the first multisubset with the first subset, the second with the second etc. This is perfectly systematic and works in general to give a specific bijection

        \displaystyle{ \beta : T_{d,k} \stackrel{\sim}{\longrightarrow} \binom{d+k-1}{d} }

        (using my previous notation). So, in a very abstract way, this completes my construction of a specific bijection

        \displaystyle{  T_{d,k} \stackrel{\sim}{\longrightarrow} T_{k-1,d+1} }

        But it would be nice to describe this bijection \beta a bit more directly, without listing everything… or some other bijection that does the job.

        • Graham Jones says:

          I am not following this at all closely, so perhaps I’m missing something, such as the point of the exercise. The following seems like an explicit bijection.

          Given an r-element subset of \{1,2,3\dots,n\}, write the elements out in increasing order:

          i_1 < i_2 < \dots < i_r

          then subtract j-1 from the j'th number to get the multisets:

          i_1-1+1 \leq i_2-2+1  \leq \dots \leq i_r-r+1

        • Greg Egan says:

          Graham, that certainly works! And it’s exactly equivalent to the lexicographic pairing. So it’s probably the most elegant way to convert between subsets and multisubsets.

          The point of the exercise for me is to try to see why certain triangular numbers are equal. The identity is easy to prove, but it’s also nice to understand what’s happening geometrically and combinatorially.

        • John Baez says:

          Graham wrote:

          The following seems like an explicit bijection.

          Great—you just solved this problem:

          But it would be nice to describe this bijection \beta a bit more directly, without listing everything…

          As for the point of the exercise… the point of a getting a bijective proof of a combinatorial identity is that it brings us a step towards “categorifying” all of mathematics. Categorification doesn’t just mean “using categories”, it means something quite specific: replacing equations by isomorphisms. A bijective proof takes what had been a mere equation between numbers and boosts it up to an isomorphism between sets: that is, a 1-1 correspondence, or bijection.

          Since numbers were invented to count sets, this is a way of bringing mathematics back to its roots. And in the hands of experts, isomorphisms are a lot more powerful than mere equations. An equation tells you two things are the same. An isomorphism says how.

          But here we’re mainly just having fun, taking an identity that seems obvious:

          \displaystyle{ \binom{n}{k} = \binom{n}{n-k} }

          reinterpreting it as a geometrical fact:

          There are as many balls in a d-dimensional pyramid of balls with k balls along each edge as there are balls in a (k-1)-dimensional pyramid of balls with d+1 balls along each edge.

          (where n = d+k-1), and then trying to get a direct geometrical way of seeing this latter fact… which amounts to a bijective proof.

          If we couldn’t figure out how to do this, there’d be a hole in our understanding of binomial coefficients!

        • Graham Jones says:

          I thought of the map from multisubsets to subsets geometrically. A k-element multisubset X of \{1, \dot, n\} can be seen as a path on a grid (square lattice) from (1,0) to (n,k). The path goes up by the number of 1s in X, then along one unit to the right, then up by the number of 2s in X, and so on. The corresponding subset is the same but skewed, with the horizontals replaced by diagonals. I made a picture for John’s example at the bottom of

          So each (hyper)cube in the pyramids can be labelled with paths.

        • John Baez says:

          Nice! That clarifies things a lot.

        • Greg Egan says:

          Ah, and if you flip Graham’s paths diagonally (that is, reverse the coordinates of each endpoint of a line segment in his two-dimensional diagram) you get a bijection that goes all the way through, from T_{d,k} \to T_{k-1,d+1}!

          Actually, you can’t quite just flip the coordinates, if you want to start both paths at (1,0). But that’s just matter of tweaking the choice of coordinates; geometrically, the paths are flipped.

          I don’t have time to draw a diagram of this right now, but it’s very easy to draw these paths by hand, and doing that for the first dimension-changing example:

          \displaystyle{\left(\begin{array}{ccc}  (1,1) & \to &  (3,3,3) \\  (1,2) & \to & (2,3,3) \\  (1,3) & \to & (2,2,3) \\  (1,4) & \to & (2,2,2) \\  (2,2) & \to & (1,3,3) \\  (2,3) & \to & (1,2,3) \\  (2,4) & \to & (1,2,2) \\  (3,3) & \to & (1,1,3) \\  (3,4) & \to & (1,1,2) \\  (4,4) & \to & (1,1,1) \\ \end{array} \right)}

          … makes it easy to see what’s going on.

        • Greg Egan says:

          Using Graham’s path method, I figured out an explicit formula for the bijection:

          \gamma: T_{d,k} \stackrel{\sim}{\longrightarrow} T_{k-1,d+1}

          If x \in T_{d,k} and y=\gamma(x), the coordinates of y are given by:

          \displaystyle{y_i = 1 + \sum_{j=1}^i c_j(x)}

          where c_j(x) is the count of the number of coordinates of x that are equal to j.

          Or to put this another way, y_i is one more than the number of coordinates of x that are less than or equal to i.

        • Greg Egan says:

          Another way to describe \gamma: T_{d,k} \stackrel{\sim}{\longrightarrow} T_{k-1,d+1} is by specifying the number of coordinates that take a given value.

          If y=\gamma(x), then the number of coordinates of y that are equal to a certain value, j, is:

          c_j(y) = x_j - x_{j-1}

          where we define x_0 \equiv 1 and x_{d+1} \equiv k in addition to the actual coordinates x_i, 1\le i \le d.

        • Greg Egan says:

          There are two representations of a multisubset of \{1,...,k\} with length d. One lists all the elements in ascending order, giving a d-tuple:

          (x_1,...,x_d), 1 \le x_1 \le x_2 \le ... x_d \le k

          The other specifies the number of elements of each possible value, giving a k-tuple:

          (c_1,..., c_k), c_j = |\{i : x_i=j\}|

          The bijection between triangular numbers:

          \gamma: T_{d,k} \stackrel{\sim}{\longrightarrow} T_{k-1,d+1}

          gives the “count representation” of one triangular number from a change of basis and origin of the “coordinate representation” of the other triangular number:

          c_j(\gamma(x)) = x_j - x_{j-1}, 1 \le j \le d+1

          where x_0 \equiv 1 and x_{d+1} \equiv k.

        • Greg Egan says:

          One more remark I should have added to my last comment: while the restriction on the “coordinate representation” of a multisubset is that the coordinates x_j are positive integers in ascending order, the restriction on the “count representation” is that the counts c_j are non-negative integers that sum to the dimension of the coordinate representation. In other words, the c_j must lie in an affine hyperplane with the appropriate fixed value for their dot product with (1,1,...1). And it’s easy to see that the formula:

          c_j(\gamma(x)) = x_j - x_{j-1}


          x_0 = 1

          x_{d+1} = k

          x_{j-1} \le x_j

          will yield non-negative values for c_j, and a dot product with (1,1,...1) of k-1.

          So what the bijection is doing is taking the d-dimensional pyramid, changing the coordinates to embed it in the appropriate affine hyperplane, and then treating the new coordinates of each object in the pyramid as counts for the various values in the coordinates of each object of the (k-1)-dimensional pyramid.

      • Greg Egan says:

        Cool! And it turns out that if I make the right choices, our definitions agree completely!

        Using your lexicographical bijection, the composition of the three maps ends up taking a lexicographical ordering of T_{d,k} into a reversed lexicographical ordering of T_{k-1,d+1}, because taking the complement of the subsets in the categorified binomial coefficients reverses their lexicographical order.

        In my recursive approach, if I define the splitting of a triangular number by taking the lower-dimensional part to consist of those d-tuples whose last coordinate (in the current dimension) is exactly equal to the size, n, the effect turns out to be the same as a kind of recursive lexicographical sort, and because I need to swap which part is mapped to which when I make the split, there’s a reversal of the ordering built into the process.

        To make the agreement complete, I just need to choose my “primitive” bijection between T_{1,d+1} and T_{d,2} to be a reverse lexicographical map.

      • Greg Egan says:

        Oh, and I can drop my “primitive bijection” imposed by fiat and get exactly the same thing by continuing the recursion to a deeper level, if we define zero-dimensional triangular numbers as sets of one element:

        \displaystyle{T_{0,k} = \{k\} \subset \{1,...,k\} = T_{1,k}}

      • Greg Egan says:

        I guess the most succinct way to describe the recursive approach is as follows. Define a triangular number of a given dimension as a disjoint union of triangular numbers of lower dimension:

        \displaystyle{T_{d,k} = \cup_{i=1}^k T_{d-1,i}}

        with the recursive definition terminating at:

        \displaystyle{T_{0,k} = \{k\}}

        For any triangular number of dimension greater than zero, there is a natural way to cleave it into two disjoint parts. The first part has the same dimension and a smaller size:

        \displaystyle{P_1:T_{d,k} \to T_{d,k-1} =  \cup_{i=1}^{k-1} T_{d-1,i}}

        The second part has the same size and a smaller dimension:

        \displaystyle{P_2:T_{d,k} \to T_{d-1,k}}

        To establish a bijection between T_{d,k} and T_{k-1,d+1}, we establish bijections between P_1(T_{d,k}) and P_2(T_{k-1,d+1}) and between P_2(T_{d,k}) and P_1(T_{k-1,d+1}), with the process terminating when the pair of sets we want bijections between have cardinality 1.

        And if the triangular numbers are defined as multisubsets, this is the same as matching up two lexicographically sorted lists in reverse order.

      • Greg Egan says:

        There’s a way to use recursion to define the bijection:

        \displaystyle{ \beta : T_{d,k} \stackrel{\sim}{\longrightarrow} \binom{d+k-1}{d} }

        where the binomial coefficient \binom{n}{d} is defined as the set of subsets of \{1,...,n\} of size d.

        Just as we can cleave the triangular numbers into two parts, we can do the same with the binomial coefficients. We define:

        \displaystyle{ Q_1:\binom{n}{d} \to \binom{n-1}{d}}


        \displaystyle{ Q_2:\binom{n}{d} \to \binom{n-1}{d-1}}

        In the first case, \binom{n-1}{d} is simply a subset of \binom{n}{d}. In the second case, we can embed \binom{n-1}{d-1} in \binom{n}{d} by adding the element n to every set in \binom{n-1}{d-1}. So these two maps produce disjoint subsets that comprise the whole of \binom{n}{d}.

        We then establish a bijection between T_{d,k} and \binom{d+k-1}{d} by establishing bijections between:

        \displaystyle{P_1(T_{d,k}) = T_{d,k-1}}



        and between:




        This recursive definition terminates when it calls for a bijection between two sets of cardinality 1.

      • John Baez says:

        Let me mention something one might try to do with the bijective proofs given above. Often bijective proofs can be copied in different contexts to gain new information.

        I’ve been using

        \displaystyle{\binom{n}{d}  }

        to mean, not the binomial coefficient, but a certain set whose size is that binomial coefficient: the set of d-element subsets of \{1,\dots ,n\}. But this set corresponds to a basis of the dth exterior power of \mathbb{C}^n, usually called \Lambda^d \mathbb{C}^n. In quantum physics, this is the Hilbert space for d identical fermions each having n degrees of freedom.

        I’ve been using

        \displaystyle{ T_{d,k} }

        to mean, not the number of balls in a d-dimensional pyramid with edges of length k, but a certain set whose size is that number: the set of d-element multisubsets of \{1,\dots, k\}. But this set corresponds to a basis of the dth symmetric power of \mathbb{C}^k, usually called S^d \mathbb{C}^k. This is the Hilbert space for d bosons each having k degrees of freedom.

        So, when we get our hands on a specific bijection

        \displaystyle{ \beta : T_{d,k} \stackrel{\sim}{\longrightarrow} \binom{n}{d} }

        when n = d+k-1, we also get an isomorphism of Hilbert spaces

        \displaystyle{  S^d \mathbb{C}^k \stackrel{\sim}{\longrightarrow}   \Lambda^d \mathbb{C}^n  }

        relating bosons and fermions. Is this good for something? I don’t know. But this is the kind of thing one can do.

  7. Greg Egan says:

    If we subtract T_n from the approximate formula for \ln(t_n), we get correction terms:

    \displaystyle{n^2 (\ln (n)-\ln (n+1))+n    \left(\ln (n)-2 \ln (n+1)+\frac{1}{2} \ln (2 \pi     n)+\frac{3}{2}-\ln (2 \pi )\right)}
    \displaystyle{-\frac{5}{6} \ln    (n+1)+\frac{1}{2} \ln (2 \pi  n)+ \frac{19}{12}-\ln (2    \pi ) -2 \zeta '(-1)+\frac{1}{12 n}}

    The “quadratic” term in n (that is, the terms where an explicit n^2 is visible) is essentially linear:

    \displaystyle{\lim_{n \to \infty} n^2 (\ln (n)-\ln (n+1))+n = \frac{1}{2}}

    However, the “linear” term in n is dominated by a log times linear term:

    \displaystyle{\lim_{n \to \infty} n    \left(\ln (n)-2 \ln (n+1)+\frac{1}{2} \ln (2 \pi     n)+\frac{3}{2}-\ln (2 \pi )\right) + \frac{1}{2} n \ln (n)+\frac{1}{2} n (\ln (2 \pi )-3) = -2}

    So the dominant correction term is:

    \displaystyle{-\frac{1}{2} n \ln (n)}

    followed by a linear correction term (including a contribution from the “quadratic” term):

    \displaystyle{\frac{1}{2} (1 - \ln(2 \pi)) n}

    This is followed by a log term:

    \displaystyle{-\frac{1}{3} \ln(n)}

    and then a constant:

    \displaystyle{\frac{1}{12} \left(1-24 \zeta '(-1)-6 \ln (2 \pi)\right)}

    The rest of the correction goes to zero as n \to \infty

  8. Greg Egan says:

    Just to clarify something, the plot for the ratio between t_n and the exponential of the approximation to \ln(t_n) is not correct. I’ve emailed John the correct plot, but it’s the middle of the night in his current time zone.

    What I graphed in that plot was based on an approximation to (n+1) \ln(n!) that differed from the one used in this blog post by a term \frac{1}{12n}. Obviously the difference goes to zero, and both approximations are asymptotic formulas for the same thing, but in fact the version derived in the blog post yields a ratio that converges to 1 even faster than the version currently shown in the plot.

    • John Baez says:

      The old plot looked like this:

      where t(n) is what I’m now calling t_n, the product of all numbers in the nth row of Pascal’s triangle.

      The new plot Greg sent me, showing t_n divided by the better approximation I actually give in my post, looks like this:

      The converge is indeed much faster!

  9. Spencer Shepard says:

    Very interesting! I had heard something about 1 + 2 + 3 + \cdots = -\frac{1}{12} before, but I didn’t realize that it came straight from the Riemann zeta function (since I haven’t had a lot of experience with \zeta(s), unfortunately). Now I’ll have to investigate and see if I can understand why \zeta(-1) = -\frac{1}{12}. :-)

    The formula given for e just before Puzzle 2 seems to equal e - 1. It should probably be:

    \frac{1}{0!} + \frac{1}{1!} + \frac{1}{2!} + \frac{1}{3!} + \cdots = e

    This kind of messes up its parallelism with the other formulas above it, but most of the other formulas could prepend an i = 0 term/factor as well (aside from the definition of factorial, which obviously can’t start with a factor of zero).

    In particular, the formula for t_n could prepend such a factor (thought it wouldn’t change the value), since n choose 0 is actually one of the values in the given row.

  10. Greg Egan says:

    Instead of answering Puzzle 1 using the formula for the binomial coefficients, here’s a simple scenario that makes the result easy to see.

    Suppose Alice is dealt k cards from a deck of n solely black cards, and k+1 cards from a deck of n solely red ones.

    Bob gets exactly the same number of cards of each colour, but his cards are dealt from fresh decks of n-1 black cards and n+1 red ones.

    What is the ratio between the number of possible hands for Bob and the number of possible hands for Alice?

    Because the numbers of cards are the same, in the ratio there’s no need to distinguish between combinations and permutations. So the difference comes down solely to the number of possibilities available as each card is dealt.

    For the black cards, the number of possibilities for Bob’s first through to second-last cards agree with Alice’s second through to last, but he loses a factor of n (Alice’s first card) while gaining a factor of n-k (his last card).

    For the red cards, the numbers for Bob’s second through to last cards agree with Alice’s first through to second-last, but he gains a factor of n+1 (his first card) while losing a factor of n-k (Alice’s last card).

    So overall, he gains a factor of \frac{n+1}{n} = 1 + \frac{1}{n}, regardless of the value of k.

    These numbers can be seen in Pascal’s triangle by looking at the four corners of any “diamond” shape that surrounds a gap. For example, the diamond centred on the second gap of the fifth row has 3 at the top, 10 at the bottom, 4 on the left and 6 on the right. The product of the top and bottom corners divided by the product of the left and right corners is \frac{30}{24}=\frac{5}{4}, and all diamonds in the same row have the same value.

  11. joelhmayer says:

    My contribution, a continuously variable aproximator to the binomial distribution offering full dynamic range for the variables n (number of permutables in the binomial set) and p (skew or bias of the distribution). I think this expression should be attributed to Abraham de Moivre.
    DeMoivre1733 = (2.0/Sqrt[
    2.0*\[Pi]*n])*(E^-(((Subscript[X, peak] – X)^2)/(n/2.0)))
    (Obviously, I’m not knowing the best way to copy and paste an equation out of a Mathematica Notebook onto a Word Press blog article)

    • Greg Egan says:

      Joel, approximating the binomial coefficients with a Gaussian sounds like a good idea! I tried it out myself initially. But if you work through the calculations, the product of the Gaussian approximations for each row of Pascal’s triangle turns out not to be asymptotic to the product of the actual binomial coefficients.

      I believe this is because the approximation works very well around the mean, but not so well at the tails, and while that doesn’t matter much if you’re summing the terms (which are small at the tails), when you’re taking their product it leads to large discrepancies.

      • joelhmayer says:

        Greg- I used the Table function in Mathematica to test De Moivre’s Approximator against Jacob Bernoulli’s Exact Function. Here are my results for n ranging from 1 through 10.
        Bernoulli 0.5, De Moivre 0.798 | Bernoulli 0.5, De Moivre 0.564 | Bernoulli 0.375, De Moivre 0.461 | Bernoulli 0.375, De Moivre 0.399 | Bern 0.312, De M 0.357 | Bern 0.312, De M 0.326, | Bern 0.273, De M 0.302 | Bern 0.273, De M 0.282 | Bern 0.246, De M 0.266 | Bern 0.246, De M 0.252 |

        The way I do it the approximation improves with increasing n. My goal is to set a lower bound on the number of genes having an influence on any given biological manifestation. So, for example, if the bell curve for adult height solves to a 12 element binomial (13 columns) I think that implies a minimum of 12 genes with an influence on cell production rate.

      • Greg Egan says:

        Joel, you’ve given a list of evaluations of the binomial probability distribution at the mean (or the integers either side of the mean), and the Gaussian approximation to that distribution at the mean. And sure, these values at the mean get closer together as n increases.

        Unfortunately, though, that doesn’t help us obtain a good approximation for the product of all the numbers in each row of Pascal’s triangle, which is the aim of the calculations in this blog post. For n=1,..,10, the ratio between the actual product and the product of the numbers you get from the Gaussian approximation takes the values: 1.06747, 1.28577, 1.36777, 1.3446, 1.237, 1.07058, 0.87397, 0.674019, 0.49155, 0.339207. The limit of this ratio for large n turns out to be zero.

        So, alas, this is one problem where the approximation you’ve described is not applicable.

You can use HTML in your comments. You can also use LaTeX, like this: $latex E = m c^2 $. The word 'latex' comes right after the first dollar sign, with a space after it.

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


Get every new post delivered to your Inbox.

Join 2,804 other followers