There are theorems whose shortest proof is insanely long. In 1936 Kurt Gödel published an abstract called “On the length of proofs”, which makes essentially this claim.

But what does ‘insanely long’ mean?

To get warmed up, let’s talk about some long proofs.

### Long proofs

You’ve surely heard of the quadratic formula, which lets you solve

with the help of a square root:

There’s a similar but more complicated ‘cubic formula’ that lets you solve cubic equations, like this:

The cubic formula involves both square roots and cube roots. There’s also a ‘quartic formula’ for equations of degree 4, like this:

The quartic formula is so long that I can only show you an absurdly shrunken version:

(Click repeatedly to enlarge.) But again, it only involves additions, subtraction, multiplication, division and taking roots.

In 1799, Paolo Ruffini proved that there was no general solution using radicals for polynomial equations of degree 5 or more. But his proof was 500 pages long! As a result, it was “mostly ignored”. I’m not sure what that means, exactly. Did most people ignore it completely? Or did everyone ignore most of it? Anyway, his argument seems to have a gap… and later Niels Abel gave a proof that was just 6 pages long, so most people give the lion’s share of credit to Abel.

Jumping ahead quite a lot, the longest proof written up in journals is the classification of finite simple groups. This was done by lots of people in lots of papers, and the total length is somewhere between 10,000 and 20,000 pages… nobody is exactly sure! People are trying to simplify it and rewrite it. The new proof will be a mere 5,000 pages long. So far six books have been written as part of this project.

Even when it’s all in one book, how can we be sure such a long proof is right? Some people want to use computers to make the argument completely rigorous, filling in all the details with the help of a program called a ‘proof assistant’.

The French are so sexy that even their proof assistant sounds dirty: it’s called Coq. Recently mathematicians led by George Gonthier have used Coq to give a completely detailed proof of a small piece of the classification of finite simple groups: the Feit–Thompson Theorem. Feit and Thompson’s original proof of this result, skipping lots of steps that are obvious to experts, took 255 pages!

What does the Feit–Thompson theorem say? Every finite group with an odd number of elements is solvable! Explaining that statement might take quite a while, depending on what you know about math. So let me just say this:

Galois invented group theory and used it to go further than Abel and Ruffini had. He showed a bunch of *specific* polynomial equations couldn’t be solved just using addition, subtraction, multiplication, division and taking roots. For example, those operations aren’t powerful enough to solve this equation:

while they can solve this one:

Galois showed that every polynomial equation has a group of symmetries, and you can solve the equation using addition, subtraction, multiplication, division and taking roots if its group has a certain special property. So, this property of a group got the name ‘solvability’.

*Every finite group with an odd number of elements is solvable*. It’s quick to say, but hard to show—so hard that making the proof fully rigorous on a computer seemed out of reach at first:

When Gonthier first suggested a formal Feit-Thompson Theorem proof, his fellow members of the Mathematical Components team at Inria could hardly believe their ears.

“The reaction of the team the first time we had a meeting and I exposed a grand plan,” he recalls ruefully, “was that I had delusions of grandeur. But the real reason of having this project was to understand how to build all these theories, how to make them fit together, and to validate all of this by carrying out a proof that was clearly deemed to be out of reach at the time we started the project.”

It took them 6 years! The completed computer proof has 170,000 lines. It involves 15,000 definitions and 4,300 lemmas. Maybe now Gonthier is dreaming of computerizing the whole classification of finite simple groups.

But there are even longer proofs of important results in math. These longer proofs all involve computer calculations. For example, Thomas Hales seems to have proved that the densest packing of spheres in 3d space is the obvious one, like this:

(There are infinitely many other *equally* dense packings, as I explained earlier, but none denser.)

Hales’ proof is a hundred pages of writing *together with about 3 gigabytes of computer calculations*. If we wrote out those calculations in a text file, they’d fill about 2 million pages!

The method is called ‘proof by exhaustion’, because it involves reducing the problem to 10,000 special cases and then settling each one with detailed calculations… thus exhausting anybody who tries to check the proof by hand. Now Hales is trying to create a fully formal proof that can be verified by automated proof checking software such as HOL. He expects that doing this will take 20 years.

These proofs are long, but they’re not what I’d call *insanely* long.

### Insanely long proofs

What do I mean by ‘insanely’ long? Well, for example, I know a theorem that you can prove using the usual axioms of arithmetic—Peano arithmetic—but whose shortest proof using those axioms contains

symbols. This is so many symbols that you couldn’t write them all down if you wrote one symbol on each proton, neutron and electron in the observable Universe!

I also know a theorem whose shortest proof in Peano arithmetic contains

symbols. This is so many that if you tried to write down the *number* of symbols—not the symbols themselves—in ordinary decimal notation, you couldn’t do it if you wrote one digit on each proton, neutron and electron in the observable Universe!

I also know a theorem whose shortest proof in Peano arithmetic contains…

… well, you get the idea.

By the way, if you don’t know what Peano arithmetic is, don’t worry. It’s just a list of obvious axioms about arithmetic, together with some obvious rules for reasoning, which turn out to be good enough to prove most everyday theorems about arithmetic. The main reason I mention it is that we need to pick *some* particular setup before we can talk about the ‘shortest proof’ of a theorem and have it be well-defined.

Also by the way, I’m assuming Peano arithmetic is consistent. If it were inconsistent, you could prove 0=1, and then everything would follow quite quickly from that. But most people feel sure it’s consistent.

If it is, then the shortest proof using Peano arithmetic of the following theorem contains at least symbols:

This statement has no proof in Peano arithmetic that contains fewer than symbols.

Huh? This doesn’t look a statement about arithmetic! But Gödel showed how you could encode the concept of ‘statement’ and ‘proof’ into arithmetic, and use this to create statements that refer to themselves. He’s most famous for this one:

This statement has no proof in Peano arithmetic.

It turns out that this statement is true… so it’s an example of a true statement about arithmetic that can’t be proved using Peano arithmetic! This is Gödel’s **first incompleteness theorem**.

This variation is similar:

This statement has no proof in Peano arithmetic that contains fewer than symbols.

This statement is also true. It’s provable in Peano arithmetic, but its proof contains at least symbols.

### Even longer proofs

But this is just the beginning. If Peano arithmetic is consistent, there are infinitely many theorems whose shortest proof is longer than 10 to the 10 to the 10 to the 10 to the… where the stack of 10’s is as long as the statement of the theorem.

Indeed, take *any computable function*, say F—growing as fast as you like. Then if Peano arithmetic is consistent, there are infinitely many theorems whose shortest proof is longer than this function of the length of the theorem!

But how do we *know* this? Here’s how. Using Gödel’s clever ideas, we can create a statement in arithmetic that says:

This statement has no proof in Peano arithmetic that is shorter than F of the length of this statement.

Let’s call this statement P.

We’ll show that if Peano arithmetic is consistent, then P is provable in Peano arithmetic. *So, according to what P itself says, P is an example of a statement whose shortest proof is insanely long!*

Now, let me sketch why if Peano arithmetic is consistent then P is provable in this setup. To save time, I’ll use N to stand for ‘F of the length of P’. So, P says:

P has no proof in Peano arithmetic whose length is less than N.

Assume there *were* a proof of P in Peano arithmetic whose length is less than N. Then, thanks to what P actually says, P would be false.

Moreover, we could carry out the argument I just gave within Peano arithmetic. So, within Peano arithmetic, we could disprove P.

But wait a minute—this would mean that within Peano arithmetic we could both prove and disprove P! We’re assuming Peano arithmetic is consistent, so this is impossible.

So our assumption must have been wrong. P must have no proof in Peano arithmetic whose length is less than N.

But this is what P says! So P is true!

Even better, P is provable in Peano arithmetic. Why? Because we can just go through all proofs shorter than N, and check that they’re not proofs of P… we know they won’t be… and this will constitute a proof that:

P has no proof in Peano arithmetic whose length is less than N.

But this is just what P says! So this is a way to prove P. Moreover we can carry out this long-winded check within Peano arithmetic, and get a proof of P *in Peano arithmetic!*

Of course, this proof has length more than N.

By the way, here I’m using the fact that there are only finitely many proofs with length less than N, so we can go through them all. We have to define ‘length’ in a way that makes this true, for my argument to work. For example, we can take the length to be the number of symbols.

Also, it’s important that our function F be computable. We need to compute N to know how many proofs we need to go through.

The upshot: if Peano arithmetic is consistent, there’s a provable statement whose shortest proof in Peano arithmetic is insanely long, by any computable standard of what counts as ‘insanely long’.

Now, so far we’ve just gotten *one* theorem with an insanely long proof. But we can get infinitely many, one for each natural number n, as follows. Let P(n) be a statement in arithmetic that says:

This statement has no proof in Peano arithmetic whose length is less than n plus F of the length of this statement.

The same argument I just sketched shows that if Peano arithmetic is consistent then P(n) is provable, but has no proof shorter than n plus F of the length of P(n). The statements P(n) are different for different values of n. So, we get infinitely many different statements with insanely long proofs… at least if Peano arithmetic is consistent, which most people believe.

### Proof speedup

But wait a minute! If we’ve proved all these statements P(n) have proofs, then we’ve basically proved them—no? And my argument, though sketchy, was quite short. So, even a completely detailed version of my argument should not be ‘insanely long’. Doesn’t that contradict my claim that the shortest proofs of these statements are insanely long?

Well, no. I only showed that the shortest proof of P(n) *using Peano arithmetic* is insanely long. I did provide a short proof of P(n). But I did this *assuming Peano arithmetic is consistent!*

So I didn’t give a short proof of P(n) using Peano arithmetic. I gave a short proof using Peano arithmetic *plus the assumption that Peano arithmetic is consistent!*

So, if we add to Peano arithmetic an extra axiom saying ‘Peano arithmetic is consistent’, infinitely many theorems get vastly shorter proofs!

This is often called **Gödel’s speedup theorem**. For more, see:

• Gödel’s speedup theorem, Wikipedia.

### Connections

It’s pretty cool how just *knowing Peano arithmetic was consistent* would let us vastly shorten infinitely many proofs.

As an instant consequence, we get Gödel’s **second incompleteness theorem**: it’s impossible to use Peano arithmetic to prove its own consistency. For if we could, adding that consistency as an extra axiom couldn’t shorten proofs by an arbitrarily large amount. It could only shorten proofs by a fixed finite amount: roughly, the number of symbols in the proof that Peano arithmetic is consistent.

And while we’re at it, let’s note how the existence of insanely long proofs is related to another famous result: the **Church–Turing theorem**. This says it’s impossible to write a computer program that can decide in a finite time, yes or no, whether any given statement is provable in Peano arithmetic.

Suppose that in Peano arithmetic there were a computable upper bound on the length of a proof of any statement, as a function of the length of that statement. Then we could write a program that would go through all potential proofs of any statement and tell us, in a finite amount of time, whether it had a proof. So, Peano arithmetic would be decidable!

But since it’s not decidable, no such computable upper bound can exist. This is another way of seeing that there must be theorems with insanely long proofs.

So, the existence of insanely long proofs is tightly connected to the inability of Peano arithmetic to prove itself consistent, and also the undecidability of Peano arithmetic. And these are features not just of Peano arithmetic, but of *any* system of arithmetic that’s sufficiently powerful, yet simple enough that we can write a program to check to see if a purported proof really is a proof.

### Buss’ speedup theorem

In fact Gödel stated his result in a more sophisticated way than I did. He never published a proof… but *not* because the proof is insanely long, probably just because he was a busy man with many ideas. Samuel Buss gave a proof in 1994:

• On Gödel’s theorems on lengths of proofs I: Number of lines and speedups for arithmetic, *Journal of Symbolic Logic* **39** (1994), 737–756.

In **first-order arithmetic** you can use variables like x,y,z to stand for natural numbers like 0,1,2,3,… This is the kind of arithmetic I’ve been talking about so far: Peano arithmetic is an example. In **second-order arithmetic** you can also use variables of a different kind to stand for *sets* of natural numbers. **Third-order arithmetic** goes further and lets you use variables for *sets of sets* of natural numbers, and so on.

Gödel claimed, and Buss showed, that each time you go up an order, some statements that were already provable get insanely shorter proofs. So, turning this fact around, there are theorems that have insanely long proofs in first-order arithmetic!

(And similarly for second-order arithmetic, and so on…)

For more details, explained in a fairly friendly way, try:

• Speed-up theorems, *Stanford Encyclopedia of Philosophy*.

By the way, this post is a kind of followup to my post on enormous integers. Insanely long proofs are *related* to enormous integers: if you know a quick way to describe an enormous integer, you can use the trick I described to cook up a theorem with an enormous proof.

Last time we saw the logician Harvey Friedman was a master of enormous integers. So it’s not surprising to me that Wikipedia says:

Harvey Friedman found some explicit natural examples of this phenomenon, giving some explicit statements in Peano arithmetic and other formal systems whose shortest proofs are ridiculously long.

However, I don’t know these statements. I assume they’re more natural than the weird self-referential examples I’ve described. What are they?

Ok, but what happens if we add this consistency assumption to the statement? What about the statement P2 = “This statement has no proof in Peano arithmetic, assuming it is consistent, that is shorter than N.”

Going in the similar vein if we assume there were a proof of P2 in Peano arithmetic whose length is less than N we get a contradiction.

So there cannot be a proof of P2 shorter then N. So statement P2 is true. But this amounts to the proof and it’s shorter then N if we pick large enough N so again we get a contradiction.

So your original statement P that was true when not containing the assumption in the statement, turns into a contradiction if we add an assumption that Peano arithmetic is consistent. Doesn’t that mean that Peano arithmetic is not consistent? Or at the very least that the original statement relied on (potential?) inconsistency of Peano arithmetic to hold?

Arrow wrote:

Very good point! But in fact the whole story here

completely mimicsthe story I told. So, we only get the contradiction you mention if make an extra assumption, namely:And this assumption is, perhaps surprisingly, stronger than the assumption that Peano arithmetic is consistent. Indeed it must be, because Gödel’s second incompleteness theorem applies not just to Peano arithmetic but a large class of systems, including “Peano arithmetic, plus the assumption that Peano arithmetic”. None of these systems can prove its own consistency. So, adding an axiom saying that the system is consistent always gives a truly new system which can prove some new things the old one couldn’t, and can prove some other things much faster than the old one could.

So, if “Peano arithmetic, plus the assumption that Peano is consistent” is consistent:

1) your statement P2 will take more than N steps to prove in that system

but

2) P2 will have a much shorter proof in the system “Peano arithmetic, plus the assumption that Peano is consistent, plus the assumption that “Peano arithmetic, plus the assumption that Peano is consistent” is consistent”.

And it keeps on going like this. It gets very tiring to write out, so mathematicians use abbreviations:

PA

PA + Con(PA)

PA + Con(PA) + Con(PA + Con(PA))

PA + Con(PA) + Con(PA + Con(PA)) + Con(PA + Con(PA) + Con(PA + Con(PA)))

etcetera. Each time you go up one notch, you get a new axiom system that makes certain proofs much shorter!

Since logicians like thinking about the infinite, this is just the beginning of the story…

Nice post, thanks. I like to view this as a consequence of a Turing-undecidability result about PA statements :

1. All the “work” is done in proving first that it is algorithmically undecidable if a statement from PA is true. This can be done by direct reduction from the Turing machine halting problem, or by reduction from the Post correspondence problem.

2. Now, if every true statement of PA had a proof, one could enumerate all the possible proofs in increasing length, and there would eventually always be a proof of either PA or not(PA). This would allow to decide the truthness of statements of PA, contradicting (1).

3. “Speed-up” : if the length of the proof of a statement was bounded by a computable function f(n), one could compute f(n) and try all the proofs with length less than f(n) in order to decide if a statement of PA is true. So there is no such computable f(n). This implies that there are true and provable statements whose shortest proof length cannot be bounded by *any* computable function, including for example high order towers of exponentials.

Yes, this is a nice and somewhat more systematic approach to the whole subject. The fun thing about Gödel sentences and their variants is that they let us give explicit

examplesof true statements that either have no proofs or insanely long proofs in PA (assuming PA is consistent). Of course these examples are strangely self-referential; eventually one wants examples that seem mathematically ‘natural’.There was a MathOverflow question about this, and Richard Borcherds gave the example of Friedman’s “finite miniaturization” of Kruskal’s theorem.

Wow, thanks! That might be the basis of a nice blog post someday.

I assume the statements you are looking for are in http://www.math.osu.edu/~friedman.8/pdf/EnormousInt112201.pdf http://www.math.osu.edu/~friedman.8/pdf/EnormousInt.12pt.6_1_00.pdf .

Amazing. This was very interesting. Which books would you suggest for this type of stuff?

If you want a ‘fun’ book on this sort of stuff, I recommend Douglas Hofstadter’s

Gödel, Escher, Bach: an Eternal Golden Braid. Every mathematician should read this, preferably when they’re young—it changed my life when I read it in college, and that happened to lots of people. For older people, who are more knowledgeable and jaded, it seems to make less of an impression.If you want a textbook, I recommend

Computability and Logicby Boolos, Burgess and Jeffreys. It covers many of the basic mind-blowing theorems of early 20th-century logic. I took a course based on this book when I was a sophomore in college, because I wanted to work on mathematical logic.I also recommend this page of mine if you haven’t read it yet:

• Surprises in logic.

“Now Hales is trying to create a fully formal proof that can be verified by automated proof checking software such as HOL. He expects that doing this will take 20 years.”

Seems to be done now: https://code.google.com/p/flyspeck/wiki/AnnouncingCompletion

Great!

• Jacob Aron, Proof confirmed of 400-year-old fruit-stacking problem,

New Scientist, 12 August 2014.“An enormous burden has been lifted from my shoulders,” said Thomas Hales. “I suddenly feel ten years younger!”

This is one of the world’s biggest proofs: it’s 200 terabytes long! That’s about equal to all the digitized text held by the US Library of Congress. There’s also a 68-gigabyte digital signature—sort of a proof that a proof exists—if you want to skim it.

It’s interesting that these 200 terabytes were used to solve a yes-or-no question, whose answer takes a single bit to state:

no.Just because the planetmath picture has died… maxima (formerly macsyma) knows the general solution of a quartic, and offers tex output. Available for your amusement here: http://jessecmckeown.tumblr.com/quartic

Thanks a million! I’ll try to include a version of this, or PlanetMath’s new version, in my blog article. It’s so long that I can’t do a screenshot!

Darn, I can’t figure out any way to create a jpg of either your image or the still more impressively huge PlanetMath version, which includes all four roots. They’re too

wide!Can someone do it?

Okay, both Jesse and Greg Egan kindly provided pictures of the quartic’s roots, adapted from PlanetMath. Here’s one version:

And here’s another:

Thank god nobody could solve the quintic!

Slight error in your cubic; as you later note, those are the roots not of , but of . Mathematica does them fine too, by the way!