Shelves and the Infinite

Infinity is a very strange concept. Like alien spores floating down from the sky, large infinities can come down and contaminate the study of questions about ordinary finite numbers! Here’s an example.

A shelf is a set with a binary operation \rhd that distributes over itself:

a \rhd (b \rhd c) = (a \rhd b) \rhd (a \rhd c)

There are lots of examples, the simplest being any group, where we define

g \rhd h = g h g^{-1}

They have a nice connection to knot theory, which you can see here if you think hard:

My former student Alissa Crans, who invented the term ‘shelf’, has written a lot about them, starting here:

• Alissa Crans, Lie 2-Algebras, Chapter 3.1: Shelves, Racks, Spindles and Quandles, Ph.D. thesis, U.C. Riverside, 2004.

I could tell you a long and entertaining story about this, including the tale of how shelves got their name. But instead I want to talk about something far more peculiar, which I understand much less well. There’s a strange connection between shelves, extremely large infinities, and extremely large finite numbers! It was first noticed by a logician named Richard Laver in the late 1980s, and it’s been developed further by Randall Dougherty.

It goes like this. For each n, there’s a unique shelf structure on the numbers \{1,2, \dots ,2^n\} such that

a \rhd 1 = a + 1 \bmod 2^n

So, the elements of our shelf are

1

1 \rhd 1 = 2

2 \rhd 1 = 3

and so on, until we get to

2^n \rhd 1 = 1

However, we can now calculate

1 \rhd 1

1 \rhd 2

1 \rhd 3

and so on. You should try it yourself for a simple example! You’ll need to use the self-distributive law. It’s quite an experience.

You’ll get a list of 2^n numbers, but this list will not contain all the numbers \{1, 2, \dots, 2^n\}. Instead, it will repeat with some period P(n).

Here is where things get weird. The numbers P(n) form this sequence:

1, 1, 2, 4, 4, 8, 8, 8, 8, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, …

It may not look like it, but the numbers in this sequence approach infinity!

if we assume an extra axiom, which goes beyond the usual axioms of set theory, but so far seems consistent!

This axiom asserts the existence of an absurdly large cardinal, called an I3 rank-into-rank cardinal.

I’ll say more about this kind of cardinal later. But, this is not the only case where a ‘large cardinal axiom’ has consequences for down-to-earth math, like the behavior of some sequence that you can define using simple rules.

On the other hand, Randall Dougherty has proved a lower bound on how far you have to go out in this sequence to reach the number 32.

And, it’s an incomprehensibly large number!

The third Ackermann function A_3(n) is roughly 2 to the nth power. The fourth Ackermann function A_4(n) is roughly 2 raised to itself n times:

2^{2^{2^{2^{\cdot^{\cdot^\cdot}}}}}

And so on: each Ackermann function is defined by iterating the previous one.

Dougherty showed that for the sequence P(n) to reach 32, you have to go at least

n = A(9,A(8,A(8,255)))

This is an insanely large number!

I should emphasize that if we use just the ordinary axioms of set theory, the ZFC axioms, nobody has proved that the sequence P(n) ever reaches 32. Neither is it known that this is unprovable if we only use ZFC.

So, what we’ve got here is a very slowly growing sequence… which is easy to define but grows so slowly that (so far) mathematicians need new axioms of set theory to prove it goes to infinity, or even reaches 32.

I should admit that my definition of the Ackermann function is rough. In reality it’s defined like this:

A(m, n) = \begin{cases} n+1 & \mbox{if } m = 0 \\ A(m-1, 1) & \mbox{if } m > 0 \mbox{ and } n = 0 \\ A(m-1, A(m, n-1)) & \mbox{if } m > 0 \mbox{ and } n > 0. \end{cases}

And if you work this out, you’ll find it’s a bit annoying. Somehow the number 3 sneaks in:

A(2,n) = 2 + (n+3) - 3

A(3,n) = 2 \cdot (n+3) - 3

A(4,n) = 2^{n+3} - 3

A(5,n) = 2\uparrow\uparrow(n+3) - 3

where a \uparrow\uparrow b means a raised to itself b times,

A(6,n) = 2 \uparrow\uparrow\uparrow(n+3) - 3

where a \uparrow\uparrow\uparrow b means a \uparrow\uparrow (a \uparrow\uparrow (a \uparrow\uparrow \cdots )) with the number a repeated b times, and so on.

However, these irritating 3’s scarcely matter, since Dougherty’s number is so large… and I believe he could have gotten an even larger upper bound if he wanted.

Perhaps I’ll wrap up by saying very roughly what an I3 rank-into-rank cardinal is.

In set theory the universe of all sets is built up in stages. These stages are called the von Neumann hierarchy. The lowest stage has nothing in it:

V_0 = \emptyset

Each successive stage is defined like this:

V_{\lambda + 1} = P(V_\lambda)

where P(S) is the the power set of S, that is, the set of all subsets of S. For ‘limit ordinals’, that is, ordinals that aren’t of the form \lambda + 1, we define

\displaystyle{ V_\lambda = \bigcup_{\alpha < \lambda} V_\alpha }

An I3 rank-into-rank cardinal is an ordinal \lambda such that V_\lambda admits a nontrivial elementary embedding into itself.

Very roughly, this means the infinity \lambda is so huge that the collection of sets that can be built by this stage can mapped into itself, in a one-to-one but not onto way, into a smaller collection that’s indistinguishable from the original one when it comes to the validity of anything you can say about sets!

More precisely, a nontrivial elementary embedding of V_\lambda into itself is a one-to-one but not onto function

f: V_\lambda \to V_\lambda

that preserves and reflects the validity of all statements in the language of set theory. That is: for any sentence \phi(a_1, \dots, a_n) in the language of set theory, this statement holds for sets a_1, \dots, a_n \in V_\lambda if and only if \phi(f(a_1), \dots, f(a_n)) holds.

I don’t know why, but an I3 rank-into-rank cardinal, if it’s even consistent to assume one exists, is known to be extraordinarily big. What I mean by this is that it automatically has a lot of other properties known to characterize large cardinals. It’s inaccessible (which is big) and ineffable (which is bigger), and measurable (which is bigger), and huge (which is even bigger), and so on.

How in the world is this related to shelves?

The point is that if

f, g : V_\lambda \to V_\lambda

are elementary embeddings, we can apply f to any set in V_\lambda. But in set theory, functions are sets too: sets of ordered pairs. So, g is a set. It’s not an element of V_\lambda, but all its subsets g \cap V_\alpha are, where \alpha < \lambda. So, we can define

f \rhd g = \bigcup_{\alpha < \lambda} f (g \cap V_\alpha)

Laver showed that this operation distributes over itself:

f \rhd (g \rhd h) = (f \rhd g) \rhd (f \rhd h)

He also that if we take an elementary embedding and let it generate a shelf by this operation, the shelf we get is isomorphic to the free shelf on one generator!

The shelf I started out describing, the numbers \{1, \dots, 2^n \} with

a \rhd 1 = a + 1 \bmod 2^n

also has one generator namely the number 1. So, it’s a quotient of the free shelf on one generator by one relation, namely the above equation.

That’s about all I understand. I don’t understand how the existence of a nontrivial elementary embedding of V_\lambda into itself implies that the function P(n) goes to infinity, and I don’t understand Randall Dougherty’s lower bound on how far you need to go to reach P(n) = 32. For more, read these:

• Richard Laver, The left distributive law and the freeness of an algebra of elementary embeddings, Adv. Math. 91 (1992), 209–231.

• Richard Laver, On the algebra of elementary embeddings of a rank into itself, Adv. Math. 110 (1995), 334–346.

• Randall Dougherty and Thomas Jech, Finite left distributive algebras and embedding algebras, Adv. Math. 130 (1997), 201–241.

• Randall Dougherty, Critical points in an algebra of elementary embeddings, Ann. Pure Appl. Logic 65 (1993), 211–241.

• Randall Dougherty, Critical points in an algebra of elementary embeddings, II.

46 Responses to Shelves and the Infinite

  1. jessemckeown says:

    I’m pretty sure “one-to-one” follows from “elementary embedding”; if we have x ∈ y \ z , we must have f(x) ∈ f(y) \ f(z)… but that’s just for the parsimoniously-inclined; some things it really is better to mention up-front.

    And now, because echoes are fun to note, … you’ve made some remarks about what does and does not matter to the Ackerman function, and … they sound an AWFUL LOT like what does and does not matter to an I3 rank-into-rank cardinal: an elementary embedding misses some things, as does jumping to 2^(n+3) rather than 2^(n+3)-3; (or to 2^(n+3)-3 from 2^n … ), but the rank-into-rank cardinal at the top of it doesn’t care — but even more emphatically, even more precisely.

    … Is it a good thing that the lower bound was A(n,A(m,A(m,k))) rather than A(A(A(m,k),m),n)?

    • John Baez says:

      Jesse wrote:

      … Is it a good thing that the lower bound was A(n,A(m,A(m,k))) rather than A(A(A(m,k),m),n)?

      “Good thing”? I guess it depends on whether you like big numbers or small ones. Since I like big ones, I was sort of disappointed that the bound was of the form A(n,A(m,A(m,k))); this makes it much smaller.

      People who like big numbers may want to compare this post to an earlier one:

      Enormous integers.

      That post mentions another very large number arising from a simple math problem; Harvey Friedman has proved an upper bound on that given by

      A_{A_5(5)}(A_5(5))

      This uses his improved version of the Ackermann function, which obeys

      A_1(n) = 2 n

      A_2(n) = 2^n

      and

      A_{k+1}(n) = A_k (A_k (A_k \cdots (n))))

      where we use n A_ks.

      So, Harvey Friedman’s bound is really big, since we’re iterating a process of making a function much bigger A_5(5) times. Unfortunately it’s an upper bound, just like Graham’s number, so it might become much smaller when we learn more, just as with the upper bound that led to Graham’s number.

      • Royce Peng says:

        Heh, now I’ve started reading all your articles. :)

        nitpick: that should be

        A_{k+1}(n) = A_k(A_k(A_k \cdots (1)...))

        Defining

        A_{k+1}(n) = A_k(A_k(A_k \cdots (n)...))

        gets you the fast-growing hierarchy, which doesn’t resolve into Knuth arrows quite as nicely.

        Concerning Friedman’s n(k) function, I read over Friedman’s paper not too long ago, and discovered that the same argument that he used to prove n(3) > A_{7198}(153,000) could be used to give a lower bound for n(k) of roughly F_{\omega^{k-2}}(7198). So for example n(4) > F_{\omega^2}(7198) is already much more than the given lower bound of about A^{A(187,196)}(1).

  2. davetweed says:

    The I3 rank-into-rank cardinal is huge, but is it yuge? (I half feel like someone should find some new classification relating to size and call it that.)

    • John Baez says:

      It’s yuge.

      As time goes on, what were once considered really enormous cardinals were later trumped by even bigger ones. This is the reason for the title of Kanamori’s book The Higher Infinite, which at first sounds pompous. (It’s wonderful book to carry around if you want to impress other mathematicians: none of those puny lower infinities for me, please!)

      If you go to this website:

      Cantor’s Attic.

      you’ll see infinities grouped into kinds. The I3 rank-into-rank cardinals are in the upper attic:

      Welcome to the upper attic, the transfinite realm of large cardinals, the higher infinite, carrying us upward from the merely inaccessible and indescribable to the subtle and endlessly extendible concepts beyond, towards the calamity of inconsistency.

      If you go up there you’ll see ‘remarkable’ cardinals, ‘ethereal’ cardinals, ‘indescribable’ cardinals and more.

      The only cardinals I know bigger than the I3 rank-into-rank cardinals are the I2, I1 and I0 rank-into-rank cardinals. These are all watered-down versions of the Reinhardt cardinals, which were proved to be inconsistent with the axiom of choice thanks to Kunen’s inconsistency theorem.

      I know you won’t follow all those links, since our lives are not infinite… so I’ll just say here: it’s inconsistent with the axiom of choice to assume there’s an elementary embedding of the whole universe into itself! So, these watered-down cardinals assume there’s an elementary embedding of some ‘stage’ of building the universe into itself.

      In short: if the axiom of choice holds, the universe of all sets is too big to fit inside a copy of itself… or, if you prefer, it’s too small to contain a copy of itself!

    • It’s a mild weakening of the inconsistent Reinhardt cardinal, so it’s about as big as things can be.

  3. Tim Campion says:

    There’s a subtle point at the end. In general, one large cardinal axiom A can be higher in consistency strength than another B without A actually implying B, and I think that’s the case here. Even though the existence of an inaccessible cardinal is much lower in consistency strength than the existence of an I3 cardinal, I believe that if there is an elementary embedding j: V_\lambda \to V_\lambda, then \lambda is not inaccessible. After all, if \lambda were inaccessible, then V_\lambda would model ZFC, and j would be a non-identity elementary embedding from the universe to itself in this model of ZFC. But Kunen showed this is inconsistent. The I3 axiom is supposed to be a weakening of Reinhardt’s axiom saying that there’s a nontrivial elementary embedding from the universe to itself.

    • John Baez says:

      Oh, wow! Thanks—as you can see, I’m just learning this stuff. So an I3 rank-into-rank cardinal can’t be measurable either, since measurable cardinals are inaccessible. In what sense is an I3 cardinal ‘large’ then, apart from consistency strength? If we could prove it’s larger than some measurable cardinal, that would be fine.

      • Tim Campion says:

        I’m purely going by what I’m reading on the internet at this point, but when talking about rank-into-rank cardinals there are really two cardinals that might be considered to be “the large cardinal”. If there is a non-identity elementary embedding j: V_\lambda \to V_\lambda, then \lambda mighbt be considered a large cardinal even though it is not inaccessible. Or the critical point of j may be considered a large cardinal. This is the smallest cardinal \kappa such that j(\kappa) \neq \kappa. It turns out that this critical point of j is inaccessible, measurable, huge, etc.

      • The rank-into-rank cardinals and their relation with finite self-distributive algebras are truly majestic. While exciting results about the classical Laver tables have been proven in the 1990’s, unfortunately, from the late 1990’s until the 2010’s no one has proven any interesting results about the classical Laver tables, but in the last couple years a couple papers about the classical Laver tables have been published. These papers include Victoria Lebed and Patrick Dehornoy’s paper which calculates the cocycle groups of the classical Laver tables and give some invariants on positive braids. Lebed and Dehornoy’s work is a good step towards possibly applying Laver tables and hence large cardinals to knot theory and braid theory. I am looking forward to when set theory has more applications in these subject areas and other subject areas. Furthermore, I am looking forward to more results about finite structures or results in knot theory or other areas whose only known proof relies on very large cardinal axioms as hypotheses.

      • John Baez. Yes. The first rank-into-rank cardinal is much larger than the first measurable cardinal. However, when people talk about rank-into-rank embeddings j:V_{\lambda}\rightarrow V_{\lambda} there are two cardinals in question, namely \lambda and the critical point \kappa=\text{crit}(j) which is the first ordinal with j(\kappa)>\kappa. The cardinal \lambda is always a singular strong limit cardinal of countable cofinality, so \lambda is not even inaccessible. On the other hand, the cardinal \kappa is measurable (and huge and much more). The term “rank-into-rank cardinal” could also refer to the critical point \kappa as opposed to \lambda. Most large cardinal axioms (like measurable, supercompact, and so forth) refer to the critical points of certain elementary embeddings rather than other cardinals involved, so when people hear the term rank-into-rank cardinal they may think about the critical point \kappa rather than \lambda (unless defined otherwise of course).

        On another note, the best way to understand the seeming discrepancy between the consistency strength and size of various large cardinals is that cardinals higher in the large cardinal hierarchy imply the existence of very nice models which satisfy large cardinal axioms lower in the large cardinal hierarchy and these very nice models are in many cases initial segments V_{\gamma} of the universe where \gamma is a large cardinal. For example, the existence of a rank-into-rank embedding does not imply the existence of a strongly compact cardinal since the first strongly compact cardinal could be larger (in size) than the first rank-into-rank cardinal. However, if j:V_{\lambda}\rightarrow V_{\lambda} is a rank-into-rank embedding with critical point \kappa, then V_{\kappa} models “there are many strongly compact cardinals”.

    • Tim Campion. A cardinal \lambda with V_{\lambda}\models ZFC does not preclude the existence of a rank-into-rank embedding from V_{\lambda} to V_{\lambda}. In fact, there are elementary embeddings from models of ZFC into themselves at much lower levels of the large cardinal hierarchy. For example, the axiom 0^{\sharp} (a large cardinal axiom between indescribable and measurable) is equivalent to saying there is an elementary embedding j:L\rightarrow L (recall that L is the constructible universe which is the smallest inner model of ZFC). The reason why we can obtain such an elementary embedding j from L to L is that 0^{\sharp} implies that V\neq L and also that L is much smaller than V. Therefore, even though we know that there is no elementary embedding from V to V, there could still be an elementary embedding from L to L since initial segments of the elementary embedding j:L\rightarrow L live outside of L.

      Furthermore, if j:V_{\lambda}\rightarrow V_{\lambda} with crit(j)=\kappa, then V_{\kappa} is actually an elementary substructure of V_{\lambda}. Since \kappa is inaccessible V_{\kappa} models ZFC so V_{\lambda}\models ZFC as well.

  4. Bruce Smith says:

    I don’t feel like trying to generate one of those P(n)’s, so I can’t be sure it can always be done in an algorithmic way… but if you promise me it can be, doesn’t that make it inevitable that for any n, there is a proof in PA that P(n) has a particular value? (Which consists of evaluating all those 2^n expressions and pointing out their period.)

    (OTOH if you say it can’t be done by an algorithm (which is provably correct, by noncontroversial axioms), then I’m unlikely to consider P(n) well-defined.)

    • Layra Idarani says:

      Generating the entire Laver table itself for 2^n isn’t too bad, actually. The trick [SPOILER] is to compute a \rhd b by starting with the cases of c \rhd d for c > a, and then for c = a, d < b. Partially-reverse dictionary order.[/SPOILER]

      I don’t actually know that this always works, but for the cases I’ve computed, n = 2, 3, 4, we always have that a \rhd b > a except when a = 2^n, so if that always holds then the algorithm works.

      So there’s a nice algorithm for generating the table, and then you can just read off P(n). I haven’t tried n = 5 yet, though.

      • Layra Idarani says:

        Here’s an explicit algorithm for computing the tables, which also fixes the gap in my last post.

        I use the variant that Dehornoy gave for rewriting A_n into an isomorphic B_n that nests more easily. The map from A_n to B_n is x \mapsto 2^n - x. Explicitly, on the set \{0,...,2^n-1\}, we define

        x \rhd (2^n-1) = x - 1 \;\mathrm{mod}\; 2^n

        0 \rhd y = y

        x \rhd (y - 1) = (x \rhd y) \rhd (x - 1)

        In this case, P(n) is the period of (2^n - 1) \rhd y. Notably, now B_n is an actual subalgebra of B_{n+1}, rather than simply isomorphic to one.

        Lemma: For all x > 0, x \rhd y < x for all y.

        Proof: Induction:

        Base cases:
        For x = 1, we get that 1 \rhd y = 0.
        For x > 1, x \rhd (2^n - 1) = x - 1 < x.
        We’re left with the cases of x > 1 and y < 2^n - 1.

        Inductive step: Now suppose that for all 0 < s < x and all t, s \rhd t < x and for all t \geq y, x \rhd t < x. Then

        x \rhd (y - 1) =  (x \rhd y) \rhd (x - 1).

        (x \rhd y) < x, so we either have 0 \rhd (x - 1) = x - 1 < x, or we have s \rhd y for some 0 < s < x, which by the induction hypotheses is also less than x.

        Now we can compute iteratively. We start with the entries explicitly computed, and then look at

        2 \rhd (2^n-2), 2\rhd (2^n - 3), \ldots 2 \rhd 0, 3 \rhd (2^n - 1), \ldots

        At each step, when computing x \rhd (y - 1), we’ve already computed x \rhd y < x and s \rhd (x - 1) for all s < x, so $x \rhd (y – 1)$ is uniquely determined.

        Thus the table is filled in algorithmically.

      • John Baez says:

        Thanks for the algorithm, Layra! I’m too sleepy to think about it (it’s 4:17 am here and I just woke up from a nightmare in which my dead father turned out to be living at the bottom of a concrete pit), so I just fixed your LaTeX. When you post comments here using LaTeX, take a look at the instructions that appear above the box you type into. They say:

        You can use Markdown or HTML in your comments. You can also use LaTeX, like this: $latex E = m c^2 $. The word ‘latex’ comes right after the first dollar sign, with a space after it.

    • John Baez says:

      I don’t feel like trying to generate one of those P(n)’s, so I can’t be sure it can always be done in an algorithmic way…

      I tried computing one of the first ones, in my head, and it was quite far-out. I recommend giving it a try. However, I have the feeling that it can be done algorithmically. There’s a MathOverflow question about this:

      • Michael Albanese, What is the largest Laver table which has been computed?, 7 August 2013.

      A Laver table is the multiplication table for i \rhd j in one of the shelves \{1, \dots, 2^n\} that I was talking about.

      Answers to this question indicate that currently the largest Laver table that’s been computed is for n = 28. If this is described as a 2^{28} \times 2^{28} matrix of numbers between 1 and 2^{28} \approx 2.7 \times 10^8, then it would take quite a lot of memory to store all these numbers even if the computation itself is easy. I don’t know if anybody is able to compute P(n), namely the period of the first row of the table, more efficiently than working out the whole table! But I get the impression that the tables are perfectly computable, just tiring to compute.

      but if you promise me it can be, doesn’t that make it inevitable that for any n, there is a proof in PA that P(n) has a particular value?

      Yes, that would be inevitable.

      However, it would not be inevitable that PA can prove “P(n) goes to infinity as n → ∞”, which is one of the facts that’s been proved using ZFC and a large cardinal axiom.

      If ZFC plus this large cardinal axiom is consistent, then we know there must exist N such that P(N) = 32, and thus there must be a proof in PA that P(N) = 32 for this N.

      However, it will then also be true that N ≥ A(9,A(8,A(8,255))). So, it’s conceivable that the shortest proof in PA that P(N) = 32 requires calculating a Laver table that is of size A(9,A(8,A(8,255))) × A(9,A(8,A(8,255))). In this case, we’ll never actually find that proof!

      Of course, we should hope that there’s a shorter proof, and try to find it. Indeed we should try to find a proof in PA that P(n) → ∞ as n → ∞

      If the large cardinal axiom is not consistent with ZFC, it could turn out that P(29) = 32. That would be amusing. (The long list of 16’s in my post was based on assuming the large cardinal axiom is consistent.)

      Randall Dougherty showed that the smallest n with P(n) ≥ k grows very fast: faster than any primitive recursive function. (The Ackermann function is not primitive recursive.) There is a somewhat popular weakened form of Peano arithmetic called primitive recursive arithmetic, and Dougherty and Jech have shown that some basic properties of Laver tables can’t be proved in this system.

      This might be a step toward proving that some properties of Laver tables can’t be proved in PA… but there are no results of that sort, yet. On the contrary, Dougherty has been trying to prove that P(n) → ∞ as n → ∞ in a system of arithmetic weaker than PA.

      • martinmackerel says:

        The sequence https://oeis.org/A098820 goes up to P(56) = 16. Do you have a link to Dougherty’s proof? (None of the three linked papers appeared to contain it.) Unless it depends on the large cardinal axiom, P(n) = 16 for all n for which we could possibly ever compute the full table, regardless of whether the large cardinal axiom is consistent with ZFC.

    • John Baez says:

      Bruce wrote:

      OTOH if you say it can’t be done by an algorithm (which is provably correct, by noncontroversial axioms), then I’m unlikely to consider P(n) well-defined.)

      By the way, as you probably know, the Busy Beaver function is uncomputable, yet specific values of this function have been computed by utterly noncontroversial methods. So, we should be a bit careful. We can make up functions F(n) that are uncomputable, yet we can compute their value for all n < A(A(A(10,10),10),10).

      But this is just a nitpick. I’m now convinced, by what Joseph van Name wrote, that P(n) is computable in the usual sense: there’s a program that computes its value for any n.

      • Mike Oliver says:

        It’s actually very simple to see that the entire table is computable, granted the result that there is a unique table for any power-of-2 size. You just start with a table with empty values. You fill in the known values (defined by the n-triangle-1 = (n+1) mod 2^n relation).

        Then you start making passes over all triples (a,b,c), inferring values of the triangle relation from the shelf axiom whenever one side of the equation has known values and the other side is unknown.

        Make as many passes as you need, until you stop inferring new values. At that point the entire table must be filled in (otherwise there would not exist a unique shelf satisfying the constraints).

        It’s not efficient, of course, but it is polynomial in 2^n (somewhere between (2^n)^3, which is the time each pass takes, and (2^n)^5, because you’ll never need more than (2^n)^2 passes, which is the number of inferences that need to be made.

        • John Baez says:

          Okay, good. I figured it was computable but this is a proof, assuming the uniqueness, which I’m happy to believe.

        • martinmackerel says:

          If you just start working it out for arbitrary N = 2^n, you can see that it’s quite straightforwardly computable. Compute N \rhd i for i from 1 to N. Then compute N – 1 \rhd all i, then N – 2 \rhd all i, etc.

          N \rhd i = i \forall i
          N – 1 \rhd i = N \forall i
          N – 2 \rhd i = N – 1 for odd i, N for even i
          and so on.
          One nicety: i \rhd N = N \forall i.

          Proof that N must be divisible by 2. For N >= 3:

          (N-1) = (N-2) \rhd 1 = (N-2) \rhd (N \rhd 1) = ((N-2) \rhd N) \rhd ((N-2) \rhd) 1) = x \rhd (N-1), where x depends on the parity of N. If N is odd, then x = N-1 and x \rhd (N-1) = (N-1) \rhd (N-1) = N != N-1, so we have a contradiction. Therefore N must be even.

          Similar proofs are possible to prove that N must be divisible by 4 if n >= 5, etc., although I don’t have a general proof that N must be an exact power of two.

          There is at most one table for any size N.

      • Mike Oliver says:

        Hmm, there may be a gap in the “otherwise there would not exist a unique shelf” step.

        • John Baez says:

          People have proved that’s true, so I’m willing to accept it in this proof that the shelf operation is computable.

          One thing I didn’t mention is that this business only works for numbers of the form 2^n. I.e., there’s no shelf structure on \{1, \dots, k\} obeying

          a \rhd 1 = a + 1 \; \mod k

          unless k = 2^n. I don’t know why that’s true, but the tiny bit of calculation I did made it clear that numbers of the form 2^n are going to behave differently from others.

        • Mike Oliver says:

          The gap I’m worried about is not the uniqueness. It’s whether the uniqueness implies that you can always make an inference directly from a single instance of the distributive law. That is, can you get a Sudoku-style situation where there remain unknown values, and you can’t infer any new ones directly from any single instance of the distributive law based on known values, but nevertheless only one way of filling in the values makes the whole table satisfy the law?

          I have a vague memory that that may not be possible in equational logic, but I can’t quite remember why. Maybe you can use completeness of first-order logic plus elimination of quantifiers, or maybe you can get it by reasoning about the free object satisfying the equational relations, but I just can’t immediately fill in the details.

          But anyway, it’s not important except to see whether my polynomial-time algo works. If all you want is computability, you can easily do that (in super-exponential time) by exhaustively enumerating all possible tables, and seeing which one satisfies the constraints.

      • Roger Joseph Witte says:

        Talking of busy beaver, have you seen

        • Scott Aaronson, The 8000th Busy Beaver number eludes ZF set theory: new paper by Adam Yedidia and me, Shtetl-Optimized, 3 May 2016.

        • John Baez says:

          Yes, it’s cool! Joseph van Name’s comment on Scott’s post led me to write this post here. In his comment, he writes a short program using the Laver table idea that takes an insanely long time to halt.

          I’d heard about Laver tables a few times, but this induced me to finally broke down and tried to learn the basic idea, which for me often means writing a blog article.

          Unfortunately my blog article does not mention the phrase ‘Laver table’: that’s the 2^n \times 2^n multiplication table for the shelf I described.

  5. The largest classical Laver table computed is actually A_{48}. The 48-th table was computed by Dougherty and the algorithm was originally described in Dougherty’s paper here. With today’s technology I could imagine that one could compute A_{96} if one has access to a sufficiently powerful computer. One can compute the classical Laver tables up to the 48-th table on your computer here at my website.

    Most of the notions concerning Laver tables can be studied in a purely algebraic manner including seemingly set theoretic notions such as the notion of a critical point. In particular, the fact that “if P(n)=32 then n must be extremely large” is a fact that can be proven in a purely algebraic manner without any reference to large cardinals. Therefore, even if large cardinals are inconsistent, an instance n of where P(n)=32 must still be at least Ack(9,Ack(8,Ack(8,255))). Thus, the long list of 16’s is a true statement even in Peano Arithmetic.

    • John Baez says:

      Joseph van Name wrote:

      The largest classical Laver table computed is actually A_{48}.

      Thanks! I’ll mention that on MathOverflow: the information there is out-dated.

      In particular, the fact that “if P(n)=32 then n must be extremely large” is a fact that can be proven in a purely algebraic manner without any reference to large cardinals.

      Great! Who has proven it in this way? Is Dougherty’s proof of that nature? I had trouble telling, since I didn’t spend a lot of time on his paper, and the arguments seem tricky.

      • Yes. Dougherty’s proof that if P(n)=32 then n is big mentioned in this paper does not necessarily need to use any set theory. Dougherty’s proof extensively uses properties of critical points of elementary embeddings, but these properties of critical points can be proven algebraically without any reference to set theory. If x\in A_{n}, then the critical point \text{crit}(x) can be defined algebraically as \mathrm{log_{2}}(\text{Gcd}(x,2^{n})). The fact that the critical points of algebras of elementary embeddings can be studied algebraically was originally known by Laver in the early 1990’s. The translation between set theoretic notions and algebraic notions has been discussed in Dehornoy’s book Braids and Self-Distributivity in Chapter 13.

        • Royce Peng says:

          Hmm, the papers you link to do not seem to say anything about when the period of a Laver table reaches 32. What facts about F(n) (the number of critical points of members of P_j lying below \kappa_n) give us lower bounds for when the periods of Laver tables reach 2^n?

  6. SK says:

    Can a shelf S be embedded as the sub-shelf of a group shelf, say by taking the group with generators S and relations (s \rhd t) * s * t^{-1} * s^{-1}=1? Can the shelves \{1,\ldots, 2^{n} \} be embedded simultaneously?

    I wonder if the “Cayley graphs” of the shelves \{1,\ldots, 2^{n} \} are significant. The ones for \{1, 2, 3, 4\} were surprisingly uniformly random.

    • John Baez says:

      Your procedure does not give an embedding: it gives just a map. In general, most shelves do not embed in groups.

      By general abstract nonsense (so-called ‘universal algebra’, or Lawvere’s work on ‘algebraic theories’) there are adjoint functors

      U : \mathrm{Group} \to \mathrm{Shelf}

      F :  \mathrm{Shelf} \to \mathrm{Group}

      between the category of groups and group homomorphisms and the category of shelves and shelf homomorphisms. U sends any group to its underlying shelf, which has the same set of elements and

      a \rhd b = a b a^{-1}

      F sends any shelf to the ‘free group on that shelf’. Here we take our shelf S, form the free group on its set of elements, and then impose the relations you listed, which force the equations

      a b a^{-1} = a \rhd b

      and all equations derivable from these by the group axioms.

      By general abstract nonsense, for any shelf S there’s a shelf homomorphism

      i : S \to U(F(S))

      I believe you’re asking if this is an embedding. It’s usually not. For example, a group obeys this law:

      a a a^{-1} = a

      The analogous law for shelves:

      a \rhd a = a

      doesn’t hold in all shelves. Only a shelf that’s a quandles has a chance to be embedded in a group.

      In particular, if S is the free shelf on one generator, whose quotients give Laver tables, F(S) is the free group on one generator, which is \mathbb{Z}, and the homomorphism i : S \to U(F(S)) is not one-to-one.

  7. I was thinking about why large cardinal axioms have effect in number theory and also why complex analysis is so powerful a number theoretic tool. I mote that lots of free things and initial things are countable because of the countability of the languages we use to describe them, When we do the sort of linguistic gymnastics required to utter a large cardinal axiom we fold up the language in increasingly complicated ways. This gives rise to ordinals in proof theory. So perhaps it is the very large countable ordinals that we should be discussing. https://en.wikipedia.org/wiki/Large_countable_ordinal

    • John Baez says:

      There are fascinating relations between large cardinals, large countable ordinals and large finite numbers. This is one of the things I’d like to understand better. Since everything we actually do involves writing finite symbol strings, it’s perhaps not so surprising, but still, getting a really clear grip on it seems hard.

  8. Steven R Wenner says:

    Noticed a typo: I think you meant V(alpha), not V(lambda), for the union in the definition of the Von Neumann hierarchy for limit ordinals.

  9. MABfan11 says:

    How does the lower bound of n = A(9,A(8,A(8,255))) for reaching 32 compare to the numbers in Googology? because it doesn’t seem that insanely large to me, especially when compared to monsters like Loader’s Number, Rayo’s Number and Large Number Garden Number

    • John Baez says:

      I don’t know how that lower bound compares to these other numbers. The thing about numbers is that there always much bigger ones. I also don’t know how good this lower bound is. I wouldn’t be surprised if it could be vastly improved. Only a few people seem willing and able to study this topic.

Leave a reply to Joseph Van Name Cancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.