Topological Crystals (Part 3)

6 August, 2016


Last time I explained how to create the ‘maximal abelian cover’ of a connected graph. Now I’ll say more about a systematic procedure for embedding this into a vector space. That will give us a topological crystal, like the one above.

Some remarkably symmetrical patterns arise this way! For example, starting from this graph:

we get this:

Nature uses this pattern for crystals of graphene.

Starting from this graph:

we get this:

Nature uses this for crystals of diamond! Since the construction depends only on the topology of the graph we start with, we call this embedded copy of its maximal abelian cover a topological crystal.

Today I’ll remind you how this construction works. I’ll also outline a proof that it gives an embedding of the maximal abelian cover if and only if the graph has no bridges: that is, edges that disconnect the graph when removed. I’ll skip all the hard steps of the proof, but they can be found here:

• John Baez, Topological crystals.

The homology of graphs

I’ll start with some standard stuff that’s good to know. Let X be a graph. Remember from last time that we’re working in a setup where every edge e goes from a vertex called its source s(e) to a vertex called its target t(e). We write e: x \to y to indicate that e is going from x to y. You can think of the edge as having an arrow on it, and if you turn the arrow around you get the inverse edge, e^{-1}: y \to x. Also, e^{-1} \ne e.

The group of integral 0-chains on X, C_0(X,\mathbb{Z}), is the free abelian group on the set of vertices of X. The group of integral 1-chains on X, C_1(X,\mathbb{Z}), is the quotient of the free abelian group on the set of edges of X by relations e^{-1} = -e for every edge e. The boundary map is the homomorphism

\partial : C_1(X,\mathbb{Z}) \to C_0(X,\mathbb{Z})

such that

\partial e = t(e) - s(e)

for each edge e, and

Z_1(X,\mathbb{Z}) =  \ker \partial

is the group of integral 1-cycles on X.

Remember, a path in a graph is a sequence of edges, the target of each one being the source of the next. Any path \gamma = e_1 \cdots e_n in X determines an integral 1-chain:

c_\gamma = e_1 + \cdots + e_n

For any path \gamma we have

c_{\gamma^{-1}} = -c_{\gamma},

and if \gamma and \delta are composable then

c_{\gamma \delta} = c_\gamma + c_\delta

Last time I explained what it means for two paths to be ‘homologous’. Here’s the quick way to say it. There’s groupoid called the fundamental groupoid of X, where the objects are the vertices of X and the morphisms are freely generated by the edges except for relations saying that the inverse of e: x \to y really is e^{-1}: y \to x. We can abelianize the fundamental groupoid by imposing relations saying that \gamma \delta = \delta \gamma whenever this equation makes sense. Each path \gamma : x \to y gives a morphism which I’ll call [[\gamma]] : x \to y in the abelianized fundamental groupoid. We say two paths \gamma, \gamma' : x \to y are homologous if [[\gamma]] = [[\gamma']].

Here’s a nice thing:

Lemma A. Let X be a graph. Two paths \gamma, \delta : x \to y in X are homologous if and only if they give the same 1-chain: c_\gamma = c_\delta.

Proof. See the paper. You could say they give ‘homologous’ 1-chains, too, but for graphs that’s the same as being equal.   █

We define vector spaces of 0-chains and 1-chains by

C_0(X,\mathbb{R}) = C_0(X,\mathbb{Z}) \otimes \mathbb{R}, \qquad C_1(X,\mathbb{R}) = C_1(X,\mathbb{Z}) \otimes \mathbb{R},

respectively. We extend the boundary map to a linear map

\partial :  C_1(X,\mathbb{R}) \to C_0(X,\mathbb{R})

We let Z_1(X,\mathbb{R}) be the kernel of this linear map, or equivalently,

Z_1(X,\mathbb{R}) = Z_0(X,\mathbb{Z}) \otimes \mathbb{R}  ,

and we call elements of this vector space 1-cycles. Since Z_1(X,\mathbb{Z}) is a free abelian group, it forms a lattice in the space of 1-cycles. Any edge of X can be seen as a 1-chain, and there is a unique inner product on C_1(X,\mathbb{R}) such that edges form an orthonormal basis (with each edge e^{-1} counting as the negative of e.) There is thus an orthogonal projection

\pi : C_1(X,\mathbb{R}) \to Z_1(X,\mathbb{R}) .

This is the key to building topological crystals!

The embedding of atoms

We now come to the main construction, first introduced by Kotani and Sunada. To build a topological crystal, we start with a connected graph X with a chosen basepoint x_0. We define an atom to be a homology class of paths starting at the basepoint, like

[[\alpha]] : x_0 \to x

Last time I showed that these atoms are the vertices of the maximal abelian cover of X. Now let’s embed these atoms in a vector space!

Definition. Let X be a connected graph with a chosen basepoint. Let A be its set of atoms. Define the map

i : A \to Z_1(X,\mathbb{R})


i([[ \alpha ]]) = \pi(c_\alpha) .

That i is well-defined follows from Lemma A. The interesting part is this:

Theorem A. The following are equivalent:

(1) The graph X has no bridges.

(2) The map i : A \to Z_1(X,\mathbb{R}) is one-to-one.

Proof. The map i is one-to-one if and only if for any atoms [[ \alpha ]] and [[ \beta ]], i([[ \alpha ]])  = i([[ \beta ]]) implies [[ \alpha ]]= [[ \beta ]]. Note that \gamma = \beta^{-1} \alpha is a path in X with c_\gamma = c_{\alpha} - c_\beta, so

\pi(c_\gamma) = \pi(c_{\alpha} - c_\beta) =   i([[ \alpha ]]) - i([[ \beta ]])

Since \pi(c_\gamma) vanishes if and only if c_\gamma is orthogonal to every 1-cycle, we have

c_{\gamma} \textrm{ is orthogonal to every 1-cycle}   \; \iff \;   i([[ \alpha ]])  = i([[ \beta ]])

On the other hand, Lemma A says

c_\gamma = 0 \; \iff \; [[ \alpha ]]= [[ \beta ]].

Thus, to prove (1)\iff(2), it suffices to that show that X has no bridges if and only if every 1-chain c_\gamma orthogonal to every 1-cycle has c_\gamma =0. This is Lemma D below.   █

The following lemmas are the key to the theorem above — and also a deeper one saying that if X has no bridges, we can extend i : A \to Z_1(X,\mathbb{R}) to an embedding of the whole maximal abelian cover of X.

For now, we just need to show that any nonzero 1-chain coming from a path in a bridgeless graph has nonzero inner product with some 1-cycle. The following lemmas, inspired by an idea of Ilya Bogdanov, yield an algorithm for actually constructing such a 1-cycle. This 1-cycle also has other desirable properties, which will come in handy later.

To state these, let a simple path be one in which each vertex appears at most once. Let a simple loop be a loop \gamma : x \to x in which each vertex except x appears at most once, while x appears exactly twice, as the starting point and ending point. Let the support of a 1-chain c, denoted \mathrm{supp}(c), be the set of edges e such that \langle c, e\rangle> 0. This excludes edges with \langle c, e \rangle= 0 , but also those with \langle c , e \rangle < 0, which are inverses of edges in the support. Note that

c = \sum_{e \in \mathrm{supp}(c)} \langle c, e \rangle  .

Thus, \mathrm{supp}(c) is the smallest set of edges such that c can be written as a positive linear combination of edges in this set.

Okay, here are the lemmas!

Lemma B. Let X be any graph and let c be an integral 1-cycle on X. Then for some n we can write

c = c_{\sigma_1} + \cdots +  c_{\sigma_n}

where \sigma_i are simple loops with \mathrm{supp}(c_{\sigma_i}) \subseteq \mathrm{supp}(c).

Proof. See the paper. The proof is an algorithm that builds a simple loop \sigma_1 with\mathrm{supp}(c_{\sigma_1}) \subseteq \mathrm{supp}(c). We subtract this from c, and if the result isn’t zero we repeat the algorithm, continuing to subtract off 1-cycles c_{\sigma_i} until there’s nothing left.   █

Lemma C. Let \gamma: x \to y be a path in a graph X. Then for some n \ge 0 we can write

c_\gamma = c_\delta + c_{\sigma_1} + \cdots +  c_{\sigma_n}

where \delta: x \to y is a simple path and \sigma_i are simple loops with \mathrm{supp}(c_\delta), \mathrm{supp}(c_{\sigma_i}) \subseteq \mathrm{supp}(c_\gamma).

Proof. This relies on the previous lemma, and the proof is similar — but when we can’t subtract off any more c_{\sigma_i}’s we show what’s left is c_\delta for a simple path \delta: x \to y.   █

Lemma D. Let X be a graph. Then the following are equivalent:

(1) X has no bridges.

(2) For any path \gamma in X, if c_\gamma is orthogonal to every 1-cycle then c_\gamma = 0.

Proof. It’s easy to show a bridge e gives a nonzero 1-chain c_e that’s orthogonal to all 1-cycles, so the hard part is showing that for a bridgeless graph, if c_\gamma is orthogonal to every 1-cycle then c_\gamma = 0. The idea is to start with a path for which c_\gamma \ne 0. We hit this path with Lemma C, which lets us replace \gamma by a simple path \delta. The point is that a simple path is a lot easier to deal with than a general path: a general path could wind around crazily, passing over every edge of our graph multiple times.

Then, assuming X has no bridges, we use Ilya Bogdanov’s idea to build a 1-cycle that’s not orthogonal to c_\delta. The basic idea is to take the path \delta : x \to y and write it out as \delta = e_1 \cdots e_n. Since the last edge e_n is not a bridge, there must be a path from y back to x that does not use the edge e_n or its inverse. Combining this path with \delta we can construct a loop, which gives a cycle having nonzero inner product with c_\delta and thus with c_\gamma.

I’m deliberately glossing over some difficulties that can arise, so see the paper for details!   █

Embedding the whole crystal

Okay: so far, we’ve taken a connected bridgeless graph X and embedded its atoms into the space of 1-cycles via a map

i : A \to Z_1(X,\mathbb{R})  .

These atoms are the vertices of the maximal abelian cover \overline{X}. Now we’ll extend i to an embedding of the whole graph \overline{X} — or to be precise, its geometric realization |\overline{X}|. Remember, for us a graph is an abstract combinatorial gadget; its geometric realization is a topological space where the edges become closed intervals.

The idea is that just as i maps each atom to a point in the vector space Z_1(X,\mathbb{R}), j maps each edge of |\overline{X}| to a straight line segment between such points. These line segments serve as the ‘bonds’ of a topological crystal. The only challenge is to show that these bonds do not cross each other.

Theorem B. If X is a connected graph with basepoint, the map i : A \to Z_1(X,\mathbb{R}) extends to a continuous map

j : |\overline{X}| \to Z_1(X,\mathbb{R})

sending each edge of |\overline{X}| to a straight line segment in Z_1(X,\mathbb{R}). If X has no bridges, then j is one-to-one.

Proof. The first part is easy; the second part takes real work! The problem is to show the edges don’t cross. Greg Egan and I couldn’t do it using just Lemma D above. However, there’s a nice argument that goes back and uses Lemma C — read the paper for details.

As usual, history is different than what you read in math papers: David Speyer gave us a nice proof of Lemma D, and that was good enough to prove that atoms are mapped into the space of 1-cycles in a one-to-one way, but we only came up with Lemma C after weeks of struggling to prove the edges don’t cross.   █

Connections to tropical geometry

Tropical geometry sets up a nice analogy between Riemann surfaces and graphs. The Abel–Jacobi map embeds any Riemann surface \Sigma in its Jacobian, which is the torus H_1(\Sigma,\mathbb{R})/H_1(\Sigma,\mathbb{Z}). We can similarly define the Jacobian of a graph X to be H_1(X,\mathbb{R})/H_1(X,\mathbb{Z}). Theorem B yields a way to embed a graph, or more precisely its geometric realization |X|, into its Jacobian. This is the analogue, for graphs, of the Abel–Jacobi map.

After I put this paper on the arXiv, I got an email from Matt Baker saying that he had already proved Theorem A — or to be precise, something that’s clearly equivalent. It’s Theorem 1.8 here:

• Matthew Baker and Serguei Norine, Riemann–Roch and Abel–Jacobi theory on a finite graph.

This says that the vertices of a bridgeless graph X are embedded in its Jacobian by means of the graph-theoretic analogue of the Abel–Jacobi map.

What I really want to know is whether someone’s written up a proof that this map embeds the whole graph, not just its vertices, into its Jacobian in a one-to-one way. That would imply Theorem B. For more on this, try my conversation with David Speyer.

Anyway, there’s a nice connection between topological crystallography and tropical geometry, and not enough communication between the two communities. Once I figure out what the tropical folks have proved, I will revise my paper to take that into account.

Next time I’ll talk about more examples of topological crystals!

Topological Crystals (Part 2)

27 July, 2016


We’re building crystals, like diamonds, purely from topology. Last time I said how: you take a graph X and embed its maximal abelian cover into the vector space H_1(X,\mathbb{R}). Now let me say a bit more about the maximal abelian cover. It’s not nearly as famous as the universal cover, but it’s very nice.

First I’ll whiz though the basic idea, and then I’ll give the details.

The basic idea

By ‘space’ let me mean a connected topological space that’s locally nice. The basic idea is that if X is some space, its universal cover \widetilde{X} is a covering space of X that covers all other covering spaces of X. The maximal abelian cover \overline{X} has a similar universal property—but it’s abelian, and it covers all abelian connected covers. A cover is abelian if its group of deck transformations is abelian.

The cool part is that universal covers are to homotopy theory as maximal abelian covers are to homology theory.

What do I mean by that? For starters, points in \widetilde{X} are just homotopy classes of paths in X starting at some chosen basepoint. And the points in \overline{X} are just ‘homology classes’ of paths starting at the basepoint.

But people don’t talk so much about ‘homology classes’ of paths. So what do I mean by that? Here a bit of category theory comes in handy. Homotopy classes of paths in X are morphisms in the fundamental groupoid of X. Homology classes of paths are morphisms in the abelianized version of the fundamental groupoid!

But wait a minute — what does that mean? Well, we can abelianize any groupoid by imposing the relations

f g = g f

whenever it makes sense to do so. It makes sense to do so when you can compose the morphisms f : x \to y and g : x' \to y' in either order, and the resulting morphisms f g and g f have the same source and the same target. And if you work out what that means, you’ll see it means

x = y = x' = y'

But now let me say it all much more slowly, for people who want a more relaxed treatment.

The details

There are lots of slightly different things called ‘graphs’ in mathematics, but in topological crystallography it’s convenient to work with one that you’ve probably never seen before. This kind of graph has two copies of each edge, one pointing in each direction.

So, we’ll say a graph X = (E,V,s,t,i) has a set V of vertices, a set E of edges, maps s,t : E \to V assigning to each edge its source and target, and a map i : E \to E sending each edge to its inverse, obeying

s(i(e)) = t(e), \quad t(i(e)) = s(e) , \qquad i(i(e)) = e


i(e) \ne e

for all e \in E.

That inequality at the end will make category theorists gag: definitions should say what’s true, not what’s not true. But category theorists should be able to see what’s really going on here, so I leave that as a puzzle.

For ordinary folks, let me repeat the definition using more words. If s(e) = v and t(e) = w we write e : v \to w, and draw e as an interval with an arrow on it pointing from v to w. We write i(e) as e^{-1}, and draw e^{-1} as the same interval as e, but with its arrow reversed. The equations obeyed by i say that taking the inverse of e : v \to w gives an edge e^{-1} : w \to v and that (e^{-1})^{-1} = e. No edge can be its own inverse.

A map of graphs, say f : X \to X', is a pair of functions, one sending vertices to vertices and one sending edges to edges, that preserve the source, target and inverse maps. By abuse of notation we call both of these functions f.

I started out talking about topology; now I’m treating graphs very combinatorially, but we can bring the topology back in. From a graph X we can build a topological space |X| called its geometric realization. We do this by taking one point for each vertex and gluing on one copy of [0,1] for each edge e : v \to w, gluing the point 0 to v and the point 1 to w, and then identifying the interval for each edge e with the interval for its inverse by means of the map t \mapsto 1 - t.

Any map of graphs gives rise to a continuous map between their geometric realizations, and we say a map of graphs is a cover if this continuous map is a covering map. For simplicity we denote the fundamental group of |X| by \pi_1(X), and similarly for other topological invariants of |X|. However, sometimes I’ll need to distinguish between a graph X and its geometric realization |X|.

Any connected graph X has a universal cover, meaning a connected cover

p : \widetilde{X} \to X

that covers every other connected cover. The geometric realization of \widetilde{X} is connected and simply connected. The fundamental group \pi_1(X) acts as deck transformations of \widetilde{X}, meaning invertible maps g : \widetilde{X} \to \widetilde{X} such that p \circ g = p. We can take the quotient of \widetilde{X} by the action of any subgroup G \subseteq \pi_1(X) and get a cover q : \widetilde{X}/G \to X.

In particular, if we take G to be the commutator subgroup of \pi_1(X), we call the graph \widetilde{X}/G the maximal abelian cover of the graph X, and denote it by \overline{X}. We obtain a cover

q : \overline{X} \to X

whose group of deck transformations is the abelianization of \pi_1(X). This is just the first homology group H_1(X,\mathbb{Z}). In particular, if the space corresponding to X has n holes, this is the free abelian group on
n generators.

I want a concrete description of the maximal abelian cover! I’ll build it starting with the universal cover, but first we need some preliminaries on paths in graphs.

Given vertices x,y in X, define a path from x to y to be a word of edges \gamma = e_1 \cdots e_\ell with e_i : v_{i-1} \to v_i for some vertices v_0, \dots, v_\ell with v_0 = x and v_\ell = y. We allow the word to be empty if and only if x = y; this gives the trivial path from x to itself.

Given a path \gamma from x to y we write \gamma : x \to y, and we write the trivial path from x to itself as 1_x : x \to x. We define the composite of paths \gamma : x \to y and \delta : y \to z via concatenation of words, obtaining a path we call \gamma \delta : x \to z. We call a path from a vertex x to itself a loop based at x.

We say two paths from x to y are homotopic if one can be obtained from the other by repeatedly introducing or deleting subwords of the form e_i e_{i+1} where e_{i+1} = e_i^{-1}. If [\gamma] is a homotopy class of paths from x to y, we write [\gamma] : x \to y. We can compose homotopy classes [\gamma] : x \to y and [\delta] : y \to z by setting [\gamma] [\delta] = [\gamma \delta].

If X is a connected graph, we can describe the universal cover \widetilde{X} as follows. Fix a vertex x_0 of X, which we call the basepoint. The vertices of \widetilde{X} are defined to be the homotopy classes of paths [\gamma] : x_0 \to x where x is arbitrary. The edges in \widetilde{X} from the vertex [\gamma] : x_0 \to x to the vertex [\delta] : x_0 \to y are defined to be the edges e \in E with [\gamma e] = [\delta]. In fact, there is always at most one such edge. There is an obvious map of graphs

p : \widetilde{X} \to X

sending each vertex [\gamma] : x_0 \to x of \widetilde{X} to the vertex
x of X. This map is a cover.

Now we are ready to construct the maximal abelian cover \overline{X}. For this, we impose a further equivalence relation on paths, which is designed to make composition commutative whenever possible. However, we need to be careful. If \gamma : x \to y and \delta : x' \to y' , the composites \gamma \delta and \delta \gamma are both well-defined if and only if x' = y and y' = x. In this case, \gamma \delta and \delta \gamma share the same starting point and share the same ending point if and only if x = x' and y = y'. If all four of these equations hold, both \gamma and \delta are loops based at x. So, we shall impose the relation \gamma \delta = \delta \gamma only in this case.

We say two paths are homologous if one can be obtained from another by:

• repeatedly introducing or deleting subwords e_i e_{i+1} where
e_{i+1} = e_i^{-1}, and/or

• repeatedly replacing subwords of the form

e_i \cdots e_j e_{j+1} \cdots e_k

by those of the form

e_{j+1} \cdots e_k e_i \cdots e_j

where e_i \cdots e_j and e_{j+1} \cdots e_k are loops based at the same vertex.

My use of the term ‘homologous’ is a bit nonstandard here!

We denote the homology class of a path \gamma by [[ \gamma ]]. Note that if two paths \gamma : x \to y, \delta : x' \to y' are homologous then x = x' and y = y'. Thus, the starting and ending points of a homology class of paths are well-defined, and given any path \gamma : x \to y we write [[ \gamma ]] : x \to y . The composite of homology classes is also well-defined if we set

[[ \gamma ]] [[ \delta ]] = [[ \gamma \delta ]]

We construct the maximal abelian cover of a connected graph X just as we constructed its universal cover, but using homology classes rather than homotopy classes of paths. And now I’ll introduce some jargon that should make you start thinking about crystals!

Fix a basepoint x_0 for X. The vertices of \overline{X}, or atoms, are defined to be the homology classes of paths [[\gamma]] : x_0 \to x where x is arbitrary. Any edge of \overline{X}, or bond, goes from some atom [[ \gamma]] : x_0 \to x to the some atom [[ \delta ]] : x_0 \to y. The bonds from [[ \gamma]] to [[ \delta ]] are defined to be the edges e \in E with [[ \gamma e ]] = [[ \delta ]]. There is at most one bond between any two atoms. Again we have a covering map

q : \overline{X} \to X .

The homotopy classes of loops based at x_0 form a group, with composition as the group operation. This is the fundamental group \pi_1(X) of the graph X. This is isomorphic as the fundamental group of the space associated to X. By our construction of the universal cover, \pi_1(X) is also the set of vertices of \widetilde{X} that are mapped to x_0 by p. Furthermore, any element [\gamma] \in \pi_1(X) defines a deck transformation of \widetilde{X} that sends each vertex [\delta] : x_0 \to x to the vertex [\gamma] [\delta] : x_0 \to x.

Similarly, the homology classes of loops based at x_0 form a group with composition as the group operation. Since the additional relation used to define homology classes is precisely that needed to make composition of homology classes of loops commutative, this group is the abelianization of \pi_1(X). It is therefore isomorphic to the first homology group H_1(X,\mathbb{Z}) of the geometric realization of X.

By our construction of the maximal abelian cover, H_1(X,\mathbb{Z}) is also the set of vertices of \overline{X} that are mapped to x_0 by q. Furthermore, any element [[\gamma]] \in H_1(X,\mathbb{Z}) defines a deck transformation of \overline{X} that sends each vertex [[\delta]] : x_0 \to x to the vertex [[\gamma]] [[\delta]] : x_0 \to x.

So, it all works out! The fundamental group \pi_1(X) acts as deck transformations of the universal cover, while the first homology group H_1(X,\mathbb{Z}) acts as deck transformations of the maximal abelian cover.

Puzzle for experts: what does this remind you of in Galois theory?

We’ll get back to crystals next time.

Topological Crystals (Part 1)

22 July, 2016


A while back, we started talking about crystals:

• John Baez, Diamonds and triamonds, Azimuth, 11 April 2016.

In the comments on that post, a bunch of us worked on some puzzles connected to ‘topological crystallography’—a subject that blends graph theory, topology and mathematical crystallography. You can learn more about that subject here:

• Tosio Sunada, Crystals that nature might miss creating, Notices of the AMS 55 (2008), 208–215.

I got so interested that I wrote this paper about it, with massive help from Greg Egan:

• John Baez, Topological crystals.

I’ll explain the basic ideas in a series of posts here.

First, a few personal words.

I feel a bit guilty putting so much work into this paper when I should be developing network theory to the point where it does our planet some good. I seem to need a certain amount of beautiful pure math to stay sane. But this project did at least teach me a lot about the topology of graphs.

For those not in the know, applying homology theory to graphs might sound fancy and interesting. For people who have studied a reasonable amount of topology, it probably sounds easy and boring. The first homology of a graph of genus g is a free abelian group on g generators: it’s a complete invariant of connected graphs up to homotopy equivalence. Case closed!

But there’s actually more to it, because studying graphs up to homotopy equivalence kills most of the fun. When we’re studying networks in real life we need a more refined outlook on graphs. So some aspects of this project might pay off, someday, in ways that have nothing to do with crystallography. But right now I’ll just talk about it as a fun self-contained set of puzzles.

I’ll start by quickly sketching how to construct topological crystals, and illustrate it with the example of graphene, a 2-dimensional form of carbon:

I’ll precisely state our biggest result, which says when this construction gives a crystal where the atoms don’t bump into each other and the bonds between atoms don’t cross each other. Later I may come back and add detail, but for now you can find details in our paper.

Constructing topological crystals

The ‘maximal abelian cover’ of a graph plays a key role in Sunada’s work on topological crystallography. Just as the universal cover of a connected graph X has the fundamental group \pi_1(X) as its group of deck transformations, the maximal abelian cover, denoted \overline{X}, has the abelianization of \pi_1(X) as its group of deck transformations. It thus covers every other connected cover of X whose group of deck transformations is abelian. Since the abelianization of \pi_1(X) is the first homology group H_1(X,\mathbb{Z}), there is a close connection between the maximal abelian cover and homology theory.

In our paper, Greg and I prove that for a large class of graphs, the maximal abelian cover can naturally be embedded in the vector space H_1(X,\mathbb{R}). We call this embedded copy of \overline{X} a ‘topological crystal’. The symmetries of the original graph can be lifted to symmetries of its topological crystal, but the topological crystal also has an n-dimensional lattice of translational symmetries. In 2- and 3-dimensional examples, the topological crystal can serve as the blueprint for an actual crystal, with atoms at the vertices and bonds along the edges.

The general construction of topological crystals was developed by Kotani and Sunada, and later by Eon. Sunada uses ‘topological crystal’ for an even more general concept, but we only need a special case.

Here’s how it works. We start with a graph X. This has a space C_0(X,\mathbb{R}) of 0-chains, which are formal linear combinations of vertices, and a space C_1(X,\mathbb{R}) of 1-chains, which are formal linear combinations of edges. There is a boundary operator

\partial \colon C_1(X,\mathbb{R}) \to C_0(X,\mathbb{R})

This is the linear operator sending any edge to the difference of its two endpoints. The kernel of this operator is called the space of 1-cycles, Z_1(X,\mathbb{R}). There is an inner product on the space of 1-chains such that edges form an orthonormal basis. This determines an orthogonal projection

\pi \colon C_1(X,\mathbb{R}) \to Z_1(X,\mathbb{R})

For a graph, Z_1(X,\mathbb{R}) is isomorphic to the first homology group H_1(X,\mathbb{R}). So, to obtain the topological crystal of X, we need only embed its maximal abelian cover \overline{X} in Z_1(X,\mathbb{R}). We do this by embedding \overline{X} in C_1(X,\mathbb{R}) and then projecting it down via \pi.

To accomplish this, we need to fix a basepoint for X. Each path \gamma in X starting at this basepoint determines a 1-chain c_\gamma. These 1-chains correspond to the vertices of \overline{X}. The graph \overline{X} has an edge from c_\gamma to c_{\gamma'} whenever the path \gamma' is obtained by adding an extra edge to \gamma. This edge is a straight line segment from the point c_\gamma to the point c_{\gamma'}.

The hard part is checking that the projection \pi maps this copy of \overline{X} into Z_1(X,\mathbb{R}) in a one-to-one manner. In Theorems 6 and 7 of our paper we prove that this happens precisely when the graph X has no ‘bridges’: that is, edges whose removal would disconnect X.

Kotani and Sunada noted that this condition is necessary. That’s actually pretty easy to see. The challenge was to show that it’s sufficient! For this, our main technical tool is Lemma 5, which for any path \gamma decomposes the 1-chain c_\gamma into manageable pieces.

We call the resulting copy of \overline{X} embedded in Z_1(X,\mathbb{R}) a topological crystal.

Let’s see how it works in an example!

Take X to be this graph:

Since X has 3 edges, the space of 1-chains is 3-dimensional. Since X has 2 holes, the space of 1-cycles is a 2-dimensional plane in this 3-dimensional space. If we consider paths \gamma in X starting at the red vertex, form the 1-chains c_\gamma, and project them down to this plane, we obtain the following picture:

Here the 1-chains c_\gamma are the white and red dots. These are the vertices of \overline{X}, while the line segments between them are the edges of \overline{X}. Projecting these vertices and edges onto the plane of 1-cycles, we obtain the topological crystal for X. The blue dots come from projecting the white dots onto the plane of 1-cycles, while the red dots already lie on this plane. The resulting topological crystal provides the pattern for graphene:

That’s all there is to the basic idea! But there’s a lot more to say about it, and a lot of fun examples to look at: diamonds, triamonds, hyperquartz and more.

Operads for “Systems of Systems”

13 July, 2016

“Systems of systems” is a fashionable buzzword for complicated systems that are themselves made of complicated systems, often of disparate sorts. They’re important in modern engineering, and it takes some thought to keep them from being unmanageable. Biology and ecology are full of systems of systems.

David Spivak has been working a lot on operads as a tool for describing systems of systems. Here’s a nice programmatic talk advocating this approach:

• David Spivak, Operads as a potential foundation for
systems of systems

This was a talk he gave at the Generalized Network Structures and Dynamics Workshop at the Mathematical Biosciences Institute at Ohio State University this spring.

You won’t learn what operads are from this talk—for that, try this:

• Wikipedia, Operad.

But if you know a bit about operads, it may help give you an idea of their flexibility as a formalism for describing ways of sticking together components to form bigger systems!

I’ll probably talk about this kind of thing more pretty soon. So far I’ve been using category theory to study networked systems like electrical circuits, Markov processes and chemical reaction networks. The same ideas handle all these different kind of systems in a unified way. But I want to push toward biology. Here we need more sophisticated ideas. My philosophy is that while biology seems “messy” to physicists, living systems actually operate at higher levels of abstraction, which call for new mathematics.

Large Countable Ordinals (Part 3)

7 July, 2016

Last time we saw why it’s devilishly hard to give names to large countable ordinals.

An obvious strategy is to make up a function f from ordinals to ordinals that grows really fast, so that f(x) is a lot bigger than the ordinal x indexing it. This is indeed a good idea. But something funny tends to happen! Eventually x catches up with f(x). In other words, you eventually hit a solution of

x = f(x)

This is called a fixed point of f. At this point, there’s no way to use f(x) as a name for x unless you already have a name for x. So, your scheme fizzles out!

For example, we started by looking at powers of \omega, the smallest infinite ordinal. But eventually we ran into ordinals x that obey

x = \omega^x

There’s an obvious work-around: we make up a new name for ordinals x that obey

x = \omega^x

We call them epsilon numbers. In our usual nerdy way we start counting at zero, so we call the smallest solution of this equation \epsilon_0, and the next one \epsilon_1, and so on.

But eventually we run into ordinals x that are fixed points of the function \epsilon_x, meaning that

x = \epsilon_x

There’s an obvious work-around: we make up a new name for ordinals x that obey

x = \epsilon_x

But by now you can guess that this problem will keep happening, so we’d better get systematic about making up new names! We should let

\phi_0(\alpha) = \omega^\alpha

and let \phi_{n+1}(\alpha) be the \alphath fixed point of \phi_n.

Oswald Veblen, a mathematician at Princeton, came up with this idea around 1908, based on some thoughts of G. H. Hardy:

• Oswald Veblen, Continuous increasing functions of finite and transfinite ordinals, Trans. Amer. Math. Soc. 9 (1908), 280–292.

He figured out how to define \phi_\gamma(\alpha) even when the index \gamma is infinite.

Last time we saw how to name a lot of countable ordinals using this idea: in fact, all ordinals less than the ‘Feferman–Schütte ordinal’. This time I want go further, still using Veblen’s work.

First, however, I feel an urge to explain things a bit more precisely.

Veblen’s fixed point theorem

There are three kinds of ordinals. The first is a successor ordinal, which is one more than some other ordinal. So, we say \alpha is a successor ordinal if

\alpha = \beta + 1

for some \beta. The second is 0, which is not a successor ordinal. And the third is a limit ordinal, which is neither 0 nor a successor ordinal. The smallest example is

\omega = \{1, 2, 3, \dots \}

Every limit ordinal is the ‘limit’ of ordinals less than it. What does that mean, exactly? Remember, each ordinal is a set: the set of all smaller ordinals. We can define the limit of a set of ordinals to be the union of that set. Alternatively, it’s the smallest ordinal that’s greater than or equal to every ordinal in that set.

Now for Veblen’s key idea:

Veblen’s Fixed Point Theorem. Suppose a function f from ordinals to ordinals is:

strictly increasing: if x < y then f(x) < f(y)


continuous: if x is a limit ordinal, f(x) is the limit of the ordinals f(\alpha) where \alpha < x.

Then f must have a fixed point.

Why? For starters, we always have this fact:

x \le f(x)

After all, if this weren’t true, there’d be a smallest x with the property that f(x) < x, since every nonempty set of ordinals has a smallest element. But since f is strictly increasing,

f(f(x)) < f(x)

so f(x) would be an even smaller ordinal with this property. Contradiction!

Using this fact repeatedly, we get

0 \le f(0) \le f(f(0)) \le \cdots

Let \alpha be the limit of the ordinals

0, f(0), f(f(0)), \dots

Then by continuity, f(\alpha) is the limit of the sequence

f(0), f(f(0)), f(f(f(0))),\dots

So f(\alpha) equals \alpha. Voilà! A fixed point!

This construction gives the smallest fixed point of f. There are infinitely many more, since we can start not with 0 but with \alpha+1 and repeat the same argument, etc. Indeed if we try to list these fixed points, we find there is one for each ordinal.

So, we can make up a new function that lists these fixed points. Just to be cute, people call this the derivative of f, so that f'(\alpha) is the \alphath fixed point of f. Beware: while the derivative of a polynomial grows more slowly than the original polynomial, the derivative of a continuous increasing function f from ordinals to ordinals generally grows more quickly than f. It doesn’t really act like a derivative; people just call it that.

Veblen proved another nice theorem:

Theorem. If f is a continuous and strictly increasing function from ordinals to ordinals, so is f'.

So, we can take the derivative repeatedly! This is the key to the Veblen hierarchy.

If you want to read more about this, it helps to know that a function from ordinals to ordinals that’s continuous and strictly increasing is called normal. ‘Normal’ is an adjective that mathematicians use when they haven’t had enough coffee in the morning and aren’t feeling creative—it means a thousand different things. In this case, a better term would be ‘differentiable’.

Armed with that buzzword, you can try this:

• Wikipedia, Fixed-point lemma for normal functions.

Okay, enough theory. On to larger ordinals!

The Feferman–Schütte barrier

First let’s summarize how far we got last time, and why we got stuck. We inductively defined the \alphath ordinal of the \gammath kind by:

\phi_0(\alpha) = \omega^\alpha


\phi_{\gamma+1}(\alpha) = \phi'_\gamma(\alpha)

meaning that \phi_{\gamma+1}(\alpha) is the \alphath fixed point of \phi_\gamma.

This handles the cases where \gamma is zero or a successor ordinal. When \gamma is a limit ordinal we let \phi_{\gamma}(\alpha) be the \alphath ordinal that’s a fixed point of all the functions \phi_\beta for \beta < \gamma.

Last time I explained how these functions \phi_\gamma give a nice notation for ordinals less than the Feferman–Schütte ordinal, which is also called \Gamma_0. This ordinal is the smallest solution of

x = \phi_x(0)

So it’s a fixed point, but of a new kind, because now the x appears as a subscript of the \phi function.

We can get our hands on the Feferman–Schütte ordinal by taking the limit of the ordinals

\phi_0(0), \; \phi_{\phi_0(0)}(0) , \; \phi_{\phi_{\phi_0(0)}(0)}(0), \dots

(If you’re wondering why we use the number 0 here, instead of some other ordinal, I believe the answer is: it doesn’t really matter, we would get the same result if we used any ordinal less than the Feferman–Schütte ordinal.)

The ‘Feferman–Schütte barrier’ is the combination of these two facts:

• On the one hand, every ordinal \beta less than \Gamma_0 can be written as a finite sum of guys \phi_\gamma(\alpha) where \alpha and \gamma are even smaller than \beta. Using this fact repeatedly, we can get a finite expression for any ordinal less than the Feferman–Schütte ordinal in terms of the \phi function, addition, and the ordinal 0.

• On the other hand, if \alpha and \gamma are less than \Gamma_0 then \phi_\gamma(\alpha) is less than \Gamma_0. So we can’t use the \phi function to name the Feferman–Schütte ordinal in terms of smaller ordinals.

But now let’s break the Feferman–Schütte barrier and reach some bigger countable ordinas!

The Γ function

The function \phi_x(0) is strictly increasing and continuous as a function of x. So, using Veblen’s theorems, we can define \Gamma_\alpha to be the \alphath solution of

x = \phi_x(0)

We can then define a bunch of enormous countable ordinals:

\Gamma_0, \Gamma_1, \Gamma_2, \dots

and still bigger ones:

\Gamma_\omega, \; \Gamma_{\omega^2}, \; \Gamma_{\omega^3} , \dots

and even bigger ones:

\Gamma_{\omega^\omega}, \; \Gamma_{\omega^{\omega^\omega}}, \; \Gamma_{\omega^{\omega^{\omega^\omega}}}, \dots

and even bigger ones:

\Gamma_{\epsilon_0}, \Gamma_{\epsilon_1}, \Gamma_{\epsilon_2}, \dots

But since \epsilon_\alpha is just \phi_1(\alpha), we can reach much bigger countable ordinals with the help of the \phi function:

\Gamma_{\phi_2(0)}, \; \Gamma_{\phi_3(0)}, \; \Gamma_{\phi_4(0)}, \dots

and we can do vastly better using the \Gamma function itself:

\Gamma_{\Gamma_0}, \Gamma_{\Gamma_{\Gamma_0}}, \Gamma_{\Gamma_{\Gamma_{\Gamma_0}}} , \dots

The limit of all these is the smallest solution of

x = \Gamma_x

As usual, this ordinal is still countable, but there’s no way to express it in terms of the \Gamma function and smaller ordinals. So we are stuck again.

In short: we got past the Feferman–Schütte barrier by introducing a name for the \alphath solution of x = \phi_x(0). We called it \Gamma_\alpha. This made us happy for about two minutes…

…. but then we ran into another barrier of the same kind.

So what we really need is a more general notation: one that gets us over not just this particular bump in the road, but all bumps of this kind! We don’t want to keep randomly choosing goofy new letters like \Gamma. We need something systematic.

The multi-variable Veblen hierarchy

We were actually doing pretty well with the \phi function. It was nice and systematic. It just wasn’t powerful enough. But if you’re trying to keep track of how far you’re driving on a really long trip, you want an odometer with more digits. So, let’s try that.

In other words, let’s generalize the \phi function to allow more subscripts. Let’s rename \Gamma_\alpha and call it \phi_{1,0}(\alpha). The fact that we’re using two subscripts says that we’re going beyond the old \phi functions with just one subscript. The subscripts 1 and 0 should remind you of what happens when you drive more than 9 miles: if your odometer has two digits, it’ll say you’re on mile 10.

Now we proceed as before: we make up new functions, each of which enumerates the fixed points of the previous one:

\phi_{1,1} = \phi'_{1,0}
\phi_{1,2} = \phi'_{1,1}
\phi_{1,3} = \phi'_{1,2}

and so on. In general, we let

\phi_{1,\gamma+1} = \phi'_{1,\gamma}

and when \gamma is a limit ordinal, we let

\displaystyle{ \phi_{1,\gamma}(\alpha) = \lim_{\beta \to \gamma} \phi_{1,\beta}(\alpha) }

Are you confused?

How could you possibly be confused???

Okay, maybe an example will help. In the last section, our notation fizzled out when we took the limit of these ordinals:

\Gamma_{\Gamma_0}, \Gamma_{\Gamma_{\Gamma_0}}, \Gamma_{\Gamma_{\Gamma_{\Gamma_0}}} , \dots

The limit of these is the smallest solution of x = \Gamma_x. But now we’re writing \Gamma_x = \phi_{1,0}(x), so this limit is the smallest fixed point of \phi_{1,0}. So, it’s \phi_{1,1}(0).

We can now ride happily into the sunset, defining \phi_{1,\gamma}(\alpha) for all ordinals \alpha, \gamma. Of course, this will never give us a notation for ordinals with

x = \phi_{1,x}(0)

But we don’t let that stop us! This is where the new extra subscript really comes in handy. We now define \phi_{2,0}(\alpha) to be the \alphath solution of

x = \phi_{1,x}(0)

Then we drive on as before. We let

\phi_{2,\gamma+1} = \phi'_{2,\gamma}

and when \gamma is a limit ordinal, we say

\displaystyle{ \phi_{2,\gamma}(\alpha) = \lim_{\beta \to \gamma} \phi_{2,\beta}(\alpha) }

I hope you get the idea. Keep doing this!

We can inductively define \phi_{\beta,\gamma}(\alpha) for all \alpha, \beta and \gamma. Of course, these functions will never give a notation for solutions of

x = \phi_{x,0}(0)

To describe these, we need a function with one more subscript! So let \phi_{1,0,0}(\alpha) be the \alphath solution of

x = \phi_{x,0}(0)

We can then proceed on and on and on, adding extra subscripts as needed.

This is called the multi-variable Veblen hierarchy.


To help you understand the multi-variable Veblen hierarchy, I’ll use it to describe lots of ordinals. Some are old friends. Starting with finite ones, we have:

\phi_0(0) = 1

\phi_0(0) + \phi_0(0) = 2

and so on, so we don’t need separate names for natural numbers… but I’ll use them just to save space.

\phi_0(1) = \omega

\phi_0(2) = \omega^2

and so on, so we don’t need separate names for \omega and its powers, but I’ll use them just to save space.

\phi_0(\omega) = \omega^\omega

\phi_0(\omega^\omega) = \omega^{\omega^\omega}

\phi_1(0) = \epsilon_0

\phi_1(1) = \epsilon_1

\displaystyle{ \phi_1(\phi_1(0)) = \epsilon_{\epsilon_0} }

\phi_2(0) = \zeta_0

\phi_2(1) = \zeta_1

where I should remind you that \zeta_\alpha is a name for the \alphath solution of x = \epsilon_x.

\phi_{1,0}(0) = \Gamma_0

\phi_{1,0}(1) = \Gamma_1

\displaystyle{ \phi_{1,0}(\phi_{1,0}(0)) = \Gamma_{\Gamma_0} }

\phi_{1,1}(0) is the limit of \Gamma_{\Gamma_0}, \Gamma_{\Gamma_{\Gamma_0}}, \Gamma_{\Gamma_{\Gamma_{\Gamma_0}}} , \dots

\phi_{1,0,0}(0) is called the Ackermann ordinal.

Apparently Wilhelm Ackermann, the logician who invented a very fast-growing function called Ackermann’s function, had a system for naming ordinals that fizzled out at this ordinal.

The small Veblen ordinal

There are obviously lots more ordinals that can be described using the multi-variable Veblen hierarchy, but I don’t have anything interesting to say about them. And you’re probably more interested in this question: what’s next?

The limit of these ordinals

\phi_1(0), \; \phi_{1,0}(0), \; \phi_{1,0,0}(0), \dots

is called the small Veblen ordinal. Yet again, it’s a countable ordinal. It’s the smallest ordinal that cannot be named in terms of smaller ordinals using the multi-variable Veblen hierarchy…. at least, not the version I described. And here’s a nice fact:

Theorem. Every ordinal \beta less than the small Veblen ordinal can be written as a finite expression in terms of the multi-variable \phi function, addition, and 0.

For example,

\Gamma_0 + \epsilon_{\epsilon_0} + \omega^\omega + 2

is equal to

\displaystyle{  \phi_{\phi_0(0),0}(0) + \phi_{\phi_0(0)}(\phi_{\phi_0(0)}(0)) +  \phi_0(\phi_0(\phi_0(0))) + \phi_0(0) + \phi_0(0)  }

On the one hand, this notation is quite tiresome to read. On the other hand, it’s amazing that it gets us so far!

Furthermore, if you stare at expressions like the above one for a while, and think about them abstractly, they should start looking like trees. So you should find it easy to believe that ordinals less than the small Veblen ordinal correspond to trees, perhaps labelled in some way.

Indeed, this paper describes a correspondence of this sort:

• Herman Ruge Jervell, Finite trees as ordinals, in New Computational Paradigms, Lecture Notes in Computer Science 3526, Springer, Berlin, 2005, pp. 211–220.

However, I don’t think his idea is quite same as what you’d come up with by staring at expressions like

\displaystyle{  \phi_{\phi_0(0),0}(0) + \phi_{\phi_0(0)}(\phi_{\phi_0(0)}(0)) +  \phi_0(\phi_0(\phi_0(0))) + \phi_0(0) + \phi_0(0)  }

Beyond the small Veblen ordinal

We’re not quite done yet. The modifier ‘small’ in the term ‘small Veblen ordinal’ should make you suspect that there’s more in Veblen’s paper. And indeed there is!

Veblen actually extended his multi-variable function \phi_{\gamma_1, \dots, \gamma_n}(\alpha) to the case where there are infinitely many variables. He requires that all but finitely many of these variables equal zero, to keep things under control. Using this, one can set up a notation for even bigger countable ordinals! This notation works for all ordinals less than the large Veblen ordinal.

We don’t need to stop here. The large Veblen ordinal is just the first of a new series of even larger countable ordinals!

These can again be defined as fixed points. Yes: it’s déjà vu all over again. But around here, people usually switch to a new method for naming these fixed points, called ‘ordinal collapsing functions’. One interesting thing about this notation is that it makes use of uncountable ordinal. The first uncountable ordinal is called \Omega, and it dwarfs all those we’ve seen here.

We can use the ordinal collapsing function \psi to name many of our favorite countable ordinals, and more:

\psi(\Omega) is \zeta_0, the smallest solution of x = \epsilon_x.

\psi(\Omega^\Omega) is \Gamma_0, the Feferman–Schütte ordinal.

\psi(\Omega^{\Omega^2}) is the Ackermann ordinal.

\psi(\Omega^{\Omega^\omega}) is the small Veblen ordinal.

\psi(\Omega^{\Omega^\Omega}) is the large Veblen ordinal.

\psi(\epsilon_{\Omega+1}) is called the Bachmann–Howard ordinal. This is the limit of the ordinals

\psi(\Omega), \psi(\Omega^\Omega), \psi(\Omega^{\Omega^\Omega}), \dots

I won’t explain this now. Maybe later! But not tonight. As Bilbo Baggins said:

The Road goes ever on and on
Out from the door where it began.
Now far ahead the Road has gone,
Let others follow it who can!
Let them a journey new begin,
But I at last with weary feet
Will turn towards the lighted inn,
My evening-rest and sleep to meet.

For more

But perhaps you’re impatient and want to begin a new journey now!

The people who study notations for very large countable ordinals tend to work on proof theory, because these ordinals have nice applications to that branch of logic. For example, Peano arithmetic is powerful enough to work with ordinals up to but not including \epsilon_0, so we call \epsilon_0 the proof-theoretic ordinal of Peano arithmetic. Stronger axiom systems have bigger proof-theoretic ordinals.

Unfortunately this makes it a bit hard to learn about large countable ordinals without learning, or at least bumping into, a lot of proof theory. And this subject, while interesting in principle, is quite tough. So it’s hard to find a readable introduction to large countable ordinals.

The bibliography of the Wikipedia article on large countable ordinals gives this half-hearted recommendation:

Wolfram Pohlers, Proof theory, Springer 1989 ISBN 0-387-51842-8 (for Veblen hierarchy and some impredicative ordinals). This is probably the most readable book on large countable ordinals (which is not saying much).

Unfortunately, Pohlers does not seem to give a detailed account of ordinal collapsing functions. If you want to read something fun that goes further than my posts so far, try this:

• Hilbert Levitz, Transfinite ordinals and their notations: for the uninitiated.

(Anyone whose first name is Hilbert must be born to do logic!)

This is both systematic and clear:

• Wikipedia, Ordinal collapsing functions.

And if you want to explore countable ordinals using a computer program, try this:

• Paul Budnik, Ordinal calculator and research tool.

Among other things, this calculator can add, multiply and exponentiate ordinals described using the multi-variable Veblen hierarchy—even the version with infinitely many variables!

Large Countable Ordinals (Part 2)

4 July, 2016

Last time I took you on a road trip to infinity. We zipped past a bunch of countable ordinals

\omega , \; \omega^\omega,\; \omega^{\omega^\omega}, \;\omega^{\omega^{\omega^\omega}}, \dots

and stopped for gas at the first one after all these. It’s called \epsilon_0. Heuristically, you can imagine it like this:

\epsilon_0 = \omega^{\omega^{\omega^{\omega^{\cdot^{\cdot^{\cdot}}}}}}

More rigorously, it’s the smallest ordinal x obeying the equation

x = \omega^x

Beyond εo

But I’m sure you have a question. What comes after \epsilon_0?

Well, duh! It’s

\epsilon_0 + 1

Then comes

\epsilon_0 + 2

and then eventually we get to

\epsilon_0 + \omega

and then

\epsilon_0 + \omega^2 ,\dots, \epsilon_0 + \omega^3,\dots \epsilon_0 + \omega^4,\dots

and after a long time

\epsilon_0 + \epsilon_0 = \epsilon_0 2

and then eventually


and then eventually….

Oh, I see! You wanted to know the first really interesting ordinal after \epsilon_0.

Well, this is a matter of taste, but you might be interested in \epsilon_1. This is the first ordinal after \epsilon_0 that satisfies this equation:

x = \omega^x

How do we actually reach this ordinal? Well, just as \epsilon_0 was the limit of this sequence:

\omega , \; \omega^\omega,\; \omega^{\omega^\omega}, \;\omega^{\omega^{\omega^\omega}}, \dots

\epsilon_1 is the limit of this:

\epsilon_0 + 1, \; \omega^{\epsilon_0 + 1}, \;  \omega^{\omega^{\epsilon_0 + 1}}, \; \omega^{\omega^{\omega^{\epsilon_0 + 1}}},\dots

You may wonder what I mean by the ‘limit’ of an increasing sequence of ordinals. I just mean the smallest ordinal greater than or equal to every ordinal in that sequence. Such a thing is guaranteed to exist, since if we treat ordinals as well-ordered sets, we can just take the union of all the sets in that sequence.

Here’s a picture of \epsilon_1, taken from David Madore’s interactive webpage:

In what sense is \epsilon_1 the first "really interesting" ordinal after \epsilon_0?

For one thing, it’s first that can’t be built out of 1, \omega and \epsilon_0 using finitely many additions, multiplications and exponentiations. In other words, if we use Cantor normal form to describe ordinals (as explained last time), and allow expressions involving \epsilon_0 as well as 1 and \omega, we get a notation for all ordinals up to \epsilon_1.

What’s the next really interesting ordinal after \epsilon_1? As you might expect, it’s called \epsilon_2. This is the next solution of

x = \omega^{x}

and it’s defined to be the limit of this sequence:

\epsilon_1 + 1, \; \omega^{\epsilon_1 + 1}, \;\omega^{\omega^{\epsilon_1 + 1}}, \; \omega^{\omega^{\omega^{\epsilon_1 + 1}}},\dots

Maybe now you get the pattern. In general, \epsilon_{\alpha} is the
\alphath solution of x = \omega^{x}. We can define this, if we’re smart, for any ordinal \alpha.

So, we can keep driving on through fields of ever larger ordinals:

\epsilon_2,\dots, \epsilon_{3},\dots, \epsilon_{4}, \dots

and eventually


which is the first ordinal bigger than \epsilon_0, \epsilon_1, \epsilon_2, \dots

Let’s stop and take a look!

Nice! Okay, back in the car…

\epsilon_{\omega+1},\dots, \epsilon_{\omega+2},\dots, \epsilon_{\omega+1},\dots

and then

\epsilon_{\omega^2},\dots , \epsilon_{\omega^3},\dots, \epsilon_{\omega^4},\dots

and then

\epsilon_{\omega^{\omega}},\dots, \epsilon_{\omega^{\omega^{\omega}}},\dots

As you can see, this gets boring after a while: it’s suspiciously similar to the beginning of our trip through the ordinals. The same ordinals are now showing up as subscripts in this epsilon notation. But we’re moving much faster now, since I’m skipping over much bigger gaps, not bothering to mention all sorts of ordinals like

\epsilon_{\omega^{\omega}} + \epsilon_{\omega 248} + \omega^{\omega^{\omega + 17}} + 1

Anyway… while we’re zipping along, I might as well finish telling you the story I started last time. My friend David Sternlieb and I were driving across South Dakota on Route 80. We kept seeing signs for the South Dakota Tractor Museum. When we finally got there, we were driving pretty darn fast, out of boredom—about 85 miles an hour. And guess what happened then!

Oh — wait a minute—this one is sort of interesting:

\displaystyle{ \epsilon_{\epsilon_0} }

Then come some more like that:

\epsilon_{\epsilon_1},\dots, \epsilon_{\epsilon_2},\dots \epsilon_{\epsilon_3},\dots

until we reach this:


and then

\epsilon_{\epsilon_{\omega^{\omega}}},\dots, \epsilon_{\epsilon_{\omega^{\omega^{\omega}}}},\dots

As we keep speeding up, we see:

\epsilon_{\epsilon_{\epsilon_0}},\dots \epsilon_{\epsilon_{\epsilon_{\epsilon_0}}},\dots \epsilon_{\epsilon_{\epsilon_{\epsilon_{\epsilon_0}}}},\dots

So, anyway: by the time we got that tractor museum, we were driving really fast. And, all we saw as we whizzed by was a bunch of rusty tractors out in a field! It was over in a split second! It was a real anticlimax — just like this anecdote, in fact.

But that’s just the way it is when you’re driving through these ordinals! Every ordinal, no matter how large, looks pretty pathetic and small compared to the ones ahead — so you keep speeding up, looking for something ‘really new and different’. But when you find one, it turns out to be part of a larger pattern, and soon that gets boring too.

For example, when we reach the limit of this sequence:

\epsilon_0, \epsilon_{\epsilon_0}, \epsilon_{\epsilon_{\epsilon_0}}, \epsilon_{\epsilon_{\epsilon_{\epsilon_0}}}, \epsilon_{\epsilon_{\epsilon_{\epsilon_{\epsilon_0}}}},\dots

our notation fizzles out again, since this is the first solution of

x = \epsilon_{x}

We could make up a new name for this ordinal, like \zeta_0. I don’t think this name is very common, though I’ve seen it. We could call it the Tractor Museum of Countable Ordinals.

Now we can play the whole game again, defining the zeta number \zeta_{\alpha} to be the \alphath solution of

x = \epsilon_x

sort of like how we defined the epsilons. This kind of equation, where something equals some function of itself, is called a fixed point equation.

But since we’ll have to play this game infinitely often, we might as well be more systematic about it!

The Veblen hierarchy

As you can see, we keep running into new, qualitatively different types of ordinals. First we ran into the powers of omega. Then we ran into the epsilons, and then the zetas. It’s gonna keep happening! For each type of ordinal, our notation fizzles out when we reach the first ‘fixed point’— when the xth ordinal of this type is actually equal to x.

So, instead of making up infinitely many Greek letters for different types of ordinals let’s index them… by ordinals! For each ordinal \gamma we’ll have a type of ordinal. We’ll let \phi_\gamma(\alpha) be the \alphath ordinal of type \gamma.

We can use the fixed point equation to define \phi_{\gamma+1} in terms of \phi_{\gamma}. In other words, we start off by defining

\phi_0(\alpha) = \omega^{\alpha}

and then define


to be the \alphath solution of

x = \phi_{\gamma}(x)

where we start counting at \alpha = 0, so the first solution is called the ‘zeroth’.

We can even make sense of \phi_\gamma(\alpha) when \gamma itself is infinite! Suppose \gamma is a limit of smaller ordinals. Then we define \phi_\gamma(x) to be the limit of \phi_\beta(x) as \beta approaches \gamma. I’ll make this more precise next time.

We get infinitely many different types of ordinals, called the Veblen hierarchy. So, concretely, the Veblen hierarchy starts with the powers of \omega:

\phi_0(\alpha) = \omega^\alpha

and then it goes on to the ‘epsilons’:

\phi_1(\alpha) = \epsilon_\alpha

and then it goes on to what I called the ‘zetas’:

\phi_2(\alpha) = \zeta_\alpha

But that’s just the start!

The Feferman–Schütte ordinal

Boosting the subscript \gamma in \phi_\gamma(\alpha) increases the result much more than boosting \alpha, so let’s focus on that and just let \alpha = 0. The Veblen hierarchy contains ordinals like this:

\phi_{\omega}(0), \; \phi_{\omega+1}(0), \; \phi_{\omega+2}(0), \dots

and then ordinals like this:

\phi_{\omega^2}(0), \; \phi_{\omega^3}(0), \; \phi_{\omega^4}(0), \dots

and then ordinals like this:

\phi_{\omega^\omega}(0), \; \phi_{\omega^{\omega^\omega}}(0), \; \phi_{\omega^{\omega^{\omega^{\omega}}}}(0), \dots

and then this:

\phi_{\epsilon_0}(0), \phi_{\epsilon_{\epsilon_0}}(0), \phi_{\epsilon_{\epsilon_{\epsilon_0}}}(0),  \dots

where of course I’m skipping huge infinite stretches of ‘boring’ ones. But note that

\phi_{\omega}(0) = \phi_{\phi_0(0)}(0)


\phi_{\epsilon_0}(0) = \phi_{\phi_1(0)}(0)


\phi_{\zeta_0}(0) = \phi_{\phi_2(0)}(0)

In short, we can plug the phi function into itself—and we get the biggest effect if we plug it into the subscript!

So, if we’re in a rush to reach some really big countable ordinals, we can try these:

\phi_0(0), \; \phi_{\phi_0(0)}(0) , \; \phi_{\phi_{\phi_0(0)}(0)}(0), \dots

But the limit of these is an ordinal x that has

x = \phi_x(0)

This is called the Feferman–Schütte ordinal and denoted \Gamma_0.

In fact, the Feferman–Schütte ordinal is the smallest solution of

x = \phi_x(0)

Since this equation is self-referential, we can’t describe Feferman–Schütte ordinal using the Veblen hierarchy—at least, not without using the Feferman–Schütte ordinal!

Indeed, some mathematicians have made a big deal about this ordinal, claiming it’s

the smallest ordinal that cannot be described without self-reference.

This takes some explaining, and it’s somewhat controversial. After all, there’s a sense in which every fixed point equation is self-referential. But there’s a certain precise sense in which the Feferman–Schütte ordinal is different from previous ones.

Anyway, you have admit that this is a very cute description of the Fefferman–Schuette ordinal: “the smallest ordinal that cannot be described without self-reference.” Does it use self-reference? It had better—otherwise we have a contradiction!

It’s a little scary, like this picture:

More importantly for us, the Veblen hierarchy fizzles out when we hit the Feferman–Schuette ordinal. Let me say what I mean by that.

Veblen normal form

The Veblen hierarchy gives a notation for ordinals called the Veblen normal form. You can think of this as a high-powered version of Cantor normal form, which we discussed last time.

Veblen normal form relies on this result:

Theorem. Any ordinal \beta can be written uniquely as

\beta = \phi_{\gamma_1}(\alpha_1) + \dots + \phi_{\gamma_{k}}(\alpha_k)

where k is a natural number, each term is less than or equal to the previous one, and \alpha_i < \phi_{\gamma_i}(\alpha_i) for all i.

Note that we can also use this theorem to write out the ordinals \beta_i and \gamma_i, and so on, recursively. So, it gives us a notation for ordinals.

However, this notation is only useful when all the ordinals \alpha_i, \gamma_i are less than the ordinal \beta that we’re trying to describe. Otherwise we need to already have a notation for \beta to express \alpha in Veblen normal form!

So, the power of this notation eventually fizzles out. And the place where it does is Feferman–Schütte ordinal. Every ordinal less than this can be expressed in terms of 0, addition, and the function \phi, using just finitely many symbols!

The moral

As I hope you see, the power of the human mind to see a pattern and formalize it gives the quest for large countable ordinals a strange quality. As soon as we see a systematic way to generate a sequence of larger and larger ordinals, we know this sequence has a limit that’s larger then all of those! And this opens the door to even larger ones….

So, this whole journey feels a bit like trying to outrace our car’s own shadow as we drive away from the sunset: the faster we drive, the faster it shoots ahead of us. We’ll never win.

On the other hand, we’ll only lose if we get tired.

So it’s interesting to hear what happens next. We don’t have to give up. The usual symbol for the Feferman–Schütte ordinal should be a clue. It’s called \Gamma_0. And that’s because it’s just the start of a new series of even bigger countable ordinals!

I’m dying to tell you about those. But this is enough for today.

Large Countable Ordinals (Part 1)

29 June, 2016

I love the infinite.

It may not exist in the physical world, but we can set up rules to think about it in consistent ways, and then it’s a helpful concept. The reason is that infinity is often easier to think about than very large finite numbers.

Finding rules to work with the infinite is one of the great triumphs of mathematics. Cantor’s realization that there are different sizes of infinity is truly wondrous—and by now, it’s part of the everyday bread and butter of mathematics.

Trying to create a notation for these different infinities is very challenging. It’s not a fair challenge, because there are more infinities than expressions we can write down in any given alphabet! But if we seek a notation for countable ordinals, the challenge becomes more fair.

It’s still incredibly frustrating. No matter what notation we use it fizzles out too soon… making us wish we’d invented a more general notation. But this process of ‘fizzling out’ is fascinating to me. There’s something profound about it. So, I would like to tell you about this.

Today I’ll start with a warmup. Cantor invented a notation for ordinals that works great for ordinals less than a certain ordinal called ε0. Next time I’ll go further, and bring in the ‘single-variable Veblen hierarchy’! This lets us describe all ordinals below a big guy called the ‘Feferman–Schütte ordinal’.

In the post after that I’ll bring in the ‘multi-variable Veblen hierarchy’, which gets us all the ordinals below the ‘small Veblen ordinal’. We’ll even touch on the ‘large Veblen ordinal’, which requires a version of the Veblen hierarchy with infinitely many variables. But all this is really just the beginning of a longer story. That’s how infinity works: the story never ends!

To describe countable ordinals beyond the large Veblen ordinal, most people switch to an entirely different set of ideas, called ‘ordinal collapsing functions’. I may tell you about those someday. Not soon, but someday. My interest in the infinite doesn’t seem to be waning. It’s a decadent hobby, but hey: some middle-aged men buy fancy red sports cars and drive them really fast. Studying notions of infinity is cooler, and it’s environmentally friendly.

I can even imagine writing a book about the infinite. Maybe these posts will become part of that book. But one step at a time…

Cardinals versus ordinals

Cantor invented two different kinds of infinities: cardinals and ordinals. Cardinals say how big sets are. Two sets can be put into 1-1 correspondence iff they have the same number of elements—where this kind of ‘number’ is a cardinal. You may have heard about cardinals like aleph-nought (the number of integers), 2 to power aleph-nought (the number of real numbers), and so on. You may have even heard rumors of much bigger cardinals, like ‘inaccessible cardinals’ or ‘super-huge cardinals’. All this is tremendously fun, and I recommend starting here:

• Frank R. Drake, Set Theory, an Introduction to Large Cardinals, North-Holland, 1974.

There are other books that go much further, but as a beginner, I found this to be the most fun.

But I don’t want to talk about cardinals! I want to talk about ordinals.

Ordinals say how big ‘well-ordered’ sets are. A set is well-ordered if it comes with a relation ≤ obeying the usual rules:

Transitivity: if x ≤ y and y ≤ z then x ≤ z

Reflexivity: x ≤ x

Antisymmetry: if x ≤ y and y ≤ x then x = y

and one more rule: every nonempty subset has a smallest element!

For example, the empty set


is well-ordered in a trivial sort of way, and the corresponding ordinal is called


Similarly, any set with just one element, like this:


is well-ordered in a trivial sort of way, and the corresponding ordinal is called


Similarly, any set with two elements, like this:


becomes well-ordered as soon as we decree which element is bigger; the obvious choice is to say 0 < 1. The corresponding ordinal is called


Similarly, any set with three elements, like this:


becomes well-ordered as soon as we linearly order it; the obvious choice here is to say 0 < 1 < 2. The corresponding ordinal is called


Perhaps you’re getting the pattern — you’ve probably seen these particular ordinals before, maybe sometime in grade school. They’re called finite ordinals, or "natural numbers".

But there’s a cute trick they probably didn’t teach you then: we can define each ordinal to be the set of all ordinals less than it:

0 = \{\} (since no ordinal is less than 0)
1 = \{0\} (since only 0 is less than 1)
2 = \{0,1\} (since 0 and 1 are less than 2)
3 = \{0,1,2\} (since 0, 1 and 2 are less than 3)

and so on. It’s nice because now each ordinal is a well-ordered set of the size that ordinal stands for. And, we can define one ordinal to be "less than or equal" to another precisely when its a subset of the other.

Infinite ordinals

What comes after all the finite ordinals? Well, the set of all finite ordinals is itself well-ordered:

\{0,1,2,3,\dots \}

So, there’s an ordinal corresponding to this — and it’s the first infinite ordinal. It’s usually called \omega, pronounced ‘omega’. Using the cute trick I mentioned, we can actually define

\omega = \{0,1,2,3,\dots\}

What comes after this? Well, it turns out there’s a well-ordered set


containing the finite ordinals together with \omega, with the obvious notion of "less than": \omega is bigger than the rest. Corresponding to this set there’s an ordinal called


As usual, we can simply define

\omega+1 = \{0,1,2,3,\dots,\omega\}

At this point you could be confused if you know about cardinals, so let me throw in a word of reassurance. The sets \omega and \omega+1 have the same cardinality: they are both countable. In other words, you can find a 1-1 and onto function between these sets. But \omega and \omega+1 are different as ordinals, since you can’t find a 1-1 and onto function between them that preserves the ordering. This is easy to see, since \omega+1 has a biggest element while \omega does not.

Indeed, all the ordinals in this series of posts will be countable! So for the infinite ones, you can imagine that all I’m doing is taking your favorite countable set and well-ordering it in ever more sneaky ways.

Okay, so we got to \omega + 1. What comes next? Well, not surprisingly, it’s

\omega+2 = \{0,1,2,3,\dots,\omega,\omega+1\}

Then comes

\omega+3, \omega+4, \omega+5,\dots

and so on. You get the idea.

I haven’t really defined ordinal addition in general. I’m trying to keep things fun, not like a textbook. But you can read about it here:

• Wikipedia, Ordinal arithmetic: addition.

The main surprise is that ordinal addition is not commutative. We’ve seen that \omega + 1 \ne \omega, since

\omega + 1 = \{1, 2, 3, \dots, \omega \}

is an infinite list of things… and then one more thing that comes after all those!. But 1 + \omega = \omega, because one thing followed by a list of infinitely many more is just a list of infinitely many things.

With ordinals, it’s not just about quantity: the order matters!

ω+ω and beyond

Okay, so we’ve seen these ordinals:

1, 2, 3, \dots, \omega, \omega + 1, \omega + 2, \omega+3, \dots

What next?

Well, the ordinal after all these is called \omega+\omega. People often call it "omega times 2" or \omega 2 for short. So,

\omega 2 = \{0,1,2,3,\dots,\omega,\omega+1,\omega+2,\omega+3,\dots.\}

It would be fun to have a book with \omega pages, each page half as thick as the previous page. You can tell a nice long story with an \omega-sized book. I think you can imagine this. And if you put one such book next to another, that’s a nice picture of \omega 2.

It’s worth noting that \omega 2 is not the same as 2 \omega. We have

\omega 2 = \omega + \omega


2 \omega = 2 + 2 + 2 + \cdots

where we add \omega of these terms. But

2 + 2 + 2 + \cdots = (1 + 1) + (1 + 1) + (1 + 1) \dots = \omega


2 \omega = \omega

This is not a proof, because I haven’t given you the official definition of how to multiply ordinals. You can find it here:

• Wikipedia, Ordinal arithmetic: multiplication.

Using this you can prove that what I’m saying is true. Nonetheless, I hope you see why what I’m saying might make sense. Like ordinal addition, ordinal multiplication is not commutative! If you don’t like this, you should study cardinals instead.

What next? Well, then comes

\omega 2 + 1, \omega 2 + 2,\dots

and so on. But you probably have the hang of this already, so we can skip right ahead to \omega 3.

In fact, you’re probably ready to skip right ahead to \omega 4, and \omega 5, and so on.

In fact, I bet now you’re ready to skip all the way to "omega times omega", or \omega^2 for short:

\omega^2 = \{0,1,2\dots\omega,\omega+1,\omega+2,\dots ,\omega 2,\omega 2+1,\omega 2+2,\dots\}

Suppose you had an encyclopedia with \omega volumes, each one being a book with \omega pages. If each book is twice as thin as one before, you’ll have \omega^2 pages — and it can still fit in one bookshelf! Here’s the idea:

What comes next? Well, we have

\omega^2+1, \omega^2+2, \dots

and so on, and after all these come

\omega^2+\omega, \omega^2+\omega+1, \omega^2+\omega+2, \dots

and so on — and eventually

\omega^2 + \omega^2 = \omega^2 2

and then a bunch more, and then

\omega^2 3

and then a bunch more, and then

\omega^2 4

and then a bunch more, and more, and eventually

\omega^2 \omega = \omega^3

You can probably imagine a bookcase containing \omega encyclopedias, each with \omega volumes, each with \omega pages, for a total of \omega^3 pages. That’s \omega^3.


I’ve been skipping more and more steps to keep you from getting bored. I know you have plenty to do and can’t spend an infinite amount of time reading this, even if the subject is infinity.

So if you don’t mind me just mentioning some of the high points, there are guys like \omega^4 and \omega^5 and so on, and after all these comes


Let’s try to we imagine this! First, imagine a book with \omega pages. Then imagine an encyclopedia of books like this, with \omega volumes. Then imagine a bookcase containing \omega encyclopedias like this. Then imagine a room containing \omega bookcases like this. Then imagine a floor with library with \omega rooms like this. Then imagine a library with \omega floors like this. Then imagine a city with \omega libraries like this. And so on, ad infinitum.

You have to be a bit careful here, or you’ll be imagining an uncountable number of pages. To name a particular page in this universe, you have to say something like this:

the 23rd page of the 107th book of the 20th encyclopedia in the 7th bookcase in 0th room on the 1000th floor of the 973rd library in the 6th city on the 0th continent on the 0th planet in the 0th solar system in the…

But it’s crucial that after some finite point you keep saying “the 0th”. Without that restriction, there would be uncountably many pages! This is just one of the rules for how ordinal exponentiation works. For the details, read:

• Wikipedia, Ordinal arithmetic: exponentiation.

As they say,

But for infinite exponents, the definition may not be obvious.

Here’s a picture of \omega^\omega, taken from David Madore’s wonderful interactive webpage:

On his page, if you click on any of the labels for an initial portion of an ordinal, like \omega, \omega^2, \omega^3 or \omega^4 here, the picture will expand to show that portion!

And here’s another picture, where each turn of the clock’s hand takes you to a higher power of \omega:

Ordinals up to ε0

Okay, so we’ve reached \omega^\omega. Now what?

Well, then comes \omega^\omega + 1, and so on, but I’m sure that’s boring by now. And then come ordinals like

\omega^\omega 2,\dots, \omega^\omega 3, \dots, \omega^\omega 4, \dots

leading up to

\omega^\omega \omega = \omega^{\omega + 1}

Then eventually come ordinals like

\omega^\omega \omega^2 , \dots, \omega^\omega \omega^3, \dots, \omega^\omega \omega^4, \dots

and so on, leading up to

\omega^\omega \omega^\omega = \omega^{\omega + \omega} = \omega^{\omega 2}

This actually reminds me of something that happened driving across South Dakota one summer with a friend of mine. We were in college, so we had the summer off, so we drive across the country. We drove across South Dakota all the way from the eastern border to the west on Interstate 90.

This state is huge — about 600 kilometers across, and most of it is really flat, so the drive was really boring. We kept seeing signs for a bunch of tourist attractions on the western edge of the state, like the Badlands and Mt. Rushmore — a mountain that they carved to look like faces of presidents, just to give people some reason to keep driving.

Anyway, I’ll tell you the rest of the story later — I see some more ordinals coming up:

\omega^{\omega 3},\dots \omega^{\omega 4},\dots \omega^{\omega 5},\dots

We’re really whizzing along now just to keep from getting bored — just like my friend and I did in South Dakota. You might fondly imagine that we had fun trading stories and jokes, like they do in road movies. But we were driving all the way from Princeton to my friend Chip’s cabin in California. By the time we got to South Dakota, we were all out of stories and jokes.

Hey, look! It’s

\omega^{\omega \omega}= \omega^{\omega^2}

That was cool. Then comes

\omega^{\omega^3}, \dots \omega^{\omega^4}, \dots \omega^{\omega^5}, \dots

and so on.

Anyway, back to my story. For the first half of our half of our trip across the state, we kept seeing signs for something called the South Dakota Tractor Museum.

Oh, wait, here’s an interesting ordinal:


Let’s stop and take look:

That was cool. Okay, let’s keep driving. Here comes

\omega^{\omega^\omega} + 1, \omega^{\omega^\omega} + 2, \dots

and then

\omega^{\omega^\omega} + \omega, \dots, \omega^{\omega^\omega} + \omega 2, \dots, \omega^{\omega^\omega} + \omega 3, \dots

and then

\omega^{\omega^\omega} + \omega^2, \dots, \omega^{\omega^\omega} + \omega^3, \dots

and eventually

\omega^{\omega^\omega} + \omega^\omega

and eventually

\omega^{\omega^\omega} + \omega^{\omega^\omega} = \omega^{\omega^\omega} 2

and then

\omega^{\omega^\omega} 3, \dots, \omega^{\omega^\omega} 4, \dots, \omega^{\omega^\omega} 5, \dots

and eventually

\omega^{\omega^\omega} \omega = \omega^{\omega^\omega + 1}

After a while we reach

\omega^{\omega^\omega + 2}, \dots, {\omega^\omega + 3}, \dots {\omega^\omega + 4}, \dots

and then

\omega^{\omega^\omega + \omega}, \dots, \omega^{\omega^\omega + \omega 2},  \dots, \omega^{\omega^\omega + \omega 3}, \dots

and then

\omega^{\omega^\omega + \omega^2}, \dots,  \omega^{\omega^\omega + \omega^3}, \dots,  \omega^{\omega^\omega + \omega^4}, \dots

and then

\omega^{\omega^\omega + \omega^\omega} = \omega^{\omega^\omega 2}

and then

\omega^{\omega^\omega 3}, \dots,  \omega^{\omega^\omega 4} , \dots

and then

\omega^{\omega^\omega \omega} = \omega^{\omega^{\omega + 1}}

and eventually

\omega^{\omega^{\omega + 2}}, \dots, \omega^{\omega^{\omega + 3}}, \dots, \omega^{\omega^{\omega + 4}}, \dots

This is pretty boring; we’re already going infinitely fast, but we’re still just picking up speed, and it’ll take a while before we reach something interesting.

Anyway, we started getting really curious about this South Dakota Tractor Museum — it sounded sort of funny. It took 250 kilometers of driving before we passed it. We wouldn’t normally care about a tractor museum, but there was really nothing else to think about while we were driving. The only thing to see were fields of grain, and these signs, which kept building up the suspense, saying things like


We’re zipping along really fast now:

\omega^{\omega^{\omega^\omega}}, \dots, \omega^{\omega^{\omega^{\omega^\omega}}},\dots , \omega^{\omega^{\omega^{\omega^{\omega^{\omega}}}}},\dots

What comes after all these?

At this point we need to stop for gas. Our notation for ordinals just ran out!

The ordinals don’t stop; it’s just our notation that fizzled out. The set of all ordinals listed up to now — including all the ones we zipped past — is a well-ordered set called


or "epsilon-nought". This has the amazing property that

\epsilon_0 = \omega^{\epsilon_0}

And it’s the smallest ordinal with this property! It looks like this:

It’s an amazing fact that every countable ordinal is isomorphic, as an well-ordered set, to some subset of the real line. David Madore took advantage of this to make his pictures.

Cantor normal form

I’ll tell you the rest of my road story later. For now let me conclude with a bit of math.

There’s a nice notation for all ordinals less than \epsilon_0, called ‘Cantor normal form’. We’ve been seeing lots of examples. Here is a typical ordinal in Cantor normal form:

\omega^{\omega^{\omega^{\omega+\omega+1}}} \; + \; \omega^{\omega^\omega+\omega^\omega} \; + \; \omega^\omega \;+\; \omega + \omega + 1 + 1 + 1

The idea is that you write it out using just + and exponentials and 1 and \omega.

Here is the theorem that justifies Cantor normal form:

Theorem. Every ordinal \alpha can be uniquely written as

\alpha = \omega^{\beta_1} c_1 + \omega^{\beta_2}c_2 + \cdots + \omega^{\beta_k}c_k

where k is a natural number, c_1, c_2, \ldots, c_k are positive integers, and \beta_1 > \beta_2 > \cdots > \beta_k \geq 0 are ordinals.

It’s like writing ordinals in base \omega.

Note that every ordinal can be written this way! So why did I say that Cantor normal form is nice notation for ordinals less than \epsilon_0? Here’s the problem: the Cantor normal form of \epsilon_0 is

\epsilon_0 = \omega^{\epsilon_0}

So, when we hit \epsilon_0, the exponents \beta_1 ,\beta_2, \dots, \beta_k can be as big as the ordinal \alpha we’re trying to describe! So, while the Cantor normal form still exists for ordinals \geq \epsilon_0, it doesn’t give a good notation for them unless we already have some notation for ordinals this big!

This is what I mean by a notation ‘fizzling out’. We’ll keep seeing this problem in the posts to come.

But for an ordinal \alpha less than \epsilon_0, something nice happens. In this case, when we write

\alpha = \omega^{\beta_1} c_1 + \omega^{\beta_2}c_2 + \cdots + \omega^{\beta_k}c_k

all the exponents \beta_1, \beta_2, \dots, \beta_k are less than \alpha. So we can go ahead and write them in Cantor normal form, and so on… and because ordinals are well-ordered, this process ends after finitely many steps.

So, Cantor normal form gives a nice way to write any ordinal less than \epsilon_0 using finitely many symbols! If we abbreviate \omega^0 as 1, and write multiplication by positive integers in terms of addition, we get expressions like this:

\omega^{\omega^{\omega^{\omega^{1 + 1} +\omega+1}}} \; + \; \omega^{\omega^\omega+\omega^\omega} \; + \; \omega^{\omega+1+1} \;+\; \omega + 1

They look like trees. Even better, you can write a computer program that does ordinal arithmetic for ordinals of this form: you can add, multiply, and exponentiate them, and tell when one is less than another.

So, there’s really no reason to be scared of \epsilon_0. Remember, each ordinal is just the set of all smaller ordinals. So you can think of \epsilon_0 as the set of tree-shaped expressions like the one above, with a particular rule for saying when one is less than another. It’s a perfectly reasonable entity. For some real excitement, we’ll need to move on to larger ordinals. We’ll do that next time.

For more, see:

• Wikipedia, Cantor normal form.


Get every new post delivered to your Inbox.

Join 3,331 other followers