Topological Crystals (Part 1)

22 July, 2016

k4_crystal

A while back, we started talking about crystals:

• John Baez, Diamonds and triamonds, Azimuth, 11 April 2016.

In the comments on that post, a bunch of us worked on some puzzles connected to ‘topological crystallography’—a subject that blends graph theory, topology and mathematical crystallography. You can learn more about that subject here:

• Tosio Sunada, Crystals that nature might miss creating, Notices of the AMS 55 (2008), 208–215.

Greg Egan and I got so interested that we wrote a paper about it!

• John Baez and Greg Egan, Topological crystals.

I’ll explain the basic ideas in a series of posts here.

First, a few personal words.

I feel a bit guilty putting so much work into this paper when I should be developing network theory to the point where it does our planet some good. I seem to need a certain amount of beautiful pure math to stay sane. But this project did at least teach me a lot about the topology of graphs.

For those not in the know, applying homology theory to graphs might sound fancy and interesting. For people who have studied a reasonable amount of topology, it probably sounds easy and boring. The first homology of a graph of genus g is a free abelian group on g generators: it’s a complete invariant of connected graphs up to homotopy equivalence. Case closed!

But there’s actually more to it, because studying graphs up to homotopy equivalence kills most of the fun. When we’re studying networks in real life we need a more refined outlook on graphs. So some aspects of this project might pay off, someday, in ways that have nothing to do with crystallography. But right now I’ll just talk about it as a fun self-contained set of puzzles.

I’ll start by quickly sketching how to construct topological crystals, and illustrate it with the example of graphene, a 2-dimensional form of carbon:


I’ll precisely state our biggest result, which says when this construction gives a crystal where the atoms don’t bump into each other and the bonds between atoms don’t cross each other. Later I may come back and add detail, but for now you can find details in our paper.

Constructing topological crystals

The ‘maximal abelian cover’ of a graph plays a key role in Sunada’s work on topological crystallography. Just as the universal cover of a connected graph X has the fundamental group \pi_1(X) as its group of deck transformations, the maximal abelian cover, denoted \overline{X}, has the abelianization of \pi_1(X) as its group of deck transformations. It thus covers every other connected cover of X whose group of deck transformations is abelian. Since the abelianization of \pi_1(X) is the first homology group H_1(X,\mathbb{Z}), there is a close connection between the maximal abelian cover and homology theory.

In our paper, Greg and I prove that for a large class of graphs, the maximal abelian cover can naturally be embedded in the vector space H_1(X,\mathbb{R}). We call this embedded copy of \overline{X} a ‘topological crystal’. The symmetries of the original graph can be lifted to symmetries of its topological crystal, but the topological crystal also has an n-dimensional lattice of translational symmetries. In 2- and 3-dimensional examples, the topological crystal can serve as the blueprint for an actual crystal, with atoms at the vertices and bonds along the edges.

The general construction of topological crystals was developed by Kotani and Sunada, and later by Eon. Sunada uses ‘topological crystal’ for an even more general concept, but we only need a special case.

Here’s how it works. We start with a graph X. This has a space C_0(X,\mathbb{R}) of 0-chains, which are formal linear combinations of vertices, and a space C_1(X,\mathbb{R}) of 1-chains, which are formal linear combinations of edges. There is a boundary operator

\partial \colon C_1(X,\mathbb{R}) \to C_0(X,\mathbb{R})

This is the linear operator sending any edge to the difference of its two endpoints. The kernel of this operator is called the space of 1-cycles, Z_1(X,\mathbb{R}). There is an inner product on the space of 1-chains such that edges form an orthonormal basis. This determines an orthogonal projection

\pi \colon C_1(X,\mathbb{R}) \to Z_1(X,\mathbb{R})

For a graph, Z_1(X,\mathbb{R}) is isomorphic to the first homology group H_1(X,\mathbb{R}). So, to obtain the topological crystal of X, we need only embed its maximal abelian cover \overline{X} in Z_1(X,\mathbb{R}). We do this by embedding \overline{X} in C_1(X,\mathbb{R}) and then projecting it down via \pi.

To accomplish this, we need to fix a basepoint for X. Each path \gamma in X starting at this basepoint determines a 1-chain c_\gamma. These 1-chains correspond to the vertices of \overline{X}. The graph \overline{X} has an edge from c_\gamma to c_{\gamma'} whenever the path \gamma' is obtained by adding an extra edge to \gamma. This edge is a straight line segment from the point c_\gamma to the point c_{\gamma'}.

The hard part is checking that the projection \pi maps this copy of \overline{X} into Z_1(X,\mathbb{R}) in a one-to-one manner. In Theorems 6 and 7 of our paper we prove that this happens precisely when the graph X has no ‘bridges’: that is, edges whose removal would disconnect X.

Kotani and Sunada noted that this condition is necessary. That’s actually pretty easy to see. The challenge was to show that it’s sufficient! For this, our main technical tool is Lemma 5, which for any path \gamma decomposes the 1-chain c_\gamma into manageable pieces.

We call the resulting copy of \overline{X} embedded in Z_1(X,\mathbb{R}) a topological crystal.

Let’s see how it works in an example!

Take X to be this graph:

Since X has 3 edges, the space of 1-chains is 3-dimensional. Since X has genus two, the space of 1-cycles forms a plane in this 3-dimensional space. If we consider paths \gamma in X starting at the red vertex, form the 1-chains c_\gamma, and project them down to this plane, we obtain the following picture:

Here the 1-chains c_\gamma are the white and red dots. These are the vertices of \overline{X}, while the line segments between them are the edges of \overline{X}. Projecting these vertices and edges onto the plane of 1-cycles, we obtain the topological crystal for X. The blue dots come from projecting the white dots onto the plane of 1-cycles, while the red dots already lie on this plane. The resulting topological crystal provides the pattern for graphene:

That’s all there is to the basic idea! But there’s a lot more to say about it, and a lot of fun examples to look at: diamonds, triamonds, hyperquartz and more.


Frigatebirds

18 July, 2016

 

Frigatebirds are amazing!

They have the largest ratio of wing area to body weight of any bird. This lets them fly very long distances while only rarely flapping their wings. They often stay in the air for weeks at time. And one being tracked by satellite in the Indian Ocean stayed aloft for two months.

Surprisingly for sea birds, they don’t go into the water. Their feathers aren’t waterproof. They are true creatures of the air. They snatch fish from the ocean surface using their long, hooked bills—and they often eat flying fish! They clean themselves in flight by flying low and wetting themselves at the water’s surface before preening themselves.

They live a long time: often over 35 years.

But here’s the cool new discovery:

Since the frigatebird spends most of its life at sea, its habits outside of when it breeds on land aren’t well-known—until researchers started tracking them around the Indian Ocean. What the researchers discovered is that the birds’ flying ability almost defies belief.

Ornithologist Henri Weimerskirch put satellite tags on a couple of dozen frigatebirds, as well as instruments that measured body functions such as heart rate. When the data started to come in, he could hardly believe how high the birds flew.

“First, we found, ‘Whoa, 1,500 meters. Wow. Excellent, fantastique,’ ” says Weimerskirch, who is with the National Center for Scientific Research in Paris. “And after 2,000, after 3,000, after 4,000 meters — OK, at this altitude they are in freezing conditions, especially surprising for a tropical bird.”

Four thousand meters is more than 12,000 feet, or as high as parts of the Rocky Mountains. “There is no other bird flying so high relative to the sea surface,” he says.

Weimerskirch says that kind of flying should take a huge amount of energy. But the instruments monitoring the birds’ heartbeats showed that the birds weren’t even working up a sweat. (They wouldn’t, actually, since birds don’t sweat, but their heart rate wasn’t going up.)

How did they do it? By flying into a cloud.

“It’s the only bird that is known to intentionally enter into a cloud,” Weimerskirch says. And not just any cloud—a fluffy, white cumulus cloud. Over the ocean, these clouds tend to form in places where warm air rises from the sea surface. The birds hitch a ride on the updraft, all the way up to the top of the cloud.

[…]

“Absolutely incredible,” says Curtis Deutsch, an oceanographer at the University of Washington. “They’re doing it right through these cumulus clouds. You know, if you’ve ever been on an airplane, flying through turbulence, you know it can be a little bit nerve-wracking.”

One of the tagged birds soared 40 miles without a wing-flap. Several covered more than 300 miles a day on average, and flew continuously for weeks.

• Christopher Joyce, Nonstop flight: how the frigatebird can soar for weeks without stopping, All Things Considered, National Public Radio, 30 June 2016.

Frigatebirds aren’t admirable in every way. They’re kleptoparasites—now there’s a word you don’t hear every day! That’s a name for animals that steal food:

Frigatebirds will rob other seabirds such as boobies, particularly the red-footed booby, tropicbirds, shearwaters, petrels, terns, gulls and even ospreys of their catch, using their speed and maneuverability to outrun and harass their victims until they regurgitate their stomach contents. They may either assail their targets after they have caught their food or circle high over seabird colonies waiting for parent birds to return laden with food.

Frigatebird, Wikipedia.


Operads for “Systems of Systems”

13 July, 2016

“Systems of systems” is a fashionable buzzword for complicated systems that are themselves made of complicated systems, often of disparate sorts. They’re important in modern engineering, and it takes some thought to keep them from being unmanageable. Biology and ecology are full of systems of systems.

David Spivak has been working a lot on operads as a tool for describing systems of systems. Here’s a nice programmatic talk advocating this approach:

• David Spivak, Operads as a potential foundation for
systems of systems
.

This was a talk he gave at the Generalized Network Structures and Dynamics Workshop at the Mathematical Biosciences Institute at Ohio State University this spring.

You won’t learn what operads are from this talk—for that, try this:

• Wikipedia, Operad.

But if you know a bit about operads, it may help give you an idea of their flexibility as a formalism for describing ways of sticking together components to form bigger systems!

I’ll probably talk about this kind of thing more pretty soon. So far I’ve been using category theory to study networked systems like electrical circuits, Markov processes and chemical reaction networks. The same ideas handle all these different kind of systems in a unified way. But I want to push toward biology. Here we need more sophisticated ideas. My philosophy is that while biology seems “messy” to physicists, living systems actually operate at higher levels of abstraction, which call for new mathematics.


Large Countable Ordinals (Part 3)

7 July, 2016

Last time we saw why it’s devilishly hard to give names to large countable ordinals.

An obvious strategy is to make up a function f from ordinals to ordinals that grows really fast, so that f(x) is a lot bigger than the ordinal x indexing it. This is indeed a good idea. But something funny tends to happen! Eventually x catches up with f(x). In other words, you eventually hit a solution of

x = f(x)

This is called a fixed point of f. At this point, there’s no way to use f(x) as a name for x unless you already have a name for x. So, your scheme fizzles out!

For example, we started by looking at powers of \omega, the smallest infinite ordinal. But eventually we ran into ordinals x that obey

x = \omega^x

There’s an obvious work-around: we make up a new name for ordinals x that obey

x = \omega^x

We call them epsilon numbers. In our usual nerdy way we start counting at zero, so we call the smallest solution of this equation \epsilon_0, and the next one \epsilon_1, and so on.

But eventually we run into ordinals x that are fixed points of the function \epsilon_x, meaning that

x = \epsilon_x

There’s an obvious work-around: we make up a new name for ordinals x that obey

x = \epsilon_x

But by now you can guess that this problem will keep happening, so we’d better get systematic about making up new names! We should let

\phi_0(\alpha) = \omega^\alpha

and let \phi_{n+1}(\alpha) be the \alphath fixed point of \phi_n.

Oswald Veblen, a mathematician at Princeton, came up with this idea around 1908, based on some thoughts of G. H. Hardy:

• Oswald Veblen, Continuous increasing functions of finite and transfinite ordinals, Trans. Amer. Math. Soc. 9 (1908), 280–292.

He figured out how to define \phi_\gamma(\alpha) even when the index \gamma is infinite.

Last time we saw how to name a lot of countable ordinals using this idea: in fact, all ordinals less than the ‘Feferman–Schütte ordinal’. This time I want go further, still using Veblen’s work.

First, however, I feel an urge to explain things a bit more precisely.

Veblen’s fixed point theorem

There are three kinds of ordinals. The first is a successor ordinal, which is one more than some other ordinal. So, we say \alpha is a successor ordinal if

\alpha = \beta + 1

for some \beta. The second is 0, which is not a successor ordinal. And the third is a limit ordinal, which is neither 0 nor a successor ordinal. The smallest example is

\omega = \{1, 2, 3, \dots \}

Every limit ordinal is the ‘limit’ of ordinals less than it. What does that mean, exactly? Remember, each ordinal is a set: the set of all smaller ordinals. We can define the limit of a set of ordinals to be the union of that set. Alternatively, it’s the smallest ordinal that’s greater than or equal to every ordinal in that set.

Now for Veblen’s key idea:

Veblen’s Fixed Point Theorem. Suppose a function f from ordinals to ordinals is:

strictly increasing: if x < y then f(x) < f(y)

and

continuous: if x is a limit ordinal, f(x) is the limit of the ordinals f(\alpha) where \alpha < x.

Then f must have a fixed point.

Why? For starters, we always have this fact:

x \le f(x)

After all, if this weren’t true, there’d be a smallest x with the property that f(x) < x, since every nonempty set of ordinals has a smallest element. But since f is strictly increasing,

f(f(x)) < f(x)

so f(x) would be an even smaller ordinal with this property. Contradiction!

Using this fact repeatedly, we get

0 \le f(0) \le f(f(0)) \le \cdots

Let \alpha be the limit of the ordinals

0, f(0), f(f(0)), \dots

Then by continuity, f(\alpha) is the limit of the sequence

f(0), f(f(0)), f(f(f(0))),\dots

So f(\alpha) equals \alpha. Voilà! A fixed point!

This construction gives the smallest fixed point of f. There are infinitely many more, since we can start not with 0 but with \alpha+1 and repeat the same argument, etc. Indeed if we try to list these fixed points, we find there is one for each ordinal.

So, we can make up a new function that lists these fixed points. Just to be cute, people call this the derivative of f, so that f'(\alpha) is the \alphath fixed point of f. Beware: while the derivative of a polynomial grows more slowly than the original polynomial, the derivative of a continuous increasing function f from ordinals to ordinals generally grows more quickly than f. It doesn’t really act like a derivative; people just call it that.

Veblen proved another nice theorem:

Theorem. If f is a continuous and strictly increasing function from ordinals to ordinals, so is f'.

So, we can take the derivative repeatedly! This is the key to the Veblen hierarchy.

If you want to read more about this, it helps to know that a function from ordinals to ordinals that’s continuous and strictly increasing is called normal. ‘Normal’ is an adjective that mathematicians use when they haven’t had enough coffee in the morning and aren’t feeling creative—it means a thousand different things. In this case, a better term would be ‘differentiable’.

Armed with that buzzword, you can try this:

• Wikipedia, Fixed-point lemma for normal functions.

Okay, enough theory. On to larger ordinals!

The Feferman–Schütte barrier

First let’s summarize how far we got last time, and why we got stuck. We inductively defined the \alphath ordinal of the \gammath kind by:

\phi_0(\alpha) = \omega^\alpha

and

\phi_{\gamma+1}(\alpha) = \phi'_\gamma(\alpha)

meaning that \phi_{\gamma+1}(\alpha) is the \alphath fixed point of \phi_\gamma.

This handles the cases where \gamma is zero or a successor ordinal. When \gamma is a limit ordinal we let \phi_{\gamma}(\alpha) be the \alphath ordinal that’s a fixed point of all the functions \phi_\beta for \beta < \gamma.

Last time I explained how these functions \phi_\gamma give a nice notation for ordinals less than the Feferman–Schütte ordinal, which is also called \Gamma_0. This ordinal is the smallest solution of

x = \phi_x(0)

So it’s a fixed point, but of a new kind, because now the x appears as a subscript of the \phi function.

We can get our hands on the Feferman–Schütte ordinal by taking the limit of the ordinals

\phi_0(0), \; \phi_{\phi_0(0)}(0) , \; \phi_{\phi_{\phi_0(0)}(0)}(0), \dots

(If you’re wondering why we use the number 0 here, instead of some other ordinal, I believe the answer is: it doesn’t really matter, we would get the same result if we used any ordinal less than the Feferman–Schütte ordinal.)

The ‘Feferman–Schütte barrier’ is the combination of these two facts:

• On the one hand, every ordinal \beta less than \Gamma_0 can be written as a finite sum of guys \phi_\gamma(\alpha) where \alpha and \gamma are even smaller than \beta. Using this fact repeatedly, we can get a finite expression for any ordinal less than the Feferman–Schütte ordinal in terms of the \phi function, addition, and the ordinal 0.

• On the other hand, if \alpha and \gamma are less than \Gamma_0 then \phi_\gamma(\alpha) is less than \Gamma_0. So we can’t use the \phi function to name the Feferman–Schütte ordinal in terms of smaller ordinals.

But now let’s break the Feferman–Schütte barrier and reach some bigger countable ordinas!

The Γ function

The function \phi_x(0) is strictly increasing and continuous as a function of x. So, using Veblen’s theorems, we can define \Gamma_\alpha to be the \alphath solution of

x = \phi_x(0)

We can then define a bunch of enormous countable ordinals:

\Gamma_0, \Gamma_1, \Gamma_2, \dots

and still bigger ones:

\Gamma_\omega, \; \Gamma_{\omega^2}, \; \Gamma_{\omega^3} , \dots

and even bigger ones:

\Gamma_{\omega^\omega}, \; \Gamma_{\omega^{\omega^\omega}}, \; \Gamma_{\omega^{\omega^{\omega^\omega}}}, \dots

and even bigger ones:

\Gamma_{\epsilon_0}, \Gamma_{\epsilon_1}, \Gamma_{\epsilon_2}, \dots

But since \epsilon_\alpha is just \phi_1(\alpha), we can reach much bigger countable ordinals with the help of the \phi function:

\Gamma_{\phi_2(0)}, \; \Gamma_{\phi_3(0)}, \; \Gamma_{\phi_4(0)}, \dots

and we can do vastly better using the \Gamma function itself:

\Gamma_{\Gamma_0}, \Gamma_{\Gamma_{\Gamma_0}}, \Gamma_{\Gamma_{\Gamma_{\Gamma_0}}} , \dots

The limit of all these is the smallest solution of

x = \Gamma_x

As usual, this ordinal is still countable, but there’s no way to express it in terms of the \Gamma function and smaller ordinals. So we are stuck again.

In short: we got past the Feferman–Schütte barrier by introducing a name for the \alphath solution of x = \phi_x(0). We called it \Gamma_\alpha. This made us happy for about two minutes…

…. but then we ran into another barrier of the same kind.

So what we really need is a more general notation: one that gets us over not just this particular bump in the road, but all bumps of this kind! We don’t want to keep randomly choosing goofy new letters like \Gamma. We need something systematic.

The multi-variable Veblen hierarchy

We were actually doing pretty well with the \phi function. It was nice and systematic. It just wasn’t powerful enough. But if you’re trying to keep track of how far you’re driving on a really long trip, you want an odometer with more digits. So, let’s try that.

In other words, let’s generalize the \phi function to allow more subscripts. Let’s rename \Gamma_\alpha and call it \phi_{1,0}(\alpha). The fact that we’re using two subscripts says that we’re going beyond the old \phi functions with just one subscript. The subscripts 1 and 0 should remind you of what happens when you drive more than 9 miles: if your odometer has two digits, it’ll say you’re on mile 10.

Now we proceed as before: we make up new functions, each of which enumerates the fixed points of the previous one:

\phi_{1,1} = \phi'_{1,0}
\phi_{1,2} = \phi'_{1,1}
\phi_{1,3} = \phi'_{1,2}

and so on. In general, we let

\phi_{1,\gamma+1} = \phi'_{1,\gamma}

and when \gamma is a limit ordinal, we let

\displaystyle{ \phi_{1,\gamma}(\alpha) = \lim_{\beta \to \gamma} \phi_{1,\beta}(\alpha) }

Are you confused?

How could you possibly be confused???

Okay, maybe an example will help. In the last section, our notation fizzled out when we took the limit of these ordinals:

\Gamma_{\Gamma_0}, \Gamma_{\Gamma_{\Gamma_0}}, \Gamma_{\Gamma_{\Gamma_{\Gamma_0}}} , \dots

The limit of these is the smallest solution of x = \Gamma_x. But now we’re writing \Gamma_x = \phi_{1,0}(x), so this limit is the smallest fixed point of \phi_{1,0}. So, it’s \phi_{1,1}(0).

We can now ride happily into the sunset, defining \phi_{1,\gamma}(\alpha) for all ordinals \alpha, \gamma. Of course, this will never give us a notation for ordinals with

x = \phi_{1,x}(0)

But we don’t let that stop us! This is where the new extra subscript really comes in handy. We now define \phi_{2,0}(\alpha) to be the \alphath solution of

x = \phi_{1,x}(0)

Then we drive on as before. We let

\phi_{2,\gamma+1} = \phi'_{2,\gamma}

and when \gamma is a limit ordinal, we say

\displaystyle{ \phi_{2,\gamma}(\alpha) = \lim_{\beta \to \gamma} \phi_{2,\beta}(\alpha) }

I hope you get the idea. Keep doing this!

We can inductively define \phi_{\beta,\gamma}(\alpha) for all \alpha, \beta and \gamma. Of course, these functions will never give a notation for solutions of

x = \phi_{x,0}(0)

To describe these, we need a function with one more subscript! So let \phi_{1,0,0}(\alpha) be the \alphath solution of

x = \phi_{x,0}(0)

We can then proceed on and on and on, adding extra subscripts as needed.

This is called the multi-variable Veblen hierarchy.

Examples

To help you understand the multi-variable Veblen hierarchy, I’ll use it to describe lots of ordinals. Some are old friends. Starting with finite ones, we have:

\phi_0(0) = 1

\phi_0(0) + \phi_0(0) = 2

and so on, so we don’t need separate names for natural numbers… but I’ll use them just to save space.

\phi_0(1) = \omega

\phi_0(2) = \omega^2

and so on, so we don’t need separate names for \omega and its powers, but I’ll use them just to save space.

\phi_0(\omega) = \omega^\omega

\phi_0(\omega^\omega) = \omega^{\omega^\omega}

\phi_1(0) = \epsilon_0

\phi_1(1) = \epsilon_1

\displaystyle{ \phi_1(\phi_1(0)) = \epsilon_{\epsilon_0} }

\phi_2(0) = \zeta_0

\phi_2(1) = \zeta_1

where I should remind you that \zeta_\alpha is a name for the \alphath solution of x = \epsilon_x.

\phi_{1,0}(0) = \Gamma_0

\phi_{1,0}(1) = \Gamma_1

\displaystyle{ \phi_{1,0}(\phi_{1,0}(0)) = \Gamma_{\Gamma_0} }

\phi_{1,1}(0) is the limit of \Gamma_{\Gamma_0}, \Gamma_{\Gamma_{\Gamma_0}}, \Gamma_{\Gamma_{\Gamma_{\Gamma_0}}} , \dots

\phi_{1,0,0}(0) is called the Ackermann ordinal.

Apparently Wilhelm Ackermann, the logician who invented a very fast-growing function called Ackermann’s function, had a system for naming ordinals that fizzled out at this ordinal.

The small Veblen ordinal

There are obviously lots more ordinals that can be described using the multi-variable Veblen hierarchy, but I don’t have anything interesting to say about them. And you’re probably more interested in this question: what’s next?

The limit of these ordinals

\phi_1(0), \; \phi_{1,0}(0), \; \phi_{1,0,0}(0), \dots

is called the small Veblen ordinal. Yet again, it’s a countable ordinal. It’s the smallest ordinal that cannot be named in terms of smaller ordinals using the multi-variable Veblen hierarchy…. at least, not the version I described. And here’s a nice fact:

Theorem. Every ordinal \beta less than the small Veblen ordinal can be written as a finite expression in terms of the multi-variable \phi function, addition, and 0.

For example,

\Gamma_0 + \epsilon_{\epsilon_0} + \omega^\omega + 2

is equal to

\displaystyle{  \phi_{\phi_0(0),0}(0) + \phi_{\phi_0(0)}(\phi_{\phi_0(0)}(0)) +  \phi_0(\phi_0(\phi_0(0))) + \phi_0(0) + \phi_0(0)  }

On the one hand, this notation is quite tiresome to read. On the other hand, it’s amazing that it gets us so far!

Furthermore, if you stare at expressions like the above one for a while, and think about them abstractly, they should start looking like trees. So you should find it easy to believe that ordinals less than the small Veblen ordinal correspond to trees, perhaps labelled in some way.

Indeed, this paper describes a correspondence of this sort:

• Herman Ruge Jervell, Finite trees as ordinals, in New Computational Paradigms, Lecture Notes in Computer Science 3526, Springer, Berlin, 2005, pp. 211–220.

However, I don’t think his idea is quite same as what you’d come up with by staring at expressions like

\displaystyle{  \phi_{\phi_0(0),0}(0) + \phi_{\phi_0(0)}(\phi_{\phi_0(0)}(0)) +  \phi_0(\phi_0(\phi_0(0))) + \phi_0(0) + \phi_0(0)  }

Beyond the small Veblen ordinal

We’re not quite done yet. The modifier ‘small’ in the term ‘small Veblen ordinal’ should make you suspect that there’s more in Veblen’s paper. And indeed there is!

Veblen actually extended his multi-variable function \phi_{\gamma_1, \dots, \gamma_n}(\alpha) to the case where there are infinitely many variables. He requires that all but finitely many of these variables equal zero, to keep things under control. Using this, one can set up a notation for even bigger countable ordinals! This notation works for all ordinals less than the large Veblen ordinal.

We don’t need to stop here. The large Veblen ordinal is just the first of a new series of even larger countable ordinals!

These can again be defined as fixed points. Yes: it’s déjà vu all over again. But around here, people usually switch to a new method for naming these fixed points, called ‘ordinal collapsing functions’. One interesting thing about this notation is that it makes use of uncountable ordinal. The first uncountable ordinal is called \Omega, and it dwarfs all those we’ve seen here.

We can use the ordinal collapsing function \psi to name many of our favorite countable ordinals, and more:

\psi(\Omega) is \zeta_0, the smallest solution of x = \epsilon_x.

\psi(\Omega^\Omega) is \Gamma_0, the Feferman–Schütte ordinal.

\psi(\Omega^{\Omega^2}) is the Ackermann ordinal.

\psi(\Omega^{\Omega^\omega}) is the small Veblen ordinal.

\psi(\Omega^{\Omega^\Omega}) is the large Veblen ordinal.

\psi(\epsilon_{\Omega+1}) is called the Bachmann–Howard ordinal. This is the limit of the ordinals

\psi(\Omega), \psi(\Omega^\Omega), \psi(\Omega^{\Omega^\Omega}), \dots

I won’t explain this now. Maybe later! But not tonight. As Bilbo Baggins said:

The Road goes ever on and on
Out from the door where it began.
Now far ahead the Road has gone,
Let others follow it who can!
Let them a journey new begin,
But I at last with weary feet
Will turn towards the lighted inn,
My evening-rest and sleep to meet.

For more

But perhaps you’re impatient and want to begin a new journey now!

The people who study notations for very large countable ordinals tend to work on proof theory, because these ordinals have nice applications to that branch of logic. For example, Peano arithmetic is powerful enough to work with ordinals up to but not including \epsilon_0, so we call \epsilon_0 the proof-theoretic ordinal of Peano arithmetic. Stronger axiom systems have bigger proof-theoretic ordinals.

Unfortunately this makes it a bit hard to learn about large countable ordinals without learning, or at least bumping into, a lot of proof theory. And this subject, while interesting in principle, is quite tough. So it’s hard to find a readable introduction to large countable ordinals.

The bibliography of the Wikipedia article on large countable ordinals gives this half-hearted recommendation:

Wolfram Pohlers, Proof theory, Springer 1989 ISBN 0-387-51842-8 (for Veblen hierarchy and some impredicative ordinals). This is probably the most readable book on large countable ordinals (which is not saying much).

Unfortunately, Pohlers does not seem to give a detailed account of ordinal collapsing functions. If you want to read something fun that goes further than my posts so far, try this:

• Hilbert Levitz, Transfinite ordinals and their notations: for the uninitiated.

(Anyone whose first name is Hilbert must be born to do logic!)

This is both systematic and clear:

• Wikipedia, Ordinal collapsing functions.

And if you want to explore countable ordinals using a computer program, try this:

• Paul Budnik, Ordinal calculator and research tool.

Among other things, this calculator can add, multiply and exponentiate ordinals described using the multi-variable Veblen hierarchy—even the version with infinitely many variables!


Large Countable Ordinals (Part 2)

4 July, 2016

Last time I took you on a road trip to infinity. We zipped past a bunch of countable ordinals

\omega , \; \omega^\omega,\; \omega^{\omega^\omega}, \;\omega^{\omega^{\omega^\omega}}, \dots

and stopped for gas at the first one after all these. It’s called \epsilon_0. Heuristically, you can imagine it like this:

\epsilon_0 = \omega^{\omega^{\omega^{\omega^{\cdot^{\cdot^{\cdot}}}}}}

More rigorously, it’s the smallest ordinal x obeying the equation

x = \omega^x

Beyond εo

But I’m sure you have a question. What comes after \epsilon_0?

Well, duh! It’s

\epsilon_0 + 1

Then comes

\epsilon_0 + 2

and then eventually we get to

\epsilon_0 + \omega

and then

\epsilon_0 + \omega^2 ,\dots, \epsilon_0 + \omega^3,\dots \epsilon_0 + \omega^4,\dots

and after a long time

\epsilon_0 + \epsilon_0 = \epsilon_0 2

and then eventually

\epsilon_0^2

and then eventually….

Oh, I see! You wanted to know the first really interesting ordinal after \epsilon_0.

Well, this is a matter of taste, but you might be interested in \epsilon_1. This is the first ordinal after \epsilon_0 that satisfies this equation:

x = \omega^x

How do we actually reach this ordinal? Well, just as \epsilon_0 was the limit of this sequence:

\omega , \; \omega^\omega,\; \omega^{\omega^\omega}, \;\omega^{\omega^{\omega^\omega}}, \dots

\epsilon_1 is the limit of this:

\epsilon_0 + 1, \; \omega^{\epsilon_0 + 1}, \;  \omega^{\omega^{\epsilon_0 + 1}}, \; \omega^{\omega^{\omega^{\epsilon_0 + 1}}},\dots

You may wonder what I mean by the ‘limit’ of an increasing sequence of ordinals. I just mean the smallest ordinal greater than or equal to every ordinal in that sequence. Such a thing is guaranteed to exist, since if we treat ordinals as well-ordered sets, we can just take the union of all the sets in that sequence.

Here’s a picture of \epsilon_1, taken from David Madore’s interactive webpage:

In what sense is \epsilon_1 the first "really interesting" ordinal after \epsilon_0?

For one thing, it’s first that can’t be built out of 1, \omega and \epsilon_0 using finitely many additions, multiplications and exponentiations. In other words, if we use Cantor normal form to describe ordinals (as explained last time), and allow expressions involving \epsilon_0 as well as 1 and \omega, we get a notation for all ordinals up to \epsilon_1.

What’s the next really interesting ordinal after \epsilon_1? As you might expect, it’s called \epsilon_2. This is the next solution of

x = \omega^{x}

and it’s defined to be the limit of this sequence:

\epsilon_1 + 1, \; \omega^{\epsilon_1 + 1}, \;\omega^{\omega^{\epsilon_1 + 1}}, \; \omega^{\omega^{\omega^{\epsilon_1 + 1}}},\dots

Maybe now you get the pattern. In general, \epsilon_{\alpha} is the
\alphath solution of x = \omega^{x}. We can define this, if we’re smart, for any ordinal \alpha.

So, we can keep driving on through fields of ever larger ordinals:

\epsilon_2,\dots, \epsilon_{3},\dots, \epsilon_{4}, \dots

and eventually

\epsilon_{\omega}

which is the first ordinal bigger than \epsilon_0, \epsilon_1, \epsilon_2, \dots

Let’s stop and take a look!

Nice! Okay, back in the car…

\epsilon_{\omega+1},\dots, \epsilon_{\omega+2},\dots, \epsilon_{\omega+1},\dots

and then

\epsilon_{\omega^2},\dots , \epsilon_{\omega^3},\dots, \epsilon_{\omega^4},\dots

and then

\epsilon_{\omega^{\omega}},\dots, \epsilon_{\omega^{\omega^{\omega}}},\dots

As you can see, this gets boring after a while: it’s suspiciously similar to the beginning of our trip through the ordinals. The same ordinals are now showing up as subscripts in this epsilon notation. But we’re moving much faster now, since I’m skipping over much bigger gaps, not bothering to mention all sorts of ordinals like

\epsilon_{\omega^{\omega}} + \epsilon_{\omega 248} + \omega^{\omega^{\omega + 17}} + 1

Anyway… while we’re zipping along, I might as well finish telling you the story I started last time. My friend David Sternlieb and I were driving across South Dakota on Route 80. We kept seeing signs for the South Dakota Tractor Museum. When we finally got there, we were driving pretty darn fast, out of boredom—about 85 miles an hour. And guess what happened then!

Oh — wait a minute—this one is sort of interesting:

\displaystyle{ \epsilon_{\epsilon_0} }

Then come some more like that:

\epsilon_{\epsilon_1},\dots, \epsilon_{\epsilon_2},\dots \epsilon_{\epsilon_3},\dots

until we reach this:

\epsilon_{\epsilon_{\omega}}

and then

\epsilon_{\epsilon_{\omega^{\omega}}},\dots, \epsilon_{\epsilon_{\omega^{\omega^{\omega}}}},\dots

As we keep speeding up, we see:

\epsilon_{\epsilon_{\epsilon_0}},\dots \epsilon_{\epsilon_{\epsilon_{\epsilon_0}}},\dots \epsilon_{\epsilon_{\epsilon_{\epsilon_{\epsilon_0}}}},\dots

So, anyway: by the time we got that tractor museum, we were driving really fast. And, all we saw as we whizzed by was a bunch of rusty tractors out in a field! It was over in a split second! It was a real anticlimax — just like this anecdote, in fact.

But that’s just the way it is when you’re driving through these ordinals! Every ordinal, no matter how large, looks pretty pathetic and small compared to the ones ahead — so you keep speeding up, looking for something ‘really new and different’. But when you find one, it turns out to be part of a larger pattern, and soon that gets boring too.

For example, when we reach the limit of this sequence:

\epsilon_0, \epsilon_{\epsilon_0}, \epsilon_{\epsilon_{\epsilon_0}}, \epsilon_{\epsilon_{\epsilon_{\epsilon_0}}}, \epsilon_{\epsilon_{\epsilon_{\epsilon_{\epsilon_0}}}},\dots

our notation fizzles out again, since this is the first solution of

x = \epsilon_{x}

We could make up a new name for this ordinal, like \zeta_0. I don’t think this name is very common, though I’ve seen it. We could call it the Tractor Museum of Countable Ordinals.

Now we can play the whole game again, defining the zeta number \zeta_{\alpha} to be the \alphath solution of

x = \epsilon_x

sort of like how we defined the epsilons. This kind of equation, where something equals some function of itself, is called a fixed point equation.

But since we’ll have to play this game infinitely often, we might as well be more systematic about it!

The Veblen hierarchy

As you can see, we keep running into new, qualitatively different types of ordinals. First we ran into the powers of omega. Then we ran into the epsilons, and then the zetas. It’s gonna keep happening! For each type of ordinal, our notation fizzles out when we reach the first ‘fixed point’— when the xth ordinal of this type is actually equal to x.

So, instead of making up infinitely many Greek letters for different types of ordinals let’s index them… by ordinals! For each ordinal \gamma we’ll have a type of ordinal. We’ll let \phi_\gamma(\alpha) be the \alphath ordinal of type \gamma.

We can use the fixed point equation to define \phi_{\gamma+1} in terms of \phi_{\gamma}. In other words, we start off by defining

\phi_0(\alpha) = \omega^{\alpha}

and then define

\phi_{\gamma+1}(\alpha)

to be the \alphath solution of

x = \phi_{\gamma}(x)

where we start counting at \alpha = 0, so the first solution is called the ‘zeroth’.

We can even make sense of \phi_\gamma(\alpha) when \gamma itself is infinite! Suppose \gamma is a limit of smaller ordinals. Then we define \phi_\gamma(x) to be the limit of \phi_\beta(x) as \beta approaches \gamma. I’ll make this more precise next time.

We get infinitely many different types of ordinals, called the Veblen hierarchy. So, concretely, the Veblen hierarchy starts with the powers of \omega:

\phi_0(\alpha) = \omega^\alpha

and then it goes on to the ‘epsilons’:

\phi_1(\alpha) = \epsilon_\alpha

and then it goes on to what I called the ‘zetas’:

\phi_2(\alpha) = \zeta_\alpha

But that’s just the start!

The Feferman–Schütte ordinal

Boosting the subscript \gamma in \phi_\gamma(\alpha) increases the result much more than boosting \alpha, so let’s focus on that and just let \alpha = 0. The Veblen hierarchy contains ordinals like this:

\phi_{\omega}(0), \; \phi_{\omega+1}(0), \; \phi_{\omega+2}(0), \dots

and then ordinals like this:

\phi_{\omega^2}(0), \; \phi_{\omega^3}(0), \; \phi_{\omega^4}(0), \dots

and then ordinals like this:

\phi_{\omega^\omega}(0), \; \phi_{\omega^{\omega^\omega}}(0), \; \phi_{\omega^{\omega^{\omega^{\omega}}}}(0), \dots

and then this:

\phi_{\epsilon_0}(0), \phi_{\epsilon_{\epsilon_0}}(0), \phi_{\epsilon_{\epsilon_{\epsilon_0}}}(0),  \dots

where of course I’m skipping huge infinite stretches of ‘boring’ ones. But note that

\phi_{\omega}(0) = \phi_{\phi_0(0)}(0)

and

\phi_{\epsilon_0}(0) = \phi_{\phi_1(0)}(0)

and

\phi_{\zeta_0}(0) = \phi_{\phi_2(0)}(0)

In short, we can plug the phi function into itself—and we get the biggest effect if we plug it into the subscript!

So, if we’re in a rush to reach some really big countable ordinals, we can try these:

\phi_0(0), \; \phi_{\phi_0(0)}(0) , \; \phi_{\phi_{\phi_0(0)}(0)}(0), \dots

But the limit of these is an ordinal x that has

x = \phi_x(0)

This is called the Feferman–Schütte ordinal and denoted \Gamma_0.

In fact, the Feferman–Schütte ordinal is the smallest solution of

x = \phi_x(0)

Since this equation is self-referential, we can’t describe Feferman–Schütte ordinal using the Veblen hierarchy—at least, not without using the Feferman–Schütte ordinal!

Indeed, some mathematicians have made a big deal about this ordinal, claiming it’s

the smallest ordinal that cannot be described without self-reference.

This takes some explaining, and it’s somewhat controversial. After all, there’s a sense in which every fixed point equation is self-referential. But there’s a certain precise sense in which the Feferman–Schütte ordinal is different from previous ones.

Anyway, you have admit that this is a very cute description of the Fefferman–Schuette ordinal: “the smallest ordinal that cannot be described without self-reference.” Does it use self-reference? It had better—otherwise we have a contradiction!

It’s a little scary, like this picture:

More importantly for us, the Veblen hierarchy fizzles out when we hit the Feferman–Schuette ordinal. Let me say what I mean by that.

Veblen normal form

The Veblen hierarchy gives a notation for ordinals called the Veblen normal form. You can think of this as a high-powered version of Cantor normal form, which we discussed last time.

Veblen normal form relies on this result:

Theorem. Any ordinal \beta can be written uniquely as

\beta = \phi_{\gamma_1}(\alpha_1) + \dots + \phi_{\gamma_{k}}(\alpha_k)

where k is a natural number, each term is less than or equal to the previous one, and \alpha_i < \phi_{\gamma_i}(\alpha_i) for all i.

Note that we can also use this theorem to write out the ordinals \beta_i and \gamma_i, and so on, recursively. So, it gives us a notation for ordinals.

However, this notation is only useful when all the ordinals \alpha_i, \gamma_i are less than the ordinal \beta that we’re trying to describe. Otherwise we need to already have a notation for \beta to express \alpha in Veblen normal form!

So, the power of this notation eventually fizzles out. And the place where it does is Feferman–Schütte ordinal. Every ordinal less than this can be expressed in terms of 0, addition, and the function \phi, using just finitely many symbols!

The moral

As I hope you see, the power of the human mind to see a pattern and formalize it gives the quest for large countable ordinals a strange quality. As soon as we see a systematic way to generate a sequence of larger and larger ordinals, we know this sequence has a limit that’s larger then all of those! And this opens the door to even larger ones….

So, this whole journey feels a bit like trying to outrace our car’s own shadow as we drive away from the sunset: the faster we drive, the faster it shoots ahead of us. We’ll never win.

On the other hand, we’ll only lose if we get tired.

So it’s interesting to hear what happens next. We don’t have to give up. The usual symbol for the Feferman–Schütte ordinal should be a clue. It’s called \Gamma_0. And that’s because it’s just the start of a new series of even bigger countable ordinals!

I’m dying to tell you about those. But this is enough for today.


Large Countable Ordinals (Part 1)

29 June, 2016

I love the infinite.

It may not exist in the physical world, but we can set up rules to think about it in consistent ways, and then it’s a helpful concept. The reason is that infinity is often easier to think about than very large finite numbers.

Finding rules to work with the infinite is one of the great triumphs of mathematics. Cantor’s realization that there are different sizes of infinity is truly wondrous—and by now, it’s part of the everyday bread and butter of mathematics.

Trying to create a notation for these different infinities is very challenging. It’s not a fair challenge, because there are more infinities than expressions we can write down in any given alphabet! But if we seek a notation for countable ordinals, the challenge becomes more fair.

It’s still incredibly frustrating. No matter what notation we use it fizzles out too soon… making us wish we’d invented a more general notation. But this process of ‘fizzling out’ is fascinating to me. There’s something profound about it. So, I would like to tell you about this.

Today I’ll start with a warmup. Cantor invented a notation for ordinals that works great for ordinals less than a certain ordinal called ε0. Next time I’ll go further, and bring in the ‘single-variable Veblen hierarchy’! This lets us describe all ordinals below a big guy called the ‘Feferman–Schütte ordinal’.

In the post after that I’ll bring in the ‘multi-variable Veblen hierarchy’, which gets us all the ordinals below the ‘small Veblen ordinal’. We’ll even touch on the ‘large Veblen ordinal’, which requires a version of the Veblen hierarchy with infinitely many variables. But all this is really just the beginning of a longer story. That’s how infinity works: the story never ends!

To describe countable ordinals beyond the large Veblen ordinal, most people switch to an entirely different set of ideas, called ‘ordinal collapsing functions’. I may tell you about those someday. Not soon, but someday. My interest in the infinite doesn’t seem to be waning. It’s a decadent hobby, but hey: some middle-aged men buy fancy red sports cars and drive them really fast. Studying notions of infinity is cooler, and it’s environmentally friendly.

I can even imagine writing a book about the infinite. Maybe these posts will become part of that book. But one step at a time…

Cardinals versus ordinals

Cantor invented two different kinds of infinities: cardinals and ordinals. Cardinals say how big sets are. Two sets can be put into 1-1 correspondence iff they have the same number of elements—where this kind of ‘number’ is a cardinal. You may have heard about cardinals like aleph-nought (the number of integers), 2 to power aleph-nought (the number of real numbers), and so on. You may have even heard rumors of much bigger cardinals, like ‘inaccessible cardinals’ or ‘super-huge cardinals’. All this is tremendously fun, and I recommend starting here:

• Frank R. Drake, Set Theory, an Introduction to Large Cardinals, North-Holland, 1974.

There are other books that go much further, but as a beginner, I found this to be the most fun.

But I don’t want to talk about cardinals! I want to talk about ordinals.

Ordinals say how big ‘well-ordered’ sets are. A set is well-ordered if it comes with a relation ≤ obeying the usual rules:

Transitivity: if x ≤ y and y ≤ z then x ≤ z

Reflexivity: x ≤ x

Antisymmetry: if x ≤ y and y ≤ x then x = y

and one more rule: every nonempty subset has a smallest element!

For example, the empty set

\{\}

is well-ordered in a trivial sort of way, and the corresponding ordinal is called

0

Similarly, any set with just one element, like this:

\{0\}

is well-ordered in a trivial sort of way, and the corresponding ordinal is called

1

Similarly, any set with two elements, like this:

\{0,1\}

becomes well-ordered as soon as we decree which element is bigger; the obvious choice is to say 0 < 1. The corresponding ordinal is called

2

Similarly, any set with three elements, like this:

\{0,1,2\}

becomes well-ordered as soon as we linearly order it; the obvious choice here is to say 0 < 1 < 2. The corresponding ordinal is called

3

Perhaps you’re getting the pattern — you’ve probably seen these particular ordinals before, maybe sometime in grade school. They’re called finite ordinals, or "natural numbers".

But there’s a cute trick they probably didn’t teach you then: we can define each ordinal to be the set of all ordinals less than it:

0 = \{\} (since no ordinal is less than 0)
1 = \{0\} (since only 0 is less than 1)
2 = \{0,1\} (since 0 and 1 are less than 2)
3 = \{0,1,2\} (since 0, 1 and 2 are less than 3)

and so on. It’s nice because now each ordinal is a well-ordered set of the size that ordinal stands for. And, we can define one ordinal to be "less than or equal" to another precisely when its a subset of the other.

Infinite ordinals

What comes after all the finite ordinals? Well, the set of all finite ordinals is itself well-ordered:

\{0,1,2,3,\dots \}

So, there’s an ordinal corresponding to this — and it’s the first infinite ordinal. It’s usually called \omega, pronounced ‘omega’. Using the cute trick I mentioned, we can actually define

\omega = \{0,1,2,3,\dots\}

What comes after this? Well, it turns out there’s a well-ordered set

\{0,1,2,3,\dots,\omega\}

containing the finite ordinals together with \omega, with the obvious notion of "less than": \omega is bigger than the rest. Corresponding to this set there’s an ordinal called

\omega+1

As usual, we can simply define

\omega+1 = \{0,1,2,3,\dots,\omega\}

At this point you could be confused if you know about cardinals, so let me throw in a word of reassurance. The sets \omega and \omega+1 have the same cardinality: they are both countable. In other words, you can find a 1-1 and onto function between these sets. But \omega and \omega+1 are different as ordinals, since you can’t find a 1-1 and onto function between them that preserves the ordering. This is easy to see, since \omega+1 has a biggest element while \omega does not.

Indeed, all the ordinals in this series of posts will be countable! So for the infinite ones, you can imagine that all I’m doing is taking your favorite countable set and well-ordering it in ever more sneaky ways.

Okay, so we got to \omega + 1. What comes next? Well, not surprisingly, it’s

\omega+2 = \{0,1,2,3,\dots,\omega,\omega+1\}

Then comes

\omega+3, \omega+4, \omega+5,\dots

and so on. You get the idea.

I haven’t really defined ordinal addition in general. I’m trying to keep things fun, not like a textbook. But you can read about it here:

• Wikipedia, Ordinal arithmetic: addition.

The main surprise is that ordinal addition is not commutative. We’ve seen that \omega + 1 \ne \omega, since

\omega + 1 = \{1, 2, 3, \dots, \omega \}

is an infinite list of things… and then one more thing that comes after all those!. But 1 + \omega = \omega, because one thing followed by a list of infinitely many more is just a list of infinitely many things.

With ordinals, it’s not just about quantity: the order matters!

ω+ω and beyond

Okay, so we’ve seen these ordinals:

1, 2, 3, \dots, \omega, \omega + 1, \omega + 2, \omega+3, \dots

What next?

Well, the ordinal after all these is called \omega+\omega. People often call it "omega times 2" or \omega 2 for short. So,

\omega 2 = \{0,1,2,3,\dots,\omega,\omega+1,\omega+2,\omega+3,\dots.\}

It would be fun to have a book with \omega pages, each page half as thick as the previous page. You can tell a nice long story with an \omega-sized book. I think you can imagine this. And if you put one such book next to another, that’s a nice picture of \omega 2.

It’s worth noting that \omega 2 is not the same as 2 \omega. We have

\omega 2 = \omega + \omega

while

2 \omega = 2 + 2 + 2 + \cdots

where we add \omega of these terms. But

2 + 2 + 2 + \cdots = (1 + 1) + (1 + 1) + (1 + 1) \dots = \omega

so

2 \omega = \omega

This is not a proof, because I haven’t given you the official definition of how to multiply ordinals. You can find it here:

• Wikipedia, Ordinal arithmetic: multiplication.

Using this you can prove that what I’m saying is true. Nonetheless, I hope you see why what I’m saying might make sense. Like ordinal addition, ordinal multiplication is not commutative! If you don’t like this, you should study cardinals instead.

What next? Well, then comes

\omega 2 + 1, \omega 2 + 2,\dots

and so on. But you probably have the hang of this already, so we can skip right ahead to \omega 3.

In fact, you’re probably ready to skip right ahead to \omega 4, and \omega 5, and so on.

In fact, I bet now you’re ready to skip all the way to "omega times omega", or \omega^2 for short:

\omega^2 = \{0,1,2\dots\omega,\omega+1,\omega+2,\dots ,\omega 2,\omega 2+1,\omega 2+2,\dots\}

Suppose you had an encyclopedia with \omega volumes, each one being a book with \omega pages. If each book is twice as thin as one before, you’ll have \omega^2 pages — and it can still fit in one bookshelf! Here’s the idea:

What comes next? Well, we have

\omega^2+1, \omega^2+2, \dots

and so on, and after all these come

\omega^2+\omega, \omega^2+\omega+1, \omega^2+\omega+2, \dots

and so on — and eventually

\omega^2 + \omega^2 = \omega^2 2

and then a bunch more, and then

\omega^2 3

and then a bunch more, and then

\omega^2 4

and then a bunch more, and more, and eventually

\omega^2 \omega = \omega^3

You can probably imagine a bookcase containing \omega encyclopedias, each with \omega volumes, each with \omega pages, for a total of \omega^3 pages. That’s \omega^3.

ωω

I’ve been skipping more and more steps to keep you from getting bored. I know you have plenty to do and can’t spend an infinite amount of time reading this, even if the subject is infinity.

So if you don’t mind me just mentioning some of the high points, there are guys like \omega^4 and \omega^5 and so on, and after all these comes

\omega^\omega

Let’s try to we imagine this! First, imagine a book with \omega pages. Then imagine an encyclopedia of books like this, with \omega volumes. Then imagine a bookcase containing \omega encyclopedias like this. Then imagine a room containing \omega bookcases like this. Then imagine a floor with library with \omega rooms like this. Then imagine a library with \omega floors like this. Then imagine a city with \omega libraries like this. And so on, ad infinitum.

You have to be a bit careful here, or you’ll be imagining an uncountable number of pages. To name a particular page in this universe, you have to say something like this:

the 23rd page of the 107th book of the 20th encyclopedia in the 7th bookcase in 0th room on the 1000th floor of the 973rd library in the 6th city on the 0th continent on the 0th planet in the 0th solar system in the…

But it’s crucial that after some finite point you keep saying “the 0th”. Without that restriction, there would be uncountably many pages! This is just one of the rules for how ordinal exponentiation works. For the details, read:

• Wikipedia, Ordinal arithmetic: exponentiation.

As they say,

But for infinite exponents, the definition may not be obvious.

Here’s a picture of \omega^\omega, taken from David Madore’s wonderful interactive webpage:

On his page, if you click on any of the labels for an initial portion of an ordinal, like \omega, \omega^2, \omega^3 or \omega^4 here, the picture will expand to show that portion!

And here’s another picture, where each turn of the clock’s hand takes you to a higher power of \omega:

Ordinals up to ε0

Okay, so we’ve reached \omega^\omega. Now what?

Well, then comes \omega^\omega + 1, and so on, but I’m sure that’s boring by now. And then come ordinals like

\omega^\omega 2,\dots, \omega^\omega 3, \dots, \omega^\omega 4, \dots

leading up to

\omega^\omega \omega = \omega^{\omega + 1}

Then eventually come ordinals like

\omega^\omega \omega^2 , \dots, \omega^\omega \omega^3, \dots, \omega^\omega \omega^4, \dots

and so on, leading up to

\omega^\omega \omega^\omega = \omega^{\omega + \omega} = \omega^{\omega 2}

This actually reminds me of something that happened driving across South Dakota one summer with a friend of mine. We were in college, so we had the summer off, so we drive across the country. We drove across South Dakota all the way from the eastern border to the west on Interstate 90.

This state is huge — about 600 kilometers across, and most of it is really flat, so the drive was really boring. We kept seeing signs for a bunch of tourist attractions on the western edge of the state, like the Badlands and Mt. Rushmore — a mountain that they carved to look like faces of presidents, just to give people some reason to keep driving.

Anyway, I’ll tell you the rest of the story later — I see some more ordinals coming up:

\omega^{\omega 3},\dots \omega^{\omega 4},\dots \omega^{\omega 5},\dots

We’re really whizzing along now just to keep from getting bored — just like my friend and I did in South Dakota. You might fondly imagine that we had fun trading stories and jokes, like they do in road movies. But we were driving all the way from Princeton to my friend Chip’s cabin in California. By the time we got to South Dakota, we were all out of stories and jokes.

Hey, look! It’s

\omega^{\omega \omega}= \omega^{\omega^2}

That was cool. Then comes

\omega^{\omega^3}, \dots \omega^{\omega^4}, \dots \omega^{\omega^5}, \dots

and so on.

Anyway, back to my story. For the first half of our half of our trip across the state, we kept seeing signs for something called the South Dakota Tractor Museum.

Oh, wait, here’s an interesting ordinal:

\omega^{\omega^\omega}

Let’s stop and take look:

That was cool. Okay, let’s keep driving. Here comes

\omega^{\omega^\omega} + 1, \omega^{\omega^\omega} + 2, \dots

and then

\omega^{\omega^\omega} + \omega, \dots, \omega^{\omega^\omega} + \omega 2, \dots, \omega^{\omega^\omega} + \omega 3, \dots

and then

\omega^{\omega^\omega} + \omega^2, \dots, \omega^{\omega^\omega} + \omega^3, \dots

and eventually

\omega^{\omega^\omega} + \omega^\omega

and eventually

\omega^{\omega^\omega} + \omega^{\omega^\omega} = \omega^{\omega^\omega} 2

and then

\omega^{\omega^\omega} 3, \dots, \omega^{\omega^\omega} 4, \dots, \omega^{\omega^\omega} 5, \dots

and eventually

\omega^{\omega^\omega} \omega = \omega^{\omega^\omega + 1}

After a while we reach

\omega^{\omega^\omega + 2}, \dots, {\omega^\omega + 3}, \dots {\omega^\omega + 4}, \dots

and then

\omega^{\omega^\omega + \omega}, \dots, \omega^{\omega^\omega + \omega 2},  \dots, \omega^{\omega^\omega + \omega 3}, \dots

and then

\omega^{\omega^\omega + \omega^2}, \dots,  \omega^{\omega^\omega + \omega^3}, \dots,  \omega^{\omega^\omega + \omega^4}, \dots

and then

\omega^{\omega^\omega + \omega^\omega} = \omega^{\omega^\omega 2}

and then

\omega^{\omega^\omega 3}, \dots,  \omega^{\omega^\omega 4} , \dots

and then

\omega^{\omega^\omega \omega} = \omega^{\omega^{\omega + 1}}

and eventually

\omega^{\omega^{\omega + 2}}, \dots, \omega^{\omega^{\omega + 3}}, \dots, \omega^{\omega^{\omega + 4}}, \dots

This is pretty boring; we’re already going infinitely fast, but we’re still just picking up speed, and it’ll take a while before we reach something interesting.

Anyway, we started getting really curious about this South Dakota Tractor Museum — it sounded sort of funny. It took 250 kilometers of driving before we passed it. We wouldn’t normally care about a tractor museum, but there was really nothing else to think about while we were driving. The only thing to see were fields of grain, and these signs, which kept building up the suspense, saying things like

ONLY 100 MILES TO THE SOUTH DAKOTA TRACTOR MUSEUM!

We’re zipping along really fast now:

\omega^{\omega^{\omega^\omega}}, \dots, \omega^{\omega^{\omega^{\omega^\omega}}},\dots , \omega^{\omega^{\omega^{\omega^{\omega^{\omega}}}}},\dots

What comes after all these?

At this point we need to stop for gas. Our notation for ordinals just ran out!

The ordinals don’t stop; it’s just our notation that fizzled out. The set of all ordinals listed up to now — including all the ones we zipped past — is a well-ordered set called

\epsilon_0

or "epsilon-nought". This has the amazing property that

\epsilon_0 = \omega^{\epsilon_0}

And it’s the smallest ordinal with this property! It looks like this:

It’s an amazing fact that every countable ordinal is isomorphic, as an well-ordered set, to some subset of the real line. David Madore took advantage of this to make his pictures.

Cantor normal form

I’ll tell you the rest of my road story later. For now let me conclude with a bit of math.

There’s a nice notation for all ordinals less than \epsilon_0, called ‘Cantor normal form’. We’ve been seeing lots of examples. Here is a typical ordinal in Cantor normal form:

\omega^{\omega^{\omega^{\omega+\omega+1}}} \; + \; \omega^{\omega^\omega+\omega^\omega} \; + \; \omega^\omega \;+\; \omega + \omega + 1 + 1 + 1

The idea is that you write it out using just + and exponentials and 1 and \omega.

Here is the theorem that justifies Cantor normal form:

Theorem. Every ordinal \alpha can be uniquely written as

\alpha = \omega^{\beta_1} c_1 + \omega^{\beta_2}c_2 + \cdots + \omega^{\beta_k}c_k

where k is a natural number, c_1, c_2, \ldots, c_k are positive integers, and \beta_1 > \beta_2 > \cdots > \beta_k \geq 0 are ordinals.

It’s like writing ordinals in base \omega.

Note that every ordinal can be written this way! So why did I say that Cantor normal form is nice notation for ordinals less than \epsilon_0? Here’s the problem: the Cantor normal form of \epsilon_0 is

\epsilon_0 = \omega^{\epsilon_0}

So, when we hit \epsilon_0, the exponents \beta_1 ,\beta_2, \dots, \beta_k can be as big as the ordinal \alpha we’re trying to describe! So, while the Cantor normal form still exists for ordinals \geq \epsilon_0, it doesn’t give a good notation for them unless we already have some notation for ordinals this big!

This is what I mean by a notation ‘fizzling out’. We’ll keep seeing this problem in the posts to come.

But for an ordinal \alpha less than \epsilon_0, something nice happens. In this case, when we write

\alpha = \omega^{\beta_1} c_1 + \omega^{\beta_2}c_2 + \cdots + \omega^{\beta_k}c_k

all the exponents \beta_1, \beta_2, \dots, \beta_k are less than \alpha. So we can go ahead and write them in Cantor normal form, and so on… and because ordinals are well-ordered, this process ends after finitely many steps.

So, Cantor normal form gives a nice way to write any ordinal less than \epsilon_0 using finitely many symbols! If we abbreviate \omega^0 as 1, and write multiplication by positive integers in terms of addition, we get expressions like this:

\omega^{\omega^{\omega^{\omega^{1 + 1} +\omega+1}}} \; + \; \omega^{\omega^\omega+\omega^\omega} \; + \; \omega^{\omega+1+1} \;+\; \omega + 1

They look like trees. Even better, you can write a computer program that does ordinal arithmetic for ordinals of this form: you can add, multiply, and exponentiate them, and tell when one is less than another.

So, there’s really no reason to be scared of \epsilon_0. Remember, each ordinal is just the set of all smaller ordinals. So you can think of \epsilon_0 as the set of tree-shaped expressions like the one above, with a particular rule for saying when one is less than another. It’s a perfectly reasonable entity. For some real excitement, we’ll need to move on to larger ordinals. We’ll do that next time.

For more, see:

• Wikipedia, Cantor normal form.


Exponential Zero

25 June, 2016

guest post by David A. Tanzer

Here is a mathematical riddle.  Consider the function below, which is undefined for negative values, sends zero to one, and sends positive values to zero.   Can you come up with a nice compact formula for this function, which uses only the basic arithmetic operations, such as addition, division and powers?  You can’t use any special functions, including things like sign and step functions, which are by definition discontinuous.

In college, I ran around showing people the graph, asking them to guess the formula.  I even tried it out on some professors there, U. Penn.  My algebra prof, who was kind of intimidating, looked at it, got puzzled, and then got irritated. When I showed him the answer, he barked out: Is this exam over??!  Then I tried it out during office hours on E. Calabi, who was teaching undergraduate differential geometry.  With a twinkle in his eye, he said, why that’s zero to the x!

The graph of 0x is not without controversy.   It is reasonable that for positive x, we have that 0x is zero.  Then 0-x = 1/0x = 1/0, so the function is undefined for negative values.  But what about 00?  This question is bound to come up in the course of one’s general mathematical education, and has been the source of long, ruminative arguments.

There are three contenders for 00:  undefined, 0, and 1.  Let’s try to define it in a way that is most consistent with the general laws of exponents — in particular, that for all a, x and y, ax+y = ax ay, and a-x = 1/ax. Let’s stick to these rules, even when a, x and y are all zero.

Then 00 equals its own square, because 00 = 00 + 0 = 00 00. And it equals its reciprocal, because 00 = 0-0 = 1/00. By these criteria, 00 equals 1.

That is the justification for the above graph — and for the striking discontinuity that it contains.

Here is an intuition for the discontinuity. Consider the family of exponential curves bx, with b as the parameter.  When b = 1, you get the constant function 1.  When b is more than 1, you get an increasing exponential, and when it is between 0 and 1, you get a decreasing exponential.  The intersection of all of these graphs is the “pivot” point x = 0, y = 1.  That is the “dot” of discontinuity.

What happens to bx, as b decreases to zero?  To the right of the origin, the curve progressively flattens down to zero.  To the left it rises up towards infinity more and more steeply.  But it always crosses through the point x = 0, y = 1, which remains in the limiting curve.  In heuristic terms, the value y = 1 is the discontinuous transit from infinitesimal values to infinite values.

There are reasons, however, why 00 could be treated as indeterminate, and left undefined.  These were indicated by the good professor.

Dr. Calabi had a truly inspiring teaching style, back in the day. He spoke of Italian paintings, and showed a kind of geometric laser vision.  In the classroom, he showed us the idea of torsion using his arms to fly around the room like an airplane.  There’s even a manifold named after him, the Calabi-Yau manifold.

He went on to talk about the underpinnings of this quirky function.  First he drew attention to the function f(x,y) = xy, over the complex domain, and attempted to sketch its level sets.  He focused on the behavior of the function when x and y are close to zero.   Then he stated that every one of the level sets L(z) = \{(x,y)|x^y = z\} comes arbitrarily close to (0,0).

This means that xy has a wild singularity at the origin: every complex number z is the limit of xy along some path to zero.  Indeed, to reach z, just take a path in L(z) that approaches (0,0).

To see why the level sets all approach the origin, take logs, to get ln(xy) = y ln(x) = ln(z).  That gives y = ln(z) / ln(x), which is a parametric formula for L(z).  As x goes to zero, ln(x) goes to negative infinity, so y goes to zero.  These are paths (x, ln(z)/ln(x)), completely within L(z), which approach the origin.

In making these statements, we need to keep in mind that xy is multi-valued.  That’s because xy = e y ln(x), and ln(x) is multi-valued. That is because ln(x) is the inverse of the complex exponential, which is many-to-one: adding any integer multiple of 2 \pi i to z leaves ez unchanged.  And that follows from the definition of the exponential, which sends a + bi to the complex number with magnitude a and phase b.

Footnote:  to visualize these operations, represent the complex numbers by the real plane.  Addition is given by vector addition.  Multiplication gives the vector with magnitude equal to the product of the magnitudes, and phase equal to the sum of the phases.   The positive real numbers have phase zero, and the positive imaginary numbers are at 90 degrees vertical, with phase \pi / 2.

For a specific (x,y), how many values does xy have?  Well, ln(x) has a countable number of values, all differing by integer multiples of 2 \pi i.  This generally induces a countable number of values for xy.  But if y is rational, they  collapse down to a finite set.  When y = 1/n, for example, the values of y ln(x) are spaced apart by 2 \pi i / n, and when these get pumped back through the exponential function, we find only n distinct values for x 1/n — they are the nth roots of x.

So, to speak of the limit of xy along a path, and of the partition of \mathbb{C}^2 into level sets, we need to work within a branch of xy.   Each branch induces a different partition of \mathbb{C}^2.  But for every one of these partitions, it holds true that all of the level sets approach the origin.  That follows from the formula for the level set L(z), which is y = ln(z) / ln(x).  As x goes to zero, every branch of ln(x) goes to negative infinity.  (Exercise:  why?)  So y also goes to zero.  The branch affects the shape of the paths to the origin, but not their existence.

Here is a qualitative description of how the level sets fit together:  they are like spokes around the origin, where each spoke is a curve in one complex dimension.  These curves are 1-D complex manifolds, which are equivalent to  two-dimensional surfaces in \mathbb{R}^4.  The partition comprises a two-parameter family of these surfaces, indexed by the complex value of xy.

What can be said about the geometry and topology of this “wheel of manifolds”?  We know they don’t intersect.  But are they “nicely” layered, or twisted and entangled?  As we zoom in on the origin, does the picture look smooth, or does it have a chaotic appearance, with infinite fine detail?  Suggestive of chaos is the fact that the gradient

\nabla x^y = (y x^{y-1}, \ln(y) x^y) = (y/x, \ln(y)) x^y

is also “wildly singular” at the origin.

These questions can be explored with plotting software.  Here, the artist would have the challenge of having only two dimensions to work with, when the “wheel” is really a structure in four-dimensional space.  So some interesting cross-sections would have to be chosen.

Exercises:

• Speak about the function bx, where b is negative, and x is real.
• What is 0^\pi, and why?
• What is 0^i?

Moral: something that seems odd, or like a joke that might annoy your algebra prof, could be more significant than you think.  So tell these riddles to your professors, while they are still around.


Follow

Get every new post delivered to your Inbox.

Join 3,302 other followers