Last time I explained how to create the ‘maximal abelian cover’ of a connected graph. Now I’ll say more about a systematic procedure for embedding this into a vector space. That will give us a topological crystal, like the one above.

Some remarkably symmetrical patterns arise this way! For example, starting from this graph:

we get this:

Nature uses this pattern for crystals of graphene.

Starting from this graph:

we get this:

Nature uses this for crystals of diamond! Since the construction depends only on the topology of the graph we start with, we call this embedded copy of its maximal abelian cover a **topological crystal**.

Today I’ll remind you how this construction works. I’ll also outline a proof that it gives an *embedding* of the maximal abelian cover if and only if the graph has no **bridges**: that is, edges that disconnect the graph when removed. I’ll skip all the hard steps of the proof, but they can be found here:

• John Baez, Topological crystals.

I’ll start with some standard stuff that’s good to know. Let be a graph. Remember from last time that we’re working in a setup where every edge goes from a vertex called its **source** to a vertex called its **target** . We write to indicate that is going from to . You can think of the edge as having an arrow on it, and if you turn the arrow around you get the **inverse** edge, . Also, .

The group of **integral 0-chains** on , , is the free abelian group on the set of vertices of . The group of **integral 1-chains** on , , is the quotient of the free abelian group on the set of edges of by relations for every edge . The **boundary map** is the homomorphism

such that

for each edge , and

is the group of **integral 1-cycles** on .

Remember, a **path** in a graph is a sequence of edges, the target of each one being the source of the next. Any path in determines an integral 1-chain:

For any path we have

and if and are composable then

Last time I explained what it means for two paths to be ‘homologous’. Here’s the quick way to say it. There’s groupoid called the **fundamental groupoid** of , where the objects are the vertices of and the morphisms are freely generated by the edges except for relations saying that the inverse of really is . We can **abelianize** the fundamental groupoid by imposing relations saying that whenever this equation makes sense. Each path gives a morphism which I’ll call in the abelianized fundamental groupoid. We say two paths are **homologous** if .

Here’s a nice thing:

**Lemma A.** Let be a graph. Two paths in are homologous if and only if they give the same 1-chain: .

**Proof.** See the paper. You could say they give ‘homologous’ 1-chains, too, but for graphs that’s the same as being equal. █

We define vector spaces of **0-chains** and **1-chains** by

respectively. We extend the boundary map to a linear map

We let be the kernel of this linear map, or equivalently,

and we call elements of this vector space **1-cycles**. Since is a free abelian group, it forms a lattice in the space of 1-cycles. Any edge of can be seen as a 1-chain, and there is a unique inner product on such that edges form an orthonormal basis (with each edge counting as the negative of .) There is thus an orthogonal projection

This is the key to building topological crystals!

We now come to the main construction, first introduced by Kotani and Sunada. To build a topological crystal, we start with a connected graph with a chosen basepoint . We define an **atom** to be a homology class of paths starting at the basepoint, like

Last time I showed that these atoms are the vertices of the maximal abelian cover of . Now let’s embed these atoms in a vector space!

**Definition.** Let be a connected graph with a chosen basepoint. Let be its set of atoms. Define the map

by

That is well-defined follows from Lemma A. The interesting part is this:

**Theorem A.** The following are equivalent:

(1) The graph has no bridges.

(2) The map is one-to-one.

**Proof.** The map is one-to-one if and only if for any atoms and , implies . Note that is a path in with , so

Since vanishes if and only if is orthogonal to every 1-cycle, we have

On the other hand, Lemma A says

Thus, to prove (1)(2), it suffices to that show that has no bridges if and only if every 1-chain orthogonal to every 1-cycle has . This is Lemma D below. █

The following lemmas are the key to the theorem above — and also a deeper one saying that if has no bridges, we can extend to an embedding of the whole maximal abelian cover of .

For now, we just need to show that any nonzero 1-chain coming from a path in a bridgeless graph has nonzero inner product with some 1-cycle. The following lemmas, inspired by an idea of Ilya Bogdanov, yield an algorithm for actually constructing such a 1-cycle. This 1-cycle also has other desirable properties, which will come in handy later.

To state these, let a **simple path** be one in which each vertex appears at most once. Let a **simple loop** be a loop in which each vertex except appears at most once, while appears exactly twice, as the starting point and ending point. Let the **support** of a 1-chain , denoted , be the set of edges such that . This excludes edges with , but also those with , which are inverses of edges in the support. Note that

Thus, is the smallest set of edges such that can be written as a positive linear combination of edges in this set.

Okay, here are the lemmas!

**Lemma B.** Let be any graph and let be an integral 1-cycle on . Then for some we can write

where are simple loops with .

**Proof.** See the paper. The proof is an algorithm that builds a simple loop with. We subtract this from , and if the result isn’t zero we repeat the algorithm, continuing to subtract off 1-cycles until there’s nothing left. █

**Lemma C.** Let be a path in a graph . Then for some we can write

where is a simple path and are simple loops with .

**Proof.** This relies on the previous lemma, and the proof is similar — but when we can’t subtract off any more ’s we show what’s left is for a simple path . █

**Lemma D.** Let be a graph. Then the following are equivalent:

(1) has no bridges.

(2) For any path in , if is orthogonal to every 1-cycle then .

**Proof.** It’s easy to show a bridge gives a nonzero 1-chain that’s orthogonal to all 1-cycles, so the hard part is showing that for a bridgeless graph, if is orthogonal to every 1-cycle then . The idea is to start with a path for which . We hit this path with Lemma C, which lets us replace by a simple path . The point is that a simple path is a lot easier to deal with than a general path: a general path could wind around crazily, passing over every edge of our graph multiple times.

Then, assuming has no bridges, we use Ilya Bogdanov’s idea to build a 1-cycle that’s not orthogonal to . The basic idea is to take the path and write it out as . Since the last edge is not a bridge, there must be a path from back to that does not use the edge or its inverse. Combining this path with we can construct a loop, which gives a cycle having nonzero inner product with and thus with .

I’m deliberately glossing over some difficulties that can arise, so see the paper for details! █

Okay: so far, we’ve taken a connected bridgeless graph and embedded its atoms into the space of 1-cycles via a map

These atoms are the vertices of the maximal abelian cover . Now we’ll extend to an embedding of the whole graph — or to be precise, its geometric realization . Remember, for us a graph is an abstract combinatorial gadget; its geometric realization is a topological space where the edges become closed intervals.

The idea is that just as maps each atom to a point in the vector space , maps each edge of to a straight line segment between such points. These line segments serve as the ‘bonds’ of a topological crystal. The only challenge is to show that these bonds do not cross each other.

**Theorem B.** If is a connected graph with basepoint, the map extends to a continuous map

sending each edge of to a straight line segment in . If has no bridges, then is one-to-one.

**Proof.** The first part is easy; the second part takes real work! The problem is to show the edges don’t cross. Greg Egan and I couldn’t do it using just Lemma D above. However, there’s a nice argument that goes back and uses Lemma C — read the paper for details.

As usual, history is different than what you read in math papers: David Speyer gave us a nice proof of Lemma D, and that was good enough to prove that atoms are mapped into the space of 1-cycles in a one-to-one way, but we only came up with Lemma C after weeks of struggling to prove the edges don’t cross. █

Tropical geometry sets up a nice analogy between Riemann surfaces and graphs. The Abel–Jacobi map embeds any Riemann surface in its Jacobian, which is the torus . We can similarly define the Jacobian of a graph to be . Theorem B yields a way to embed a graph, or more precisely its geometric realization , into its Jacobian. This is the analogue, for graphs, of the Abel–Jacobi map.

After I put this paper on the arXiv, I got an email from Matt Baker saying that he had already proved Theorem A — or to be precise, something that’s clearly equivalent. It’s Theorem 1.8 here:

• Matthew Baker and Serguei Norine, Riemann–Roch and Abel–Jacobi theory on a finite graph.

This says that the *vertices* of a bridgeless graph are embedded in its Jacobian by means of the graph-theoretic analogue of the Abel–Jacobi map.

What I really want to know is whether someone’s written up a proof that this map embeds the whole graph, not just its vertices, into its Jacobian in a one-to-one way. That would imply Theorem B. For more on this, try my conversation with David Speyer.

Anyway, there’s a nice connection between topological crystallography and tropical geometry, and not enough communication between the two communities. Once I figure out what the tropical folks have proved, I will revise my paper to take that into account.

Next time I’ll talk about more *examples* of topological crystals!

]]>

• Ed Crooks, Balance of power tilts from fossil fuels to renewable energy, *Financial Times*, 26 July 2016.

These are strange days in the energy business. Startling headlines are emerging from the sector that would have seemed impossible just a few years ago.

The Dubai Electricity and Water Authority said in May it had received bids to develop solar power projects that would deliver electricity costing less than three cents per kilowatt hour. This established a new worldwide low for the contracted cost of delivering solar power to the grid—and is priced well below the benchmark of what the emirate and other countries typically pay for electricity from coal-fired stations.

In the UK, renowned for its miserable overcast weather, solar panels contributed more power to the grid than coal plants for the month of May.

In energy-hungry Los Angeles, the electricity company AES is installing the world’s largest battery, with capacity to power hundreds of thousands of homes at times of high demand, replacing gas-fired plants which are often used at short notice to increase supply to the grid.

Trina Solar, the Chinese company that is the world’s largest solar panel manufacturer, said it had started selling in 20 new markets last year, from Poland to Mauritius and Nepal to Uruguay.

[…]

Some new energy technologies, meanwhile, are not making much progress, such as the development of power plants that capture and store the carbon dioxide they produce. It is commonly assumed among policymakers that carbon capture has become essential if humankind is to enjoy the benefits of fossil fuels while avoiding their polluting effects.

It is clear, too, that the growth of renewables and other low-carbon energy sources will not follow a straight line. Investment in “clean” energy has been faltering this year after hitting a record in 2015, according to Bloomberg New Energy Finance. For the first half of 2016, it is down 23 per cent from the equivalent period last year.

Even so, the elements are being put in place for what could be a quite sudden and far-reaching energy transition, which could be triggered by an unexpected and sustained surge in oil prices. If China or India were to make large-scale policy commitments to electric vehicles, they would have a dramatic impact on the outlook for oil demand.

I’m also interested in Elon Musk’s Gigafactory: a lithium-ion battery factory in Nevada with a projected capacity of 50 gigawatt-hours/year of battery packs in 2018, ultimately ramping up to 150 GWh/yr. These battery packs are mainly designed for Musk’s electric car company, Tesla.

So far, Tesla is having trouble making lots of cars: its Fremont, California plant theoretically has the capacity to make 500,000 cars per year, but last year it only built 50,000. For some of the reasons, see this:

• Matthew Debord, Tesla has to overcome a major problem for its massive new Gigafactory to succeed, *Singapore Business Insider*, 1 August 2016.

Basically, it’s hard to make cars as efficiently as traditional auto companies have learned to do, and as long as people don’t buy many electric cars, it’s hard to get better quickly.

Still, Musk has big dreams for his Gigafactory, which I can only applaud. Here’s what it should look like when it’s done:

]]>

We’re building crystals, like diamonds, purely from topology. Last time I said how: you take a graph and embed its maximal abelian cover into the vector space Now let me say a bit more about the maximal abelian cover. It’s not nearly as famous as the universal cover, but it’s very nice.

First I’ll whiz though the basic idea, and then I’ll give the details.

By ‘space’ let me mean a connected topological space that’s locally nice. The basic idea is that if is some space, its **universal cover** is a covering space of that covers all other covering spaces of The maximal abelian cover has a similar universal property—but it’s *abelian*, and it covers all *abelian* connected covers. A cover is **abelian** if its group of deck transformations is abelian.

The cool part is that universal covers are to homotopy theory as maximal abelian covers are to homology theory.

What do I mean by that? For starters, points in are just homotopy classes of paths in starting at some chosen basepoint. And the points in are just ‘homology classes’ of paths starting at the basepoint.

But people don’t talk so much about ‘homology classes’ of paths. So what do I mean by *that?* Here a bit of category theory comes in handy. Homotopy classes of paths in are morphisms in the fundamental groupoid of Homology classes of paths are morphisms in the *abelianized* version of the fundamental groupoid!

But wait a minute — what does *that* mean? Well, we can **abelianize** any groupoid by imposing the relations

whenever it makes sense to do so. It makes sense to do so when you can compose the morphisms and in either order, and the resulting morphisms and have the same source and the same target. And if you work out what that means, you’ll see it means

But now let me say it all much more slowly, for people who want a more relaxed treatment.

There are lots of slightly different things called ‘graphs’ in mathematics, but in topological crystallography it’s convenient to work with one that you’ve probably never seen before. This kind of graph has two copies of each edge, one pointing in each direction.

So, we’ll say a **graph**** has a set of ****vertices**, a set of **edges**, maps assigning to each edge its **source**** and ****target**, and a map sending each edge to its **inverse**, obeying

and

for all

That inequality at the end will make category theorists gag: definitions should say what’s true, not what’s *not* true. But category theorists should be able to see what’s really going on here, so I leave that as a puzzle.

For ordinary folks, let me repeat the definition using more words. If and we write and draw as an interval with an arrow on it pointing from to We write as and draw as the same interval as but with its arrow reversed. The equations obeyed by say that taking the inverse of gives an edge and that No edge can be its own inverse.

A **map of graphs**, say is a pair of functions, one sending vertices to vertices and one sending edges to edges, that preserve the source, target and inverse maps. By abuse of notation we call both of these functions

I started out talking about topology; now I’m treating graphs very combinatorially, but we can bring the topology back in. From a graph we can build a topological space called its **geometric realization**. We do this by taking one point for each vertex and gluing on one copy of for each edge gluing the point to and the point to and then identifying the interval for each edge with the interval for its inverse by means of the map

Any map of graphs gives rise to a continuous map between their geometric realizations, and we say a map of graphs is a **cover** if this continuous map is a covering map. For simplicity we denote the fundamental group of by and similarly for other topological invariants of However, sometimes I’ll need to distinguish between a graph and its geometric realization

Any connected graph has a **universal cover**, meaning a connected cover

that covers every other connected cover. The geometric realization of is connected and simply connected. The fundamental group acts as **deck transformations** of meaning invertible maps such that We can take the quotient of by the action of any subgroup and get a cover

In particular, if we take to be the commutator subgroup of we call the graph the **maximal abelian cover** of the graph and denote it by We obtain a cover

whose group of deck transformations is the abelianization of This is just the first homology group In particular, if the space corresponding to has holes, this is the free abelian group on

generators.

I want a concrete description of the maximal abelian cover! I’ll build it starting with the universal cover, but first we need some preliminaries on paths in graphs.

Given vertices in define a **path** from to to be a word of edges with for some vertices with and We allow the word to be empty if and only if ; this gives the **trivial path** from to itself.

Given a path from to we write and we write the trivial path from to itself as We define the composite of paths and via concatenation of words, obtaining a path we call We call a path from a vertex to itself a **loop** based at

We say two paths from to are **homotopic** if one can be obtained from the other by repeatedly introducing or deleting subwords of the form where If is a homotopy class of paths from to we write We can compose homotopy classes and by setting

If is a connected graph, we can describe the universal cover as follows. Fix a vertex of which we call the **basepoint**. The vertices of are defined to be the homotopy classes of paths where is arbitrary. The edges in from the vertex to the vertex are defined to be the edges with In fact, there is always at most one such edge. There is an obvious map of graphs

sending each vertex of to the vertex

of This map is a cover.

Now we are ready to construct the maximal abelian cover For this, we impose a further equivalence relation on paths, which is designed to make composition commutative whenever possible. However, we need to be careful. If and the composites and are both well-defined if and only if and In this case, and share the same starting point and share the same ending point if and only if and If all four of these equations hold, both and are loops based at So, we shall impose the relation only in this case.

We say two paths are **homologous** if one can be obtained from another by:

• repeatedly introducing or deleting subwords where

and/or

• repeatedly replacing subwords of the form

by those of the form

where and are loops based at the same vertex.

My use of the term ‘homologous’ is a bit nonstandard here!

We denote the homology class of a path by Note that if two paths are homologous then and Thus, the starting and ending points of a homology class of paths are well-defined, and given any path we write The composite of homology classes is also well-defined if we set

We construct the maximal abelian cover of a connected graph just as we constructed its universal cover, but using homology classes rather than homotopy classes of paths. And now I’ll introduce some jargon that should make you start thinking about crystals!

Fix a basepoint for The vertices of or **atoms**, are defined to be the homology classes of paths where is arbitrary. Any edge of or **bond**, goes from some atom to the some atom The bonds from to are defined to be the edges with There is at most one bond between any two atoms. Again we have a covering map

The homotopy classes of loops based at form a group, with composition as the group operation. This is the **fundamental group** of the graph This is isomorphic as the fundamental group of the space associated to By our construction of the universal cover, is also the set of vertices of that are mapped to by Furthermore, any element defines a deck transformation of that sends each vertex to the vertex

Similarly, the homology classes of loops based at form a group with composition as the group operation. Since the additional relation used to define homology classes is precisely that needed to make composition of homology classes of loops commutative, this group is the abelianization of It is therefore isomorphic to the first homology group of the geometric realization of

By our construction of the maximal abelian cover, is also the set of vertices of that are mapped to by Furthermore, any element defines a deck transformation of that sends each vertex to the vertex

So, it all works out! The fundamental group acts as deck transformations of the universal cover, while the first homology group acts as deck transformations of the maximal abelian cover.

Puzzle for experts: what does this remind you of in Galois theory?

We’ll get back to crystals next time.

]]>

A while back, we started talking about crystals:

• John Baez, Diamonds and triamonds, *Azimuth*, 11 April 2016.

In the comments on that post, a bunch of us worked on some puzzles connected to ‘topological crystallography’—a subject that blends graph theory, topology and mathematical crystallography. You can learn more about that subject here:

• Tosio Sunada, Crystals that nature might miss creating, *Notices of the AMS* **55** (2008), 208–215.

I got so interested that I wrote this paper about it, with massive help from Greg Egan:

• John Baez, Topological crystals.

I’ll explain the basic ideas in a series of posts here.

First, a few personal words.

I feel a bit guilty putting so much work into this paper when I should be developing network theory to the point where it does our planet some good. I seem to need a certain amount of beautiful pure math to stay sane. But this project did at least teach me a lot about the topology of graphs.

For those not in the know, applying homology theory to graphs might sound fancy and interesting. For people who have studied a reasonable amount of topology, it probably sounds easy and boring. The first homology of a graph of genus is a free abelian group on generators: it’s a complete invariant of connected graphs up to homotopy equivalence. Case closed!

But there’s actually more to it, because studying graphs *up to homotopy equivalence* kills most of the fun. When we’re studying networks in real life we need a more refined outlook on graphs. So some aspects of this project might pay off, someday, in ways that have nothing to do with crystallography. But right now I’ll just talk about it as a fun self-contained set of puzzles.

I’ll start by quickly sketching how to construct topological crystals, and illustrate it with the example of graphene, a 2-dimensional form of carbon:

I’ll precisely state our biggest result, which says when this construction gives a crystal where the atoms don’t bump into each other and the bonds between atoms don’t cross each other. Later I may come back and add detail, but for now you can find details in our paper.

The ‘maximal abelian cover’ of a graph plays a key role in Sunada’s work on topological crystallography. Just as the universal cover of a connected graph has the fundamental group as its group of deck transformations, the maximal abelian cover, denoted has the abelianization of as its group of deck transformations. It thus covers every other connected cover of whose group of deck transformations is abelian. Since the abelianization of is the first homology group there is a close connection between the maximal abelian cover and homology theory.

In our paper, Greg and I prove that for a large class of graphs, the maximal abelian cover can naturally be embedded in the vector space We call this embedded copy of a ‘topological crystal’. The symmetries of the original graph can be lifted to symmetries of its topological crystal, but the topological crystal also has an -dimensional lattice of translational symmetries. In 2- and 3-dimensional examples, the topological crystal can serve as the blueprint for an actual crystal, with atoms at the vertices and bonds along the edges.

The general construction of topological crystals was developed by Kotani and Sunada, and later by Eon. Sunada uses ‘topological crystal’ for an even more general concept, but we only need a special case.

Here’s how it works. We start with a graph This has a space of 0-chains, which are formal linear combinations of vertices, and a space of 1-chains, which are formal linear combinations of edges. There is a boundary operator

This is the linear operator sending any edge to the difference of its two endpoints. The kernel of this operator is called the space of 1-cycles, There is an inner product on the space of 1-chains such that edges form an orthonormal basis. This determines an orthogonal projection

For a graph, is isomorphic to the first homology group So, to obtain the topological crystal of we need only embed its maximal abelian cover in We do this by embedding in and then projecting it down via

To accomplish this, we need to fix a basepoint for Each path in starting at this basepoint determines a 1-chain These 1-chains correspond to the vertices of The graph has an edge from to whenever the path is obtained by adding an extra edge to This edge is a straight line segment from the point to the point

The hard part is checking that the projection maps this copy of into in a one-to-one manner. In Theorems 6 and 7 of our paper we prove that this happens precisely when the graph has no ‘bridges’: that is, edges whose removal would disconnect

Kotani and Sunada noted that this condition is necessary. That’s actually pretty easy to see. The challenge was to show that it’s sufficient! For this, our main technical tool is Lemma 5, which for any path decomposes the 1-chain into manageable pieces.

We call the resulting copy of embedded in a **topological crystal**.

Let’s see how it works in an example!

Take to be this graph:

Since has 3 edges, the space of 1-chains is 3-dimensional. Since has 2 holes, the space of 1-cycles is a 2-dimensional plane in this 3-dimensional space. If we consider paths in starting at the red vertex, form the 1-chains and project them down to this plane, we obtain the following picture:

Here the 1-chains are the white and red dots. These are the vertices of while the line segments between them are the edges of Projecting these vertices and edges onto the plane of 1-cycles, we obtain the topological crystal for The blue dots come from projecting the white dots onto the plane of 1-cycles, while the red dots already lie on this plane. The resulting topological crystal provides the pattern for graphene:

That’s all there is to the basic idea! But there’s a lot more to say about it, and a lot of fun examples to look at: diamonds, triamonds, hyperquartz and more.

]]>

Frigatebirds are amazing!

They have the largest ratio of wing area to body weight of any bird. This lets them fly very long distances while only rarely flapping their wings. They often stay in the air for weeks at time. And one being tracked by satellite in the Indian Ocean stayed aloft for *two months*.

Surprisingly for sea birds, they don’t go into the water. Their feathers aren’t waterproof. They are true creatures of the air. They snatch fish from the ocean surface using their long, hooked bills—and they often eat flying fish! They clean themselves in flight by flying low and wetting themselves at the water’s surface before preening themselves.

They live a long time: often over 35 years.

But here’s the cool new discovery:

Since the frigatebird spends most of its life at sea, its habits outside of when it breeds on land aren’t well-known—until researchers started tracking them around the Indian Ocean. What the researchers discovered is that the birds’ flying ability almost defies belief.

Ornithologist Henri Weimerskirch put satellite tags on a couple of dozen frigatebirds, as well as instruments that measured body functions such as heart rate. When the data started to come in, he could hardly believe how high the birds flew.

“First, we found, ‘Whoa, 1,500 meters. Wow. Excellent, fantastique,’ ” says Weimerskirch, who is with the National Center for Scientific Research in Paris. “And after 2,000, after 3,000, after 4,000 meters — OK, at this altitude they are in freezing conditions, especially surprising for a tropical bird.”

Four thousand meters is more than 12,000 feet, or as high as parts of the Rocky Mountains. “There is no other bird flying so high relative to the sea surface,” he says.

Weimerskirch says that kind of flying should take a huge amount of energy. But the instruments monitoring the birds’ heartbeats showed that the birds weren’t even working up a sweat. (They wouldn’t, actually, since birds don’t sweat, but their heart rate wasn’t going up.)

How did they do it? By flying into a cloud.

“It’s the only bird that is known to intentionally enter into a cloud,” Weimerskirch says. And not just any cloud—a fluffy, white cumulus cloud. Over the ocean, these clouds tend to form in places where warm air rises from the sea surface. The birds hitch a ride on the updraft, all the way up to the top of the cloud.

[…]

“Absolutely incredible,” says Curtis Deutsch, an oceanographer at the University of Washington. “They’re doing it right through these cumulus clouds. You know, if you’ve ever been on an airplane, flying through turbulence, you know it can be a little bit nerve-wracking.”

One of the tagged birds soared 40 miles without a wing-flap. Several covered more than 300 miles a day on average, and flew continuously for weeks.

• Christopher Joyce, Nonstop flight: how the frigatebird can soar for weeks without stopping, *All Things Considered*, National Public Radio, 30 June 2016.

Frigatebirds aren’t admirable in every way. They’re kleptoparasites—now there’s a word you don’t hear every day! That’s a name for animals that steal food:

Frigatebirds will rob other seabirds such as boobies, particularly the red-footed booby, tropicbirds, shearwaters, petrels, terns, gulls and even ospreys of their catch, using their speed and maneuverability to outrun and harass their victims until they regurgitate their stomach contents. They may either assail their targets after they have caught their food or circle high over seabird colonies waiting for parent birds to return laden with food.

• Frigatebird, Wikipedia.

]]>

David Spivak has been working a lot on operads as a tool for describing systems of systems. Here’s a nice programmatic talk advocating this approach:

• David Spivak, Operads as a potential foundation for

systems of systems.

This was a talk he gave at the Generalized Network Structures and Dynamics Workshop at the Mathematical Biosciences Institute at Ohio State University this spring.

You won’t learn what operads are from this talk—for that, try this:

• Wikipedia, Operad.

But if you know a bit about operads, it may help give you an idea of their flexibility as a formalism for describing ways of sticking together components to form bigger systems!

I’ll probably talk about this kind of thing more pretty soon. So far I’ve been using category theory to study networked systems like electrical circuits, Markov processes and chemical reaction networks. The same ideas handle all these different kind of systems in a unified way. But I want to push toward biology. Here we need more sophisticated ideas. My philosophy is that while biology seems “messy” to physicists, living systems actually operate at higher levels of abstraction, which call for new mathematics.

]]>

An obvious strategy is to make up a function from ordinals to ordinals that grows really fast, so that is a lot bigger than the ordinal indexing it. This is indeed a good idea. But something funny tends to happen! Eventually catches up with In other words, you eventually hit a solution of

This is called a **fixed point** of At this point, there’s no way to use as a name for unless you *already* have a name for So, your scheme fizzles out!

For example, we started by looking at powers of the smallest infinite ordinal. But eventually we ran into ordinals that obey

There’s an obvious work-around: we make up a new name for ordinals that obey

We call them **epsilon numbers**. In our usual nerdy way we start counting at zero, so we call the smallest solution of this equation and the next one and so on.

But eventually we run into ordinals that are fixed points of the function meaning that

There’s an obvious work-around: we make up a new name for ordinals that obey

But by now you can guess that this problem will keep happening, so we’d better get systematic about making up new names! We should let

and let be the th fixed point of

Oswald Veblen, a mathematician at Princeton, came up with this idea around 1908, based on some thoughts of G. H. Hardy:

• Oswald Veblen, Continuous increasing functions of finite and transfinite ordinals, *Trans. Amer. Math. Soc.* **9** (1908), 280–292.

He figured out how to define even when the index is infinite.

Last time we saw how to name a lot of countable ordinals using this idea: in fact, all ordinals less than the ‘Feferman–Schütte ordinal’. This time I want go further, still using Veblen’s work.

First, however, I feel an urge to explain things a bit more precisely.

There are three kinds of ordinals. The first is a **successor ordinal**, which is one more than some other ordinal. So, we say is a successor ordinal if

for some The second is 0, which is not a successor ordinal. And the third is a **limit ordinal**, which is neither 0 nor a successor ordinal. The smallest example is

Every limit ordinal is the ‘limit’ of ordinals less than it. What does that mean, exactly? Remember, each ordinal is a set: the set of all smaller ordinals. We can define the **limit** of a set of ordinals to be the union of that set. Alternatively, it’s the smallest ordinal that’s greater than or equal to every ordinal in that set.

Now for Veblen’s key idea:

**Veblen’s Fixed Point Theorem.** Suppose a function from ordinals to ordinals is:

• **strictly increasing**: if then

and

• **continuous**: if is a limit ordinal, is the limit of the ordinals where

Then must have a fixed point.

Why? For starters, we always have this fact:

After all, if this weren’t true, there’d be a smallest with the property that since every nonempty set of ordinals has a smallest element. But since is strictly increasing,

so would be an even smaller ordinal with this property. Contradiction!

Using this fact repeatedly, we get

Let be the limit of the ordinals

Then by continuity, is the limit of the sequence

So equals *Voilà!* A fixed point!

This construction gives the smallest fixed point of . There are infinitely many more, since we can start not with but with and repeat the same argument, etc. Indeed if we try to list these fixed points, we find there is one for each ordinal.

So, we can make up a new function that lists these fixed points. Just to be cute, people call this the **derivative** of so that is the th fixed point of Beware: while the derivative of a polynomial grows more slowly than the original polynomial, the derivative of a continuous increasing function from ordinals to ordinals generally grows more *quickly* than It doesn’t really act like a derivative; people just call it that.

Veblen proved another nice theorem:

**Theorem.** If is a continuous and strictly increasing function from ordinals to ordinals, so is

So, we can take the derivative repeatedly! This is the key to the Veblen hierarchy.

If you want to read more about this, it helps to know that a function from ordinals to ordinals that’s continuous and strictly increasing is called **normal**. ‘Normal’ is an adjective that mathematicians use when they haven’t had enough coffee in the morning and aren’t feeling creative—it means a thousand different things. In this case, a better term would be ‘differentiable’.

Armed with that buzzword, you can try this:

• Wikipedia, Fixed-point lemma for normal functions.

Okay, enough theory. On to larger ordinals!

First let’s summarize how far we got last time, and why we got stuck. We inductively defined the th ordinal of the th kind by:

and

meaning that is the th fixed point of

This handles the cases where is zero or a successor ordinal. When is a limit ordinal we let be the th ordinal that’s a fixed point of *all* the functions for

Last time I explained how these functions give a nice notation for ordinals less than the Feferman–Schütte ordinal, which is also called This ordinal is the smallest solution of

So it’s a fixed point, but of a new kind, because now the appears as a *subscript* of the function.

We can get our hands on the Feferman–Schütte ordinal by taking the limit of the ordinals

(If you’re wondering why we use the number 0 here, instead of some other ordinal, I believe the answer is: it doesn’t really matter, we would get the same result if we used any ordinal less than the Feferman–Schütte ordinal.)

The ‘Feferman–Schütte barrier’ is the combination of these two facts:

• On the one hand, every ordinal less than can be written as a finite sum of guys where and are even smaller than Using this fact repeatedly, we can get a *finite* expression for any ordinal less than the Feferman–Schütte ordinal in terms of the function, addition, and the ordinal 0.

• On the other hand, if and are less than then is less than So we can’t use the function to name the Feferman–Schütte ordinal in terms of smaller ordinals.

But now let’s break the Feferman–Schütte barrier and reach some bigger countable ordinas!

The function is strictly increasing and continuous as a function of So, using Veblen’s theorems, we can define to be the th solution of

We can then define a bunch of enormous countable ordinals:

and still bigger ones:

and even bigger ones:

and even bigger ones:

But since is just we can reach much bigger countable ordinals with the help of the function:

and we can do vastly better using the function itself:

The limit of all *these* is the smallest solution of

As usual, this ordinal is still countable, but there’s no way to express it in terms of the function and smaller ordinals. So we are stuck again.

In short: we got past the Feferman–Schütte barrier by introducing a name for the th solution of We called it This made us happy for about two minutes…

…. but then we ran into another barrier of the same kind.

So what we really need is a more general notation: one that gets us over not just this particular bump in the road, but *all bumps of this kind!* We don’t want to keep randomly choosing goofy new letters like We need something systematic.

We were actually doing pretty well with the function. It was nice and systematic. It just wasn’t powerful enough. But if you’re trying to keep track of how far you’re driving on a really long trip, you want an odometer with more digits. So, let’s try that.

In other words, let’s generalize the function to allow more subscripts. Let’s rename and call it . The fact that we’re using two subscripts says that we’re going beyond the old functions with just one subscript. The subscripts 1 and 0 should remind you of what happens when you drive more than 9 miles: if your odometer has two digits, it’ll say you’re on mile 10.

Now we proceed as before: we make up new functions, each of which enumerates the fixed points of the previous one:

and so on. In general, we let

and when is a limit ordinal, we let

Are you confused?

*How could you possibly be confused???*

Okay, maybe an example will help. In the last section, our notation fizzled out when we took the limit of these ordinals:

The limit of these is the smallest solution of But now we’re writing so this limit is the smallest fixed point of So, it’s

We can now ride happily into the sunset, defining for all ordinals Of course, this will never give us a notation for ordinals with

But we don’t let that stop us! This is where the new extra subscript really comes in handy. We now define to be the th solution of

Then we drive on as before. We let

and when is a limit ordinal, we say

I hope you get the idea. *Keep doing this!*

We can inductively define for all and Of course, these functions will never give a notation for solutions of

To describe these, we need a function with one more subscript! So let be the th solution of

We can then proceed on and on and on, adding extra subscripts as needed.

This is called the **multi-variable Veblen hierarchy**.

To help you understand the multi-variable Veblen hierarchy, I’ll use it to describe lots of ordinals. Some are old friends. Starting with finite ones, we have:

•

•

and so on, so we don’t need separate names for natural numbers… but I’ll use them just to save space.

•

•

and so on, so we don’t need separate names for and its powers, but I’ll use them just to save space.

•

•

•

•

•

•

•

where I should remind you that is a name for the th solution of

•

•

•

• is the limit of

• is called the **Ackermann ordinal**.

Apparently Wilhelm Ackermann, the logician who invented a very fast-growing function called Ackermann’s function, had a system for naming ordinals that fizzled out at this ordinal.

There are obviously lots more ordinals that can be described using the multi-variable Veblen hierarchy, but I don’t have anything interesting to say about them. And you’re probably more interested in this question: *what’s next?*

The limit of these ordinals

is called the **small Veblen ordinal**. Yet again, it’s a countable ordinal. It’s the smallest ordinal that cannot be named in terms of smaller ordinals using the multi-variable Veblen hierarchy…. at least, not the version I described. And here’s a nice fact:

**Theorem.** Every ordinal less than the small Veblen ordinal can be written as a finite expression in terms of the multi-variable function, addition, and 0.

For example,

is equal to

On the one hand, this notation is quite tiresome to read. On the other hand, it’s amazing that it gets us so far!

Furthermore, if you stare at expressions like the above one for a while, and think about them abstractly, they should start looking like *trees*. So you should find it easy to believe that ordinals less than the small Veblen ordinal correspond to *trees*, perhaps labelled in some way.

Indeed, this paper describes a correspondence of this sort:

• Herman Ruge Jervell, Finite trees as ordinals, in *New Computational Paradigms*, Lecture Notes in Computer Science **3526**, Springer, Berlin, 2005, pp. 211–220.

However, I don’t think his idea is quite same as what you’d come up with by staring at expressions like

We’re not quite done yet. The modifier ‘small’ in the term ‘small Veblen ordinal’ should make you suspect that there’s more in Veblen’s paper. And indeed there is!

Veblen actually extended his multi-variable function to the case where there are *infinitely many variables*. He requires that all but finitely many of these variables equal zero, to keep things under control. Using this, one can set up a notation for even bigger countable ordinals! This notation works for all ordinals less than the large Veblen ordinal.

We don’t need to stop here. The large Veblen ordinal is just the first of a new series of even larger countable ordinals!

These can again be defined as fixed points. Yes: it’s déjà vu all over again. But around here, people usually switch to a new method for naming these fixed points, called ‘ordinal collapsing functions’. One interesting thing about this notation is that it makes use of *uncountable* ordinal. The first uncountable ordinal is called and it dwarfs all those we’ve seen here.

We can use the ordinal collapsing function to name many of our favorite countable ordinals, and more:

• is the smallest solution of

• is the Feferman–Schütte ordinal.

• is the Ackermann ordinal.

• is the small Veblen ordinal.

• is the large Veblen ordinal.

• is called the Bachmann–Howard ordinal. This is the limit of the ordinals

I won’t explain this now. Maybe later! But not tonight. As Bilbo Baggins said:

The Road goes ever on and on

Out from the door where it began.

Now far ahead the Road has gone,

Let others follow it who can!

Let them a journey new begin,

But I at last with weary feet

Will turn towards the lighted inn,

My evening-rest and sleep to meet.

But perhaps you’re impatient and want to begin a new journey now!

The people who study notations for very large countable ordinals tend to work on proof theory, because these ordinals have nice applications to that branch of logic. For example, Peano arithmetic is powerful enough to work with ordinals up to but not including , so we call the proof-theoretic ordinal of Peano arithmetic. Stronger axiom systems have bigger proof-theoretic ordinals.

Unfortunately this makes it a bit hard to learn about large countable ordinals without learning, or at least bumping into, a lot of proof theory. And this subject, while interesting in principle, is quite tough. So it’s hard to find a readable introduction to large countable ordinals.

The bibliography of the Wikipedia article on large countable ordinals gives this half-hearted recommendation:

Wolfram Pohlers, Proof theory, Springer 1989 ISBN 0-387-51842-8 (for Veblen hierarchy and some impredicative ordinals). This is probably the most readable book on large countable ordinals (which is not saying much).

Unfortunately, Pohlers does not seem to give a detailed account of ordinal collapsing functions. If you want to read something *fun* that goes further than my posts so far, try this:

• Hilbert Levitz, Transfinite ordinals and their notations: for the uninitiated.

(Anyone whose *first* name is Hilbert must be *born* to do logic!)

This is both systematic and clear:

• Wikipedia, Ordinal collapsing functions.

And if you want to explore countable ordinals using a computer program, try this:

• Paul Budnik, Ordinal calculator and research tool.

Among other things, this calculator can add, multiply and exponentiate ordinals described using the multi-variable Veblen hierarchy—even the version with infinitely many variables!

]]>

and stopped for gas at the first one after all these. It’s called Heuristically, you can imagine it like this:

More rigorously, it’s the smallest ordinal obeying the equation

But I’m sure you have a question. *What comes after ?*

Well, duh! It’s

Then comes

and then eventually we get to

and then

and after a long time

and then eventually

and then eventually….

Oh, I see! You wanted to know the first *really interesting* ordinal after

Well, this is a matter of taste, but you might be interested in This is the first ordinal after that satisfies this equation:

How do we actually reach this ordinal? Well, just as was the limit of this sequence:

is the limit of this:

You may wonder what I mean by the ‘limit’ of an increasing sequence of ordinals. I just mean the smallest ordinal greater than or equal to every ordinal in that sequence. Such a thing is guaranteed to exist, since if we treat ordinals as well-ordered sets, we can just take the *union* of all the sets in that sequence.

Here’s a picture of taken from David Madore’s interactive webpage:

In what sense is the first "really interesting" ordinal after ?

For one thing, it’s first that can’t be built out of and using finitely many additions, multiplications and exponentiations. In other words, if we use Cantor normal form to describe ordinals (as explained last time), and allow expressions involving as well as and we get a notation for all ordinals up to

What’s the next really interesting ordinal after ? As you might expect, it’s called This is the next solution of

and it’s defined to be the limit of this sequence:

Maybe now you get the pattern. In general, is the

th solution of We can define this, if we’re smart, for any ordinal

So, we can keep driving on through fields of ever larger ordinals:

and eventually

which is the first ordinal bigger than

Let’s stop and take a look!

Nice! Okay, back in the car…

and then

and then

As you can see, this gets boring after a while: it’s suspiciously similar to the beginning of our trip through the ordinals. The same ordinals are now showing up as subscripts in this epsilon notation. But we’re moving much faster now, since I’m skipping over much bigger gaps, not bothering to mention all sorts of ordinals like

Anyway… while we’re zipping along, I might as well finish telling you the story I started last time. My friend David Sternlieb and I were driving across South Dakota on Route 80. We kept seeing signs for the South Dakota Tractor Museum. When we finally got there, we were driving pretty darn fast, out of boredom—about 85 miles an hour. And guess what happened then!

Oh — wait a minute—this one is sort of interesting:

Then come some more like that:

until we reach this:

and then

As we keep speeding up, we see:

So, anyway: by the time we got that tractor museum, we were driving really fast. And, all we saw as we whizzed by was a bunch of rusty tractors out in a field! It was over in a split second! It was a real anticlimax — just like this anecdote, in fact.

But that’s just the way it is when you’re driving through these ordinals! Every ordinal, no matter how large, looks pretty pathetic and small compared to the ones ahead — so you keep speeding up, looking for something ‘really new and different’. But when you find one, it turns out to be part of a larger pattern, and soon *that* gets boring too.

For example, when we reach the limit of this sequence:

our notation fizzles out again, since this is the first solution of

We could make up a new name for this ordinal, like I don’t think this name is very common, though I’ve seen it. We could call it the Tractor Museum of Countable Ordinals.

Now we can play the whole game again, defining the **zeta number** to be the th solution of

sort of like how we defined the epsilons. This kind of equation, where something equals some function of itself, is called a fixed point equation.

But since we’ll have to play this game infinitely often, we might as well be more systematic about it!

As you can see, we keep running into new, qualitatively different types of ordinals. First we ran into the powers of omega. Then we ran into the epsilons, and then the zetas. It’s gonna keep happening! For each type of ordinal, our notation fizzles out when we reach the first ‘fixed point’— when the xth ordinal of this type is actually equal to x.

So, instead of making up infinitely many Greek letters for different types of ordinals let’s index them… by ordinals! For each ordinal we’ll have a type of ordinal. We’ll let be the th ordinal of type

We can use the fixed point equation to define in terms of In other words, we start off by defining

and then define

to be the th solution of

where we start counting at so the first solution is called the ‘zeroth’.

We can even make sense of when itself is infinite! Suppose is a limit of smaller ordinals. Then we define to be the limit of as approaches I’ll make this more precise next time.

We get infinitely many different types of ordinals, called the Veblen hierarchy. So, concretely, the Veblen hierarchy starts with the powers of

and then it goes on to the ‘epsilons’:

and then it goes on to what I called the ‘zetas’:

But that’s just the start!

Boosting the subscript in increases the result much more than boosting so let’s focus on that and just let The Veblen hierarchy contains ordinals like this:

and then ordinals like this:

and then ordinals like this:

and then this:

where of course I’m skipping huge infinite stretches of ‘boring’ ones. But note that

and

and

In short, we can plug the phi function into itself—and we get the biggest effect if we plug it into the subscript!

So, if we’re in a rush to reach some *really* big countable ordinals, we can try these:

But the limit of these is an ordinal that has

This is called the Feferman–Schütte ordinal and denoted

In fact, the Feferman–Schütte ordinal is the *smallest* solution of

Since this equation is self-referential, we can’t describe Feferman–Schütte ordinal using the Veblen hierarchy—at least, not without using the Feferman–Schütte ordinal!

Indeed, some mathematicians have made a big deal about this ordinal, claiming it’s

the smallest ordinal that cannot be described without self-reference.

This takes some explaining, and it’s somewhat controversial. After all, there’s a sense in which *every* fixed point equation is self-referential. But there’s a certain precise sense in which the Feferman–Schütte ordinal is different from previous ones.

Anyway, you have admit that this is a very cute description of the Fefferman–Schuette ordinal: “the smallest ordinal that cannot be described without self-reference.” Does it use self-reference? It had better—otherwise we have a contradiction!

It’s a little scary, like this picture:

More importantly for us, the Veblen hierarchy fizzles out when we hit the Feferman–Schuette ordinal. Let me say what I mean by that.

The Veblen hierarchy gives a notation for ordinals called the Veblen normal form. You can think of this as a high-powered version of Cantor normal form, which we discussed last time.

Veblen normal form relies on this result:

**Theorem.** Any ordinal can be written uniquely as

where is a natural number, each term is less than or equal to the previous one, and for all

Note that we can also use this theorem to write out the ordinals and , and so on, recursively. So, it gives us a notation for ordinals.

However, this notation is only *useful* when all the ordinals are less than the ordinal that we’re trying to describe. Otherwise we need to *already have* a notation for to express in Veblen normal form!

So, the power of this notation eventually fizzles out. And the place where it does is Feferman–Schütte ordinal. Every ordinal less than this can be expressed in terms of , addition, and the function using just finitely many symbols!

As I hope you see, the power of the human mind to see a pattern and formalize it gives the quest for large countable ordinals a strange quality. As soon as we see a systematic way to generate a sequence of larger and larger ordinals, we know this sequence has a limit that’s larger then all of those! And this opens the door to even larger ones….

So, this whole journey feels a bit like trying to outrace our car’s own shadow as we drive away from the sunset: the faster we drive, the faster it shoots ahead of us. We’ll never win.

On the other hand, we’ll only lose if we get tired.

So it’s interesting to hear what happens next. We don’t have to give up. The usual symbol for the Feferman–Schütte ordinal should be a clue. It’s called And that’s because it’s *just the start of a new series of even bigger countable ordinals!*

I’m dying to tell you about those. But this is enough for today.

]]>

It may not exist in the physical world, but we can set up rules to think about it in consistent ways, and then it’s a helpful concept. The reason is that infinity is often easier to think about than very large finite numbers.

Finding rules to work with the infinite is one of the great triumphs of mathematics. Cantor’s realization that there are *different sizes of infinity* is truly wondrous—and by now, it’s part of the everyday bread and butter of mathematics.

Trying to create a *notation* for these different infinities is very challenging. It’s not a fair challenge, because there are more infinities than expressions we can write down in any given alphabet! But if we seek a notation for *countable ordinals*, the challenge becomes more fair.

It’s still incredibly frustrating. No matter what notation we use it fizzles out too soon… making us wish we’d invented a more general notation. But this process of ‘fizzling out’ is fascinating to me. There’s something profound about it. So, I would like to tell you about this.

Today I’ll start with a warmup. Cantor invented a notation for ordinals that works great for ordinals less than a certain ordinal called ε_{0}. Next time I’ll go further, and bring in the ‘single-variable Veblen hierarchy’! This lets us describe all ordinals below a big guy called the ‘Feferman–Schütte ordinal’.

In the post after that I’ll bring in the ‘multi-variable Veblen hierarchy’, which gets us all the ordinals below the ‘small Veblen ordinal’. We’ll even touch on the ‘large Veblen ordinal’, which requires a version of the Veblen hierarchy with *infinitely* many variables. But all this is really just the beginning of a longer story. That’s how infinity works: the story never ends!

To describe countable ordinals beyond the large Veblen ordinal, most people switch to an entirely different set of ideas, called ‘ordinal collapsing functions’. I may tell you about those someday. Not soon, but someday. My interest in the infinite doesn’t seem to be waning. It’s a decadent hobby, but hey: some middle-aged men buy fancy red sports cars and drive them really fast. Studying notions of infinity is cooler, and it’s environmentally friendly.

I can even imagine writing a *book* about the infinite. Maybe these posts will become part of that book. But one step at a time…

Cantor invented two different kinds of infinities: cardinals and ordinals. Cardinals say how big sets are. Two sets can be put into 1-1 correspondence iff they have the same number of elements—where this kind of ‘number’ is a cardinal. You may have heard about cardinals like aleph-nought (the number of integers), 2 to power aleph-nought (the number of real numbers), and so on. You may have even heard rumors of much bigger cardinals, like ‘inaccessible cardinals’ or ‘super-huge cardinals’. All this is tremendously fun, and I recommend starting here:

• Frank R. Drake, *Set Theory, an Introduction to Large Cardinals*, North-Holland, 1974.

There are other books that go much further, but as a beginner, I found this to be the most fun.

But I don’t want to talk about cardinals! I want to talk about *ordinals*.

Ordinals say how big ‘well-ordered’ sets are. A set is well-ordered if it comes with a relation ≤ obeying the usual rules:

• **Transitivity**: if x ≤ y and y ≤ z then x ≤ z

• **Reflexivity**: x ≤ x

• **Antisymmetry**: if x ≤ y and y ≤ x then x = y

and one more rule: *every nonempty subset has a smallest element!*

For example, the empty set

is well-ordered in a trivial sort of way, and the corresponding ordinal is called

Similarly, any set with just one element, like this:

is well-ordered in a trivial sort of way, and the corresponding ordinal is called

Similarly, any set with two elements, like this:

becomes well-ordered as soon as we decree which element is bigger; the obvious choice is to say 0 < 1. The corresponding ordinal is called

Similarly, any set with three elements, like this:

becomes well-ordered as soon as we linearly order it; the obvious choice here is to say 0 < 1 < 2. The corresponding ordinal is called

Perhaps you’re getting the pattern — you’ve probably seen these particular ordinals before, maybe sometime in grade school. They’re called finite ordinals, or "natural numbers".

But there’s a cute trick they probably didn’t teach you then: we can *define* each ordinal to *be* the set of all ordinals less than it:

(since no ordinal is less than 0)

(since only 0 is less than 1)

(since 0 and 1 are less than 2)

(since 0, 1 and 2 are less than 3)

and so on. It’s nice because now each ordinal *is* a well-ordered set of the size that ordinal stands for. And, we can define one ordinal to be "less than or equal" to another precisely when its a subset of the other.

What comes after all the finite ordinals? Well, the set of all finite ordinals is itself well-ordered:

So, there’s an ordinal corresponding to this — and it’s the first *infinite* ordinal. It’s usually called pronounced ‘omega’. Using the cute trick I mentioned, we can actually define

What comes after this? Well, it turns out there’s a well-ordered set

containing the finite ordinals together with with the obvious notion of "less than": is bigger than the rest. Corresponding to this set there’s an ordinal called

As usual, we can simply define

At this point you could be confused if you know about cardinals, so let me throw in a word of reassurance. The sets and have the same cardinality: they are both countable. In other words, you can find a 1-1 and onto function between these sets. But and are different as ordinals, since you can’t find a 1-1 and onto function between them that *preserves the ordering*. This is easy to see, since has a biggest element while does not.

Indeed, all the ordinals in this series of posts will be countable! So for the infinite ones, you can imagine that all I’m doing is taking your favorite countable set and well-ordering it in ever more sneaky ways.

Okay, so we got to What comes next? Well, not surprisingly, it’s

Then comes

and so on. You get the idea.

I haven’t really defined ordinal addition in general. I’m trying to keep things fun, not like a textbook. But you can read about it here:

• Wikipedia, Ordinal arithmetic: addition.

The main surprise is that ordinal addition is not commutative. We’ve seen that since

is an infinite list of things… *and then one more thing that comes after all those!*. But because one thing followed by a list of infinitely many more is just a list of infinitely many things.

With ordinals, it’s not just about quantity: the order matters!

Okay, so we’ve seen these ordinals:

What next?

Well, the ordinal after all these is called People often call it "omega times 2" or for short. So,

It would be fun to have a book with pages, each page half as thick as the previous page. You can tell a nice long story with an -sized book. I think you can imagine this. And if you put one such book next to another, that’s a nice picture of

It’s worth noting that is not the same as We have

while

where we add of these terms. But

so

This is not a proof, because I haven’t given you the official definition of how to multiply ordinals. You can find it here:

• Wikipedia, Ordinal arithmetic: multiplication.

Using this you can prove that what I’m saying is true. Nonetheless, I hope you see why what I’m saying might make sense. Like ordinal addition, ordinal multiplication is not commutative! If you don’t like this, you should study cardinals instead.

What next? Well, then comes

and so on. But you probably have the hang of this already, so we can skip right ahead to

In fact, you’re probably ready to skip right ahead to and and so on.

In fact, I bet now you’re ready to skip all the way to "omega times omega", or for short:

Suppose you had an encyclopedia with volumes, each one being a book with pages. If each book is twice as thin as one before, you’ll have pages — and it can still fit in one bookshelf! Here’s the idea:

What comes next? Well, we have

and so on, and after all these come

and so on — and eventually

and then a bunch more, and then

and then a bunch more, and then

and then a bunch more, and more, and eventually

You can probably imagine a bookcase containing encyclopedias, each with volumes, each with pages, for a total of pages. That’s

I’ve been skipping more and more steps to keep you from getting bored. I know you have plenty to do and can’t spend an *infinite* amount of time reading this, even if the subject is infinity.

So if you don’t mind me just mentioning some of the high points, there are guys like and and so on, and after all these comes

Let’s try to we imagine this! First, imagine a book with pages. Then imagine an encyclopedia of books like this, with volumes. Then imagine a bookcase containing encyclopedias like this. Then imagine a room containing bookcases like this. Then imagine a floor with library with rooms like this. Then imagine a library with floors like this. Then imagine a city with libraries like this. And so on, *ad infinitum*.

You have to be a bit careful here, or you’ll be imagining an *uncountable* number of pages. To name a particular page in this universe, you have to say something like this:

the 23rd page of the 107th book of the 20th encyclopedia in the 7th bookcase in 0th room on the 1000th floor of the 973rd library in the 6th city on the 0th continent on the 0th planet in the 0th solar system in the…

But it’s crucial that after some finite point you keep saying “the 0th”. Without that restriction, there would be uncountably many pages! This is just one of the rules for how ordinal exponentiation works. For the details, read:

• Wikipedia, Ordinal arithmetic: exponentiation.

As they say,

But for infinite exponents, the definition may not be obvious.

Here’s a picture of taken from David Madore’s wonderful interactive webpage:

On his page, if you click on any of the labels for an initial portion of an ordinal, like or here, the picture will expand to show that portion!

And here’s another picture, where each turn of the clock’s hand takes you to a higher power of :

Okay, so we’ve reached Now what?

Well, then comes and so on, but I’m sure that’s boring by now. And then come ordinals like

leading up to

Then eventually come ordinals like

and so on, leading up to

This actually reminds me of something that happened driving across South Dakota one summer with a friend of mine. We were in college, so we had the summer off, so we drive across the country. We drove across South Dakota all the way from the eastern border to the west on Interstate 90.

This state is huge — about 600 kilometers across, and most of it is really flat, so the drive was really boring. We kept seeing signs for a bunch of tourist attractions on the western edge of the state, like the Badlands and Mt. Rushmore — a mountain that they carved to look like faces of presidents, just to give people some reason to keep driving.

Anyway, I’ll tell you the rest of the story later — I see some more ordinals coming up:

We’re really whizzing along now just to keep from getting bored — just like my friend and I did in South Dakota. You might fondly imagine that we had fun trading stories and jokes, like they do in road movies. But we were driving all the way from Princeton to my friend Chip’s cabin in California. By the time we got to South Dakota, we were all out of stories and jokes.

Hey, look! It’s

That was cool. Then comes

and so on.

Anyway, back to my story. For the first half of our half of our trip across the state, we kept seeing signs for something called the South Dakota Tractor Museum.

Oh, wait, here’s an interesting ordinal:

Let’s stop and take look:

That was cool. Okay, let’s keep driving. Here comes

and then

and then

and eventually

and eventually

and then

and eventually

After a while we reach

and then

and then

and then

and then

and then

and eventually

This is pretty boring; we’re already going infinitely fast, but we’re still just picking up speed, and it’ll take a while before we reach something interesting.

Anyway, we started getting really curious about this South Dakota Tractor Museum — it sounded sort of funny. It took 250 kilometers of driving before we passed it. We wouldn’t normally care about a tractor museum, but there was really nothing else to think about while we were driving. The only thing to see were fields of grain, and these signs, which kept building up the suspense, saying things like

ONLY 100 MILES TO THE SOUTH DAKOTA TRACTOR MUSEUM!

We’re zipping along really fast now:

What comes after all these?

At this point we need to stop for gas. Our notation for ordinals just ran out!

The ordinals don’t stop; it’s just our notation that fizzled out. The set of all ordinals listed up to now — including all the ones we zipped past — is a well-ordered set called

or "epsilon-nought". This has the amazing property that

And it’s the smallest ordinal with this property! It looks like this:

It’s an amazing fact that every countable ordinal is isomorphic, as an well-ordered set, to some subset of the real line. David Madore took advantage of this to make his pictures.

I’ll tell you the rest of my road story later. For now let me conclude with a bit of math.

There’s a nice notation for all ordinals less than called ‘Cantor normal form’. We’ve been seeing lots of examples. Here is a typical ordinal in Cantor normal form:

The idea is that you write it out using just + and exponentials and 1 and

Here is the theorem that justifies Cantor normal form:

**Theorem.** Every ordinal can be uniquely written as

where is a natural number, are positive integers, and are ordinals.

It’s like writing ordinals in base

Note that *every* ordinal can be written this way! So why did I say that Cantor normal form is nice notation for ordinals less than ? Here’s the problem: the Cantor normal form of is

So, when we hit the exponents can be as big as the ordinal we’re trying to describe! So, while the Cantor normal form still *exists* for ordinals it doesn’t give a good notation for them unless we *already* have some notation for ordinals this big!

This is what I mean by a notation ‘fizzling out’. We’ll keep seeing this problem in the posts to come.

But for an ordinal less than something nice happens. In this case, when we write

all the exponents are less than So we can go ahead and write *them* in Cantor normal form, and so on… and because ordinals are well-ordered, this process ends *after finitely many steps*.

So, Cantor normal form gives a nice way to write any ordinal less than using finitely many symbols! If we abbreviate as and write multiplication by positive integers in terms of addition, we get expressions like this:

They look like trees. Even better, you can write a computer program that does ordinal arithmetic for ordinals of this form: you can add, multiply, and exponentiate them, and tell when one is less than another.

So, there’s really no reason to be scared of Remember, each ordinal is just the set of all smaller ordinals. So you can think of as the set of tree-shaped expressions like the one above, with a particular rule for saying when one is less than another. It’s a perfectly reasonable entity. For some real excitement, we’ll need to move on to larger ordinals. We’ll do that next time.

For more, see:

• Wikipedia, Cantor normal form.

]]>

Here is a mathematical riddle. Consider the function below, which is undefined for negative values, sends zero to one, and sends positive values to zero. Can you come up with a nice compact formula for this function, which uses only the basic arithmetic operations, such as addition, division and powers? You can’t use any special functions, including things like sign and step functions, which are by definition discontinuous.

In college, I ran around showing people the graph, asking them to guess the formula. I even tried it out on some professors there, U. Penn. My algebra prof, who was kind of intimidating, looked at it, got puzzled, and then got irritated. When I showed him the answer, he barked out: Is this exam over??! Then I tried it out during office hours on E. Calabi, who was teaching undergraduate differential geometry. With a twinkle in his eye, he said, why that’s zero to the x!

The graph of 0^{x} is not without controversy. It is reasonable that for positive x, we have that 0^{x} is zero. Then 0^{-x} = 1/0^{x} = 1/0, so the function is undefined for negative values. But what about 0^{0}? This question is bound to come up in the course of one’s general mathematical education, and has been the source of long, ruminative arguments.

There are three contenders for 0^{0}: undefined, 0, and 1. Let’s try to define it in a way that is most consistent with the general laws of exponents — in particular, that for all a, x and y, a^{x+y} = a^{x} a^{y}, and a^{-x} = 1/a^{x}. Let’s stick to these rules, even when a, x and y are all zero.

Then 0^{0} equals its own square, because 0^{0} = 0^{0 + 0} = 0^{0} 0^{0}. And it equals its reciprocal, because 0^{0} = 0^{-0} = 1/0^{0}. By these criteria, 0^{0} equals 1.

That is the justification for the above graph — and for the striking discontinuity that it contains.

Here is an intuition for the discontinuity. Consider the family of exponential curves b^{x}, with b as the parameter. When b = 1, you get the constant function 1. When b is more than 1, you get an increasing exponential, and when it is between 0 and 1, you get a decreasing exponential. The intersection of all of these graphs is the “pivot” point x = 0, y = 1. That is the “dot” of discontinuity.

What happens to b^{x}, as b decreases to zero? To the right of the origin, the curve progressively flattens down to zero. To the left it rises up towards infinity more and more steeply. But it always crosses through the point x = 0, y = 1, which remains in the limiting curve. In heuristic terms, the value y = 1 is the discontinuous transit from infinitesimal values to infinite values.

There are reasons, however, why 0^{0} could be treated as indeterminate, and left undefined. These were indicated by the good professor.

Dr. Calabi had a truly inspiring teaching style, back in the day. He spoke of Italian paintings, and showed a kind of geometric laser vision. In the classroom, he showed us the idea of torsion using his arms to fly around the room like an airplane. There’s even a manifold named after him, the Calabi-Yau manifold.

He went on to talk about the underpinnings of this quirky function. First he drew attention to the function f(x,y) = x^{y}, over the *complex* domain, and attempted to sketch its level sets. He focused on the behavior of the function when x and y are close to zero. Then he stated that every one of the level sets comes arbitrarily close to (0,0).

This means that x^{y} has a *wild singularity* at the origin: *every* complex number z is the limit of x^{y} along some path to zero. Indeed, to reach z, just take a path in L(z) that approaches (0,0).

To see why the level sets all approach the origin, take logs, to get ln(x^{y}) = y ln(x) = ln(z). That gives y = ln(z) / ln(x), which is a parametric formula for L(z). As x goes to zero, ln(x) goes to negative infinity, so y goes to zero. These are paths (x, ln(z)/ln(x)), completely within L(z), which approach the origin.

In making these statements, we need to keep in mind that x^{y} is *multi-valued*. That’s because x^{y} = e ^{ y ln(x)}, and ln(x) is multi-valued. That is because ln(x) is the inverse of the complex exponential, which is many-to-one: adding any integer multiple of to z leaves e^{z} unchanged. And that follows from the definition of the exponential, which sends a + bi to the complex number with magnitude a and phase b.

Footnote: to visualize these operations, represent the complex numbers by the real plane. Addition is given by vector addition. Multiplication gives the vector with magnitude equal to the product of the magnitudes, and phase equal to the sum of the phases. The positive real numbers have phase zero, and the positive imaginary numbers are at 90 degrees vertical, with phase .

For a specific (x,y), how many values does x^{y} have? Well, ln(x) has a countable number of values, all differing by integer multiples of . This generally induces a countable number of values for x^{y}. But if y is rational, they collapse down to a finite set. When y = 1/n, for example, the values of y ln(x) are spaced apart by , and when these get pumped back through the exponential function, we find only n distinct values for x ^{1/n} — they are the nth roots of x.

So, to speak of the limit of x^{y} along a path, and of the partition of into level sets, we need to work within a branch of x^{y}. Each branch induces a different partition of . But for every one of these partitions, it holds true that all of the level sets approach the origin. That follows from the formula for the level set L(z), which is y = ln(z) / ln(x). As x goes to zero, every branch of ln(x) goes to negative infinity. (Exercise: why?) So y also goes to zero. The branch affects the shape of the paths to the origin, but not their existence.

Here is a qualitative description of how the level sets fit together: they are like spokes around the origin, where each spoke is a curve in one complex dimension. These curves are 1-D complex manifolds, which are equivalent to two-dimensional surfaces in . The partition comprises a two-parameter family of these surfaces, indexed by the complex value of x^{y}.

What can be said about the geometry and topology of this “wheel of manifolds”? We know they don’t intersect. But are they “nicely” layered, or twisted and entangled? As we zoom in on the origin, does the picture look smooth, or does it have a chaotic appearance, with infinite fine detail? Suggestive of chaos is the fact that the gradient

is also “wildly singular” at the origin.

These questions can be explored with plotting software. Here, the artist would have the challenge of having only two dimensions to work with, when the “wheel” is really a structure in four-dimensional space. So some interesting cross-sections would have to be chosen.

Exercises:

• Speak about the function b^{x}, where b is negative, and x is real.

• What is , and why?

• What is ?

Moral: something that seems odd, or like a joke that might annoy your algebra prof, could be more significant than you think. So tell these riddles to your professors, while they are still around.

]]>