Information Geometry (Part 12)

24 June, 2012

Last time we saw that if a population evolves toward an ‘evolutionarily stable state’, then the amount of information our population has ‘left to learn’ can never increase! It must always decrease or stay the same.

This result sounds wonderful: it’s a lot like the second law of thermodynamics, which says entropy must always increase. Of course there are some conditions for this wonderful result to hold. The main condition is that the population evolves according to the replicator equation. But the other is the existence of an evolutionarily stable state. Last time I wrote down the rather odd-looking definition of ‘evolutionary stable state’ without justifying it. I need to do that soon. But if you’ve never thought about evolutionary game theory, I think giving you a little background will help. So today let me try that.

Evolutionary game theory

We’ve been thinking of evolution as similar to inference or learning. In this analogy, organisms are like ‘hypotheses’, and the population ‘does experiments’ to see if these hypotheses make ‘correct predictions’ (i.e., can reproduce) or not. The successful ones are reinforced while the unsuccessful ones are weeded out. As a result, the population ‘learns’. And under the conditions of the theorem we discussed last time, the relative information—the amount ‘left to learn’—goes down!

While you might object to various points of this analogy, it’s useful—and that’s really all you can ask of an analogy. It’s useful because it lets us steal chunks of math from the subjects of Bayesian inference and machine learning and apply them to the study of biodiversity and evolution! This is what Marc Harper has been doing:

• Marc Harper, Information geometry and evolutionary game theory.

• Marc Harper, The replicator equation as an inference dynamic.

But now let’s bring in another analogy, also contained in Harper’s work. We can also think of evolution as similar to a game. In this analogy, organisms are like ‘strategies’—or if you prefer, they have strategies. The winners get to reproduce, while the losers don’t. John Maynard Smith started developing this analogy in 1973, and eventually wrote a whole book on it:

• John Maynard Smith, Evolution and the Theory of Games, Cambridge University Press, 1982.

As far as I can tell, evolutionary game theory has brought almost as many chunks of math to game theory as it has taken from it. Maybe it’s just my ignorance showing, but it seems that game theory becomes considerably deeper when we think about games that many players play again and again, with the winners getting to reproduce, while the losers are eliminated.

According to William Sandholm:

The birth of evolutionary game theory is marked by the publication of a series of papers by mathematical biologist John Maynard Smith. Maynard Smith adapted the methods of traditional game theory, which were created to model the behavior of rational economic agents, to the context of biological natural selection. He proposed his notion of an evolutionarily stable strategy (ESS) as a way of explaining the existence of ritualized animal conflict.

Maynard Smith’s equilibrium concept was provided with an explicit dynamic foundation through a diff erential equation model introduced by Taylor and Jonker. Schuster and Sigmund, following Dawkins, dubbed this model the replicator dynamic, and recognized the close links between this game-theoretic dynamic and dynamics studied much earlier in population ecology and population genetics. By the 1980s, evolutionary game theory was a well-developed and firmly established modeling framework in biology.

Towards the end of this period, economists realized the value of the evolutionary approach to game theory in social science contexts, both as a method of providing foundations for the equilibrium concepts of traditional game theory, and as a tool for selecting among equilibria in games that admit more than one. Especially in its early stages, work by economists in evolutionary game theory hewed closely to the interpretation set out by biologists, with the notion of ESS and the replicator dynamic understood as modeling natural selection in populations of agents genetically programmed to behave in specific ways. But it soon became clear that models of essentially the same form could be used to study the behavior of populations of active decision makers. Indeed, the two approaches sometimes lead to identical models: the replicator dynamic itself can be understood not only as a model of natural selection, but also as one of imitation of successful opponents.

While the majority of work in evolutionary game theory has been undertaken by biologists and economists, closely related models have been applied to questions in a variety of fields, including transportation science, computer science, and sociology. Some paradigms from evolutionary game theory are close relatives of certain models from physics, and so have attracted the attention of workers in this field. All told, evolutionary game theory provides a common ground for workers from a wide range of disciplines.

The Prisoner’s Dilemma

In game theory, the most famous example is the Prisoner’s Dilemma. In its original form, this ‘game’ is played just once:

Two men are arrested, but the police don’t have enough information to convict them. So they separate the two men, and offer both the same deal: if one testifies against his partner (or defects), and the other remains silent (and thus cooperates with his partner), the defector goes free and the cooperator goes to jail for 12 months. If both remain silent, both are sentenced to only 1 month in jail for a minor charge. If they both defect, they both receive a 3-month sentence. Each prisoner must choose either to defect or cooperate with his partner in crime; neither gets to hear what the other decides. What will they do?

Traditional game theory emphasizes the so-called ‘Nash equilibrium’ for this game, in which both prisoners defect. Why don’t they both cooperate? They’d both be better off if they both cooperated. However, for them to both cooperate is ‘unstable’: either one could shorten their sentence by defecting! By definition, a Nash equilibrium has the property that neither player can improve his situation by unilaterally changing his strategy.

In the Prisoner’s Dilemma, the Nash equilibrium is not very nice: both parties would be happier if they’d only cooperate. That’s why it’s called a ‘dilemma’. Perhaps the most tragic example today is global warming. Even if all players would be better off if all cooperate to reduce carbon emissions, any one will be better off if everybody except themselves cooperates while they emit more carbon.

For this and many other reasons, people have been interested in ‘solving’ the Prisoner’s Dilemma: that is, finding reasons why cooperation might be favored over defection.

This book got people really excited in seeing what evolutionary game theory has to say about the Prisoner’s Dilemma:

• Robert Axelrod, The Evolution of Cooperation, Basic Books, New York, 1984. (A related article with the same title is available online.)

The idea is that under certain circumstances, strategies that are ‘nicer’ than defection will gradually take over. The most famous of these strategies is ‘tit for tat’, meaning that you cooperate the first time and after that do whatever your opponent just did. I won’t go into this further, because it’s a big digression and I’m already digressing too far. I’ll just mention that from the outlook of evolutionary game theory, the Prisoner’s Dilemma is still full of surprises. Just this week, some fascinating new work has been causing a stir:

• William Press and Freeman Dyson, Iterated Prisoner’s Dilemma contains strategies that dominate any evolutionary opponent, Edge, 18 June 2012.

I hope I’ve succeeded in giving you a vague superficial sense of the history of evolutionary game theory and why it’s interesting. Next time I’ll get serious about the task at hand, which is to understand ‘evolutionarily stable strategies’. If you want to peek ahead, try this nice paper:

• William H. Sandholm, Evolutionary game theory, 12 November 2007.

This is where I got the long quote by Sandholm on the history of evolutionary game theory. The original quote contained lots of references; if you’re interested in those, go to page 3 of this paper.


The Mathematics of Biodiversity (Part 2)

24 June, 2012

How likely is it that the next thing we see is one of a brand new kind? That sounds like a hard question. Last time I told you about the Good–Turing rule for answering this question.

The discussion that blog entry triggered has been very helpful! Among other things, it got Lou Jost more interested in this subject. Two days ago, he showed me the following simple argument for the Good–Turing estimate.

Suppose there are finitely many species of orchid. Suppose the fraction of orchids belonging to the ith species is p_i.

Suppose we start collecting orchids. Suppose each time we find one, the chance that it’s an orchid of the ith species is p_i. Of course this is not true in reality! For example, it’s harder to find a tiny orchid, like this:

than a big one. But never mind.

Say we collect a total of N orchids. What is the probability that we find no orchids of the ith species? It is

(1 - p_i)^N

Similarly, the probability that we find exactly one orchid of the ith species is

N p_i (1 - p_i)^{N-1}

And so on: these are the first two terms in a binomial series.

Let n_1 be the expected number of singletons: species for which we find exactly one orchid of that species. Then

\displaystyle{ n_1 = \sum_i N p_i (1 - p_i)^{N-1} }

Let D be the coverage deficit: the expected fraction of the total population consisting of species that remain undiscovered. Given our assumptions, this is the same as the chance that the next orchid we find will be of a brand new species.

Then

\displaystyle{ D = \sum_i p_i (1-p_i)^N }

since p_i is the fraction of orchids belonging to the ith species and (1-p_i)^N is the chance that this species remains undiscovered.

Lou Jost pointed out that the formulas for n_1 and D are very similar! In particular,

\displaystyle{ \frac{n_1}{N} = \sum_i p_i (1 - p_i)^{N-1} }

should be very close to

\displaystyle{ D = \sum_i p_i (1 - p_i)^N }

when N is large. So, we should have

\displaystyle{ D \approx \frac{n_1}{N} }

In other words: the chance that the next orchid we find is of a brand new species should be close to the fraction of orchids that are singletons now.

Of course it would be nice to turn these ‘shoulds’ into precise theorems! Theorem 1 in this paper does that:

• David McAllester and Robert E. Schapire, On the convergence rate of Good–Turing estimators, February 17, 2000.

By the way: the only difference between the formulas for n_1/N and D is that the first contains the exponent N-1, while the second contains the exponent N. So, Lou Jost’s argument is a version of Boris Borcic’s ‘time-reversal’ idea:

Good’s estimate is what you immediately obtain if you time-reverse your sampling procedure, e.g., if you ask for the probability that there is a change in the number of species in your sample when you randomly remove a specimen from it.


Fluid Flows and Infinite Dimensional Manifolds (Part 3)

30 May, 2012

Or: Twisting on the Borderline

guest post by Tim van Beek

flow around a symmetric airfoil, lamiar to turbulent

In Part 2 of this series, I told you what ideal incompressible fluids are. Then I explained how the equation of motion for such fluids, Euler’s equation, can be seen as the equation for geodesics in \mathrm{SDiff}(M), the infinite dimensional manifold of volume preserving diffeomorphisms. Last time I promised to talk about the role of pressure in Euler’s equation. I also mentioned that Arnold used this geometric setup to put a bound on how good weather forecasts can be. I will try to fulfill both promises in the next blog post!

But last time I also mentioned that the ideal fluid has serious drawbacks as a model. This is an important topic, too, so I would like to explain this a little bit further first, in this blog post.

So, this time we will talk a little bit about how we can get viscosity, and therefore turbulence, back into the picture. This will lead us to the Navier–Stokes equation. Can ideal fluids, which solve Euler’s equation, model fluids with a very small viscosity? This depends on what happens to solutions when one lets the viscosity go to zero in the Navier–Stokes equation, so I will show you a result that answers this question in a specific context.

I’ll also throw in a few graphics that illustrate the transition from laminar flow to turbulent flow at boundaries, starting with the one above. These are all from:

• Milton Van Dyke, An Album of Fluid Motion, Parabolic Press, 12th edition, 1982.

flow around a circular cylinder, laminar to turbulent

Re-introducing viscosity: The Navier-Stokes equation

The motion of an incompressible, homogeneous, ideal fluid is described by Euler’s equation:

\displaystyle{ \partial_t u + u \cdot \nabla u = - \nabla p }

Ideal fluids are very nice mathematically. Nicest of all are potential flows, where the velocity vector field is the gradient of a potential. In two dimensions can be studied using complex analysis! One could say that a whole ‘industry’ evolved around the treatment of these kinds of fluid flows. It was even taught to some extend to engineers, before computers took over. Here’s a very nice, somewhat nostalgic book to read about that:

• L. M. Milne-Thomson, Theoretical Aerodynamics, 4th edition, Dover Publications, New York, 2011. (Reprint of the 1958 edition.)

The assumption of ‘incompressibility’ is not restrictive for most applications involving fluid flows of water and air, for example. Maybe you are a little bit surprised that I mention air, because the compressibility of air is a part of every day life, for example when you pump up a cycle tire. It is, however, not necessary to include this property when you model air flowing at velocities that are significantly lower than the speed of sound in air. The rule of thumb for engineers seems to be that one needs to include compressibility for speeds around Mach 0.3 or more:

Compressible aerodynamics, Wikipedia.

However, the concept of an ‘ideal’ fluid takes viscosity out of the picture—and therefore also turbulence, and the drag that a body immersed in fluid feels. As I mentioned last time, this is called the D’Alembert’s paradox.

The simplest way to introduce viscosity is by considering a Newtonian fluid. This is a fluid where the viscosity is a constant, and the relation of velocity differences and resulting shear forces is strictly linear. This leads to the the Navier–Stokes equation for incompressible fluids:

\displaystyle{ \partial_t u + u \cdot \nabla u - \nu \Delta u = - \nabla p }

If you think about molten plastic, or honey, you will notice that the viscosity actually depends on the temperature, and maybe also on the pressure and other parameters, of the fluid. The science that is concerned with the exploration of these effects is called rheology. This is an important research topic and the reason why producers of, say, plastic sheets, sometimes keep physicists around. But let’s stick to Newtownian fluids for now.

curved wall, lamiar to turbulent

Sending Viscosity to Zero: Boundary Layers

Since we get Euler’s equation if we set \nu = 0 in the above equation, the question is, if in some sense or another solutions of the Navier-Stokes equation converge to a solution of Euler’s equation in the limit of vanishing viscosity? If you had asked me, I would have guessed: No. The mathematical reason is that we have a transition from a second order partial differential equation to a first order one. This is usually called a singular perturbation. The physical reason is that a nonzero viscosity will give rise to phenomena like turbulence and energy dissipation that cannot occur in an ideal fluid. Well, the last argument shows that we cannot expect convergence for long times if eddies are present, so there certainly is a loophole.

The precise formulation of the last statement depends on the boundary conditions one chooses. One way is this: Let u be a smooth solution of Euler’s equation in \mathbb{R}^3 with sufficiently fast decay at infinity (this is our boundary condition), then the kinetic energy E

\displaystyle{ E =  \frac{1}{2} \int \| u \|^2 \; \mathrm{d x} }

is conserved for all time. This is not the only conserved quantity for Euler’s equation, but it’s a very important one.

But now, suppose u is a smooth solution of the Navier–Stokes equation in \mathbb{R}^3 with sufficiently fast decay at infinity. In this case we have

\displaystyle{ \frac{d E}{d t} =  - \nu \int \| \nabla \times u \|^2 \mathrm{d x} }

So, the presence of viscosity turns a conserved quantity into a decaying quantity! Since the 20th century, engineers have taken these effects into account following the idea of ‘boundary layers’ introduced by Ludwig Prandtl, as I already mentioned last time. Actually the whole technique of singular perturbation theory has been developed following this ansatz. This has become a mathematical technique to get asymptotic expansions of solutions of complicated nonlinear partial differential equations.

The idea is that the concept of ‘ideal’ fluid is good except at boundaries, where effects due to viscosity cannot be ignored. This is true for a lot of fluids like air and water, which have a very low viscosity. Therefore one tries to match a solution describing an ideal fluid flow far away from the boundaries with a specific solution for a viscous fluid with prescribed boundary conditions valid in a thin layer on the boundaries. This works quite well in applications. One of the major textbooks about this topic has been around for over 60 years now and has reached its 10th edition in German. It is:

• H. Schlichting and K. Gersten: Boundary-Layer Theory, 8th edition, Springer, Berlin, 2000.

Since I am also interested in numerical models and applications in engineering, I should probably read it. (I don’t know when the 10th edition will be published in English.)

flow on a sphere, lamiar to turbulent

Sending Viscosity to Zero: Convergence Results

However, this approach does not tell us under what circumstances we can expect convergence of solutions u_{\nu} to the viscous Navier–Stokes equations with viscosity \nu > 0 to a solution u_{0} of Euler’s equation with zero viscosity. That is, are there such solutions, boundary and initial conditions and a topology on an appropriate topological vector space, such that

\lim_{\nu \to 0} u_{\nu} = u_{0} \; \text{?}

I asked this question over on Mathoverflow and got some interesting answers. Obviously, a lot of brain power has gone into this question, and there are both interesting positive and negative results. As an example, let me describe a very interesting positive result. I learned about it from this book:

• Andrew J. Majda and Andrea L. Bertozzi, Vorticity and Incompressible Flow, Cambridge University Press, Cambridge, 2001.

It’s Proposition 3.2 in this book. There are three assumptions that we need to make in order for things to work out:

• First, we need to fix an interval [0, T] for the time. As mentioned above, we should not expect that we can get convergence for an unbounded time interval like [0, \infty].

• Secondly, we need to assume that solutions u_{\nu} of the Navier–Stokes equation and a solution u_0 of Euler’s equation exist and are smooth.

• Thirdly we will dodge the issue of boundary layers by assuming that the solutions exist on the whole of \mathbb{R}^3 with sufficiently fast decay. As I already mentioned above, a viscous fluid will of course show very different behavior at a boundary than an ideal (that is, nonviscous) one. Our third assumption means that there is no such boundary layer present.

We will denote the L^{\infty} norm on vector fields by \| \cdot \| and use the big O notation.

Given our three assumptions, Proposition 3.2 says that:

\displaystyle{ \mathrm{sup}_{0 \le t \le T} \; \| u_{\nu} - u_0 \| = O(\nu) }

It also gives the convergence of the derivatives:

\displaystyle{ \int_0^T \| \nabla (u_{\nu} - u_0) \| \; d t =  O(\nu^{1/2}) }

This result illustrates that the boundary layer ansatz may work, because the ideal fluid approximation could be a good one away from boundaries and for fluids with low viscosity like water and air.

flow around a symmetric airfoil, laminar to turbulent

So, when John von Neumann joked that “ideal water” is like “dry water”, I would have said: “Well, that is half right”.


Symmetry and the Fourth Dimension (Part 2)

27 May, 2012

The Coxeter group of the cube

Coxeter groups are a huge amount of fun. Normally their delights are reserved for people who have already studied group theory. But I don’t think that’s fair. You don’t need to know music theory to enjoy a piece by Bach. And you don’t need to know group theory to enjoy Coxeter groups.

In fact, it’s probably better to learn theories after falling in love with some examples. Imagine a world where you had to learn music theory before listening to music. A world where everyone studied music theory in elementary school, high school and college, but only people who majored in music were allowed to listen to the stuff. In that world, people might say they hate music… just as in this world they say they hate math.

So, here goes:

Last time I showed you that any Platonic solid has a bunch of symmetries where we reflect it across planes. These planes, called mirrors, all intersect at the center of the solid. If we take a sphere and slice it with these mirrors, it gets chopped up into triangles, and we get a pattern called a Coxeter complex.

If we start with the cube, here’s the Coxeter complex we get:

For artistic reasons, half the triangles are colored blue and half are colored black. But it’s not just pretty: there’s also math here. If we take any black triangle and reflect it across any mirror, we get a blue triangle… and vice versa.

Instead of taking a sphere and slicing it with mirrors, we can start with the cube itself. Here’s what we get:

It’s not quite as pretty (especially because I drew it), but it makes certain games easier to play. These games involve picking one triangle and calling it our favorite. It doesn’t matter which. But we have to pick one… so how about this:

Each different symmetry of the cube sends this triangle to a different triangle. This instantly lets us count the symmetries: there are 48, since each of the cube’s 6 faces has been chopped into 8 triangles.

But even better, we get a vivid picture of the symmetries of the cube! Let’s see how this works.

Any triangle in the Coxeter complex has three corners:

• one corner is a vertex of the cube,

• one corner is the center of an edge of the cube,

• one corner is the center of a face of the cube.

Here’s how it works for our favorite triangle:

Now the real fun starts. We can move from any triangle to a neighboring one in three ways:

1) We can change which vertex of the cube our triangle contains. Starting from our favorite triangle, we get this blue triangle:

Note: the blue triangle touches the same edge of the cube as the black one. It also lies on the same face. Only the vertex has changed!

What have we actually done here? We’ve reflected our triangle across a mirror in a way that changes which vertex of the cube it contains. Let’s call this way of flipping a triangle V.

2) We can change which edge of the cube our triangle touches. Starting from our favorite triangle, we get this yellow triangle:

Note: the yellow triangle contains the same vertex of the cube. It also lies on the same face. Only the edge has changed!

What have we actually done here? We’ve reflected our triangle across a mirror in a way that changes which edge of the cube it touches. Let’s call this way of flipping a triangle E.

3) We can change which face of the cube our triangle lies on. Starting from our favorite triangle, we get this green triangle:

Note: the green triangle contains the same vertex of the cube. It also touches the same edge. Only the face has changed.

What have we actually done here? If you can’t guess, you must be asleep: we’ve reflected our triangle across a mirror in a way that changes which face of the cube it lies on! Let’s call this way of flipping a triangle F.

By repeating these three operations—changing the vertex, edge or face—we can get to any triangle starting from our favorite one. We can even use this trick to label all the triangles. Let’s call our favorite triangle 1:

Let’s call its neighbors F, E and V, since we use those three reflections to get to these new triangles:

Starting with these, we can get more triangles by changing the vertex, edge or face:

See what I’m doing? We get the triangle VE by starting with the triangle V and then changing which edge of the cube it contains. We get EF by starting with E and then changing which face it lies on. And so on.

However, there’s a ‘problem’. See the triangle VF? We got there from the triangle V by changing which face it lies on. But we could also get there another way! We could start at F and then change which vertex this triangle contains. So we could equally well call this triangle FV.

Luckily, in math nothing is really a problem once you understand it. This is why math is more fun than real life: merely understanding a problem makes it go away. We’ll just say that

VF = FV

So, we can use either label for this triangle: it doesn’t matter.

More deeply, if you start with any triangle, change the vertex it contains and then change the face it lies on, you get the same triangle as if you first change the face and then the vertex. That’s what the equation VF = FV really means. It’s a fact of geometry: a general fact about Platonic solids.

Let’s go a bit further:

I’m using the same rules; check to make sure I did everything right! There’s another little ‘problem’, though: see the triangle labelled FEF? We got there from FE by changing which face of the cube our triangle lies on. But we could also get there starting from EF by changing the edge. So really we have

FEF = EFE

But this is not a general fact about Platonic solids: it shows up because in the cube, three faces and three edges meet at each vertex. That’s why the equation has three F’s and three E’s.

We can go on even further, but you can already see where the next problem will show up. See that unlabelled triangle in the front face of the cube? At the next stage we’ll want to label it VEVE, but we’ll also want to label it EVEV. So:

VEVE = EVEV

Again, this is not a general fact about Platonic solids! It shows up because the cube has square faces, so four vertices and four edges touch each face. That’s why the equation has four V’s and four E’s.

We almost have enough equations to avoid all future problems. But there are a few more that are so obvious you may have overlooked them. Suppose we change which vertex of the cube our triangle contains, and then do this again. We get back where we started! For example, first we go from the black one to the blue one:

and then we go back to black.

So, we have

VV = 1

This says switching vertices twice gets us back where we started. Similarly, we have

EE = 1

and

FF = 1

And now, although I haven’t proved it to you, we have a complete set of equations to give each triangle an unambiguous name… or more precisely, an unambiguous element of the ‘Coxeter group’ of the cube. Two different expressions, like EFE and FEF, give the same element of the Coxeter group if we can get from one to the other using our equations. For example, in the Coxeter group we have

FEFVVEFE = FEFEFE = FEFFEF = FEEF = FF = 1

Coxeter groups of Platonic solids

We can do this stuff for other Platonic solids, too. The Coxeter group of the octahedron is secretly the same as that of the cube, since they’re dual. The only difference is that the names F and V get switched, because faces of the cube correspond to vertices of the octahedron, and vice versa! Similarly for the icosahedron and dodecahedron. So, I mainly have two puzzles for you:

Puzzle 1: Find equations defining the Coxeter group of the tetrahedron.

Puzzle 2: Find equations defining the Coxeter group of the dodecahedron.

If these seem hard, let’s reflect a bit more on what we did for the cube. For the cube we have

VF = FV

because of this picture:

Similarly, we have

EFE = FEF

because of this picture:

And finally, we have

VEVE = EVEV

because of this picture:

This should make it easy to solve the puzzles. We can also phrase the solutions in a different way:

Puzzle 3: Show that that Coxeter groups of the tetrahedron, cube and dodecahedron can be completely described by the equations

V2 = E2 = F2 = 1

and

(VE)a = (VF)b = (EF)c = 1

for some integers a, b, and c.

The story doesn’t stop here—far from it! Later we’ll meet Coxeter groups for the higher-dimensional analogues of Platonic solids, which are called regular polytopes. And we’ll use them to classify so-called uniform polytopes obtained by chopping vertices, edges, faces and so on off the regular ones. For example, the cuboctahedron:

can be gotten either by chopping the corners off a cube, or chopping the corners of an octahedron! We can classify such shapes using Coxeter diagrams, which are based on Coxeter groups. So, there’s no shortage of fun stuff to do… in fact, there’s way too much!

Actually, that’s the main problem with mathematics, once you start actually doing it. There’s just too much fun stuff.


Symmetry and the Fourth Dimension (Part 1)

21 May, 2012

Coxeter complexes

Though I’m shifting toward applied math, I find myself unable to quit explaining pure math to people—stuff that’s fun purely for its own sake. So, I’ve been posting about symmetry and the fourth dimension over on Google+. Now I want to take those posts, polish them up a bit, and combine them into blog articles.

The idea is to start with something very familiar and then take it a little further than most people have seen… without getting so technical that only people with PhDs understand what’s going on. I’m more interested in communicating with ordinary folks than in wowing the experts.

So, I’ll assume you know and love the five Platonic solids:

The tetrahedron, with 4 triangular faces, 6 edges and 4 vertices:

The cube, with 6 square faces, 12 edges and 8 vertices:


The octahedron, with 8 triangular faces, 12 edges and 6 vertices:

The dodecahedron, with 12 pentagonal faces, 30 edges and 20 vertices:

The icosahedron, with 20 triangular faces, 30 edges and 12 vertices:

Starting from these, we’ll build the six Platonic solids that exist in 4 dimensions, and the various fancier shapes we can get from these by cutting off corners, edges and so on.

Luckily, a lot of heroic mathematicians and programmers have made pictures of these shapes freely available online. For example, the rotating Platonic solids above were made by Tom Ruen, who put them on Wiki Commons. It wouldn’t be so bad if all I did is show you lots of these pictures and explain them. But there are some underlying themes that make the story deeper, so I thought I’d reveal those now. As the series marches on, I’ll try to make it easy to ignore these themes or pay attention to them, depending on what you want.

One theme is the quaternions. This is a number system introduced by the famous mathematician William Rowan Hamilton back in 1843. A typical quaternion looks like this:

a + b i + c j + dk

where a,b,c,d are ordinary real numbers and i, j, k are square roots of -1 that ‘anticommute’:

i^2 = j^2 = k^2 = - 1

ij = -ji = k

jk = -kj = i

ki = -ik = j

As their name indicates, the quaternions are a 4-dimensional number system. We can use them to relate rotations in 3 dimensions to rotations in 4 dimensions… and this establishes links between 3d Platonic solids and 4d Platonic solids: special links that just don’t exist in higher dimensions.

For example, the dodecahedron has 60 rotational symmetries, and this fact gives a 4d Platonic solid—or as mathematicians say, a 4d regular polytope—with 120 dodecahedral faces. Getting to understand this in detail will be one of the high points of this series: it’s a really wonderful story!

Another theme is 5-fold symmetry. In 2 dimensions there’s an obvious polygon with 5-fold symmetry: the regular pentagon. In 3 dimensions we have a Platonic solid with pentagonal faces: the regular dodecahedron. In 4 dimensions, as I just mentioned, there’s a regular polytope with regular dodecahedra as faces. But then this pattern ends. There are no higher-dimensional regular polytopes with pentagons in them! Only squares and triangles show up.

But the biggest unifying theme is ‘finite reflection groups’. These show up as symmetry groups of Platonic solids and 4d regular polytopes. Technically, a finite reflection group is a finite group of transformations of n-dimensional Euclidean space that’s generated by reflections. Some examples in 3 dimensions will illustrate the idea: it’s not as scary as it might sound.

Take a regular dodecahedron, for example:


This has lots of rotations and reflections as symmetries—but a finite number of them. Each reflection corresponds to a mirror: a plane through the center of the dodecahedron. The reflection switches points on opposite sides of this mirror. We can get every symmetry by doing a bunch of these reflections, one after another: that’s what we mean by saying a group of symmetries is ‘generated by reflections’. So, the symmetry group of a dodecahedron is a finite reflection group.

But the fun starts when we take a sphere centered at the center of the dodecahedron, and slice it with all these mirrors. We get a picture like this, called a Coxeter complex:

The great circles in this picture are where the mirrors intersect the sphere.

You’ll notice there are 120 triangles in this picture: each of the 12 pentagonal faces of the dodecahedron has been subdivided into 10 right triangles. You should be able to see that if we pick one of these triangles, there’s a symmetry carrying it to any other. So, the symmetry group of the dodecahedron has 120 elements!

By the way: earlier I mentioned the 60 rotational symmetries of the dodecahedron; now I’m talking about 120 symmetries including rotations and reflections. There’s no contradiction. If we start by picking a black triangle, a rotation will take it to another black triangle, while a reflection will take it to be a blue one. There are 60 of each, for a total of 120.

We can play the same game starting with any other Platonic solid. If we start with the icosahedron, nothing really new happens. It has the same symmetry group, so we get the same Coxeter complex. Indeed, if you look carefully here:

you can see a bunch of equilateral triangles, each containing 6 right triangles. There are 20 of these equilateral triangles, and they’re the faces of an icosahedron:


The corners of the icosahedron are located at the centers of the faces of a dodecahedron, and vice versa. So we say these Platonic solids are dual to each other. Dual polyhedra, or more generally dual polytopes, have the same symmetry group and the same Coxeter complex.

But we get something different if we start with the cube:

This gives a Coxeter complex with 48 triangles, formed by subidividing each of the 6 square faces of the cube into 8 right triangles:

There’s a symmetry of the cube carrying any of these right triangles to any other, so its symmetry group has 48 elements.

The octahedron has the same symmetry group as the cube, because they’re duals. But we get something different if we start with the regular tetrahedron:

This gives a Coxeter complex with 24 triangles, formed by subidividing each of the 4 triangular faces of the tetrahedron into 6 right triangles:

There’s a symmetry of the tetrahedron carrying any of these right triangles to any other, so its symmetry group has 24 elements.

The tetrahedron is its own dual. So the Platonic solids only give us three finite reflection groups in 3 dimensions. There are also some some infinite sequences of less interesting ones, like the symmetry groups of the hosohedra, which are funny degenerate Platonic solids whose faces have just 2 sides… these faces need to be curved:

But in general, the possibilities are quite restricted. So, finite reflection groups are not only beautiful: they’re a bit rare. This makes them doubly precious. People have written books about them:

• James E. Humphreys, Reflection Groups and Coxeter Groups, Cambridge U. Press, 1992.

And as we’re beginning to see, the Coxeter complex is a vivid picture of the finite reflection group it comes from. We can already see that in 3 dimensions, it has one black triangle for each reflection in the group, and one blue triangle for each rotation. But it contains much more information than this, in neatly visible form… and it works well not just in 3 dimensions, but any dimension.

That’s enough for now: I want to keep these blog articles bite-sized, rather than letting them grow jaw-breakingly big. But if you’re hungry for more right now, try this:

• John Baez, Platonic solids in all dimensions.

I’ll also leave you with this:

Puzzle: How many great circles are there in these Coxeter complexes?

• The Coxeter complex of the tetrahedral finite reflection group, also known to mathematicians as A3:

• The Coxeter complex of the octahedral finite reflection group, also known as B3:

• The Coxeter complex of the icosahedral finite reflection group, also known as H3:

The answers will say how many reflections there are in the finite reflection groups we’ve met today.


Fluid Flows and Infinite-Dimensional Manifolds (Part 2)

12 May, 2012

Or: ideal fluids—dry water?

guest post by Tim van Beek

Last time in this series, we set the stage by explaining infinite dimensional manifolds. Then we looked at a simple example: the inviscid Burgers equation. We saw this was the equation for geodesics in the diffeomorphism group of the circle.

Now let’s look at a more interesting example! It will still be a simplified model of fluid flow: it will describe an ideal fluid that is incompressible. I’ll start by explaining these concepts. We will then see how the equation of motion for ideal incompressible fluids can be interpreted as a geodesic equation.

En route I will also repeat some stuff from classical vector analysis, mostly for my own sake. The last time I seriously had to calculate with it was when I attended a class on “classical electrodynamics”, which was almost 15 years ago!

When we delve into differential geometry, it is always a good idea to look both at the “coordinate free” formulation using abstract concepts like differential forms, and also at the “classical vector analysis” part, that is best for calculating stuff once suitable coordinates have been chosen. Our fluid flows will take place in a smooth, orientable, compact, n-dimensional Riemannian manifold M, possibly with a smooth boundary \partial M.

I will frequently think of M as an open set in \mathbb{R}^2 or \mathbb{R}^3, so I will use the globally defined coordinate chart of Euclidean coordinates on \mathbb{R}^n denoted by x, y (and z, if needed) without further warning.

Before we continue: Last time our reader “nick” pointed out a blog post by Terence Tao about the same topic as ours, but—as could be expected—assuming a little bit more of a mathematical background: The Euler-Arnold equation. If you are into math, you might like to take a look at it.

So, let us start with the first important concept: the ‘ideal fluid’.

What is an ideal fluid?

When you are a small parcel in a fluid flow, you will feel two kinds of forces:

external forces like gravity that are there whether or not your fellow fluid parcels surround you or are absent,

internal forces that come from your interaction with the other fluid parcels.

If there is friction between you and other fluid parcels, for example, then there will be a force slowing down faster parcels and speeding up slower parcels. This is called viscosity. I already explained it back in the post Eddy Who? High viscosity means that there is a lot of friction: think of honey.

The presence of viscosity leads to shear stress whenever there are differences in the velocities of nearby fluid parcels. These lead to the formation of eddies and therefore to turbulence. This complicates matters considerably! For this reason, sometimes people like to simplify matters and to assume that the fluid flow that they consider has zero viscosity. This leads us to the physics definition of an ideal fluid:

An ideal fluid (as physicists say) is a fluid with zero viscosity.

As you can guess, I have also a mathematical definition in store for you:

An ideal fluid (as mathematicians say) is a fluid with the following property: For any motion of the fluid there is a (real valued) function p(x, t) called the pressure such that if S is a surface in the fluid with a chosen unit normal n, the force of stress exerted across the surface S per unit area at x \in S at time t is p(x,t) n.

This implies that there is no force acting tangentially to the surface S:

pressure in an ideal fluid

This picture is from

• Alexandre Chorin and Jerrold E. Marsden, A Mathematical Introduction to Fluid Mechanics, 3rd edition, Springer, New York 1993.

An ideal fluid cannot form eddies by itself without the help of external forces, nor can eddies vanish once they are present. So this simplification exclude a lot of very interesting phenomena, including everything that is usually associated with the term ‘turbulence’. But it is a necessary simplification for describing fluid flow using geodesic equations, because something moving along a geodesic doesn’t lose energy due to friction! So we will have to stick with it for now.

Historically, ideal fluids were almost exclusively studied during the 19th century, because the mathematics of viscous fluids seemed to be too hard—which it still is, although there has been a lot of progress. T his led to a schism of theoretical hydrodynamics and engineering hydrodynamics, because engineers had to handle effects like turbulence that ideal fluids cannot model. A very problematic aspect is that no body with a subsonic velocity feels any drag force in an ideal fluid. This is known as D’Alembert’s paradox. This means that one cannot find out anything about optimal design of ships or aircrafts or cars using ideal fluids as a model. This situation was overcome by the invention of ‘boundary layer techniques’ by the physicist Ludwig Prandtl at the beginning of the 20th century.

John von Neumann is cited by Richard Feynman in his physics lectures as having said that ideal fluids are like “dry water”, because they are so unlike real water. This is what the subtitle of this post alludes to. I don’t think this is quite fair to say. Along these lines one could say that quantum mechanics is the theory of stagnant light, because it does not include relativistic effects like quantum field theory does. Of course every mathematical model is always just an approximation to a Gedankenexperiment. And ideal fluids still have their role to play.

Maybe I will tell you more about this in a follow-up post, but before this one gets too long, let us move on to our second topic: incompressible fluids and ‘volume preserving’ diffeomorphisms.

What is an incompressible fluid flow?

If you are a parcel of an incompressible fluid, this means that your volume does not change over time. But your shape may, so if you start out as a sphere, after some time you may end up as an ellipsoid. Let’s make this mathematically precise.

But first note, that “incompressible” in the sense above means that the density of a given fluid parcel does not change over time. It does not mean that the density of the whole fluid is everywhere the same. A fluid like that is actually called homogeneous. So we have two different notions:

incompressible means that the volume of an infinitesimal fluid parcel does not change as it moves along the fluid flow,

homogeneous means that the density at a given time is everywhere the same, that is: constant in space.

This distinction is important, but for now we will study fluid flows that are both homogeneous and incompressible.

Let us see how we can make the notion of “incompressible” mathematically precise:

Remember from the last post: The flow of each fluid parcel is described by a path on M parametrized by time, so that for every time t \ge t_0 there is a diffeomorphism

g^t : M \to M

defined by the requirement that it maps the initial position x of each fluid parcel to its position g^t(x) at time t:

schematic fluid flow

Now let’s assume our fluid flow is incompressible. What does that mean for the diffeomorphisms that describe the flow? Assuming that we have a volume form \mu on M, these diffeomorphisms must conserve it:

\mathrm{SDiff}(M) := \{ f \in \mathrm{Diff}(M): f^* \mu = \mu \}

For people who need a reminder of the concepts involved (which includes me), here it is:

Remember that M is a smooth orientable Riemannian manifold of dimension n. A volume form \mu is a n-form that vanishes nowhere. In \mathbb{R}^3 with Cartesian coordinates x, y, z the canonical example would be

\mu = d x \wedge  d y \wedge  d z

The dual basis of d x, d y, d z is denoted by \partial_x, \partial_y, \partial_z in our example.

Given two manifolds M, N and a differentiable map f: M \to N, we can pull back a differential form \mu on N to one on M via

f^{*} \mu_p (v_1, ..., v_n) = \mu_{f(p)} (d f(v_1), ..., d f(v_n))

For the übernerds out there: remember that we see the group of diffeomorpisms \mathrm{Diff}(M) as a Fréchet Lie group modelled on the Fréchet space of vector fields on M, \mathrm{Vec}(M). For those who would like to read more about this concept, try this:

• Karl-Hermann Neeb, Monastir Summer School: infinite-dimensional Lie groups.

\mathrm{SDiff}(M) is clearly a subgroup of \mathrm{Diff}(M). It is less obvious, but true, that it is a closed subgroup and therefore itself a Lie group. What about its Lie algebra? For a vector field to give a flow that’s volume preserving, it must have zero divergence. So, the vector fields that form the tangent space T_{\mathrm{id}} \mathrm{SDiff}(M) consist of all smooth vector fields V with zero divergence:

\mathrm{div}(V) = 0

These vector fields form a vector space we denote by \mathrm{SVec}(M). Remember T_{\mathrm{id}} stands for the tangent space at the identity element of the group \mathrm{SDiff}(M), which is the identity diffeomorphism \mathrm{id} of M. The tangent space at the identity of a Lie group is a Lie algebra, so \mathrm{SVec}(M) is a Lie algebra.

I will need a little refresher about the definition of divergence. Then I will point you to a proof of the claim above, namely that zero-divergence vector fields form the Lie algebra of volume preserving diffeomorphism. This may seem obvious on an intuitive level, if you ever learned that the zero-divergence vector fields have ‘no sinks and no sources’, for example in a course on classical electromagnetism.

So, what is the divergence, again? You’ve probably seen it somewhere if you’ve survived reading this so far, but you may not have seen it in full generality.

The divergence of a vector field V with respect to a volume form \mu is the unique scalar function \mathrm{div}(V) such that:

\mathrm{div}(V)\, \mu = d (i_V \mu)

Here, i_X is the contraction with X. Contraction means that you feed the vector X in the first slot of the differential form \mu, and therefore reduce the function \mu of n vector fields to one of n-1 vector fields.

When we use our standard example M = \mathbb{R}^3, we of course write a vector field as

V = V_x \partial_x + V_y\partial_y + V_z \partial_z

where V_x, V_y and V_z are smooth real-valued functions. The divergence of V is then

\mathrm{div}(V) = \partial_x  V_x + \partial_y V_y + \partial_z V_z

which we get if we plug in the expression for V into the formula d(i_V \mu).

So, how does one see that ‘zero divergence’ of a vector field is equivalent to ‘volume preserving’ for the flow it generates?

If we write

\phi(t) = (x(t), y(t), z(t))

for the path of a fluid particle and $u$ for its velocity, then of course we have:

\displaystyle{ \frac{d \phi}{d t} = u }

For a scalar function f(t, x(t), y(t), z(t)) we get

\displaystyle{ \frac{d f}{d t} = \frac{\partial f}{\partial t} + u \cdot \mathrm{grad}(f) }

Here \cdot is the inner product. The latter part is often written with the help of the nabla operator \nabla as

u \cdot \mathrm{grad}(f) = u \cdot \nabla \; f

This is really just a handy short notation, there is no mystery behind it: it’s just like how we write the divergence as \mathrm{div}(X) = \nabla \cdot X and the curl as \mathrm{curl}(X) = \nabla \times X.

The operator

D_t = \partial_t + u \cdot \nabla

appears so often that it has its own name: it is called the material derivative.

Why ‘material’? Because if we follow a little bit of material—what we’re calling a parcel of fluid—something about it can change with time for two different reasons. First, this quantity can explicitly depend on time: that’s what the first term, \partial_t, is about. Second, this quantity can depend on where you are, so it changes as the parcel moves: that’s what u \cdot \nabla is about.

Now suppose we have a little parcel of fluid. We’ve been talking about it intuitively, but mathematically we can describe it at time zero as an open set W_0 in our manifold. After a time t, it will be mapped by the fluid flow g^t to

W_t :=  g^t (W_0)

This describes how our parcel moves. We define the fluid to be incompressible if the volume of W_t for all choices of W_0 is constant, that is:

\displaystyle{ 0 = \frac{d}{d t} \int_{W_t} d \mu }

If we write J^t for the Jacobian determinant of g^t, then we have

\displaystyle{ 0 = \frac{d}{d t} \int_{W_t} d \mu = \frac{d}{d t} \int_{W_0} J^t d \mu }

So in a first step we get that a fluid flow is incompressible iff the Jacobian determinant J is 1 for all times, which is true iff g^t is volume preserving.

It is not that hard to show by a direct calculation that

\displaystyle{ \left. \partial_t J\right|_{t=0} = \mathrm{div}(u) J }

If you don’t want to do it yourself, you can look it up in a book that I already mentioned:

• Alexandre Chorin and Jerrold E. Marsden, A Mathematical Introduction to Fluid Mechanics, 3rd edition, Springer-Verlag, New York 1993.

This is the connection between ‘volume preserving’ and ‘zero divergence’! Inserting this into our equation of incompressibility, we finally get:

\begin{array}{ccl}   0 &=& \displaystyle{ \frac{d}{d t} \int_{W_t} d \mu } \\ \\  &=& \displaystyle{\frac{d}{d t} \int_{W_0} J^t d \mu } \\ \\  &=& \displaystyle{\int_{W_0} \mathrm{div}(u) J d \mu  }  \end{array}

which is true for all open sets W_0 iff \mathrm{div}(u) = 0. The equation of continuity for a fluid flow is:

\displaystyle{ \frac{\partial \rho}{\partial t} + \mathrm{div}(\rho u) = 0 }

This says that mass is conserved. Written with the material derivative it is:

\displaystyle{ \frac{D \rho}{D t} + \rho \, \mathrm{div}(u) = 0 }

So, since we’re assuming \mathrm{div}(u) = 0, we get

\displaystyle{  \frac{D \rho}{D t} = 0 }

which is what we intuitively expect, namely that the density is constant for a fluid parcel following the fluid flow.

Euler’s equation for the ideal incompressible fluid

The equation of motion for an ideal incompressible fluid is Euler’s equation:

\partial_t u + (u \cdot \nabla) u = - \nabla p

p is the pressure function mentioned in the mathematical definition of an ideal fluid above. As I already mentioned, to be precise I should say that we also assume that the fluid is homogeneous. This means that the density \rho is constant both in space and time and therefore can be cancelled from the equation of motion.

If M has a nonempty (smooth) boundary \partial M, the equation is supplemented by the boundary condition that u is tangential to \partial M.

How can we turn this equation into a geodesic equation on \mathrm{SDiff}(M)? Our strategy will be the same as last time when we handled the diffeomorphism group of the circle. We will define the necessary gadgets of differential geometry on \mathrm{SDiff}(M) using the already existing ones on M. First we define them on T_{\mathrm{id}}\mathrm{SDiff}(M). Then, for any diffeomorphism \phi \in \mathrm{SDiff}(M), we use right translation by \phi to define them on T_{\phi}\mathrm{SDiff}(M). After that, we can use the version of the abstract version of the geodesic equation for right invariant metrics to calculate the explicit differential equation behind it.

Let us start with defining right invariant vector fields on \mathrm{SDiff}(M). A right invariant vector field U is a vector field such that there is a u \in \mathrm{SVec}(M) such that U_{\phi} = u \circ \phi. In the following, we restrict ourselves to right invariant vector fields only.

We define the usual L^2 inner product of vector fields u, v on M just as last time:

\displaystyle{ \langle u, v \rangle = \int_M \langle u_x, v_x \rangle \; d \mu (x) }

The inner product used on the right is of course the one on M.

For two right invariant vector fields U, V with U_{\phi} = u \circ \phi and V_{\phi} = v \circ \phi, we define the inner product on T_{\phi}\mathrm{SDiff}(M) by

\langle U, V \rangle_{\phi} = \langle u, v \rangle

This definition induces a right invariant metric on \mathrm{SDiff}(M). Note that it is right invariant because we are only considering volume preserving diffeomorphisms. It is not right invariant on the larger group of all diffeomorphims \mathrm{Diff}(M)!

For an incompressible ideal fluid without external fields the only kind of energy one has to consider is the kinetic energy. The inner product that we use is actually proportional to the kinetic energy of the whole fluid flow at a fixed time. So geodesics with respect to the induced metric will correspond to Hamilton’s extremal principle. In fact it is possible to formulate all this in the language of Hamiltonian systems, but I will stop here and return to the quest of calculating the geodesic equation.

Last but not least, we define the following right invariant connection:

\nabla_{U_{\phi}} V_{\phi} = (\nabla_{u} v) \circ \phi

Here \nabla on the right is the connection on M—sorry, this is not quite the same as the \nabla we’d been using earlier! But in \mathbb{R}^3 or Euclidean space of any other dimension, \nabla_u v is just another name for (u \cdot \nabla) v, so don’t get scared.

Remember from last time that the geodesic equation says

\nabla_u u = 0

where u is the velocity vector of our geodesic, say

\displaystyle{ u(t) = \frac{d}{d t} \gamma(t) }

where \gamma is the curve describing our geodesic. We saw that for a right-invariant metric on a Lie group, this equation says

\partial_t u = \mathrm{ad}^*_u u

where the coadjoint operator \mathrm{ad}^* is defined by

\langle \mathrm{ad}^*_u v, w \rangle = \langle v, \mathrm{ad}_u w \rangle = \langle v, [u, w] \rangle

For simplicity, let us specialize to \mathbb{R}^3, or an open set in there. What can we say about the right hand side of the above equation in this case? First, we have the vector identity

\nabla \times (u \times w) = - [u, w] + u \; \nabla \cdot w - w \; \nabla \cdot u

Since we are talking about divergence-free vector fields, we actually have

[u, w] = - \nabla \times (u \times w)

Also note that for a scalar function f and the divergence-free vector field u we have

\begin{array}{ccl} \langle u, \nabla f \rangle &=& \int_M \langle u(x), \nabla f(x) \rangle \; d \mu (x) \\ \\ &=& \int_M \nabla \cdot (f(x) u(x)) \; d \mu (x) \\ \\ &=& \int_{\partial M} f(x) \; \langle u, n \rangle \; d S (x) \\ \\ &=& 0 \end{array}

The last term is zero because of our boundary condition that says that the velocity field u is tangent to \partial M.

So, now I am ready to formulate my claim that

\mathrm{ad}^*_u v = - (\nabla \times v) \times u + \nabla f

for some yet undetermined scalar function f. This can be verified by a direct calculation:

\begin{array}{ccl} \langle \mathrm{ad}^*_u v, w \rangle &=& \langle v, \mathrm{ad}_u w \rangle \\ \\ &=& \langle v, [u, w] \rangle \\  \\  &=&  \int_M \langle v_x, [u, w]_x \rangle \;d\mu(x)  \\ \\ &=& - \int_M \langle v_x, (\nabla \times (u \times w))_x \rangle \;d \mu(x)  \end{array}

What next? We can use the following 3 dimensional version of Green’s theorem for the curl operator:

\int_M ( \langle \nabla \times a, b  \rangle - \langle a, \nabla \times b \rangle ) d \mu = \int_{\partial M} \langle a \times b, n \rangle d S

That is, the curl operator is symmetric when acting on vector fields that have no component that is tangent to \partial M. Note that I deliberately forgot to talk about function spaces that our vector fields need to belong to and the regularity assumptions on the domain M and its boundary, because this is a blog post and not a math lecture. tongue But the operators we use on vector fields obviously depend on such assumptions.

If you are interested in how to extend the symmetric curl operator to a self-adjoint operator, for example, you could look it up here:

• R. Hiptmair, P. R. Kotiuga, S. Tordeux, Self-adjoint curl operators.

Since our vector fields are supposed to be tangent to \partial M, we have that the boundary term in our case is

\int_{\partial M} \langle u_x \times w_x \times v_x, n \rangle \; dS = 0

because u_x \times w_x is normal, and therefore u_x \times w_x \times v_x is tangent to \partial M, so its inner product with the normal vector n is zero.

So we can shift the curl operator from right to left like this:

\begin{array}{ccl} - \int_M \langle v_x, (\nabla \times (u \times w))_x \rangle \;d \mu(x) &=& - \int_M \langle (\nabla \times v)_x, (u \times w)_x \rangle \;d \mu(x) \\ \\ &=& - \int_M \langle (\nabla \times v)_x \times u_x, w_x \rangle \;d \mu(x) \end{array}

In the last step we used the cyclicity of the relation of the vector product and the volume spanned by three vectors:

\langle a \times b, c \rangle = \mu(a, b, c) = \mu (c, a, b) = \langle c \times a, b \rangle

This verifies the claim, since the part \nabla f does not contribute, as stated above.

And now, yet another vector identity comes to our rescue:

(\nabla \times v) \times u = (u \cdot \nabla) v - u_k \nabla v_k

So, we finally end up with this:

\begin{array}{ccl} \mathrm{ad}^*_u u &=& - (u \cdot \nabla) u - u_k \nabla u_k + \nabla f \\ \\ &=& - (u \cdot \nabla) u + \nabla g \end{array}

for some function g. Why? Since the middle term u_k \nabla u_k = \frac{1}{2} \nabla u^2 is actually a gradient, we can absorb this summand and \nabla f into one summand with a new function, \nabla g.

Thanks to this formula we derived, the abstract and elegant equation for a geodesic on any Lie group

\partial_t u = \mathrm{ad}^*_u u

becomes, in this special case

\partial_t u = - (u \cdot \nabla) u + \nabla g

If we can convince ourselves that -g is the pressure p of our fluid, we get Euler’s equation:

\partial_t u + (u \cdot \nabla) u = - \nabla p

Wow! Starting with abstract stuff about infinite-dimensional Lie groups, we’ve almost managed to derive Euler’s equation as the geodesic equation on \mathrm{SDiff}(M)! We’re not quite done: we still need to talk about the role of the function g, and why it’s minus the pressure. But that will have to wait for another post.


Enormous Integers

24 April, 2012

Sometimes math problems have unexpected answers. For example, consider the integral of this infinite product:

\displaystyle{ \int_0^\infty \cos(2x) \cos(x) \cos(x/2) \cos(x/3) \cos(x/4) \cdots \, dx }

The answer matches \pi/8 up to its 43rd digit. But it’s not actually \pi/8. Weird, huh? But it’s not a coincidence; there’s an explanation.

Here’s a puzzle due to the logician Harvey Friedman. It too has an unexpected answer.

Say you have a finite alphabet to write with. How long can a word be if no block of letters in this word, from the nth letter to the 2nth, is allowed to appear as a subsequence of a bigger block from the mth letter to the 2mth?

If you have just one letter, this is the longest it can be:

AAA

If you have two, this is the longest it can be:

ABBBAAAAAAA

Puzzle: How long can the word be if you have three letters in your alphabet?

Friedman showed there’s still a finite upper bound on how long it can be. But, he showed it’s incomprehensibly huge!

Now Friedman is one of the world’s experts on large cardinals—large infinite numbers. So when he says a finite number is incomprehensibly huge, you sit up and listen. It’s like seeing a seasoned tiger hunter running through the jungle with his shotgun, yelling “Help! It’s a giant ant!”

The answer to the puzzle involves the 7th Ackermann function. The first Ackermann function is just doubling:

A_1(n) = 2n

The second Ackermann function of n is 2 to the nth power:

A_2(n) = 2^n

To calculate the third Ackermann function of n, you write down a tower of n 2’s. For example:

A_3(4) = 2^{2^{2^{2}}}  = 65536

And so on: to calculate the (k+1)st Ackermann function applied to n, you apply the kth Ackermann function n times to the number 1:

A_{k+1}(n) = A_k A_k \cdots A_k(1)

where there are n of these A_k‘s.

For example, with a little work you can show

A_4(4) = 2^{2^{2^{2^{\cdot^{\cdot^\cdot}}}}}

where the number of 2’s in the tower is 65,536.

But this is still puny, compared to the answer to the puzzle!

In 1998 Friedman show that the longest word you can write with 3 letters, following the rule I described, is at least A7(184) letters long.

As he notes, when we reach numbers this large:

… a profound level of incomprehensibility emerges. The definitions are not incomprehensible, but the largeness is incomprehensible. These higher levels of largeness blur, where one is unable to sense one level of largeness from another.

I don’t understand his proof, but you can see it here:

• Harvey M. Friedman, Long finite sequences, 8 October 1998.

When I posted about this on Google+, Andrés Caicedo helped me notice that later in the paper, Friedman goes further. In fact, A7(184) is a ridiculous underestimate of the true answer! And apparently now there’s an upper bound on the answer, too.

When Andrés Caicaido heard me say the word could be at least A7(184) letters long, he wrote:

Much much longer!

I wrote a bit on this:

A lower bound for n(3), 14 February 2012.

Using some computer assistance from Ramdall Dougherty, Friedman shows that a lower bound is A7198(158386). I asked Dougherty, and got a copy of his files. I am hoping to find some decent time in the not too distant future to examine them in detail and see how far this can be pushed.

Friedman also found an upper bound, An(n), where n = A5(5). He mentions this in a note from 2001, but the note is a series of announcements with no proofs. Actually, it is exciting stuff, in spite of the lack of proofs. He discusses some other simply defined combinatorial constructions that give numbers much much longer than this one:

• Harvey M. Friedman, Lecture notes on enormous integers, 22 November 2011.

So, we have a well-defined enormous integer, and we know—or at least Friedman claims—that it’s somewhere between A_{7198}(158386) and A_{A_5(5)}(A_5(5)). That’s an impressively large spread. I wonder how accurately we will ever know this number.

I should add that beside the sheer shock value of seeing such a huge number show up as the answer to a simply stated puzzle, there’s some deep math lurking in these games. It gets rather scary. Near the end of his lecture notes, Friedman mentions a number so big that any proof of its existence—in a certain system of mathematics—has an incomprehensibly large number of symbols.

But if all this talk of enormous integers isn’t scary enough for you, just wait until you read about dark integers:

• Greg Egan, Dark integers, 2007.

Dark matter, dark energy . . . dark integers. They’re all around us, but we don’t usually see them, because they don’t quite play by the rules.


Follow

Get every new post delivered to your Inbox.

Join 3,095 other followers