From the Icosahedron to E8

10 December, 2017

Here’s a draft of a little thing I’m writing for the Newsletter of the London Mathematical Society. The regular icosahedron is connected to many ‘exceptional objects’ in mathematics, and here I describe two ways of using it to construct \mathrm{E}_8. One uses a subring of the quaternions called the ‘icosians’, while the other uses Patrick du Val’s work on the resolution of Kleinian singularities. I leave it as a challenge to find the connection between these two constructions!

You can see a PDF here:

From the icosahedron to E8.

Here’s the story:

From the Icosahedron to E8

In mathematics, every sufficiently beautiful object is connected to all others. Many exciting adventures, of various levels of difficulty, can be had by following these connections. Take, for example, the icosahedron—that is, the regular icosahedron, one of the five Platonic solids. Starting from this it is just a hop, skip and a jump to the \mathrm{E}_8 lattice, a wonderful pattern of points in 8 dimensions! As we explore this connection we shall see that it also ties together many other remarkable entities: the golden ratio, the quaternions, the quintic equation, a highly symmetrical 4-dimensional shape called the 600-cell, and a manifold called the Poincaré homology 3-sphere.

Indeed, the main problem with these adventures is knowing where to stop! The story we shall tell is just a snippet of a longer one involving the McKay correspondence and quiver representations. It would be easy to bring in the octonions, exceptional Lie groups, and more. But it can be enjoyed without these esoteric digressions, so let us introduce the protagonists without further ado.

The icosahedron has a long history. According to a comment in Euclid’s Elements it was discovered by Plato’s friend Theaetetus, a geometer who lived from roughly 415 to 369 BC. Since Theaetetus is believed to have classified the Platonic solids, he may have found the icosahedron as part of this project. If so, it is one of the earliest mathematical objects discovered as part of a classification theorem. It’s hard to be sure. In any event, it was known to Plato: in his Timaeus, he argued that water comes in atoms of this shape.

The icosahedron has 20 triangular faces, 30 edges, and 12 vertices. We can take the vertices to be the four points

\displaystyle{   (0 , \pm 1 , \pm \Phi)  }

and all those obtained from these by cyclic permutations of the coordinates, where

\displaystyle{   \Phi = \frac{\sqrt{5} + 1}{2} }

is the golden ratio. Thus, we can group the vertices into three orthogonal golden rectangles: rectangles whose proportions are \Phi to 1.

In fact, there are five ways to do this. The rotational symmetries of the icosahedron permute these five ways, and any nontrivial rotation gives a nontrivial permutation. The rotational symmetry group of the icosahedron is thus a subgroup of \mathrm{S}_5. Moreover, this subgroup has 60 elements. After all, any rotation is determined by what it does to a chosen face of the icosahedron: it can map this face to any of the 20 faces, and it can do so in 3 ways. The rotational symmetry group of the icosahedron is therefore a 60-element subgroup of \mathrm{S}_5. Group theory therefore tells us that it must be the alternating group \mathrm{A}_5.

The \mathrm{E}_8 lattice is harder to visualize than the icosahedron, but still easy to characterize. Take a bunch of equal-sized spheres in 8 dimensions. Get as many of these spheres to touch a single sphere as you possibly can. Then, get as many to touch those spheres as you possibly can, and so on. Unlike in 3 dimensions, where there is ‘wiggle room’, you have no choice about how to proceed, except for an overall rotation and translation. The balls will inevitably be centered at points of the \mathrm{E}_8 lattice!

We can also characterize the \mathrm{E}_8 lattice as the one giving the densest packing of spheres among all lattices in 8 dimensions. This packing was long suspected to be optimal even among those that do not arise from lattices—but this fact was proved only in 2016, by the young mathematician Maryna Viazovska [V].

We can also describe the \mathrm{E}_8 lattice more explicitly. In suitable coordinates, it consists of vectors for which:

1) the components are either all integers or all integers plus \textstyle{\frac{1}{2}}, and

2) the components sum to an even number.

This lattice consists of all integral linear combinations of the 8 rows of this matrix:

\left( \begin{array}{rrrrrrrr}  1&-1&0&0&0&0&0&0 \\  0&1&-1&0&0&0&0&0 \\  0&0&1&-1&0&0&0&0 \\  0&0&0&1&-1&0&0&0 \\  0&0&0&0&1&-1&0&0 \\  0&0&0&0&0&1&-1&0 \\  0&0&0&0&0&1&1&0 \\  -\frac{1}{2}&-\frac{1}{2}&-\frac{1}{2}&-\frac{1}{2}&-\frac{1}{2}&-\frac{1}{2}&-\frac{1}{2}&-\frac{1}{2}   \end{array} \right)

The inner product of any row vector with itself is 2, while the inner product of distinct row vectors is either 0 or -1. Thus, any two of these vectors lie at an angle of either 90° or 120°. If we draw a dot for each vector, and connect two dots by an edge when the angle between their vectors is 120° we get this pattern:

This is called the \mathrm{E}_8 Dynkin diagram. In the first part of our story we shall find the \mathrm{E}_8 lattice hiding in the icosahedron; in the second part, we shall find this diagram. The two parts of this story must be related—but the relation remains mysterious, at least to me.

The Icosians

The quickest route from the icosahedron to \mathrm{E}_8 goes through the fourth dimension. The symmetries of the icosahedron can be described using certain quaternions; the integer linear combinations of these form a subring of the quaternions called the ‘icosians’, but the icosians can be reinterpreted as a lattice in 8 dimensions, and this is the \mathrm{E}_8 lattice [CS]. Let us see how this works.

The quaternions, discovered by Hamilton, are a 4-dimensional algebra

\displaystyle{ \mathbb{H} = \{a + bi + cj + dk \colon \; a,b,c,d\in \mathbb{R}\}  }

with multiplication given as follows:

\displaystyle{i^2 = j^2 = k^2 = -1, }
\displaystyle{i j = k = - j i  \textrm{ and cyclic permutations} }

It is a normed division algebra, meaning that the norm

\displaystyle{ |a + bi + cj + dk| = \sqrt{a^2 + b^2 + c^2 + d^2} }


|q q'| = |q| |q'|

for all q,q' \in \mathbb{H}. The unit sphere in \mathbb{H} is thus a group, often called \mathrm{SU}(2) because its elements can be identified with 2 \times 2 unitary matrices with determinant 1. This group acts as rotations of 3-dimensional Euclidean space, since we can see any point in \mathbb{R}^3 as a purely imaginary quaternion x = bi + cj + dk, and the quaternion qxq^{-1} is then purely imaginary for any q \in \mathrm{SO}(3). Indeed, this action gives a double cover

\displaystyle{   \alpha \colon \mathrm{SU}(2) \to \mathrm{SO}(3) }

where \mathrm{SO}(3) is the group of rotations of \mathbb{R}^3.

We can thus take any Platonic solid, look at its group of rotational symmetries, get a subgroup of \mathrm{SO}(3), and take its double cover in \mathrm{SU}(2). If we do this starting with the icosahedron, we see that the 60-element group \mathrm{A}_5 \subset \mathrm{SO}(3) is covered by a 120-element group \Gamma \subset \mathrm{SU}(2), called the binary icosahedral group.

The elements of \Gamma are quaternions of norm one, and it turns out that they are the vertices of a 4-dimensional regular polytope: a 4-dimensional cousin of the Platonic solids. It deserves to be called the “hypericosahedron”, but it is usually called the 600-cell, since it has 600 tetrahedral faces. Here is the 600-cell projected down to 3 dimensions, drawn using Robert Webb’s Stella software:

Explicitly, if we identify \mathbb{H} with \mathbb{R}^4, the elements of \Gamma are the points

\displaystyle{    (\pm \textstyle{\frac{1}{2}}, \pm \textstyle{\frac{1}{2}},\pm \textstyle{\frac{1}{2}},\pm \textstyle{\frac{1}{2}}) }

\displaystyle{ (\pm 1, 0, 0, 0) }

\displaystyle{  \textstyle{\frac{1}{2}} (\pm \Phi, \pm 1 , \pm 1/\Phi, 0 ),}

and those obtained from these by even permutations of the coordinates. Since these points are closed under multiplication, if we take integral linear combinations of them we get a subring of the quaternions:

\displaystyle{    \mathbb{I} = \{ \sum_{q \in \Gamma} a_q  q  : \; a_q \in \mathbb{Z} \}  \subset \mathbb{H} .}

Conway and Sloane [CS] call this the ring of icosians. The icosians are not a lattice in the quaternions: they are dense. However, any icosian is of the form a + bi + cj + dk where a,b,c, and d live in the golden field

\displaystyle{   \mathbb{Q}(\sqrt{5}) = \{ x + \sqrt{5} y : \; x,y \in \mathbb{Q}\} }

Thus we can think of an icosian as an 8-tuple of rational numbers. Such 8-tuples form a lattice in 8 dimensions.

In fact we can put a norm on the icosians as follows. For q \in \mathbb{I} the usual quaternionic norm has

\displaystyle{  |q|^2 =  x + \sqrt{5} y }

for some rational numbers x and y, but we can define a new norm on \mathbb{I} by setting

\displaystyle{ \|q\|^2 = x + y }

With respect to this new norm, the icosians form a lattice that fits isometrically in 8-dimensional Euclidean space. And this is none other than \mathrm{E}_8!

Klein’s Icosahedral Function

Not only is the \mathrm{E}_8 lattice hiding in the icosahedron; so is the \mathrm{E}_8 Dynkin diagram. The space of all regular icosahedra of arbitrary size centered at the origin has a singularity, which corresponds to a degenerate special case: the icosahedron of zero size. If we resolve this singularity in a minimal way we get eight Riemann spheres, intersecting in a pattern described by the \mathrm{E}_8 Dynkin diagram!

This remarkable story starts around 1884 with Felix Klein’s Lectures on the Icosahedron [Kl]. In this work he inscribed an icosahedron in the Riemann sphere, \mathbb{C}\mathrm{P}^1. He thus got the icosahedron’s symmetry group, \mathrm{A}_5, to act as conformal transformations of \mathbb{C}\mathrm{P}^1—indeed, rotations. He then found a rational function of one complex variable that is invariant under all these transformations. This function equals 0 at the centers of the icosahedron’s faces, 1 at the midpoints of its edges, and \infty at its vertices.

Here is Klein’s icosahedral function as drawn by Abdelaziz Nait Merzouk. The color shows its phase, while the contour lines show its magnitude:

We can think of Klein’s icosahedral function as a branched cover of the Riemann sphere by itself with 60 sheets:

\displaystyle{                \mathcal{I} \colon \mathbb{C}\mathrm{P}^1 \to \mathbb{C}\mathrm{P}^1 .}

Indeed, \mathrm{A}_5 acts on \mathbb{C}\mathrm{P}^1, and the quotient space \mathbb{C}\mathrm{P}^1/\mathrm{A}_5 is isomorphic to \mathbb{C}\mathrm{P}^1 again. The function \mathcal{I} gives an explicit formula for the quotient map \mathbb{C}\mathrm{P}^1 \to \mathbb{C}\mathrm{P}^1/\mathrm{A}_5 \cong \mathbb{C}\mathrm{P}^1.

Klein managed to reduce solving the quintic to the problem of solving the equation \mathcal{I}(z) = w for z. A modern exposition of this result is Shurman’s Geometry of the Quintic [Sh]. For a more high-powered approach, see the paper by Nash [N]. Unfortunately, neither of these treatments avoids complicated calculations. But our interest in Klein’s icosahedral function here does not come from its connection to the quintic: instead, we want to see its connection to \mathrm{E}_8.

For this we should actually construct Klein’s icosahedral function. To do this, recall that the Riemann sphere \mathbb{C}\mathrm{P}^1 is the space of 1-dimensional linear subspaces of \mathbb{C}^2. Let us work directly with \mathbb{C}^2. While \mathrm{SO}(3) acts on \mathbb{C}\mathrm{P}^1, this comes from an action of this group’s double cover \mathrm{SU}(2) on \mathbb{C}^2. As we have seen, the rotational symmetry group of the icosahedron, \mathrm{A}_5 \subset \mathrm{SO}(3), is double covered by the binary icosahedral group \Gamma \subset \mathrm{SU}(2). To build an \mathrm{A}_5-invariant rational function on \mathbb{C}\mathrm{P}^1, we should thus look for \Gamma-invariant homogeneous polynomials on \mathbb{C}^2.

It is easy to construct three such polynomials:

V, of degree 12, vanishing on the 1d subspaces corresponding to icosahedron vertices.

E, of degree 30, vanishing on the 1d subspaces corresponding to icosahedron edge midpoints.

F, of degree 20, vanishing on the 1d subspaces corresponding to icosahedron face centers.

Remember, we have embedded the icosahedron in \mathbb{C}\mathrm{P}^1, and each point in \mathbb{C}\mathrm{P}^1 is a 1-dimensional subspace of \mathbb{C}^2, so each icosahedron vertex determines such a subspace, and there is a linear function on \mathbb{C}^2, unique up to a constant factor, that vanishes on this subspace. The icosahedron has 12 vertices, so we get 12 linear functions this way. Multiplying them gives V, a homogeneous polynomial of degree 12 on \mathbb{C}^2 that vanishes on all the subspaces corresponding to icosahedron vertices! The same trick gives E, which has degree 30 because the icosahedron has 30 edges, and F, which has degree 20 because the icosahedron has 20 faces.

A bit of work is required to check that V,E and F are invariant under \Gamma, instead of changing by constant factors under group transformations. Indeed, if we had copied this construction using a tetrahedron or octahedron, this would not be the case. For details, see Shurman’s book [Sh], which is free online, or van Hoboken’s nice thesis [VH].

Since both F^3 and V^5 have degree 60, F^3/V^5 is homogeneous of degree zero, so it defines a rational function \mathcal{I} \colon \mathbb{C}\mathrm{P}^1 \to \mathbb{C}\mathrm{P}^1. This function is invariant under \mathrm{A}_5 because F and V are invariant under \Gamma. Since F vanishes at face centers of the icosahedron while V vanishes at vertices, \mathcal{I} = F^3/V^5 equals 0 at face centers and \infty at vertices. Finally, thanks to its invariance property, \mathcal{I} takes the same value at every edge center, so we can normalize V or F to make this value 1.

Thus, \mathcal{I} has precisely the properties required of Klein’s icosahedral function! And indeed, these properties uniquely characterize that function, so that function is \mathcal{I}.

The Appearance of E8

Now comes the really interesting part. Three polynomials on a 2-dimensional space must obey a relation, and V,E, and F obey a very pretty one, at least after we normalize them correctly:

\displaystyle{      V^5 + E^2 + F^3 = 0. }

We could guess this relation simply by noting that each term must have the same degree. Every \Gamma-invariant polynomial on \mathbb{C}^2 is a polynomial in V, E and F, and indeed

\displaystyle{          \mathbb{C}^2 / \Gamma \cong  \{ (V,E,F) \in \mathbb{C}^3 \colon \; V^5 + E^2 + F^3 = 0 \} . }

This complex surface is smooth except at V = E = F = 0, where it has a singularity. And hiding in this singularity is \mathrm{E}_8!

To see this, we need to ‘resolve’ the singularity. Roughly, this means that we find a smooth complex surface S and an onto map

that is one-to-one away from the singularity. (More precisely, if X is an algebraic variety with singular points X_{\mathrm{sing}} \subset X, \pi \colon S \to X is a resolution of X if S is smooth, \pi is proper, \pi^{-1}(X - X_{\textrm{sing}}) is dense in S, and \pi is an isomorphism between \pi^{-1}(X - X_{\mathrm{sing}}) and X - X_{\mathrm{sing}}. For more details see Lamotke’s book [L].)

There are many such resolutions, but one minimal resolution, meaning that all others factor uniquely through this one:

What sits above the singularity in this minimal resolution? Eight copies of the Riemann sphere \mathbb{C}\mathrm{P}^1, one for each dot here:

Two of these \mathbb{C}\mathrm{P}^1s intersect in a point if their dots are connected by an edge: otherwise they are disjoint.

This amazing fact was discovered by Patrick Du Val in 1934 [DV]. Why is it true? Alas, there is not enough room in the margin, or even in the entire blog article, to explain this. The books by Kirillov [Ki] and Lamotke [L] fill in the details. But here is a clue. The \mathrm{E}_8 Dynkin diagram has ‘legs’ of lengths 5, 2 and 3:

On the other hand,

\displaystyle{   \mathrm{A}_5 \cong \langle v, e, f | v^5 = e^2 = f^3 = v e f = 1 \rangle }

where in terms of the rotational symmetries of the icosahedron:

v is a 1/5 turn around some vertex of the icosahedron,

e is a 1/2 turn around the center of an edge touching that vertex,

f is a 1/3 turn around the center of a face touching that vertex,

and we must choose the sense of these rotations correctly to obtain vef = 1. To get a presentation of the binary icosahedral group we drop one relation:

\displaystyle{  \Gamma \cong \langle v, e, f | v^5 = e^2 = f^3 = vef \rangle }

The dots in the \mathrm{E}_8 Dynkin diagram correspond naturally to conjugacy classes in \Gamma, not counting the conjugacy class of the central element -1 \in \Gamma. Each of these conjugacy classes, in turn, gives a copy of \mathbb{C}\mathrm{P}^1 in the minimal resolution of \mathbb{C}^2/\Gamma.

Not only the \mathrm{E}_8 Dynkin diagram, but also the \mathrm{E}_8 lattice, can be found in the minimal resolution of \mathbb{C}^2/\Gamma. Topologically, this space is a 4-dimensional manifold. Its real second homology group is an 8-dimensional vector space with an inner product given by the intersection pairing. The integral second homology is a lattice in this vector space spanned by the 8 copies of \mathbb{C}P^1 we have just seen—and it is a copy of the \mathrm{E}_8 lattice [KS].

But let us turn to a more basic question: what is \mathbb{C}^2/\Gamma like as a topological space? To tackle this, first note that we can identify a pair of complex numbers with a single quaternion, and this gives a homeomorphism

\mathbb{C}^2/\Gamma \cong \mathbb{H}/\Gamma

where we let \Gamma act by right multiplication on \mathbb{H}. So, it suffices to understand \mathbb{H}/\Gamma.

Next, note that sitting inside \mathbb{H}/\Gamma are the points coming from the unit sphere in \mathbb{H}. These points form the 3-dimensional manifold \mathrm{SU}(2)/\Gamma, which is called the Poincaré homology 3-sphere [KS]. This is a wonderful thing in its own right: Poincaré discovered it as a counterexample to his guess that any compact 3-manifold with the same homology as a 3-sphere is actually diffeomorphic to the 3-sphere, and it is deeply connected to \mathrm{E}_8. But for our purposes, what matters is that we can think of this manifold in another way, since we have a diffeomorphism

\mathrm{SU}(2)/\Gamma \cong \mathrm{SO}(3)/\mathrm{A}_5.

The latter is just the space of all icosahedra inscribed in the unit sphere in 3d space, where we count two as the same if they differ by a rotational symmetry.

This is a nice description of the points of \mathbb{H}/\Gamma coming from points in the unit sphere of \mathbb{H}. But every quaternion lies in some sphere centered at the origin of \mathbb{H}, of possibly zero radius. It follows that \mathbb{C}^2/\Gamma \cong \mathbb{H}/\Gamma is the space of all icosahedra centered at the origin of 3d space—of arbitrary size, including a degenerate icosahedron of zero size. This degenerate icosahedron is the singular point in \mathbb{C}^2/\Gamma. This is where \mathrm{E}_8 is hiding.

Clearly much has been left unexplained in this brief account. Most of the missing details can be found in the references. But it remains unknown—at least to me—how the two constructions of \mathrm{E}_8 from the icosahedron fit together in a unified picture.

Recall what we did. First we took the binary icosahedral group \Gamma \subset \mathbb{H}, took integer linear combinations of its elements, thought of these as forming a lattice in an 8-dimensional rational vector space with a natural norm, and discovered that this lattice is a copy of the \mathrm{E}_8 lattice. Then we took \mathbb{C}^2/\Gamma \cong \mathbb{H}/\Gamma, took its minimal resolution, and found that the integral 2nd homology of this space, equipped with its natural inner product, is a copy of the \mathrm{E}_8 lattice. From the same ingredients we built the same lattice in two very different ways! How are these constructions connected? This puzzle deserves a nice solution.


I thank Tong Yang for inviting me to speak on this topic at the Annual General Meeting of the Hong Kong Mathematical Society on May 20, 2017, and Guowu Meng for hosting me at the HKUST while I prepared that talk. I also thank the many people, too numerous to accurately list, who have helped me understand these topics over the years.


[CS] J. H. Conway and N. J. A. Sloane, Sphere Packings, Lattices and Groups, Springer, Berlin, 2013.

[DV] P. du Val, On isolated singularities of surfaces which do not affect the conditions of adjunction, I, II and III, Proc. Camb. Phil. Soc. 30, 453–459, 460–465, 483–491.

[KS] R. Kirby and M. Scharlemann, Eight faces of the Poincaré homology 3-sphere, Usp. Mat. Nauk. 37 (1982), 139–159. Available at

[Ki] A. Kirillov, Quiver Representations and Quiver Varieties, AMS, Providence, Rhode Island, 2016.

[Kl] F. Klein, Lectures on the Ikosahedron and the Solution of Equations of the Fifth Degree, Trüubner & Co., London, 1888. Available at

[L] K. Lamotke, Regular Solids and Isolated Singularities, Vieweg & Sohn, Braunschweig, 1986.

[N] O. Nash, On Klein’s icosahedral solution of the quintic. Available at

[Sh] J. Shurman, Geometry of the Quintic, Wiley, New York, 1997. Available at

[Sl] P. Slodowy, Platonic solids, Kleinian singularities, and Lie groups, in Algebraic Geometry, Lecture Notes in Mathematics 1008, Springer, Berlin, 1983, pp. 102–138.

[VH] J. van Hoboken, Platonic Solids, Binary Polyhedral Groups, Kleinian Singularities and Lie Algebras of Type A, D, E, Master’s Thesis, University of Amsterdam, 2002. Available at

[V] M. Viazovska, The sphere packing problem in dimension 8, Ann. Math. 185 (2017), 991–1015. Available at


10 December, 2017

In certain crystals you can knock an electron out of its favorite place and leave a hole: a place with a missing electron. Sometimes these holes can move around like particles. And naturally these holes attract electrons, since they are places an electron would want to be.

Since an electron and a hole attract each other, they can orbit each other. An orbiting electron-hole pair is a bit like a hydrogen atom, where an electron orbits a proton. All of this is quantum-mechanical, of course, so you should be imagining smeared-out wavefunctions, not little dots moving around. But imagine dots if it’s easier.

An orbiting electron-hole pair is called an exciton, because while it acts like a particle in its own right, it’s really just a special kind of ‘excited’ electron—an electron with extra energy, not in its lowest energy state where it wants to be.

An exciton usually doesn’t last long: the orbiting electron and hole spiral towards each other, the electron finds the hole it’s been seeking, and it settles down.

But excitons can last long enough to do interesting things. In 1978 the Russian physicist Abrikosov wrote a short and very creative paper in which he raised the possibility that excitons could form a crystal in their own right! He called this new state of matter excitonium.

In fact his reasoning was very simple.

Just as electrons have a mass, so do holes. That sounds odd, since a hole is just a vacant spot where an electron would like to be. But such a hole can move around. It has more energy when it moves faster, and it takes force to accelerate it—so it acts just like it has a mass! The precise mass of a hole depends on the nature of the substance we’re dealing with.

Now imagine a substance with very heavy holes.

When a hole is much heavier than an electron, it will stand almost still when an electron orbits it. So, they form an exciton that’s very similar to a hydrogen atom, where we have an electron orbiting a much heavier proton.

Hydrogen comes in different forms: gas, liquid, solid… and at extreme pressures, like in the core of Jupiter, hydrogen becomes metallic. So, we should expect that excitons can come in all these different forms too!

We should be able to create an exciton gas… an exciton liquid… an exciton solid…. and under the right circumstances, a metallic crystal of excitons. Abrikosov called this metallic excitonium.

People have been trying to create this stuff for a long time. Some claim to have succeeded. But a new paper claims to have found something else: a Bose–Einstein condensate of excitons:

• Anshul Kogar, Melinda S. Rak, Sean Vig, Ali A. Husain, Felix Flicker, Young Il Joe, Luc Venema, Greg J. MacDougall, Tai C. Chiang, Eduardo Fradkin, Jasper van Wezel and Peter Abbamonte, Signatures of exciton condensation in a transition metal dichalcogenide, Science 358 (2017), 1314–1317.

A lone electron acts like a fermion, so I guess a hole does do, and if so that means an exciton acts approximately like a boson. When it’s cold, a gas of bosons will ‘condense’, with a significant fraction of them settling into the lowest energy states available. I guess excitons have been seen to do this!

There’s a fairly good simplified explanation at the University of Illinois website:

• Siv Schwink, Physicists excited by discovery of new form of matter, excitonium, 7 December 2017.

However, the picture on this page, which I used above, shows domain walls moving through crystallized excitonium. I think that’s different than a Bose-Einstein condensate!

I urge you to look at Abrikosov’s paper. It’s short and beautiful:

• Alexei Alexeyevich Abrikosov, A possible mechanism of high temperature superconductivity, Journal of the Less Common Metals
62 (1978), 451–455.

(Cool journal title. Is there a journal of the more common metals?)

In this paper, Abrikoskov points out that previous authors had the idea of metallic excitonium. Maybe his new idea was that this might be a superconductor—and that this might explain high-temperature superconductivity. The reason for his guess is that metallic hydrogen, too, is widely suspected to be a superconductor.

Later, Abrikosov won the Nobel prize for some other ideas about superconductors. I think I should read more of his papers. He seems like one of those physicists with great intuitions.

Puzzle 1. If a crystal of excitons conducts electricity, what is actually going on? That is, which electrons are moving around, and how?

This is a fun puzzle because an exciton crystal is a kind of abstract crystal created by the motion of electrons in another, ordinary, crystal. And that leads me to another puzzle, that I don’t know the answer to:

Puzzle 2. Is it possible to create a hole in excitonium? If so, it possible to create an exciton in excitonium? If so, is it possible to create meta-excitonium: an crystal of excitons in excitonium?

Wigner Crystals

7 December, 2017

I’d like to explain a conjecture about Wigner crystals, which we came up with in a discussion on Google+. It’s a purely mathematical conjecture that’s pretty simple to state, motivated by the picture above. But let me start at the beginning.

Electrons repel each other, so they don’t usually form crystals. But if you trap a bunch of electrons in a small space, and cool them down a lot, they will try to get as far away from each other as possible—and they can do this by forming a crystal!

This is sometimes called an electron crystal. It’s also called a Wigner crystal, because the great physicist Eugene Wigner predicted in 1934 that this would happen.

Only since the late 1980s have we been able to make electron crystals in the lab. Such a crystal can only form if the electron density is low enough. The reasons is that even at absolute zero, a gas of electrons has kinetic energy. At absolute zero the gas will minimize its energy. But it can’t do this by having all the electrons in a state with zero momentum, since you can’t put two electrons in the same state, thanks to the Pauli exclusion principle. So, higher momentum states need to be occupied, and this means there’s kinetic energy. And it has more if its density is high: if there’s less room in position space, the electrons are forced to occupy more room in momentum space.

When the density is high, this prevents the formation of a crystal: instead, we have lots of electrons whose wavefunctions are ‘sitting almost on top of each other’ in position space, but with different momenta. They’ll have lots of kinetic energy, so minimizing kinetic energy becomes more important than minimizing potential energy.

When the density is low, this effect becomes unimportant, and the electrons mainly try to minimize potential energy. So, they form a crystal with each electron avoiding the rest. It turns out they form a body-centered cubic: a crystal lattice formed of cubes, with an extra electron in the middle of each cube.

To know whether a uniform electron gas at zero temperature forms a crystal or not, you need to work out its so-called Wigner-Seitz radius. This is the average inter-particle spacing measured in units of the Bohr radius. The Bohr radius is the unit of length you can cook up from the electron mass, the electron charge and Planck’s constant:

\displaystyle{ a_0=\frac{\hbar^2}{m_e e^2} }

It’s mainly famous as the average distance between the electron and a proton in a hydrogen atom in its lowest energy state.

Simulations show that a 3-dimensional uniform electron gas crystallizes when the Wigner–Seitz radius is at least 106. The picture, however, shows an electron crystal in 2 dimensions, formed by electrons trapped on a thin film shaped like a disk. In 2 dimensions, Wigner crystals form when the Wigner–Seitz radius is at least 31. In the picture, the density is so low that we can visualize the electrons as points with well-defined positions.

So, the picture simply shows a bunch of points x_i trying to minimize the potential energy, which is proportional to

\displaystyle{ \sum_{i \ne j} \frac{1}{\|x_i - x_j\|} }

The lines between the dots are just to help you see what’s going on. They’re showing the Delauney triangulation, where we draw a graph that divides the plane into regions closer to one electron than all the rest, and then take the dual of that graph.

Thanks to energy minimization, this triangulation wants to be a lattice of equilateral triangles. But since such a triangular lattice doesn’t fit neatly into a disk, we also see some ‘defects’:

Most electrons have 6 neighbors. But there are also some red defects, which are electrons with 5 neighbors, and blue defects, which are electrons with 7 neighbors.

Note that there are 6 clusters of defects. In each cluster there is one more red defect than blue defect. I think this is not a coincidence.

Conjecture. When we choose a sufficiently large number of points x_i on a disk in such a way that

\displaystyle{ \sum_{i \ne j} \frac{1}{\|x_i - x_j\|} }

is minimized, and draw the Delauney triangulation, there will be 6 more vertices with 5 neighbors than vertices with 7 neighbors.

Here’s a bit of evidence for this, which is not at all conclusive. Take a sphere and triangulate it in such a way that each vertex has 5, 6 or 7 neighbors. Then here’s a cool fact: there must be 12 more vertices with 5 neighbors than vertices with 7 neighbors.

Puzzle. Prove this fact.

If we think of the picture above as the top half of a triangulated sphere, then each vertex in this triangulated sphere has 5, 6 or 7 neighbors. So, there must be 12 more vertices on the sphere with 5 neighbors than with 7 neighbors. So, it makes some sense that the top half of the sphere will contain 6 more vertices with 5 neighbors than with 7 neighbors. But this is not a proof.

I have a feeling this energy minimization problem has been studied with various numbers of points. So, there either be a lot of evidence for my conjecture, or some counterexamples that will force me to refine it. The picture shows what happens with 600 points on the disk. Maybe something dramatically different happens with 599! Maybe someone has even proved theorems about this. I just haven’t had time to look for such work.

The picture here was drawn by Arunas.rv and placed on Wikicommons on a Creative Commons Attribution-Share Alike 3.0 Unported license.

A Universal Snake-like Continuum

27 November, 2017

It sounds like jargon from a bad episode of Star Trek. But it’s a real thing. It’s a monstrous object that lives in the plane, but is impossible to draw.

Do you want to see how snake-like it is? Okay, but beware… this video clip is a warning:

This snake-like monster is also called the ‘pseudo-arc’. It’s the limit of a sequence of curves that get more and more wiggly. Here are the 5th and 6th curves in the sequence:

Here are the 8th and 10th:

But what happens if you try to draw the pseudo-arc itself, the limit of all these curves? It turns out to be infinitely wiggly—so wiggly that any picture of it is useless.

In fact Wayne Lewis and Piotr Minic wrote a paper about this, called Drawing the pseudo-arc. That’s where I got these pictures. The paper also shows stage 200, and it’s a big fat ugly black blob!

But the pseudo-arc is beautiful if you see through the pictures to the concepts, because it’s a universal snake-like continuum. Let me explain. This takes some math.

The nicest metric spaces are compact metric spaces, and each of these can be written as the union of connected components… so there’s a long history of interest in compact connected metric spaces. Except for the empty set, which probably doesn’t deserve to be called connected, these spaces are called continua.

Like all point-set topology, the study of continua is considered a bit old-fashioned, because people have been working on it for so long, and it’s hard to get good new results. But on the bright side, what this means is that many great mathematicians have contributed to it, and there are lots of nice theorems. You can learn about it here:

• W. T. Ingraham, A brief historical view of continuum theory,
Topology and its Applications 153 (2006), 1530–1539.

• Sam B. Nadler, Jr, Continuum Theory: An Introduction, Marcel Dekker, New York, 1992.

Now, if we’re doing topology, we should really talk not about metric spaces but about metrizable spaces: that is, topological spaces where the topology comes from some metric, which is not necessarily unique. This nuance is a way of clarifying that we don’t really care about the metric, just the topology.

So, we define a continuum to be a nonempty compact connected metrizable space. When I think of this I think of a curve, or a ball, or a sphere. Or maybe something bigger like the Hilbert cube: the countably infinite product of closed intervals. Or maybe something full of holes, like the Sierpinski carpet:

or the Menger sponge:

Or maybe something weird like a solenoid:

Very roughly, a continuum is ‘snake-like’ if it’s long and skinny and doesn’t loop around. But the precise definition is a bit harder:

We say that an open cover 𝒰 of a space X refines an open cover 𝒱 if each element of 𝒰 is contained in an element of 𝒱. We call a continuum X snake-like if each open cover of X can be refined by an open cover U1, …, Un such that for any i, j the intersection of Ui and Uj is nonempty iff i and j are right next to each other.

Such a cover is called a chain, so a snake-like continuum is also called chainable. But ‘snake-like’ is so much cooler: we should take advantage of any opportunity to bring snakes into mathematics!

The simplest snake-like continuum is the closed unit interval [0,1]. It’s hard to think of others. But here’s what Mioduszewski proved in 1962: the pseudo-arc is a universal snake-like continuum. That is: it’s a snake-like continuum, and it has continuous map onto every snake-like continuum!

This is a way of saying that the pseudo-arc is the most complicated snake-like continuum possible. A bit more precisely: it bends back on itself as much as possible while still going somewhere! You can see this from the pictures above, or from the construction on Wikipedia:

• Wikipedia, Pseudo-arc.

I like the idea that there’s a subset of the plane with this simple ‘universal’ property, which however is so complicated that it’s impossible to draw.

Here’s the paper where these pictures came from:

• Wayne Lewis and Piotr Minic, Drawing the pseudo-arc, Houston J. Math. 36 (2010), 905–934.

The pseudo-arc has other amazing properties. For example, it’s ‘indecomposable’. A nonempty connected closed subset of a continuum is a continuum in its own right, called a subcontinuum, and we say a continuum is indecomposable if it is not the union of two proper subcontinua.

It takes a while to get used to this idea, since all the examples of continua that I’ve listed so far are decomposable except for the pseudo-arc and the solenoid!

Of course a single point is an indecomposable continuum, but that example is so boring that people sometimes exclude it. The first interesting example was discovered by Brouwer in 1910. It’s the intersection of an infinite sequence of sets like this:

It’s called the Brouwer–Janiszewski–Knaster continuum or buckethandle. Like the solenoid, it shows up as an attractor in some chaotic dynamical systems.

It’s easy to imagine how if you write the buckethandle as the union of two closed proper subsets, at least one will be disconnected. And note: you don’t even need these subsets to be disjoint! So, it’s an indecomposable continuum.

But once you get used to indecomposable continua, you’re ready for the next level of weirdness. An even more dramatic thing is a hereditarily indecomposable continuum: one for which each subcontinuum is also indecomposable.

Apart from a single point, the pseudo-arc is the unique hereditarily indecomposable snake-like continuum! I believe this was first proved here:

• R. H. Bing, Concerning hereditarily indecomposable continua, Pacific J. Math. 1 (1951), 43–51.

Finally, here’s one more amazing fact about the pseudo-arc. To explain it, I need a bunch more nice math:

Every continuum arises as a closed subset of the Hilbert cube. There’s an obvious way to define the distance between two closed subsets of a compact metric space, called the Hausdorff distance—if you don’t know about this already, it’s fun to reinvent it yourself. The set of all closed subsets of a compact metric space thus forms a metric space in its own right—and by the way, the Blaschke selection theorem says this metric space is again compact!

Anyway, this stuff means that there’s a metric space whose points are all subcontinua of the Hilbert cube, and we don’t miss out on any continua by looking at these. So we can call this the space of all continua.

Now for the amazing fact: pseudo-arcs are dense in the space of all continua!

I don’t know who proved this. It’s mentioned here:

• Trevor L. Irwin and Salawomir Solecki, Projective Fraïssé limits and the pseudo-arc.

but they refer to this paper as a good source for such facts:

• Wayne Lews, The pseudo-arc, Bol. Soc. Mat. Mexicana (3) 5 (1999), 25–77.

Abstract. The pseudo-arc is the simplest nondegenerate hereditarily indecomposable continuum. It is, however, also the most important, being homogeneous, having several characterizations, and having a variety of useful mapping properties. The pseudo-arc has appeared in many areas of continuum theory, as well as in several topics in geometric topology, and is beginning to make its appearance in dynamical systems. In this monograph, we give a survey of basic results and examples involving the pseudo-arc. A more complete treatment will be given in a book dedicated to this topic, currently under preparation by this author. We omit formal proofs from this presentation, but do try to give indications of some basic arguments and construction techniques. Our presentation covers the following major topics: 1. Construction 2. Homogeneity 3. Characterizations 4. Mapping properties 5. Hyperspaces 6. Homeomorphism groups 7. Continuous decompositions 8. Dynamics.

It may seem surprising that one can write a whole book about the pseudo-arc… but if you like continua, it’s a fundamental structure just like spheres and cubes!

The Golden Ratio and the Entropy of Braids

22 November, 2017

Here’s a cute connection between topological entropy, braids, and the golden ratio. I learned about it in this paper:

• Jean-Luc Thiffeault and Matthew D. Finn, Topology, braids, and mixing in fluids.

Topological entropy

I’ve talked a lot about entropy on this blog, but not much about topological entropy. This is a way to define the entropy of a continuous map f from a compact topological space X to itself. The idea is that a map that mixes things up a lot should have a lot of entropy. In particular, any map defining a ‘chaotic’ dynamical system should have positive entropy, while non-chaotic maps maps should have zero entropy.

How can we make this precise? First, cover X with finitely many open sets U_1, \dots, U_k. Then take any point in X, apply the map f to it over and over, say n times, and report which open set the point lands in each time. You can record this information in a string of symbols. How much information does this string have? The easiest way to define this is to simply count the total number of strings that can be produced this way by choosing different points initially. Then, take the logarithm of this number.

Of course the answer depends on n, typically growing bigger as n increases. So, divide it by n and try to take the limit as n \to \infty. Or, to be careful, take the lim sup: this could be infinite, but it’s always well-defined. This will tell us how much new information we get, on average, each time we apply the map and report which set our point lands in.

And of course the answer also depends on our choice of open cover U_1, \dots, U_k. So, take the supremum over all finite open covers. This is called the topological entropy of f.

Believe it or not, this is often finite! Even though the log of the number of symbol strings we get will be larger when we use a cover with lots of small sets, when we divide by n and take the limit as n \to \infty this dependence often washes out.


Any braid gives a bunch of maps from the disc to itself. So, we define the entropy of a braid to be the minimum—or more precisely, the infimum—of the topological entropies of these maps.

How does a braid give a bunch of maps from the disc to itself? Imagine the disc as made of very flexible rubber. Grab it at some finite set of points and then move these points around in the pattern traced out by the braid. When you’re done you get a map from the disc to itself. The map you get is not unique, since the rubber is wiggly and you could have moved the points around in slightly different ways. So, you get a bunch of maps.

I’m being sort of lazy in giving precise details here, since the idea seems so intuitively obvious. But that could be because I’ve spent a lot of time thinking about braids, the braid group, and their relation to maps from the disc to itself!

This picture by Thiffeault and Finn may help explain the idea:

As we keep move points around each other, we keep building up more complicated braids with 4 strands, and keep getting more complicated maps from the disc to itself. In fact, these maps are often chaotic! More precisely: they often have positive entropy.

In this other picture the vertical axis represents time, and we more clearly see the braid traced out as our 4 points move around:

Each horizontal slice depicts a map from the disc (or square: this is topology!) to itself, but we only see their effect on a little rectangle drawn in black.

The golden ratio

Okay, now for the punchline!

Puzzle 1. Which braid with 3 strands has the highest entropy per generator? What is its entropy per generator?

I should explain: any braid with 3 strands can be written as a product of generators \sigma_1, \sigma_2, \sigma_1^{-1}, \sigma_2^{-1}. Here \sigma_1 switches strands 1 and 2 moving the counterclockwise around each other, \sigma_2 does the same for strands 2 and 3, and \sigma_1^{-1} and \sigma_2^{-1} do the same but moving the strands clockwise.

For any braid we can write it as a product of n generators with n as small as possible, and then we can evaluate its entropy divided by n. This is the right way to compare the entropy of braids, because if a braid gives a chaotic map we expect powers of that braid to have entropy growing linearly with n.

Now for the answer to the puzzle!

Answer 1. A 3-strand braid maximizing the entropy per generator is \sigma_1 \sigma_2^{-1}. And the entropy of this braid, per generator, is the logarithm of the golden ratio:

\displaystyle{ \log \left( \frac{\sqrt{5} + 1}{2} \right) }

In other words, the entropy of this braid is

\displaystyle{ \log \left( \frac{\sqrt{5} + 1}{2} \right)^2 }

All this works regardless of which base we use for our logarithms. But if we use base e, which seems pretty natural, the maximum possible entropy per generator is

\displaystyle{ \ln \left( \frac{\sqrt{5} + 1}{2} \right) \approx 0.48121182506\dots }

Or if you prefer base 2, then each time you stir around a point in the disc with this braid, you’re creating

\displaystyle{ \log_2 \left( \frac{\sqrt{5} + 1}{2} \right) \approx 0.69424191363\dots }

bits of unknown information.

This fact was proved here:

• D. D’Alessandro, M. Dahleh and I Mezíc, Control of mixing in fluid flow: A maximum entropy approach, IEEE Transactions on Automatic Control 44 (1999), 1852–1863.

So, people call this braid \sigma_1 \sigma_2^{-1} the golden braid. But since you can use it to generate entropy forever, perhaps it should be called the eternal golden braid.

What does it all mean? Well, the 3-strand braid group is called \mathrm{B}_3, and I wrote a long story about it:

• John Baez, This Week’s Finds in Mathematical Physics (Week 233).

You’ll see there that \mathrm{B}_3 has a representation as 2 × 2 matrices:

\displaystyle{ \sigma_1 \mapsto \left(\begin{array}{rr} 1 & 1 \\ 0 & 1 \end{array}\right)}

\displaystyle{ \sigma_2 \mapsto \left(\begin{array}{rr} 1 & 0 \\ -1 & 1 \end{array}\right) }

These matrices are shears, which is connected to how the braids \sigma_1 and \sigma_2 give maps from the disc to itself that shear points. If we take the golden braid and turn it into a matrix using this representation, we get a matrix for which the magnitude of its largest eigenvalue is the square of the golden ratio! So, the amount of stretching going on is ‘the golden ratio per generator’.

I guess this must be part of the story too:

Puzzle 2. Is it true that when we multiply n matrices of the form

\displaystyle{ \left(\begin{array}{rr} 1 & 1 \\ 0 & 1 \end{array}\right)  , \quad \left(\begin{array}{rr} 1 & 0 \\ -1 & 1 \end{array}\right) }

or their inverses:

\displaystyle{ \left(\begin{array}{rr} 1 & -1 \\ 0 & 1 \end{array}\right)  , \quad \left(\begin{array}{rr} 1 & 0 \\ 1 & 1 \end{array}\right) }

the magnitude of the largest eigenvalue of the resulting product can never exceed the nth power of the golden ratio?

There’s also a strong connection between braid groups, certain quasiparticles in the plane called Fibonacci anyons, and the golden ratio. But I don’t see the relation between these things and topological entropy! So, there is a mystery here—at least for me.

For more, see:

• Matthew D. Finn and Jean-Luc Thiffeault, Topological optimisation of rod-stirring devices, SIAM Review 53 (2011), 723—743.

Abstract. There are many industrial situations where rods are used to stir a fluid, or where rods repeatedly stretch a material such as bread dough or taffy. The goal in these applications is to stretch either material lines (in a fluid) or the material itself (for dough or taffy) as rapidly as possible. The growth rate of material lines is conveniently given by the topological entropy of the rod motion. We discuss the problem of optimising such rod devices from a topological viewpoint. We express rod motions in terms of generators of the braid group, and assign a cost based on the minimum number of generators needed to write the braid. We show that for one cost function—the topological entropy per generator—the optimal growth rate is the logarithm of the golden ratio. For a more realistic cost function,involving the topological entropy per operation where rods are allowed to move together, the optimal growth rate is the logarithm of the silver ratio, 1+ \sqrt{2}. We show how to construct devices that realise this optimal growth, which we call silver mixers.

Here is the silver ratio:

But now for some reason I feel it’s time to stop!

Applied Category Theory at UCR (Part 3)

13 November, 2017

We had a special session on applied category theory here at UCR:

Applied category theory, Fall Western Sectional Meeting of the AMS, 4-5 November 2017, U.C. Riverside.

A bunch of people stayed for a few days afterwards, and we had a lot of great discussions. I wish I could explain everything that happened, but I’m too busy right now. Luckily, even if you couldn’t come here, you can now see slides of almost all the talks… and videos of many!

Click on talk titles to see abstracts. For multi-author talks, the person whose name is in boldface is the one who gave the talk. For videos, go here: I haven’t yet created links to all the videos.

Saturday November 4, 2017

9:00 a.m.A higher-order temporal logic for dynamical systemstalk slides.
David I. Spivak, MIT.

10:00 a.m.
Algebras of open dynamical systems on the operad of wiring diagramstalk slides.
Dmitry Vagner, Duke University
David I. Spivak, MIT
Eugene Lerman, University of Illinois at Urbana-Champaign

10:30 a.m.
Abstract dynamical systemstalk slides.
Christina Vasilakopoulou, UCR
David Spivak, MIT
Patrick Schultz, MIT

3:00 p.m.
Decorated cospanstalk slides.
Brendan Fong, MIT

4:00 p.m.
Compositional modelling of open reaction networkstalk slides.
Blake S. Pollard, UCR
John C. Baez, UCR

4:30 p.m.
A bicategory of coarse-grained Markov processestalk slides.
Kenny Courser, UCR

5:00 p.m.
A bicategorical syntax for pure state qubit quantum mechanicstalk slides.
Daniel M. Cicala, UCR

5:30 p.m.
Open systems in classical mechanicstalk slides.
Adam Yassine, UCR

Sunday November 5, 2017

9:00 a.m.
Controllability and observability: diagrams and dualitytalk slides.
Jason Erbele, Victor Valley College

9:30 a.m.
Frobenius monoids, weak bimonoids, and corelationstalk slides.
Brandon Coya, UCR

10:00 a.m.
Compositional design and tasking of networks.
John D. Foley, Metron, Inc.
John C. Baez, UCR
Joseph Moeller, UCR
Blake S. Pollard, UCR

10:30 a.m.
Operads for modeling networkstalk slides.
Joseph Moeller, UCR
John Foley, Metron Inc.
John C. Baez, UCR
Blake S. Pollard, UCR

2:00 p.m.
Reeb graph smoothing via cosheavestalk slides.
Vin de Silva, Department of Mathematics, Pomona College

3:00 p.m.
Knowledge representation in bicategories of relationstalk slides.
Evan Patterson, Stanford University, Statistics Department

3:30 p.m.
The multiresolution analysis of flow graphstalk slides.
Steve Huntsman, BAE Systems

4:00 p.m.
Data modeling and integration using the open source tool Algebraic Query Language (AQL)talk slides.
Peter Y. Gates, Categorical Informatics
Ryan Wisnesky, Categorical Informatics

Biology as Information Dynamics (Part 3)

9 November, 2017

On Monday I’m giving this talk at Caltech:

Biology as information dynamics, November 13, 2017, 4:00–5:00 pm, General Biology Seminar, Kerckhoff 119, Caltech.

If you’re around, please check it out! I’ll be around all day talking to people, including Erik Winfree, my graduate student host Fangzhou Xiao, and other grad students.

If you can’t make it, you can watch this video! It’s a neat subject, and I want to do more on it:

Abstract. If biology is the study of self-replicating entities, and we want to understand the role of information, it makes sense to see how information theory is connected to the ‘replicator equation’ — a simple model of population dynamics for self-replicating entities. The relevant concept of information turns out to be the information of one probability distribution relative to another, also known as the Kullback–Liebler divergence. Using this we can get a new outlook on free energy, see evolution as a learning process, and give a clearer, more general formulation of Fisher’s fundamental theorem of natural selection.