Information Processing in Chemical Networks

4 January, 2017

There’s a workshop this summer:

• Dynamics, Thermodynamics and Information Processing in Chemical Networks, 13-16 June 2017, Complex Systems and Statistical Mechanics Group, University of Luxembourg. Organized by Massimiliano Esposito and Matteo Polettini.

They write, “The idea of the workshop is to bring in contact a small number of high-profile research groups working at the frontier between physics and biochemistry, with particular emphasis on the role of Chemical Networks.”

Some invited speakers include Vassily Hatzimanikatis, John Baez, Christoff Flamm, Hong Qian, Joshua D. Rabinowitz, Luca Cardelli, Erik Winfree, David Soloveichik, Stefan Schuster, David Fell and Arren Bar-Even. There will also be a session of shorter seminars by researchers from the local institutions such as Luxembourg Center for System Biomedicine. I believe attendance is by invitation only, so I’ll endeavor to make some of the ideas presented available here at this blog.

Some of the people involved

I’m looking forward to this, in part because there will be a mix of speakers I’ve met, speakers I know but haven’t met, and speakers I don’t know yet. I feel like reminiscing a bit, and I hope you’ll forgive me these reminiscences, since if you try the links you’ll get an introduction to the interface between computation and chemical reaction networks.

In part 25 of the network theory series here, I imagined an arbitrary chemical reaction network and said:

We could try to use these reactions to build a ‘chemical computer’. But how powerful can such a computer be? I don’t know the answer.

Luca Cardelli answered my question in part 26. This was just my first introduction to the wonderful world of chemical computing. Erik Winfree has a DNA and Natural Algorithms Group at Caltech, practically next door to Riverside, and the people there do a lot of great work on this subject. David Soloveichik, now at U. T. Austin, is an alumnus of this group.

In 2014 I met all three of these folks, and many other cool people working on these theme, at a workshop I tried to summarize here:

Programming with chemical reaction networks, Azimuth, 23 March 2014.

The computational power of chemical reaction networks, 10 June 2014.

Chemical reaction network talks, 26 June 2014.

I met Matteo Polettini about a year later, at a really big workshop on chemical reaction networks run by Elisenda Feliu and Carsten Wiuf:

Trends in reaction network theory (part 1), Azimuth, 27 January 2015.

Trends in reaction network theory (part 2), Azimuth, 1 July 2015.

Polettini has his own blog, very much worth visiting. For example, you can see his view of the same workshop here:

• Matteo Polettini, Mathematical trends in reaction network theory: part 1 and part 2, Out of Equilibrium, 1 July 2015.

Finally, I met Massimiliano Esposito and Christoph Flamm recently at the Santa Fe Institute, at a workshop summarized here:

Information processing and biology, Azimuth, 7 November 2016.

So, I’ve gradually become educated in this area, and I hope that by June I’ll be ready to say something interesting about the semantics of chemical reaction networks. Blake Pollard and I are writing a paper about this now.

Semantics for Physicists

7 December, 2016

I once complained that my student Brendan Fong said ‘semantics’ too much. You see, I’m in a math department, but he was actually in the computer science department at Oxford: I was his informal supervisor. Theoretical computer scientists love talking about syntax versus semantics—that is, written expressions versus what those expressions actually mean, or programs versus what those programs actually do. So Brendan was very comfortable with that distinction. But my other grad students, coming from a math department didn’t understand it… and he was mentioning it in practically ever other sentence.

In 1963, in his PhD thesis, Bill Lawvere figured out a way to talk about syntax versus semantics that even mathematicians—well, even category theorists—could understand. It’s called ‘functorial semantics’. The idea is that things you write are morphisms in some category X, while their meanings are morphisms in some other category Y. There’s a functor F \colon X \to Y which sends things you write to their meanings. This functor sends syntax to semantics!

But physicists may not enjoy this idea unless they see it at work in physics. In physics, too, the distinction is important! But it takes a while to understand. I hope Prakash Panangaden’s talk at the start of the Simons Institute workshop on compositionality is helpful. Check it out:

Jarzynksi on Non-Equilibrium Statistical Mechanics

18 November, 2016


Here at the Santa Fe Institute we’re having a workshop on Statistical Physics, Information Processing and Biology. Unfortunately the talks are not being videotaped, so it’s up to me to spread the news of what’s going on here.

Christopher Jarzynski is famous for discovering the Jarzynski equality. It says

\displaystyle{ e^ { -\Delta F / k T} = \langle e^{ -W/kT } \rangle }

where k is Boltzmann’s consstant and T is the temperature of a system that’s in equilibrium before some work is done on it. \Delta F is the change in free energy, W is the amount of work, and the angle brackets represent an average over the possible options for what takes place—this sort of process is typically nondeterministic.

We’ve seen a good quick explanation of this equation here on Azimuth:

• Eric Downes, Crooks’ Fluctuation Theorem, Azimuth, 30 April 2011.

We’ve also gotten a proof, where it was called the ‘integral fluctuation theorem’:

• Matteo Smerlak, The mathematical origin of irreversibility, Azimuth, 8 October 2012.

It’s a fundamental result in nonequilibrium statistical mechanics—a subject where inequalities are so common that this equation is called an ‘equality’.

Two days ago, Jarzynski gave an incredibly clear hour-long tutorial on this subject, starting with the basics of thermodynamics and zipping forward to modern work. With his permission, you can see the slides here:

• Christopher Jarzynski, A brief introduction to the delights of non-equilibrium statistical physics.

Also try this review article:

• Christopher Jarzynski, Equalities and inequalities: irreversibility and the Second Law of thermodynamics at the nanoscale, Séminaire Poincaré XV Le Temps (2010), 77–102.

Kosterlitz–Thouless Transition

7 October, 2016

Three days ago, the 2016 Nobel Prize in Physics was awarded to Michael Kosterlitz of Brown University:


David Thouless of the University of Washington:


and Duncan Haldane of Princeton University:


They won it for their “theoretical discovery of topological phase transitions and topological phases of matter”, which was later confirmed by many experiments.

Sadly, the world’s reaction was aptly summarized by Wired magazine’s headline:

Nobel Prize in Physics Goes to Another Weird Thing Nobody Understands

Journalists worldwide struggled to pronounce ‘topology’, and a member of the Nobel prize committee was reduced to waving around a bagel and a danish to explain what the word means:

That’s fine as far as it goes: I’m all for using food items to explain advanced math! However, it doesn’t explain what Kosterlitz, Thouless and Haldane actually did. I think a 3-minute video with the right animations would make the beauty of their work perfectly clear. I can see it in my head. Alas, I don’t have the skill to make those animations—hence this short article.

I’ll just explain the Kosterlitz–Thouless transition, which is an effect that shows up in thin films of magnetic material. Haldane’s work on magnetic wires is related, but it deserves a separate story.

I’m going to keep this very quick! For more details, try this excellent blog article:

• Brian Skinner, Samuel Beckett’s guide to particles and antiparticles, Ribbonfarm, 24 September 2015.

I’m taking all my pictures from there.

The idea

Imagine a thin film of stuff where each atom’s spin likes to point in the same direction as its neighbors. Also suppose that each spin must point in the plane of the material.

Your stuff will be happiest when all its spins are lined up, like this:


What does ‘happy’ mean? Physicists often talk this way. It sounds odd, but it means something precise: it means that the energy is low. When your stuff is very cold, its energy will be as low as possible, so the spins will line up.

When you heat up your thin film, it gets a bit more energy, so the spins can do more interesting things.

Here’s one interesting possibility, called a ‘vortex’:


The spins swirl around like the flow of water in a whirlpool. Each spin is fairly close to being lined up to its neighbors, except near the middle where they’re doing a terrible job.

The total energy of a vortex is enormous. The reason is not the problem at the middle, which certainly contributes some energy. The reason is that ‘fairly’ close is not good enough. The spins fail to perfectly line up with their neighbors even far away from the middle of this picture. This problem is bad enough to make the energy huge. (In fact, the energy would be infinite if our thin film of material went on forever.)

So, even if you heat up your substance, there won’t be enough energy to make many vortices. This made people think vortices were irrelevant.

But there’s another possibility, called an ‘antivortex’:


A single antivortex has a huge energy, just like a vortex. So again, it might seem antivortices are irrelevant if you’re wondering what your stuff will do when it has just a little energy.

But here’s what Kosterlitz and Thouless noticed: the combination of a vortex together with an antivortex has much less energy than either one alone! So, when your thin film of stuff is hot enough, the spins will form ‘vortex-antivortex pairs’.

Brian Skinner has made a beautiful animation showing how this happens. A vortex-antivortex pair can appear out of nothing:


… and then disappear again!

Thanks to this process, at low temperatures our thin film will contain a dilute ‘gas’ of vortex-antivortex pairs. Each vortex will stick to an antivortex, since it takes a lot of energy to separate them. These vortex-antivortex pairs act a bit like particles: they move around, bump into each other, and so on. But unlike most ordinary particles, they can appear out of nothing, or disappear, in the process shown above!

As you heat up the thin film, you get more and more vortex-antivortex pairs, since there’s more energy available to create them. But here’s the really surprising thing. Kosterlitz and Thouless showed that as you turn up the heat, there’s a certain temperature at which the vortex-antivortex pairs suddenly ‘unbind’ and break apart!

Why? Because at this point, the density of vortex-antivortex pairs is so high, and they’re bumping into each other so much, that we can’t tell which vortex is the partner of which antivortex. All we’ve got is a thick soup of vortices and antivortices!

What’s interesting is that this happens suddenly at some particular temperature. It’s a bit like how ice suddenly turns into liquid water when it warms above its melting point. A sudden change in behavior like this is called a phase transition.

So, the Kosterlitz–Thouless transition is the sudden unbinding of the vortex-antivortex pairs as you heat up a thin film of stuff where the spins are confined to a plane and they like to line up.

In fact, the pictures above are relevant to many other situations, like thin films of superconductive materials. So, these too can exhibit a Kosterlitz–Thouless transition. Indeed, the work of Kosterlitz and Thouless was the key that unlocked a treasure room full of strange new states of matter, called ‘topological phases’. But this is another story.


What is the actual definition of a vortex or antivortex? As you march around either one and look at the little arrows, the arrows turn around—one full turn. It’s a vortex if when you walk around it clockwise the little arrows make a full turn clockwise:


It’s an antivortex if when you walk around it clockwise the little arrows make a full turn counterclockwise:


Topologists would say the vortex has ‘winding number’ 1, while the antivortex has winding number -1.

In the physics, the winding number is very important. Any collection of vortex-antivortex pairs has winding number 0, and Kosterlitz and Thouless showed that situations with winding number 0 are the only ones with small enough energy to be important for a large thin film at rather low temperatures.

Now for the puzzles:

Puzzle 1: What’s the mirror image of a vortex? A vortex, or an antivortex?

Puzzle 2: What’s the mirror image of an antivortex?

Here are some clues, drawn by the science fiction writer Greg Egan:

and the mathematician Simon Willerton:

For more

To dig a bit deeper, try this:

• The Nobel Prize in Physics 2016, Topological phase transitions and topological phases of matter.

It’s a very well-written summary of what Kosterlitz, Thouless and Haldane did.

Also, check out Simon Burton‘s simulation of the system Kosterlitz and Thouless were studying:

In this simulation the spins start out at random and then evolve towards equilibrium at a temperature far below the Kosterlitz–Thouless transition. When equilibrium is reached, we have a gas of vortex-antivortex pairs. Vortices are labeled in blue while antivortices are green (though this is not totally accurate because the lattice is discrete). Burton says that if we raise the temperature to the Kosterlitz–Thouless transition, the movie becomes ‘a big mess’. That’s just what we’d expect as the vortex-antivortex pairs unbind.

I thank Greg Egan, Simon Burton, Brian Skinner, Simon Willerton and Haitao Zhang, whose work made this blog article infinitely better than it otherwise would be.

Struggles with the Continuum (Part 8)

25 September, 2016

We’ve been looking at how the continuum nature of spacetime poses problems for our favorite theories of physics—problems with infinities. Last time we saw a great example: general relativity predicts the existence of singularities, like black holes and the Big Bang. I explained exactly what these singularities really are. They’re not points or regions of spacetime! They’re more like ways for a particle to ‘fall off the edge of spacetime’. Technically, they are incomplete timelike or null geodesics.

The next step is to ask whether these singularities rob general relativity of its predictive power. The ‘cosmic censorship hypothesis’, proposed by Penrose in 1969, claims they do not.

In this final post I’ll talk about cosmic censorship, and conclude with some big questions… and a place where you can get all these posts in a single file.

Cosmic censorship

To say what we want to rule out, we must first think about what behaviors we consider acceptable. Consider first a black hole formed by the collapse of a star. According to general relativity, matter can fall into this black hole and ‘hit the singularity’ in a finite amount of proper time, but nothing can come out of the singularity.

The time-reversed version of a black hole, called a ‘white hole’, is often considered more disturbing. White holes have never been seen, but they are mathematically valid solutions of Einstein’s equation. In a white hole, matter can come out of the singularity, but nothing can fall in. Naively, this seems to imply that the future is unpredictable given knowledge of the past. Of course, the same logic applied to black holes would say the past is unpredictable given knowledge of the future.

Big Bang Cosmology

Big Bang cosmology

If white holes are disturbing, perhaps the Big Bang should be more so. In the usual solutions of general relativity describing the Big Bang, all matter in the universe comes out of a singularity! More precisely, if one follows any timelike geodesic back into the past, it becomes undefined after a finite amount of proper time. Naively, this may seem a massive violation of predictability: in this scenario, the whole universe ‘sprang out of nothing’ about 14 billion years ago.

However, in all three examples so far—astrophysical black holes, their time-reversed versions and the Big Bang—spacetime is globally hyperbolic. I explained what this means last time. In simple terms, it means we can specify initial data at one moment in time and use the laws of physics to predict the future (and past) throughout all of spacetime. How is this compatible with the naive intuition that a singularity causes a failure of predictability?

For any globally hyperbolic spacetime M, one can find a smoothly varying family of Cauchy surfaces S_t (t \in \mathbb{R}) such that each point of M lies on exactly one of these surfaces. This amounts to a way of chopping spacetime into ‘slices of space’ for various choices of the ‘time’ parameter t. For an astrophysical black hole, the singularity is in the future of all these surfaces. That is, an incomplete timelike or null geodesic must go through all these surfaces S_t before it becomes undefined. Similarly, for a white hole or the Big Bang, the singularity is in the past of all these surfaces. In either case, the singularity cannot interfere with our predictions of what occurs in spacetime.

A more challenging example is posed by the Kerr–Newman solution of Einstein’s equation coupled to the vacuum Maxwell equations. When

e^2 + (J/m)^2 < m^2

this solution describes a rotating charged black hole with mass m, charge e and angular momentum J in units where c = G = 1. However, an electron violates this inequality. In 1968, Brandon Carter pointed out that if the electron were described by the Kerr–Newman solution, it would have a gyromagnetic ratio of g = 2, much closer to the true answer than a classical spinning sphere of charge, which gives g = 1. But since

e^2 + (J/m)^2 > m^2

this solution gives a spacetime that is not globally hyperbolic: it has closed timelike curves! It also contains a ‘naked singularity’. Roughly speaking, this is a singularity that can be seen by arbitrarily faraway observers in a spacetime whose geometry asymptotically approaches that of Minkowski spacetime. The existence of a naked singularity implies a failure of global hyperbolicity.

The cosmic censorship hypothesis comes in a number of forms. The original version due to Penrose is now called ‘weak cosmic censorship’. It asserts that in a spacetime whose geometry asymptotically approaches that of Minkowski spacetime, gravitational collapse cannot produce a naked singularity.

In 1991, Preskill and Thorne made a bet against Hawking in which they claimed that weak cosmic censorship was false. Hawking conceded this bet in 1997 when a counterexample was found. This features finely-tuned infalling matter poised right on the brink of forming a black hole. It almost creates a region from which light cannot escape—but not quite. Instead, it creates a naked singularity!

Given the delicate nature of this construction, Hawking did not give up. Instead he made a second bet, which says that weak cosmic censorshop holds ‘generically’ — that is, for an open dense set of initial conditions.

In 1999, Christodoulou proved that for spherically symmetric solutions of Einstein’s equation coupled to a massless scalar field, weak cosmic censorship holds generically. While spherical symmetry is a very restrictive assumption, this result is a good example of how, with plenty of work, we can make progress in rigorously settling the questions raised by general relativity.

Indeed, Christodoulou has been a leader in this area. For example, the vacuum Einstein equations have solutions describing gravitational waves, much as the vacuum Maxwell equations have solutions describing electromagnetic waves. However, gravitational waves can actually form black holes when they collide. This raises the question of the stability of Minkowski spacetime. Must sufficiently small perturbations of the Minkowski metric go away in the form of gravitational radiation, or can tiny wrinkles in the fabric of spacetime somehow amplify themselves and cause trouble—perhaps even a singularity? In 1993, together with Klainerman, Christodoulou proved that Minkowski spacetime is indeed stable. Their proof fills a 514-page book.

In 2008, Christodoulou completed an even longer rigorous study of the formation of black holes. This can be seen as a vastly more detailed look at questions which Penrose’s original singularity theorem addressed in a general, preliminary way. Nonetheless, there is much left to be done to understand the behavior of singularities in general relativity.


In this series of posts, we’ve seen that in every major theory of physics, challenging mathematical questions arise from the assumption that spacetime is a continuum. The continuum threatens us with infinities! Do these infinities threaten our ability to extract predictions from these theories—or even our ability to formulate these theories in a precise way?

We can answer these questions, but only with hard work. Is this a sign that we are somehow on the wrong track? Is the continuum as we understand it only an approximation to some deeper model of spacetime? Only time will tell. Nature is providing us with plenty of clues, but it will take patience to read them correctly.

For more

To delve deeper into singularities and cosmic censorship, try this delightful book, which is free online:

• John Earman, Bangs, Crunches, Whimpers and Shrieks: Singularities and Acausalities in Relativistic Spacetimes, Oxford U. Press, Oxford, 1993.

To read this whole series of posts in one place, with lots more references and links, see:

• John Baez, Struggles with the continuum.

Struggles with the Continuum (Part 7)

23 September, 2016

Combining electromagnetism with relativity and quantum mechanics led to QED. Last time we saw the immense struggles with the continuum this caused. But combining gravity with relativity led Einstein to something equally remarkable: general relativity.

Gravitational lensing by a non-rotating black hole

In general relativity, infinities coming from the continuum nature of spacetime are deeply connected to its most dramatic successful predictions: black holes and the Big Bang. In this theory, the density of the Universe approaches infinity as we go back in time toward the Big Bang, and the density of a star approaches infinity as it collapses to form a black hole. Thus we might say that instead of struggling against infinities, general relativity accepts them and has learned to live with them.

General relativity does not take quantum mechanics into account, so the story is not yet over. Many physicists hope that quantum gravity will eventually save physics from its struggles with the continuum! Since quantum gravity far from being understood, this remains just a hope. This hope has motivated a profusion of new ideas on spacetime: too many to survey here. Instead, I’ll focus on the humbler issue of how singularities arise in general relativity—and why they might not rob this theory of its predictive power.

General relativity says that spacetime is a 4-dimensional Lorentzian manifold. Thus, it can be covered by patches equipped with coordinates, so that in each patch we can describe points by lists of four numbers. Any curve \gamma(s) going through a point then has a tangent vector v whose components are v^\mu = d \gamma^\mu(s)/ds. Furthermore, given two tangent vectors v,w at the same point we can take their inner product

g(v,w) = g_{\mu \nu} v^\mu w^\nu

where as usual we sum over repeated indices, and g_{\mu \nu} is a 4 \times 4 matrix called the metric, depending smoothly on the point. We require that at any point we can find some coordinate system where this matrix takes the usual Minkowski form:

\displaystyle{  g = \left( \begin{array}{cccc} -1 & 0 &0 & 0 \\ 0 & 1 &0 & 0 \\ 0 & 0 &1 & 0 \\ 0 & 0 &0 & 1 \\ \end{array}\right). }

However, as soon as we move away from our chosen point, the form of the matrix g in these particular coordinates may change.

General relativity says how the metric is affected by matter. It does this in a single equation, Einstein’s equation, which relates the ‘curvature’ of the metric at any point to the flow of energy-momentum through that point. To define the curvature, we need some differential geometry. Indeed, Einstein had to learn this subject from his mathematician friend Marcel Grossman in order to write down his equation. Here I will take some shortcuts and try to explain Einstein’s equation with a bare minimum of differential geometry. For how this approach connects to the full story, and a list of resources for further study of general relativity, see:

• John Baez and Emory Bunn, The meaning of Einstein’s equation.

Consider a small round ball of test particles that are initially all at rest relative to each other. This requires a bit of explanation. First, because spacetime is curved, it only looks like Minkowski spacetime—the world of special relativity—in the limit of very small regions. The usual concepts of ’round’ and ‘at rest relative to each other’ only make sense in this limit. Thus, all our forthcoming statements are precise only in this limit, which of course relies on the fact that spacetime is a continuum.

Second, a test particle is a classical point particle with so little mass that while it is affected by gravity, its effects on the geometry of spacetime are negligible. We assume our test particles are affected only by gravity, no other forces. In general relativity this means that they move along timelike geodesics. Roughly speaking, these are paths that go slower than light and bend as little as possible. We can make this precise without much work.

For a path in space to be a geodesic means that if we slightly vary any small portion of it, it can only become longer. However, a path \gamma(s) in spacetime traced out by particle moving slower than light must be ‘timelike’, meaning that its tangent vector v = \gamma'(s) satisfies g(v,v) < 0. We define the proper time along such a path from s = s_0 to s = s_1 to be

\displaystyle{  \int_{s_0}^{s_1} \sqrt{-g(\gamma'(s),\gamma'(s))} \, ds }

This is the time ticked out by a clock moving along that path. A timelike path is a geodesic if the proper time can only decrease when we slightly vary any small portion of it. Particle physicists prefer the opposite sign convention for the metric, and then we do not need the minus sign under the square root. But the fact remains the same: timelike geodesics locally maximize the proper time.

Actual particles are not test particles! First, the concept of test particle does not take quantum theory into account. Second, all known particles are affected by forces other than gravity. Third, any actual particle affects the geometry of the spacetime it inhabits. Test particles are just a mathematical trick for studying the geometry of spacetime. Still, a sufficiently light particle that is affected very little by forces other than gravity can be approximated by a test particle. For example, an artificial satellite moving through the Solar System behaves like a test particle if we ignore the solar wind, the radiation pressure of the Sun, and so on.

If we start with a small round ball consisting of many test particles that are initially all at rest relative to each other, to first order in time it will not change shape or size. However, to second order in time it can expand or shrink, due to the curvature of spacetime. It may also be stretched or squashed, becoming an ellipsoid. This should not be too surprising, because any linear transformation applied to a ball gives an ellipsoid.

Let V(t) be the volume of the ball after a time t has elapsed, where time is measured by a clock attached to the particle at the center of the ball. Then in units where c = 8 \pi G = 1, Einstein’s equation says:

\displaystyle{  \left.{\ddot V\over V} \right|_{t = 0} = -{1\over 2} \left( \begin{array}{l} {\rm flow \; of \;} t{\rm -momentum \; in \; the \;\,} t {\rm \,\; direction \;} + \\ {\rm flow \; of \;} x{\rm -momentum \; in \; the \;\,} x {\rm \; direction \;} + \\ {\rm flow \; of \;} y{\rm -momentum \; in \; the \;\,} y {\rm \; direction \;} + \\ {\rm flow \; of \;} z{\rm -momentum \; in \; the \;\,} z {\rm \; direction} \end{array} \right) }

These flows here are measured at the center of the ball at time zero, and the coordinates used here take advantage of the fact that to first order, at any one point, spacetime looks like Minkowski spacetime.

The flows in Einstein’s equation are the diagonal components of a 4 \times 4 matrix T called the ‘stress-energy tensor’. The components T_{\alpha \beta} of this matrix say how much momentum in the \alpha direction is flowing in the \beta direction through a given point of spacetime. Here \alpha and \beta range from 0 to 3, corresponding to the t,x,y and z coordinates.

For example, T_{00} is the flow of t-momentum in the t-direction. This is just the energy density, usually denoted \rho. The flow of x-momentum in the x-direction is the pressure in the x direction, denoted P_x, and similarly for y and z. You may be more familiar with direction-independent pressures, but it is easy to manufacture a situation where the pressure depends on the direction: just squeeze a book between your hands!

Thus, Einstein’s equation says

\displaystyle{ {\ddot V\over V} \Bigr|_{t = 0} = -{1\over 2} (\rho + P_x + P_y + P_z) }

It follows that positive energy density and positive pressure both curve spacetime in a way that makes a freely falling ball of point particles tend to shrink. Since E = mc^2 and we are working in units where c = 1, ordinary mass density counts as a form of energy density. Thus a massive object will make a swarm of freely falling particles at rest around it start to shrink. In short, gravity attracts.

Already from this, gravity seems dangerously inclined to create singularities. Suppose that instead of test particles we start with a stationary cloud of ‘dust’: a fluid of particles having nonzero energy density but no pressure, moving under the influence of gravity alone. The dust particles will still follow geodesics, but they will affect the geometry of spacetime. Their energy density will make the ball start to shrink. As it does, the energy density \rho will increase, so the ball will tend to shrink ever faster, approaching infinite density in a finite amount of time. This in turn makes the curvature of spacetime become infinite in a finite amount of time. The result is a ‘singularity’.

In reality, matter is affected by forces other than gravity. Repulsive forces may prevent gravitational collapse. However, this repulsion creates pressure, and Einstein’s equation says that pressure also creates gravitational attraction! In some circumstances this can overwhelm whatever repulsive forces are present. Then the matter collapses, leading to a singularity—at least according to general relativity.

When a star more than 8 times the mass of our Sun runs out of fuel, its core suddenly collapses. The surface is thrown off explosively in an event called a supernova. Most of the energy—the equivalent of thousands of Earth masses—is released in a ten-minute burst of neutrinos, formed as a byproduct when protons and electrons combine to form neutrons. If the star’s mass is below 20 times that of our the Sun, its core crushes down to a large ball of neutrons with a crust of iron and other elements: a neutron star.

However, this ball is unstable if its mass exceeds the Tolman–Oppenheimer–Volkoff limit, somewhere between 1.5 and 3 times that of our Sun. Above this limit, gravity overwhelms the repulsive forces that hold up the neutron star. And indeed, no neutron stars heavier than 3 solar masses have been observed. Thus, for very heavy stars, the endpoint of collapse is not a neutron star, but something else: a black hole, an object that bends spacetime so much even light cannot escape.

If general relativity is correct, a black hole contains a singularity. Many physicists expect that general relativity breaks down inside a black hole, perhaps because of quantum effects that become important at strong gravitational fields. The singularity is considered a strong hint that this breakdown occurs. If so, the singularity may be a purely theoretical entity, not a real-world phenomenon. Nonetheless, everything we have observed about black holes matches what general relativity predicts. Thus, unlike all the other theories we have discussed, general relativity predicts infinities that are connected to striking phenomena that are actually observed.

The Tolman–Oppenheimer–Volkoff limit is not precisely known, because it depends on properties of nuclear matter that are not well understood. However, there are theorems that say singularities must occur in general relativity under certain conditions.

One of the first was proved by Raychauduri and Komar in the mid-1950’s. It applies only to ‘dust’, and indeed it is a precise version of our verbal argument above. It introduced the Raychauduri’s equation, which is the geometrical way of thinking about spacetime curvature as affecting the motion of a small ball of test particles. It shows that under suitable conditions, the energy density must approach infinity in a finite amount of time along the path traced out out by a dust particle.

The first required condition is that the flow of dust be initally converging, not expanding. The second condition, not mentioned in our verbal argument, is that the dust be ‘irrotational’, not swirling around. The third condition is that the dust particles be affected only by gravity, so that they move along geodesics. Due to the last two conditions, the Raychauduri–Komar theorem does not apply to collapsing stars.

The more modern singularity theorems eliminate these conditions. But they do so at a price: they require a more subtle concept of singularity! There are various possible ways to define this concept. They’re all a bit tricky, because a singularity is not a point or region in spacetime.

For our present purposes, we can define a singularity to be an ‘incomplete timelike or null geodesic’. As already explained, a timelike geodesic is the kind of path traced out by a test particle moving slower than light. Similarly, a null geodesic is the kind of path traced out by a test particle moving at the speed of light. We say a geodesic is ‘incomplete’ if it ceases to be well-defined after a finite amount of time. For example, general relativity says a test particle falling into a black hole follows an incomplete geodesic. In a rough-and-ready way, people say the particle ‘hits the singularity’. But the singularity is not a place in spacetime. What we really mean is that the particle’s path becomes undefined after a finite amount of time.

We need to be a bit careful about what we mean by ‘time’ here. For test particles moving slower than light this is easy, since we can parametrize a timelike geodesic by proper time. However, the tangent vector v = \gamma'(s) of a null geodesic has g(v,v) = 0, so a particle moving along a null geodesic does not experience any passage of proper time. Still, any geodesic, even a null one, has a family of preferred parametrizations. These differ only by changes of variable like this: s \mapsto as + b. By ‘time’ we really mean the variable s in any of these preferred parametrizations. Thus, if our spacetime is some Lorentzian manifold M, we say a geodesic \gamma \colon [s_0, s_1] \to M is incomplete if, parametrized in one of these preferred ways, it cannot be extended to a strictly longer interval.

The first modern singularity theorem was proved by Penrose in 1965. It says that if space is infinite in extent, and light becomes trapped inside some bounded region, and no exotic matter is present to save the day, either a singularity or something even more bizarre must occur. This theorem applies to collapsing stars. When a star of sufficient mass collapses, general relativity says that its gravity becomes so strong that light becomes trapped inside some bounded region. We can then use Penrose’s theorem to analyze the possibilities.

Shortly thereafter Hawking proved a second singularity theorem, which applies to the Big Bang. It says that if space is finite in extent, and no exotic matter is present, generically either a singularity or something even more bizarre must occur. The singularity here could be either a Big Bang in the past, a Big Crunch in the future, both—or possibly something else. Hawking also proved a version of his theorem that applies to certain Lorentzian manifolds where space is infinite in extent, as seems to be the case in our Universe. This version requires extra conditions.

There are some undefined phrases in this summary of the Penrose–Hawking singularity theorems, most notably these:

• ‘exotic matter’

• ‘singularity’

• ‘something even more bizarre’.

So, let me say a bit about each.

These singularity theorems precisely specify what is meant by ‘exotic matter’. This is matter for which

\rho + P_x + P_y + P_z < 0

at some point, in some coordinate system. By Einstein’s equation, this would make a small ball of freely falling test particles tend to expand. In other words, exotic matter would create a repulsive gravitational field. No matter of this sort has ever been found; the matter we know obeys the so-called ‘dominant energy condition’

\rho + P_x + P_y + P_z \ge 0

The Penrose–Hawking singularity theorems also say what counts as ‘something even more bizarre’. An example would be a closed timelike curve. A particle following such a path would move slower than light yet eventually reach the same point where it started—and not just the same point in space, but the same point in spacetime! If you could do this, perhaps you could wait, see if it would rain tomorrow, and then go back and decide whether to buy an umbrella today. There are certainly solutions of Einstein’s equation with closed timelike curves. The first interesting one was found by Einstein’s friend Gödel in 1949, as part of an attempt to probe the nature of time. However, closed timelike curves are generally considered less plausible than singularities.

In the Penrose–Hawking singularity theorems, ‘something even more bizarre’ means that spacetime is not ‘globally hyperbolic’. To understand this, we need to think about when we can predict the future or past given initial data. When studying field equations like Maxwell’s theory of electromagnetism or Einstein’s theory of gravity, physicists like to specify initial data on space at a given moment of time. However, in general relativity there is considerable freedom in how we choose a slice of spacetime and call it ‘space’. What should we require? For starters, we want a 3-dimensional submanifold S of spacetime that is ‘spacelike’: every vector v tangent to S should have g(v,v) > 0. However, we also want any timelike or null curve to hit S exactly once. A spacelike surface with this property is called a Cauchy surface, and a Lorentzian manifold containing a Cauchy surface is said to be globally hyperbolic. There are many theorems justifying the importance of this concept. Globally hyperbolicity excludes closed timelike curves, but also other bizarre behavior.

By now the original singularity theorems have been greatly generalized and clarified. Hawking and Penrose gave a unified treatment of both theorems in 1970. The 1973 textbook by Hawking and Ellis gives a systematic introduction to this subject. Hawking gave an elegant informal overview of the key ideas in 1994, and a paper by Garfinkle and Senovilla reviews the subject and its history up to 2015.

If we accept that general relativity really predicts the existence of singularities in physically realistic situations, the next step is to ask whether they rob general relativity of its predictive power. I’ll talk about that next time!

Struggles with the Continuum (Part 6)

21 September, 2016

Last time I sketched how physicists use quantum electrodynamics, or ‘QED’, to compute answers to physics problems as power series in the fine structure constant, which is

\displaystyle{ \alpha = \frac{1}{4 \pi \epsilon_0} \frac{e^2}{\hbar c} \approx \frac{1}{137.036} }

I concluded with a famous example: the magnetic moment of the electron. With a truly heroic computation, physicists have used QED to compute this quantity up to order \alpha^5. If we also take other Standard Model effects into account we get agreement to roughly one part in 10^{12}.

However, if we continue adding up terms in this power series, there is no guarantee that the answer converges. Indeed, in 1952 Freeman Dyson gave a heuristic argument that makes physicists expect that the series diverges, along with most other power series in QED!

The argument goes as follows. If these power series converged for small positive \alpha, they would have a nonzero radius of convergence, so they would also converge for small negative \alpha. Thus, QED would make sense for small negative values of \alpha, which correspond to imaginary values of the electron’s charge. If the electron had an imaginary charge, electrons would attract each other electrostatically, since the usual repulsive force between them is proportional to e^2. Thus, if the power series converged, we would have a theory like QED for electrons that attract rather than repel each other.

However, there is a good reason to believe that QED cannot make sense for electrons that attract. The reason is that it describes a world where the vacuum is unstable. That is, there would be states with arbitrarily large negative energy containing many electrons and positrons. Thus, we expect that the vacuum could spontaneously turn into electrons and positrons together with photons (to conserve energy). Of course, this not a rigorous proof that the power series in QED diverge: just an argument that it would be strange if they did not.

To see why electrons that attract could have arbitrarily large negative energy, consider a state \psi with a large number N of such electrons inside a ball of radius R. We require that these electrons have small momenta, so that nonrelativistic quantum mechanics gives a good approximation to the situation. Since its momentum is small, the kinetic energy of each electron is a small fraction of its rest energy m_e c^2. If we let \langle \psi, E \psi\rangle be the expected value of the total rest energy and kinetic energy of all the electrons, it follows that \langle \psi, E\psi \rangle is approximately proportional to N.

The Pauli exclusion principle puts a limit on how many electrons with momentum below some bound can fit inside a ball of radius R. This number is asymptotically proportional to the volume of the ball. Thus, we can assume N is approximately proportional to R^3. It follows that \langle \psi, E \psi \rangle is approximately proportional to R^3.

There is also the negative potential energy to consider. Let V be the operator for potential energy. Since we have N electrons attracted by an 1/r potential, and each pair contributes to the potential energy, we see that \langle \psi , V \psi \rangle is approximately proportional to -N^2 R^{-1}, or -R^5. Since R^5 grows faster than R^3, we can make the expected energy \langle \psi, (E + V) \psi \rangle arbitrarily large and negative as N,R \to \infty.

Note the interesting contrast between this result and some previous ones we have seen. In Newtonian mechanics, the energy of particles attracting each other with a 1/r potential is unbounded below. In quantum mechanics, thanks the uncertainty principle, the energy is bounded below for any fixed number of particles. However, quantum field theory allows for the creation of particles, and this changes everything! Dyson’s disaster arises because the vacuum can turn into a state with arbitrarily large numbers of electrons and positrons. This disaster only occurs in an imaginary world where \alpha is negative—but it may be enough to prevent the power series in QED from having a nonzero radius of convergence.

We are left with a puzzle: how can perturbative QED work so well in practice, if the power series in QED diverge?

Much is known about this puzzle. There is an extensive theory of ‘Borel summation’, which allows one to extract well-defined answers from certain divergent power series. For example, consider a particle of mass m on a line in a potential

V(x) = x^2 + \beta x^4

When \beta \ge 0 this potential is bounded below, but when \beta < 0 it is not: classically, it describes a particle that can shoot to infinity in a finite time. Let H = K + V be the quantum Hamiltonian for this particle, where K is the usual operator for the kinetic energy and V is the operator for potential energy. When \beta \ge 0, the Hamiltonian H is essentially self-adjoint on the set of smooth wavefunctions that vanish outside a bounded interval. This means that the theory makes sense. Moreover, in this case H has a ‘ground state’: a state \psi whose expected energy \langle \psi, H \psi \rangle is as low as possible. Call this expected energy E(\beta). One can show that E(\beta) depends smoothly on \beta for \beta \ge 0, and one can write down a Taylor series for E(\beta).

On the other hand, when \beta < 0 the Hamiltonian H is not essentially self-adjoint. This means that the quantum mechanics of a particle in this potential is ill-behaved when \beta < 0. Heuristically speaking, the problem is that such a particle could tunnel through the barrier given by the local maxima of V(x) and shoot off to infinity in a finite time.

This situation is similar to Dyson’s disaster, since we have a theory that is well-behaved for \beta \ge 0 and ill-behaved for \beta < 0. As before, the bad behavior seems to arise from our ability to convert an infinite amount of potential energy into other forms of energy. However, in this simpler situation one can prove that the Taylor series for E(\beta) does not converge. Barry Simon did this around 1969. Moreover, one can prove that Borel summation, applied to this Taylor series, gives the correct value of E(\beta) for \beta \ge 0. The same is known to be true for certain quantum field theories. Analyzing these examples, one can see why summing the first few terms of a power series can give a good approximation to the correct answer even though the series diverges. The terms in the series get smaller and smaller for a while, but eventually they become huge.

Unfortunately, nobody has been able to carry out this kind of analysis for quantum electrodynamics. In fact, the current conventional wisdom is that this theory is inconsistent, due to problems at very short distance scales. In our discussion so far, we summed over Feynman diagrams with \le n vertices to get the first n terms of power series for answers to physical questions. However, one can also sum over all diagrams with \le n loops. This more sophisticated approach to renormalization, which sums over infinitely many diagrams, may dig a bit deeper into the problems faced by quantum field theories.

If we use this alternate approach for QED we find something surprising. Recall that in renormalization we impose a momentum cutoff \Lambda, essentially ignoring waves of wavelength less than \hbar/\Lambda, and use this to work out a relation between the the electron’s bare charge e_\mathrm{bare}(\Lambda) and its renormalized charge e_\mathrm{ren}. We try to choose e_\mathrm{bare}(\Lambda) that makes e_\mathrm{ren} equal to the electron’s experimentally observed charge e. If we sum over Feynman diagrams with \le n vertices this is always possible. But if we sum over Feynman diagrams with at most one loop, it ceases to be possible when \Lambda reaches a certain very large value, namely

\displaystyle{  \Lambda \; = \; \exp\left(\frac{3 \pi}{2 \alpha} + \frac{5}{6}\right) m_e c \; \approx \; e^{647} m_e c}

According to this one-loop calculation, the electron’s bare charge becomes infinite at this point! This value of \Lambda is known as a ‘Landau pole’, since it was first noticed in about 1954 by Lev Landau and his colleagues.

What is the meaning of the Landau pole? We said that poetically speaking, the bare charge of the electron is the charge we would see if we could strip off the electron’s virtual particle cloud. A somewhat more precise statement is that e_\mathrm{bare}(\Lambda) is the charge we would see if we collided two electrons head-on with a momentum on the order of \Lambda. In this collision, there is a good chance that the electrons would come within a distance of \hbar/\Lambda from each other. The larger \Lambda is, the smaller this distance is, and the more we penetrate past the effects of the virtual particle cloud, whose polarization ‘shields’ the electron’s charge. Thus, the larger \Lambda is, the larger e_\mathrm{bare}(\Lambda) becomes.

So far, all this makes good sense: physicists have done experiments to actually measure this effect. The problem is that according to a one-loop calculation, e_\mathrm{bare}(\Lambda) becomes infinite when \Lambda reaches a certain huge value.

Of course, summing only over diagrams with at most one loop is not definitive. Physicists have repeated the calculation summing over diagrams with \le 2 loops, and again found a Landau pole. But again, this is not definitive. Nobody knows what will happen as we consider diagrams with more and more loops. Moreover, the distance \hbar/\Lambda corresponding to the Landau pole is absurdly small! For the one-loop calculation quoted above, this distance is about

\displaystyle{  e^{-647} \frac{\hbar}{m_e c} \; \approx \; 6 \cdot 10^{-294}\, \mathrm{meters} }

This is hundreds of orders of magnitude smaller than the length scales physicists have explored so far. Currently the Large Hadron Collider can probe energies up to about 10 TeV, and thus distances down to about 2 \cdot 10^{-20} meters, or about 0.00002 times the radius of a proton. Quantum field theory seems to be holding up very well so far, but no reasonable physicist would be willing to extrapolate this success down to 6 \cdot 10^{-294} meters, and few seem upset at problems that manifest themselves only at such a short distance scale.

Indeed, attitudes on renormalization have changed significantly since 1948, when Feynman, Schwinger and Tomonoga developed it for QED. At first it seemed a bit like a trick. Later, as the success of renormalization became ever more thoroughly confirmed, it became accepted. However, some of the most thoughtful physicists remained worried. In 1975, Dirac said:

Most physicists are very satisfied with the situation. They say: ‘Quantum electrodynamics is a good theory and we do not have to worry about it any more.’ I must say that I am very dissatisfied with the situation, because this so-called ‘good theory’ does involve neglecting infinities which appear in its equations, neglecting them in an arbitrary way. This is just not sensible mathematics. Sensible mathematics involves neglecting a quantity when it is small—not neglecting it just because it is infinitely great and you do not want it!

As late as 1985, Feynman wrote:

The shell game that we play [. . .] is technically called ‘renormalization’. But no matter how clever the word, it is still what I would call a dippy process! Having to resort to such hocus-pocus has prevented us from proving that the theory of quantum electrodynamics is mathematically self-consistent. It’s surprising that the theory still hasn’t been proved self-consistent one way or the other by now; I suspect that renormalization is not mathematically legitimate.

By now renormalization is thoroughly accepted among physicists. The key move was a change of attitude emphasized by Kenneth Wilson in the 1970s. Instead of treating quantum field theory as the correct description of physics at arbitrarily large energy-momenta, we can assume it is only an approximation. For renormalizable theories, one can argue that even if quantum field theory is inaccurate at large energy-momenta, the corrections become negligible at smaller, experimentally accessible energy-momenta. If so, instead of seeking to take the \Lambda \to \infty limit, we can use renormalization to relate bare quantities at some large but finite value of \Lambda to experimentally observed quantities.

From this practical-minded viewpoint, the possibility of a Landau pole in QED is less important than the behavior of the Standard Model. Physicists believe that the Standard Model would suffer from Landau pole at momenta low enough to cause serious problems if the Higgs boson were considerably more massive than it actually is. Thus, they were relieved when the Higgs was discovered at the Large Hadron Collider with a mass of about 125 GeV/c2. However, the Standard Model may still suffer from a Landau pole at high momenta, as well as an instability of the vacuum.

Regardless of practicalities, for the mathematical physicist, the question of whether or not QED and the Standard Model can be made into well-defined mathematical structures that obey the axioms of quantum field theory remain open problems of great interest. Most physicists believe that this can be done for pure Yang–Mills theory, but actually proving this is the first step towards winning $1,000,000 from the Clay Mathematics Institute.