Roger Penrose’s Nobel Prize

8 October, 2020

Roger Penrose just won Nobel Prize in Physics “for the discovery that black hole formation is a robust prediction of the general theory of relativity.” He shared it with Reinhard Genzel and Andrea Ghez, who won it “for the discovery of a supermassive compact object at the centre of our galaxy.”

This is great news! It’s a pity that Stephen Hawking is no longer alive, because if he were he would surely have shared in this prize. Hawking’s most impressive piece of work—his prediction of black hole evaporation—was too far from being experimentally confirmed to win a Nobel prize before his death. It still is today. The Nobel Prize is conservative in this way: it doesn’t go to theoretical developments that haven’t been experimentally confirmed. That makes a lot of sense. But sometimes they go overboard: Einstein never won a Nobel for general relativity or even special relativity. I consider that a scandal!

I’m glad that the Penrose–Hawking singularity theorems are considered Nobel-worthy. Let me just say a little about what Penrose and Hawking proved.

The most dramatic successful predictions of general relativity are black holes and the Big Bang. According to general relativity, as you follow a particle back in time toward the Big Bang or forward in time as it falls into a black hole, spacetime becomes more and more curved… and eventually it stops! This is roughly what we mean by a singularity. Penrose and Hawking made this idea mathematically precise, and proved that under reasonable assumptions singularities are inevitable in general relativity.

General relativity does not take quantum mechanics into account, so while Penrose and Hawking’s results are settled theorems, their applicability to our universe is not a settled fact. Many physicists hope that a theory of quantum gravity will save physics from singularities! Indeed this is one of the reasons physicist are fascinated by quantum gravity. But we know very little for sure about quantum gravity. So, it makes a lot of sense to work with general relativity as a mathematically precise theory and see what it says. That is what Hawking and Penrose did in their singularity theorems.

Let’s start with a quick introduction to general relativity, and then get an idea of why this theory predicts singularities are inevitable in certain situations.

General relativity says that spacetime is a 4-dimensional Lorentzian manifold. Thus, it can be covered by patches equipped with coordinates, so that in each patch we can describe points by lists of four numbers. Any curve \gamma(s) going through a point then has a tangent vector v whose components are v^\mu = d \gamma^\mu(s)/ds. Furthermore, given two tangent vectors v,w at the same point we can take their inner product

g(v,w) = g_{\mu \nu} v^\mu w^\nu

where as usual we sum over repeated indices, and g_{\mu \nu} is a 4 \times 4 matrix called the metric, depending smoothly on the point. We require that at any point we can find some coordinate system where this matrix takes the usual Minkowski form:

\displaystyle{  g = \left( \begin{array}{cccc} -1 & 0 &0 & 0 \\ 0 & 1 &0 & 0 \\ 0 & 0 &1 & 0 \\ 0 & 0 &0 & 1 \\ \end{array}\right) }

However, as soon as we move away from our chosen point, the form of the matrix g in these particular coordinates may change.

General relativity says how the metric is affected by matter. It does this in a single equation, Einstein’s equation, which relates the ‘curvature’ of the metric at any point to the flow of energy-momentum through that point. To define the curvature, we need some differential geometry. Indeed, Einstein had to learn this subject from his mathematician friend Marcel Grossman in order to write down his equation. Here I will take some shortcuts and try to explain Einstein’s equation with a bare minimum of differential geometry.

Consider a small round ball of test particles that are initially all at rest relative to each other. This requires a bit of explanation. First, because spacetime is curved, it only looks like Minkowski spacetime—the world of special relativity—in the limit of very small regions. The usual concepts of ’round’ and ‘at rest relative to each other’ only make sense in this limit. Thus, all our forthcoming statements are precise only in this limit, which of course relies on the fact that spacetime is a continuum.

Second, a test particle is a classical point particle with so little mass that while it is affected by gravity, its effects on the geometry of spacetime are negligible. We assume our test particles are affected only by gravity, no other forces. In general relativity this means that they move along timelike geodesics. Roughly speaking, these are paths that go slower than light and bend as little as possible. We can make this precise without much work.

For a path in space to be a geodesic means that if we slightly vary any small portion of it, it can only become longer. However, a path \gamma(s) in spacetime traced out by particle moving slower than light must be ‘timelike’, meaning that its tangent vector v = \gamma'(s) satisfies g(v,v) < 0. We define the proper time along such a path from s = s_0 to s = s_1 to be

\displaystyle{  \int_{s_0}^{s_1} \sqrt{-g(\gamma'(s),\gamma'(s))} \, ds }

This is the time ticked out by a clock moving along that path. A timelike path is a geodesic if the proper time can only decrease when we slightly vary any small portion of it. Particle physicists prefer the opposite sign convention for the metric, and then we do not need the minus sign under the square root. But the fact remains the same: timelike geodesics locally maximize the proper time.

Actual particles are not test particles! First, the concept of test particle does not take quantum theory into account. Second, all known particles are affected by forces other than gravity. Third, any actual particle affects the geometry of the spacetime it inhabits. Test particles are just a mathematical trick for studying the geometry of spacetime. Still, a sufficiently light particle that is affected very little by forces other than gravity can be approximated by a test particle. For example, an artificial satellite moving through the Solar System behaves like a test particle if we ignore the solar wind, the radiation pressure of the Sun, and so on.

If we start with a small round ball consisting of many test particles that are initially all at rest relative to each other, to first order in time it will not change shape or size. However, to second order in time it can expand or shrink, due to the curvature of spacetime. It may also be stretched or squashed, becoming an ellipsoid. This should not be too surprising, because any linear transformation applied to a ball gives an ellipsoid.

Let V(t) be the volume of the ball after a time t has elapsed, where time is measured by a clock attached to the particle at the center of the ball. Then in units where c = 8 \pi G = 1, Einstein’s equation says:

\displaystyle{  \left.{\ddot V\over V} \right|_{t = 0} = -{1\over 2} \left( \begin{array}{l} {\rm flow \; of \;} t{\rm -momentum \; in \; the \;\,} t {\rm \,\; direction \;} + \\ {\rm flow \; of \;} x{\rm -momentum \; in \; the \;\,} x {\rm \; direction \;} + \\ {\rm flow \; of \;} y{\rm -momentum \; in \; the \;\,} y {\rm \; direction \;} + \\ {\rm flow \; of \;} z{\rm -momentum \; in \; the \;\,} z {\rm \; direction} \end{array} \right) }

These flows here are measured at the center of the ball at time zero, and the coordinates used here take advantage of the fact that to first order, at any one point, spacetime looks like Minkowski spacetime.

The flows in Einstein’s equation are the diagonal components of a 4 \times 4 matrix T called the ‘stress-energy tensor’. The components T_{\alpha \beta} of this matrix say how much momentum in the \alpha direction is flowing in the \beta direction through a given point of spacetime. Here \alpha and \beta range from 0 to 3, corresponding to the t,x,y and z coordinates.

For example, T_{00} is the flow of t-momentum in the t-direction. This is just the energy density, usually denoted \rho. The flow of x-momentum in the x-direction is the pressure in the x direction, denoted P_x, and similarly for y and z. You may be more familiar with direction-independent pressures, but it is easy to manufacture a situation where the pressure depends on the direction: just squeeze a book between your hands!

Thus, Einstein’s equation says

\displaystyle{ {\ddot V\over V} \Bigr|_{t = 0} = -{1\over 2} (\rho + P_x + P_y + P_z) }

It follows that positive energy density and positive pressure both curve spacetime in a way that makes a freely falling ball of point particles tend to shrink. Since E = mc^2 and we are working in units where c = 1, ordinary mass density counts as a form of energy density. Thus a massive object will make a swarm of freely falling particles at rest around it start to shrink. In short, gravity attracts.

Already from this, gravity seems dangerously inclined to create singularities. Suppose that instead of test particles we start with a stationary cloud of ‘dust’: a fluid of particles having nonzero energy density but no pressure, moving under the influence of gravity alone. The dust particles will still follow geodesics, but they will affect the geometry of spacetime. Their energy density will make the ball start to shrink. As it does, the energy density \rho will increase, so the ball will tend to shrink ever faster, approaching infinite density in a finite amount of time. This in turn makes the curvature of spacetime become infinite in a finite amount of time. The result is a ‘singularity’.

In reality, matter is affected by forces other than gravity. Repulsive forces may prevent gravitational collapse. However, this repulsion creates pressure, and Einstein’s equation says that pressure also creates gravitational attraction! In some circumstances this can overwhelm whatever repulsive forces are present. Then the matter collapses, leading to a singularity—at least according to general relativity.

When a star more than 8 times the mass of our Sun runs out of fuel, its core suddenly collapses. The surface is thrown off explosively in an event called a supernova. Most of the energy—the equivalent of thousands of Earth masses—is released in a ten-second burst of neutrinos, formed as a byproduct when protons and electrons combine to form neutrons. If the star’s mass is below 20 times that of our the Sun, its core crushes down to a large ball of neutrons with a crust of iron and other elements: a neutron star.

However, this ball is unstable if its mass exceeds the Tolman–Oppenheimer–Volkoff limit, somewhere between 1.5 and 3 times that of our Sun. Above this limit, gravity overwhelms the repulsive forces that hold up the neutron star. And indeed, no neutron stars heavier than 3 solar masses have been observed. Thus, for very heavy stars, the endpoint of collapse is not a neutron star, but something else: a black hole, an object that bends spacetime so much even light cannot escape.

If general relativity is correct, a black hole contains a singularity. Many physicists expect that general relativity breaks down inside a black hole, perhaps because of quantum effects that become important at strong gravitational fields. The singularity is considered a strong hint that this breakdown occurs. If so, the singularity may be a purely theoretical entity, not a real-world phenomenon. Nonetheless, everything we have observed about black holes matches what general relativity predicts.

The Tolman–Oppenheimer–Volkoff limit is not precisely known, because it depends on properties of nuclear matter that are not well understood. However, there are theorems that say singularities must occur in general relativity under certain conditions.

One of the first was proved by Raychauduri and Komar in the mid-1950’s. It applies only to ‘dust’, and indeed it is a precise version of our verbal argument above. It introduced the Raychauduri’s equation, which is the geometrical way of thinking about spacetime curvature as affecting the motion of a small ball of test particles. It shows that under suitable conditions, the energy density must approach infinity in a finite amount of time along the path traced out out by a dust particle.

The first required condition is that the flow of dust be initally converging, not expanding. The second condition, not mentioned in our verbal argument, is that the dust be ‘irrotational’, not swirling around. The third condition is that the dust particles be affected only by gravity, so that they move along geodesics. Due to the last two conditions, the Raychauduri–Komar theorem does not apply to collapsing stars.

The more modern singularity theorems eliminate these conditions. But they do so at a price: they require a more subtle concept of singularity! There are various possible ways to define this concept. They’re all a bit tricky, because a singularity is not a point or region in spacetime.

For our present purposes, we can define a singularity to be an ‘incomplete timelike or null geodesic’. As already explained, a timelike geodesic is the kind of path traced out by a test particle moving slower than light. Similarly, a null geodesic is the kind of path traced out by a test particle moving at the speed of light. We say a geodesic is ‘incomplete’ if it ceases to be well-defined after a finite amount of time. For example, general relativity says a test particle falling into a black hole follows an incomplete geodesic. In a rough-and-ready way, people say the particle ‘hits the singularity’. But the singularity is not a place in spacetime. What we really mean is that the particle’s path becomes undefined after a finite amount of time.

The first modern singularity theorem was proved by Penrose in 1965. It says that if space is infinite in extent, and light becomes trapped inside some bounded region, and no exotic matter is present to save the day, either a singularity or something even more bizarre must occur. This theorem applies to collapsing stars. When a star of sufficient mass collapses, general relativity says that its gravity becomes so strong that light becomes trapped inside some bounded region. We can then use Penrose’s theorem to analyze the possibilities.

Here is Penrose’s story of how he discovered this:

At that time I was at Birkbeck College, and a friend of mine, Ivor Robinson, whose an Englishman but he was working in Dallas, Texas at the time, and he was talking to me … I forget what it was … he was a very … he had a wonderful way with words and so he was talking to me, and we got to this crossroad and as we crossed the road he stopped talking as we were watching out for traffic. We got to the other side and then he started talking again. And then when he left I had this strange feeling of elation and I couldn’t quite work out why I was feeling like that. So I went through all the things that had happened to me during the day—you know, what I had for breakfast and goodness knows what—and finally it came to this point when I was crossing the street, and I realised that I had a certain idea, and this idea what the crucial characterisation of when a collapse had reached a point of no return, without assuming any symmetry or anything like that. So this is what I called a trapped surface. And this was the key thing, so I went back to my office and I sketched out a proof of the collapse theorem. The paper I wrote was not that long afterwards, which went to Physical Review Letters, and it was published in 1965 I think.

Shortly thereafter Hawking proved a second singularity theorem, which applies to the Big Bang. It says that if space is finite in extent, and no exotic matter is present, generically either a singularity or something even more bizarre must occur. The singularity here could be either a Big Bang in the past, a Big Crunch in the future, both—or possibly something else. Hawking also proved a version of his theorem that applies to certain Lorentzian manifolds where space is infinite in extent, as seems to be the case in our Universe. This version requires extra conditions.

There are some undefined phrases in my summary of the Penrose–Hawking singularity theorems, most notably these:

• ‘exotic matter’

• ‘something even more bizarre’.

In each case I mean something precise.

These singularity theorems precisely specify what is meant by ‘exotic matter’. All known forms of matter obey the ‘dominant energy condition’, which says that

|P_x|, \, |P_y|, \, |P_z| \le \rho

at all points and in all locally Minkowskian coordinates. Exotic matter is anything that violates this condition.

The Penrose–Hawking singularity theorems also say what counts as ‘something even more bizarre’. An example would be a closed timelike curve. A particle following such a path would move slower than light yet eventually reach the same point where it started—and not just the same point in space, but the same point in spacetime! If you could do this, perhaps you could wait, see if it would rain tomorrow, and then go back and decide whether to buy an umbrella today. There are certainly solutions of Einstein’s equation with closed timelike curves. The first interesting one was found by Einstein’s friend Gödel in 1949, as part of an attempt to probe the nature of time. However, closed timelike curves are generally considered less plausible than singularities.

In the Penrose–Hawking singularity theorems, ‘something even more bizarre’ means precisely this: spacetime is not ‘globally hyperbolic’. To understand this, we need to think about when we can predict the future or past given initial data. When studying field equations like Maxwell’s theory of electromagnetism or Einstein’s theory of gravity, physicists like to specify initial data on space at a given moment of time. However, in general relativity there is considerable freedom in how we choose a slice of spacetime and call it ‘space’. What should we require? For starters, we want a 3-dimensional submanifold S of spacetime that is ‘spacelike’: every vector v tangent to S should have g(v,v) > 0. However, we also want any timelike or null curve to hit S exactly once. A spacelike surface with this property is called a Cauchy surface, and a Lorentzian manifold containing a Cauchy surface is said to be globally hyperbolic. There are many theorems justifying the importance of this concept. Globally hyperbolicity excludes closed timelike curves, but also other bizarre behavior.

By now the original singularity theorems have been greatly generalized and clarified. Hawking and Penrose gave a unified treatment of both theorems in 1970, which you can read here:

• Stephen William Hawking and Roger Penrose, The singularities of gravitational collapse and cosmology, Proc. Royal Soc. London A 314 (1970), 529–548.

The 1973 textbook by Hawking and Ellis gives a systematic introduction to this subject. A paper by Garfinkle and Senovilla reviews the subject and its history up to 2015. Also try the first two chapters of this wonderful book:

• Stephen Hawking and Roger Penrose, The Nature of Space and Time, Princeton U. Press, 1996.

You can find the first chapter, by Hawking, here: it describes the singularity theorems. The second, by Penrose, discusses the nature of singlarities in general relativity.

I’m sure Penrose’s Nobel Lecture will also be worth watching. Three cheers to Roger Penrose!


Fock Space Techniques for Stochastic Physics

2 October, 2020

I’ve been fascinated for a long time about the relation between classical probability theory and quantum mechanics. This story took a strange new turn when people discovered that stochastic Petri nets, good for describing classical probabilistic models of interacting entities, can also be described using ideas from the quantum field theory!

I’ll be talking about this at the online category theory seminar at UNAM, the National Autonomous University of Mexico, on Wednesday October 7th at 18:00 UTC (11 am Pacific Time):

Fock space techniques for stochastic physics

Abstract. Some ideas from quantum theory are beginning to percolate back to classical probability theory. For example, the master equation for a chemical reaction network—also known as a stochastic Petri net—describes particle interactions in a stochastic rather than quantum way. If we look at this equation from the perspective of quantum theory, this formalism turns out to involve creation and annihilation operators, coherent states and other well-known ideas—but with a few big differences.

You can watch the talk here:

You can also see the slides of this talk. Click on any picture in the slides, or any text in blue, and get more information!

My students Joe Moeller and Jade Master will also be giving talks in this seminar—on Petri nets and structured cospans.

 


The Brownian Map

19 September, 2020

\phantom{x}

Nina Holden won the 2021 Maryam Mirzakhani New Frontiers Prize for her work on random surfaces and the mathematics of quantum gravity. I’d like to tell you what she did… but I’m so far behind I’ll just explain a bit of the background.

Suppose you randomly choose a triangulation of the sphere with n triangles. This is a purely combinatorial thing, but you can think of it as a metric space if each of the triangles is equilateral with all sides of length 1.

This is a distorted picture of what you might get, drawn by Jérémie Bettinelli:


The triangles are not drawn as equilateral, so we can fit this shape into 3d space. Visit Bettinelli’s page for images that you can rotate:

• Jérémie Bettinelli, Computer simulations of random maps.

I’ve described how to build a random space out of n triangles. In the limit n \to \infty, if you rescale the resulting space by a factor of n^{-1/4} so it doesn’t get bigger and bigger, it converges to a ‘random metric space’ with fascinating properties. It’s called the Brownian map.

This random metric space is on average so wrinkly and crinkly that ‘almost surely’—that is, with probability 1—its Hausdorff dimension is not 2 but 4. And yet it is almost surely homeomorphic to a sphere!

Rigorously proving this is hard: a mix of combinatorics, probability theory and geometry.

Ideas from physics are also important here. There’s a theory called
Liouville quantum gravity’ that describes these random 2-dimensional surfaces. So, physicists have ways of—nonrigorously—figuring out answers to some questions faster than the mathematicians!

A key step in understanding the Brownian map was this paper from 2013:

• Jean-François Le Gall, Uniqueness and universality of the Brownian map, Annals of Probability 41 (2013), 2880–2960.



The Brownian map is to surfaces what Brownian motion is to curves. For example, the Hausdorff dimension of Brownian motion is almost surely 2: twice the dimension of a smooth curve. For the Brownian map it’s almost surely 4, twice the dimension of a smooth surface.

Let me just say one more technical thing. There’s a ‘space of all compact metric spaces’, and the Brownian map is actually a probability measure on this space! It’s called the Gromov-Hausdorff space, and it itself is a metric space… but not compact. (So no, we don’t have a set containing itself as an element.)


There’s a lot more to say about this… but I haven’t gotten very close to understanding Nina Holden’s work yet. She wrote a 7-paper series leading up to this one:

• Nina Holden and Xin Sun, Convergence of uniform triangulations under the Cardy embedding.

They show that random triangulations of a disk can be chosen to a random metric on the disk which can also be obtained from Liouville quantum gravity.

This is a much easier place to start learning this general subject:

• Ewain Gwynne, Random surfaces and Liouville quantum gravity.

One reason I find all this interesting is that when I worked on ‘spin foam models’ of quantum gravity, we were trying to develop combinatorial theories of spacetime that had nice limits as the number of discrete building blocks approached infinity. We were studying theories much more complicated than random 2-dimensional triangulations, and it quickly became clear to me how much work it would be to carefully analyze these. So it’s interesting to see how mathematicians have entered this subject—starting with a much more tractable class of theories, which are already quite hard.

While the theory I just described gives random metric spaces whose Hausdorff dimension is twice their topological dimension, Liouville quantum gravity actually contains an adjustable parameter that lets you force these metric spaces to become less wrinkled, with lower Hausdorff dimension. Taming the behavior of random triangulations gets harder in higher dimensions. Renate Loll, Jan Ambjørn and others have argued that we need to work with Lorentzian rather than Riemannian geometries to get physically reasonable behavior. This approach to quantum gravity is called causal dynamical triangulations.


Diary, 2003-2020

8 August, 2020

I keep putting off organizing my written material, but with coronavirus I’m feeling more mortal than usual, so I’d like get this out into the world now:

• John Baez, Diary, 2003–2020.

Go ahead and grab a copy!

It’s got all my best tweets and Google+ posts, mainly explaining math and physics, but also my travel notes and other things… starting in 2003 with my ruminations on economics and ecology. It’s too big to read all at once, but I think you can dip into it more or less anywhere and pull out something fun.

It goes up to July 2020. It’s 2184 pages long.

I fixed a few problems like missing pictures, but there are probably more. If you let me know about them, I’ll fix them (if it’s easy).


Open Systems in Classical Mechanics

5 August, 2020

I think we need a ‘compositional’ approach to classical mechanics. A classical system is typically built from parts, and we describe the whole system by describing its parts and then saying how they are put together. But this aspect of classical mechanics is typically left informal. You learn how it works in a physics class by doing lots of homework problems, but the rules are never completely spelled out, which is one reason physics is hard.

I want an approach that makes the compositionality of classical mechanics formal: a category (or categories) where the morphisms are open classical systems—that is, classical systems with the ability to interact with the outside world—and composing these morphisms describes putting together open systems to form larger open systems.

There are actually two main approaches to classical mechanics: the Lagrangian approach, which describes the state of a system in terms of position and velocity, and the Hamiltonian approach, which describes the state of a system in terms of position and momentum. There’s a way to go from the first approach to the second, called the Legendre transformation. So we should have a least two categories, one for Lagrangian open systems and one for Hamiltonian open systems, and a functor from the first to the second.

That’s what this paper provides:

• John C. Baez, David Weisbart and Adam Yassine, Open systems in classical mechanics.

The basic idea is by not new—but there are some twists! I like treating open systems as cospans with extra structure. But in this case it makes more sense to use spans, since the space of states of a classical system maps to the space of states of any subsystem. We’ll compose these spans using pullbacks.

For example, suppose you have a spring with rocks at both ends:

spring

If it’s in 1-dimensional space, and we only care about the position and momentum of the two rocks (not vibrations of the spring), we can say the phase space of this system is the cotangent bundle T^\ast \mathbb{R}^2.

But this system has some interesting subsystems: the rocks at the ends! So we get a span. We could draw it like this:

span.jpg

but what I really mean is that we have a span of phase spaces:

span_2.jpg

Here the left-hand arrow maps the state of the whole system to the state of the left-hand rock, and the right-hand arrow maps the state of the whole system to the state of the right-hand rock. These maps are smooth maps between manifolds, but they’re better than that! They are Poisson maps between symplectic manifolds: that’s where the physics comes in. They’re also surjective.

Now suppose we have two such open systems. We can compose them, or ‘glue them together’, by identifying the right-hand rock of one with the left-hand rock of the other. We can draw this as follows:

two_spans.jpg

Now we have a big three-rock system on top, whose states map to states of our original two-rock systems, and then down to states of the individual rocks. This picture really stands for the following commutative diagram:

two_spans_2.jpg

Here the phase space of the big three-rock system on top is obtained as a pullback: that’s how we formalize the process of gluing together two open systems! We can then discard some information and get a span:

composite_span

Bravo! We’ve managed to build a more complicated open system by gluing together two simpler ones! Or in mathematical terms: we’ve taken two spans of symplectic manifolds, where the maps involved in are surjective Poisson maps, and composed them to get another such span.

Since we can compose them, it shouldn’t be surprising that there’s a category whose morphisms are such spans—or more precisely, isomorphism classes of such spans. But we can go further! We can equip all the symplectic manifolds in this story with Hamiltonians, to describe dynamics. And we get a category whose morphisms are open Hamiltonian systems, which we call \mathsf{HamSy}. This is Theorem 4.2 of our paper.

But be careful: to describe one of these open Hamiltonian systems, we need to choose a Hamiltonian not only on the symplectic manifold at the apex of the span, but also on the two symplectic manifolds at the bottom—its ‘feet’. We need this to be able to compute the new Hamiltonian we get when we compose, or glue together, two open Hamiltonian systems. If we just added Hamiltonians for two subsystems, we’d ‘double-count’ the energy when we glued them together.

This takes us further from the decorated cospan or structured cospan frameworks I’ve been talking about repeatedly on this blog. Using spans instead of cospans is not a big deal: a span in some category is just a cospan in the opposite category. What’s a bigger deal is that we’re decorating not just the apex of our spans with extra data, but its feet—and when we compose our spans, we need this data on the feet to compute the data for the apex of the new composite span.

Furthermore, doing pullbacks is subtler in categories of manifolds than in the categories I’d been using for decorated or structured cospans. To handle this nicely, my coauthors wrote a whole separate paper!

• David Weisbart and Adam Yassine, Constructing span categories from categories without pullbacks.

Anyway, in our present paper we get not only a category \mathsf{HamSy} of open Hamiltonian systems, but also a category \mathsf{LagSy} of open Lagrangian systems. So we can do both Hamiltonian and Lagrangian mechanics with open systems.

Moreover, they’re compatible! In classical mechanics we use the Legendre transformation to turn Lagrangian systems into their Hamiltonian counterparts. Now this becomes a functor:

\mathcal{L} \colon \mathsf{LagSy} \to \mathsf{HamSy}

That’s Theorem 5.5.

So, classical mechanics is becoming ‘compositional’. We can convert the Lagrangian descriptions of a bunch of little open systems into their Hamiltonian descriptions and then glue the results together, and we get the same answer as if we did that conversion on the whole big system. Thus, we’re starting to formalize the way physicists think about physical systems ‘one piece at a time’.


Getting to the Bottom of Noether’s Theorem

29 June, 2020

Most of us have been staying holed up at home lately. I spent the last month holed up writing a paper that expands on my talk at a conference honoring the centennial of Noether’s 1918 paper on symmetries and conservation laws. This made my confinement a lot more bearable. It was good getting back to this sort of mathematical physics after a long time spent on applied category theory. It turns out I really missed it.

While everyone at the conference kept emphasizing that Noether’s 1918 paper had two big theorems in it, my paper is just about the easy one—the one physicists call Noether’s theorem:

Getting to the bottom of Noether’s theorem.

People often summarize this theorem by saying “symmetries give conservation laws”. And that’s right, but it’s only true under some assumptions: for example, that the equations of motion come from a Lagrangian.

This leads to some interesting questions. For which types of physical theories do symmetries give conservation laws? What are we assuming about the world, if we assume it is described by a theories of this type? It’s hard to get to the bottom of these questions, but it’s worth trying.

We can prove versions of Noether’s theorem relating symmetries to conserved quantities in many frameworks. While a differential geometric framework is truer to Noether’s original vision, my paper studies the theorem algebraically, without mentioning Lagrangians.

Now, Atiyah said:

…algebra is to the geometer what you might call the Faustian offer. As you know, Faust in Goethe’s story was offered whatever he wanted (in his case the love of a beautiful woman), by the devil, in return for selling his soul. Algebra is the offer made by the devil to the mathematician. The devil says: I will give you this powerful machine, it will answer any question you like. All you need to do is give me your soul: give up geometry and you will have this marvellous machine.

While this is sometimes true, algebra is more than a computational tool: it allows us to express concepts in a very clear and distilled way. Furthermore, the geometrical framework developed for classical mechanics is not sufficient for quantum mechanics. An algebraic approach emphasizes the similarity between classical and quantum mechanics, clarifying their differences.

In talking about Noether’s theorem I keep using an interlocking trio of important concepts used to describe physical systems: ‘states’, ‘observables’ and `generators’. A physical system has a convex set of states, where convex linear combinations let us describe probabilistic mixtures of states. An observable is a real-valued quantity whose value depends—perhaps with some randomness—on the state. More precisely: an observable maps each state to a probability measure on the real line. A generator, on the other hand, is something that gives rise to a one-parameter group of transformations of the set of states—or dually, of the set of observables.

It’s easy to mix up observables and generators, but I want to distinguish them. When we say ‘the energy of the system is 7 joules’, we are treating energy as an observable: something you can measure. When we say ‘the Hamiltonian generates time translations’, we are treating the Hamiltonian as a generator.

In both classical mechanics and ordinary complex quantum mechanics we usually say the Hamiltonian is the energy, because we have a way to identify them. But observables and generators play distinct roles—and in some theories, such as real or quaternionic quantum mechanics, they are truly different. In all the theories I consider in my paper the set of observables is a Jordan algebra, while the set of generators is a Lie algebra. (Don’t worry, I explain what those are.)

When we can identify observables with generators, we can state Noether’s theorem as the following equivalence:


The generator a generates transformations that leave the
observable b fixed.

\Updownarrow

The generator b generates transformations that leave the observable a fixed.

In this beautifully symmetrical statement, we switch from thinking of a as the generator and b as the observable in the first part to thinking of b as the generator and a as the observable in the second part. Of course, this statement is true only under some conditions, and the goal of my paper is to better understand these conditions. But the most fundamental condition, I claim, is the ability to identify observables with generators.

In classical mechanics we treat observables as being the same as generators, by treating them as elements of a Poisson algebra, which is both a Jordan algebra and a Lie algebra. In quantum mechanics observables are not quite the same as generators. They are both elements of something called a ∗-algebra. Observables are self-adjoint, obeying

a^* = a

while generators are skew-adjoint, obeying

a^* = -a

The self-adjoint elements form a Jordan algebra, while the skew-adjoint elements form a Lie algebra.

In ordinary complex quantum mechanics we use a complex ∗-algebra. This lets us turn any self-adjoint element into a skew-adjoint one by multiplying it by \sqrt{-1}. Thus, the complex numbers let us identify observables with generators! In real and quaternionic quantum mechanics this identification is impossible, so the appearance of complex numbers in quantum mechanics is closely connected to Noether’s theorem.

In short, classical mechanics and ordinary complex quantum mechanics fit together in this sort of picture:

classical_and_quantum_mechanics

To dig deeper, it’s good to examine generators on their own: that is, Lie algebras. Lie algebras arise very naturally from the concept of ‘symmetry’. Any Lie group gives rise to a Lie algebra, and any element of this Lie algebra then generates a one-parameter family of transformations of that very same Lie algebra. This lets us state a version of Noether’s theorem solely in terms of generators:


The generator a generates transformations that leave the generator b fixed.

\Updownarrow

The generator b generates transformations that leave the generator a fixed.

And when we translate these statements into equations, their equivalence follows directly from this elementary property of the Lie bracket:


[a,b] = 0

\Updownarrow

[b,a] = 0

Thus, Noether’s theorem is almost automatic if we forget about observables and work solely with generators. The only questions left are: why should symmetries be described by Lie groups, and what is the meaning of this property of the Lie bracket?

In my paper I tackle both these questions, and point out that the Lie algebra formulation of Noether’s theorem comes from a more primitive group formulation, which says that whenever you have two group elements g and h,


g commutes with h.

\Updownarrow

h commutes with g.

That is: whenever you’ve got two ways of transforming a physical system, the first transformation is ‘conserved’ by second if and only if the second is conserved by the first!

However, observables are crucial in physics. Working solely with generators in order to make Noether’s theorem a tautology would be another sort of Faustian bargain. So, to really get to the bottom of Noether’s theorem, we need to understand the map from observables to generators. In ordinary quantum mechanics this comes from multiplication by i. But this just pushes the mystery back a notch: why should we be using the complex numbers in quantum mechanics?

For this it’s good to spend some time examining observables on their own: that is, Jordan algebras. Those of greatest importance in physics are the unital JB-algebras, which are unfortunately named not after me, but Jordan and Banach. These allow a unified approach to real, complex and quaternionic quantum mechanics, along with some more exotic theories. So, they let us study how the role of complex numbers in quantum mechanics is connected to Noether’s theorem.

Any unital JB-algebra O has a partial ordering: that is, we can talk about one observable being greater than or equal to another. With the help of this we can define states on O, and prove that any observable maps each state to a probability measure on the real line.

More surprisingly, any JB-algebra also gives rise to two Lie algebras. The smaller of these, say L, has elements that generate transformations of O that preserve all the structure of this unital JB-algebra. They also act on the set of states. Thus, elements of L truly deserve to be considered ‘generators’.

In a unital JB-algebra there is not always a way to reinterpret observables as generators. However, Alfsen and Shultz have defined the notion of a ‘dynamical correspondence’ for such an algebra, which is a well-behaved map

\psi \colon O \to L

One of the two conditions they impose on this map implies a version of Noether’s theorem. They prove that any JB-algebra with a dynamical correspondence gives a complex ∗-algebra where the observables are self-adjoint elements, the generators are skew-adjoint, and we can convert observables into generators by multiplying them by i.

This result is important, because the definition of JB-algebra does not involve the complex numbers, nor does the concept of dynamical correspondence. Rather, the role of the complex numbers in quantum mechanics emerges from a map from observables to generators that obeys conditions including Noether’s theorem!

To be a bit more precise, Alfsen and Shultz’s first condition on the map \psi \colon O \to L says that every observable a \in O generates transformations that leave a itself fixed. I call this the self-conservation principle. It implies Noether’s theorem.

However, in their definition of dynamical correspondence, Alfsen and Shultz also impose a second, more mysterious condition on the map \psi. I claim that that this condition is best understood in terms of the larger Lie algebra associated to a unital JB-algebra. As a vector space this is the direct sum

A = O \oplus L

but it’s equipped with a Lie bracket such that

[-,-] \colon L \times L \to L    \qquad [-,-] \colon L \times O \to O

[-,-] \colon O \times L \to O    \qquad [-,-] \colon O \times O \to L

As I mentioned, elements of L generate transformations of O that preserve all the structure on this unital JB-algebra. Elements of O also generate transformations of O, but these only preserve its vector space structure and partial ordering.

What’s the meaning of these other transformations? I claim they’re connected to statistical mechanics.

For example, consider ordinary quantum mechanics and let O be the unital JB-algebra of all bounded self-adjoint operators on a complex Hilbert space. Then L is the Lie algebra of all bounded skew-adjoint operators on this Hilbert space. There is a dynamical correpondence sending any observable H \in O to the generator \psi(H) = iH \in L, which then generates a one-parameter group of transformations of O like this:

a \mapsto e^{itH/\hbar} \, a \, e^{-itH/\hbar}  \qquad \forall t \in \mathbb{R}, a \in O

where \hbar is Planck’s constant. If H is the Hamiltonian of some system, this is the usual formula for time evolution of observables in the Heisenberg picture. But H also generates a one-parameter group of transformations of O as follows:

a \mapsto  e^{-\beta H/2} \, a \, e^{-\beta H/2}  \qquad \forall \beta \in \mathbb{R}, a \in O

Writing \beta = 1/kT where T is temperature and k is Boltzmann’s constant, I claim that these are ‘thermal transformations’. Acting on a state in thermal equilibrium at some temperature, these transformations produce states in thermal equilibrium at other temperatures (up to normalization).

The analogy between it/\hbar and 1/kT is often summarized by saying “inverse temperature is imaginary time”. The second condition in Alfsen and Shultz’s definition of dynamical correspondence is a way of capturing this principle in a way that does not explicitly mention the complex numbers. Thus, we may very roughly say their result explains the role of complex numbers in quantum mechanics starting from three assumptions:

• observables form Jordan algebra of a nice sort (a unital JB-algebra)

• the self-conservation principle (and thus Noether’s theorem)

• the relation between time and inverse temperature.

I still want to understand all of this more deeply, but the way statistical mechanics entered the game was surprising to me, so I feel I made a little progress.

I hope the paper is half as fun to read as it was to write! There’s a lot more in it than described here.


Thermalization

12 June, 2020

I’m wondering if people talk about this. Maybe you know?

Given a self-adjoint operator H that’s bounded below and a density matrix D on some Hilbert space, we can define for any \beta > 0 a new density matrix

\displaystyle{ D_\beta = \frac{e^{-\beta H/2} \, D \, e^{-\beta H/2}}{\mathrm{tr}(e^{-\beta H/2} \, D \, e^{-\beta H/2})} }

I would like to call this the thermalization of D when H is a Hamiltonian and \beta = 1/kT where T is the temperature and k is Boltzmann’s constant.

For example, in the finite-dimensional case we can take D to be the identity matrix, normalized to have trace 1. Then D_\beta is the Gibbs state at temperature T: that is, the state of thermal equilibrium at temperature T.

But I want to know if you’ve seen people do this thermalization trick starting from some other density matrix D.


Formal Concepts vs Eigenvectors of Density Operators

7 May, 2020

In the seventh talk of the ACT@UCR seminar, Tai-Danae Bradley told us about applications of categorical quantum mechanics to formal concept analysis.

She gave her talk on Wednesday May 13th. Afterwards we discussed her talk at the Category Theory Community Server. You can see those discussions here if you become a member:

https://categorytheory.zulipchat.com/#narrow/stream/229966-ACT.40UCR-seminar/topic/May.2013th.3A.20Tai-Danae.20Bradley

You can see her slides here, or download a video here, or watch the video here:

• Tai-Danae Bradley: Formal concepts vs. eigenvectors of density operators.

Abstract. In this talk, I’ll show how any probability distribution on a product of finite sets gives rise to a pair of linear maps called density operators, whose eigenvectors capture “concepts” inherent in the original probability distribution. In some cases, the eigenvectors coincide with a simple construction from lattice theory known as a formal concept. In general, the operators recover marginal probabilities on their diagonals, and the information stored in their eigenvectors is akin to conditional probability. This is useful in an applied setting, where the eigenvectors and eigenvalues can be glued together to reconstruct joint probabilities. This naturally leads to a tensor network model of the original distribution. I’ll explain these ideas from the ground up, starting with an introduction to formal concepts. Time permitting, I’ll also share how the same ideas lead to a simple framework for modeling hierarchy in natural language. As an aside, it’s known that formal concepts arise as an enriched version of a generalization of the Isbell completion of a category. Oftentimes, the construction is motivated by drawing an analogy with elementary linear algebra. I like to think of this talk as an application of the linear algebraic side of that analogy.

Her talk is based on her thesis:

• Tai-Danae Bradley, At the Interface of Algebra and Statistics.

Bradley_slide


Superfluid Quasicrystals

31 January, 2020

Condensed matter physics is so cool! Bounce 4 laser beams off mirrors to make an interference pattern with 8-fold symmetry. Put a Bose–Einstein condensate of potassium atoms into this “optical lattice” and you get a superfluid quasicrystal!

You see, no periodic pattern in the plane can have 8-fold symmetry, so the interference pattern of the light is ‘quasiperiodic’: it never repeats itself, thought it comes arbitrarily close, sort of like this pattern drawn by Greg Egan:

In the Bose–Einstein condensate all the particles have the same wavefunction, and the wavefunction itself, influenced by the light, also becomes quasiperiodic.

But that’s not all! As you increase the intensity of the lasers, the Bose-Einstein condensate suddenly collapses from a quasicrystal to a ‘localized’ state where all the atoms sit in the same place!

Below the gray curve is the potential V formed by the lasers, while the blue curve is the absolute value squared of the wavefunction of the Bose–Einstein condensate, |ψ0|2.

At top the lasers are off so V is zero and |ψ0|2 is constant. In the middle the lasers are on, but not too bright, so V and |ψ0| is quasiperiodic. At the bottom the lasers are brighter, so V is quasiperiodic and larger, and |ψ0|2 is localized.

It’s well known that when a crystal is sufficiently disordered, its electrons may localize: instead of having spread-out wavefunctions, they get trapped in specific regions as shown here:

This phenomenon is called ‘Anderson localization’, and it was discovered around 1958.

But when a Bose-Einstein condensate localizes, all the atoms get trapped in the same place—because they’re all in exactly the same state! This phenomenon was discovered experimentally at the University of Cambridge very recently:

• Matteo Sbroscia, Konrad Viebahn, Edward Carter, Jr-Chiun Yu, Alexander Gaunt and Ulrich Schneider, Observing localisation in a 2D quasicrystalline optical lattice.

The evidence for it is somewhat indirect, so I’m sure people will continue to study it. Localization of a Bose–Einstein condensate in a one-dimensional quasiperiodic potential was seen much earlier, in 2008:

• Giacomo Roati, Chiara D’Errico, Leonardo Fallani, Marco Fattori, Chiara Fort, Matteo Zaccanti, Giovanni Modugno, Michele Modugno and Massimo Inguscio, Anderson localization of a non-interacting Bose–Einstein condensate, Nature 453 (2008), 895–898.

The holy grail, a ‘Bose glass’, remains to be seen. It’s a Bose-Einstein condensate that’s also a glass: its wavefunctions is disordered rather than periodic or quasiperiodic.

New forms of matter with strange properties—I love ’em!

For more popularizations of these ideas, see:

• Julia C. Keller, Researchers create new form of matter—supersolid is crystalline and superfluid at the same time, Phys.org, 3 March 2018.

• University of Texas at Dallas, Solid research leads physicists to propose new state of matter, Phys.org, 9 April 2018.

The latter says “The term ‘superfluid quasicrystal’ sounds like something a comic-book villain might use to carry out his dastardly plans.”


Schrödinger and Einstein

5 January, 2020

  

Schrödinger and Einstein helped invent quantum mechanics. But they didn’t really believe in its implications for the structure of reality, so in their later years they couldn’t get themselves to simply use it like most of their colleagues. Thus, they were largely sidelined. While others made rapid progress in atomic, nuclear and particle physics, they spent a lot of energy criticizing and analyzing quantum theory.

They also spent a lot of time on ‘unified field theories’: theories that sought to unify gravity and electromagnetism, without taking quantum mechanics into account.

After he finally found his equations describing gravity in November 1915, Einstein spent years working out their consequences. In 1917 he changed the equations, introducing the ‘cosmological constant’ Λ to keep the universe from expanding. Whoops.

In 1923, Einstein got excited about attempts to unify gravity and electromagnetism. He wrote to Niels Bohr:

I believe I have finally understood the connection between electricity and gravitation. Eddington has come closer to the truth than Weyl.

You see, Hermann Weyl and Arthur Eddington had both tried to develop unified field theories—theories that unified gravity and electromagnetism. Weyl had tried a gauge theory—indeed, he invented the term ‘gauge transformations’ at this time. In 1918 he asked Einstein to communicate a paper on it to the Berlin Academy. Einstein did, but pointed out a crushing physical objection to it in a footnote!

In 1921, Eddington tried a theory where the fundamental field was not the spacetime metric, but a torsion-free connection. He tried to show that both electromagnetism and gravity could be described by such a theory. But he didn’t even get as far as writing down field equations.

Einstein wrote three papers on Eddington’s ideas in 1923. He was so excited that he sent the first to the Berlin Academy from a ship sailing from Japan! He wrote down field equations and sought to connect them to Maxwell’s equations and general relativity. He was very optimistic at this time, concluding that

Eddington’s general idea in context with the Hamiltonian principle leads to a theory almost free of ambiguities; it does justice to our present knowledge about gravitation and electricity and unifies both kinds of fields in a truly accomplished manner.

Later he noticed the flaws in the theory. He had an elaborate approach to getting charged particles from singular solutions of the equation, though he wished they could be described by nonsingular solutions. He was stumped by the fact that the negatively and positively charged particles he knew—the electron and proton—had different masses. The same problem afflicted Dirac later, until the positron was discovered. But there were also problems even in getting Maxwell’s equations and general relativity from this framework, even approximately.

By the 1925 his enthusiasm had faded. He wrote to his friend Besso:

Regrettably, I had to throw away my work in the spirit of Eddington. Anyway, I now am convinced that, unfortunately, nothing can be made with the complex of ideas by Weyl–Eddington.

So, he started work on another unified field theory. And another.

And another.

Einstein worked obsessively on unified field theories until his death in 1955. He lost touch with his colleagues’ discoveries in particle physics. He had an assistant, Valentine Bargmann, try to teach him quantum field theory—but he lost interest in a month. All he wanted was a geometrical explanation of gravity and electromagnetism. He never succeeded in this quest.

But there’s more to this story!

The other side of the story is Schrödinger. In the 1940s, he too became obsessed with unified field theories. He and Einstein became good friends—but also competitors in their quest to unify the forces of nature.

But let’s back up a bit. In June 1935, after the famous Einstein-Podolsky-Rosen paper arguing that quantum mechanics was incomplete, Schrödinger wrote to Einstein:

I am very happy that in the paper just published in P.R. you have evidently caught dogmatic q.m. by the coat-tails.

Einstein replied:

You are the only person with whom I am actually willing to come to terms.

They bonded over their philosophical opposition to the Bohr–Heisenberg attitude to quantum mechanics. In November 1935, Schrödinger wrote his paper on ‘Schrödinger’s cat‘.



Schrödinger fled Austria after the Nazis took over. In 1940 he got a job at the brand-new Dublin Institute for Advanced Studies.

In 1943 he started writing about unified field theories, corresponding with Einstein. He worked on some theories very similar to those of Einstein and Straus, who were trying to unify gravity and electromagnetism in a theory involving a connection with torsion, whose Levi-Civita symbol was therefore non-symmetric. He wrote 8 papers on this subject.

Einstein even sent Schrödinger two of his unpublished papers on these ideas!

In late 1946, Schrödinger had a new insight. He was thrilled.

By 1947 Schrödinger thought he’d made a breakthrough. He presented a paper on January 27th at the Dublin Institute of Advanced Studies. He even called a press conference to announce his new theory!

He predicted that a rotating mass would generate a magnetic field.

The story of the great discovery was quickly telegraphed around the world, and the science editor of the New York Times interview Einstein to see what he thought.

Einstein was not impressed. In a carefully prepared statement he shot Schrödinger down:

Einstein was especially annoyed that Schrödinger had called a press conference to announce his new theory before there was any evidence supporting it.

Wise words. I wish people heeded them!

Schrödinger apologized in a letter to Einstein, claiming that he’d done the press conference just to get a pay raise. Einstein responded curtly, saying “your theory does not really differ from mine”.

They stopped writing to each other for 3 years.

I’d like to understand Schrödinger’s theory using the modern tools of differential geometry. I don’t think it’s promising. I just want to know what it actually says, and what it predicts! Go here for details:

Schrödinger’s unified field theory, The n-Category Café, December 26, 2019.

For more on Schrödinger’s theory, try his book:

• Erwin Schrödinger, Space-Time Structure, Cambridge U. Press, Cambridge, 1950. Chapter XII: Generalizations of Einstein’s theory.

and his first paper on the theory:

• Erwin Schödinger, The final affine field laws I, Proceedings of the Royal Irish Academy A 51 (1945–1948), 163–171.

For a wonderfully detailed analysis of the history of unified field theories, including the work of Einstein and Schrödinger, read these:

• Hubert F. M. Goenner, On the history of unified field theories, Living Reviews in Relativity 7 (2004), article no. 2. On the history of unified field theories II (ca. 1930–ca. 1965), Living Reviews in Relativity 17 (2014), article no. 5.

especially Section 6 of the second paper. For more on the story of Einstein and Schrödinger, I recommend this wonderful book:

• Walter Moore, Schrödinger: Life and Thought, Cambridge U. Press, Cambridge, 1989.

This is where I got most of my quotes.