Ramanujan’s Last Formula

27 November, 2020

When I gave a talk about Ramanujan’s easiest formula at the Whittier College math club run by my former student Brandon Coya, one of the students there asked a great question: are there any unproved formulas of Ramanujan left?

So I asked around on MathOverflow, and this is the result:

George Andrews and Bruce Berndt have written five books about Ramanujan’s lost notebook, which was actually not a notebook but a pile of notes Andrews found in 1976 in a box at the Wren Library at Trinity College, Cambridge. In 2019 Berndt wrote about the last unproved identity in the lost notebook:

• Bruce C. Berndt, Junxian Li and Alexandru Zaharescu, The final problem: an identity from Ramanujan’s lost notebook, Journal of the London Mathematical Society 100 (2) (2019), 568–591.

Following Timothy Chow’s advice, I consulted Berndt and asked him if there were any remaining formulas of Ramanujan that have neither been proved nor disproved. He said no:

To the best of my knowledge, there are no claims or conjectures remaining. There are some statements to which we have not been able to attach meaning.

I checked to make sure that this applies to all of Ramanujan’s output, not just the lost notebook, and he said yes.

It’s sort of sad. But there’s a big difference between proving an identity and extracting all the wisdom contained in that identity! A lot of Ramanujan’s formulas have combinatorial interpretations, while others are connected to complex analysis—e.g. mock theta functions—so I’m sure there’s a lot of good work left to do, inspired by Ramanujan’s identities. There is also a continuing industry of discovering Ramanujan-like identities.

For more fun reading, try this:

• Robert P. Schneider, Uncovering Ramanujan’s “lost” notebook: an oral history.

Here’s an identity from Ramanujan’s lost notebooks:


The Tenfold Way

22 November, 2020

I now have a semiannual column in the Notices of the American Mathematical Society! I’m excited: it gives me a chance to write short explanations of cool math topics and get them read by up to 30,000 mathematicians. It’s sort of like This Week’s Finds on steroids.

Here’s the first one:

The tenfold way, Notices Amer. Math. Soc. 67 (November 2020), 1599–1601.

The tenfold way became important in physics around 2010: it implies that there are ten fundamentally different kinds of matter. But it goes back to 1964, when C. T. C. Wall classified real ‘super division algebras’. He found that besides \mathbb{R}, \mathbb{C} and \mathbb{H}, which give ‘purely even’ super division algebras, there are seven more. He also showed that these ten algebras are all real or complex Clifford algebras. The eight real ones represent all eight Morita equivalence classes of real Clifford algebras, and the two complex ones do the same for the complex Clifford algebras. The tenfold way thus unites real and complex Bott periodicity.

In my article I explain what a ‘super division algebra’ is, give a quick proof that there are ten real super division algebras, and say a bit about how they show up in quantum mechanics and geometry.

For a lot more about the tenfold way, try this:

The tenfold way.


Ramanujan’s Easiest Formula

18 November, 2020

A while ago I decided to figure out how to prove one of Ramanujan’s formulas. I feel this is the sort of thing every mathematician should try at least once.

I picked the easiest one I could find. Hardy called it one of the “least impressive”. Still, it was pretty interesting: it turned out to be a puzzle within a puzzle. It has an easy outer layer which one can solve using standard ideas in calculus, and a tougher inner core which requires more cleverness. This inner core was cracked first by Laplace and then by Jacobi. Not being clever enough to do it myself, I read Jacobi’s two-page paper on this subject to figure out the trick. It was in Latin, and full of mistakes, but still one of the most fun papers I’ve ever read.

On Friday November 20th I’m giving a talk about this at the Whittier College Math Club, which is run by my former student Brandon Coya. Here are my slides:

Ramanujan’s easiest formula.

Here is Ramanjuan’s puzzle in the The Journal of the Indian Mathematical Society:



Exponential Discounting

25 October, 2020

Most of us seem to agree that the promise of a dollar in the future is worth less to us than a dollar today, even if the promise is certain to be fulfilled. Economists often assume ‘exponential discounting’, which says that a dollar promised at some time s is worth

\exp(-\alpha(s - t))

dollars in hand at time t. The constant \alpha is connected to the ‘interest rate’.

Why are economists so wedded to exponential discounting? The main reason is probably that it’s mathematically simple. But one argument for it goes roughly like this: if your decisions today are to look rational at any future time, you need to use exponential discounting.

In practice, humans, pigeons and rats do not use exponential discounting. So, economists say they are ‘dynamically inconsistent’:

• Wikipedia, Dynamic inconsistency.

In economics, dynamic inconsistency or time inconsistency is a situation in which a decision-maker’s preferences change over time in such a way that a preference can become inconsistent at another point in time. This can be thought of as there being many different “selves” within decision makers, with each “self” representing the decision-maker at a different point in time; the inconsistency occurs when not all preferences are aligned.

I this ‘inconsistent’ could be a misleading term for what’s going on here. It suggests that something bad is happening. That may not be true.

Anyway, some of the early research on this was done by George Ainslie, and here is what he found:

Ainslie’s research showed that a substantial number of subjects reported that they would prefer $50 immediately rather than $100 in six months, but would NOT prefer $50 in 3 months rather than $100 in nine months, even though this was the same choice seen at 3 months’ greater distance. More significantly, those subjects who said they preferred $50 in 3 months to $100 in 9 months said they would NOT prefer $50 in 12 months to $100 in 18 months—again, the same pair of options at a different distance—showing that the preference-reversal effect did not depend on the excitement of getting an immediate reward. Nor does it depend on human culture; the first preference reversal findings were in rats and pigeons.

Let me give a mathematical argument for exponential discounting. Of course it will rely on some assumptions. I’m not claiming these assumptions are true! Far from it. I’m just claiming that if we don’t use exponential discounting, we are violating one or more of these assumptions… or breaking out of the whole framework of my argument. The widespread prevalence of ‘dynamic inconsistency’ suggests that the argument doesn’t apply to real life.

Here’s the argument:

Suppose the value to us at any time t of a dollar given to us at some other time s is V(t,s).

Let us assume:

1) The ratio

\displaystyle{ \frac{V(t,s_2)}{V(t,s_1)} }

is independent of t. E.g., the ratio of value of a “dollar on Friday” to “a dollar on Thursday” is the same if you’re computing it on Monday, or on Tuesday, or on Wednesday.

2) The quantity V(t,s) depends only on the difference s - t.

3) The quantity V(t,s) is a continuous function of s and t.

Then we can show

V(t,s) = k \exp(-\alpha(s-t))

for some constants \alpha and k. Typically we assume k = 1 since the value of a dollar given to us right now is 1. But let’s just see how we get this formula for V(t,s) out of assumptions 1), 2) and 3).

The proof goes like this. By 2) we know

V(t,s) = F(s-t)

for some function F. By 1) it follows that

\displaystyle{ \frac{F(s_2 - t)}{F(s_1 - t)} }

is independent of t, so

\displaystyle{ \frac{F(s_2 - t)}{F(s_1 - t)} =  \frac{F(s_2)}{F(s_1)} }

or in other words

F(s_2 - t) F(s_1) = F(s_2) F(s_1 - t)

Ugh! What next? Well, if we take s_1 = t, we get a simpler equation that’s probably still good enough to get the job done:

F(s_2 - t) F(t) = F(s_2) F(0)

Now let’s make up a variable t' = s_2 - t, so that s_2 = t + t'. Then we can rewrite our equation as

F(t') F(t) = F(t+t') F(0)

or

F(t) F(t') = F(t+t') F(0)

This is beautiful except for the constant F(0). Let’s call that k and factor it out by writing

F(t) = k G(t)

Then we get

G(t) G(t') = G(t+t')

A theorem of Cauchy implies that any continuous solution of this equation is of the form

G(t) = \exp(-\alpha t)

So, we get

F(t) = k \exp(-\alpha t)

or

V(t,s) = k \exp(-\alpha(s-t))

as desired!

By the way, we don’t need to assume G is continuous: it’s enough to assume G is measurable. You can get bizarre nonmeasurable solutions of G(t) G(t') = G(t+t') using the axiom of choice, but they are not of practical interest.

So, assumption 3) is not the assumption I’d want to attack in trying to argue against exponential discounting. In fact both assumptions 1) and 2) are open to quite a few objections. Can you name some? Here’s one: in real life the interest rate changes with time. There must be some reason.

By the way, nothing in the argument I gave shows that \alpha \ge 0. So there could be people who obey assumptions 1)–3) yet believe the promise of a dollar in the future is worth more than a dollar in hand today.

Also, nothing in my argument for the form of V(t,s) assumes that s \ge t. That is, my assumptions as stated also concern the value of a dollar that was promised in the past. So, you might have fun seeing what changes, or does not change, if you restrict the assumptions to say they only apply when s \ge t. The arrow of time seems to be built into economics, after all.

Also, you may enjoy finding the place in my derivation where I might have divided by zero, and figure out to do about that.

If you don’t like exponential discounting—for example, because people use it to argue against spending money now to fight climate change—you might prefer hyperbolic discounting:

• Wikipedia, Hyperbolic discounting.


Epidemiological Modeling With Structured Cospans

19 October, 2020

This is a wonderful development! Micah Halter and Evan Patterson have taken my work on structured cospans with Kenny Courser and open Petri nets with Jade Master, together with Joachim Kock’s whole-grain Petri nets, and turned them into a practical software tool!

Then they used that to build a tool for ‘compositional’ modeling of the spread of infectious disease. By ‘compositional’, I mean that they make it easy to build more complex models by sticking together smaller, simpler models.

Even better, they’ve illustrated the use of this tool by rebuilding part of the model that the UK has been using to make policy decisions about COVID19.

All this software was written in the programming language Julia.

I had expected structured cospans to be useful in programming and modeling, but I didn’t expect it to happen so fast!

For details, read this great article:

• Micah Halter and Evan Patterson, Compositional epidemiological modeling using structured cospans, 17 October 2020.

Abstract. The field of applied category theory (ACT) aims to put the compositionality inherent to scientific and engineering processes on a firm mathematical footing. In this post, we show how the mathematics of ACT can be operationalized to build complex epidemiological models in a compositional way. In the first two sections, we review the idea of structured cospans, a formalism for turning closed systems into open ones, and we illustrate its use in Catlab through the simple example of open graphs. Finally, we put this machinery to work in the setting of Petri nets and epidemiological models. We construct a portion of the COEXIST model for the COVID-19 pandemic and we simulate the resulting ODEs.

You can see related articles by James Fairbanks, Owen Lynch and Evan Patterson here:

AlgebraicJulia Blog.

Also try these videos:

• James Fairbanks, AlgebraicJulia: Applied category theory in Julia, 29 July 2020.

• Evan Patterson, Realizing applied category theory in Julia, 16 January 2020.

I’m biased, but I think this is really cool cutting-edge stuff. If you want to do work along these lines let me know here and I’ll get Patterson to take a look.

Here’s part of a network created using their software:


Open Petri Nets and Their Categories of Processes

14 October, 2020

My student Jade Master will be talking about her work on open Petri nets at the online category theory seminar at UNAM on Wednesday October 21st at 18:00 UTC (11 am Pacific Time):

Open Petri Nets and Their Categories of Processes

Abstract. In this talk we will discuss Petri nets from a categorical perspective. A Petri net freely generates a symmetric monoidal category whose morphisms represent its executions. We will discuss how to make Petri nets ‘open’—i.e., equip them with input and output boundaries where resources can flow in and out. Open Petri nets freely generate open symmetric monoidal categories: symmetric monoidal categories which can be glued together along a shared boundary. The mapping from open Petri nets to their open symmetric monoidal categories is functorial and this gives a compositional framework for reasoning about the executions of Petri nets.

You can see the talk live, or later recorded, here:

You can read more about this work here:

• John Baez and Jade Master, Open Petri nets.

• Jade Master, Generalized Petri nets.

You can see Jade’s slides for a related talk here:

Open Petri nets

Abstract. The reachability semantics for Petri nets can be studied using open Petri nets. For us an ‘open’ Petri net is one with certain places designated as inputs and outputs via a cospan of sets. We can compose open Petri nets by gluing the outputs of one to the inputs of another. Open Petri nets can be treated as morphisms of a category \mathsf{Open}(\mathsf{Petri}), which becomes symmetric monoidal under disjoint union. However, since the composite of open Petri nets is defined only up to isomorphism, it is better to treat them as morphisms of a symmetric monoidal double category \mathbb{O}\mathbf{pen}(\mathsf{Petri}). Various choices of semantics for open Petri nets can be described using symmetric monoidal double functors out of \mathbb{O}\mathbf{pen}(\mathsf{Petri}). Here we describe the reachability semantics, which assigns to each open Petri net the relation saying which markings of the outputs can be obtained from a given marking of the inputs via a sequence of transitions. We show this semantics gives a symmetric monoidal lax double functor from \mathbb{O}\mathbf{pen}(\mathsf{Petri}) to the double category of relations. A key step in the proof is to treat Petri nets as presentations of symmetric monoidal categories; for this we use the work of Meseguer, Montanari, Sassone and others.


Recursive Saturation

13 October, 2020

Woo-hoo! Michael Weiss is teaching me about ‘recursively saturated’ models of Peano arithmetic:

• John Baez and Michael Weiss, Non-standard models of arithmetic 20, 12 October 2020.

Suppose you have a model of Peano arithemetic. Suppose you have an infinite list of predicates in Peano arithmetic: \phi_1(x), \phi_2(x), \dots. Suppose that for any finite subset of these, your model of PA has an element x making them true. Is there an element x making all these predicates true?

Of course this doesn’t hold in the standard model of PA. Consider this example:

x > 0, x > 1, x > 2, \dots

For any finite collection of these inequalities, we can find a standard natural number x large enough to make them true. But there’s no standard natural number that makes all these inequalities true.

On the other hand, for any nonstandard model of PA we can find an element x that obeys all of these:

x > 0, x > 1, x > 2, \dots

In fact this is the defining feature of a nonstandard model. (To be clear, I mean x > n where n ranges over standard natural numbers. A model of PA is nonstandard iff it contains an element greater than all the standard natural numbers.)

For a more interesting example, consider these predicates:

2|n, 3|n, 5|n, 7|n, 11|n, \dots

For any finite set of primes we can find a natural number divisible by all the primes in this set. We can’t find a standard natural number divisible by every prime, of course. But remarkably, in any nonstandard model of PA there is an element divisible by every prime—or more precisely, every standard prime.

In fact, suppose we have a model of PA, and an infinite list of predicates \phi_i in PA, and for any finite subset of these our model has an element x obeying the predicates in that subset. Then there is an element x obeying all the predicates \phi_i if:

  1. the model is nonstandard,
  2. you can write a computer program that lists all these predicates \phi_1(x), \phi_2(x), \dots,
  3. there’s an upper bound on the number of alternating quantifiers \forall \exists \forall \exists \cdots in the predicates \phi_i.

Intuitively, this result says that nonstandard models are very ‘fat’, containing nonstandard numbers with a wide range of properties. More technically, 2. says the predicates can be ‘recursively enumerated’, and this remarkable result is summarized by saying every nonstandard model of PA is ‘recursively \Sigma_n-saturated’.

In our conversation, Michael led me through the proof of this result. To do this, we used the fact that despite Tarski’s theorem on the undefinability of truth, truth is arithmetically definable for statements with any fixed upper bound on their number of alternating quantifiers! Michael had explained this end run around the undefinability of truth earlier, in Part 15 and Part 16 of our conversation.

Next we’ll show that if our nonstandard model is ‘ZF-standard’—that is, if it’s the \omega in some model of ZF—we can drop condition 3. above. So, in technical jargon, we’ll show that any nonstandard but ZF-standard model is ‘recursively saturated’.

I’m really enjoying these explorations of logic!


Decimal Digits of 1/π²

10 October, 2020

This formula may let you compute a decimal digit of 1/\pi^2 without computing all the previous digits:

\displaystyle{ \frac{1}{\pi^2} = \frac{2^5}{3} \sum_{n=0}^\infty \frac{(6n)!}{n!^6} (532n^2 + 126n + 9) \frac{1}{1000^{2n+1}}  }

It was discovered here:

• Gert Almkvist and Jesús Guillera, Ramanujan-like series for 1/\pi^2 and string theory, Experimental Mathematics, 21 (2012), 223–234.

They give some sort of argument for it, but apparently not a rigorous proof. Experts seem to believe it:

• Tito Piezas III, A compilation of Ramanujan-type formulas for 1/\pi^m.

It’s reminiscent of the famous Bailey–Borwein–Plouffe formula for \pi:

\displaystyle{ \pi = \sum_{n = 0}^\infty  \frac{1}{16^n} \left( \frac{4}{8n + 1} - \frac{2}{8n + 4} - \frac{1}{8n + 5} - \frac{1}{8n + 6} \right)  }

This lets you compute the nth hexadecimal digit of \pi without computing all the previous ones. It takes cleverness to do this, due to all those fractions.

A similar formula was found by Bellard:

\begin{array}{ccl}  \pi &=& \displaystyle{ \frac{1}{2^6} \sum_{n=0}^\infty \frac{(-1)^n}{2^{10n}} \, \left(-\frac{2^5}{4n+1} - \frac{1}{4n+3} +  \right. } \\ \\  & & \displaystyle{ \left. \frac{2^8}{10n+1} - \frac{2^6}{10n+3} - \frac{2^2}{10n+5} - \frac{2^2}{10n+7} + \frac{1}{10n+9} \right) } \end{array}

Between 1998 and 2000, the distributed computing project PiHex used Bellard’s formula to compute the quadrillionth bit of \pi, which turned out to be… [drum roll]…

A lot of work for nothing!

No formula of this sort is known that lets you compute individual decimal digits of \pi, but it’s cool that we can do it for 1/\pi^2, at least if Almkvist and Guillera’s formula is true.

Someday I’d like to understand any one of these Ramanujan-type formulas. The search for lucid conceptual clarity that makes me love category theory runs into a big challenge when it meets the work of Ramanujan! But it’s a worthwhile challenge. I started here with one of Ramanujan’s easiest formulas:

• John Baez, Chasing the Tail of the Gaussian: Part 1 and Part 2, The n-Category Café, 28 August 28 and 3 September 2020.

But the ideas involved in this formula all predate Ramanujan. For more challenging examples one could try this paper:

• Srinivasa Ramanujan, Modular equations and approximations to \pi, Quarterly Journal of Mathematics, XLV (1914), 350–372.

Here Ramanujan gave 17 formulas for pi, without proof. A friendly-looking explanation of one is given here:

• J. M. Borwein, P. B. Borwein and D. H. Bailey, Ramanujan, modular equations, and approximations to pi or How to compute one billion digits of pi, American Mathematical Monthly 96 (1989), 201–221.

So, this is where I’ll start!


Roger Penrose’s Nobel Prize

8 October, 2020

Roger Penrose just won Nobel Prize in Physics “for the discovery that black hole formation is a robust prediction of the general theory of relativity.” He shared it with Reinhard Genzel and Andrea Ghez, who won it “for the discovery of a supermassive compact object at the centre of our galaxy.”

This is great news! It’s a pity that Stephen Hawking is no longer alive, because if he were he would surely have shared in this prize. Hawking’s most impressive piece of work—his prediction of black hole evaporation—was too far from being experimentally confirmed to win a Nobel prize before his death. It still is today. The Nobel Prize is conservative in this way: it doesn’t go to theoretical developments that haven’t been experimentally confirmed. That makes a lot of sense. But sometimes they go overboard: Einstein never won a Nobel for general relativity or even special relativity. I consider that a scandal!

I’m glad that the Penrose–Hawking singularity theorems are considered Nobel-worthy. Let me just say a little about what Penrose and Hawking proved.

The most dramatic successful predictions of general relativity are black holes and the Big Bang. According to general relativity, as you follow a particle back in time toward the Big Bang or forward in time as it falls into a black hole, spacetime becomes more and more curved… and eventually it stops! This is roughly what we mean by a singularity. Penrose and Hawking made this idea mathematically precise, and proved that under reasonable assumptions singularities are inevitable in general relativity.

General relativity does not take quantum mechanics into account, so while Penrose and Hawking’s results are settled theorems, their applicability to our universe is not a settled fact. Many physicists hope that a theory of quantum gravity will save physics from singularities! Indeed this is one of the reasons physicist are fascinated by quantum gravity. But we know very little for sure about quantum gravity. So, it makes a lot of sense to work with general relativity as a mathematically precise theory and see what it says. That is what Hawking and Penrose did in their singularity theorems.

Let’s start with a quick introduction to general relativity, and then get an idea of why this theory predicts singularities are inevitable in certain situations.

General relativity says that spacetime is a 4-dimensional Lorentzian manifold. Thus, it can be covered by patches equipped with coordinates, so that in each patch we can describe points by lists of four numbers. Any curve \gamma(s) going through a point then has a tangent vector v whose components are v^\mu = d \gamma^\mu(s)/ds. Furthermore, given two tangent vectors v,w at the same point we can take their inner product

g(v,w) = g_{\mu \nu} v^\mu w^\nu

where as usual we sum over repeated indices, and g_{\mu \nu} is a 4 \times 4 matrix called the metric, depending smoothly on the point. We require that at any point we can find some coordinate system where this matrix takes the usual Minkowski form:

\displaystyle{  g = \left( \begin{array}{cccc} -1 & 0 &0 & 0 \\ 0 & 1 &0 & 0 \\ 0 & 0 &1 & 0 \\ 0 & 0 &0 & 1 \\ \end{array}\right) }

However, as soon as we move away from our chosen point, the form of the matrix g in these particular coordinates may change.

General relativity says how the metric is affected by matter. It does this in a single equation, Einstein’s equation, which relates the ‘curvature’ of the metric at any point to the flow of energy-momentum through that point. To define the curvature, we need some differential geometry. Indeed, Einstein had to learn this subject from his mathematician friend Marcel Grossman in order to write down his equation. Here I will take some shortcuts and try to explain Einstein’s equation with a bare minimum of differential geometry.

Consider a small round ball of test particles that are initially all at rest relative to each other. This requires a bit of explanation. First, because spacetime is curved, it only looks like Minkowski spacetime—the world of special relativity—in the limit of very small regions. The usual concepts of ’round’ and ‘at rest relative to each other’ only make sense in this limit. Thus, all our forthcoming statements are precise only in this limit, which of course relies on the fact that spacetime is a continuum.

Second, a test particle is a classical point particle with so little mass that while it is affected by gravity, its effects on the geometry of spacetime are negligible. We assume our test particles are affected only by gravity, no other forces. In general relativity this means that they move along timelike geodesics. Roughly speaking, these are paths that go slower than light and bend as little as possible. We can make this precise without much work.

For a path in space to be a geodesic means that if we slightly vary any small portion of it, it can only become longer. However, a path \gamma(s) in spacetime traced out by particle moving slower than light must be ‘timelike’, meaning that its tangent vector v = \gamma'(s) satisfies g(v,v) < 0. We define the proper time along such a path from s = s_0 to s = s_1 to be

\displaystyle{  \int_{s_0}^{s_1} \sqrt{-g(\gamma'(s),\gamma'(s))} \, ds }

This is the time ticked out by a clock moving along that path. A timelike path is a geodesic if the proper time can only decrease when we slightly vary any small portion of it. Particle physicists prefer the opposite sign convention for the metric, and then we do not need the minus sign under the square root. But the fact remains the same: timelike geodesics locally maximize the proper time.

Actual particles are not test particles! First, the concept of test particle does not take quantum theory into account. Second, all known particles are affected by forces other than gravity. Third, any actual particle affects the geometry of the spacetime it inhabits. Test particles are just a mathematical trick for studying the geometry of spacetime. Still, a sufficiently light particle that is affected very little by forces other than gravity can be approximated by a test particle. For example, an artificial satellite moving through the Solar System behaves like a test particle if we ignore the solar wind, the radiation pressure of the Sun, and so on.

If we start with a small round ball consisting of many test particles that are initially all at rest relative to each other, to first order in time it will not change shape or size. However, to second order in time it can expand or shrink, due to the curvature of spacetime. It may also be stretched or squashed, becoming an ellipsoid. This should not be too surprising, because any linear transformation applied to a ball gives an ellipsoid.

Let V(t) be the volume of the ball after a time t has elapsed, where time is measured by a clock attached to the particle at the center of the ball. Then in units where c = 8 \pi G = 1, Einstein’s equation says:

\displaystyle{  \left.{\ddot V\over V} \right|_{t = 0} = -{1\over 2} \left( \begin{array}{l} {\rm flow \; of \;} t{\rm -momentum \; in \; the \;\,} t {\rm \,\; direction \;} + \\ {\rm flow \; of \;} x{\rm -momentum \; in \; the \;\,} x {\rm \; direction \;} + \\ {\rm flow \; of \;} y{\rm -momentum \; in \; the \;\,} y {\rm \; direction \;} + \\ {\rm flow \; of \;} z{\rm -momentum \; in \; the \;\,} z {\rm \; direction} \end{array} \right) }

These flows here are measured at the center of the ball at time zero, and the coordinates used here take advantage of the fact that to first order, at any one point, spacetime looks like Minkowski spacetime.

The flows in Einstein’s equation are the diagonal components of a 4 \times 4 matrix T called the ‘stress-energy tensor’. The components T_{\alpha \beta} of this matrix say how much momentum in the \alpha direction is flowing in the \beta direction through a given point of spacetime. Here \alpha and \beta range from 0 to 3, corresponding to the t,x,y and z coordinates.

For example, T_{00} is the flow of t-momentum in the t-direction. This is just the energy density, usually denoted \rho. The flow of x-momentum in the x-direction is the pressure in the x direction, denoted P_x, and similarly for y and z. You may be more familiar with direction-independent pressures, but it is easy to manufacture a situation where the pressure depends on the direction: just squeeze a book between your hands!

Thus, Einstein’s equation says

\displaystyle{ {\ddot V\over V} \Bigr|_{t = 0} = -{1\over 2} (\rho + P_x + P_y + P_z) }

It follows that positive energy density and positive pressure both curve spacetime in a way that makes a freely falling ball of point particles tend to shrink. Since E = mc^2 and we are working in units where c = 1, ordinary mass density counts as a form of energy density. Thus a massive object will make a swarm of freely falling particles at rest around it start to shrink. In short, gravity attracts.

Already from this, gravity seems dangerously inclined to create singularities. Suppose that instead of test particles we start with a stationary cloud of ‘dust’: a fluid of particles having nonzero energy density but no pressure, moving under the influence of gravity alone. The dust particles will still follow geodesics, but they will affect the geometry of spacetime. Their energy density will make the ball start to shrink. As it does, the energy density \rho will increase, so the ball will tend to shrink ever faster, approaching infinite density in a finite amount of time. This in turn makes the curvature of spacetime become infinite in a finite amount of time. The result is a ‘singularity’.

In reality, matter is affected by forces other than gravity. Repulsive forces may prevent gravitational collapse. However, this repulsion creates pressure, and Einstein’s equation says that pressure also creates gravitational attraction! In some circumstances this can overwhelm whatever repulsive forces are present. Then the matter collapses, leading to a singularity—at least according to general relativity.

When a star more than 8 times the mass of our Sun runs out of fuel, its core suddenly collapses. The surface is thrown off explosively in an event called a supernova. Most of the energy—the equivalent of thousands of Earth masses—is released in a ten-second burst of neutrinos, formed as a byproduct when protons and electrons combine to form neutrons. If the star’s mass is below 20 times that of our the Sun, its core crushes down to a large ball of neutrons with a crust of iron and other elements: a neutron star.

However, this ball is unstable if its mass exceeds the Tolman–Oppenheimer–Volkoff limit, somewhere between 1.5 and 3 times that of our Sun. Above this limit, gravity overwhelms the repulsive forces that hold up the neutron star. And indeed, no neutron stars heavier than 3 solar masses have been observed. Thus, for very heavy stars, the endpoint of collapse is not a neutron star, but something else: a black hole, an object that bends spacetime so much even light cannot escape.

If general relativity is correct, a black hole contains a singularity. Many physicists expect that general relativity breaks down inside a black hole, perhaps because of quantum effects that become important at strong gravitational fields. The singularity is considered a strong hint that this breakdown occurs. If so, the singularity may be a purely theoretical entity, not a real-world phenomenon. Nonetheless, everything we have observed about black holes matches what general relativity predicts.

The Tolman–Oppenheimer–Volkoff limit is not precisely known, because it depends on properties of nuclear matter that are not well understood. However, there are theorems that say singularities must occur in general relativity under certain conditions.

One of the first was proved by Raychauduri and Komar in the mid-1950’s. It applies only to ‘dust’, and indeed it is a precise version of our verbal argument above. It introduced the Raychauduri’s equation, which is the geometrical way of thinking about spacetime curvature as affecting the motion of a small ball of test particles. It shows that under suitable conditions, the energy density must approach infinity in a finite amount of time along the path traced out out by a dust particle.

The first required condition is that the flow of dust be initally converging, not expanding. The second condition, not mentioned in our verbal argument, is that the dust be ‘irrotational’, not swirling around. The third condition is that the dust particles be affected only by gravity, so that they move along geodesics. Due to the last two conditions, the Raychauduri–Komar theorem does not apply to collapsing stars.

The more modern singularity theorems eliminate these conditions. But they do so at a price: they require a more subtle concept of singularity! There are various possible ways to define this concept. They’re all a bit tricky, because a singularity is not a point or region in spacetime.

For our present purposes, we can define a singularity to be an ‘incomplete timelike or null geodesic’. As already explained, a timelike geodesic is the kind of path traced out by a test particle moving slower than light. Similarly, a null geodesic is the kind of path traced out by a test particle moving at the speed of light. We say a geodesic is ‘incomplete’ if it ceases to be well-defined after a finite amount of time. For example, general relativity says a test particle falling into a black hole follows an incomplete geodesic. In a rough-and-ready way, people say the particle ‘hits the singularity’. But the singularity is not a place in spacetime. What we really mean is that the particle’s path becomes undefined after a finite amount of time.

The first modern singularity theorem was proved by Penrose in 1965. It says that if space is infinite in extent, and light becomes trapped inside some bounded region, and no exotic matter is present to save the day, either a singularity or something even more bizarre must occur. This theorem applies to collapsing stars. When a star of sufficient mass collapses, general relativity says that its gravity becomes so strong that light becomes trapped inside some bounded region. We can then use Penrose’s theorem to analyze the possibilities.

Here is Penrose’s story of how he discovered this:

At that time I was at Birkbeck College, and a friend of mine, Ivor Robinson, whose an Englishman but he was working in Dallas, Texas at the time, and he was talking to me … I forget what it was … he was a very … he had a wonderful way with words and so he was talking to me, and we got to this crossroad and as we crossed the road he stopped talking as we were watching out for traffic. We got to the other side and then he started talking again. And then when he left I had this strange feeling of elation and I couldn’t quite work out why I was feeling like that. So I went through all the things that had happened to me during the day—you know, what I had for breakfast and goodness knows what—and finally it came to this point when I was crossing the street, and I realised that I had a certain idea, and this idea what the crucial characterisation of when a collapse had reached a point of no return, without assuming any symmetry or anything like that. So this is what I called a trapped surface. And this was the key thing, so I went back to my office and I sketched out a proof of the collapse theorem. The paper I wrote was not that long afterwards, which went to Physical Review Letters, and it was published in 1965 I think.

Shortly thereafter Hawking proved a second singularity theorem, which applies to the Big Bang. It says that if space is finite in extent, and no exotic matter is present, generically either a singularity or something even more bizarre must occur. The singularity here could be either a Big Bang in the past, a Big Crunch in the future, both—or possibly something else. Hawking also proved a version of his theorem that applies to certain Lorentzian manifolds where space is infinite in extent, as seems to be the case in our Universe. This version requires extra conditions.

There are some undefined phrases in my summary of the Penrose–Hawking singularity theorems, most notably these:

• ‘exotic matter’

• ‘something even more bizarre’.

In each case I mean something precise.

These singularity theorems precisely specify what is meant by ‘exotic matter’. All known forms of matter obey the ‘dominant energy condition’, which says that

|P_x|, \, |P_y|, \, |P_z| \le \rho

at all points and in all locally Minkowskian coordinates. Exotic matter is anything that violates this condition.

The Penrose–Hawking singularity theorems also say what counts as ‘something even more bizarre’. An example would be a closed timelike curve. A particle following such a path would move slower than light yet eventually reach the same point where it started—and not just the same point in space, but the same point in spacetime! If you could do this, perhaps you could wait, see if it would rain tomorrow, and then go back and decide whether to buy an umbrella today. There are certainly solutions of Einstein’s equation with closed timelike curves. The first interesting one was found by Einstein’s friend Gödel in 1949, as part of an attempt to probe the nature of time. However, closed timelike curves are generally considered less plausible than singularities.

In the Penrose–Hawking singularity theorems, ‘something even more bizarre’ means precisely this: spacetime is not ‘globally hyperbolic’. To understand this, we need to think about when we can predict the future or past given initial data. When studying field equations like Maxwell’s theory of electromagnetism or Einstein’s theory of gravity, physicists like to specify initial data on space at a given moment of time. However, in general relativity there is considerable freedom in how we choose a slice of spacetime and call it ‘space’. What should we require? For starters, we want a 3-dimensional submanifold S of spacetime that is ‘spacelike’: every vector v tangent to S should have g(v,v) > 0. However, we also want any timelike or null curve to hit S exactly once. A spacelike surface with this property is called a Cauchy surface, and a Lorentzian manifold containing a Cauchy surface is said to be globally hyperbolic. There are many theorems justifying the importance of this concept. Globally hyperbolicity excludes closed timelike curves, but also other bizarre behavior.

By now the original singularity theorems have been greatly generalized and clarified. Hawking and Penrose gave a unified treatment of both theorems in 1970, which you can read here:

• Stephen William Hawking and Roger Penrose, The singularities of gravitational collapse and cosmology, Proc. Royal Soc. London A 314 (1970), 529–548.

The 1973 textbook by Hawking and Ellis gives a systematic introduction to this subject. A paper by Garfinkle and Senovilla reviews the subject and its history up to 2015. Also try the first two chapters of this wonderful book:

• Stephen Hawking and Roger Penrose, The Nature of Space and Time, Princeton U. Press, 1996.

You can find the first chapter, by Hawking, here: it describes the singularity theorems. The second, by Penrose, discusses the nature of singlarities in general relativity.

I’m sure Penrose’s Nobel Lecture will also be worth watching. Three cheers to Roger Penrose!


Network Models

7 October, 2020

Good news: my student Joe Moeller will be taking a job at NIST, the National Institute of Standards and Technology! He’ll be working with Spencer Breiner and Eswaran Subrahmanian on categories in engineering and system design.

Joe Moeller will be talking about his work on ‘network models’ at the online category theory seminar at UNAM on Wednesday October 14th at 18:00 UTC (11 am Pacific Time):

Network Models

Abstract. Networks can be combined in various ways, such as overlaying one on top of another or setting two side by side. We introduce `network models’ to encode these ways of combining networks. Different network models describe different kinds of networks. We show that each network model gives rise to an operad, whose operations are ways of assembling a network of the given kind from smaller parts. Such operads, and their algebras, can serve as tools for designing networks. Technically, a network model is a lax symmetric monoidal functor from the free symmetric monoidal category on some set to Cat, and the construction of the corresponding operad proceeds via a symmetric monoidal version of the Grothendieck construction.

You can watch the talk here:

You can read more about network models here:

Complex adaptive system design (part 6).

and here’s the original paper:

• John Baez, John Foley, Blake Pollard and Joseph Moeller, Network models, Theory and Applications of Categories 35 (2020), 700–744.