Nonequilibrium Thermodynamics in Biology (Part 2)

16 June, 2021

Larry Li, Bill Cannon and I ran a session on non-equilibrium thermodynamics in biology at SMB2021, the annual meeting of the Society for Mathematical Biology. You can see talk slides here!

Here’s the basic idea:

Since Lotka, physical scientists have argued that living things belong to a class of complex and orderly systems that exist not despite the second law of thermodynamics, but because of it. Life and evolution, through natural selection of dissipative structures, are based on non-equilibrium thermodynamics. The challenge is to develop an understanding of what the respective physical laws can tell us about flows of energy and matter in living systems, and about growth, death and selection. This session addresses current challenges including understanding emergence, regulation and control across scales, and entropy production, from metabolism in microbes to evolving ecosystems.

Click on the links to see slides for most of the talks:

Persistence, permanence, and global stability in reaction network models: some results inspired by thermodynamic principles
Gheorghe Craciun, University of Wisconsin–Madison

The standard mathematical model for the dynamics of concentrations in biochemical networks is called mass-action kinetics. We describe mass-action kinetics and discuss the connection between special classes of mass-action systems (such as detailed balanced and complex balanced systems) and the Boltzmann equation. We also discuss the connection between the ‘global attractor conjecture’ for complex balanced mass-action systems and Boltzmann’s H-theorem. We also describe some implications for biochemical mechanisms that implement noise filtering and cellular homeostasis.

The principle of maximum caliber of nonequilibria
Ken Dill, Stony Brook University

Maximum Caliber is a principle for inferring pathways and rate distributions of kinetic processes. The structure and foundations of MaxCal are much like those of Maximum Entropy for static distributions. We have explored how MaxCal may serve as a general variational principle for nonequilibrium statistical physics—giving well-known results, such as the Green-Kubo relations, Onsager’s reciprocal relations and Prigogine’s Minimum Entropy Production principle near equilibrium, but is also applicable far from equilibrium. I will also discuss some applications, such as finding reaction coordinates in molecular simulations non-linear dynamics in gene circuits, power-law-tail distributions in ‘social-physics’ networks, and others.

Nonequilibrium biomolecular information processes
Pierre Gaspard, Université libre de Bruxelles

Nearly 70 years have passed since the discovery of DNA structure and its role in coding genetic information. Yet, the kinetics and thermodynamics of genetic information processing in DNA replication, transcription, and translation remain poorly understood. These template-directed copolymerization processes are running away from equilibrium, being powered by extracellular energy sources. Recent advances show that their kinetic equations can be exactly solved in terms of so-called iterated function systems. Remarkably, iterated function systems can determine the effects of genome sequence on replication errors, up to a million times faster than kinetic Monte Carlo algorithms. With these new methods, fundamental links can be established between molecular information processing and the second law of thermodynamics, shedding a new light on genetic drift, mutations, and evolution.

Nonequilibrium dynamics of disturbed ecosystems
John Harte, University of California, Berkeley

The Maximum Entropy Theory of Ecology (METE) predicts the shapes of macroecological metrics in relatively static ecosystems, across spatial scales, taxonomic categories, and habitats, using constraints imposed by static state variables. In disturbed ecosystems, however, with time-varying state variables, its predictions often fail. We extend macroecological theory from static to dynamic, by combining the MaxEnt inference procedure with explicit mechanisms governing disturbance. In the static limit, the resulting theory, DynaMETE, reduces to METE but also predicts a new scaling relationship among static state variables. Under disturbances, expressed as shifts in demographic, ontogenic growth, or migration rates, DynaMETE predicts the time trajectories of the state variables as well as the time-varying shapes of macroecological metrics such as the species abundance distribution and the distribution of metabolic rates over
individuals. An iterative procedure for solving the dynamic theory is presented. Characteristic signatures of the deviation from static predictions of macroecological patterns are shown to result from different kinds of disturbance. By combining MaxEnt inference with explicit dynamical mechanisms of disturbance, DynaMETE is a candidate theory of macroecology for ecosystems responding to anthropogenic or natural disturbances.

Stochastic chemical reaction networks
Supriya Krishnamurthy, Stockholm University

The study of chemical reaction networks (CRN’s) is a very active field. Earlier well-known results (Feinberg Chem. Enc. Sci. 42 2229 (1987), Anderson et al Bull. Math. Biol. 72 1947 (2010)) identify a topological quantity called deficiency, easy to compute for CRNs of any size, which, when exactly equal to zero, leads to a unique factorized (non-equilibrium) steady-state for these networks. No general results exist however for the steady states of non-zero-deficiency networks. In recent work, we show how to write the full moment-hierarchy for any non-zero-deficiency CRN obeying mass-action kinetics, in terms of equations for the factorial moments. Using these, we can recursively predict values for lower moments from higher moments, reversing the procedure usually used to solve moment hierarchies. We show, for non-trivial examples, that in this manner we can predict any moment of interest, for CRN’s with non-zero deficiency and non-factorizable steady states. It is however an open question how scalable these techniques are for large networks.

Heat flows adjust local ion concentrations in favor of prebiotic chemistry
Christof Mast, Ludwig-Maximilians-Universität München

Prebiotic reactions often require certain initial concentrations of ions. For example, the activity of RNA enzymes requires a lot of divalent magnesium salt, whereas too much monovalent sodium salt leads to a reduction in enzyme function. However, it is known from leaching experiments that prebiotically relevant geomaterial such as basalt releases mainly a lot of sodium and only little magnesium. A natural solution to this problem is heat fluxes through thin rock fractures, through which magnesium is actively enriched and sodium is depleted by thermogravitational convection and thermophoresis. This process establishes suitable conditions for ribozyme function from a basaltic leach. It can take place in a spatially distributed system of rock cracks and is therefore particularly stable to natural fluctuations and disturbances.

Deficiency of chemical reaction networks and thermodynamics
Matteo Polettini, University of Luxembourg

Deficiency is a topological property of a Chemical Reaction Network linked to important dynamical features, in particular of deterministic fixed points and of stochastic stationary states. Here we link it to thermodynamics: in particular we discuss the validity of a strong vs. weak zeroth law, the existence of time-reversed mass-action kinetics, and the possibility to formulate marginal fluctuation relations. Finally we illustrate some subtleties of the Python module we created for MCMC stochastic simulation of CRNs, soon to be made public.

Large deviations theory and emergent landscapes in biological dynamics
Hong Qian, University of Washington

The mathematical theory of large deviations provides a nonequilibrium thermodynamic description of complex biological systems that consist of heterogeneous individuals. In terms of the notions of stochastic elementary reactions and pure kinetic species, the continuous-time, integer-valued Markov process dictates a thermodynamic structure that generalizes (i) Gibbs’ microscopic chemical thermodynamics of equilibrium matters to nonequilibrium small systems such as living cells and tissues; and (ii) Gibbs’ potential function to the landscapes for biological dynamics, such as that of C. H. Waddington and S. Wright.

Using the maximum entropy production principle to understand and predict microbial biogeochemistry
Joseph Vallino, Marine Biological Laboratory, Woods Hole

Natural microbial communities contain billions of individuals per liter and can exceed a trillion cells per liter in sediments, as well as harbor thousands of species in the same volume. The high species diversity contributes to extensive metabolic functional capabilities to extract chemical energy from the environment, such as methanogenesis, sulfate reduction, anaerobic photosynthesis, chemoautotrophy, and many others, most of which are only expressed by bacteria and archaea. Reductionist modeling of natural communities is problematic, as we lack knowledge on growth kinetics for most organisms and have even less understanding on the mechanisms governing predation, viral lysis, and predator avoidance in these systems. As a result, existing models that describe microbial communities contain dozens to hundreds of parameters, and state variables are extensively aggregated. Overall, the models are little more than non-linear parameter fitting exercises that have limited, to no, extrapolation potential, as there are few principles governing organization and function of complex self-assembling systems. Over the last decade, we have been developing a systems approach that models microbial communities as a distributed metabolic network that focuses on metabolic function rather than describing individuals or species. We use an optimization approach to determine which metabolic functions in the network should be up regulated versus those that should be down regulated based on the non-equilibrium thermodynamics principle of maximum entropy production (MEP). Derived from statistical mechanics, MEP proposes that steady state systems will likely organize to maximize free energy dissipation rate. We have extended this conjecture to apply to non-steady state systems and have proposed that living systems maximize entropy production integrated over time and space, while non-living systems maximize instantaneous entropy production. Our presentation will provide a brief overview of the theory and approach, as well as present several examples of applying MEP to describe the biogeochemistry of microbial systems in laboratory experiments and natural ecosystems.

Reduction and the quasi-steady state approximation
Carsten Wiuf, University of Copenhagen

Chemical reactions often occur at different time-scales. In applications of chemical reaction network theory it is often desirable to reduce a reaction network to a smaller reaction network by elimination of fast species or fast reactions. There exist various techniques for doing so, e.g. the Quasi-Steady-State Approximation or the Rapid Equilibrium Approximation. However, these methods are not always mathematically justifiable. Here, a method is presented for which (so-called) non-interacting species are eliminated by means of QSSA. It is argued that this method is mathematically sound. Various examples are given (Michaelis-Menten mechanism, two-substrate mechanism, …) and older related techniques from the 50s and 60s are briefly discussed.

Electrostatics and the Gauss–Lucas Theorem

24 May, 2021

Say you know the roots of a polynomial P and you want to know the roots of its derivative. You can do it using physics! Namely, electrostatics in 2d space, viewed as the complex plane.

To keep things simple, let us assume P does not have repeated roots. Then the procedure works as follows.

Put equal point charges at each root of P, then see where the resulting electric field vanishes. Those are the roots of P’.

I’ll explain why this is true a bit later. But first, we use this trick to see something cool.

There’s no way the electric field can vanish outside the convex hull of your set of point charges. After all, if all the charges are positive, the electric field must point out of that region. So, the roots of P’ must lie in the convex hull of the roots of P!

This cool fact is called the Gauss–Lucas theorem. It always seemed mysterious to me. Now, thanks to this ‘physics proof’, it seems completely obvious!

Of course, it relies on my first claim: that if we put equal point
charges at the roots of P, the electric field they generate will vanish at the roots of P’. Why is this true?

By multiplying by a constant if necessary, we can assume

\displaystyle{   P(z) = \prod_{i = 1}^n  (z - a_i) }


\displaystyle{  \ln |P(z)| = \sum_{i = 1}^n \ln|z - a_i| }

This function is the electric potential created by equal point charges at the points ai in the complex plane. The corresponding electric field is minus the gradient of the potential, so it vanishes at the critical points of this function. Equivalently, it vanishes at the critical points of the exponential of this function, namely |P|. Apart from one possible exception, these points are the same as the critical points of P, namely the roots of P’. So, we’re almost done!

The exception occurs when P has a critical point where P vanishes. |P| is not smooth where P vanishes, so in this case we cannot say the critical point of P is a critical point of |P|.

However, when P has a critical point where P vanishes, then this point is a repeated root of P, and I already said I’m assuming P has no repeated roots. So, we’re done—given this assumption.

Everything gets a bit more complicated when our polynomial has repeated roots. Greg Egan explored this, and also the case where its derivative has repeated roots.

However, the Gauss–Lucas theorem still applies to polynomials with repeated roots, and this proof explains why:

• Wikipedia, Gauss–Lucas theorem.

Alternatively, it should be possible to handle the case of a polynomial with repeated roots by thinking of it as a limit of polynomials without repeated roots.

By the way, in my physics proof of the Gauss–Lucas theorem I said the electric field generated by a bunch of positive point charges cannot vanish outside the convex hull of these point charges because the field ‘points out’ of this region. Let me clarify that.

It’s true even if the positive point charges aren’t all equal; they just need to have the same sign. The rough idea is that each charge creates an electric field that points radially outward, so these electric fields can’t cancel at a point that’s not ‘between’ several charges—in other words, at a point that’s not in the convex hull of the charges.

But let’s turn this idea into a rigorous argument.

Suppose z is some point outside the convex hull of the points ai. Then, by the hyperplane separation theorem, we can draw a line with z on one side and all the points ai on the other side. Let v be a vector normal to this line and pointing toward the z side. Then

v \cdot (z - a_i) > 0

for all i. Since the electric field created by the ith point charge is a positive multiple of z – ai at the point z, the total electric field at z has a positive dot product with v. So, it can’t be zero!


The picture of a convex hull is due to Robert Laurini.

Parallel Line Masses and Marden’s Theorem

22 May, 2021

Here’s an idea I got from Albert Chern on Twitter. He did all the hard work, and I think he also drew the picture I’m going to use. I’ll just express the idea in a different way.

Here’s a strange fact about Newtonian gravity.

Consider three parallel ‘line masses’ that have a constant mass per length—the same constant for each one. Choose a plane orthogonal to these lines. There will typically be two points on this plane, say a and b, where a mass can sit in equilibrium, with the gravitational pull from all three lines masses cancelling out. This will be an unstable equilibrium.

Put a mass at point a. Remove the three line masses—but keep in mind the triangle they formed where they pierced your plane!

You can now orbit a test particle in an elliptical orbit around the mass at a in such a way that:

• one focus of this ellipse is a,
• the other focus is b, and
• the ellipse fits inside the triangle, just touching the midpoint of each side of the triangle.

Even better, this ellipse has the largest possible area of any ellipse contained in the triangle!

Here is Chern’s picture:


The triangle’s corners are the three points where the line masses pierce your chosen plane. These line masses create a gravitational potential, and the contour lines are level curves of this potential.

You can see that the points a and b are at saddle points of the potential. Thus, a mass placed at either a and b will be in an unstable equilibrium.

You can see the ellipse with a and b as its foci, snugly fitting into the triangle.

You can sort of see that the ellipse touches the midpoints of the triangle’s edges.

What you can’t see is that this ellipse has the largest possible area for any ellipse fitting into the triangle!

Now let me explain the math. While the gravitational potential of a point mass in 3d space is proportional to 1/r, the gravitational potential of a line mass in 3d space is proportional to \log r, which is also the gravitational potential of a point mass in 2d space.

So, if we have three equal line masses, which are parallel and pierce an orthogonal plane at points p_1, p_2 and p_3, then their gravitational potential, as a function on this plane, will be proportional to

\phi(z) = \log|z - p_1| + \log|z - p_2| + \log|z - p_3|

Here I’m using z as our name for an arbitrary point on this plane, because the next trick is to think of this plane as the complex plane!

Where are the critical points (in fact saddle points) of this potential? They are just points where the gradient of \phi vanishes. To find these points, we can just take the exponential of \phi and see where the gradient of that vanishes. This is a nice idea because

e^{\phi(z)} = |(z-p_1)(z-p_2)(z-p_3)|

The gradient of this function will vanish whenever

P'(z) = 0


P(z) = (z-p_1)(z-p_2)(z-p_3)

Since P is a cubic polynomial, P' is a quadratic, hence proportional to

(z - a)(z - b)

for some a and b. Now we use

Marden’s theorem. Suppose the zeros p_1, p_2, p_3 of a cubic polynomial P are non-collinear. Then there is a unique ellipse inscribed in the triangle with vertices p_1, p_2, p_3 and tangent to the sides at their midpoints. The foci of this ellipse are the zeroes of the derivative of P.

For a short proof of this theorem go here:

Carlson’s proof of Marden’s theorem.

This ellipse is called the Steiner inellipse of the triangle:

• Wikipedia, Steiner inellipse.

The proof that it has the largest area of any ellipse inscribed in the triangle goes like this. Using a linear transformation of the plane you can map any triangle to an equilateral triangle. It’s obvious that there’s a circle inscribed in any equilateral triangle, touching each of the triangle’s midpoints. It’s at least very plausible that that this circle is the ellipse of largest area contained in the triangle. If we can prove this we’re done.

Why? Because linear transformations map circles to ellipses, and map midpoints of line segments to midpoints of line segments, and simply rescale areas by a constant fact. So applying the inverse linear transformation to the circle inscribed in the equilateral triangle, we get an ellipse inscribed in our original triangle, which will touch this triangle’s midpoints, and have the maximum possible area of any ellipse contained in this triangle!

The Koide Formula

4 April, 2021

There are three charged leptons: the electron, the muon and the tau. Let m_e, m_\mu and m_\tau be their masses. Then the Koide formula says

\displaystyle{ \frac{m_e + m_\mu + m_\tau}{\big(\sqrt{m_e} + \sqrt{m_\mu} + \sqrt{m_\tau}\big)^2} = \frac{2}{3} }

There’s no known reason for this formula to be true! But if you plug in the experimentally measured values of the electron, muon and tau masses, it’s accurate within the current experimental error bars:

\displaystyle{ \frac{m_e + m_\mu + m_\tau}{\big(\sqrt{m_e} + \sqrt{m_\mu} + \sqrt{m_\tau}\big)^2} = 0.666661 \pm 0.000007 }

Is this significant or just a coincidence? Will it fall apart when we measure the masses more accurately? Nobody knows.

Here’s something fun, though:

Puzzle. Show that no matter what the electron, muon and tau masses might be—that is, any positive numbers whatsoever—we must have

\displaystyle{ \frac{1}{3} \le \frac{m_e + m_\mu + m_\tau}{\big(\sqrt{m_e} + \sqrt{m_\mu} + \sqrt{m_\tau}\big)^2} \le 1}

For some reason this ratio turns out to be almost exactly halfway between the lower bound and upper bound!

Koide came up with his formula in 1982 before the tau’s mass was measured very accurately.  At the time, using the observed electron and muon masses, his formula predicted the tau’s mass was

m_\tau = 1776.97 MeV/c2

while the observed mass was

m_\tau = 1784.2 ± 3.2 MeV/c2

Not very good.

In 1992 the tau’s mass was measured much more accurately and found to be

m_\tau = 1776.99 ± 0.28 MeV/c2

Much better!

Koide has some more recent thoughts about his formula:

• Yoshio Koide, What physics does the charged lepton mass relation tell us?, 2018.

He points out how difficult it is to explain a formula like this, given how masses depend on an energy scale in quantum field theory.

Vincenzo Galilei

3 April, 2021

I’ve been reading about early music. I ran into Vicenzo Galilei, an Italian lute player, composer, and music theorist who lived during the late Renaissance and helped start the Baroque era. Of course anyone interested in physics will know Galileo Galilei. And it turns out Vicenzo was Galileo’s dad!

The really interesting part is that Vincenzo did a lot of experiments—and he got Galileo interested in the experimental method!

Vicenzo started out as a lutenist, but in 1563 he met Gioseffo Zarlino, the most important music theorist of the sixteenth century, and began studying with him. Vincenzo became interested in tuning and keys, and in 1584 he anticipated Bach’s Well-Tempered Clavier by composing 24 groups of dances, one for each of the 12 major and 12 minor keys.

He also studied acoustics, especially vibrating strings and columns of air. He discovered that while the frequency of sound produced by a vibrating string varies inversely with the length of string, it’s also proportional to the square root of the tension applied. For example, weights suspended from strings of equal length need to be in a ratio of 9:4 to produce a perfect fifth, which is the frequency ratio 3:2.

Galileo later told a biographer that Vincenzo introduced him to the idea of systematic testing and measurement. The basement of their house was strung with lengths of lute string materials, each of different lengths, with different weights attached. Some say this drew Galileo’s attention away from pure mathematics to physics!

You can see books by Vicenzo Galilei here:

• Internet Archive, Vincenzo Galilei, c. 1520 – 2 July 1591.

Unfortunately for me they’re in Italian, but the title of his Dialogo della Musica Antica et Della Moderna reminds me of his son’s Dialogo sopra i Due Massimi Sistemi del Mondo (Dialog Concerning the Two Chief World Systems).

Speaking of dialogs, here’s a nice lute duet by Vincenzo Galilei, played by Evangelina Mascardi and Frédéric Zigante:

It’s from his book Fronimo Dialogo, an instruction manual for the lute which includes many compositions, including the 24 dances illustrating the 24 keys. “Fronimo” was an imaginary expert in the lute—in ancient Greek, phronimo means sage—and the book apparently consists of dialogs with between Fronimo and a student Eumazio (meaning “he who learns well”).

So, I now suspect that Galileo also got his fondness for dialogs from his dad, too! Or maybe everyone was writing them back then?

Can We Understand the Standard Model Using Octonions?

31 March, 2021

I gave two talks in Latham Boyle and Kirill Krasnov’s Perimeter Institute workshop Octonions and the Standard Model.

The first talk was on Monday April 5th at noon Eastern Time. The second was exactly one week later, on Monday April 12th at noon Eastern Time.

Here they are:

Can we understand the Standard Model? (video, slides)

Abstract. 40 years trying to go beyond the Standard Model hasn’t yet led to any clear success. As an alternative, we could try to understand why the Standard Model is the way it is. In this talk we review some lessons from grand unified theories and also from recent work using the octonions. The gauge group of the Standard Model and its representation on one generation of fermions arises naturally from a process that involves splitting 10d Euclidean space into 4+6 dimensions, but also from a process that involves splitting 10d Minkowski spacetime into 4d Minkowski space and 6 spacelike dimensions. We explain both these approaches, and how to reconcile them.

The second is on Monday April 12th at noon Eastern Time:

Can we understand the Standard Model using octonions? (video, slides)

Abstract. Dubois-Violette and Todorov have shown that the Standard Model gauge group can be constructed using the exceptional Jordan algebra, consisting of 3×3 self-adjoint matrices of octonions. After an introduction to the physics of Jordan algebras, we ponder the meaning of their construction. For example, it implies that the Standard Model gauge group consists of the symmetries of an octonionic qutrit that restrict to symmetries of an octonionic qubit and preserve all the structure arising from a choice of unit imaginary octonion. It also sheds light on why the Standard Model gauge group acts on 10d Euclidean space, or Minkowski spacetime, while preserving a 4+6 splitting.

You can see all the slides and videos and also some articles with more details here.

The Joy of Condensed Matter

24 March, 2021

I published a slightly different version of this article in Nautilus on February 24, 2021.

Everyone seems to be talking about the problems with physics: Woit’s book Not Even Wrong, Smolin’s The Trouble With Physics and Hossenfelder’s Lost in Math leap to mind, and they have started a wider conversation. But is all of physics really in trouble, or just some of it?

If you actually read these books, you’ll see they’re about so-called “fundamental physics”. Some other parts of physics are doing just fine, and I want to tell you about one. It’s called “condensed matter physics”, and it’s the study of solids and liquids. We are living in the golden age of condensed matter physics.

But first, what is “fundamental” physics? It’s a tricky term. You might think any truly revolutionary development in physics counts as fundamental. But in fact physicists use this term in a more precise, narrowly delimited way. One of the goals of physics is to figure out some laws that, at least in principle, we could use to predict everything that can be predicted about the physical universe. The search for these laws is fundamental physics.

The fine print is crucial. First: “at least in principle”. In principle we can use the fundamental physics we know to calculate the boiling point of water to immense accuracy—but nobody has done it yet, because the calculation is hard. Second: “everything that can be predicted”. As far we can tell, quantum mechanics says there’s inherent randomness in things, which makes some predictions impossible, not just impractical, to carry out with certainty. And this inherent quantum randomness sometimes gets amplified over time, by a phenomenon called chaos. For this reason, even if we knew everything about the universe now, we couldn’t predict the weather precisely a year from now.

So even if fundamental physics succeeded perfectly, it would be far from giving the answer to all our questions about the physical world. But it’s important nonetheless, because it gives us the basic framework in which we can try to answer these questions.

As of now, research in fundamental physics has given us the Standard Model (which seeks to describe matter and all the forces except gravity) and General Relativity (which describes gravity). These theories are tremendously successful, but we know they are not the last word. Big questions remain unanswered—like the nature of dark matter, or whatever is fooling us into thinking there’s dark matter.
Unfortunately, progress on these questions has been very slow since the 1990s.

Luckily fundamental physics is not all of physics, and today it is no longer the most exciting part of physics. There is still plenty of mind-blowing new physics being done. And lot of it—though by no means all—is condensed matter physics.

Traditionally, the job of condensed matter physics was to predict the properties of solids and liquids found in nature. Sometimes this can be very hard: for example, computing the boiling point of water. But now we know enough fundamental physics to design strange new materials—and then actually make these materials, and probe their properties with experiments, testing our theories of how they should work. Even better, these experiments can often be done on a table top. There’s no need for enormous particle accelerators here.

Let’s look at an example. We’ll start with the humble “hole”. A crystal is a regular array of atoms, each with some electrons orbiting it. When one of these electrons gets knocked off somehow, we get a “hole”: an atom with a missing electron. And this hole can actually move around like a particle! When an electron from some neighboring atom moves to fill the hole, the hole moves to the neighboring atom. Imagine a line of people all wearing hats except for one whose head is bare: if their neighbor lends them their hat, the bare head moves to the neighbor. If this keeps happening, the bare head will move down the line of people. The absence of a thing can act like a thing!

The famous physicist Paul Dirac came up with the idea of holes in 1930. He correctly predicted that since electrons have negative electric charge, holes should have positive charge. Dirac was working on fundamental physics: he hoped the proton could be explained as a hole. That turned out not to be true. Later physicists found another particle that could: the “positron”. It’s just like an electron with the opposite charge. And thus antimatter—particles like ordinary matter particles, with the same mass but with the opposite charge—was born. But that’s another story.

In 1931, Heisenberg applied the idea of holes to condensed matter physics. He realized that just as electrons create an electrical current as they move along, so do holes—but because they’re positively charged, their electrical current goes in the other direction! It became clear that holes carry electrical current in some but of the materials called “semiconductors”: for example, silicon with a bit of aluminum added to it. After many further developments, in 1948 the physicist William Schockley patented transistors that use both holes and electrons to form a kind of switch. He later won the Nobel prize for this, and now they’re widely used in computer chips.

Holes in semiconductors are not really particles in the sense of fundamental physics. They are really just a convenient way of thinking about the motion of electrons. But any sufficiently convenient abstraction takes on a life of its own. The equations that describe the behavior of holes are just like the equations that describe the behavior of particles. So, we can treat holes as if they were particles. We’ve already seen that a hole is positively charged. But because it takes energy to get a hole moving, a hole also acts like it has a mass. And so on: the properties we normally attribute to particles also make sense for holes.

Physicists have a name for things that act like particles even though they’re really not: “quasiparticles”. There are many kinds: holes are just one of the simplest. The beauty of quasiparticles is that we can practically make them to order, having a vast variety of properties. As Michael Nielsen put it, we now live in the era of “designer matter”.

For example, consider the “exciton”. Since an electron is negatively charged and a hole is positively charged, they attract each other. And if the hole is much heavier than the electron—remember, a hole has a mass—an electron can orbit a hole much as an electron orbits a proton in a hydrogen atom. Thus, they form a kind of artificial atom called an exciton. It’s a ghostly dance of presence and absence!

This is how an exciton moves through a crystal.

The idea of excitons goes back all the way to 1931. By now we can make excitons in large quantities in certain semiconductors. They don’t last for long: the electron quickly falls back into the hole. It can take between 1 and 10 trillonths of a second for this to happen. But that’s enough time to do some interesting things.

For example: if you can make an artificial atom, can you make an artificial molecule? Sure! Just as two atoms of hydrogen can stick together and form a molecule, two excitons can stick together and form a “biexciton”. An exciton can stick to another hole and form a “trion”. An exciton can even stick to a photon—a particle of light—and form something called a “polariton”. It’s a blend of matter and light!

Can you make a gas of artificial atoms? Yes! At low densities and high temperatures, excitons zip around very much like atoms in a gas. Can you make a liquid? Again, yes: at higher densities, and colder temperatures, excitons bump into each other enough to act like a liquid. At even colder temperatures, excitons can even form a “superfluid”, with almost zero viscosity: if you could somehow get it swirling around, it would go on practically forever.

This is just a small taste of what researchers in condensed matter physics are doing these days. Besides excitons, they are studying a host of other quasiparticles. A “phonon” is a quasiparticle of sound formed from vibrations moving through a crystal. A “magnon” is a quasiparticle of magnetization: a pulse of electrons in a crystal whose spins have flipped. The list goes on, and becomes ever more esoteric.

But there is also much more to the field than quasiparticles. Physicists can now create materials in which the speed of light is much slower than usual, say 40 miles an hour. They can create materials called “hyperbolic metamaterials” in which light moves as if there were two space dimensions and two time dimensions, instead of the usual three dimensions of space and one of time! Normally we think that time can go forward in just one direction, but in these substances light acts as if there’s a whole circle of directions that count as “forward in time”. The possibilities are limited only by our imagination and the fundamental laws of physics.

At this point, usually some skeptic comes along and questions whether these things are useful. Indeed, some of these new materials are likely to be useful. In fact a lot of condensed matter physics, while less glamorous than what I have just described, is carried out precisely to develop new improved computer chips—and also technologies like “photonics,” which uses light instead of electrons. The fruits of photonics are ubiquitous—it saturates modern technology, like flat-screen TVs—but physicists are now aiming for more radical applications, like computers that process information using light.

Then typically some other kind of skeptic comes along and asks if condensed matter physics is “just engineering”. Of course the very premise of this question is insulting: there is nothing wrong with engineering! Trying to build useful things is not only important in itself, it’s a great way to raise deep new questions about physics. For example the whole field of thermodynamics, and the idea of entropy, arose in part from trying to build better steam engines. But condensed matter physics is not just engineering. Large portions of it are blue-sky research into the possibilities of matter, like I’ve been talking about here.

These days, the field of condensed matter physics is just as full of rewarding new insights as the study of elementary particles or black holes. And unlike fundamental physics, progress in condensed matter physics is rapid—in part because experiments are comparatively cheap and easy, and in part because there is more new territory to explore.

So, when you see someone bemoaning the woes of fundamental physics, take them seriously—but don’t let it get you down. Just find a good article on condensed matter physics and read that. You’ll cheer up immediately.

Can We Understand the Standard Model?

16 March, 2021

I’m giving a talk in Latham Boyle and Kirill Krasnov’s Perimeter Institute workshop Octonions and the Standard Model on Monday April 5th at noon Eastern Time.

This talk will be a review of some facts about the Standard Model. Later I’ll give one that says more about the octonions.

Can we understand the Standard Model?

Abstract. 40 years trying to go beyond the Standard Model hasn’t yet led to any clear success. As an alternative, we could try to understand why the Standard Model is the way it is. In this talk we review some lessons from grand unified theories and also from recent work using the octonions. The gauge group of the Standard Model and its representation on one generation of fermions arises naturally from a process that involves splitting 10d Euclidean space into 4+6 dimensions, but also from a process that involves splitting 10d Minkowski spacetime into 4d Minkowski space and 6 spacelike dimensions. We explain both these approaches, and how to reconcile them.

You can see the slides here, and later a video of my talk will appear. You can register to attend the talk at the workshop’s website.

Here’s a puzzle, just for fun. As I’ll recall in my talk, there’s a normal subgroup of \mathrm{U}(1) \times \mathrm{SU}(2) \times \mathrm{SU}(3) that acts trivially on all known particles, and this fact is very important. The ‘true’ gauge group of the Standard Model is the quotient of \mathrm{U}(1) \times \mathrm{SU}(2) \times \mathrm{SU}(3) by this normal subgroup.

This normal subgroup is isomorphic to \mathbb{Z}_6 and it consists of all the elements

(\zeta^n, (-1)^n, \omega^n )  \in \mathrm{U}(1) \times \mathrm{SU}(2) \times \mathrm{SU}(3)


\zeta = e^{2 \pi i / 6}

is my favorite primitive 6th root of unity, -1 is my favorite primitive square root of unity, and

\omega = e^{2 \pi i / 3}

is my favorite primitive cube root of unity. (I’m a primitive kind of guy, in touch with my roots.)

Here I’m turning the numbers (-1)^n into elements of \mathrm{SU}(2) by multiplying them by the 2 \times 2 identity matrix, and turning the numbers \omega^n into elements of \mathrm{SU}(3) by multiplying them by the 3 \times 3 identity matrix.

But in fact there are a bunch of normal subgroups of \mathrm{U}(1) \times \mathrm{SU}(2) \times \mathrm{SU}(3) isomorphic to \mathbb{Z}_6. By my count there are 12 of them! So you have to be careful that you’ve got the right one, when you’re playing with some math and trying to make it match the Standard Model.

Puzzle 1. Are there really exactly 12 normal subgroups of \mathrm{U}(1) \times \mathrm{SU}(2) \times \mathrm{SU}(3) that are isomorphic to \mathbb{Z}_6?

Puzzle 2. Which ones give quotients isomorphic to the true gauge group of the Standard Model, which is \mathrm{U}(1) \times \mathrm{SU}(2) \times \mathrm{SU}(3) modulo the group of elements (\zeta^n, (-1)^n, \omega^n)?

To help you out, it helps to know that every normal subgroup of \mathrm{SU}(2) is a subgroup of its center, which consists of the matrices \pm 1. Similarly, every normal subgroup of \mathrm{SU}(3) is a subgroup of its center, which consists of the matrices 1, \omega and \omega^2. So, the center of \mathrm{U}(1) \times \mathrm{SU}(2) \times \mathrm{SU}(3) is \mathrm{U}(1) \times \mathbb{Z}_2 \times \mathbb{Z}_3.

Here, I believe, are the 12 normal subgroups of \mathrm{U}(1) \times \mathrm{SU}(2) \times \mathrm{SU}(3) isomorphic to \mathbb{Z}_6. I could easily have missed some, or gotten something else wrong!

  1. The group consisting of all elements (1, (-1)^n, \omega^n).
  2. The group consisting of all elements ((-1)^n, 1, \omega^n).
  3. The group consisting of all elements ((-1)^n, (-1)^n, \omega^n).
  4. The group consisting of all elements (\omega^n, (-1)^n, 1).
  5. The group consisting of all elements (\omega^n, (-1)^n, \omega^n).
  6. The group consisting of all elements (\omega^n, (-1)^n, \omega^{-n}).
  7. The group consisting of all elements (\zeta^n , 1, 1).
  8. The group consisting of all elements (\zeta^n , (-1)^n, 1).
  9. The group consisting of all elements (\zeta^n , 1, \omega^n).
  10. The group consisting of all elements (\zeta^n , 1, \omega^{-n}).
  11. The group consisting of all elements (\zeta^n , (-1)^n, \omega^n).
  12. The group consisting of all elements (\zeta^n , (-1)^n, \omega^{-n}).

Magic Numbers

9 March, 2021

Working in the Manhattan Project, Maria Goeppert Mayer discovered in 1948 that nuclei with certain numbers of protons and/or neutrons are more stable than others. In 1963 she won the Nobel prize for explaining this discovery with her ‘nuclear shell model’.

Nuclei with 2, 8, 20, 28, 50, or 82 protons are especially stable, and also nuclei with 2, 8, 20, 28, 50, 82 or 126 neutrons. Eugene Wigner called these magic numbers, and it’s a fun challenge to explain them.

For starters one can imagine a bunch of identical fermions in a harmonic oscillator potential. In one-dimensional space we have evenly spaced energy levels, each of which holds one state if we ignore spin. I’ll write this as

1, 1, 1, 1, ….

But if we have spin-1/2 fermions, each of these energy levels can hold two spin states, so the numbers double:

2, 2, 2, 2, ….

In two-dimensional space, ignoring spin, the pattern changes to

1, 1+1, 1+1+1, 1+1+1+1, ….

or in other words

1, 2, 3, 4, ….

That is: there’s one state of the lowest possible energy, 2 states of the next energy, and so on. Including spin the numbers double:

2, 4, 6, 8, ….

In three-dimensional space the pattern changes to this if we ignore spin:

1, 1+2, 1+2+3, 1+2+3+4, ….


1, 3, 6, 10, ….

So, we’re getting triangular numbers! Here’s a nice picture of these states, drawn by J. G. Moxness:

Including spin the numbers double:

2, 6, 12, 20, ….

So, there are 2 states of the lowest energy, 2+6 = 8 states of the first two energies, 2+6+12 = 20 states of the first three energies, and so on. We’ve got the first 3 magic numbers right! But then things break down: next we get 2+6+12+20 = 40, while the next magic number is just 28.

Wikipedia has a nice explanation of what goes wrong and how to fix it to get the next few magic numbers right:

Nuclear shell model.

We need to take two more effects into account. First, ‘spin-orbit interactions’ decrease the energy of a state when some spins point in the opposite direction from the orbital angular momentum. Second, the harmonic oscillator potential gets flattened out at large distances, so states of high angular momentum have less energy than you’d expect. I won’t attempt to explain the details, since Wikipedia does a pretty good job and I’m going to want breakfast soon. Here’s a picture that cryptically summarizes the analysis:

The notation is old-fashioned, from spectroscopy—you may know it if you’ve studied atomic physics, or chemistry. If you don’t know it, don’t worry about it! The main point is that the energy levels in the simple story I just told change a bit. They don’t change much until we hit the fourth magic number; then 8 of the next 20 energy levels get lowered so much that this magic number is 2+6+12+8 = 28 instead of 2+6+12+20 = 40. Things go on from there.

But here’s something cute: our simplified calculation of the magic numbers actually matches the count of states in each energy level for a four-dimensional harmonic oscillator! In four dimensions, if we ignore spin, the number of states in each energy level goes like this:

1, 1+3, 1+3+6, 1+3+6+10, …

These are the tetrahedral numbers:

Doubling them to take spin into account, we get the first three magic numbers right! Then, alas, we get 40 instead of 28.

But we can understand some interesting features of the world using just the first three magic numbers: 2, 8, and 20.

For example, helium-4 has 2 protons and 2 neutrons, so it’s ‘doubly magic’ and very stable. It’s the second most common substance in the universe! And in radioactive decays, often a helium nucleus gets shot out. Before people knew what it was, people called it an ‘alpha particle’… and the name stuck.

Oxygen-16, with 8 protons and 8 neutrons, is also doubly magic. So is calcium-40, with 20 protons and 20 neutrons. This is the heaviest stable element with the same number of protons and neutrons! After that, the repulsive electric charge of the protons needs to be counteracted by a greater number of neutrons.

A wilder example is helium-10, with 2 protons and 8 neutrons. It’s doubly magic, but not stable. It just barely clings to existence, helped by all that magic.

Here’s one thing I didn’t explain yet, which is actually pretty easy. Why is it true that—ignoring the spin—the number of states of the harmonic oscillator in the nth energy level follows this pattern in one-dimensional space:

1, 1, 1, 1, ….

and this pattern in two-dimensional space:

1, 1+1 = 2, 1+1+1 = 3, 1+1+1+1 = 4, …

and this pattern in three-dimensional space:

1, 1+2 = 3, 1+2+3 = 6, 1+2+3+4 = 10, ….

and this pattern in four-dimensional space:

1, 1+3 = 4, 1+3+6 = 10, 1+3+6+10 = 20, ….

and so on?

To see this we need to know two things. First, the allowed energies for a harmonic oscillator in one-dimensional space are equally spaced. So, if we say the lowest energy allowed is 0, by convention, and choose units where the next allowed energy is 1, then the allowed energies are the natural numbers:

0, 1, 2, 3, 4, ….

Second, a harmonic oscillator in n-dimensional space is just like n independent harmonic oscillators in one-dimensional space. In particular, its energy is just the sum of their energies.

So, the number of states of energy E for an n-dimensional oscillator is just the number of ways of writing E as a sum of a list of n natural numbers! The order of the list matters here: writing 3 as 1+2 counts as different than writing it as 2+1.

This leads to the patterns we’ve seen. For example, consider a harmonic oscillator in two-dimensional space. It has 1 state of energy 0, namely


It has 2 states of energy 1, namely

1+0 and 0+1

It has 3 states of energy 2, namely

2+0 and 1+1 and 0+2

and so on.

Next, consider a harmonic oscillator in three-dimensional space. This has 1 state of energy 0, namely


It has 3 states of energy 1, namely

1+0+0 and 0+1+0 and 0+0+1

It has 6 states of energy 2, namely

2+0+0 and 1+1+0 and 1+0+1 and 0+2+0 and 0+1+1 and 0+0+2

and so on. You can check that we’re getting triangular numbers: 1, 3, 6, etc. The easiest way is to note that to get a state of energy E, the first of the three independent oscillators can have any natural number j from 0 to E as its energy, and then there are E – j ways to choose the energies of the other two oscillators so that they sum to E – j. This gives a total of

E + (E-1) + (E-2) + \cdots + 1

states, and this is a triangular number.

The pattern continues in a recursive way: in four-dimensional space the same sort of argument gives us tetrahedral numbers because these are sums of triangular numbers, and so on. We’re getting the diagonals of Pascal’s triangle, otherwise known as binomial coefficients.

We often think of the binomial coefficient

\displaystyle{\binom{n}{k} }

as the number of ways of choosing a k-element subset of an n-element set. But here we are seeing it’s the number of ways of choosing an ordered (k+1)-tuple of natural numbers that sum to n. You may enjoy finding a quick proof that these two things are equal!


6 March, 2021

A baryon is a particle made of 3 quarks. The most familiar are the proton, which consists of two up quarks and a down quark, and the neutron, made of two downs and an up. Baryons containing strange quarks were discovered later, since the strange quark is more massive and soon decays to an up or down quark. A hyperon is a baryon that contains one or more strange quarks, but none of the still more massive quarks.

The first hyperon be found was the Λ, or lambda baryon. It’s made of an up quark, a down quark and a strange quark. You can think of it as a ‘heavy neutron’ in which one down quark was replaced by a strange quark. The strange quark has the same charge as the down, so like the neutron the Λ is neutral.

The Λ baryon was discovered in October 1950 by V. D. Hopper and S. Biswas of the University of Melbourne: these particles produced naturally when cosmic rays hit the upper atmosphere, and they were detected in photographic emulsions flown in a balloon. Imagine discovering a new elementary particle using a balloon! Those were the good old days.

The Λ has a mean life of just 0.26 nanoseconds, but that’s actually a long time in this business. The strange quark can only decay using the weak force, which, as its name suggests, is weak—so this happens slowly compared to decays involving the electromagnetic or strong forces.

For comparison, the Δ+ baryon is made of two ups and a down, just like a proton, but it has spin 3/2 instead of spin 1/2. So, you can think of it as a ‘fast-spinning proton’. It decays very quickly via the strong force: it has a mean life of just 5.6 × 10-23 seconds! When you get used to things like this, a nanosecond seems like an eternity.

The unexpectedly long lifetime of the Λ and some other particles was considered ‘strange’, and this eventually led people to dream up a quantity called ‘strangeness’, which is not conserved, but only changed by the weak interaction, so that strange particles decay on time scales of roughly nanoseconds. In 1962 Murray Gell-Mann realized that strangeness is simply the number of strange quarks in a particle, minus the number of strange antiquarks.

So, what’s a ‘hypernucleus’?

A hypernucleus is nucleus containing one or more hyperons along with the usual protons and neutrons. Since nuclei are held together by the strong force, they do things on time scales of 10-23 seconds—so an extra hyperon, which lasts for many billion times longer, can be regarded as a stable particle of a new kind when you’re doing nuclear physics! It lets you build new kinds of nuclei.

One well-studied hypernucleus is the hypertriton. Remember, an ordinary triton consists of a proton and two neutrons: it’s the nucleus of tritium, the radioactive isotope of hydrogen used in hydrogen bombs, also known as hydrogen-3. To get a hypertriton, we replace one of the neutrons with a Λ. So, it consists of a proton, a neutron, and a Λ.

In a hypertriton, the Λ behaves almost like a free particle. So, the lifetime of a hypertriton should be almost the same as that of a Λ by itself. Remember, the lifetime of the Λ is 0.26 nanoseconds. The lifetime of the hypertriton is a bit less: 0.24 nanoseconds. Predicting this lifetime, and even measuring it accurately, has taken a lot of work:

Hypertriton lifetime puzzle nears resolution, CERN Courier, 20 December 2019.

Hypernuclei get more interesting when they have more protons and neutrons. In a nucleus the protons form ‘shells’: due to the Pauli exclusion principle, you can only put one proton in each state. The neutrons form their own shells. So the situation is a bit like chemistry, where the electrons form shells, but now you have two kinds of shells. For example in helium-4 we have two protons, one spin-up and one spin-down, in the lowest energy level, also known as the first shell—and also two neutrons in their lowest energy level.

If you add an extra neutron to your helium-4, to get helium-5, it has to occupy a higher energy level. But if you add a hyperon, since it’s different from both the proton and neutron, it can too can occupy the lowest energy level.

Indeed, no matter how big your nucleus is, if you add a hyperon it goes straight to the lowest energy level! You can roughly imagine it as falling straight to the center of the nucleus—though everything is quantum-mechanical, so these mental images have to be taken with a grain of salt.

One reason for studying hypernuclei is that in some neutron stars, the inner core may contain hyperons! The point is that by weaseling around the Pauli exclusion principle, we can get more particles in low-energy states, producing dense forms of nuclear matter that have less energy. But nobody knows if this ‘strange nuclear matter’ is really stable. So this is an active topic of research. Hypernuclei are one of the few ways to learn useful information about this using experiments in the lab.

For a lot more, try this:

• A. Gal, E. V. Hungerford and D. J. Millener, Strangeness in nuclear physics, Reviews of Modern Physics 88 (2016), 035004.

You can see some hyperons in the baryon octet, which consists of spin-1/2 baryons made of up, down and strange quarks:

and the baryon decuplet which consists of spin-3/2 baryons made of up, down and strange quarks:

In these charts I3 is proportional to the number of up quarks minus the number of down quarks, Q is the electric charge, and S is the strangeness.

Gell-Mann and other physicists realized that mathematically, both the baryon octet and the baryon decuplet are both irreducible representations of SU(3). But that’s another tale!