The Kepler Problem (Part 3)

23 January, 2022

The Kepler problem studies a particle moving in an inverse square force, like a planet orbiting the Sun. Last time I talked about an extra conserved quantity associated to this problem, which keeps elliptical orbits from precessing or changing shape. This extra conserved quantity is sometimes called the Laplace–Runge–Lenz vector, but since it was first discovered by none of these people, I prefer to call it the ‘eccentricity vector’

In 1847, Hamilton noticed a fascinating consequence of this extra conservation law. For a particle moving in an inverse square force, its momentum moves along a circle!

Greg Egan has given a beautiful geometrical argument for this fact:

• Greg Egan, The ellipse and the atom.

I will not try to outdo him; instead I’ll follow a more dry, calculational approach. One reason is that I’m trying to amass a little arsenal of formulas connected to the Kepler problem.

Let’s dive in. Remember from last time: we’re studying a particle whose position \vec q obeys

\ddot{\vec q} = - \frac{\vec q}{q^3}

Its momentum is

\vec p = m \dot{\vec q}

Its momentum is not conserved. Its conserved quantities are energy:

H = \frac{1}{2} p^2 - \frac{1}{q}

the angular momentum vector:

\vec L = \vec q \times \vec p

and the eccentricity vector:

\vec e = \vec p \times \vec L - \frac{\vec q}{q}

Now for the cool part: we can show that

\displaystyle{ \left( \vec p - \frac{\vec L \times \vec e}{L^2} \right)^2 = \frac{1}{L^2} }

Thus, the momentum \vec p stays on a circle of radius 1/L centered at the point (\vec L \times \vec e)/L^2. And since \vec L and \vec e are conserved, this circle doesn’t change! Let’s call it Hamilton’s circle.

Now let’s actually do the calculations needed to show that the momentum stays on Hamilton’s circle. Since

\vec e = \vec p \times \vec L - \frac{\vec q}{q}

we have

\frac{\vec q}{q} = \vec p \times \vec L - \vec e

Taking the dot product of this vector with itself, which is 1, we get

\begin{array}{ccl}  1 &=& \frac{\vec q}{q} \cdot \frac{\vec q}{q}  \\ \\  &=& (\vec p \times \vec L - \vec e) \cdot (\vec p \times \vec L - \vec e) \\ \\  &=& (\vec p \times \vec L)^2 - 2 \vec e \cdot (\vec p \times \vec L) + e^2  \end{array}

Now, notice that \vec p and \vec L are orthogonal since \vec L = \vec q \times \vec p. Thus

(\vec p \times \vec L)^2 = p^2 L^2

I actually used this fact and explained it in more detail last time. Substituting this in, we get

1 = p^2 L^2 - 2 \vec e \cdot (\vec p \times \vec L) + e^2

Similarly, \vec e and \vec L are orthogonal! After all,

\vec e = \vec p \times \vec L - \frac{\vec q}{q}

The first term is orthogonal to \vec L since it’s the cross product of \vec L and some other vector. And the second term is orthogonal to \vec L since \vec L is the cross product of \vec q and some other vector! So, we have

(\vec L \times \vec e)^2 = L^2 e^2

and thus

\displaystyle { e^2 = \frac{(\vec L \times \vec e)^2}{L^2} }

Substituting this in, we get

\displaystyle { 1 = p^2 L^2 - 2 \vec e \cdot (\vec p \times \vec L) + \frac{(\vec L \times \vec e)^2}{L^2} }

Using the cyclic property of the scalar triple product, we can rewrite this as

\displaystyle { 1 = p^2 L^2 - 2 \vec p \cdot (\vec L \times \vec e) + \frac{(\vec L \times \vec e)^2}{L^2} }

This is nicer because it involves \vec L \times \vec e in two places. If we divide both sides by L^2 we get

\displaystyle { \frac{1}{L^2} = p^2 - \frac{2}{L^2} \; \vec p \cdot (\vec L \times \vec e) + \frac{(\vec L \times \vec e)^2}{L^4} }

And now for the final flourish! The right hand is the dot product of a vector with itself:

\displaystyle { \frac{1}{L^2} = \left(\vec p -  \frac{\vec L \times \vec e}{L^2}\right)^2 }

This is the equation for Hamilton’s circle!

Now, beware: the momentum \vec p doesn’t usually move at a constant rate along Hamilton’s circle, since that would force the particle’s orbit to itself be circular.

But on the bright side, the momentum moves along Hamilton’s circle regardless of whether the particle’s orbit is elliptical, parabolic or hyperbolic. And we can easily distinguish the three cases using Hamilton’s circle!

After all, the center of Hamilton’s circle is the point (\vec L \times \vec e)/L^2, and

(\vec L \times \vec e)^2 = L^2 e^2

so the distance of this center from the origin is

\displaystyle{ \sqrt{\frac{(\vec L \times \vec e)^2}{L^4}} = \sqrt{\frac{L^2 e^2}{L^4}} = \frac{e}{L} }

On the other hand, the radius of Hamilton’s circle is 1/L. So his circle encloses the origin, goes through the origin or does not enclose the origin depending on whether e < 1, e = 1 or e > 1. But we saw last time that these three cases correspond to elliptical, parabolic and hyperbolic orbits!


• If e < 1 the particle’s orbit is an ellipse and the origin lies inside Hamilton’s circle. The momentum goes round and round Hamilton’s circle as time passes.

• If e = 1 the particle’s orbit is a parabola and the origin lies exactly on Hamilton’s circle. The particle’s momentum approaches zero as time approaches \pm \infty, so its momentum goes around Hamilton’s circle exactly once as time passes.

• If e > 1 the particle’s orbit is a hyperbola and the origin lies outside Hamilton’s circle. The particle’s momentum approaches distinct nonzero values as time approaches \pm \infty, so its momentum goes around just a portion of Hamilton’s circle.

By the way, in general the curve traced out by the momentum vector of a particle is called a hodograph. So you can learn more about Hamilton’s circle with the help of that buzzword.

The Periodic Table

21 January, 2022


I like many kinds of periodic table, but hate this one. See the problem?

Element 57 is drawn right next to element 72, replacing the element that should be there: element 71. So lutetium, element 71, is being denied its rightful place as a transition metal and is classified as a rare earth. Meanwhile lanthanum, element 57, which really is a rare earth, is drawn separately from all the rest! This is especially ironic because those rare earths are called ‘lanthanoids’ or ‘lanthanides’.

Similarly, element 89 is next to element 104, instead of the element that should be there: element 103. So lawrencium, element 103, is also being denied its rightful place as a transition metal. Meanwhile actinium, element 89, is banished from the row of ‘actinoids’, or ‘actinides’ — even though it gave them their name in the first place. How cruel!

Here Wikipedia does it right. Element 71 is a transition metal — not element 57. Similarly element 103 is a transition metal, not element 89.

This stuff is not just an arbitrary convention. Transition metals are chemically different from lanthanides and actinides. You can’t just stick them wherever you want.

In simple terms, as we move across the transition metals, they fill 1, 2, 3, … , 10 of their outermost d orbitals with electrons. Similarly as we move across the lanthanides or actinides, they fill 1, 2, 3, … 14 of their outermost f orbitals with electrons. I wrote about this here a while ago:

• John Baez, The Madelung rules, Azimuth, December 8, 2021.

There are some exceptions to the Madelung rules, but the bad periodic tables are not motivated by those exceptions. The Wikipedia periodic table accurately reflects the chemistry. The Encyclopedia Brittanica table completely ruins the story by arbitrarily sticking lanthanum and actinium in amongst the transition metals instead of the elements that should be there: lutetium and lawrencium. I see no good reason for doing this.

Here’s another common kind of periodic table that I hate. It cuts a hole into the bottom two rows of the transition metals, and moves the metals that should be there — elements 71 and 103 — into the rare earths and actinides.

This amounts to claiming that there are 15 rare earths and actinides, and just 9 transition metals in those two rows. That’s crazy: the fact that the p subshell holds 10 electrons and the d subshell holds 14 is dictated by group representation theory. Subshells hold 2, 6, 10, 14 electrons — twice odd numbers.

The periodic table is a marvelous thing: it shows how quantum
mechanics and math predict patterns in the elements. Have fun making up new designs — but if you’re going to use the old kind, use the good one!

If you don’t believe me, listen to this guy:

But unlike him, I don’t think experiments were necessary to realize that the bad periodic tables were messed up. It’s not as if they were designed based on some alternative theory about which elements are transition metals.

Interestingly, the International Union of Pure and Applied Chemistry were supposed to meet at the end of last year to settle this issue. What
did they decide? If you find out, please let me know!

Rapid Variable B Subdwarf Stars

19 January, 2022

A subdwarf B star is a blue-hot star smaller than the Sun. A few of these crazy stars pulse in brightness as fast as every 90 seconds! Waves of ionizing iron pulse through their thin surface atmosphere.

What’s up with these weird stars?

Sometimes a red giant loses most of its outer hydrogen… nobody is sure why… leaving just a thin layer of hydrogen over its helium core. We get a star with at most 1/4 the diameter of the Sun, but really hot.

It’s the blue-hot heart of a red giant, stripped bare.

Iron and other metals in the star’s thin hydrogen atmosphere can lose and regain their outer electrons. When these electrons are gone, the metals are ‘ionized’ and they absorb more light. This pushes them further out. Then they cool, become less ionized, absorb less light, and fall back down. This heats them up, so they become more ionized and the cycle begins again.

This happens in standing waves, which follow spherical harmonic patterns. You may have seen spherical harmonics in chemistry, where they describe electron orbitals. The same math is being applied here to a whole star! Now it’s not the electron’s wavefunction that’s pulsing in a spherical harmonic: it’s metals in the atmosphere of a star.

When the star is rotating, spherical harmonics that would otherwise vibrate at the same frequency do so at different frequencies. So, just by looking at the pulsing of light from a distant subdwarf B star, you can learn how fast it’s rotating!

I got the gif of a pulsing star from here:

White Dwarf Research Corporation.

Pulsating white dwarf stars also oscillate in spherical harmonic patterns, and this website shows how they look.

The figure showing frequency lines is from this cool paper:

• Stephane Charpinet, Noemi Giammichele, Weikai Zong, Valérie Van Grootel, Pierre Brassard and Gilles Fontaine, Rotation in sdB stars as revealed by stellar oscillations, Open Astronomy 27 (2017), 112–119.

This paper says “a κ-mechanism triggered by an accumulation of heavy elements (in particular iron) in the stellar envelope caused by radiative levitation is driving the oscillation.”

So, what’s the κ-mechanism and radiative levitation?

The κ-mechanism causes oscillations when a layer of a star’s atmosphere gets more opaque at higher temperatures. For example, when heavy metals near the surface of the atmosphere get hot they can ionize, and thus absorb more radiation. When the layer of ions falls in it gets hotter, more opaque, blocks more escaping heat, and the star’s pressure goes up… pushing the layer out. But when the layer shoots out it gets cooler, less opaque, blocks less escaping heat, and the pressure drops again. So we can get oscillations!

Radiative levitation can drive heavy metals to the surface of a star. They absorb light, and the light literally pushes them up. This can make
these metals thousands of times more common than you’d expect near the surface.

There’s more that can happen with subdwarf B stars, and you can learn about it here:

• Wikipedia, Subdwarf B stars.

For example, they can simultaneously oscillate in two ways, at two separate rates!

The Color of Infinite Temperature

16 January, 2022


This is the color of something infinitely hot. Of course you’d instantly be fried by gamma rays of arbitrarily high frequency, but this would be its spectrum in the visible range.

This is also the color of a typical neutron star. They’re so hot they look the same.

It’s also the color of the early Universe!

This was worked out by David Madore.

As a blackbody gets hotter and hotter, its spectrum approaches the classical Rayleigh–Jeans law. That is, its true spectrum as given by the Planck law approaches the classical prediction over a larger and larger range of frequencies.

So, for an extremely hot blackbody, the spectrum of light we can actually see with our eyes is governed by the Rayleigh–Jeans law. This law says the color doesn’t depend on the temperature: only the brightness does!

And this color is shown above.

This involves human perception, not just straight physics. So David Madore needed to work out the response of the human eye to the Rayleigh–Jeans spectrum — “by integrating the spectrum against the CIE XYZ matching functions and using the definition of the sRGB color space.”

The color he got is sRGB(148,177,255). And according to the experts who sip latte all day and make up names for colors, this color is called ‘Perano’.

Here is some background material Madore wrote on colors and visual perception. It doesn’t include the whole calculation that leads to this particular color, so somebody should check it, but it should help you understand how to convert the blackbody spectrum at a particular temperature into an sRGB color:

• David Madore, Colors and colorimetry.

In the comments you can see that Thomas Mansencal redid the calculation and got a slightly different color: sRGB(154,181,255). It looks quite similar to me:

Adjoint School 2022

2 January, 2022

Every year since 2018 we’ve been having annual courses on applied category theory where you can do research with experts. It’s called the Adjoint School.

You can apply to be a student at the 2022 Adjoint School now, and applications are due January 29th! Go here:

2022 Adjoint School: application.

The school will be run online from February to June, 2022, and then—coronavirus permitting—there will be in-person research at the University of Strathclyde in Glasgow, Scotland the week of July 11 – 15, 2022. This is also the location of the applied category theory conference ACT2022.

The 2022 Adjoint School is organized by Angeline Aguinaldo, Elena Di Lavore, Sophie Libkind, and David Jaz Myers. You can read more about how it works here:

About the Adjoint School.

There are four topics to work on, and you can see descriptions of them below.

Who should apply?

Anyone, from anywhere in the world, who is interested in applying category-theoretic methods to problems outside of pure mathematics. This is emphatically not restricted to math students, but one should be comfortable working with mathematics. Knowledge of basic category-theoretic language—the definition of monoidal category for example—is encouraged.

We will consider advanced undergraduates, PhD students, post-docs, as well as people working outside of academia. Members of groups which are underrepresented in the mathematics and computer science communities are especially encouraged to apply.

Also check out our inclusivity statement.

Topic 1: Compositional Thermodynamics

Mentors: Spencer Breiner and Joe Moeller

TA: Owen Lynch

Description: Thermodynamics is the study of the relationships between heat, energy, work, and matter. In category theory, we model flows in physical systems using string diagrams, allowing us to formalize physical axioms as diagrammatic equations. The goal of this project is to establish such a compositional framework for thermodynamical networks. A first goal will be to formalize the laws of thermodynamics in categorical terms. Depending on the background and interest of the participants, further topics may include the Carnot and Otto engines, more realistic modeling for real-world systems, and software implementation within the AlgebraicJulia library.


• John C. Baez, Owen Lynch, and Joe Moeller, Compositional thermostatics.

• F. William Lawvere, State categories, closed categories and the existence of semi-continuous entropy functions.

Topic 2: Fuzzy Type Theory for Opinion Dynamics

Mentor: Paige North

TA: Hans Reiss

Description: When working in type theory (or most logics), one is interested in proving propositions by constructing witnesses to their incontrovertible truth. In the real world, however, we can often only hope to understand how likely something is to be true, and we look for evidence that something is true. For example, when a doctor is trying to determine if a patient has a certain condition, they might ask certain questions and perform certain tests, each of which constitutes a piece of evidence that the patient does or does not have that condition. This suggests that a fuzzy version of type theory might be appropriate for capturing and analyzing real-world situations. In this project, we will explore the space of fuzzy type theories which can be used to reason about the fuzzy propositions of disease and similar dynamics.


• Daniel R. Grayson, An introduction to univalent foundations for mathematicians.

• Jakob Hansen and Robert Ghrist, Opinion dynamics on discourse sheaves.

Topic 3: A Compositional Theory of Timed and Probabilistic Processes: CospanSpan(Graph)

Mentor: Nicoletta Sabadini

TA: Mario Román

Description: Span(Graph), introduced by Katis, Sabadini and Walters as a categorical algebra for automata with interfaces, provides, in a very intuitive way, a compositional description of hierarchical networks of interacting components with fixed topology. The algebra also provides a calculus of connectors, with an elegant description of signal broadcasting. In particular, the operations of “parallel with communication” (that allows components to evolve simultaneously, like connected gears), and “non-sequential feedback” (not considered in Kleene’s algebra for classical automata) are fundamental in modelling complex distributed systems such as biological systems. Similarly, the dual algebra Cospan(Graph) allows us to compose systems sequentially. Hence, the combined algebra CospanSpan(Graph), which extends Kleene’s algebra for classical automata, is a general algebra for reconfigurable networks of interacting components. Still, some very interesting aspects and possible applications of this model deserve a better understanding:

• How can timed actions and probability be combined in CospanSpan(Graph)?

• If not, can we describe time-varying probability in a compositional setting?

• Which is the possible role of “parallel with communication” in understanding causality?


• L. de Francesco Albasini, N. Sabadini, and R.F.C. Walters, The compositional construction of Markov processes II.

• A. Cherubini, N. Sabadini, and R.F.C. Walters, Timing in the Cospan-Span model.

Topic 4: Algebraic Structures in Logic and Relations

Mentor: Filippo Bonchi

Description coming soon!

The Kepler Problem (Part 2)

31 December, 2021

I’m working on a math project involving the periodic table of elements and the Kepler problem—that is, the problem of a particle moving in an inverse square force law. That’s one reason I’ve been blogging about chemistry lately! I hope to tell you all about this project sometime—but right now I just want to say some very basic stuff about the ‘eccentricity vector’.

This vector is a conserved quantity for the Kepler problem. It was named the ‘Runge–Lenz vector’ after Lenz used it in 1924 to study the hydrogen atom in the framework of the ‘old quantum mechanics’ of Bohr and Sommerfeld: Lenz cite Runge’s popular German textbook on vector analysis from 1919, which explains this vector. But Runge never claimed any originality: he attributed this vector to Gibbs, who wrote about it in his book on vector analysis in 1901!

Nowadays many people call it the ‘Laplace–Runge–Lenz vector’, honoring Laplace’s discussion of it in his famous treatise on celestial mechaics in 1799. But in fact this vector goes back at least to Jakob Hermann, who wrote about it in 1710, triggering further work on this topic by Johann Bernoulli in the same year.

Nobody has seen signs of this vector in work before Hermann. So, we might call it the Hermann–Bernoulli–Laplace–Gibbs–Runge–Lenz vector, or just the Hermann vector. But I prefer to call it the eccentricity vector, because for a particle in an inverse square law its magnitude is the eccentricity of that orbit!

Let’s suppose we have a particle whose position \vec q \in \mathbb{R}^3 obeys this version of the inverse square force law:

\ddot{\vec q} = - \frac{\vec q}{q^3}

where I remove the arrow from a vector when I want to talk about its magnitude. So, I’m setting the mass of this particle equal to 1, along with the constant saying the strength of the force. That’s because I want to keep the formulas clean! With these conventions, the momentum of the particle is

\vec p = \dot{\vec q}

For this system it’s well-known that the following energy is conserved:

H = \frac{1}{2} p^2 - \frac{1}{q}

as well as the angular momentum vector:

\vec L = \vec q \times \vec p

But the interesting thing for me today is the eccentricity vector:

\vec e = \vec p \times \vec L - \frac{\vec q}{q}

Let’s check that it’s conserved! Taking its time derivative,

\dot{\vec e} = \dot{\vec p} \times \vec L + \vec p \times \dot{\vec L} - \frac{\vec p}{q} + \frac{\dot q}{q^2} \,\vec q

But angular momentum is conserved so the second term vanishes, and

\dot q = \frac{d}{dt} \sqrt{\vec q \cdot \vec q} =  \frac{\vec p \cdot \vec q}{\sqrt{\vec q \cdot \vec q}} = \frac{\vec p \cdot \vec q}{q}

so we get

\dot{\vec e} = \dot{\vec p} \times \vec L - \frac{\vec p}{q} +  \frac{\vec p \cdot \vec q}{q^2}\, \vec q

But the inverse square force law says

\dot{\vec p} = - \frac{\vec q}{q^3}


\dot{\vec e} = - \frac{1}{q^3} \, \vec q \times \vec L - \frac{\vec p}{q} +  \frac{\vec p \cdot \vec q}{q^2}\, \vec q

How can we see that this vanishes? Mind you, there are various geometrical ways to think about this, but today I’m in the mood for checking that my skills in vector algebra are sufficient for a brute-force proof—and I want to record this proof so I can see it later!

To get anywhere we need to deal with the cross product in the above formula:

\vec q \times \vec L = \vec q \times (\vec q \times \vec p)

There’s a nice identity for the vector triple product:

\vec a \times (\vec b \times \vec c) = (\vec a \cdot \vec c) \vec b - (\vec a \cdot \vec b) \vec c

I could have fun talking about why this is true, but I won’t now! I’ll just use it:

\vec q \times \vec L = \vec q \times (\vec q \times \vec p) = (\vec q \cdot \vec p) \vec q - q^2 \, \vec p

and plug this into our formula

\dot{\vec e} = - \frac{1}{q^3} \, \vec q \times \vec L - \frac{\vec p}{q} +  \frac{\vec p \cdot \vec q}{q^2}\, \vec q


\dot{\vec e} = -\frac{1}{q^3} \Big((\vec q \cdot \vec p) \vec q - q^2 \vec p \Big) - \frac{\vec p}{q} +  \frac{\vec p \cdot \vec q}{q^3}\, \vec q

But look—everything cancels! So

\dot{\vec e} = 0

and the eccentricity vector is conserved!

So, it seems that the inverse square force law has 7 conserved quantities: the energy H, the 3 components of the angular momentum \vec L, and the 3 components of the eccentricity vector \vec e. But they can’t all be independent, since the particle only has 6 degrees of freedom: 3 for position and 3 for momentum. There can be at most 5 independent conserved quantities, since something has to change. So there have to be at least two relations betwen the conserved quantities we’ve found.

The first of these relations is pretty obvious: \vec e and \vec L are at right angles, so

\vec e \cdot \vec L = 0

But wait, why are they at right angles? Because

\vec e = \vec p \times \vec L - \frac{\vec q}{q}

The first term is orthogonal to \vec L because it’s a cross product of \vec p and \vec L; the second is orthogonal to \vec L because \vec L is a cross product of \vec q and \vec p.

The second relation is a lot less obvious, but also more interesting. Let’s take the dot product of \vec e with itself:

e^2 = \left(\vec p \times \vec L - \frac{\vec q}{q}\right) \cdot \left(\vec p \times \vec L - \frac{\vec q}{q}\right)

or in other words,

e^2 = (\vec p \times \vec L) \cdot (\vec p \times \vec L) - \frac{2}{q} \vec q \cdot (\vec p \times \vec L) + 1

But remember this nice cross product identity:

(\vec a \times \vec b) \cdot (\vec a \times \vec b) + (\vec a \cdot \vec b)^2 = a^2 b^2

Since \vec p and L are at right angles this gives

(\vec p \times \vec L) \cdot (\vec p \times \vec L) = p^2 L^2


e^2 = p^2 L^2 - \frac{2}{q} \vec q \cdot (\vec p \times \vec L) + 1

Then we can use the cyclic identity for the scalar triple product:

\vec a \cdot (\vec b \times \vec c) = \vec c \cdot (\vec a \times \vec b)

to rewrite this as

e^2 = p^2 L^2 - \frac{2}{q} \vec L \cdot (\vec q \times \vec p) + 1

or simply

e^2 = p^2 L^2 - \frac{2}{q} L^2 + 1

or even better,

e^2 = 2 \left(\frac{1}{2} p^2 - \frac{1}{q}\right) L^2 + 1

But this means that

e^2 = 2HL^2 + 1

which is our second relation between conserved quantities for the Kepler problem!

This relation makes a lot of sense if you know that e is the eccentricity of the orbit. Then it implies:

• if H > 0 then e > 1 and the orbit is a hyperbola.

• if H = 0 then e = 1 and the orbit is a parabola.

• if H < 0 then 0 < e < 1 and the orbit is an ellipse (or circle).

But why is e the eccentricity? And why does the particle move in a hyperbola, parabola or ellipse in the first place? We can show both of these things by taking the dot product of \vec q and \vec e:

\begin{array}{ccl}   \vec q \cdot \vec e &=&   \vec q \cdot \left(\vec p \times \vec L - \frac{\vec q}{q} \right)  \\ \\    &=& \vec q \cdot (\vec p \times \vec L) - q   \end{array}

Using the cyclic property of the scalar triple product we can rewrite this as

\begin{array}{ccl}   \vec q \cdot \vec e &=&   \vec L \cdot (\vec q \times \vec p) - q  \\ \\  &=& L^2 - q  \end{array}

Now, we know that \vec q moves in the plane orthogonal to \vec L. In this plane, which contains the vector \vec e, the equation \vec q \cdot \vec e = L^2 - q defines a conic of eccentricity e. I won’t show this from scratch, but it may seem more familiar if we rotate the whole situation so this plane is the xy plane and \vec e points in the x direction. Then in polar coordinates this equation says

er \cos \theta = L^2 - r


r = \frac{L^2}{1 + e \cos \theta}

This is well-known, at least among students of physics who have solved the Kepler problem, to be the equation of a conic of eccentricity e.

Another thing that’s good to do is define a rescaled eccentricity vector. In the case of elliptical orbits, where H < 0, we define this by

\vec M = \frac{\vec e}{\sqrt{-2H}}

Then we can take our relation

e^2 = 2HL^2 + 1

and rewrite it as

1 = e^2 - 2H L^2

and then divide by -2H getting

- \frac{1}{2H} = \frac{e^2}{-2H} + L^2


- \frac{1}{2H} = L^2 + M^2

This suggests an interesting similarity between \vec L and \vec M, which turns out to be very important in a deeper understanding of the Kepler problem. And with more work, you can use this idea to show that -1/4H is the Hamiltonian for a free particle on the 3-sphere. But more about that some other time, I hope!

For now, you might try this:

• Wikipedia, Laplace–Runge–Lenz vector.

and of course this:

The Kepler problem (part 1).

Clemens non Papa

25 December, 2021

As I’ve explored more music from the Franco-Flemish school, I’ve gotten to like some of the slightly less well-known composers—though usually famous in their day—such as Jacobus Clemens non Papa, who lived in Flanders from roughly 1510 to 1555. I enjoy his clear, well-balanced counterpoint. It’s peppy, well-structured, but unromantic: no grand gestures or strong emotions, just lucid clarity. That’s quite appealing to me these days.

On a website about Flemish music I read that:

The style of his work stayed “northern”, without any Italian influences. As far as is known Clemens never ventured out of the Low Countries to pursue a career at a foreign court or institution, unlike many of his contemporaries. This is reflected in most of his religious pieces, where the style is generally reliant on counterpoint arrangements where every voice is independently formed.

Not much is known of his life. The name ‘Clemens non Papa’ may be a bit of a joke, since his last name was Clemens, but there was also a pope of that name, so it may have meant ‘Clemens — not the Pope’.

That makes it all the more funny that if you look for a picture of Clemens non Papa, you’ll quickly be led to Classical, which has a nice article about him—with this picture:

Yes, this is Pope Clement VII.

Clemens non Papa was one of the best musicians of the fourth generation of the Franco-Flemish school, along with Nicolas Gombert, Thomas Crequillon and my personal favorite, Pierre de Manchicourt. He was extremely prolific! He wrote 233 motets, 15 masses, 15 Magnificats, 159 settings of the Psalms in Dutch, and a bit over 100 secular pieces, including 89 chansons.

But unfortunately, he doesn’t seem to have inspired the tireless devotion among modern choral groups that more famous Franco-Flemish composers have. I’m talking about projects like The Clerks’ complete recordings of the sacred music of Ockeghem in five CDs, The Sixteen’s eight CDs of Palestrina, or the Tallis Scholars’ nine CDs of masses by Josquin. There’s something about early music that incites such massive projects! I think I know what it is: it’s beautiful, and a lot has been lost or forgotten, so you when you fall in love with it you start wanting to preserve and share it.

Maybe someday we’ll see complete recordings of the works of Clemens non Papa! But right now all we have are small bits—and let me list some.

A great starting-point is Clemens non Papa: Missa Pastores quidnam vidistis by the Tallis Scholars. This whole album is currently available as a YouTube playlist:

Another important album is Behold How Joyful – Clemens non Papa: Mass and Motets by the Brabant Ensemble. It too is is available as a playlist on YouTube:

The Brabant Ensemble have another album of Clemens non Papa’s music, Clemens non Papa: Missa pro defunctis, Penitential Motets. I haven’t heard it.

Next, the Egidius Kwartet has a wonderful set of twelve CDs called De Leidse Koorboeken—yet another of the massive projects I mentioned—in which they sing everything in the Leiden Choirbooks. These were six volumes of polyphonic Renaissance music of the Franco-Flemish school copied for a church in Leiden sometime in the 15th or 16th century, which somehow survived an incident in 1566 when a mob burst into that church and ransacked it.

You can currently listen to the Egidius Kwartet’s performances of the complete Leiden Choirbooks on YouTube playlists:

Volume 2 contains these pieces by Clemens non Papa—click to listen to them:

Heu mihi Domine, a4. Anima mea turbata est, a4.

Maria Magdalena, a5. Cito euntes, a5.

Jherusalem surge, a5. Leva in circuitu, a5.

Magnificat quarti thoni, a4.

Magnificat sexti thoni, a4.

Magnificat octavi toni, a4-5.

Volume 3 contains these:

Cum esset anna, a5.

Domine probasti, a5.

Advenit ignis divinus, a5.

Volume 4 contains these:

Angelus domini ad pastores, a4 – Secunda pars: Parvulus filius, a4.

Pastores loquebantur, a5 – Secunda pars: Et venerunt festinantes, a5.

Congratulamini mihi omnes, a4.

Sancti mei qui in carne – Secunda pars: Venite benedicti patris.

Pater peccavi, a4 – Secunda pars: Quanti mercenarii, a4.

Volume 5 contains this:

Ave Maria.

And finally, the group Henry’s Eight has a nice album Pierre de la Rue: Missa cum incundate, curently available as a YouTube playlist, which includes two pieces by Clemens non Papa:

Here are those pieces—click to hear them:

Ego flos campi.

Pater peccavi.

Here also is a live performance of Ego flos campi by the Choir of St James, in Winchester Cathedral:

Happy listening! And if you know a big trove of recordings of music by Clemens non Papa, let me know. I just know what’s on Discogs.

The Binary Octahedral Group (Part 2)

24 December, 2021

Part 1 introduced the ‘binary octahedral group’. This time I just want to show you some more pictures related to this group. I’ll give just enough explanation to hint at what’s going on. For more details, check out this webpage:

• Greg Egan, Symmetries and the 24-cell.

Okay, here goes!

You can inscribe two regular tetrahedra in a cube:

Each tetrahedron has 4! = 24 symmetries permuting its 4 vertices.

The cube thus has 48 symmetries, twice as many. Half map each tetrahedron to itself, and half switch the two tetrahedra.

If we consider only rotational symmetries, not reflections, we have to divide by 2. The tetrahedron has 12 rotational symmetries. The cube has 24.

But the rotation group SO(3) has a double cover SU(2). So the rotational symmetry groups of tetrahedron and cube have double covers too, with twice as many elements: 24 and 48, respectively.

But these 24-element and 48-element groups are different from the ones mentioned before! They’re called the binary tetrahedral group and binary octahedral group—since we could have used the symmetries of an octahedron instead of a cube.

Now let’s think about these groups using quaternions. We can think of SU(2) as consisting of the ‘unit quaternions’—that is, quaternions of length 1. That will connect what we’re doing to 4-dimensional geometry!

The binary tetrahedral group

Viewed this way, the binary tetrahedral group consists of 24 unit quaternions. 8 of them are very simple:

\pm 1, \; \pm i, \; \pm j, \; \pm k

These form a group called the quaternion group, and they’re the vertices of a shape that’s the 4d analogue of a regular octahedron. It’s called the 4-dimensional cross-polytope and it looks like this:

The remaining 16 elements of the binary tetrahedral group are these:

\displaystyle{ \frac{\pm 1 \pm i \pm j \pm k}{2} }

They form the vertices of a 4-dimensional hypercube:

Putting the vertices of the hypercube and the cross-polytope together, we get all 8 + 16 = 24 elements of the binary tetrahedral group. These are the vertices of a 4-dimensional shape called the 24-cell:

This shape is called the 24-cell not because it has 24 vertices, but because it also has 24 faces, which happen to be regular octahedra. You can see one if you slice the 24-cell like this:

The slices here have real part 1, ½, 0, -½, and -1 respectively. Note that the slices with real part ±½ contain the vertices of a hypercube, while the rest contain the vertices of a cross-polytope.

And here’s another great way to think about the binary tetrahedral group. We’ve seen that if you take every other vertex of a cube you get the vertices of a regular tetrahedron. Similarly, if you take every other vertex of a 4d hypercube you get a 4d cross-polytope. So, you can take the vertices of a 4d hypercube and partition them into the vertices of two cross-polytopes.

As a result, the 24 elements of the binary tetrahedral group can be partitioned into three cross-polytopes! Greg Egan shows how it looks:

The binary octahedral group

Now that we understand the binary tetrahedral group pretty well, we’re ready for our actual goal: understanding the binary octahedral group! We know this forms a group of 48 unit quaternions, and we know it acts as symmetries of the cube—with elements coming in pairs that act on the cube in the same way, because it’s a double cover of the rotational symmetry group of the cube.

So, we can partition its 48 elements into two kinds: those that preserve each tetrahedron in this picture, and those that switch these two tetahedra:

The first 24 form a copy of the binary tetrahedral group and thus a 24-cell, as we have discussed. The second form another 24-cell! And these two separate 24-cells are ‘dual’ to each other: the vertices of each one hover above the centers of the other’s faces.

Greg has nicely animated the 48 elements of the binary octahedral group here:

He’s colored them according to the rotations of the cube they represent:

• black: identity
• red: ±120° rotation around a V axis
• yellow: 180° rotation around an F axis
• blue: ±90° rotation around an F axis
• cyan: 180° rotation around an E axis

Here ‘V, F, and E axes’ join opposite vertices, faces, and edges of the cube.

Finally, note that because

• we can partition the 48 vertices of the binary octahedral group into two 24-cells


• we can partition the 24 vertices of the 24-cell into three cross-polytopes

it follows that we can partition the 48 vertices of the binary octahedral group into six cross-polytopes.

I don’t know the deep meaning of this fact. I know that the vertices of the 24-cell correspond to the 24 roots of the Lie algebra \mathfrak{so}(8). I know that the famous ‘triality’ symmetry of \mathfrak{so}(8) permutes the three cross-polytopes in the 24-cell, which are in some rather sneaky way related to the three 8-dimensional irreducible representations of \mathfrak{so}(8). I also know that if we take the two 24-cells in the binary octahedral group, and expand one by a factor of \sqrt{2}, so the vertices of other lie exactly at the center of its faces, we get the 48 roots of the Lie algebra \mathfrak{f}_4. But I don’t know how to extend this story to get a nice story about the six cross-polytopes in the binary octahedral group.

All I know is that if you pick a quaternion group sitting in the binary octahedral group, it will have 6 cosets, and these will be six cross-polytopes.

Photon-Photon Scattering

13 December, 2021

Light can bounce off light by exchanging virtual charged particles! This gives nonlinear corrections to Maxwell’s equations, even in the vacuum—but they’re only noticeable when the electric field is about 1018 volts/meter or more. This is an enormous electric field, able to accelerate a proton from rest to Large Hadron Collider energies in just 5 micrometers!

In 2017, light-on-light scattering was seen at the LHC when they shot lead ions past each other:

Direct evidence for light-by-light scattering at high energy had proven elusive for decades, until the Large Hadron Collider (LHC) began its second data-taking period (Run 2). Collisions of lead ions in the LHC provide a uniquely clean environment to study light-by-light scattering. Bunches of lead ions that are accelerated to very high energy are surrounded by an enormous flux of photons. Indeed, the coherent action from the large number of 82 protons in a lead atom with all the electrons stripped off (as is the case for the lead ions in the LHC) give rise to an electromagnetic field of up to 1025 volts per metre. When two lead ions pass close by each other at the centre of the ATLAS detector, but at a distance greater than twice the lead ion radius, those photons can still interact and scatter off one another without any further interaction between the lead ions, as the reach of the (much stronger) strong force is bound to the radius of a single proton. These interactions are known as ultra-peripheral collisions.

But now people want to see photon-photon scattering by shooting lasers at each other! One place they’ll try this is at the Extreme Light Infrastructure.

In 2019, a laser at the Extreme Light Infrastructure in Romania achieved a power of 10 petawatts for brief pulses — listen to the announcement for what means!

I think it reached an intensity of 1029 watts per square meter, but I’m not sure. If you know the intensity I in watts/square mete of a plane wave of light, you can compute the maximum strength E of its electric field (in volts/meter) by

I = \frac{1}{2} \varepsilon_0 c  E^2

where \varepsilon_0 is the permittivity of the vacuum and c is the speed of light. According to Dominik Wild, I = 1029 watts per square meter gives E \approx 1016 volts/meter. If so, this is about 1/100 the field strength needed to see strong nonlinear corrections to Maxwell’s equations.

In China, the Station of Extreme Light plans to build a laser that makes brief pulses of 100 petawatts. That’s 10,000 times the power of all the world’s electrical grids combined—for a very short time! They’re aiming for an intensity of 1028 watts/square meter:

• Edwin Cartlidge, Physicists are planning to build lasers so powerful they could rip apart empty space, Science, January 24, 2018.

The modification of Maxwell’s equations due to virtual particles was worked out by Heisenberg and Euler in 1936. (No, not that Euler.) They’re easiest to describe using a Lagrangian, but if we wrote out the equations we’d get Maxwell’s equations plus extra terms that are cubic in \mathbf{E} and \mathbf{B}.

For more, read these:

• Wikpedia, Schwinger limit.

• Wikpedia, Schwinger effect.

The Schwinger limit is the strength of the electric (or magnetic) field where nonlinearity due to virtual charrged particles becomes significant. They’re about

\displaystyle{ E_\text{c} = \frac{m_\text{e}^2 c^3}{e \hbar} \approx 1.32 \times 10^{18} \, \textrm{volts/meter} }

\displaystyle{ B_\text{c} = \frac{m_\text{e}^2 c^2}{e \hbar} \approx 4.41 \times 10^{9} \, \textrm{tesla} }

where e is the electron charge and \hbar is Planck’s constant. For more see page 38 here:

• David Delphenich, Nonlinear electrodynamics and QED.

The Schwinger effect is when a very large static electric field ‘sparks the vacuum’ and creates real particles. This may put an upper limit on many protons can be in an atomic nucleus, spelling an end to the periodic table.

Transition Metals

9 December, 2021

The transition metals are more complicated than lighter elements.


Because they’re the first whose electron wavefunctions are described by quadratic functions of x,y, and z — not just linear or constant. These are called ‘d orbitals’, and they look sort of like this:

More precisely: the wavefunctions of electrons in atoms depend on the distance r from the nucleus and also the angles \theta, \phi. The angular dependence is described by ‘spherical harmonics’, certain functions on the sphere. These are gotten by taking certain polynomials in x,y,z and restricting them to the unit sphere. Chemists have their own jargon for this:

• constant polynomial: s orbital

• linear polynomial: p orbital

• quadratic polynomial: d orbital

• cubic polynomial: f orbital

and so on.

To be even more precise, a spherical harmonic is an
eigenfunction of the Laplacian on the sphere. Any such function is the restriction to the sphere of some homogeneous polynomial in x,y,z whose Laplacian in 3d space is zero. This polynomial can be constant, linear, etc.

The dimension of the space of spherical harmonics goes like 1, 3, 5, 7,… as we increase the degree of the polynomial starting from 0:

• constant: 1

• linear: x, y, z

• quadratic: xy, xz, yz, x^2 - y^2, x^2 - z^2

etcetera. So, we get one s orbital, three p orbitals, five d orbitals and so on. Here I’ve arbitrarily chosen a basis of the space of quadratic polynomials with vanishing Laplacian, and I’m not claiming this matches the d orbitals in the pictures!

The transition metals are the first to use the d orbitals. This is why they’re so different than lighter elements.

Although there are 5 d orbitals, an electron occupying such an orbital can have spin up or down. This is why there are 10 transition metals per row!

This chart doesn’t show the last row of highly radioactive transition metals, just the ones you’re likely to see:

Look: 10 per row, all because there’s a 5d space of quadratic polynomials in x,y,z with vanishing Laplacian. Math becomes matter.

The Madelung rules

Can we understand why the first transition element, scandium, has 21 electrons? Yes, if we’re willing to use the ‘Madelung rules’ explained last time. Let me review them rapidly here.

You’ll notice this chart has axes called n and \ell.

As I just explained, the angular dependence of an orbital is determined by a homogeneous polynomial with vanishing Laplacian. In the above chart, the degree of this polynomial is called \ell. The space of such polynomials has dimension 2\ell + 1.

But an orbital has an additional radial dependence, described using a number called n. The math, which I won’t go into, requires that 0 \le \ell \le n. That gives the above chart its roughly triangular appearance.

The letters s, p, d, f are just chemistry jargon for \ell = 0,1,2,3.

Thanks to spin and the Pauli exclusion principle, we can pack at most 2(2\ell + 1) electrons into the orbitals with a given choice of n and \ell. This bunch of orbitals is called a ‘subshell’.

The Madelung rules say the order in which subshells get filled:

  1. Electrons are assigned to subshells in order of increasing values of n + \ell.
  2. For subshells with the same value of n + \ell, electrons are assigned first to the subshell with lower n.

So let’s see what happens. Only when we hit \ell = 2 will we get transition metals!

\boxed{n + \ell = 1}

n = 1, \ell = 0

This is called the 1s subshell, and we can put 2 electrons in here. First we get hydrogen with 1 electron, then helium with 2. At this point all the n = 1 subshells are full, so the ‘1st shell’ is complete, and helium is called a ‘noble gas’.

\boxed{n + \ell = 2}

n = 2, \ell = 0

This is called the 2s subshell, and we can put 2 more electrons in here. We get lithium with 3 electrons, and then beryllium with 4.

\boxed{n + \ell = 3}

n = 2, \ell = 1

This is called the 2p subshell, and we can put 6 more electrons in here. We get:

◦ boron with 5 electrons,
◦ carbon with 6,
◦ nitrogen with 7,
◦ oxygen with 8,
◦ fluorine with 9,
◦ neon with 10.

At this point all the n = 2 subshells are full, so the 2nd shell is complete and neon is another noble gas.

n = 3, \ell = 0

This is is called the 3s subshell, and we can put 2 more electrons in here. We get sodium with 11 electrons, and magnesium with 12.

\boxed{n + \ell = 4}

n = 3, \ell = 1

This is called the 4p subshell, and we can put 6 more electrons in here. We get:

◦ aluminum with 13 electrons,
◦ silicon with 14,
◦ phosphorus with 15,
◦ sulfur with 16,
◦ chlorine with 17,
◦ argon with 18.

At this point all the n = 3 subshells are full, so the 3rd shell is complete and argon is another noble gas.

n = 4, \ell = 0

This is called the 4s subshell, and we can put 2 more electrons in here. We get potassium with 19 electrons and calcium with 20.

\boxed{n + \ell = 5}

n = 3, \ell = 2

This is called the 3d subshell, and we can put 10 electrons in here. Since now we’ve finally hit \ell = 2, and thus a d subshell, these are transition metals! We get:

◦ scandium with 21 electrons,
◦ titanium with 22,
◦ vanadium with 23,
◦ chromium with 24,
◦ manganese with 25,
◦ iron with 26,
◦ cobalt with 27,
◦ nickel with 28,
◦ copper with 29,
◦ zinc with 30.

And the story continues—but at least we’ve seen why the first batch of transition elements starts where it does!

The scandal of scandium

For a strong attack on the Madelung rules, see:

• Eric Scerri, The problem with the Aufbau principle for finding electronic configurations, 24 June 2012.

But it’s important to realize that he’s attacking a version of the Madelung rules that is different, and stronger than the version stated above. My version only concerned atoms, not ions. The stronger version claims that you can use the Madelung rules not only to determine the ground state of an atom, but also those of the positive ions obtained by taking that atom and removing some electrons!

This stronger version breaks down if you consider scandium with one electron removed. As we’ve just seen, scandium has the electrons as in argon together with three more: two in the 4s orbital and one in the 3d orbital. This conforms to the Madelung rules.

But when you ionize scandium and remove one electron, it’s not the 3d electron that leaves—it’s one of the 4s electrons! This breaks the stronger version of the Madelung rules.

The weaker version of the Madelung rules also breaks down, but later in the transition metals. The first problem is with chromium, the second is with copper:

By the Madelung rules, chromium should have 2 electrons in the 4s shell and 4 in the 3d shell. But in fact it has just 1 in the 4s and 5 in the 3d.

The second is with copper. By the Madelung rules, this should have 2 electrons in the 4s shell and 9 in the 3d. But in fact it has just 1 in the 4s and 10 in the 3d.

There are also other breakdowns in heavier transition metals, listed here:

• Wikipedia, Aufbau principle: exceptions in the d block.

These subtleties can only be understood by digging a lot deeper into how the electrons in an atom interact with each other. That’s above my pay grade right now. If you know a good place to learn more about this, let me know! I’m only interested in atoms here, not molecules.

Oxidation states of transition metals

Transition metals get some of their special properties because the electrons in the d subshell are easily removed. For example, this is why the transition metals conduct electricity.

Also, when reacting chemically with other elements, they lose different numbers of electrons. The different possibilities are called ‘oxidation states’.

For example, scandium has all the electrons of argon (Ar) plus two in an s orbital and one in a d orbital. It can easily lose 3 electrons, giving an oxidation state called Sc3+. Titanium has one more electron, so it can lose 4 and form Ti4+. And so on:

This accounts for the most obvious pattern in the chart below: the diagonal lines sloping up.

The red dots are common oxidation states, while the white dots are rarer oxidation states. For example iron (Fe) can lose 2 electrons, 3 electrons, 4 electrons (more rarely), 5 electrons, or 6 electrons (more rarely).

The diagonal lines sloping up come from the simple fact that as we move through a group of transition metals, there are more and more electrons in the d subshell, so more can be easily be removed. But everything is complicated by the fact that electrons interact! So the trend doesn’t go on forever: manganese gives up 8 electrons but iron doesn’t easily give up 8, only at most 6. And there’s much more going on, too.

Note also that the two charts above don’t actually agree: the chart in color includes more rare oxidation states.


For a bit more, read:

• Wikipedia, Transition metals.

Oxidation states of transition metals, Chemistry LibreTexts.

The colored chart of oxidation states in this post is from Wikicommons,
made by Felix Wan, corrected to include the two most common oxidation
states of ruthenium. The black-and-white chart is from the Chemistry