The Kepler Problem (Part 4)

27 January, 2022

The Kepler problem is the study of a particle moving in an attractive inverse square force. In classical mechanics, this problem shows up when you study the motion of a planet around the Sun in the Solar System. In quantum mechanics, it shows up when you study the motion of an electron around a proton in a hydrogen atom.

In Part 2 we saw that the classical Kepler problem has, besides energy and the three components of angular momentum, three more conserved quantities: the components of the eccentricity vector!

This was discovered long ago, in 1710, by the physicist Jakob Hermann. But thanks to Noether, we now know that in classical mechanics, conserved quantities come from symmetries. In the Kepler problem, conservation of energy comes from time translation symmetry, while conservation of the angular momentum comes from rotation symmetry. Which symmetries give conservation of the eccentricity vector?

As we shall see, these symmetries are rotations in 4-dimensional space. These include the obvious rotations in 3-dimensional space which give angular momentum. The other 4-dimensional rotations act in a much less obvious way, and give the eccentricity vector.

In fact, we’ll see that the Kepler problem can be rephrased in terms of a free particle moving around on a sphere in 4-dimensional space. This is a nice explanation of the 4-dimensional rotation symmetry.

After that we’ll see a second way to rephrase the Kepler problem: in terms of a massless, relativistic free particle moving at the speed of light on a sphere in 4-dimensional space. Our first formulation will not involve relativity. This second will.

All this is very nice. You can read some fun explanations of the first formulation here:

• Greg Egan, The ellipse and the atom.

• John Baez, Planets in the fourth dimension.

But how could you guess this 4-dimensional rotation symmetry if you didn’t know about it already? One systematic approach uses Poisson brackets. I won’t explain these, just dive in and use them!

Remember, the particle in the Kepler problem has various observables, which are all ultimately functions of its position and momentum:

• position: \vec q

• momentum: \vec p

• energy: H = \tfrac{1}{2} p^2 - \tfrac{1}{q}

• angular momentum: \vec L = \vec q \times \vec p

• the eccentricity vector: \vec e = \vec p \times \vec L - \tfrac{\vec q}{q}

I’ll use conventions where the Poisson brackets of the components of position q_k and momentum p_\ell are taken to be

\{q_k,p_\ell\} = \delta_{jk}

From this, using the rules for Poisson brackets, we can calculate the Poisson brackets of everything else. For starters:

\{H, L_k\} = \{H,e_h\} = 0

These equations are utterly unsurprising, since they are equivalent to saying that angular momentum \vec L and the eccentricity vector \vec e are conserved. More interestingly, we have

\begin{array}{ccl} \{L_k, L_\ell\} &=& \epsilon_{jk\ell} L_\ell  \\ \{e_k, L_\ell\} &=& \epsilon_{jk\ell} e_\ell \\ \{e_k, e_\ell \} &=& -2H \epsilon_{jk\ell} L_\ell \end{array}

where all the indices go from 1 to 3, I’m summing over repeated indices even if they’re both subscripts, and \epsilon_{jk\ell} are the Levi–Civita symbols.

Now, the factor of -2H above is annoying. But on the region of phase space where H < 0—that is, the space of bound states, where the particle carries out an elliptical orbit—we can define a new vector to deal with this annoyance:

\displaystyle{    \vec M = \frac{\vec e}{\sqrt{-2H}} }

Now we easily get

\begin{array}{ccl} \{L_k, L_\ell\} &=& \epsilon_{jk\ell} L_\ell  \\ \{L_j, M_k\} &=& \epsilon_{jk\ell} M_\ell \\ \{M_j, M_k \} &=& \epsilon_{jk\ell} M_\ell \end{array}

This is nicer, but we can simplify it even more if we introduce some new vectors that are linear combinations of \vec L and \vec M, namely half their sum and half their difference:

\vec A = \tfrac{1}{2} (\vec L + \vec M), \qquad \vec B = \tfrac{1}{2}(\vec L - \vec M)

Then we get

\begin{array}{ccl} \{ A_j, A_k\} &=&  \epsilon_{jk\ell} A_\ell \\ \{ B_j, B_k\} &=&  \epsilon_{jk\ell} B_\ell  \\ \{ A_j, B_k\} &=& 0 \end{array}

So, the observables A_j and B_k contain the same information as the angular momentum and eccentricity vectors, but now they commute with each other!

What does this mean?

Well, when you’re first learning math the Levi–Civita symbols \epsilon_{jk\ell} may seem like just a way to summarize the funny rules for cross products in 3-dimensional space. But as you proceed, you ultimately learn that \mathbb{R}^3 with its cross product is the Lie algebra of the Lie group \mathrm{SO}(3) of rotations in 3-dimensional space. From this viewpoint, the Levi–Civita symbols are nothing but the structure constants for the Lie algebra \mathfrak{so}(3): that is, a way of describing the bracket operation in this Lie algebra in terms of basis vectors.

So, what we’ve got here are two commuting copies of \mathfrak{so}(3), one having the A_j as a basis and the other having the B_k as a basis, both with the Poisson bracket as their Lie bracket.

A better way to say the same thing is that we’ve got a single 6-dimensional Lie algebra

\mathfrak{so}(3) \oplus \mathfrak{so}(3)

having both the A_j and B_k as basis. But then comes the miracle:

\mathfrak{so}(3) \oplus \mathfrak{so}(3) \cong \mathfrak{so}(4)

The easiest way to see this is to realize that S^3, the unit sphere in 4 dimensions, is itself a Lie group with Lie algebra isomorphic to \mathfrak{so}(3). Namely, it’s the unit quaternions!—or if you prefer, the Lie group \mathrm{SU}(2). Like any Lie group it acts on itself via left and right translations, which commute. But these are actually ways of rotating S^3. So, you get a map of Lie algebras from \mathfrak{so}(3) \oplus \mathfrak{so}(3) to \mathfrak{so}(4), and you can check that this is an isomorphism.

So in this approach, the 4th dimension pops out of the fact that the Kepler problem has conserved quantities that give two commuting copies of \mathfrak{so}(3). By Noether’s theorem, it follows that conservation of angular momentum and the eccentricity vector must come from a hidden symmetry: symmetry under some group whose Lie algebra is \mathfrak{so}(4).

And indeed, it turns out that the group \mathrm{SO}(4) acts on the bound states of the Kepler problem in a way that commutes with time evolution!

But how can we understand this fact?

Historically, it seems that the first explanation was found in the quantum-mechanical context. In 1926, even before Schrödinger came up with his famous equation, Pauli used conservation of angular momentum and the eccentricity to determine the spectrum of hydrogen. But I believe he was using what we now call Lie algebra methods, not bringing in the group \mathrm{SO}(4).

In 1935, Vladimir Fock, famous for the ‘Fock space’ in quantum field theory, explained this 4-dimensional rotation symmetry by setting up an equivalence between hydrogen atom bound states and functions on the 3-sphere! In the following year, Valentine Bargmann, later famous for being Einstein’s assistant, connected Pauli and Fock’s work using group representation theory.

All this is quantum mechanics. It seems the first global discussion of this symmetry in the classical context was given by Bacry, Ruegg, and Souriau in 1966, leading to important work by Souriau and Moser in the early 1970s. Since then, much more has been done. You can learn about a lot of it from these two books, which are my constant companions these days:

• Victor Guillemin and Shlomo Sternberg, Variations on a Theme by Kepler, Providence, R.I., American Mathematical Society, 1990.

• Bruno Cordani, The Kepler Problem: Group Theoretical Aspects, Regularization and Quantization, with Application to the Study of Pertubation, Birkhäuser, Boston, 2002.

But let me try to summarize a bit of this material.

One way to understand the \mathrm{SO}(4) symmetry for bound states of the Kepler problem is the result of Hamilton that I explained last time: for a particle moving around an elliptical orbit in the Kepler problem, its momentum moves round and round in a circle.

I’ll call these circles Hamilton’s circles. Hamilton’s circles are not arbitrary circles in \mathbb{R}^3. Using the inverse of stereographic projection, we can map \mathbb{R}^3 to the unit 3-sphere:

\begin{array}{rccl} f \colon &\mathbb{R}^3 &\to & S^3    \subset \mathbb{R}^4  \\  \\ & \vec p    &\mapsto & \displaystyle{\left(\frac{p^2 - 1}{p^2 +1}, \frac{2 \vec p}{p^2 + 1}\right).} \end{array}

This map sends Hamilton’s circles in \mathbb{R}^3 to great circles in S^3. Furthermore, this construction gives all the great circles in S^3 except those that go through the north and south poles, (\pm 1, 0,0,0). These missing great circles correspond to periodic orbits in the Kepler problem where a particle starts with momentum zero, falls straight to the origin, and bounces back the way it came. If we include these degenerate orbits, every great circle on the unit 3-sphere is the path traced out by the momentum in some solution of the Kepler problem.

Let me reemphasize: in this picture, points of S^3 correspond not to positions but to momenta in the Kepler problem. As time passes, these points move along great circles in S^3... but not at constant speed.

How is their dynamics related to geodesic motion on the 3-sphere?
We can understand this as follows. In Part 2 we saw that

L^2 + M^2 =  - \frac{1}{2H}

and using the fact that \vec L \cdot \vec M = 0, an easy calculation gives

H \; = \; -\frac{1}{8A^2} \; = \; -\frac{1}{8B^2}

In the 3-sphere picture, the observables A_j become functions on the cotangent bundle T^\ast S^3. These functions are just the components of momentum for a particle on S^3, defined using a standard basis of right-invariant vector fields on S^3 \cong \mathrm{SU}(2). Similarly, the observables B_j are the components of momentum using a standard basis of left-invariant vector fields. It follows that

K = 8A^2 = 8B^2

is the Hamiltonian for a nonrelativistic free particle on S^3 with an appropriately chosen mass. Such a particle moves around a great circle on S^3 at constant speed. Since the Kepler Hamiltonian H is a function of K, particles governed by this Hamiltonian move along the same trajectories—but typically not at constant speed!

Both K and the Kepler Hamiltonian H = -1/K are well-defined smooth functions on the symplectic manifold that Souriau dubbed the Kepler manifold:

T^+ S^3 = \{ (x,p) : \; x \in S^3, \, p \in T_x S^3, \, p \ne 0 \}

This is the cotangent bundle of the 3-sphere with the zero cotangent vectors removed, so that H = -1/K is well-defined.

All this is great. But even better, there’s yet another picture of what’s going on, which brings relativity into the game!

We can also think of T^+ S^3 as a space of null geodesics in the Einstein universe: the manifold \mathbb{R} \times S^3 with the Lorentzian metric

dt^2 - ds^2

where dt^2 is the usual Riemannian metric on the real line (‘time’) and ds^2 is the usual metric on the unit sphere (‘space’). In this picture x \in S^3 describes the geodesic’s position at time zero, while the null cotangent vector p + \|p\| dt describes its 4-momentum at time zero. Beware: in this picture two geodesics count as distinct if we rescale p by any positive factor other than 1. But this is good: physically, it reflects the fact that in relativity, massless particles can have different 4-momentum even if they trace out the same path in spacetime.

In short, the Kepler manifold T^+ S^3 also serves as the classical phase space for a free massless spin-0 particle in the Einstein universe!

And here’s the cool part: the Hamiltonian for such a particle is

\sqrt{K} = \sqrt{-1/H}

So it’s a function of both the Hamiltonians we’ve seen before. Thus, time evolution given by this Hamiltonian carries particles around great circles on the 3-sphere… at constant speed, but at a different speed than the nonrelativistic free particle described by the Hamiltonian K.

In future episodes, I want to quantize this whole story. We’ll get some interesting outlooks on the quantum mechanics of the hydrogen atom.


Learning Computer Science With Categories

26 January, 2022

The first book in Bob Coecke’s series on applied category theory is out, and the pdf is free—legally, even!—until 8 February 2022. Grab a copy now:

• Noson Yanofsky, Theoretical Computer Science for the Working Category Theorist, Cambridge U. Press, 2022.

There are already books on category theory for theoretical computer scientists. Why the reverse? Yanofsky explains:

There’s just one catch: you need to know category theory.

But it’s worth learning category theory, because it’s like a magic key to many subjects. It helps you learn more, faster.


Categories in Chemistry, Computing, and Social Networks

26 January, 2022

This article does two things:

• John Baez, Simon Cho, Daniel Cicala, Nina Otter, and Valeria de Paiva, Applied category theory in chemistry, computing, and social networks, Notices of the American Mathematical Society, February 2022.

It urges you — or your friends, or students — to apply for our free summer school in applied category theory run by the American Mathematical Society. It’s also a quick intro to some key ideas in applied category theory!

Applications are due Tuesday 2022 February 15 at 11:59 Eastern Time — go here for details. If you get in, you’ll get an all-expenses-paid trip to a conference center in upstate New York for a week in the summer. There will be a pool, bocci, lakes with canoes, woods to hike around in, campfires at night… and also whiteboards, meeting rooms, and coffee available 24 hours a day.

You can work with me on categories in chemistry, Nina on categories in the study of social networks, or Valeria on categories applied to concepts from computer science, like lenses.

There are also other programs to choose from. Read this, and click for more details:


The Kepler Problem (Part 3)

23 January, 2022

The Kepler problem studies a particle moving in an inverse square force, like a planet orbiting the Sun. Last time I talked about an extra conserved quantity associated to this problem, which keeps elliptical orbits from precessing or changing shape. This extra conserved quantity is sometimes called the Laplace–Runge–Lenz vector, but since it was first discovered by none of these people, I prefer to call it the ‘eccentricity vector’

In 1847, Hamilton noticed a fascinating consequence of this extra conservation law. For a particle moving in an inverse square force, its momentum moves along a circle!

Greg Egan has given a beautiful geometrical argument for this fact:

• Greg Egan, The ellipse and the atom.

I will not try to outdo him; instead I’ll follow a more dry, calculational approach. One reason is that I’m trying to amass a little arsenal of formulas connected to the Kepler problem.

Let’s dive in. Remember from last time: we’re studying a particle whose position \vec q obeys

\ddot{\vec q} = - \frac{\vec q}{q^3}

Its momentum is

\vec p = m \dot{\vec q}

Its momentum is not conserved. Its conserved quantities are energy:

H = \frac{1}{2} p^2 - \frac{1}{q}

the angular momentum vector:

\vec L = \vec q \times \vec p

and the eccentricity vector:

\vec e = \vec p \times \vec L - \frac{\vec q}{q}

Now for the cool part: we can show that

\displaystyle{ \left( \vec p - \frac{\vec L \times \vec e}{L^2} \right)^2 = \frac{1}{L^2} }

Thus, the momentum \vec p stays on a circle of radius 1/L centered at the point (\vec L \times \vec e)/L^2. And since \vec L and \vec e are conserved, this circle doesn’t change! Let’s call it Hamilton’s circle.

Now let’s actually do the calculations needed to show that the momentum stays on Hamilton’s circle. Since

\vec e = \vec p \times \vec L - \frac{\vec q}{q}

we have

\frac{\vec q}{q} = \vec p \times \vec L - \vec e

Taking the dot product of this vector with itself, which is 1, we get

\begin{array}{ccl}  1 &=& \frac{\vec q}{q} \cdot \frac{\vec q}{q}  \\ \\  &=& (\vec p \times \vec L - \vec e) \cdot (\vec p \times \vec L - \vec e) \\ \\  &=& (\vec p \times \vec L)^2 - 2 \vec e \cdot (\vec p \times \vec L) + e^2  \end{array}

Now, notice that \vec p and \vec L are orthogonal since \vec L = \vec q \times \vec p. Thus

(\vec p \times \vec L)^2 = p^2 L^2

I actually used this fact and explained it in more detail last time. Substituting this in, we get

1 = p^2 L^2 - 2 \vec e \cdot (\vec p \times \vec L) + e^2

Similarly, \vec e and \vec L are orthogonal! After all,

\vec e = \vec p \times \vec L - \frac{\vec q}{q}

The first term is orthogonal to \vec L since it’s the cross product of \vec L and some other vector. And the second term is orthogonal to \vec L since \vec L is the cross product of \vec q and some other vector! So, we have

(\vec L \times \vec e)^2 = L^2 e^2

and thus

\displaystyle { e^2 = \frac{(\vec L \times \vec e)^2}{L^2} }

Substituting this in, we get

\displaystyle { 1 = p^2 L^2 - 2 \vec e \cdot (\vec p \times \vec L) + \frac{(\vec L \times \vec e)^2}{L^2} }

Using the cyclic property of the scalar triple product, we can rewrite this as

\displaystyle { 1 = p^2 L^2 - 2 \vec p \cdot (\vec L \times \vec e) + \frac{(\vec L \times \vec e)^2}{L^2} }

This is nicer because it involves \vec L \times \vec e in two places. If we divide both sides by L^2 we get

\displaystyle { \frac{1}{L^2} = p^2 - \frac{2}{L^2} \; \vec p \cdot (\vec L \times \vec e) + \frac{(\vec L \times \vec e)^2}{L^4} }

And now for the final flourish! The right hand is the dot product of a vector with itself:

\displaystyle { \frac{1}{L^2} = \left(\vec p -  \frac{\vec L \times \vec e}{L^2}\right)^2 }

This is the equation for Hamilton’s circle!

Now, beware: the momentum \vec p doesn’t usually move at a constant rate along Hamilton’s circle, since that would force the particle’s orbit to itself be circular.

But on the bright side, the momentum moves along Hamilton’s circle regardless of whether the particle’s orbit is elliptical, parabolic or hyperbolic. And we can easily distinguish the three cases using Hamilton’s circle!

After all, the center of Hamilton’s circle is the point (\vec L \times \vec e)/L^2, and

(\vec L \times \vec e)^2 = L^2 e^2

so the distance of this center from the origin is

\displaystyle{ \sqrt{\frac{(\vec L \times \vec e)^2}{L^4}} = \sqrt{\frac{L^2 e^2}{L^4}} = \frac{e}{L} }

On the other hand, the radius of Hamilton’s circle is 1/L. So his circle encloses the origin, goes through the origin or does not enclose the origin depending on whether e < 1, e = 1 or e > 1. But we saw last time that these three cases correspond to elliptical, parabolic and hyperbolic orbits!

Summarizing:

• If e < 1 the particle’s orbit is an ellipse and the origin lies inside Hamilton’s circle. The momentum goes round and round Hamilton’s circle as time passes.

• If e = 1 the particle’s orbit is a parabola and the origin lies exactly on Hamilton’s circle. The particle’s momentum approaches zero as time approaches \pm \infty, so its momentum goes around Hamilton’s circle exactly once as time passes.

• If e > 1 the particle’s orbit is a hyperbola and the origin lies outside Hamilton’s circle. The particle’s momentum approaches distinct nonzero values as time approaches \pm \infty, so its momentum goes around just a portion of Hamilton’s circle.

By the way, in general the curve traced out by the momentum vector of a particle is called a hodograph. So you can learn more about Hamilton’s circle with the help of that buzzword.


The Binary Octahedral Group (Part 2)

24 December, 2021

Part 1 introduced the ‘binary octahedral group’. This time I just want to show you some more pictures related to this group. I’ll give just enough explanation to hint at what’s going on. For more details, check out this webpage:

• Greg Egan, Symmetries and the 24-cell.

Okay, here goes!

You can inscribe two regular tetrahedra in a cube:



Each tetrahedron has 4! = 24 symmetries permuting its 4 vertices.

The cube thus has 48 symmetries, twice as many. Half map each tetrahedron to itself, and half switch the two tetrahedra.

If we consider only rotational symmetries, not reflections, we have to divide by 2. The tetrahedron has 12 rotational symmetries. The cube has 24.

But the rotation group SO(3) has a double cover SU(2). So the rotational symmetry groups of tetrahedron and cube have double covers too, with twice as many elements: 24 and 48, respectively.

But these 24-element and 48-element groups are different from the ones mentioned before! They’re called the binary tetrahedral group and binary octahedral group—since we could have used the symmetries of an octahedron instead of a cube.

Now let’s think about these groups using quaternions. We can think of SU(2) as consisting of the ‘unit quaternions’—that is, quaternions of length 1. That will connect what we’re doing to 4-dimensional geometry!

The binary tetrahedral group

Viewed this way, the binary tetrahedral group consists of 24 unit quaternions. 8 of them are very simple:

\pm 1, \; \pm i, \; \pm j, \; \pm k

These form a group called the quaternion group, and they’re the vertices of a shape that’s the 4d analogue of a regular octahedron. It’s called the 4-dimensional cross-polytope and it looks like this:



The remaining 16 elements of the binary tetrahedral group are these:

\displaystyle{ \frac{\pm 1 \pm i \pm j \pm k}{2} }

They form the vertices of a 4-dimensional hypercube:



Putting the vertices of the hypercube and the cross-polytope together, we get all 8 + 16 = 24 elements of the binary tetrahedral group. These are the vertices of a 4-dimensional shape called the 24-cell:



This shape is called the 24-cell not because it has 24 vertices, but because it also has 24 faces, which happen to be regular octahedra. You can see one if you slice the 24-cell like this:



The slices here have real part 1, ½, 0, -½, and -1 respectively. Note that the slices with real part ±½ contain the vertices of a hypercube, while the rest contain the vertices of a cross-polytope.

And here’s another great way to think about the binary tetrahedral group. We’ve seen that if you take every other vertex of a cube you get the vertices of a regular tetrahedron. Similarly, if you take every other vertex of a 4d hypercube you get a 4d cross-polytope. So, you can take the vertices of a 4d hypercube and partition them into the vertices of two cross-polytopes.

As a result, the 24 elements of the binary tetrahedral group can be partitioned into three cross-polytopes! Greg Egan shows how it looks:



The binary octahedral group

Now that we understand the binary tetrahedral group pretty well, we’re ready for our actual goal: understanding the binary octahedral group! We know this forms a group of 48 unit quaternions, and we know it acts as symmetries of the cube—with elements coming in pairs that act on the cube in the same way, because it’s a double cover of the rotational symmetry group of the cube.

So, we can partition its 48 elements into two kinds: those that preserve each tetrahedron in this picture, and those that switch these two tetahedra:



The first 24 form a copy of the binary tetrahedral group and thus a 24-cell, as we have discussed. The second form another 24-cell! And these two separate 24-cells are ‘dual’ to each other: the vertices of each one hover above the centers of the other’s faces.

Greg has nicely animated the 48 elements of the binary octahedral group here:



He’s colored them according to the rotations of the cube they represent:

• black: identity
• red: ±120° rotation around a V axis
• yellow: 180° rotation around an F axis
• blue: ±90° rotation around an F axis
• cyan: 180° rotation around an E axis

Here ‘V, F, and E axes’ join opposite vertices, faces, and edges of the cube.

Finally, note that because

• we can partition the 48 vertices of the binary octahedral group into two 24-cells

and

• we can partition the 24 vertices of the 24-cell into three cross-polytopes

it follows that we can partition the 48 vertices of the binary octahedral group into six cross-polytopes.

I don’t know the deep meaning of this fact. I know that the vertices of the 24-cell correspond to the 24 roots of the Lie algebra \mathfrak{so}(8). I know that the famous ‘triality’ symmetry of \mathfrak{so}(8) permutes the three cross-polytopes in the 24-cell, which are in some rather sneaky way related to the three 8-dimensional irreducible representations of \mathfrak{so}(8). I also know that if we take the two 24-cells in the binary octahedral group, and expand one by a factor of \sqrt{2}, so the vertices of other lie exactly at the center of its faces, we get the 48 roots of the Lie algebra \mathfrak{f}_4. But I don’t know how to extend this story to get a nice story about the six cross-polytopes in the binary octahedral group.

All I know is that if you pick a quaternion group sitting in the binary octahedral group, it will have 6 cosets, and these will be six cross-polytopes.


The Madelung Rules

8 December, 2021

I’ve been thinking about chemistry lately. I’m always amazed by how far we can get in the study of multi-electron atoms using ideas from the hydrogen atom, which has just one electron.

If we ignore the electron’s spin and special relativity, and use just Schrödinger’s equation, the hydrogen atom is exactly solvable—and a key part of any thorough introduction to quantum mechanics. So I’ll zip through that material now, saying just enough to introduce the ‘Madelung rules’, which are two incredibly important rules of thumb for how electrons behave in the ground states of multi-electron atoms.

If take the Hilbert space \mathbf{H} of bound states of the hydrogen atom, ignoring electron spin, we can decompose it as a direct sum of subspaces:

\displaystyle{ \mathbf{H} = \bigoplus_{n = 1}^\infty \mathbf{H}_n }

where the energy equals -1/n^2 in the subspace \mathbf{H}_n.

Since the hydrogen atom has rotational symmetry, each subspace \mathbf{H}_n further decomposes into irreducible representations of the rotation group \mathrm{SO}(3). Any such representation is classified by a natural number \ell = 0, 1, 2, \dots. Physically this number describes angular momentum; mathematically it describes the dimension of the representation. The angular momentum is \sqrt{\ell(\ell + 1)}, while the dimension is 2\ell + 1. Concretely, we can think of this representation as the space of homogeneous polynomials of degree \ell with vanishing Laplacian. I’ll say a bit more about this next time.

The subspace \mathbf{H}_n decomposes as a direct sum of irreducible representations with 0 \le \ell \le n - 1, like this:

\displaystyle{\mathbf{H}_n = \bigoplus_{\ell = 0}^n \mathbf{H}_{n,\ell} }

So, in the subspace \mathbf{H}_{n,\ell} the electron has energy -1/n^2 and angular momentum \sqrt{\ell(\ell + 1)}.

I’ve been ignoring the electron’s spin, but we shouldn’t ignore it completely, because it’s very important for understanding the periodic table and chemistry in general. To take it into account, all I will do is tensor \mathbf{H} with \mathbb{C}^2, to account for the electron’s two spin states. So, the true Hilbert space of bound states of a hydrogen atom is \mathbf{H} \otimes \mathbb{C}^2.

By what I’ve said, we can decompose this Hilbert space into subspaces

\mathbf{H}_n \otimes \mathbb{C}^2

called shells, getting this:

\displaystyle{ \mathbf{H} \otimes \mathbb{C}^2 = \bigoplus_{n = 1}^\infty \mathbf{H}_n \otimes \mathbb{C}^2 }

We can then decompose the shells further into subspaces

\mathbf{H}_{n,\ell} \otimes \mathbb{C}

called subshells, obtaining our final result:

\displaystyle{ \mathbf{H} \otimes \mathbb{C}^2 = \bigoplus_{n = 1}^\infty \bigoplus_{\ell = 0}^{n-1} \mathbf{H}_{n , \ell} \otimes \mathbb{C}^2 }

Since

\dim(\mathbf{H}_{n,\ell}) = 2\ell + 1

the dimensions of the subshells are

\dim(\mathbf{H}_{n,\ell} \otimes \mathbb{C}^2) = 2(2\ell + 1)

or in other words, twice the odd numbers:

2, 6, 10, 14, \dots

This lets us calculate the dimensions of the shells:

\displaystyle{ \dim(\mathbf{H}_{n}) = \sum_{\ell = 0}^{n-1} \dim(\mathbf{H}_{n,\ell})  = 2\sum_{\ell = 0}^{n-1} (2\ell + 1) = 2n^2}

since the sum of the first bunch of odd numbers is perfect square. So, the dimensions of the shells are twice the perfect squares:

2, 8, 18, 32, \dots

Okay, so much for hydrogen!

Now, the ‘Aufbau principle’ says that as we keep going through the periodic table, looking at elements with more and more electrons, these electrons will act approximately as if each one occupies a subshell. Thanks to the Pauli exclusion principle, we can’t put more electrons into some subshell than the dimension of that subshell. So the big question is: which subshell does each electron go into?

This is the question that the Madelung rules answer! Here’s what they say:

  1. Electrons are assigned to subshells in order of increasing values of n + \ell.
  2. For subshells with the same value of n + \ell, electrons are assigned first to the subshell with lower n—or equivalently, higher \ell.

They aren’t always right, but they’re damned good, given that in reality we’ve got a bunch of electrons interacting with each other—and not even described by separate wavefunctions, but really one big fat ‘entangled’ wavefunction.

Here’s what the Madelung rules predict:

So, subshells get filled in order of increasing n + \ell, and for any choice of n + \ell we start with the biggest possible \ell and work down.

This chart uses some old-fashioned but still very popular notation for the subshells. We say the number n, but instead of saying the number \ell we use a letter:

\ell = 0: s
\ell = 1: p
\ell = 2: d
\ell = 3: f

The reasons for these letters would make for a long and thrilling story, but not today.

The history of the Madelung rules

At this point I should go through the periodic table and show you how well the Madelung rules predict what’s going on. Basically, as we fill a particular subshell we get a bunch of related elements, and then as we go on up to the next subshell we get another bunch—and we can understand a lot about their properties! But there are also plenty of deviations from the Madelung rules, which are also interesting. I’ll talk about all these things next time.

I should also say a bit about why the Madelung rules work as well as they do! For example, what’s the importance of n + \ell?

But that’s not what I have lined up for today. Sorry! Instead, I want to talk about something much less important: the historical origin of the Madelung rules. According to Wikipedia they have many names:

• the Madelung rule (after Erwin Madelung)
• the Janet rule (after Charles Janet)
• the Klechkowsky rule (after Vsevolod Klechkovsky)
• Wiswesser’s rule (after William Wiswesser)
• the Aufbau approximation
• the diagonal rule, and
• the Uncle Wiggly path.

Seriously! Uncle Wiggly!

Understanding the history of these rules is going to be difficult. You can read a lot about their prehistory here:

• Wikipedia, Aufbau principle.

Bohr and Sommerfeld played a big role in setting up the theory of shells and subshells, and perhaps the whole idea of ‘Aufbau’: German for ‘building up’ atoms one electron at a time.

But I was confused about whether Madelung discovered both rules or just one, especially because a lot of people say ‘Madelung rule’ in the singular, when there are really two. So I asked on History of Science and Mathematics Stackexchange:

There are two widely used rules of thumb to determine which subshells are filled in a neutral atom in its ground state:

• Electrons are assigned to subshells in order of increasing value of n + \ell.

• For subshells with the same value of n + \ell, electrons are assigned first to the subshell with lower n.

These rules don’t always hold, but that’s not my concern here. My question is: which of these rules did Erwin Madelung discover? (Both? Just one?)

Wikipedia seems to say Charles Janet discovered the first in 1928:

A periodic table in which each row corresponds to one value of n + \ell (where the values of n and \ell correspond to the principal and azimuthal quantum numbers respectively) was suggested by Charles Janet in 1928, and in 1930 he made explicit the quantum basis of this pattern, based on knowledge of atomic ground states determined by the analysis of atomic spectra. This table came to be referred to as the left-step table. Janet “adjusted” some of the actual n + \ell values of the elements, since they did not accord with his energy ordering rule, and he considered that the discrepancies involved must have arisen from measurement errors. In the event, the actual values were correct and the n + \ell energy ordering rule turned out to be an approximation rather than a perfect fit, although for all elements that are exceptions the regularised configuration is a low-energy excited state, well within reach of chemical bond energies.

It then goes on to say:

In 1936, the German physicist Erwin Madelung proposed this as an empirical rule for the order of filling atomic subshells, and most English-language sources therefore refer to the Madelung rule. Madelung may have been aware of this pattern as early as 1926.

Is “this” still rule 1?

It then goes on to say:

In 1945 William Wiswesser proposed that the subshells are filled in order of increasing values of the function

{\displaystyle W(n,\ell)=n+\ell-{\frac {\ell}{\ell+1}}.}

This is equivalent to the combination of rules 1 and 2. So, this article seems to suggest that rule 2 was discovered by Wiswesser. But I have my doubts. Goudsmit and Richards write:

Madelung discovered a simple empirical rule for neutral atoms. It consists of two parts.

and then they list both 1 and 2.

I have not yet managed to get ahold of Erwin Madelung’s work, e.g.

• E. Madelung, Die mathematischen Hilfsmittel der Physikers, Springer, Berlin, 1936

I got a helpful reply from M. Farooq:

The only relevant section in E. Madelung’s edited book, Die mathematischen Hilfsmittel der Physikers, Springer: Berlin, 1936 is a small paragraph. It seems Madelung just called this for electron book-keeping and he had no justification for his proposition. He calls it a lexicographic order (lexikographische Ordnung).

I was searching for it some other reasons several months ago: Question in SE Physics. The original and the (machine but edited) translations is shared.

15 Atomic structure (electron catalog) (to p. 301).

The eigenfunction of an atom, consisting of Z electrons and Z-times positively charged nucleus, can be constructed in the case of removed degeneracy in first approximation as a product of Z hydrogen eigenfunctions (cf. p. 356), each of which is defined by four quantum numbers n, \ell, m, s defined by n>0, n-1 \geqq \ell \geqq 0, s=\pm \frac{1}{2}, m \leqq l. According to Pauli’s principle, no two of these functions may coincide in all four quantum numbers. According to the Bohr principle of structure, an atom with Z electrons is formed from an atom with (Z-1) electrons by adding another one (and increasing the nuclear charge by 1) without changing the quantum numbers of the already existing electrons. Therefore a catalog can be set up, from whose in each case Z first positions the atom is built up in the basic state (cf. the table p. 360).

The ordering principle of this catalog is a lexicographic order according to the numbers (n+\ell), n, s, m. A theoretical justification of just this arrangement is not yet available. One reads from it:

1. The periodic system of the elements. Two atoms are homologous if in each case their “last electron” in the l, m, s coincides.

2. The spectroscopic character of the basic term, entered in column 10 . There is namely |\Sigma m|=0,1,2,3 \ldots the characterS P, D, F, G, H, I \ldots and (2|\Sigma s|+1) the multiplicity.

3. The possibilities for excited states (possible terms), where not all Z electrons are in the first Z positions of the catalog.

The catalog is the representation form of an empirical rule. It idealizes the experience, because in some cases deviations are observed.

So, it’s clear that Madelung knew and efficiently stated both rules in 1936! By the way, ‘lexicographic order’ is the standard mathematical term for how we’re ordering the pairs (n+\ell, n) in the Madelung rules.

References

I took the handsome chart illustrating the Madelung rules from here:

Aufbau Principle Chemistry God, 18 February 2020.

This blog article is definitely worth reading!


Compositional Thermostatics (Part 1)

22 November, 2021

At the Topos Institute this summer, a group of folks started talking about thermodynamics and category theory. It probably started because Spencer Breiner and my former student Joe Moeller, both working at NIST, were talking about thermodynamics with some people there. But I’ve been interested in thermodynamics for quite a while now –and Owen Lynch, a grad student visiting from the University of Utrecht, wanted to do his master’s thesis on the subject. He’s now working with me. Sophie Libkind, David Spivak and David Jaz Myers also joined in: they’re especially interested in open systems and how they interact.

Prompted by these conversations, a subset of us eventually wrote a paper on the foundations of equilibrium thermodynamics:

• John Baez, Owen Lynch and Joe Moeller, Compositional thermostatics.

The idea here is to describe classical thermodynamics, classical statistical mechanics and quantum statistical mechanics in a unified framework based on entropy maximization. This framework can also handle ‘generalized probabilistic theories’ of the sort studied in quantum foundations—that is, theories like quantum mechanics, but more general.

To unify all these theories, we define a ‘thermostatic system’ to be any convex space X of ‘states’ together with a concave function

S \colon X \to [-\infty, \infty]

assigning to each state an ‘entropy’.

Whenever several such systems are combined and allowed to come to equilibrium, the new equilibrium state maximizes the total entropy subject to constraints. We explain how to express this idea using an operad. Intuitively speaking, the operad we construct has as operations all possible ways of combining thermostatic systems. For example, there is an operation that combines two gases in such a way that they can exchange energy and volume, but not particles—and another operation that lets them exchange only particles, and so on.

It is crucial to use a sufficiently general concept of ‘convex space’, which need not be a convex subset of a vector space. Luckily there has been a lot of work on this, so we can just grab a good definition off the shelf:

Definition. A convex space is a set X with an operation c_\lambda \colon X \times X \to X for each \lambda \in [0, 1] such that the following identities hold:

1) c_1(x, y) = x

2) c_\lambda(x, x) = x

3) c_\lambda(x, y) = c_{1-\lambda}(y, x)

4) c_\lambda(c_\mu(x, y) , z) = c_{\lambda'}(x, c_{\mu'}(y, z)) for all 0 \le \lambda, \mu, \lambda', \mu' \le 1 satisfying \lambda\mu = \lambda' and 1-\lambda = (1-\lambda')(1-\mu').

To understand these axioms, especially the last, you need to check that any vector space is a convex space with

c_\lambda(x, y) = \lambda x + (1-\lambda)y

So, these operations c_\lambda describe ‘convex linear combinations’.

Indeed, any subset of a vector space closed under convex linear combinations is a convex space! But there are other examples too.

In 1949, the famous mathematician Marshall Stone invented ‘barycentric algebras’. These are convex spaces satisfying one extra axiom: the cancellation axiom, which says that whenever \lambda \ne 0,

c_\lambda(x,y) = c_\lambda(x',y) \implies x = x'

He proved that any barycentric algebra is isomorphic to a convex subset of a vector space. Later Walter Neumann noted that a convex space, defined as above, is isomorphic to a convex subset of a vector space if and only if the cancellation axiom holds.

Dropping the cancellation axiom has convenient formal consequences, since the resulting more general convex spaces can then be defined as algebras of a finitary commutative monad, giving the category of convex spaces very good properties.

But dropping this axiom is no mere formal nicety. In our definition of ‘thermostatic system’, we need the set of possible values of entropy to be a convex space. One obvious candidate is the set [0,\infty). However, for a well-behaved formalism based on entropy maximization, we want the supremum of any set of entropies to be well-defined. This forces us to consider the larger set [0,\infty], which does not obey the cancellation axiom.

But even that is not good enough! In thermodynamics you often read about the ‘heat bath‘, an idealized system that can absorb or emit an arbitrarily large amount of energy while keeping a fixed temperature. We want to treat the ‘heat bath’ as a thermostatic system on an equal footing with any other. To do this, we need to allow consider negative entropies—not because the heat bath can have negative entropy, but because it acts as an infinite reservoir of entropy, and the change in entropy from its default state can be positive or negative.

This suggests letting entropies take values in the convex space \mathbb{R}. But then the requirement that any set of entropies have a supremum (including empty and unbounded sets) forces us to use the larger convex space [-\infty,\infty].

This does not obey the cancellation axiom, so there is no way to think of it as a convex subset of a vector space. In fact, it’s not even immediately obvious how to make it into a convex space at all! After all, what do you get when you take a nontrivial convex linear combination of \infty and -\infty? You’ll have to read our paper for the answer, and the justification.

We then define a thermostatic system to be a convex set X together with a concave function

S \colon X \to [-\infty, \infty]

where concave means that

S(c_\lambda(x,y)) \ge c_\lambda(S(x), S(y))

We give lots of examples from classical thermodynamics, classical and quantum statistical mechanics, and beyond—including our friend the ‘heat bath’.

For example, suppose X is the set of probability distributions on an n-element set, and suppose S \colon X \to [-\infty, \infty] is the Shannon entropy

\displaystyle{ S(p) = - \sum_{i = 1}^n p_i \log p_i }

Then given two probability distributions p and q, we have

S(\lambda p + (1-\lambda q)) \ge \lambda S(p) + (1-\lambda) S(q)

for all \lambda \in [0,1]. So this entropy function is convex, and S \colon X \to [-\infty, \infty] defines a thermostatic system. But in this example the entropy only takes nonnegative values, and is never infinite, so you need to look at other examples to see why we want to let entropy take values in [-\infty,\infty].

After looking at examples of thermostatic systems, we define an operad whose operations are convex-linear relations from a product of convex spaces to a single convex space. And then we prove that thermostatic systems give an algebra for this operad: that is, we can really stick together thermostatic systems in all these ways. The trick is computing the entropy function of the new composed system from the entropy functions of its parts: this is where entropy maximization comes in.

For a nice introduction to these ideas, check out Owen’s blog article:

• Owen Lynch, Compositional thermostatics, Topos Institute Blog, 9 September 2021.

And then comes the really interesting part: checking that this adequately captures many of the examples physicists have thought about!

The picture at the top of this post shows one that we discuss: two cylinders of ideal gas with a movable divider between them that’s permeable to heat. Yes, this is an operation in an operad—and if you tell us the entropy function of each cylinder of gas, our formalism will automatically compute the entropy function of the resulting combination of these two cylinders.

There are many other examples. Did you ever hear of the ‘canonical ensemble’, the ‘microcanonical ensemble’, or the ‘grand canonical ensemble’? Those are famous things in statistical mechanics. We show how our formalism recovers these.

I’m sure there’s much more to be done. But I feel happy to see modern math being put to good use: making the foundations of thermodynamics more precise. Once Vladimir Arnol’d wrote:

Every mathematician knows that it is impossible to understand any elementary course in thermodynamics.

I’m not sure our work will help with that—and indeed, it’s possible that once the mathematicians finally understand thermodynamics, physicists won’t understand what the mathematicians are talking about! But at least we’re clearly seeing some more of the mathematical structures that are hinted at, but not fully spelled out, in such an elementary course.

I expect that our work will interact nicely with Simon Willerton’s work on the Legendre transform. The Legendre transform of a concave (or convex) function is widely used in thermostatics, and Simon describes this for functions valued in [-\infty,\infty] using enriched profunctors:

• Simon Willerton, Enrichment and the Legendre–Fenchel transform I, The n-Category Café, April 16, 2014.

• Simon Willerton, Enrichment and the Legendre–Fenchel transform II, The n-Category Café, May 22, 2014.

He also has a paper on this, and you can see him talk about it on YouTube.


See all four parts of this series:

Part 1: thermostatic systems and convex sets.

Part 2: composing thermostatic systems.

Part 3: operads and their algebras.

Part 4: the operad for composing thermostatic systems.


The Kuramoto–Sivashinsky Equation (Part 7)

3 November, 2021

I have a lot of catching up to do. I want to share a bunch of work by Steve Huntsman. I’ll start with some older material. A bit of this may be ‘outdated’ by his later work, but I figure it’s all worth recording.

One goal here is to define ‘stripes’ for the Kuramoto–Sivashinky equation in a way that lets us count them, their births, and their mergers, and so on. We need a good definition to test the conjectures I made in Part 1.

While I originally formulated my conjectures for the ‘integral form’ of the
Kuramoto–Sivashinky equation:

h_t + h_{xx} + h_{xxxx} + \frac{1}{2} (h_x)^2 = 0

Steve has mostly been working with the derivative form:

u_t + u_{xx} + u_{xxxx} + u u_x = 0

so you can assume that unless I say otherwise. He’s using periodic boundary conditions such that

u(t,x) = u(t,x+L)

for some length L. The length depends on the particular experiment he’s doing.

First, a plot of stripes. It looks like L = 100 here:



Births and deaths are shown as green and red dots, respectively. But to see them, you may need to click on the picture to enlarge it!

According to my conjecture there should be no red dots. The red dots at the top and the bottom of the image don’t count: they mostly arise because this program doesn’t take the periodic boundary conditions into account. There are two other red dots, which are worth thinking about.

Nice! But how are stripes being defined here? He describes how:

The stripe definition is mostly pretty simple and not image processy at all, and the trick to improve it is limited to removing little blobs and is easily explained.

Let u(t,x) be the solution to the KSE. Then let

v(t,x) := u(t,x)- u(t,x+a)

where a is the average integer offset (maybe I’m missing a minus sign a la -a) that maximizes the cross-correlation between u(t,x) and -u(t,x+a). Now anywhere v exceeds its median is part of a stripe.

The image processing trick is that I delete little stripes (and I use what image processors would call 4-connectivity to define simply connected regions—this is the conservative idea that a pixel should have a neighbor to the north, south, east, or west to be connected to that neighbor, instead of the aggressive 8-connectivity that allows NE, NW, SE, SW too) whose area is less than 1000 grid points. So it uses lots of image processing machinery to actually do its job, but the definition is simple and easily explained mathematically.

An obvious fix that removes the two nontrivial deaths in the picture I sent is to require a death to be sufficiently far away from another stripe: here I am guessing that the characteristic radius of a stripe will work just fine.


Learn Applied Category Theory!

27 October, 2021

Do you like the idea of learning applied category theory by working on a project, as part of a team led by an expert? If you’re an early career researcher you can apply to do that now!

Mathematical Research Community: Applied Category Theory, meeting 2022 May 29–June 4. Details on how to apply: here. Deadline to apply: Tuesday 2022 February 15 at 11:59 Eastern Time.

After working with your team online, you’ll take an all-expenses-paid trip to a conference center in upstate New York for a week in the summer. There will be a pool, bocci, lakes with canoes, woods to hike around in, campfires at night… and also whiteboards, meeting rooms, and coffee available 24 hours a day to power your research!

Later you’ll get invited to the 2023 Joint Mathematics Meetings in Boston.

There will be three projects to choose from:

Valeria de Paiva (Topos Institute) will lead a study in the context of computer science that investigates indexed containers and partial compilers using lenses and Dialectica categories.

Nina Otter (Queen Mary University of London) will lead a study of social networks using simplicial complexes.

John Baez (University of California, Riverside) will lead a study of chemical reaction networks using category theoretic methods such as structured cospans.

The whole thing is being organized by Daniel Cicala of the University of New Haven:

and Simon Cho of Two Six Technologies:

I should add that this is just one of four ‘Mathematical Research Communities’ run by the American Mathematical Society in 2022, and you may prefer another. The applied category theory session will be held at the same time and place as one on data science! Then there are two more:

• Week 1a: Applied Category Theory

Organizers: John Baez, University of California, Riverside; Simon Cho, Two Six Technologies; Daniel Cicala, University of New Haven; Nina Otter, Queen Mary University of London; Valeria de Paiva, Topos Institute.

• Week 1b: Data Science at the Crossroads of Analysis, Geometry, and Topology

Organizers: Marina Meila, University of Washington; Facundo Mémoli, The Ohio State University; Jose Perea, Northeastern University; Nicolas Garcia Trillos, University of Wisconsin-Madison; Soledad Villar, Johns Hopkins University.

• Week 2a: Models and Methods for Sparse (Hyper)Network Science

Organizers: Sinan G. Aksoy, Pacific Northwest National Laboratory; Aric Hagberg, Los Alamos National Laboratory; Cliff Joslyn, Pacific Northwest National Laboratory; Bill Kay, Oak Ridge National Laboratory; Emilie Purvine, Pacific Northwest National Laboratory; Stephen J. Young, Pacific Northwest National Laboratory; Jennifer Webster, Pacific Northwest National Laboratory.

• Week 2b: Trees in Many Contexts

Organizers: Miklós Bóna, University of Florida; Éva Czabarka, University of South Carolina; Heather Smith Blake, Davidson College; Stephan Wagner, Uppsala University; Hua Wang, Georgia Southern University.

Applicants should be ready to engage in collaborative research and should be “early career”—either expecting to earn a PhD within two years or having completed a PhD within five years of the date of the summer conference. Exceptions to this limit on the career stage of an applicant may be made on a case-by-case basis. The Mathematical Research Community (MRC) program is open to individuals who are US citizens as well as to those who are affiliated with US institutions and companies/organizations. A few international participants may be accepted. Depending on space and other factors, a small number of self-funded participants may be admitted. Individuals who have once previously been an MRC participant will be considered for admission, and their applications must include a rationale for repeating. Please note that individuals cannot participate in the MRC program more than twice: applications from individuals who have twice been MRC participants will not be considered.

We seek individuals who will both contribute to and benefit from the MRC experience, and the goal is to create a collaborative research community that is vibrant, productive, and diverse. We welcome applicants from academic institutions of all types, as well as from private industry and government laboratories and agencies. Women and under-represented minorities are especially encouraged to apply.

All participants are expected to be active in the full array of MRC activities—the summer conference, special sessions at the Joint Mathematics Meetings, and follow-up collaborations.


The Kuramoto–Sivashinsky Equation (Part 6)

25 October, 2021

guest post by Theodore Kolokolnikov

I coded up a simple dynamical system with the following rules (loosely motivated by theory of motion of spikes in reaction-diffusion systems, see e.g. appendix of this paper, as well as this paper):

• insert a particle if inter-particle distance is more than some maxdist
• merge any two particles that collide
• otherwise evolve particles according to the ODE

\displaystyle{ x'_k(t) = \sum_{j=1}^N G_x(x_k, x_j) }

Here, G is a Green’s function that satisfies

G_{xx}-\lambda^2 G = -\delta(x,y)

inside the interval [-L,L] with Neumann boundary conditions G_x(\pm L, y)=0. Explicitly,

\displaystyle{ G(x,y) = \frac{\cosh((x+y)\lambda)+\cosh((2L-|x-y|) \lambda )}{2 \lambda \sinh(2L \lambda ) }  }

and

\displaystyle{ G_x(x,y)= \frac{\sinh((x+y) \lambda )+ \sinh((|x-y|-2L) \lambda ) \mbox{sign}(x-y) } {2 \sinh(2L\lambda)} }

where sign(0) is taken to be zero so that

\displaystyle{ G_x(y,y) := \frac{G_x(y^+,y)+ G_x(y^-,y)}{2} }

In particular, for large \lambda, one has

\displaystyle{ G(x,y)\sim\frac{e^{-\lambda | x-y|}}{2\lambda} }

and

\displaystyle{ G_x(x,y)\sim-\frac{e^{-\lambda | x-y|}} {2} \mbox{sign}(x-y), ~~\lambda \gg 1 }

Here are some of the resulting simulations, with different \lambda (including complex \lambda). This is mainly just for fun but there is a wide range of behaviours. In particular, I think the large-lambda limit should be able to capture analogous dynamics in the Keller-Siegel model with logistic growth (Hillen et. al.), see e.g. figures in this paper.