Serpentine Barrens

19 February, 2021


This is the Soldiers Delight Natural Environmental Area, a nature reserve in Maryland. The early colonial records of Maryland describe the area as a hunting ground for Native Americans. In 1693, rangers in the King’s service from a nearby garrison patrolled this area and named it Soldiers Delight, for some unknown reason.

It may not look like much, but that’s exactly the point! In this otherwise lush land, why does it look like nothing but grass and a few scattered trees are growing here?

It’s because this area is a serpentine barrens. Serpentine is a kind of rock: actually a class of closely related minerals which get their name from their smooth or scaly green appearance.

Soils formed from serpentine are toxic to many plants because they have lots of nickel, chromium, and cobalt! Plants are also discouraged by how these soils have little potassium and phosphorus, not much calcium, and too much magnesium. Serpentine, you see, is made of magnesium, silicon, iron, hydrogen and oxygen.

As a result, the plants that actually do well in serpentine barrens are very specialized: some small beautiful flowers, for example. Indeed, there are nature reserves devoted to protecting these! One of the most dramatic is the Tablelands of Gros Morne National Park in Newfoundland:

Scott Weidensaul writes this about the Tablelands:

These are hardly garden spots, and virtually no animals live here except for birds and the odd caribou passing through. Yet some plants manage to eke out a living. Balsam ragwort, a relative of the cat’s-paw ragwort of the shale barrens, has managed to cope with the toxins and can tolerate up to 12 percent of its dry weight in magnesium—a concentration that would level most flowers. Even the common pitcher-plant, a species normally associated with bogs, has a niche in this near-desert, growing along the edges of spring seeps where subsurface water brings up a little calcium. By supplementing soil nourishment with a diet of insects trapped in its upright tubes, the pitcher-plant is able to augment the Tablelands’ miserly offerings. Several other carnivorous plants, including sundews and butterwort, work the same trick on their environment.

In North America, serpentine barrens can be found in the Appalachian Mountains—Gros Morne is at the northern end of these, and further south are the Soldiers Delight Natural Environmental Area in Maryland, and the State Line Serpentine Barrens on the border of Maryland and Pennsylvania.

There are also serpentine barrens in the coastal ranges of California, Oregon, and Washington. Here are some well-adapted flowers in the Klamath-Siskiyou Mountains on the border of California and Oregon:

I first thought about serpentine when the Azimuth Project was exploring ways of sucking carbon dioxide from the air. If you grind up serpentine and get it wet, it will absorb carbon dioxide! A kilogram of serpentine can dispose about two-thirds of a kilogram of carbon dioxide. So, people have suggested this as a way to fight global warming.

Unfortunately we’re putting out over 37 gigatonnes of carbon dioxide per year. To absorb all of this we’d need to grind up about 55 gigatonnes of serpentine every year, spread it around, and get it wet. There’s plenty of serpentine available, but this is over ten times the amount of worldwide cement production, so it would take a lot of work. Then there’s the question of where to put all the ground-up rock.

And now I’ve learned that serpentine poses serious challenges to the growth of plant life! It doesn’t much matter, given that nobody seems eager to fight global warming by grinding up huge amounts of this rock. But it’s interesting.


The top picture of the Soldiers Delight Natural Environmental Area was taken by someone named Veggies. The picture of serpentine was apparently taken by Kluka. The Tablelands were photographed by Tango7174. All these are on Wikicommons. The quote comes from this wonderful book:

• Scott Weidensaul, Mountains of the Heart: A Natural History of the Appalachians, Fulcrum Publishing, 2016.

The picture of flowers in the Klamath-Siskiyous was taken by Susan Erwin and appears along with many other interesting things here:

Klamath-Siskiyou serpentines, U. S. Forest Service.

A quote:

It is crystal clear when you have entered the serpentine realm. There is no mistaking it, as the vegetation shift is sharp and dramatic. Full-canopied forests become sparse woodlands or barrens sometimes in a matter of a few feet. Dwarfed trees, low-lying shrubs, grassy patches, and rock characterize the dry, serpentine uplands. Carnivorous wetlands, meadows, and Port-Orford-cedar dominated riparian areas express the water that finds its way to the surface through fractured and faulted bedrock.

For more on serpentine, serpentinization, and serpentine barrens, try this blog article:

Serpentine, Hiker’s Notebook.

It’s enjoyable despite its misuse of the word ‘Weltanschauung’.


2 December, 2020


The Japanese take pride in ‘shinise’: businesses that have lasted for hundreds or even thousands of years. This points out an interesting alternative to the goal of profit maximization: maximizing the time of survival.

• Ben Dooley and Hisako Ueno, This Japanese shop is 1,020 years old. It knows a bit about surviving crises, New York Times, 2 December 2020.

Such enterprises may be less dynamic than those in other countries. But their resilience offers lessons for businesses in places like the United States, where the coronavirus has forced tens of thousands into bankruptcy.

“If you look at the economics textbooks, enterprises are supposed to be maximizing profits, scaling up their size, market share and growth rate. But these companies’ operating principles are completely different,” said Kenji Matsuoka, a professor emeritus of business at Ryukoku University in Kyoto.

“Their No. 1 priority is carrying on,” he added. “Each generation is like a runner in a relay race. What’s important is passing the baton.”

Japan is an old-business superpower. The country is home to more than 33,000 with at least 100 years of history — over 40 percent of the world’s total. Over 3,100 have been running for at least two centuries. Around 140 have existed for more than 500 years. And at least 19 claim to have been continuously operating since the first millennium.

(Some of the oldest companies, including Ichiwa, cannot definitively trace their history back to their founding, but their timelines are accepted by the government, scholars and — in Ichiwa’s case — the competing mochi shop across the street.)

The businesses, known as “shinise,” are a source of both pride and fascination. Regional governments promote their products. Business management books explain the secrets of their success. And entire travel guides are devoted to them.

Of course if some businesses try to maximize time of survival, they may be small compared to businesses that are mainly trying to become “big”—at least if size is not the best road to long-term survival, which apparently it’s not. So we’ll have short-lived dinosaurs tromping around, and, dodging their footsteps, long-lived mice.

The idea of different organisms pursuing different strategies is familiar in ecology, where people talk about r-selected and K-selected organisms. The former “emphasize high growth rates, typically exploit less-crowded ecological niches, and produce many offspring, each of which has a relatively low probability of surviving to adulthood.” The latter “display traits associated with living at densities close to carrying capacity and typically are strong competitors in such crowded niches, that invest more heavily in fewer offspring, each of which has a relatively high probability of surviving to adulthood.”

But the contrast between r-selection and K-selection seems different to me than the contrast between profit maximization and lifespan maximization. As far as I know, no organism except humans deliberately tries to maximize the lifetime of anything.

And amusingly, the theory of r-selection versus K-selection may also be nearing the end of its life:

When Stearns reviewed the status of the theory in 1992, he noted that from 1977 to 1982 there was an average of 42 references to the theory per year in the BIOSIS literature search service, but from 1984 to 1989 the average dropped to 16 per year and continued to decline. He concluded that r/K theory was a once useful heuristic that no longer serves a purpose in life history theory.

For newer thoughts, see:

• D. Reznick, M. J. Bryant and F. Bashey, r-and K-selection revisited: the role of population regulation in life-history evolution, Ecology 83 (2002) 1509–1520.

See also:

• Innan Sasaki, How to build a business that lasts more than 200 years—lessons from Japan’s shinise companies, The Conversation, 6 June 2019.

Among other things, she writes:

We also found there to be a dark side to the success of these age-old shinise firms. At least half of the 17 companies we interviewed spoke of hardships in maintaining their high social status. They experienced peer pressure not to innovate (and solely focus on maintaining tradition) and had to make personal sacrifices to maintain their family and business continuity.

As the vice president of Shioyoshiken, a sweets company established in 1884, told us:

In a shinise, the firm is the same as the family. We need to sacrifice our own will and our own feelings and what we want to do … Inheriting and continuing the household is very important … We do not continue the business because we particularly like that industry. The fact that our family makes sweets is a coincidence. What is important is to continue the household as it is.

Innovations were sometimes discouraged by either the earlier family generation who were keen on maintaining the tradition, or peer shinise firms who cared about maintaining the tradition of the industry as a whole. Ultimately, we found that these businesses achieve such a long life through long-term sacrifice at both the personal and organisational level.

Fisher’s Fundamental Theorem (Part 3)

8 October, 2020

Last time we stated and proved a simple version of Fisher’s fundamental theorem of natural selection, which says that under some conditions, the rate of increase of the mean fitness equals the variance of the fitness. But the conditions we gave were very restrictive: namely, that the fitness of each species of replicator is constant, not depending on how many of these replicators there are, or any other replicators.

To broaden the scope of Fisher’s fundamental theorem we need to do one of two things:

1) change the left side of the equation: talk about some other quantity other than rate of change of mean fitness.

2) change the right side of the question: talk about some other quantity than the variance in fitness.

Or we could do both! People have spent a lot of time generalizing Fisher’s fundamental theorem. I don’t think there are, or should be, any hard rules on what counts as a generalization.

But today we’ll take alternative 1). We’ll show the square of something called the ‘Fisher speed’ always equals the variance in fitness. One nice thing about this result is that we can drop the restrictive condition I mentioned. Another nice thing is that the Fisher speed is a concept from information theory! It’s defined using the Fisher metric on the space of probability distributions.

And yes—that metric is named after the same guy who proved Fisher’s fundamental theorem! So, arguably, Fisher should have proved this generalization of Fisher’s fundamental theorem. But in fact it seems that I was the first to prove it, around February 1st, 2017. Some similar results were already known, and I will discuss those someday. But they’re a bit different.

A good way to think about the Fisher speed is that it’s ‘the rate at which information is being updated’. A population of replicators of different species gives a probability distribution. Like any probability distribution, this has information in it. As the populations of our replicators change, the Fisher speed says the rate at which this information is being updated. So, in simple terms, we’ll show

The square of the rate at which information is updated is equal to the variance in fitness.

This is quite a change from Fisher’s original idea, namely:

The rate of increase of mean fitness is equal to the variance in fitness.

But it has the advantage of always being true… as long the population dynamics are described by the general framework we introduced last time. So let me remind you of the general setup, and then prove the result!

The setup

We start out with population functions P_i \colon \mathbb{R} \to (0,\infty), one for each species of replicator i = 1,\dots,n, obeying the Lotka–Volterra equation

\displaystyle{ \frac{d P_i}{d t} = f_i(P_1, \dots, P_n) P_i }

for some differentiable functions f_i \colon (0,\infty) \to \mathbb{R} called fitness functions. The probability of a replicator being in the ith species is

\displaystyle{  p_i(t) = \frac{P_i(t)}{\sum_j P_j(t)} }

Using the Lotka–Volterra equation we showed last time that these probabilities obey the replicator equation

\displaystyle{ \dot{p}_i = \left( f_i(P) - \overline f(P) \right)  p_i }

Here P is short for the whole list of populations (P_1(t), \dots, P_n(t)), and

\displaystyle{ \overline f(P) = \sum_j f_j(P) p_j  }

is the mean fitness.

The Fisher metric

The space of probability distributions on the set \{1, \dots, n\} is called the (n-1)-simplex

\Delta^{n-1} = \{ (x_1, \dots, x_n) : \; x_i \ge 0, \; \displaystyle{ \sum_{i=1}^n x_i = 1 } \}

It’s called \Delta^{n-1} because it’s (n-1)-dimensional. When n = 3 it looks like the letter \Delta:

The Fisher metric is a Riemannian metric on the interior of the (n-1)-simplex. That is, given a point p in the interior of \Delta^{n-1} and two tangent vectors v,w at this point the Fisher metric gives a number

g(v,w) = \displaystyle{ \sum_{i=1}^n \frac{v_i w_i}{p_i}  }

Here we are describing the tangent vectors v,w as vectors in \mathbb{R}^n with the property that the sum of their components is zero: that’s what makes them tangent to the (n-1)-simplex. And we’re demanding that x be in the interior of the simplex to avoid dividing by zero, since on the boundary of the simplex we have p_i = 0 for at least one choice of $i.$

If we have a probability distribution p(t) moving around in the interior of the (n-1)-simplex as a function of time, its Fisher speed is

\displaystyle{ \sqrt{g(\dot{p}(t), \dot{p}(t))} = \sum_{i=1}^n \frac{\dot{p}_i(t)^2}{p_i(t)} }

if the derivative \dot{p}(t) exists. This is the usual formula for the speed of a curve moving in a Riemannian manifold, specialized to the case at hand.

Now we’ve got all the formulas we’ll need to prove the result we want. But for those who don’t already know and love it, it’s worthwhile saying a bit more about the Fisher metric.

The factor of 1/x_i in the Fisher metric changes the geometry of the simplex so that it becomes round, like a portion of a sphere:

But the reason the Fisher metric is important, I think, is its connection to relative information. Given two probability distributions p, q \in \Delta^{n-1}, the information of q relative to p is

\displaystyle{ I(q,p) = \sum_{i = 1}^n q_i \ln\left(\frac{q_i}{p_i}\right)   }

You can show this is the expected amount of information gained if p was your prior distribution and you receive information that causes you to update your prior to q. So, sometimes it’s called the information gain. It’s also called relative entropy or—my least favorite, since it sounds so mysterious—the Kullback–Leibler divergence.

Suppose p(t) is a smooth curve in the interior of the (n-1)-simplex. We can ask the rate at which information is gained as time passes. Perhaps surprisingly, a calculation gives

\displaystyle{ \frac{d}{dt} I(p(t), p(t_0))\Big|_{t = t_0} = 0 }

That is, in some sense ‘to first order’ no information is being gained at any moment t_0 \in \mathbb{R}. However, we have

\displaystyle{  \frac{d^2}{dt^2} I(p(t), p(t_0))\Big|_{t = t_0} =  g(\dot{p}(t_0), \dot{p}(t_0))}

So, the square of the Fisher speed has a nice interpretation in terms of relative entropy!

For a derivation of these last two equations, see Part 7 of my posts on information geometry. For more on the meaning of relative entropy, see Part 6.

The result

It’s now extremely easy to show what we want, but let me state it formally so all the assumptions are crystal clear.

Theorem. Suppose the functions P_i \colon \mathbb{R} \to (0,\infty) obey the Lotka–Volterra equations:

\displaystyle{ \dot P_i = f_i(P) P_i}

for some differentiable functions f_i \colon (0,\infty)^n \to \mathbb{R} called fitness functions. Define probabilities and the mean fitness as above, and define the variance of the fitness by

\displaystyle{ \mathrm{Var}(f(P)) =  \sum_j ( f_j(P) - \overline f(P))^2 \, p_j }

Then if none of the populations P_i are zero, the square of the Fisher speed of the probability distribution p(t) = (p_1(t), \dots , p_n(t)) is the variance of the fitness:

g(\dot{p}, \dot{p})  = \mathrm{Var}(f(P))

Proof. The proof is near-instantaneous. We take the square of the Fisher speed:

\displaystyle{ g(\dot{p}, \dot{p}) = \sum_{i=1}^n \frac{\dot{p}_i(t)^2}{p_i(t)} }

and plug in the replicator equation:

\displaystyle{ \dot{p}_i = (f_i(P) - \overline f(P)) p_i }

We obtain:

\begin{array}{ccl} \displaystyle{ g(\dot{p}, \dot{p})} &=& \displaystyle{ \sum_{i=1}^n \left( f_i(P) - \overline f(P) \right)^2 p_i } \\ \\ &=& \mathrm{Var}(f(P)) \end{array}

as desired.   █

It’s hard to imagine anything simpler than this. We see that given the Lotka–Volterra equation, what causes information to be updated is nothing more and nothing less than variance in fitness! But there are other variants of Fisher’s fundamental theorem worth discussing, so I’ll talk about those in future posts.

Markov Decision Processes

6 October, 2020

The National Institute of Mathematical and Biological Sciences is having an online seminar on ‘adaptive management’. It should be fun for people who want to understand Markov decision processes—like me!

NIMBioS Adaptive Management Webinar Series, 2020 October 26-29 (Monday-Thursday).

Adaptive management seeks to determine sound management strategies in the face of uncertainty concerning the behavior of the system being managed. Specifically, it attempts to find strategies for managing dynamic systems while learning the behavior of the system. These webinars review the key concept of a Markov Decision Process (MDP) and demonstrate how quantitative adaptive management strategies can be developed using MDPs. Additional conceptual, computational and application aspects will be discussed, including dynamic programming and Bayesian formalization of learning.

Here are the topics:

Session 1: Introduction to decision problems
Session 2: Introduction to Markov decision processes (MDPs)
Session 3: Solving Markov decision processes (MDPs)
Session 4: Modeling beliefs
Session 5: Conjugacy and discrete model adaptive management (AM)
Session 6: More on AM problems (Dirichlet/multinomial and Gaussian prior/likelihood)
Session 7: Partially observable Markov decision processes (POMDPs)
Session 8: Frontier topics (projection methods, approximate DP, communicating solutions)


Fisher’s Fundamental Theorem (Part 2)

29 September, 2020

Here’s how Fisher stated his fundamental theorem:

The rate of increase of fitness of any species is equal to the genetic variance in fitness.

But clearly this is only going to be true under some conditions!

A lot of early criticism of Fisher’s fundamental theorem centered on the fact that the fitness of a species can vary due to changing external conditions. For example: suppose the Sun goes supernova. The fitness of all organisms on Earth will suddenly drop. So the conclusions of Fisher’s theorem can’t hold under these circumstances.

I find this obvious and thus uninteresting. So, let’s tackle situations where the fitness changes due to changing external conditions later. But first let’s see what happens if the fitness isn’t changing for these external reasons.

What’s ‘fitness’, anyway? To define this we need a mathematical model of how populations change with time. We’ll start with a very simple, very general model. While it’s often used in population biology, it will have very little to do with biology per se. Indeed, the reason I’m digging into Fisher’s fundamental theorem is that it has a mathematical aspect that doesn’t require much knowledge of biology to understand. Applying it to biology introduces lots of complications and caveats, but that won’t be my main focus here. I’m looking for the simple abstract core.

The Lotka–Volterra equation

The Lotka–Volterra equation is a simplified model of how populations change with time. Suppose we have n different types of self-replicating entity. We will call these entities replicators. We will call the types of replicators species, but they do not need to be species in the biological sense!

For example, the replicators could be organisms of one single biological species, and the types could be different genotypes. Or the replicators could be genes, and the types could be alleles. Or the replicators could be restaurants, and the types could be restaurant chains. In what follows these details won’t matter: we’ll have just have different ‘species’ of ‘replicators’.

Let P_i(t) or just P_i for short, be the population of the ith species at time t. We will treat this population as a differentiable real-valued function of time, which is a reasonable approximation when the population is fairly large.

Let’s assume the population obeys the Lotka–Volterra equation:

\displaystyle{ \frac{d P_i}{d t} = f_i(P_1, \dots, P_n) \, P_i }

where each function f_i depends in a differentiable way on all the populations. Thus each population P_i changes at a rate proportional to P_i, but the ‘constant of proportionality’ need not be constant: it depends on the populations of all the species.

We call f_i the fitness function of the ith species. Note: we are assuming this function does not depend on time.

To write the Lotka–Volterra equation more concisely, we can create a vector whose components are all the populations:

P = (P_1, \dots , P_n).

Let’s call this the population vector. In terms of the population vector, the Lotka–Volterra equation become

\displaystyle{ \dot P_i = f_i(P) P_i}

where the dot stands for a time derivative.

To define concepts like ‘mean fitness’ or ‘variance in fitness’ we need to introduce probability theory, and the replicator equation.

The replicator equation

Starting from the populations P_i, we can work out the probability p_i that a randomly chosen replicator belongs to the ith species. More precisely, this is the fraction of replicators belonging to that species:

\displaystyle{  p_i = \frac{P_i}{\sum_j P_j} }

As a mnemonic, remember that the big Population P_i is being normalized to give a little probability p_i. I once had someone scold me for two minutes during a talk I was giving on this subject, for using lower-case and upper-case P’s to mean different things. But it’s my blog and I’ll do what I want to.

How do these probabilities p_i change with time? We can figure this out using the Lotka–Volterra equation. We pull out the trusty quotient rule and calculate:

\displaystyle{ \dot{p}_i = \frac{\dot{P}_i \left(\sum_j P_j\right) - P_i \left(\sum_j \dot{P}_j \right)}{\left(  \sum_j P_j \right)^2 } }

Then the Lotka–Volterra equation gives

\displaystyle{ \dot{p}_i = \frac{ f_i(P) P_i \; \left(\sum_j P_j\right) - P_i \left(\sum_j f_j(P) P_j \right)} {\left(  \sum_j P_j \right)^2 } }

Using the definition of p_i this simplifies and we get

\displaystyle{ \dot{p}_i =  f_i(P) p_i  - \left( \sum_j f_j(P) p_j \right) p_i }

The expression in parentheses here has a nice meaning: it is the mean fitness. In other words, it is the average, or expected, fitness of a replicator chosen at random from the whole population. Let us write it thus:

\displaystyle{ \overline f(P) = \sum_j f_j(P) p_j  }

This gives the replicator equation in its classic form:

\displaystyle{ \dot{p}_i = \left( f_i(P) - \overline f(P) \right) \, p_i }

where the dot stands for a time derivative. Thus, for the fraction of replicators of the ith species to increase, their fitness must exceed the mean fitness.

The moral is clear:

To become numerous you have to be fit.
To become predominant you have to be fitter than average.

This picture by David Wakeham illustrates the idea:

The fundamental theorem

What does the fundamental theorem of natural selection say, in this context? It says the rate of increase in mean fitness is equal to the variance of the fitness. As an equation, it says this:

\displaystyle{ \frac{d}{d t} \overline f(P) = \sum_j \Big( f_j(P) - \overline f(P) \Big)^2 \, p_j  }

The left hand side is the rate of increase in mean fitness—or decrease, if it’s negative. The right hand side is the variance of the fitness: the thing whose square root is the standard deviation. This can never be negative!

A little calculation suggests that there’s no way in the world that this equation can be true without extra assumptions!

We can start computing the left hand side:

\begin{array}{ccl} \displaystyle{\frac{d}{d t} \overline f(P)}  &=&  \displaystyle{ \frac{d}{d t} \sum_j f_j(P) p_j } \\  \\  &=& \displaystyle{ \sum_j  \frac{d f_j(P)}{d t} \, p_j \; + \; f_j(P) \, \frac{d p_j}{d t} } \\ \\  &=& \displaystyle{ \sum_j (\nabla f_j(P) \cdot \dot{P}) p_j \; + \; f_j(P) \dot{p}_j }  \end{array}

Before your eyes glaze over, let’s look at the two terms and think about what they mean. The first term says: the mean fitness will change since the fitnesses f_j(P) depend on P, which is changing. The second term says: the mean fitness will change since the fraction p_j of replicators that are in the jth species is changing.

We could continue the computation by using the Lotka–Volterra equation for \dot{P} and the replicator equation for \dot{p}. But it already looks like we’re doomed without invoking an extra assumption. The left hand side of Fisher’s fundamental theorem involves the gradients of the fitness functions, \nabla f_j(P). The right hand side:

\displaystyle{ \sum_j \Big( f_j(P) - \overline f(P) \Big)^2 \, p_j  }

does not!

This suggests an extra assumption we can make. Let’s assume those gradients \nabla f_j vanish!

In other words, let’s assume that the fitness of each replicator is a constant, independent of the populations:

f_j(P_1, \dots, P_n) = f_j

where f_j at right is just a number.

Then we can redo our computation of the rate of change of mean fitness. The gradient term doesn’t appear:

\begin{array}{ccl} \displaystyle{\frac{d}{d t} \overline f(P)}  &=&  \displaystyle{ \frac{d}{d t} \sum_j f_j p_j } \\  \\  &=& \displaystyle{ \sum_j f_j \dot{p}_j }  \end{array}

We can use the replicator equation for \dot{p}_j and get

\begin{array}{ccl} \displaystyle{ \frac{d}{d t} \overline f } &=&  \displaystyle{ \sum_j f_j \Big( f_j - \overline f \Big) p_j } \\ \\  &=& \displaystyle{ \sum_j f_j^2 p_j - f_j p_j  \overline f  } \\ \\  &=& \displaystyle{ \Big(\sum_j f_j^2 p_j\Big) - \overline f^2  }  \end{array}

This is the mean of the squares of the f_j minus the square of their mean. And if you’ve done enough probability theory, you’ll recognize this as the variance! Remember, the variance is

\begin{array}{ccl} \displaystyle{ \sum_j \Big( f_j - \overline f \Big)^2 \, p_j  }  &=&  \displaystyle{ \sum_j f_j^2 \, p_j - 2 f_j \overline f \, p_j + \overline f^2 p_j } \\ \\  &=&  \displaystyle{ \Big(\sum_j f_j^2 \, p_j\Big) - 2 \overline f^2 + \overline f^2 } \\ \\  &=&  \displaystyle{ \Big(\sum_j f_j^2 p_j\Big) - \overline f^2  }  \end{array}

Same thing.

So, we’ve gotten a simple version of Fisher’s fundamental theorem. Given all the confusion swirling around this subject, let’s summarize it very clearly.

Theorem. Suppose the functions P_i \colon \mathbb{R} \to (0,\infty) obey the equations

\displaystyle{ \dot P_i = f_i P_i}

for some constants f_i. Define probabilities by

\displaystyle{  p_i = \frac{P_i}{\sum_j P_j} }

Define the mean fitness by

\displaystyle{ \overline f = \sum_j f_j p_j  }

and the variance of the fitness by

\displaystyle{ \mathrm{Var}(f) =  \sum_j \Big( f_j - \overline f \Big)^2 \, p_j }

Then the time derivative of the mean fitness is the variance of the fitness:

\displaystyle{  \frac{d}{d t} \overline f = \mathrm{Var}(f) }

This is nice—but as you can see, our extra assumption that the fitness functions are constants has trivialized the problem. The equations

\displaystyle{ \dot P_i = f_i P_i}

are easy to solve: all the populations change exponentially with time. We’re not seeing any of the interesting features of population biology, or even of dynamical systems in general. The theorem is just an observation about a collection of exponential functions growing or shrinking at different rates.

So, we should look for a more interesting theorem in this vicinity! And we will.

Before I bid you adieu, let’s record a result we almost reached, but didn’t yet state. It’s stronger than the one I just stated. In this version we don’t assume the fitness functions are constant, so we keep the term involving their gradient.

Theorem. Suppose the functions P_i \colon \mathbb{R} \to (0,\infty) obey the Lotka–Volterra equations:

\displaystyle{ \dot P_i = f_i(P) P_i}

for some differentiable functions f_i \colon (0,\infty)^n \to \mathbb{R} called fitness functions. Define probabilities by

\displaystyle{  p_i = \frac{P_i}{\sum_j P_j} }

Define the mean fitness by

\displaystyle{ \overline f(P)  = \sum_j f_j(P) p_j  }

and the variance of the fitness by

\displaystyle{ \mathrm{Var}(f(P)) =  \sum_j \Big( f_j(P) - \overline f(P) \Big)^2 \, p_j }

Then the time derivative of the mean fitness is the variance plus an extra term involving the gradients of the fitness functions:

\displaystyle{\frac{d}{d t} \overline f(P)}  =  \displaystyle{ \mathrm{Var}(f(P)) + \sum_j (\nabla f_j(P) \cdot \dot{P}) p_j }

The proof just amounts to cobbling together the calculations we have already done, and not assuming the gradient term vanishes.


After writing this blog article I looked for a nice picture to grace it. I found one here:

• David Wakeham, Replicators and Fisher’s fundamental theorem, 30 November 2017.

I was mildly chagrined to discover that he said most of what I just said more simply and cleanly… in part because he went straight to the case where the fitness functions are constants. But my mild chagrin was instantly offset by this remark:

Fisher likened the result to the second law of thermodynamics, but there is an amusing amount of disagreement about what Fisher meant and whether he was correct. Rather than look at Fisher’s tortuous proof (or the only slightly less tortuous results of latter-day interpreters) I’m going to look at a simpler setup due to John Baez, and (unlike Baez) use it to derive the original version of Fisher’s theorem.

So, I’m just catching up with Wakeham, but luckily an earlier blog article of mine helped him avoid “Fisher’s tortuous proof” and the “only slightly less tortuous results of latter-day interpreters”. We are making progress here!

(By the way, a quiz show I listen to recently asked about the difference between “tortuous” and “torturous”. They mean very different things, but this particular case either word would apply.)

My earlier blog article, in turn, was inspired by this paper:

• Marc Harper, Information geometry and evolutionary game theory.

Fisher’s Fundamental Theorem (Part 1)

29 September, 2020

There are various ‘fundamental theorems’ in mathematics. The fundamental theorem of arithmetic, the fundamental theorem of algebra, and the fundamental theorem of calculus are three of the most famous. These are gems of mathematics.

The statistician, biologist and eugenicist Ronald Fisher had his own fundamental theorem: the ‘fundamental theorem of natural selection’. But this one is different—it’s a mess! The theorem was based on confusing definitions, Fisher’s proofs were packed with typos and downright errors, and people don’t agree on what the theorem says, whether it’s correct, and whether it’s nontrivial. Thus, people keep trying to clarify and strengthen it.

This paper analyzes Fisher’s work:

• George R. Price, Fisher’s ‘fundamental theorem’ made clear, Annals of Human Genetics 32 (1972), 129–140.

Price writes:

It has long been a mystery how Fisher (1930, 1941, 1958) derived his famous ‘fundamental theorem of Natural Selection’ and exactly what he meant by it. He stated the theorem in these words (1930, p. 35; 1958, p. 37): ‘The rate of increase in fitness of any organism at any time is equal to its genetic variance in fitness at that time.’ And also in these words (1930, p. 46; 1958, p. 50): ‘The rate of increase of fitness of any species is equal to the genetic variance in fitness.’ He compared this result to the second law of thermodynamics, and described it as holding ‘the supreme position among the biological sciences’. Also, he spoke of the ‘rigour’ of his derivation of the theorem and of ‘the ease of its interpretation’. But others have variously described his derivation as ‘recondite’ (Crow & Kimura, 1970), ‘very difficult’ (Turner, 1970), or ‘entirely obscure’ (Kempthorne, 1957). And no one has ever found any other way to derive the result that Fisher seems to state. Hence, many authors (not reviewed here) have maintained that the theorem holds only under very special conditions, while only a few (e.g. Edwards, 1967) have thought that Fisher may have been correct—if only we could understand what he meant!

It will be shown here that this latter view is correct. Fisher’s theorem does indeed hold with the generality that he claimed for it. The mystery and the controversy result from incomprehensibility rather than error.

I won’t try to explain the result here—or the various improved versions that people have invented, which may actually be more interesting than the original. I’ll try to do this in a series of posts. Right now I just want to quote Price’s analysis of why Fisher’s result is so difficult to understand! It’s amusing:

In addition to the central confusion resulting from the use of the word fitness in two highly different senses, Fisher’s three publications on his theorem contain an astonishing number of lesser obscurities, infelicities of expression, typographical errors, omissions of crucial explanations, and contradictions between different passages about the same point. It is necessary to clarify some of this confusion before explaining the derivation of the theorem.

He analyzes the problems in detail, calling one passage “most confusing published scientific writing I know of”.

Part of the problem, though only part, is that Fisher rewrote part of his paper while not remembering to change the rest to make the terminology match. It reminds me a bit of how the typesetter accidentally omitted a line from one of Bohr’s papers on quantum mechanics, creating a sentence that made absolutely no sense—though in Bohr’s case, his writing was so obscure that nobody even noticed until many years later.

Given its legendary obscurity, I will not try to fight my way through Fisher’s original paper. I will start with some later work. Next time!

Diary, 2003-2020

8 August, 2020

I keep putting off organizing my written material, but with coronavirus I’m feeling more mortal than usual, so I’d like get this out into the world now:

• John Baez, Diary, 2003–2020.

Go ahead and grab a copy!

It’s got all my best tweets and Google+ posts, mainly explaining math and physics, but also my travel notes and other things… starting in 2003 with my ruminations on economics and ecology. It’s too big to read all at once, but I think you can dip into it more or less anywhere and pull out something fun.

It goes up to July 2020. It’s 2184 pages long.

I fixed a few problems like missing pictures, but there are probably more. If you let me know about them, I’ll fix them (if it’s easy).

Ordovician Meteor Event

25 September, 2019

About 1/3 of the meteorites hitting Earth today come from one source: the L chondrite parent body, an asteroid 100–150 kilometers across that was smashed in an impact 468 million years ago. This was biggest asteroid collision in the last 3 billion years!

Here is an L-chondrite:

A chondrite is a stony, non-metallic meteorite that was formed form small grains of dust present in the early Solar System. They are the most common kind of meteorite—and the three most common kinds, each with its own somewhat different chemical composition, seem to come from different asteroids.

L chondrites are named that because they are low in iron. Compared to other chondrites, a lot of L chondrites have been heavily shocked—evidence that their parent body was catastrophically disrupted by a large impact.

It seems that roughly 500,000 years after this event, lots of meteorites started hitting Earth: this is called the Ordovician meteor event. Big craters from that event still dot the Earth! Here are some in North America:

Number 3 is the Rock Elm Disturbance, created when a rock roughly 170 meters in diameter slammed into what’s now Wisconsin:

It doesn’t look like much now, but imagine what it must have been like! The crater is about 6 kilometers across. It features intensely fractured quartz grain and a faulted rim.

It seems these big L-chondrite meteors hit the Earth roughly in a line:

Of course the continents didn’t look like this when the meteor hit, about 467.5 million years ago.

One big question is: was the Ordovician meteor event somehow connected to the giant increase in biodiversity during the Ordovician? Here’s a graph of biodiversity over time:

The Cambrian explosion gets all the press, but in terms of the sheer number of new families the so-called Ordovician radiation was bigger. Most animal life was undersea at the time. This is when coral reefs and other complex ocean ecosystems came into being!

There are lots of theories that try to explain the Ordovician radiation. For example, the oxygen concentration in the atmosphere and ocean soared right before the start of the Ordovician period. More than one of these theories could be right. But it’s interesting to think about the possible influence of the Ordovician meteor event.

There were a lot of meteor impacts, but the dust may have been more important. Right now, extraterrestrial dust counts for just 1% of all dust in the Earth’s atmosphere. In the Ordovician, the amount of extraterrestial dust was 1,000 – 10,000 times greater, due to the big smash-up in the asteroid belt! This may have caused the global cooling we see in that period. The Ordovician started out hot, but by the end there were glaciers.

How could this increase biodiversity? The “intermediate disturbance hypothesis” says that biodiversity increases under conditions of mild stress. Some argue this explains the Ordovician radiation.

I’d say this is pretty iffy. But it’s sure interesting! Read more here:

• Birger Schmitz et al., An extraterrestrial trigger for the mid-Ordovician ice age: Dust from the breakup of the L-chondrite parent body, Science Advances, 18 September 2019.

Another fun question is: where are the remains of the L chondrite parent body? Could they be the asteroids in the Flora family?

Coupling Through Emergent Conservation Laws (Part 8)

3 July, 2018

joint post with Jonathan Lorand, Blake Pollard, and Maru Sarazola

To wrap up this series, let’s look at an even more elaborate cycle of reactions featuring emergent conservation laws: the citric acid cycle. Here’s a picture of it from Stryer’s textbook Biochemistry:

I’ll warn you right now that we won’t draw any grand conclusions from this example: that’s why we left it out of our paper. Instead we’ll leave you with some questions we don’t know how to answer.

All known aerobic organisms use the citric cycle to convert energy derived from food into other useful forms. This cycle couples an exergonic reaction, the conversion of acetyl-CoA to CoA-SH, to endergonic reactions that produce ATP and a chemical called NADH.

The citric acid cycle can be described at various levels of detail, but at one level it consists of ten reactions:

\begin{array}{rcl}   \mathrm{A}_1 + \text{acetyl-CoA} + \mathrm{H}_2\mathrm{O} & \longleftrightarrow &  \mathrm{A}_2 + \text{CoA-SH}  \\  \\   \mathrm{A}_2 & \longleftrightarrow &  \mathrm{A}_3 + \mathrm{H}_2\mathrm{O} \\  \\  \mathrm{A}_3 + \mathrm{H}_2\mathrm{O} & \longleftrightarrow &   \mathrm{A}_4 \\  \\   \mathrm{A}_4 + \mathrm{NAD}^+  & \longleftrightarrow &  \mathrm{A}_5 + \mathrm{NADH} + \mathrm{H}^+  \\  \\   \mathrm{A}_5 + \mathrm{H}^+ & \longleftrightarrow &  \mathrm{A}_6 + \textrm{CO}_2 \\  \\  \mathrm{A}_6 + \mathrm{NAD}^+ + \text{CoA-SH} & \longleftrightarrow &  \mathrm{A}_7 + \mathrm{NADH} + \mathrm{H}^+ + \textrm{CO}_2 \\  \\   \mathrm{A}_7 + \mathrm{ADP} + \mathrm{P}_{\mathrm{i}}   & \longleftrightarrow &  \mathrm{A}_8 + \text{CoA-SH} + \mathrm{ATP} \\  \\   \mathrm{A}_8 + \mathrm{FAD} & \longleftrightarrow &  \mathrm{A}_9 + \mathrm{FADH}_2 \\  \\  \mathrm{A}_9 + \mathrm{H}_2\mathrm{O}  & \longleftrightarrow &  \mathrm{A}_{10} \\  \\  \mathrm{A}_{10} + \mathrm{NAD}^+  & \longleftrightarrow &  \mathrm{A}_1 + \mathrm{NADH} + \mathrm{H}^+  \end{array}

Here \mathrm{A}_1, \dots, \mathrm{A}_{10} are abbreviations for species that cycle around, each being transformed into the next. It doesn’t really matter for what we’ll be doing, but in case you’re curious:

\mathrm{A}_1= oxaloacetate,
\mathrm{A}_2= citrate,
\mathrm{A}_3= cis-aconitate,
\mathrm{A}_4= isocitrate,
\mathrm{A}_5= oxalosuccinate,
\mathrm{A}_6= α-ketoglutarate,
\mathrm{A}_7= succinyl-CoA,
\mathrm{A}_8= succinate,
\mathrm{A}_9= fumarate,
\mathrm{A}_{10}= L-malate.

In reality, the citric acid cycle also involves inflows of reactants such as acetyl-CoA, which is produced by metabolism, as well as outflows of both useful products such as ADP and NADH and waste products such as CO2. Thus, a full analysis requires treating this cycle as an open chemical reaction network, where species flow in and out. However, we can gain some insight just by studying the emergent conservation laws present in this network, ignoring inflows and outflows—so let’s do that!

There are a total of 22 species in the citric acid cycle. There are 10 forward reactions. We can see that their vectors are all linearly independent as follows. Since each reaction turns \mathrm{A}_i into \mathrm{A}_{i+1}, where we count modulo 10, it is easy to see that any nine of the reaction vectors are linearly independent. Whichever one we choose to ‘close the cycle’ could in theory be linearly dependent on the rest. However, it is easy to see that the vector for this reaction

\mathrm{A}_8 + \mathrm{FAD} \longleftrightarrow \mathrm{A}_9 + \mathrm{FADH}_2

is linearly independent from the rest, because only this one involves FAD. So, all 10 reaction vectors are linearly independent, and the stoichiometric subspace has dimension 10.

Since 22 – 10 = 12, there must be 12 linearly independent conserved quantities. Some of these conservation laws are ‘fundamental’, at least by the standards of chemistry. All the species involved are made of 6 different atoms (carbon, hydrogen, oxygen, nitrogen, phosphorus and sulfur), and conservation of charge provides another fundamental conserved quantity, for a total of 7.

(In our example from last time we didn’t keep track of conservation of hydrogen and charge, because both \mathrm{H}^+ and e^- ions are freely available in water… but we studied the citric acid cycle when we were younger, more energetic and less wise, so we kept careful track of hydrogen and charge, and made sure that all the reactions conserved these. So, we’ll have 7 fundamental conserved quantities.)

For example, the conserved quantity

[\text{acetyl-CoA}] + [\text{CoA-SH}] + [\mathrm{A}_7]

arises from the fact that \text{acetyl-CoA}, \text{CoA-SH} and \mathrm{A}_7 contain a single sulfur atom, while none of the other species involved contain sulfur.

Similarly, the conserved quantity

3[\mathrm{ATP}] + 2[\mathrm{ADP}] + [\mathrm{P}_{\mathrm{i}}] + 2[\mathrm{FAD}] +2[\mathrm{FADH}_2]

expresses conservation of phosphorus.

Besides the 7 fundamental conserved quantities, there must also be 5 linearly independent emergent conserved quantities: that is, quantities that are not conserved in every possible chemical reaction, but remain constant in every reaction in the citric acid cycle. We can use these 5 quantities:

[\mathrm{ATP}] + [\mathrm{ADP}], due to the conservation of adenosine.

[\mathrm{FAD}] + [\mathrm{FADH}_2], due to conservation of flavin adenine dinucleotide.

[\mathrm{NAD}^+] + [\mathrm{NADH}], due to conservation of nicotinamide adenine dinucleotide.

[\mathrm{A}_1] + \cdots + [\mathrm{A}_{10}]. This expresses the fact that in the citric acid cycle each species [\mathrm{A}_i] is transformed to the next, modulo 10.

[\text{acetyl-CoA}] + [\mathrm{A}_1] + \cdots + [\mathrm{A}_7] + [\text{CoA-SH}]. It can be checked by hand that each reaction in the citric acid cycle conserves this quantity. This expresses the fact that during the first 7 reactions of the citric acid cycle, one molecule of \text{acetyl-CoA} is destroyed and one molecule of \text{CoA-SH} is formed.

Of course, other conserved quantities can be formed as linear combinations of fundamental and emergent conserved quantities, often in nonobvious ways. An example is

3 [\text{acetyl-CoA}] + 3 [\mathrm{A}_2] + 3[\mathrm{A}_3] + 3[\mathrm{A}_4] + 2[\mathrm{A}_5] +
2[\mathrm{A}_6] + [\mathrm{A}_7] + [\mathrm{A}_8] + [\mathrm{A}_9] + [\mathrm{A}_{10}] + [\mathrm{NADH}]

which expresses the fact that in each turn of the citric acid cycle, one molecule of \text{acetyl-CoA} is destroyed and three of \mathrm{NADH} are formed. It is easier to check by hand that this quantity is conserved than to express it as an explicit linear combination of the 12 conserved quantities we have listed so far.

Finally, we bit you a fond farewell and leave you with this question: what exactly do the 7 emergent conservation laws do? In our previous two examples (ATP hydrolysis and the urea cycle) there were certain undesired reactions involving just the species we listed which were forbidden by the emergent conservation laws. In this case I don’t see any of those. But there are other important processes, involving additional species, that are forbidden. For example, if you let acetyl-CoA sit in water it will ‘hydrolyze’ as follows:

\text{acetyl-CoA} + \mathrm{H}_2\mathrm{O} \longleftrightarrow \text{CoA-SH} + \text{acetate} + \text{H}^+

So, it’s turning into CoA-SH and some other stuff, somewhat as does in the citric acid cycle, but in a way that doesn’t do anything ‘useful’: no ATP or NADH is created in this process. This is one of the things the citric acid cycle tries to prevent.

(Remember, a reaction being ‘forbidden by emergent conservation laws’ doesn’t mean it’s absolutely forbidden. It just means that it happens much more slowly than the catalyzed reactions we are listing in our reaction network.)

Unfortunately acetate and \text{H}^+ aren’t on the list of species we’re considering. We could add them. If we added them, and perhaps other species, could we get a setup where every emergent conservation law could be seen as preventing a specific unwanted reaction that’s chemically allowed?

Ideally the dimension of the space of emergent conservation laws would match the dimension of the space spanned by reaction vectors of unwanted reactions, so ‘everything would be accounted for’. But even in the simpler example of the urea cycle, we didn’t achieve this perfect match.


The paper:

• John Baez, Jonathan Lorand, Blake S. Pollard and Maru Sarazola,
Biochemical coupling through emergent conservation laws.

The blog series:

Part 1 – Introduction.

Part 2 – Review of reaction networks and equilibrium thermodynamics.

Part 3 – What is coupling?

Part 4 – Interactions.

Part 5 – Coupling in quasiequilibrium states.

Part 6 – Emergent conservation laws.

Part 7 – The urea cycle.

Part 8 – The citric acid cycle.

Coupling Through Emergent Conservation Laws (Part 7)

2 July, 2018

joint post with Jonathan Lorand, Blake Pollard, and Maru Sarazola

Last time we examined ATP hydrolysis as a simple example of coupling through emergent conservation laws, but the phenomenon is more general. A slightly more complicated example is the urea cycle. The first metabolic cycle to be discovered, it is used by land-dwelling vertebrates to convert ammonia, which is highly toxic, to urea for excretion. Now we’ll find 11 conserved quantities in the urea cycle, including 7 emergent ones.

(Yes, this post is about mathematics of piss!)

The urea cycle looks like this:

We’ll focus on this portion:

\mathrm{NH}_3 + \mathrm{HCO}_3^- + 2 \mathrm{ATP} \leftrightarrow \mathrm{carbamoyl \; phosphate} + 2 \mathrm{ADP} + \mathrm{P}_{\mathrm{i}}

\mathrm{A}_1 + \mathrm{carbamoyl \; phosphate} \leftrightarrow \mathrm{A}_2 + \mathrm{P}_{\mathrm{i}}

\mathrm{A}_2 + \mathrm{aspartate}+ \mathrm{ATP} \leftrightarrow \mathrm{A}_3 + \mathrm{AMP} + \mathrm{PP}_{\mathrm{i}}

\mathrm{A}_3  \leftrightarrow \mathrm{A}_4 + \mathrm{fumarate}

\mathrm{A}_4 + \mathrm{H}_2\mathrm{O} \leftrightarrow \mathrm{A}_1 + \mathrm{urea}

Ammonia (\mathrm{NH}_3) and carbonate (\mathrm{HCO}_3^-) enter in the first reaction, along with ATP. The four remaining reactions form a cycle in which four similar species A1, A2, A3, A4 cycle around, each transformed into the next. In case you’re curious, these species are:

• A1 = ornithine:

• A2 = citrulline:

• A3 = argininosuccinate:

• A3 = arginine:

One atom of nitrogen from carbamoyl phosphate and one from aspartate enter this cycle, and they are incorporated in urea, which then leaves the cycle.

As you can see above, argininosuccinate is the largest of the four molecules that cycle around. It’s formed when citrulline combines with aspartate, which looks like this:

Argininosuccinate then breaks down to form arginine and fumarate:

All this is powered by two exergonic reactions: the hydrolysis of ATP to ADP and phosphate (Pi) and the hydrolysis of ATP to adenosine monophosphate (AMP) and a compound with two phosphorus atoms, pyrophosphate (PPi). Thus, we are seeing a more elaborate example of an endergonic process coupled to ATP hydrolysis. The most interesting new feature is the use of a cycle.

Since inflows and outflows are crucial to the purpose of the urea cycle, a full analysis requires treating this cycle as an open chemical reaction network. However, we can gain some insight into coupling just by studying the emergent conservation laws present in this network, ignoring inflows and outflows.

There are a total of 16 species in the urea cycle. There are 5 forward reactions, which are easily seen to have linearly independent reaction vectors. Thus, the stoichiometric subspace has dimension 5. There must therefore be 11 linearly independent conserved quantities.

Some of these conserved quantities can be explained by fundamental laws of chemistry. All the species involved are made of five different atoms: carbon, hydrogen, oxygen, nitrogen and phosphorus. The conserved quantity

3[\mathrm{ATP}] + 2[\mathrm{ADP}] + [\mathrm{AMP}] + 2 [\mathrm{PP}_{\mathrm{i}}] +
[\mathrm{P}_{\mathrm{i}}] + [\mathrm{carbamoyl \; phosphate}]

expresses conservation of phosphorus. The conserved quantity

[\mathrm{NH}_3] + [\mathrm{carbamoyl \; phosphate}] +  [\mathrm{aspartate}] + 2[\mathrm{urea}] +
2[\mathrm{A}_1] + 3[\mathrm{A}_2] + 4[\mathrm{A}_3] + 4[\mathrm{A}_4]

expresses conservation of nitrogen. Conservation of oxygen and carbon give still more complicated conserved quantities. Conservation of hydrogen and conservation of charge are not really valid laws in this context, because all the reactions are occurring in water, where it is easy for protons (H+) and electrons to come and go. So, four linearly independent ‘fundamental’ conserved quantities are relevant to the urea cycle.

There must therefore be seven other linearly independent conserved quantities that are emergent: that is, not conserved in every possible reaction, but conserved by those in the urea cycle. A computer calculation shows that we can use these:

A) [\mathrm{ATP}] + [\mathrm{ADP}] + [\mathrm{AMP}], due to conservation of adenosine by all reactions in the urea cycle.

B) [\mathrm{H}_2\mathrm{O}] + [\mathrm{urea}], since the only reaction in the urea cycle involving either \mathrm{H}_2\mathrm{O} or \mathrm{urea} has \mathrm{H}_2\mathrm{O} as a reactant and \mathrm{urea} as a product.

C) [\mathrm{aspartate}] + [\mathrm{PP}_{\mathrm{i}}], since the only reaction involving either \mathrm{aspartate} or \mathrm{PP}_{\mathrm{i}} has \mathrm{aspartate} as a reactant and \mathrm{PP}_{\mathrm{i}} as a product.

D) 2[\mathrm{NH}_3] + [\mathrm{ADP}], since the only reaction involving either \mathrm{NH}_3 or \mathrm{ADP} has \mathrm{NH}_3 as a reactant and 2\mathrm{ADP} as a product.

E) 2[\mathrm{HCO}_3^-] + [\mathrm{ADP}], since the only reaction involving either \mathrm{HCO}_3^- or \mathrm{ADP} has \mathrm{HCO}_3^- as a reactant and 2\mathrm{ADP} as a product.

F) [\mathrm{A}_3] + [\mathrm{fumarate}] - [\mathrm{PP}_{\mathrm{i}}], since these species are involved only in the third and fourth reactions of the urea cycle, and this quantity is conserved in both those reactions.

G) [\mathrm{A}_1] + [\mathrm{A}_2] + [\mathrm{A}_3] + [\mathrm{A}_4], since these species cycle around the last four reactions, and they are not involved in the first.

These emergent conservation laws prevent either form of ATP hydrolysis from occurring on its own: the reaction

\mathrm{ATP} + \mathrm{H}_2\mathrm{O} \longrightarrow \mathrm{ADP} + \mathrm{P}_{\mathrm{i}}

violates conservation of quantities B), D) and E), while

\mathrm{ATP} +  \mathrm{H}_2\mathrm{O} \longrightarrow\mathrm{AMP} + \mathrm{PP}_{\mathrm{i}}

violates conservation of quantities B), C) and F). (In these reactions we are neglecting \mathrm{H}^+ ions, since as mentioned these are freely available in water.)

Indeed, any linear combination of these two forms of ATP hydrolysis is prohibited. But since this requires only two emergent conservation laws, the presence of seven is a bit of a puzzle. Conserved quantity C) prevents the destruction of aspartate without the production of an equal amount of \mathrm{PP}_{\mathrm{i}}, conserved quantity D) prevents the destruction of \mathrm{NH}_3 without the production of an equal amount of \mathrm{ADP}, and so on. But there seems to be more coupling than is strictly “required”. Of course, many factors besides coupling are involved in an evolutionarily advantageous reaction network.

Further directions

Our paper, similar to these blog articles but with some more equations and fewer pictures, is here:

• John Baez, Jonathan Lorand, Blake S. Pollard and Maru Sarazola, Biochemical coupling through emergent conservation laws.

As a slight hint at further directions to explore, here’s an interesting quote:

“It is generally believed that enzyme-free prebiotic reactions typically go wild and produce many side products,” says Pasquale Stano, an organic chemist at the University of Salento, Italy.

Emergent conservation laws limit the number of side products! For more, see:

• Melissae Fellet, Enzyme-free reaction cycles hint at primitive precursor to metabolism, Chemistry World, 10 January 2018.

This is about an artificially created cycle similar to the citric acid cycle, which air-breathing organisms use to ‘burn’ foods and create ATP.

In our final post, we’ll take a look at the citric acid cycle and its emergent conservation laws. This material is more rough than the rest, and it didn’t find its way into our paper on the arXiv, but we put a fair amount of work into it—so, we’ll blog about it!


The paper:

• John Baez, Jonathan Lorand, Blake S. Pollard and Maru Sarazola,
Biochemical coupling through emergent conservation laws.

The blog series:

Part 1 – Introduction.

Part 2 – Review of reaction networks and equilibrium thermodynamics.

Part 3 – What is coupling?

Part 4 – Interactions.

Part 5 – Coupling in quasiequilibrium states.

Part 6 – Emergent conservation laws.

Part 7 – The urea cycle.

Part 8 – The citric acid cycle.