Information Geometry (Part 17)

27 July, 2021

I’m getting back into information geometry, which is the geometry of the space of probability distributions, studied using tools from information theory. I’ve written a bunch about it already, which you can see here:

Information geometry.

Now I’m fascinated by something new: how symplectic geometry and contact geometry show up in information geometry. But before I say anything about this, let me say a bit about how they show up in thermodynamics. This is more widely discussed, and it’s a good starting point.

Symplectic geometry was born as the geometry of phase space in classical mechanics: that is, the space of possible positions and momenta of a classical system. The simplest example of a symplectic manifold is the vector space \mathbb{R}^{2n}, with n position coordinates q_i and n momentum coordinates p_i.

It turns out that symplectic manifolds are always even-dimensional, because we can always cover them with coordinate charts that look like \mathbb{R}^{2n}. When we change coordinates, it turns out that the splitting of coordinates into positions and momenta is somewhat arbitrary. For example, the position of a rock on a spring now may determine its momentum a while later, and vice versa. What’s not arbitrary? It’s the so-called ‘symplectic structure’:

\omega = dp_1 \wedge dq_1 + \cdots + dp_n \wedge dq_n

While far from obvious at first, we know by now that the symplectic structure is exactly what needs to be preserved under valid changes of coordinates in classical mechanics! In fact, we can develop the whole formalism of classical mechanics starting from a manifold with a symplectic structure.

Symplectic geometry also shows up in thermodynamics. In thermodynamics we can start with a system in equilibrium whose state is described by some variables q_1, \dots, q_n. Its entropy will be a function of these variables, say

S = f(q_1, \dots, q_n)

We can then take the partial derivatives of entropy and call them something:

\displaystyle{ p_i = \frac{\partial f}{\partial q_i} }

These new variables p_i are said to be ‘conjugate’ to the q_i, and they turn out to be very interesting. For example, if q_i is energy then p_i is ‘coolness’: the reciprocal of temperature. The coolness of a system is its change in entropy per change in energy.

Often the variables q_i are ‘extensive’: that is, you can measure them only by looking at your whole system and totaling up some quantity. Examples are energy and volume. Then the new variables p_i are ‘intensive’: that is, you can measure them at any one location in your system. Examples are coolness and pressure.

Now for a twist: sometimes we do not know the function f ahead of time. Then we cannot define the p_i as above. We’re forced into a different approach where we treat them as independent quantities, at least until someone tells us what f is.

In this approach, we start with a space \mathbb{R}^{2n} having n coordinates called q_i and n coordinates called p_i. This is a symplectic manifold, with the symplectic struture \omega described earlier!

But what about the entropy? We don’t yet know what it is as a function of the q_i, but we may still want to talk about it. So, we build a space \mathbb{R}^{2n+1} having one extra coordinate S in addition to the q_i and p_i. This new coordinate stands for entropy. And this new space has an important 1-form on it:

\alpha = -dS + p_1 dq_i + \cdots + p_n dq_n

This is called the ‘contact 1-form’.

This makes \mathbb{R}^{2n+1} into an example of a ‘contact manifold’. Contact geometry is the odd-dimensional partner of symplectic geometry. Just as symplectic manifolds are always even-dimensional, contact manifolds are always odd-dimensional.

What is the point of the contact 1-form? Well, suppose someone tells us the function f relating entropy to the coordinates q_i. Now we know that we want

S = f

and also

\displaystyle{ p_i = \frac{\partial f}{\partial q_i} }

So, we can impose these equations, which pick out a subset of \mathbb{R}^{2n+1}. You can check that this subset, say \Sigma, is an n-dimensional submanifold. But even better, the contact 1-form vanishes when restricted to this submanifold:

\left.\alpha\right|_\Sigma = 0

Let’s see why! Suppose x \in \Sigma and suppose v \in T_x \Sigma is a vector tangent to \Sigma at this point x. It suffices to show

\alpha(v) = 0

Using the definition of \alpha this equation says

\displaystyle{ -dS(v) + \sum_i p_i dq_i(v) = 0 }

But on the surface \Sigma we have

S = f, \qquad  \displaystyle{ p_i = \frac{\partial f}{\partial q_i} }

So, the equation we’re trying to show can be written as

\displaystyle{ -df(v) + \sum_i \frac{\partial f}{\partial q_i} dq_i(v) = 0 }

But this follows from

\displaystyle{ df = \sum_i \frac{\partial f}{\partial q_i} dq_i }

which holds because f is a function only of the coordinates q_i.

So, any formula for entropy S = f(q_1, \dots, q_n) picks out a so-called ‘Legendrian submanifold’ of \mathbb{R}^{2n+1}: that is, an n-dimensional submanifold such that the contact 1-form vanishes when restricted to this submanifold. And the idea is that this submanifold tells you everything you need to know about a thermodynamic system.

Indeed, V. I. Arnol’d says this was implicitly known to the great founder of statistical mechanics, Josiah Willard Gibbs. Arnol’d calls \mathbb{R}^5 with coordinates energy, entropy, temperature, pressure and volume the ‘Gibbs manifold’, and he proclaims:

Gibbs’ thesis: substances are Legendrian submanifolds of the Gibbs manifold.

This is from here:

• V. I. Arnol’d, Contact geometry: the geometrical method of Gibbs’ thermodynamics, Proceedings of the Gibbs Symposium (New Haven, CT, 1989), AMS, Providence, Rhode Island, 1990.

A bit more detail

Now I want to say everything again, with a bit of extra detail, assuming more familiarity with manifolds. Above I was using \mathbb{R}^n with coordinates q_1, \dots, q_n to describe the ‘extensive’ variables of a thermodynamic system. But let’s be a bit more general and use any smooth n-dimensional manifold Q. Even if Q is a vector space, this viewpoint is nice because it’s manifestly coordinate-independent!

So: starting from Q we build the cotangent bundle T^\ast Q. A point in cotangent describes both extensive variables, namely q \in Q, and ‘intensive’ variables, namely a cotangent vector p \in T^\ast_q Q.

The manifold T^\ast Q has a 1-form \theta on it called the tautological 1-form. We can describe it as follows. Given a tangent vector v \in T_{(q,p)} T^\ast Q we have to say what \theta(v) is. Using the projection

\pi \colon T^\ast Q \to Q

we can project v down to a tangent vector d\pi(v) at the point q. But the 1-form p eats tangent vectors at q and spits out numbers! So, we set

\theta(v) = p(d\pi(v))

This is sort of mind-boggling at first, but it’s worth pondering until it makes sense. It helps to work out what \theta looks like in local coordinates. Starting with any local coordinates q_i on an open set of Q, we get local coordinates q_i, p_i on the cotangent bundle of this open set in the usual way. On this open set you then get

\theta = p_1 dq_1 + \cdots + p_n dq_n

This is a standard calculation, which is really worth doing!

It follows that we can define a symplectic structure \omega by

\omega = d \theta

and get this formula in local coordinates:

\omega = dp_1 \wedge dq_1 + \cdots + dp_n \wedge dq_n

Now, suppose we choose a smooth function

f \colon Q \to \mathbb{R}

which describes the entropy. We get a 1-form df, which we can think of as a map

df \colon Q \to T^\ast Q

assigning to each choice q of extensive variables the pair (q,p) of extensive and intensive variables where

p = df_q

The image of the map df is a ‘Lagrangian submanifold‘ of T^\ast Q: that is, an n-dimensional submanifold \Lambda such that

\left.\omega\right|_{\Lambda} = 0

Lagrangian submanifolds are to symplectic geometry as Legendrian submanifolds are to contact geometry! What we’re seeing here is that if Gibbs had preferred symplectic geometry, he could have described substances as Lagrangian submanifolds rather than Legendrian submanifolds. But this approach would only keep track of the derivatives of entropy, df, not the actual value of the entropy function f.

If we prefer to keep track of the actual value of f using contact geometry, we can do that. For this we add an extra dimension to T^\ast Q and form the manifold T^\ast Q \times \mathbb{R}. The extra dimension represents entropy, so we’ll use S as our name for the coordinate on \mathbb{R}.

We can make T^\ast Q \times \mathbb{R} into a contact manifold with contact 1-form

\alpha = -d S + \theta

In local coordinates we get

\alpha = -dS + p_1 dq_i + \cdots + p_n dq_n

just as we had earlier. And just as before, if we choose a smooth function f \colon Q \to \mathbb{R} describing entropy, the subset

\Sigma = \{(q,p,S) \in T^\ast Q \times \mathbb{R} : \; S = f(q), \; p = df_q \}

is a Legendrian submanifold of T^\ast Q \times \mathbb{R}.

Okay, this concludes my lightning review of symplectic and contact geometry in thermodynamics! Next time I’ll talk about something a bit less well understood: how they show up in statistical mechanics.

Thermodynamics and Economic Equilibrium

18 July, 2021

I’m having another round of studying thermodynamics, and I’m running into more interesting leads than I can keep up with. Like this paper:

• Eric Smith and Duncan K. Foley, Classical thermodynamics and economic general equilibrium theory, Journal of Economic Dynamics and Control 32 (2008) 7–65.

I’ve always been curious about the connection between economics and thermodynamics, but I know too little about economics to make this easy to explore. There are people who work on subjects called thermoeconomics and econophysics, but classical economists consider them ‘heterodox’. While I don’t trust classical economists to be right about things, I should probably learn more classical economics before I jump into the fray.

Still, the introduction of this paper is intriguing:

The relation between economic and physical (particularly thermodynamic) concepts of equilibrium has been a topic of recurrent interest throughout the development of neoclassical economic theory. As systems for defining equilibria, proving their existence, and computing their properties, neoclassical economics (Mas-Collel et al., 1995; Varian, 1992) and classical thermodynamics (Fermi, 1956) undeniably have numerous formal and methodological similarities. Both fields seek to describe system phenomena in terms of solutions to constrained optimization problems. Both rely on dual representations of interacting subsystems: the state of each subsystem is represented by pairs of variables, one variable from each pair characterizing the subsystem’s content, and the other characterizing the way it interacts with other subsystems. In physics the content variables are quantities like asubsystem’s total energy or the volume in space it occupies; in economics they area mounts of various commodities held by agents. In physics the interaction variables are quantities like temperature and pressure that can be measured on the system boundaries; in economics they are prices that can be measured by an agent’s willingness to trade one commodity for another.

In thermodynamics these pairs are called conjugate variables. The ‘content variables’ are usually called extensive and the ‘interaction variables’ are usually called intensive. A vector space with conjugate pairs of variables as coordinates is a symplectic vector space, and I’ve written about how these show up in the category-theoretic approach to open systems:

• John Baez, A compositional framework for passive linear networks, Azimuth, 28 April 2015.

Continuing on:

The significance attached to these similarities has changed considerably, however, in the time from the first mathematical formulation of utility (Walras, 1909) to the full axiomatization of general equilibrium theory (Debreu, 1987). Léon Walras appears (Mirowski, 1989) to have conceptualized economic equilibrium as a balance of the gradients of utilities, more for the sake of similarity to the concept of force balance in mechanics, than to account for any observations about the outcomes of trade. Fisher (1892) (a student of J. Willard Gibbs) attempted to update Walrasian metaphors from mechanics to thermodynamics, but retained Walras’s program of seeking an explicit parallelism between physics and economics.

This Fisher is not the geneticist and statistician Ronald Fisher who came up with Fisher’s fundamental theorem. It’s the author of this thesis:

• Irving Fisher, Mathematical Investigations in the Theory of Value and Prices, Ph.D. thesis, Yale University, 1892.

Continuing on with Smith and Foley’s paper:

As mathematical economics has become more sophisticated (Debreu, 1987) the naive parallelism of Walras and Fisher has progressively been abandoned, and with it the sense that it matters whether neoclassical economics resembles any branch of physics. The cardinalization of utility that Walras thought of as a counterpart to energy has been discarded, apparently removing the possibility of comparing utility with any empirically measurable quantity. A long history of logically inconsistent (or simply unproductive) analogy making (see Section 7.2) has further caused the topic of parallels to fall out of favor. Samuelson (1960) summarizes well the current view among many economists, at the end of one of the few methodologically sound analyses of the parallel roles of dual representation in economics and physics:

The formal mathematical analogy between classical thermodynamics and mathematic economic systems has now been explored. This does not warrant the commonly met attempt to find more exact analogies of physical magnitudes—such as entropy or energy—in the economic realm. Why should there be laws like the first or second laws of thermodynamics holding in the economic realm? Why should ‘utility’ be literally identified with entropy, energy, or anything else? Why should a failure to make such a successful identification lead anyone to overlook or deny the mathematical isomorphism that does exist between minimum systems that arise in different disciplines?

The view that neoclassical economics is now mathematically mature, and that it is mere coincidence and no longer relevant whether it overlaps with any body of physical theory, is reflected in the complete omission of the topic of parallels from contemporary graduate texts (Mas-Collel et al., 1995). We argue here that, despite its long history of discussion, there are important insights still to be gleaned from considering the relation of neoclassical economics to classical thermodynamics. The new results concerning this relation we present here have significant implications, both for the interpretation of economic theory and for econometrics. The most important point of this paper (more important than the establishment of formal parallels between thermodynamics and utility economics) is that economics, because it does not recognize an equation of state or define prices intrinsically in terms of equilibrium, lacks the close relation between measurement and theory physical thermodynamics enjoys.

Luckily, the paper seems to be serious about explaining economics to those who know thermodynamics (and maybe vice versa). So, I will now read the rest of the paper—or at least skim it.

One interesting simple point seems to be this: there’s an analogy between entropy maximization and utility maximization, but it’s limited by the following difference.

In classical thermodynamics the total entropy of a closed system made of subsystems is the sum of the entropies of the parts. While the second law forbids the system from moving to a state to a state of lower total entropy, the entropies of some parts can decrease.

By contrast, in classical economics the total utility of a collection of agents is an unimportant quantity: what matters is the utility of each individual agent. The reason is that we assume the agents will voluntarily move from one state to another only if the utility of each agent separately increases. Furthermore, if we believe we can reparametrize the utility of each agent without changing anything, it makes no sense to add utilities.

(On the other hand, some utilitarian ethicists seem to believe it makes sense to add utilities and try to maximize the total. I imagine that libertarians would consider this ‘totalitarian’ approach morally unacceptable. I’m even less eager to enter discussions of the foundations of ethics than of economics, but it’s interesting how the question of whether a quantity can or ‘should’ be totaled up and then maximized plays a role in this debate.)

The Ideal Monatomic Gas

15 July, 2021

Today at the Topos Institute, Sophie Libkind, Owen Lynch and I spent some time talking about thermodynamics, Carnot engines and the like. As a result, I want to work out for myself some basic facts about the ideal gas. This stuff is all well-known, but I’m having trouble finding exactly what I want—and no more, thank you—collected in one place.

Just for background, the Carnot cycle looks roughly like this:

This is actually a very inaccurate picture, but it gets the point across. We have a container of gas, and we make it execute a cyclic motion, so its pressure P and volume V trace out a loop in the plane. As you can see, this loop consists of four curves:

• In the first, from a to b, we put a container of gas in contact with a hot medium. Then we make it undergo isothermal expansion: that is, expansion at a constant temperature.

• In the second, from b to c, we insulate the container and let the gas undergo adiabatic reversible expansion: that is, expansion while no heat enters or leaves. The temperature drops, but merely because the container expands, not because heat leaves. It reaches a lower temperature. Then we remove the insulation.

• In the third, from c to d, we put the container in contact with a cold medium that matches its temperature. Then we make it undergo isothermal contraction: that is, contraction at a constant temperature.

• In the fourth, from d to a, we insulate the container and let the gas undergo adiabatic reversible contraction: that is, contraction while no heat enters or leaves. The temperature increases until it matches that of the hot medium. Then we remove the insulation.

The Carnot cycle is historically important because it’s an example of a heat engine that’s as efficient as possible: it give you the most work possible for the given amount of heat transferred from the hot medium to the cold medium. But I don’t want to get into that. I just want to figure out formulas for everything that’s going on here—including formulas for the four curves in this picture!

To get specific formulas, I’ll consider an ideal monatomic gas, meaning a gas made of individual atoms, like helium. Some features of an ideal gas, like the formula for energy as a function of temperature, depend on whether it’s monatomic.

As a quirky added bonus, I’d like to highlight how certain properties of the ideal monatomic gas depend on the dimension of space. There’s a certain chunk of the theory that doesn’t depend on the dimension of space, as long as you interpret ‘volume’ to mean the n-dimensional analogue of volume. But the number 3 shows up in the formula for the energy of the ideal monatomic gas. And this is because space is 3-dimensional! So just for fun, I’ll do the whole analysis in n dimensions.

There are four basic formulas we need to know.

First, we have the ideal gas law:

PV = NkT


P is the pressure.
V is the n-dimensional volume.
N is the number of molecules in a container of gas.
k is a constant called Boltzmann’s constant.
T is the temperature.

Second, we have a formula for the energy, or more precisely the internal energy, of a monatomic ideal gas:

U = \frac{n}{2} NkT


U is the internal energy.
n is the dimension of space.

The factor of n/2 shows up thanks to the equipartition theorem: classically, a harmonic oscillator at temperature T has expected energy equal to kT times its number of degrees of freedom. Very roughly, the point is that in n dimensions there are n different directions in which an atom can move around.

Third, we have a relation between internal energy, work and heat:

dU = \delta W + \delta Q


dU is the differential of internal energy.
\delta W is the infinitesimal work done to the gas.
\delta Q is the infinitesimal heat transferred to the gas.

The intuition is simple: to increase the energy of some gas you can do work to it or transfer heat to it. But the math may seem a bit murky, so let me explain.

I emphasize ‘to’ because it affects the sign: for example, the work done by the gas is minus the work done to the gas. Work done to the gas increases its internal energy, while work done by it reduces its internal energy. Similarly for heat.

But what is this ‘infinitesimal’ stuff, and these weird \delta symbols?

In a minute I’m going to express everything in terms of P and V. So, T, N and U will be functions on the plane with coordinates P and V. dU will be a 1-form on this plane: it’s the differential of the function U.

But \delta W and \delta Q are not differentials of functions W and Q. There are no functions on the plane called W and Q. You can not take a box of gas and measure its work, or heat! There are just 1-forms called \delta W and \delta Q describing the change in work or heat. These are not exact 1-forms: that is, they’re not differentials of functions.

Fourth and finally:

\delta W = - P dV

This should be intuitive. The work done by the gas on the outside world by changing its volume a little equals the pressure times the change in volume. So, the work done to the gas is minus the pressure times the change in volume.

One nice feature of the 1-form \delta W = -P d V is this: as we integrate it around a simple closed curve going counterclockwise, we get the area enclosed by that curve. So, the area of this region:

is the work done by our container of gas during the Carnot cycle. (There are a lot of minus signs to worry about here, but don’t worry, I’ve got them under control. Our curve is going clockwise, so the work done to our container of gas is negative, and it’s minus the area in the region.)

Okay, now that we have our four basic equations, we can play with them and derive consequences. Let’s suppose the number N of atoms in our container of gas is fixed—a constant. Then we think of everything as a function of two variables: P and V.

First, since PV = NkT we have

\displaystyle{ T = \frac{PV}{Nk} }

So temperature is proportional to pressure times volume.

Second, since PV = NkT and U = \frac{n}{2}NkT we have

U = \frac{n}{2} P V

So, like the temperature, the internal energy of the gas is proportional to pressure times volume—but it depends on the dimension of space!

From this we get

dU = \frac{n}{2} d(PV) = \frac{n}{2}( V dP + P dV)

From this and our formulas dU = \delta W + \delta Q, \delta W = -PdV we get

\begin{array}{ccl}  \delta Q &=& dU - \delta W \\  \\  &=& \frac{n}{2}( V dP + P dV) + P dV \\ \\  &=& \frac{n}{2} V dP + \frac{n+2}{2} P dV   \end{array}

That’s basically it!

But now we know how to figure out everything about the Carnot cycle. I won’t do it all here, but I’ll work out formulas for the curves in this cycle:

The isothermal curves are easy, since we’ve seen temperature is proportional to pressure times volume:

\displaystyle{ T = \frac{PV}{Nk} }

So, an isothermal curve is any curve with

P \propto V^{-1}

The adiabatic reversible curves, or ‘adiabats’ for short, are a lot more interesting. A curve C in the P  V plane is an adiabat if when the container of gas changes pressure and volume while moving along this curve, no heat gets transferred to or from the gas. That is:

\delta Q \Big|_C = 0

where the funny symbol means I’m restricting a 1-form to the curve and getting a 1-form on that curve (which happens to be zero).

Let’s figure out what an adiabat looks like! By our formula for Q we have

(\frac{n}{2} V dP + \frac{n+2}{2} P dV) \Big|_C = 0


\frac{n}{2} V dP \Big|_C = -\frac{n+2}{2} P dV \Big|_C


\frac{dP}{P} \Big|_C = - \frac{n+2}{n} \frac{dV}{V}\Big|_C

Now, we can integrate both sides along a portion of the curve C and get

\ln P = - \frac{n+2}{n} \ln V + \mathrm{constant}


P \propto V^{-(n+2)/n}

So in 3-dimensional space, as you let a gas expand adiabatically—say by putting it in an insulated cylinder so heat can’t get in or out—its pressure drops as its volume increases. But for a monatomic gas it drops in this peculiar specific way: the pressure goes like the volume to the -5/3 power.

In any dimension, the pressure of the monatomic gas drops more steeply when the container expands adiabatically than when it expands at constant temperature. Why? Because V^{-(n+2)/n} drops more rapidly than V^{-1} since

\frac{n+2}{n} > 1

But as n \to \infty,

\frac{n+2}{n} \to 1

so the adiabats become closer and and closer to the isothermal curves in high dimensions. This is not important for understanding the conceptually significant features of the Carnot cycle! But it’s curious, and I’d like to improve my understanding by thinking about it until it seems obvious. It doesn’t yet.

Fisher’s Fundamental Theorem (Part 4)

13 July, 2021

I wrote a paper that summarizes my work connecting natural selection to information theory:

• John Baez, The fundamental theorem of natural selection.

Check it out! If you have any questions or see any mistakes, please let me know.

Just for fun, here’s the abstract and introduction.

Abstract. Suppose we have n different types of self-replicating entity, with the population P_i of the ith type changing at a rate equal to P_i times the fitness f_i of that type. Suppose the fitness f_i is any continuous function of all the populations P_1, \dots, P_n. Let p_i be the fraction of replicators that are of the ith type. Then p = (p_1, \dots, p_n) is a time-dependent probability distribution, and we prove that its speed as measured by the Fisher information metric equals the variance in fitness. In rough terms, this says that the speed at which information is updated through natural selection equals the variance in fitness. This result can be seen as a modified version of Fisher’s fundamental theorem of natural selection. We compare it to Fisher’s original result as interpreted by Price, Ewens and Edwards.


In 1930, Fisher stated his “fundamental theorem of natural selection” as follows:

The rate of increase in fitness of any organism at any time is equal to its genetic variance in fitness at that time

Some tried to make this statement precise as follows:

The time derivative of the mean fitness of a population equals the variance of its fitness.

But this is only true under very restrictive conditions, so a controversy was ignited.

An interesting resolution was proposed by Price, and later amplified by Ewens and Edwards. We can formalize their idea as follows. Suppose we have n types of self-replicating entity, and idealize the population of the ith type as a real-valued function P_i(t). Suppose

\displaystyle{ \frac{d}{dt} P_i(t) = f_i(P_1(t), \dots, P_n(t)) \, P_i(t) }

where the fitness f_i is a differentiable function of the populations of every type of replicator. The mean fitness at time t is

\displaystyle{ \overline{f}(t) = \sum_{i=1}^n p_i(t) \, f_i(P_1(t), \dots, P_n(t)) }

where p_i(t) is the fraction of replicators of the ith type:

\displaystyle{ p_i(t) = \frac{P_i(t)}{\phantom{\Big|} \sum_{j = 1}^n P_j(t) } }

By the product rule, the rate of change of the mean fitness is the sum of two terms:

\displaystyle{ \frac{d}{dt} \overline{f}(t) = \sum_{i=1}^n \dot{p}_i(t) \, f_i(P_1(t), \dots, P_n(t)) \; + \; }

\displaystyle{ \sum_{i=1}^n p_i(t) \,\frac{d}{dt} f_i(P_1(t), \dots, P_n(t)) }

The first of these two terms equals the variance of the fitness at time t. We give the easy proof in Theorem 1. Unfortunately, the conceptual significance of this first term is much less clear than that of the total rate of change of mean fitness. Ewens concluded that “the theorem does not provide the substantial biological statement that Fisher claimed”.

But there is another way out, based on an idea Fisher himself introduced in 1922: Fisher information. Fisher information gives rise to a Riemannian metric on the space of probability distributions on a finite set, called the ‘Fisher information metric’—or in the context of evolutionary game theory, the ‘Shahshahani metric’. Using this metric we can define the speed at which a time-dependent probability distribution changes with time. We call this its ‘Fisher speed’. Under just the assumptions already stated, we prove in Theorem 2 that the Fisher speed of the probability distribution

p(t) = (p_1(t), \dots, p_n(t))

is the variance of the fitness at time t.

As explained by Harper, natural selection can be thought of as a learning process, and studied using ideas from information geometry—that is, the geometry of the space of probability distributions. As p(t) changes with time, the rate at which information is updated is closely connected to its Fisher speed. Thus, our revised version of the fundamental theorem of natural selection can be loosely stated as follows:

As a population changes with time, the rate at which information is updated equals the variance of fitness.

The precise statement, with all the hypotheses, is in Theorem 2. But one lesson is this: variance in fitness may not cause ‘progress’ in the sense of increased mean fitness, but it does cause change!

For more details in a user-friendly blog format, read the whole series:

Part 1: the obscurity of Fisher’s original paper.

Part 2: a precise statement of Fisher’s fundamental theorem of natural selection, and conditions under which it holds.

Part 3: a modified version of the fundamental theorem of natural selection, which holds much more generally.

Part 4: my paper on the fundamental theorem of natural selection.

Complex Adaptive System Design (Part 10)

25 June, 2021

guest post by John Foley

Though the Complex Adaptive System Composition and Design Environment (CASCADE) program concluded in Fall 2020, just this week two new articles came out reviewing the work and future research directions:

• John Baez and John Foley, Operads for designing systems of systems, Notices of the American Mathematical Society 68 (2021), 1005–1007.

• John Foley, Spencer Breiner, Eswaran Subrahmanian and John Dusel, Operads for complex system design specification, analysis and synthesis, Proceedings of the Royal Society A 477 (2021), 20210099.

Operads for Designing Systems of Systems

The first is short and sweet (~2 pages!), aimed at a general mathematical audience. It describes the motivation for CASCADE and how basic modeling issues for point-to-point communications led to the development network operads:

This figure depicts the prototypical example of this style of operad, the ‘simple network operad’, acting on an algebra of graphs whose nodes are endowed with locations and edges can be no longer than a fixed range limit. For more information, check out the article or Part 4 of this series.

For a quick, retrospective overview of CASCADE, this note is hard to beat, so I won’t repeat more here.

Operads for complex system design specification, analysis and synthesis

The second article is a full length review, aimed at a general applied science audience:

We introduce operads for design to a general scientific audience by explaining what the operads do relative to broadly applied techniques and how specific domain problems are modelled. Research directions are presented with an eye towards opening up interdisciplinary partnerships and continuing application-driven investigations to build on recent insights.

The review describes how operads apply to system design problems through three examples:

and concludes with a discussion of future research directions. The specification and synthesis examples come from applications of network operads in CASCADE, but the analysis example was contributed by collaborators Spencer Breiner and Eswaran Subrahmanian at the National Institute of Standards and Technology (NIST), who analyzed the Length Scale Interferometer (LSI) at NIST headquarters. Readers interested in a quick introduction to these examples should head directly to Section 3 of the review.

As we describe:

The present article captures an intermediate stage of technical maturity: operad-based design has shown its practicality by lowering barriers of entry for applied practitioners and demonstrating applied examples across many domains. However, it has not realized its full potential as an applied meta-language. Much of this recent progress is not focused solely on the analytic power of operads to separate concerns. Significant progress on explicit specification of domain models and techniques to automatically synthesize designs from basic building blocks has been made.

With this context, CASCADE’s contribution was prototyping general-purpose methods to specify basic building blocks and synthesize composite systems from atoms. By testing these methods against specific domain problems, we learned that domain-specific information should be exploited but systematically fitting together general-purpose and computationally efficient methods is challenging. Moreover, no reconciliation between the analytic point-of-view on operads for system design and the `generative’ perspective of network operads, which facilitate specification and synthesis, has been established. The review does not address how these threads might fit together precisely, but perhaps the answer looks something like this:

For more discussion of future research directions, please see Section 7 of the review, especially the open problems listed in 7f.

For readers that make it through the examples in Sections 4, 5 and 6 of the review but still want more, the following references provide additional details:

• John Baez, John Foley, Joe Moeller and Blake Pollard, Network models, Theory and Applications of Categories 35 (2020), 700–744.

• Spencer Breiner, Olivier Marie-Rose, Blake Pollard and Eswaran Subrahmanian, Modeling hierarchical system with operads, Electron. Proc. Theor. Comput. Sci. 323 (2020) 72–83.

• John Baez, John Foley and Joe Moeller, Network models from Petri nets with catalysts, Compositionality 1 (4) (2017).

Here’s the whole series of posts:

Part 1. CASCADE: the Complex Adaptive System Composition and Design Environment.

Part 2. Metron’s software for system design.

Part 3. Operads: the basic idea.

Part 4. Network operads: an easy example.

Part 5. Algebras of network operads: some easy examples.

Part 6. Network models.

Part 7. Step-by-step compositional design and tasking using commitment networks.

Part 8. Compositional tasking using category-valued network models.

Part 9 – Network models from Petri nets with catalysts.

Part 10 – Two papers reviewing the whole project.

Nonequilibrium Thermodynamics in Biology (Part 2)

16 June, 2021

Larry Li, Bill Cannon and I ran a session on non-equilibrium thermodynamics in biology at SMB2021, the annual meeting of the Society for Mathematical Biology. You can see talk slides here!

Here’s the basic idea:

Since Lotka, physical scientists have argued that living things belong to a class of complex and orderly systems that exist not despite the second law of thermodynamics, but because of it. Life and evolution, through natural selection of dissipative structures, are based on non-equilibrium thermodynamics. The challenge is to develop an understanding of what the respective physical laws can tell us about flows of energy and matter in living systems, and about growth, death and selection. This session addresses current challenges including understanding emergence, regulation and control across scales, and entropy production, from metabolism in microbes to evolving ecosystems.

Click on the links to see slides for most of the talks:

Persistence, permanence, and global stability in reaction network models: some results inspired by thermodynamic principles
Gheorghe Craciun, University of Wisconsin–Madison

The standard mathematical model for the dynamics of concentrations in biochemical networks is called mass-action kinetics. We describe mass-action kinetics and discuss the connection between special classes of mass-action systems (such as detailed balanced and complex balanced systems) and the Boltzmann equation. We also discuss the connection between the ‘global attractor conjecture’ for complex balanced mass-action systems and Boltzmann’s H-theorem. We also describe some implications for biochemical mechanisms that implement noise filtering and cellular homeostasis.

The principle of maximum caliber of nonequilibria
Ken Dill, Stony Brook University

Maximum Caliber is a principle for inferring pathways and rate distributions of kinetic processes. The structure and foundations of MaxCal are much like those of Maximum Entropy for static distributions. We have explored how MaxCal may serve as a general variational principle for nonequilibrium statistical physics—giving well-known results, such as the Green-Kubo relations, Onsager’s reciprocal relations and Prigogine’s Minimum Entropy Production principle near equilibrium, but is also applicable far from equilibrium. I will also discuss some applications, such as finding reaction coordinates in molecular simulations non-linear dynamics in gene circuits, power-law-tail distributions in ‘social-physics’ networks, and others.

Nonequilibrium biomolecular information processes
Pierre Gaspard, Université libre de Bruxelles

Nearly 70 years have passed since the discovery of DNA structure and its role in coding genetic information. Yet, the kinetics and thermodynamics of genetic information processing in DNA replication, transcription, and translation remain poorly understood. These template-directed copolymerization processes are running away from equilibrium, being powered by extracellular energy sources. Recent advances show that their kinetic equations can be exactly solved in terms of so-called iterated function systems. Remarkably, iterated function systems can determine the effects of genome sequence on replication errors, up to a million times faster than kinetic Monte Carlo algorithms. With these new methods, fundamental links can be established between molecular information processing and the second law of thermodynamics, shedding a new light on genetic drift, mutations, and evolution.

Nonequilibrium dynamics of disturbed ecosystems
John Harte, University of California, Berkeley

The Maximum Entropy Theory of Ecology (METE) predicts the shapes of macroecological metrics in relatively static ecosystems, across spatial scales, taxonomic categories, and habitats, using constraints imposed by static state variables. In disturbed ecosystems, however, with time-varying state variables, its predictions often fail. We extend macroecological theory from static to dynamic, by combining the MaxEnt inference procedure with explicit mechanisms governing disturbance. In the static limit, the resulting theory, DynaMETE, reduces to METE but also predicts a new scaling relationship among static state variables. Under disturbances, expressed as shifts in demographic, ontogenic growth, or migration rates, DynaMETE predicts the time trajectories of the state variables as well as the time-varying shapes of macroecological metrics such as the species abundance distribution and the distribution of metabolic rates over
individuals. An iterative procedure for solving the dynamic theory is presented. Characteristic signatures of the deviation from static predictions of macroecological patterns are shown to result from different kinds of disturbance. By combining MaxEnt inference with explicit dynamical mechanisms of disturbance, DynaMETE is a candidate theory of macroecology for ecosystems responding to anthropogenic or natural disturbances.

Stochastic chemical reaction networks
Supriya Krishnamurthy, Stockholm University

The study of chemical reaction networks (CRN’s) is a very active field. Earlier well-known results (Feinberg Chem. Enc. Sci. 42 2229 (1987), Anderson et al Bull. Math. Biol. 72 1947 (2010)) identify a topological quantity called deficiency, easy to compute for CRNs of any size, which, when exactly equal to zero, leads to a unique factorized (non-equilibrium) steady-state for these networks. No general results exist however for the steady states of non-zero-deficiency networks. In recent work, we show how to write the full moment-hierarchy for any non-zero-deficiency CRN obeying mass-action kinetics, in terms of equations for the factorial moments. Using these, we can recursively predict values for lower moments from higher moments, reversing the procedure usually used to solve moment hierarchies. We show, for non-trivial examples, that in this manner we can predict any moment of interest, for CRN’s with non-zero deficiency and non-factorizable steady states. It is however an open question how scalable these techniques are for large networks.

Heat flows adjust local ion concentrations in favor of prebiotic chemistry
Christof Mast, Ludwig-Maximilians-Universität München

Prebiotic reactions often require certain initial concentrations of ions. For example, the activity of RNA enzymes requires a lot of divalent magnesium salt, whereas too much monovalent sodium salt leads to a reduction in enzyme function. However, it is known from leaching experiments that prebiotically relevant geomaterial such as basalt releases mainly a lot of sodium and only little magnesium. A natural solution to this problem is heat fluxes through thin rock fractures, through which magnesium is actively enriched and sodium is depleted by thermogravitational convection and thermophoresis. This process establishes suitable conditions for ribozyme function from a basaltic leach. It can take place in a spatially distributed system of rock cracks and is therefore particularly stable to natural fluctuations and disturbances.

Deficiency of chemical reaction networks and thermodynamics
Matteo Polettini, University of Luxembourg

Deficiency is a topological property of a Chemical Reaction Network linked to important dynamical features, in particular of deterministic fixed points and of stochastic stationary states. Here we link it to thermodynamics: in particular we discuss the validity of a strong vs. weak zeroth law, the existence of time-reversed mass-action kinetics, and the possibility to formulate marginal fluctuation relations. Finally we illustrate some subtleties of the Python module we created for MCMC stochastic simulation of CRNs, soon to be made public.

Large deviations theory and emergent landscapes in biological dynamics
Hong Qian, University of Washington

The mathematical theory of large deviations provides a nonequilibrium thermodynamic description of complex biological systems that consist of heterogeneous individuals. In terms of the notions of stochastic elementary reactions and pure kinetic species, the continuous-time, integer-valued Markov process dictates a thermodynamic structure that generalizes (i) Gibbs’ microscopic chemical thermodynamics of equilibrium matters to nonequilibrium small systems such as living cells and tissues; and (ii) Gibbs’ potential function to the landscapes for biological dynamics, such as that of C. H. Waddington and S. Wright.

Using the maximum entropy production principle to understand and predict microbial biogeochemistry
Joseph Vallino, Marine Biological Laboratory, Woods Hole

Natural microbial communities contain billions of individuals per liter and can exceed a trillion cells per liter in sediments, as well as harbor thousands of species in the same volume. The high species diversity contributes to extensive metabolic functional capabilities to extract chemical energy from the environment, such as methanogenesis, sulfate reduction, anaerobic photosynthesis, chemoautotrophy, and many others, most of which are only expressed by bacteria and archaea. Reductionist modeling of natural communities is problematic, as we lack knowledge on growth kinetics for most organisms and have even less understanding on the mechanisms governing predation, viral lysis, and predator avoidance in these systems. As a result, existing models that describe microbial communities contain dozens to hundreds of parameters, and state variables are extensively aggregated. Overall, the models are little more than non-linear parameter fitting exercises that have limited, to no, extrapolation potential, as there are few principles governing organization and function of complex self-assembling systems. Over the last decade, we have been developing a systems approach that models microbial communities as a distributed metabolic network that focuses on metabolic function rather than describing individuals or species. We use an optimization approach to determine which metabolic functions in the network should be up regulated versus those that should be down regulated based on the non-equilibrium thermodynamics principle of maximum entropy production (MEP). Derived from statistical mechanics, MEP proposes that steady state systems will likely organize to maximize free energy dissipation rate. We have extended this conjecture to apply to non-steady state systems and have proposed that living systems maximize entropy production integrated over time and space, while non-living systems maximize instantaneous entropy production. Our presentation will provide a brief overview of the theory and approach, as well as present several examples of applying MEP to describe the biogeochemistry of microbial systems in laboratory experiments and natural ecosystems.

Reduction and the quasi-steady state approximation
Carsten Wiuf, University of Copenhagen

Chemical reactions often occur at different time-scales. In applications of chemical reaction network theory it is often desirable to reduce a reaction network to a smaller reaction network by elimination of fast species or fast reactions. There exist various techniques for doing so, e.g. the Quasi-Steady-State Approximation or the Rapid Equilibrium Approximation. However, these methods are not always mathematically justifiable. Here, a method is presented for which (so-called) non-interacting species are eliminated by means of QSSA. It is argued that this method is mathematically sound. Various examples are given (Michaelis-Menten mechanism, two-substrate mechanism, …) and older related techniques from the 50s and 60s are briefly discussed.

Jacob Obrecht

15 June, 2021

This is a striking portrait of the “outsider genius” Jacob Obrecht:

Obrecht, ~1457–1505, was an important composer in the third generation of the Franco-Flemish school. While he was overshadowed by the superstar Josquin, I’m currently finding him more interesting—mainly on the basis of one long piece called Missa Maria zart.

Obrecht was very bold and experimental in his younger years. He would do wild stuff like play themes backwards, or take the notes in a melody, rearrange them in order of how long they were played, and use that as a new melody. Paraphrasing Wikipedia:

Combining modern and archaic elements, Obrecht’s style is multi-dimensional. Perhaps more than those of the mature Josquin, the masses of Obrecht display a profound debt to the music of Johannes Ockeghem in the wide-arching melodies and long musical phrases that typify the latter’s music. Obrecht’s style is an example of the contrapuntal extravagance of the late 15th century. He often used a cantus firmus technique for his masses: sometimes he divided his source material up into short phrases; at other times he used retrograde (backwards) versions of complete melodies or melodic fragments. He once even extracted the component notes and ordered them by note value, long to short, constructing new melodic material from the reordered sequences of notes. Clearly to Obrecht there could not be too much variety, particularly during the musically exploratory period of his early twenties. He began to break free from conformity to formes fixes (standard forms) especially in his chansons (songs). However, he much preferred composing Masses, where he found greater freedom. Furthermore, his motets reveal a wide variety of moods and techniques.

But I haven’t heard any of these far-out pieces yet. Instead, I’ve been wallowing in his masterpiece: Missa Maria zart, an hour-long mass he wrote one year before he died of the bubonic plague. Here is the
Tallis Scholars version, with a score:

It’s harmonically sweet: it seems to avoid the pungent leading-tones that Dufay or even Ockeghem lean on. It’s highly non-repetitive: while the same themes get reused in endless variations, there’s little if any exact repetition of anything that came before. And it’s very homogeneous: nothing stands out very dramatically. So it’s a bit like a beautiful large stone with all its rough edges smoothed down by water, that’s hard to get a handle on. And I’m the sort of guy who finds this irresistibly attractive. After about a dozen listens, it reveals itself.

The booklet in the Tallis Scholars version, written by Peter Phillips, explains it better:

To describe Obrecht’s Missa Maria zart (‘Mass for gentle Mary’) as a ‘great work’ is true in two respects. It is a masterpiece of sustained and largely abstract musical thought; and it is possibly the longest polyphonic setting of the Mass Ordinary ever written, over twice the length of the more standard examples by Palestrina and Josquin. How it was possible for Obrecht to conceive something so completely outside the normal experience of his time is one of the most fascinating riddles in Renaissance music.

Jacob Obrecht (1457/8–1505) was born in Ghent and died in Ferrara. If the place of death suggests that he was yet another Franco-Flemish composer who received his training in the Low Countries and made his living in Italy, this is inaccurate. For although Obrecht was probably the most admired living composer alongside Josquin des Prés, he consistently failed to find employment in the Italian Renaissance courts. The reason for this may have been that he could not sing well enough: musicians at that time were primarily required to perform, to which composing took second place. Instead he was engaged by churches in his native land—in Utrecht, Bergen op Zoom, Cambrai, Bruges and Antwerp—before he finally decided in 1504 to take the risk and go to the d’Este court in Ferrara. Within a few months of arriving there he had contracted the plague. He died as the leading representative of Northern polyphonic style, an idiom which his Missa Maria zart explores to the full.

This Mass has inevitably attracted a fair amount of attention. The most recent writer on the subject is Rob Wegman (Born for the Muses: The Life and Masses of Jacob Obrecht by Rob C Wegman (Oxford 1994) pp.322–330. Wegman, Op.cit., p.284, is referring to H Besseler’s article ‘Von Dufay bis Josquin, ein Literaturbericht’, Zeitschrift für Musikwissenschaft, 11 (1928/9), p.18): ‘Maria zart is the sphinx among Obrecht’s Masses. It is vast. Even the sections in reduced scoring … are unusually extended. Two successive duos in the Gloria comprise over 100 bars, two successive trios in the Credo close to 120; the Benedictus alone stretches over more than 100 bars’; ‘Maria zart has to be experienced as the whole, one-hour-long sound event that it is, and it will no doubt evoke different responses in each listener … one might say that the composer retreated into a sound world all his own’; ‘Maria zart is perhaps the only Mass that truly conforms to Besseler’s description of Obrecht as the outsider genius of the Josquin period.’

The special sound world of Maria zart was not in fact created by anything unusual in its choice of voices. Many four-part Masses of the later fifteenth century were written for a similar grouping: low soprano, as here, or high alto as the top part; two roughly equal tenor lines, one of them normally carrying the chant when it is quoted in long notes; and bass. The unusual element is to a certain extent the range of the voices—they are all required to sing at extremes of their registers and to make very wide leaps—but more importantly the actual detail of the writing: the protracted sequences against the long chant notes, the instrumental-like repetitions and imitations.

It is this detail which explains the sheer length of this Mass. At thirty-two bars the melody of Maria zart is already quite long as a paraphrase model (the Western Wind melody, for example, is twenty-two bars long) and it duly becomes longer when it is stated in very protracted note-lengths. This happens repeatedly in all the movements, the most substantial augmentation being times twelve (for example, ‘Benedicimus te’ and ‘suscipe deprecationem nostram’ in the Gloria; ‘visibilium’ and ‘Et ascendit’ in the Credo). But what ultimately makes the setting so extremely elaborate is Obrecht’s technique of tirelessly playing with the many short phrases of this melody, quoting snippets of it in different voices against each other, constantly varying the extent of the augmentation even within a single statement, taking motifs from it which can then be turned into other melodies and sequences, stating the phrases in antiphony between different voices. By making a kaleidoscope of the melody in these ways he literally saturated all the voice-parts in all the sections with references to it. To identify them all would be a near impossible task. The only time that Maria zart is quoted in full from beginning to end without interruption, fittingly, is at the conclusion of the Mass, in the soprano part of the third Agnus Dei (though even here Obrecht several times introduced unscheduled octave leaps).

At the same time as constantly quoting from the Maria zart melody Obrecht developed some idiosyncratic ways of adorning it. Perhaps the first thing to strike the ear is that the texture of the music is remarkably homogeneous. There are none of the quick bursts of vocal virtuosity one may find in Ockeghem, or the equally quick bursts of triple-time metre in duple beloved of Dufay and others. The calmer, more consistent world of Josquin is suggested (though it is worth remembering that Josquin may well have learnt this technique in the first place from Obrecht). This sound is partly achieved by use of motifs, often derived from the tune, which keep the rhythmic stability of the original but go on to acquire a life of their own. Most famously these motifs become sequences—an Obrecht special—some of them with a dazzling number of repetitions (nine at ‘miserere’ in the middle of Agnus Dei I; six of the much more substantial phrase at ‘qui ex Patre’ in the Credo; nine in the soprano part alone at ‘Benedicimus te’ in the Gloria. This number is greatly increased by imitation in the other non-chant parts). Perhaps this method is at its most beautiful at the beginning of the Sanctus. In addition the motifs are used in imitation between the voices, sometimes so presented that the singers have to describe leaps of anything up to a twelfth to take their place in the scheme (as in the passage beginning ‘Benedicimus te’ in the Gloria mentioned above). It is the impression which Obrecht gives of having had an inexhaustible supply of these motifs and melodic ideas, free or derived, that gives this piece so much of its vitality. The mesmerizing effect of these musical snippets unceasingly passing back and forth around the long notes of the central melody is at the heart of the particular sound world of this great work.

When Obrecht wrote his Missa Maria zart is not certain. Wegman concludes that it is a late work—possibly his last surviving Mass setting—on the suggestion that Obrecht was in Innsbruck, on his way to Italy, at about the time that some other settings of the Maria zart melody are known to have been written. These, by Ludwig Senfl and others, appeared between 1500 and 1504–6; the melody itself, a devotional monophonic song, was probably written in the Tyrol in the late fifteenth century. The idea that this Mass, stylistically at odds with much of Obrecht’s other known late works and anyway set apart from all his other compositions, was something of a swansong is particularly appealing. We shall never know exactly what Obrecht was hoping to prove in it, but by going to the extremes he did he set his contemporaries a challenge in a certain kind of technique which they proved unable or unwilling to rival.

This Gramophone review of the Tallis Scholars performance, by David Fallows, is also helpful:

This is a bizarre and fascinating piece: and the disc is long-awaited, because The Tallis Scholars have been planning it for some years. It may be the greatest challenge they have faced so far. Normally a Renaissance Mass cycle lasts from 20 to 30 minutes; in the present performance, this one lasts 69 minutes. No ‘liturgical reconstruction’ with chants or anything to flesh out the disc: just solid polyphony the whole way. It seems, in fact, to be the longest known Renaissance Mass.

It is a work that has long held the attention of musicologists: Marcus van Crevel’s famous edition was preceded by 160 pages of introduction discussing its design and numerology. And nobody has ever explained why it survives in only a single source—a funny print by a publisher who produced no other known music book. However, most critics agree that this is one of Obrecht’s last and most glorious works, even if it leaves them tongue-tied. Rob C. Wegman’s recent masterly study of Obrecht’s Masses put it in a nutshell: “Forget the imitation, it seems to tell us, be still, and listen”.

There is room for wondering whether all of it needs to be quite so slow: an earlier record, by the Prague Madrigal Singers (Supraphon, 6/72 – nla), got through it in far less time. Moreover, Obrecht is in any case a very strange composer, treating his dissonances far more freely than most of his contemporaries, sometimes running sequential patterns beyond their limit, making extraordinary demands of the singers in terms of range and phrase-length. That is, there may be ways of making the music run a little more fluidly, so that the irrational dissonances do not come across as clearly as they do here. But in most ways it is hard to fault Peter Phillips’s reading of this massive work.

With only eight singers on the four voices, he takes every detail seriously. And they sing with such conviction and skill that there is hardly a moment when the ear is inclined to wander. As we have come to expect, The Tallis Scholars are technically flawless and constantly alive. Briefly, the disc is a triumph. But, more than that, it is a major contribution to the catalogue, unflinchingly presenting both the beauties and the apparent flaws of this extraordinary work. Phew!

My ear must be too jaded by modern music to notice the dissonances.

Data Visualization Course

10 June, 2021

Are you a student interested in data analysis and sustainability? Or maybe you know some students interested in these things?

Then check this out: my former student Nina Otter, who now teaches at UCLA and Leipzig, is offering a short course on how to analyze and present data using modern methods like topological data analysis—with sustainable fishing as an example!

Students who apply before June 15 have a chance to learn a lot of cool stuff and get paid for it!

Call for Applications

We are advertising the following bootcamp, which will take place remotely on 22-25 June 2021.

If you are interested in participating, please apply here:

FishEthoBase data visualisation bootcamp: this is a 4-day bootcamp, organised by the DeMoS Institute, whose aim is to study ways to visualise scores and criteria from a fish ethology database. The database ( is an initiative led by the non-profits fair-fish international ( and FishEthoGroup ( The database is publicly accessible, it stores all currently available ethological knowledge on fish, with a specific focus on species farmed in aquacultures, with the goal of improving the welfare of fish.

The bootcamp will take place virtually on 22-25 June 2021, and will involve a maximum of eight students selected through an open call during the first half of June. The students will be guided by researchers in statistics and topological data analysis. During the first day of the bootcamp there will be talks given by researchers from FishEthoBase, as well as from the mentors. The next three days will be devoted to focused work in groups, with each day starting and ending with short presentations given by students about the progress of their work; after the presentations there will also be time for feedback and discussions from FishEthoBase researchers, and the mentors. Towards the end of August there will be a 2-hour follow-up meeting to discuss the implementation of the results from the bootcamp.

Target audience: we encourage applications from advanced undergraduate, master, and PhD students from a variety of backgrounds, including, but not limited to, computer science, mathematics, statistics, data analysis, computational biology, maritime sciences, and zoology.

Inclusivity: we encourage especially students from underrepresented groups to apply to this bootcamp.

Remuneration: The students who will be selected to participate in the bootcamp will be remunerated with a salary of 1400 euros.

When: 22-25 June 2021, approximately 11-18 CET each day

Where: remotely, on Zoom

I think it’s really cool that Nina Otter has started the DeMoS Institute. Here is the basic idea:

The institute carries out research on topics related to anti-democratic tendencies in our society, as well as on meta-scientific questions on how to make the scientific system more democratic. We believe that research must be done in the presence of those who bear their consequences. Therefore, we perform our research while at the same time implementing directly practices that promote inclusivity, interdisciplinarity, and in active engagement with society at large.

Symmetric Monoidal Categories: a Rosetta Stone

28 May, 2021

The Topos Institute is in business! I’m really excited about visiting there this summer and working on applied category theory.

They recently had a meeting with some people concerned about AI risks, called Finding the Right Abstractions, organized by Scott Garrabrant, David Spivak, and Andrew Critch. I gave a gentle introduction to the uses of symmetric monoidal categories:

• Symmetric monoidal categories: a Rosetta Stone.

To describe systems composed of interacting parts, scientists and engineers draw diagrams of networks: flow charts, Petri nets, electrical circuit diagrams, signal-flow graphs, chemical reaction networks, Feynman diagrams and the like. All these different diagrams fit into a common framework: the mathematics of symmetric monoidal categories. While originally the morphisms in such categories were mainly used to describe processes, we can also use them to describe open systems.

You can see the slides here, and watch a video here:

For a lot more detail on these ideas, see:

• John Baez and Mike Stay, Physics, topology, logic and computation: a Rosetta Stone, in New Structures for Physics, ed. Bob Coecke, Lecture Notes in Physics vol. 813, Springer, Berlin, 2011, pp. 95—174.

Compositional Robotics (Part 2)

27 May, 2021

Very soon we’re having a workshop on applications of category theory to robotics:

2021 Workshop on Compositional Robotics: Mathematics and Tools, online, Monday 31 May 2021.

You’re invited! As of today it’s not too late to register and watch the talks online, and registration is free. Go here to register:

Here’s the schedule. All times are in UTC, so the show starts at 9:15 am Pacific Time:

Time (UTC) Speaker


16:15-16:30   Intro and plan of the workshop


Jonathan Lorand

Category Theory Basics


John Baez Category Theory and Systems 


Breakout rooms



Andrea Censi
& Gioele Zardini

Categories for Co-Design


David Spivak

Dynamic Interaction Patterns


Breakout rooms



Aaron Ames

A Categorical Perspective on Robotics

21:30-22:15 Daniel Koditschek Toward a Grounded Type Theory for Robot Task Composition
22:30-00:30 Selected speakers Talks from open submissions

For more information go to the workshop website or my previous blog post on this workshop:

Compositional robotics (part 1).