Coupling Through Emergent Conservation Laws (Part 3)

28 June, 2018

joint post with Jonathan Lorand, Blake Pollard, and Maru Sarazola

Last time we gave a quick intro to the chemistry and thermodynamics we’ll use to understand ‘coupling’. Now let’s really get started!

Suppose that we are in a setting in which some reaction

\mathrm{X} + \mathrm{Y} \mathrel{\substack{\alpha_{\rightarrow} \\\longleftrightarrow\\ \alpha_{\leftarrow}}} \mathrm{XY}

takes place. Let’s also assume we are interested in the production of \mathrm{XY} from \mathrm{X} and \mathrm{Y}, but that in our system, the reverse reaction is favored to happen. This means that that reverse rate constant exceeds the forward one, let’s say by a lot:

\alpha_\leftarrow \gg \alpha_\to

so that in equilibrium, the concentrations of the species will satisfy

\displaystyle{ \frac{[\mathrm{XY}]}{[\mathrm{X}][\mathrm{Y}]}\ll 1 }

which we assume undesirable. How can we influence this ratio to get a more desired outcome?

This is where coupling comes into play. Informally, we think of the coupling of two reactions as a process in which an endergonic reaction—one which does not ‘want’ to happen—is combined with an exergonic reaction—one that does ‘want’ to happen—in a way that improves the products-to-reactants concentrations ratio of the first reaction.

An important example of coupling, and one we will focus on, involves ATP hydrolysis:

\mathrm{ATP} + \mathrm{H}_2\mathrm{O} \mathrel{\substack{\beta_{\rightarrow} \\\longleftrightarrow\\ \beta_{\leftarrow}}} \mathrm{ADP} + \mathrm{P}_{\mathrm{i}} + \mathrm{H}^+

where ATP (adenosine triphosphate) reacts with a water molecule. Typically, this reaction results in ADP (adenosine diphosphate), a phosphate ion \mathrm{P}_{\mathrm{i}} and a hydrogen ion \mathrm{H}^+. To simplify calculations, we will replace the above equation with

\mathrm{ATP}  \mathrel{\substack{\beta_{\rightarrow} \\\longleftrightarrow\\ \beta_{\leftarrow}}} \mathrm{ADP} + \mathrm{P}_{\mathrm{i}}

since suppressing the bookkeeping of hydrogen and oxygen atoms in this manner will not affect our main points.

One reason ATP hydrolysis is good for coupling is that this reaction is strongly exergonic:

\beta_\to \gg \beta_\leftarrow

and in fact so much that

\displaystyle{ \frac{\beta_\to}{\beta_\leftarrow} \gg \frac{\alpha_\leftarrow}{\alpha_\to}  }

Yet this fact alone is insufficient to explain coupling!

To see why, suppose our system consists merely of the two reactions

\begin{array}{ccc}  \mathrm{X} + \mathrm{Y}   & \mathrel{\substack{\alpha_{\rightarrow} \\\longleftrightarrow\\ \alpha_{\leftarrow}}} & \mathrm{XY} \\ \\  \mathrm{ATP} & \mathrel{\substack{\beta_{\rightarrow} \\\longleftrightarrow\\ \beta_{\leftarrow}}} &  \mathrm{ADP} + \mathrm{P}_{\mathrm{i}} \label{beta}  \end{array}

happening in parallel. We can study the concentrations in equilibrium to see that one reaction has no influence on the other. Indeed, the rate equation for this reaction network is

\begin{array}{ccl}  \dot{[\mathrm{X}]} & = & -\alpha_\to [\mathrm{X}][\mathrm{Y}]+\alpha_\leftarrow [\mathrm{XY}]\\ \\  \dot{[\mathrm{Y}]} & = & -\alpha_\to [\mathrm{X}][\mathrm{Y}]+\alpha_\leftarrow [\mathrm{XY}]\\ \\  \dot{[\mathrm{XY}]} & = & \alpha_\to [\mathrm{X}][\mathrm{Y}]-\alpha_\leftarrow [\mathrm{XY}]\\ \\  \dot{[\mathrm{ATP}]} & =& -\beta_\to [\mathrm{ATP}]+\beta_\leftarrow [\mathrm{ADP}][\mathrm{P}_{\mathrm{i}}]\\ \\  \dot{[\mathrm{ADP}]} & = &\beta_\to [\mathrm{ATP}]-\beta_\leftarrow [\mathrm{ADP}][\mathrm{P}_{\mathrm{i}}]\\ \\  \dot{[\mathrm{P}_{\mathrm{i}}]} & = &\beta_\to [\mathrm{ATP}]-\beta_\leftarrow [\mathrm{ADP}][\mathrm{P}_{\mathrm{i}}]  \end{array}

When concentrations are constant, these are equivalent to the relations

\displaystyle{  \frac{[\mathrm{XY}]}{[\mathrm{X}][\mathrm{Y}]} = \frac{\alpha_\to}{\alpha_\leftarrow} \ \ \text{ and } \ \ \frac{[\mathrm{ADP}][\mathrm{P}_{\mathrm{i}}]}{[\mathrm{ATP}]} = \frac{\beta_\to}{\beta_\leftarrow} }

We thus see that ATP hydrolysis is in no way affecting the ratio of [\mathrm{XY}] to [\mathrm{X}][\mathrm{Y}]. Intuitively, there is no coupling because the two reactions proceed independently. This ‘independence’ is clearly visible if we draw the reaction network as a so-called Petri net:

So what really happens when we are in the presence of coupling? Stay tuned for the next episode!

By the way, here’s what ATP hydrolysis looks like in a bit more detail, from a website at Loreto College:


 


 
The paper:

• John Baez, Jonathan Lorand, Blake S. Pollard and Maru Sarazola,
Biochemical coupling through emergent conservation laws.

The blog series:

Part 1 – Introduction.

Part 2 – Review of reaction networks and equilibrium thermodynamics.

Part 3 – What is coupling?

Part 4 – Interactions.

Part 5 – Coupling in quasiequilibrium states.

Part 6 – Emergent conservation laws.

Part 7 – The urea cycle.

Part 8 – The citric acid cycle.


Coupling Through Emergent Conservation Laws (Part 2)

27 June, 2018

joint post with Jonathan Lorand, Blake Pollard, and Maru Sarazola

Here’s a little introduction to the chemistry and thermodynamics prerequisites for our work on ‘coupling’. Luckily, it’s fun stuff that everyone should know: a lot of the world runs on these principles!

We will be working with reaction networks. A reaction network consists of a set of reactions, for example

\mathrm{X}+\mathrm{Y}\longrightarrow \mathrm{XY}

Here X, Y and XY are the species involved, and we interpret this reaction as species X and Y combining to form species XY. We call X and Y the reactants and XY the product. Additive combinations of species, such as X + Y, are called complexes.

The law of mass action states that the rate at which a reaction occurs is proportional to the product of the concentrations of the reactants. The proportionality constant is called the rate constant; it is a positive real number associated to a reaction that depends on chemical properties of the reaction along with the temperature, the pH of the solution, the nature of any catalysts that may be present, and so on. Every reaction has a reverse reaction; that is, if X and Y combine to form XY, then XY can also split into X and Y. The reverse reaction has its own rate constant.

We can summarize this information by writing

\mathrm{X} + \mathrm{Y} \mathrel{\substack{\alpha_{\rightarrow} \\\longleftrightarrow\\ \alpha_{\leftarrow}}}  \mathrm{XY}

where \alpha_{\to} is the rate constant for X and Y to combine and form XY, while \alpha_\leftarrow is the rate constant for the reverse reaction.

As time passes and reactions occur, the concentration of each species will likely change. We can record this information in a collection of functions

[\mathrm{X}] \colon \mathbb{R} \to [0,\infty),

one for each species X, where \mathrm{X}(t) gives the concentration of the species \mathrm{X} at time t. This naturally leads one to consider the rate equation of a given reaction, which specifies the time evolution of these concentrations. The rate equation can be read off from the reaction network, and in the above example it is:

\begin{array}{ccc}  \dot{[\mathrm{X}]} & = & -\alpha_\to [\mathrm{X}][\mathrm{Y}]+\alpha_\leftarrow [\mathrm{XY}]\\  \dot{[\mathrm{Y}]} & = & -\alpha_\to [\mathrm{X}][\mathrm{Y}]+\alpha_\leftarrow [\mathrm{XY}]\\  \dot{[\mathrm{XY}]} & = & \alpha_\to [\mathrm{X}][\mathrm{Y}]-\alpha_\leftarrow [\mathrm{XY}]  \end{array}

Here \alpha_\to [\mathrm{X}] [\mathrm{Y}] is the rate at which the forward reaction is occurring; thanks to the law of mass action, this is the rate constant \alpha_\to times the product of the concentrations of X and Y. Similarly, \alpha_\leftarrow [\mathrm{XY}] is the rate at which the reverse reaction is occurring.

We say that a system is in detailed balanced equilibrium, or simply equilibrium, when every reaction occurs at the same rate as its reverse reaction. This implies that the concentration of each species is constant in time. In our example, the condition for equilibrium is

\displaystyle{ \frac{\alpha_\to}{\alpha_\leftarrow}=\frac{[\mathrm{XY}]}{[\mathrm{X}][\mathrm{Y}]} }

and the rate equation then implies that

\dot{[\mathrm{X}]} =  \dot{[\mathrm{Y}]} =\dot{[\mathrm{XY}]} = 0

The laws of thermodynamics determine the ratio of the forward and reverse rate constants. For any reaction at all, this ratio is

\displaystyle{ \frac{\alpha_\to}{\alpha_\leftarrow} = e^{-\Delta {G^\circ}/RT} }  \qquad \qquad \qquad (1)

where T is the temperature, R is the ideal gas constant, and \Delta {G^\circ} is the free energy change under standard conditions.

Note that if \Delta {G^\circ} < 0, then the rate constant of the forward reaction is larger than the rate constant of the reverse reaction:

\alpha_\to > \alpha_\leftarrow

In this case one may loosely say that the forward reaction ‘wants’ to happen ‘spontaneously’. Such a reaction is called exergonic. If on the other hand \Delta {G^\circ} > 0, then the forward reaction is ‘non-spontaneous’ and it is called endergonic.

The most important thing for us is that \Delta {G^\circ} takes a very simple form. Each species has a free energy. The free energy of a complex

\mathrm{A}_1 + \cdots + \mathrm{A}_m

is the sum of the free energies of the species \mathrm{A}_i. Given a reaction

\mathrm{A}_1 + \cdots + \mathrm{A}_m \longrightarrow \mathrm{B}_1 + \cdots + \mathrm{B}_n

the free energy change \Delta {G^\circ} for this reaction is the free energy of

\mathrm{B}_1 + \cdots + \mathrm{B}_n

minus the free energy of

\mathrm{A}_1 + \cdots + \mathrm{A}_m.

As a consequence, \Delta{G^\circ} is additive with respect to combining multiple reactions in either series or parallel. In particular, then, the law (1) imposes relations between ratios of rate constants: for example, if we have the following more complicated set of reactions

\mathrm{A} \mathrel{\substack{\alpha_{\rightarrow} \\\longleftrightarrow\\ \alpha_{\leftarrow}}} \mathrm{B}

\mathrm{B} \mathrel{\substack{\beta_{\rightarrow} \\\longleftrightarrow\\ \beta_{\leftarrow}}} \mathrm{C}

\mathrm{A} \mathrel{\substack{\gamma_{\rightarrow} \\\longleftrightarrow\\ \gamma_{\leftarrow}}} \mathrm{C}

then we must have

\displaystyle{    \frac{\gamma_\to}{\gamma_\leftarrow} = \frac{\alpha_\to}{\alpha_\leftarrow} \frac{\beta_\to}{\beta_\leftarrow} .  }

So, not only are the rate constant ratios of reactions determined by differences in free energy, but also nontrivial relations between these ratios can arise, depending on the structure of the system of reactions in question!

Okay—this is all the basic stuff we’ll need to know. Please ask questions! Next time we’ll go ahead and use this stuff to start thinking about how biology manages to make reactions that ‘want’ to happen push forward reactions that are useful but wouldn’t happen spontaneously on their own.

 


 
The paper:

• John Baez, Jonathan Lorand, Blake S. Pollard and Maru Sarazola,
Biochemical coupling through emergent conservation laws.

The blog series:

Part 1 – Introduction.

Part 2 – Review of reaction networks and equilibrium thermodynamics.

Part 3 – What is coupling?

Part 4 – Interactions.

Part 5 – Coupling in quasiequilibrium states.

Part 6 – Emergent conservation laws.

Part 7 – The urea cycle.

Part 8 – The citric acid cycle.


Coupling Through Emergent Conservation Laws (Part 1)

27 June, 2018

joint post with Jonathan Lorand, Blake Pollard, and Maru Sarazola

In the cell, chemical reactions are often ‘coupled’ so that reactions that release energy drive reactions that are biologically useful but involve an increase in energy. But how, exactly, does coupling work?

Much is known about this question, but the literature is also full of vague explanations and oversimplifications. Coupling cannot occur in equilibrium; it arises in open systems, where the concentrations of certain chemicals are held out of equilibrium due to flows in and out. One might thus suspect that the simplest mathematical treatment of this phenomenon would involve non-equilibrium steady states of open systems. However, Bazhin has shown that some crucial aspects of coupling arise in an even simpler framework:

• Nicolai Bazhin, The essence of ATP coupling, ISRN Biochemistry 2012 (2012), article 827604.

He considers ‘quasi-equilibrium’ states, where fast reactions have come into equilibrium and slow ones are neglected. He shows that coupling occurs already in this simple approximation.

In this series of blog articles we’ll do two things. First, we’ll review Bazhin’s work in a way that readers with no training in biology or chemistry should be able to follow. (But if you get stuck, ask questions!) Second, we’ll explain a fact that seems to have received insufficient attention: in many cases, coupling relies on emergent conservation laws.

Conservation laws are important throughout science. Besides those that are built into the fabric of physics, such as conservation of energy and momentum, there are also many ’emergent’ conservation laws that hold approximately in certain circumstances. Often these arise when processes that change a given quantity happen very slowly. For example, the most common isotope of uranium decays into lead with a half-life of about 4 billion years—but for the purposes of chemical experiments in the laboratory, it is useful to treat the amount of uranium as a conserved quantity.

The emergent conservation laws involved in biochemical coupling are of a different nature. Instead of making the processes that violate these laws happen more slowly, the cell uses enzymes to make other processes happen more quickly. At the time scales relevant to cellular metabolism, the fast processes dominate, while slowly changing quantities are effectively conserved. By a suitable choice of these emergent conserved quantities, the cell ensures that certain reactions that release energy can only occur when other ‘desired’ reactions occur. To be sure, this is only approximately true, on sufficiently short time scales. But this approximation is enlightening!

Following Bazhin, our main example involves ATP hydrolysis. We consider this following schema for a whole family of reactions:

\begin{array}{ccc}  \mathrm{X} + \mathrm{ATP}  & \longleftrightarrow & \mathrm{ADP} + \mathrm{XP}_{\mathrm{i}} \qquad (1) \\  \mathrm{XP}_{\mathrm{i}} + \mathrm{Y}  & \longleftrightarrow &    \mathrm{XY} + \mathrm{P}_{\mathrm{i}} \,\;\;\;\;\qquad (2)  \end{array}

Some concrete examples of this schema include:

• The synthesis of glutamine (XY) from glutamate (X) and ammonium (Y). This is part of the important glutamate-glutamine cycle in the central nervous system.

• The synthesis of sucrose (XY) from glucose (X) and fructose (Y). This is one of many processes whereby plants synthesize more complex sugars and starches from simpler building-blocks.

In these and other examples, the two reactions, taken together, have the effect of synthesizing a larger molecule XY out of two parts X and Y while ATP is broken down to ADP and the phosphate ion Pi Thus, they have the same net effect as this other pair of reactions:

\begin{array}{ccc}  \mathrm{X} + \mathrm{Y} &\longleftrightarrow & \mathrm{XY} \;\;\;\quad \quad \qquad  (3) \\   \mathrm{ATP} &\longleftrightarrow & \mathrm{ADP} + \mathrm{P}_{\mathrm{i}} \qquad (4) \end{array}

The first reaction here is just the synthesis of XY from X and Y. The second is a deliberately simplified version of ATP hydrolysis. The first involves an increase of energy, while the second releases energy. But in the schema used in biology, these processes are ‘coupled’ so that ATP can only break down to ADP + Pi if X and Y combine to form XY.

As we shall see, this coupling crucially relies on a conserved quantity: the total number of Y molecules plus the total number of Pi ions is left unchanged by reactions (1) and (2). This fact is not a fundamental law of physics, nor even a general law of chemistry (such as conservation of phosphorus atoms). It is an emergent conservation law that holds approximately in special situations. Its approximate validity relies on the fact that the cell has enzymes that make reactions (1) and (2) occur more rapidly than reactions that violate this law, such as (3) and (4).

In the series to come, we’ll start by providing the tiny amount of chemistry and thermodynamics needed to understand what’s going on. Then we’ll raise the question “what is coupling?” Then we’ll study the reactions required for coupling ATP hydrolysis to the synthesis of XY from components X and Y, and explain why these reactions are not yet enough for coupling. Then we’ll show that coupling occurs in a ‘quasiequilibrium’ state where reactions (1) and (2), assumed much faster than the rest, have reached equilibrium, while the rest are neglected. And then we’ll explain the role of emergent conservation laws!

 


 
The paper:

• John Baez, Jonathan Lorand, Blake S. Pollard and Maru Sarazola,
Biochemical coupling through emergent conservation laws.

The blog series:

Part 1 – Introduction.

Part 2 – Review of reaction networks and equilibrium thermodynamics.

Part 3 – What is coupling?

Part 4 – Interactions.

Part 5 – Coupling in quasiequilibrium states.

Part 6 – Emergent conservation laws.

Part 7 – The urea cycle.

Part 8 – The citric acid cycle.


A Biochemistry Question

26 June, 2018

Does anyone know a real-world example of a cycle like this:


or in other words, this:

\begin{array}{ccc}  \mathrm{A} + \mathrm{C}_1 \longrightarrow \mathrm{C}_2 \\   \mathrm{X} + \mathrm{C}_2 \longrightarrow \mathrm{C}_3  \\    \mathrm{C}_3 \longrightarrow \mathrm{B} + \mathrm{C}_4   \\    \mathrm{C}_4 \longrightarrow \mathrm{Y} + \mathrm{C}_1   \end{array}

where the reaction

\mathrm{A} \to \mathrm{B}

is exergonic (i.e., involves a decrease in free energy) while

\mathrm{X} \to \mathrm{Y}

is endergonic (i.e., involves a free energy increase)?

The idea is that the above cycle, presumably catalyzed so that all the reactions go fairly fast under normal conditions, ‘couples’ the exergonic reaction, which ‘wants to happen’, to the endergonic reaction, which doesn’t… thus driving the endergonic one.

I would love an example from biochemistry. This is like a baby version of much more elaborate cycles such as the citric acid cycle, shown here:

in a picture from Stryer’s Biochemistry. I’m writing a paper on this stuff with Jonathan Lorand, Blake Pollard and Maru Sarazola, and we have—presumably obvious—reasons to want to discuss a simpler cycle!


Applied Category Theory Course: Resource Theories

12 May, 2018

 

My course on applied category theory is continuing! After a two-week break where the students did exercises, I’m back to lecturing about Fong and Spivak’s book Seven Sketches. Now we’re talking about “resource theories”. Resource theories help us answer questions like this:

  1. Given what I have, is it possible to get what I want?
  2. Given what I have, how much will it cost to get what I want?
  3. Given what I have, how long will it take to get what I want?
  4. Given what I have, what is the set of ways to get what I want?

Resource theories in their modern form were arguably born in these papers:

• Bob Coecke, Tobias Fritz and Robert W. Spekkens, A mathematical theory of resources.

• Tobias Fritz, Resource convertibility and ordered commutative monoids.

We are lucky to have Tobias in our course, helping the discussions along! He’s already posted some articles on resource theory here on this blog:

• Tobias Fritz, Resource convertibility (part 1), Azimuth, 7 April 2015.

• Tobias Fritz, Resource convertibility (part 2), Azimuth, 10 April 2015.

• Tobias Fritz, Resource convertibility (part 3), Azimuth, 13 April 2015.

We’re having fun bouncing between the relatively abstract world of monoidal preorders and their very concrete real-world applications to chemistry, scheduling, manufacturing and other topics. Here are the lectures so far:

Lecture 18 – Chapter 2: Resource Theories
Lecture 19 – Chapter 2: Chemistry and Scheduling
Lecture 20 – Chapter 2: Manufacturing
Lecture 21 – Chapter 2: Monoidal Preorders
Lecture 22 – Chapter 2: Symmetric Monoidal Preorders
Lecture 23 – Chapter 2: Commutative Monoidal Posets
Lecture 24 – Chapter 2: Pricing Resources
Lecture 25 – Chapter 2: Reaction Networks
Lecture 26 – Chapter 2: Monoidal Monotones
Lecture 27 – Chapter 2: Adjoints of Monoidal Monotones
Lecture 28 – Chapter 2: Ignoring Externalities
Lecture 29 – Chapter 2: Enriched Categories
Lecture 30 – Chapter 2: Preorders as Enriched Categories
Lecture 31 – Chapter 2: Lawvere Metric Spaces
Lecture 32 – Chapter 2: Enriched Functors
Lecture 33 – Chapter 2: Tying Up Loose Ends

 


Effective Thermodynamics for a Marginal Observer

8 May, 2018

guest post by Matteo Polettini

Suppose you receive an email from someone who claims “here is the project of a machine that runs forever and ever and produces energy for free!” Obviously he must be a crackpot. But he may be well-intentioned. You opt for not being rude, roll your sleeves, and put your hands into the dirt, holding the Second Law as lodestar.

Keep in mind that there are two fundamental sources of error: either he is not considering certain input currents (“hey, what about that tiny hidden cable entering your machine from the electrical power line?!”, “uh, ah, that’s just to power the “ON” LED”, “mmmhh, you sure?”), or else he is not measuring the energy input correctly (“hey, why are you using a Geiger counter to measure input voltages?!”, “well, sir, I ran out of voltmeters…”).

In other words, the observer might only have partial information about the setup, either in quantity or quality. Because he has been marginalized by society (most crackpots believe they are misunderstood geniuses) we will call such observer “marginal,” which incidentally is also the word that mathematicians use when they focus on the probability of a subset of stochastic variables.

In fact, our modern understanding of thermodynamics as embodied in statistical mechanics and stochastic processes is founded (and funded) on ignorance: we never really have “complete” information. If we actually had, all energy would look alike, it would not come in “more refined” and “less refined” forms, there would not be a differentials of order/disorder (using Paul Valery’s beautiful words), and that would end thermodynamic reasoning, the energy problem, and generous research grants altogether.

Even worse, within this statistical approach we might be missing chunks of information because some parts of the system are invisible to us. But then, what warrants that we are doing things right, and he (our correspondent) is the crackpot? Couldn’t it be the other way around? Here I would like to present some recent ideas I’ve been working on together with some collaborators on how to deal with incomplete information about the sources of dissipation of a thermodynamic system. I will do this in a quite theoretical manner, but somehow I will mimic the guidelines suggested above for debunking crackpots. My three buzzwords will be: marginal, effective, and operational.

“Complete” thermodynamics: an out-of-the-box view

The laws of thermodynamics that I address are:

• The good ol’ Second Law (2nd)

• The Fluctuation-Dissipation Relation (FDR), and the Reciprocal Relation (RR) close to equilibrium.

• The more recent Fluctuation Relation (FR)1 and its corollary the Integral Fluctuation Relation (IFR), which have been discussed on this blog in a remarkable post by Matteo Smerlak.

The list above is all in the “area of the second law”. How about the other laws? Well, thermodynamics has for long been a phenomenological science, a patchwork. So-called stochastic thermodynamics is trying to put some order in it by systematically grounding thermodynamic claims in (mostly Markov) stochastic processes. But it’s not an easy task, because the different laws of thermodynamics live in somewhat different conceptual planes. And it’s not even clear if they are theorems, prescriptions, or habits (a bit like in jurisprudence2).

Within stochastic thermodynamics, the Zeroth Law is so easy nobody cares to formulate it (I do, so stay tuned…). The Third Law: no idea, let me know. As regards the First Law (or, better, “laws”, as many as there are conserved quantities across the system/environment interface…), we will assume that all related symmetries have been exploited from the offset to boil down the description to a minimum.

1

This minimum is as follows. We identify a system that is well separated from its environment. The system evolves in time, the environment is so large that its state does not evolve within the timescales of the system3. When tracing out the environment from the description, an uncertainty falls upon the system’s evolution. We assume the system’s dynamics to be described by a stochastic Markovian process.

How exactly the system evolves and what is the relationship between system and environment will be described in more detail below. Here let us take an “out of the box” view. We resolve the environment into several reservoirs labeled by index \alpha. Each of these reservoirs is “at equilibrium” on its own (whatever that means4). Now, the idea is that each reservoir tries to impose “its own equilibrium” on the system, and that their competition leads to a flow of currents across the system/environment interface. Each time an amount of the reservoir’s resource crosses the interface, a “thermodynamic cost” has to be to be paid or gained (be it a chemical potential difference for a molecule to go through a membrane, or a temperature gradient for photons to be emitted/absorbed, etc.).

The fundamental quantities of stochastic thermodynamic modeling thus are:

• On the “-dynamic” side: the time-integrated currents \Phi^t_\alpha, independent among themselves5. Currents are stochastic variables distributed with joint probability density

P(\{\Phi_\alpha\}_\alpha)

• On the “thermo-” side: The so-called thermodynamic forces or “affinities”6 \mathcal{A}_\alpha (collectively denoted \mathcal{A}). These are tunable parameters that characterize reservoir-to-reservoir gradients, and they are not stochastic. For convenience, we conventionally take them all positive.

Dissipation is quantified by the entropy production:

\sum \mathcal{A}_\alpha \Phi^t_\alpha

We are finally in the position to state the main results. Be warned that in the following expressions the exact treatment of time and its scaling would require a lot of specifications, but keep in mind that all these relations hold true in the long-time limit, and that all cumulants scale linearly with time.

FR: The probability of observing positive currents is exponentially favoured with respect to negative currents according to

P(\{\Phi_\alpha\}_\alpha) / P(\{-\Phi_\alpha\}_\alpha) = \exp \sum \mathcal{A}_\alpha \Phi^t_\alpha

Comment: This is not trivial, it follows from the explicit expression of the path integral, see below.

IFR: The exponential of minus the entropy production is unity

\big\langle  \exp - \sum \mathcal{A}_\alpha \Phi^t_\alpha  \big\rangle_{\mathcal{A}} =1

Homework: Derive this relation from the FR in one line.

2nd Law: The average entropy production is not negative

\sum \mathcal{A}_\alpha \left\langle \Phi^t_\alpha \right\rangle_{\mathcal{A}} \geq 0

Homework: Derive this relation using Jensen’s inequality.

Equilibrium: Average currents vanish if and only if affinities vanish:

\left\langle \Phi^t_\alpha \right\rangle_{\mathcal{A}} \equiv 0, \forall \alpha \iff  \mathcal{A}_\alpha \equiv 0, \forall \alpha

Homework: Derive this relation taking the first derivative w.r.t. {\mathcal{A}_\alpha} of the IFR. Notice that also the average depends on the affinities.

S-FDR: At equilibrium, it is impossible to tell whether a current is due to a spontaneous fluctuation (quantified by its variance) or to an external perturbation (quantified by the response of its mean). In a symmetrized (S-) version:

\left.  \frac{\partial}{\partial \mathcal{A}_\alpha}\left\langle \Phi^t_{\alpha'} \right\rangle \right|_{0} + \left.  \frac{\partial}{\partial \mathcal{A}_{\alpha'}}\left\langle \Phi^t_{\alpha} \right\rangle \right|_{0} = \left. \left\langle \Phi^t_{\alpha} \Phi^t_{\alpha'} \right\rangle \right|_{0}

Homework: Derive this relation taking the mixed second derivatives w.r.t. {\mathcal{A}_\alpha} of the IFR.

RR: The reciprocal response of two different currents to a perturbation of the reciprocal affinities close to equilibrium is symmetrical:

\left.  \frac{\partial}{\partial \mathcal{A}_\alpha}\left\langle \Phi^t_{\alpha'} \right\rangle \right|_{0} - \left.  \frac{\partial}{\partial \mathcal{A}_{\alpha'}}\left\langle \Phi^t_{\alpha} \right\rangle \right|_{0} = 0

Homework: Derive this relation taking the mixed second derivatives w.r.t. {\mathcal{A}_\alpha} of the FR.

Notice the implication scheme: FR ⇒ IFR ⇒ 2nd, IFR ⇒ S-FDR, FR ⇒ RR.

“Marginal” thermodynamics (still out-of-the-box)

Now we assume that we can only measure a marginal subset of currents \{\Phi_\mu^t\}_\mu \subset \{\Phi_\alpha^t\}_\alpha (index \mu always has a smaller range than \alpha), distributed with joint marginal probability

P(\{\Phi_\mu\}_\mu) = \int \prod_{\alpha \neq \mu} d\Phi_\alpha \, P(\{\Phi_\alpha\}_\alpha)

2

Notice that a state where these marginal currents vanish might not be an equilibrium, because other currents might still be whirling around. We call this a stalling state.

\mathrm{stalling:} \qquad \langle \Phi_\mu \rangle \equiv 0,  \quad \forall \mu

My central question is: can we associate to these currents some effective affinity \mathcal{Q}_\mu in such a way that at least some of the results above still hold true? And, are all definitions involved just a fancy mathematical construct, or are they operational?

First the bad news: In general the FR is violated for all choices of effective affinities:

P(\{\Phi_\mu\}_\mu) / P(\{-\Phi_\mu\}_\mu) \neq \exp \sum \mathcal{Q}_\mu \Phi^t_\mu

This is not surprising and nobody would expect that. How about the IFR?

Marginal IFR: There are effective affinities such that

\left\langle \exp - \sum \mathcal{Q}_\mu \Phi^t_\mu \right\rangle_{\mathcal{A}} =1

Mmmhh. Yeah. Take a closer look this expression: can you see why there actually exists an infinite choice of “effective affinities” that would make that average cross 1? Which on the other hand is just a number, so who even cares? So this can’t be the point.

The fact is, the IFR per se is hardly of any practical interest, as are all “absolutes” in physics. What matters is “relatives”: in our case, response. But then we need to specify how the effective affinities depend on the “real” affinities. And here steps in a crucial technicality, whose precise argumentation is a pain. Basing on reasonable assumptions7, we demonstrate that the IFR holds for the following choice of effective affinities:

\mathcal{Q}_\mu = \mathcal{A}_\mu - \mathcal{A}^{\mathrm{stalling}}_\mu,

where \mathcal{A}^{\mathrm{stalling}} is the set of values of the affinities that make marginal currents stall. Notice that this latter formula gives an operational definition of the effective affinities that could in principle be reproduced in laboratory (just go out there and tune the tunable until everything stalls, and measure the difference). Obviously:

Stalling: Marginal currents vanish if and only if effective affinities vanish:

\left\langle \Phi^t_\mu \right\rangle_{\mathcal{A}} \equiv 0, \forall \mu \iff \mathcal{A}_\mu \equiv 0, \forall \mu

Now, according to the inference scheme illustrated above, we can also prove that:

Effective 2nd Law: The average marginal entropy production is not negative

\sum \mathcal{Q}_\mu \left\langle \Phi^t_\mu \right\rangle_{\mathcal{A}} \geq 0

S-FDR at stalling:

\left. \frac{\partial}{\partial \mathcal{A}_\mu}\left\langle \Phi^t_{\mu'} \right\rangle \right|_{\mathcal{A}^{\mathrm{stalling}}} + \left. \frac{\partial}{\partial \mathcal{A}_{\mu'}}\left\langle \Phi^t_{\mu} \right\rangle \right|_{\mathcal{A}^{\mathrm{stalling}}} = \left. \left\langle \Phi^t_{\mu} \Phi^t_{\mu'} \right\rangle \right|_{\mathcal{A}^{\mathrm{stalling}}}

Notice instead that the RR is gone at stalling. This is a clear-cut prediction of the theory that can be experimented with basically the same apparatus with which response theory has been experimentally studied so far (not that I actually know what these apparatus are…): at stalling states, differing from equilibrium states, the S-FDR still holds, but the RR does not.

Into the box

You’ve definitely gotten enough at this point, and you can give up here. Please exit through the gift shop.

If you’re stubborn, let me tell you what’s inside the box. The system’s dynamics is modeled as a continuous-time, discrete configuration-space Markov “jump” process. The state space can be described by a graph G=(I, E) where I is the set of configurations, E is the set of possible transitions or “edges”, and there exists some incidence relation between edges and couples of configurations. The process is determined by the rates w_{i \gets j} of jumping from one configuration to another.

We choose these processes because they allow some nice network analysis and because the path integral is well defined! A single realization of such a process is a trajectory

\omega^t = (i_0,\tau_0) \to (i_1,\tau_1) \to \ldots \to (i_N,\tau_N)

A “Markovian jumper” waits at some configuration i_n for some time \tau_n with an exponentially decaying probability w_{i_n} \exp - w_{i_n} \tau_n with exit rate w_i = \sum_k w_{k \gets i}, then instantaneously jumps to a new configuration i_{n+1} with transition probability w_{i_{n+1} \gets {i_n}}/w_{i_n}. The overall probability density of a single trajectory is given by

P(\omega^t) = \delta \left(t - \sum_n \tau_n \right) e^{- w_{i_N}\tau_{i_N}} \prod_{n=0}^{N-1} w_{j_n \gets i_n} e^{- w_{i_n} \tau_{i_n}}

One can in principle obtain the probability distribution function of any observable defined along the trajectory by taking the marginal of this measure (though in most cases this is technically impossible). Where does this expression come from? For a formal derivation, see the very beautiful review paper by Weber and Frey, but be aware that this is what one would intuitively come up with if one had to simulate with the Gillespie algorithm.

The dynamics of the Markov process can also be described by the probability of being at some configuration i at time t, which evolves via the master equation

\dot{p}_i(t) = \sum_j \left[ w_{ij} p_j(t) - w_{ji} p_i(t) \right].

We call such probability the system’s state, and we assume that the system relaxes to a uniquely defined steady state p = \mathrm{lim}_{t \to \infty} p(t).

A time-integrated current along a single trajectory is a linear combination of the net number of jumps \#^t between configurations in the network:

\Phi^t_\alpha = \sum_{ij} C^{ij}_\alpha \left[ \#^t(i \gets j) - \#^t(j\gets i) \right]

The idea here is that one or several transitions within the system occur because of the “absorption” or the “emission” of some environmental degrees of freedom, each with different intensity. However, for the moment let us simplify the picture and require that only one transition contributes to a current, that is that there exist i_\alpha,j_\alpha such that

C^{ij}_\alpha = \delta^i_{i_\alpha} \delta^j_{j_\alpha}.

Now, what does it mean for such a set of currents to be “complete”? Here we get inspiration from Kirchhoff’s Current Law in electrical circuits: the continuity of the trajectory at each configuration of the network implies that after a sufficiently long time, cycle or loop or mesh currents completely describe the steady state. There is a standard procedure to identify a set of cycle currents: take a spanning tree T of the network; then the currents flowing along the edges E\setminus T left out from the spanning tree form a complete set.

The last ingredient you need to know are the affinities. They can be constructed as follows. Consider the Markov process on the network where the observable edges are removed G' = (I,T). Calculate the steady state of its associated master equation (p^{\mathrm{eq}}_i)_i, which is necessarily an equilibrium (since there cannot be cycle currents in a tree…). Then the affinities are given by

\mathcal{A}_\alpha = \log  w_{i_\alpha j_\alpha} p^{\mathrm{eq}}_{j_\alpha} / w_{j_\alpha i_\alpha} p^{\mathrm{eq}}_{i_\alpha}.

Now you have all that is needed to formulate the complete theory and prove the FR.

Homework: (Difficult!) With the above definitions, prove the FR.

How about the marginal theory? To define the effective affinities, take the set E_{\mathrm{mar}} = \{i_\mu j_\mu, \forall \mu\} of edges where there run observable currents. Notice that now its complement obtained by removing the observable edges, the hidden edge set E_{\mathrm{hid}} = E \setminus E_{\mathrm{mar}}, is not in general a spanning tree: there might be cycles that are not accounted for by our observations. However, we can still consider the Markov process on the hidden space, and calculate its stalling steady state p^{\mathrm{st}}_i, and ta-taaa: The effective affinities are given by

\mathcal{Q}_\mu = \log w_{i_\mu j_\mu} p^{\mathrm{st}}_{j_\mu} / w_{j_\mu i_\mu} p^{\mathrm{st}}_{i_\mu}.

Proving the marginal IFR is far more complicated than the complete FR. In fact, very often in my field we will not work with the current’ probability density itself, but we prefer to take its bidirectional Laplace transform and work with the currents’ cumulant generating function. There things take a quite different and more elegant look.

Many other questions and possibilities open up now. The most important one left open is: Can we generalize the theory the (physically relevant) case where the current is supported on several edges? For example, for a current defined like \Phi^t = 5 \Phi^t_{12} + 7 \Phi^t_{34}? Well, it depends: the theory holds provided that the stalling state is not “internally alive”, meaning that if the observable current vanishes on average, then also should \Phi^t_{12} and \Phi^t_{34} separately. This turns out to be a physically meaningful but quite strict condition.

Is all of thermodynamics “effective”?

Let me conclude with some more of those philosophical considerations that sadly I have to leave out of papers…

Stochastic thermodynamics strongly depends on the identification of physical and information-theoretic entropies — something that I did not openly talk about, but that lurks behind the whole construction. Throughout my short experience as researcher I have been pursuing a program of “relativization” of thermodynamics, by making the role of the observer more and more evident and movable. Inspired by Einstein’s Gedankenexperimenten, I also tried to make the theory operational. This program may raise eyebrows here and there: Many thermodynamicians embrace a naive materialistic world-view whereby what only matters are “real” physical quantities like temperature, pressure, and all the rest of the information-theoretic discourse is at best mathematical speculation or a fascinating analog with no fundamental bearings. According to some, information as a physical concept lingers alarmingly close to certain extreme postmodern claims in the social sciences that “reality” does not exist unless observed, a position deemed dangerous at times when the authoritativeness of science is threatened by all sorts of anti-scientific waves.

I think, on the contrary, that making concepts relative and effective and by summoning the observer explicitly is a laic and prudent position that serves as an antidote to radical subjectivity. The other way around—clinging to the objectivity of a preferred observer, which is implied in any materialistic interpretation of thermodynamics, e.g. by assuming that the most fundamental degrees of freedom are the positions and velocities of gas’s molecules—is the dangerous position, expecially when the role of such preferred observer is passed around from the scientist to the technician and eventually to the technocrat, who would be induced to believe there are simple technological fixes to complex social problems

How do we reconcile observer-dependency and the laws of physics? The object and the subject? On the one hand, much like the position of an object depends on the reference frame, so much so entropy and entropy production do depend on the observer and the particular apparatus that he controls or experiment he is involved with. On the other hand, much like motion is ultimately independent of position and it is agreed upon by all observers that share compatible measurement protocols, so much so the laws of thermodynamics are independent of that particular observer’s quantification of entropy and entropy production (e.g., the effective Second Law holds independently of how much the marginal observer knows of the system, if he operates according to our phenomenological protocol…). This is the case even in the every-day thermodynamics as practiced by energetic engineers et al., where there are lots of choices to gauge upon, and there is no other external warrant that the amount of dissipation being quantified is the “true” one (whatever that means…)—there can only be trust in one’s own good practices and methodology.

So in this sense, I like to think that all observers are marginal, that this effective theory serves as a dictionary by which different observers practice and communicate thermodynamics, and that we should not revere the laws of thermodynamics as “true” idols, but rather as tools of good scientific practice.

References

• M. Polettini and M. Esposito, Effective fluctuation and response theory, arXiv:1803.03552.

In this work we give the complete theory and numerous references to work of other people that was along the same lines. We employ a “spiral” approach to the presentation of the results, inspired by the pedagogical principle of Albert Baez.

• M. Polettini and M. Esposito, Effective thermodynamics for a marginal observer, Phys. Rev. Lett. 119 (2017), 240601, arXiv:1703.05715.

This is a shorter version of the story.

• B. Altaner, M. Polettini and M. Esposito, Fluctuation-dissipation relations far from equilibrium, Phys. Rev. Lett. 117 (2016), 180601, arXiv:1604.0883.

An early version of the story, containing the FDR results but not the full-fledged FR.

• G. Bisker, M. Polettini, T. R. Gingrich and J. M. Horowitz, Hierarchical bounds on entropy production inferred from partial information, J. Stat. Mech. (2017), 093210, arXiv:1708.06769.

Some extras.

• M. F. Weber and E. Frey, Master equations and the theory of stochastic path integrals, Rep. Progr. Phys. 80 (2017), 046601, arXiv:1609.02849.

Great reference if one wishes to learn about path integrals for master equation systems.

Footnotes

1 There are as many so-called “Fluctuation Theorems” as there are authors working on them, so I decided not to call them by any name. Furthermore, notice I prefer to distinguish between a relation (a formula) and a theorem (a line of reasoning). I lingered more on this here.

2 “Just so you know, nobody knows what energy is.”—Richard Feynman.

I cannot help but mention here the beautiful book by Shapin and Schaffer, Leviathan and the Air-Pump, about the Boyle vs. Hobbes diatribe about what constitutes a “matter of fact,” and Bruno Latour’s interpretation of it in We Have Never Been Modern. Latour argues that “modernity” is a process of separation of the human and natural spheres, and within each of these spheres a process of purification of the unit facts of knowledge and the unit facts of politics, of the object and the subject. At the same time we live in a world where these two spheres are never truly separated, a world of “hybrids” that are at the same time necessary “for all practical purposes” and unconceivable according to the myths that sustain the narration of science, of the State, and even of religion. In fact, despite these myths, we cannot conceive a scientific fact out of the contextual “network” where this fact is produced and replicated, and neither we can conceive society out of the material needs that shape it: so in this sense “we have never been modern”, we are not quite different from all those societies that we take pleasure of studying with the tools of anthropology. Within the scientific community Latour is widely despised; probably he is also misread. While it is really difficult to see how his analysis applies to, say, high-energy physics, I find that thermodynamics and its ties to the industrial revolution perfectly embodies this tension between the natural and the artificial, the matter of fact and the matter of concern. Such great thinkers as Einstein and Ehrenfest thought of the Second Law as the only physical law that would never be replaced, and I believe this is revelatory. A second thought on the Second Law, a systematic and precise definition of all its terms and circumstances, reveals that the only formulations that make sense are those phenomenological statements such as Kelvin-Planck’s or similar, which require a lot of contingent definitions regarding the operation of the engine, while fetishized and universal statements are nonsensical (such as that masterwork of confusion that is “the entropy of the Universe cannot decrease”). In this respect, it is neither a purely natural law—as the moderns argue, nor a purely social construct—as the postmodern argue. One simply has to renounce to operate this separation. While I do not have a definite answer on this problem, I like to think of the Second Law as a practice, a consistency check of the thermodynamic discourse.

3 This assumption really belongs to a time, the XIXth century, when resources were virtually infinite on planet Earth…

4 As we will see shortly, we define equilibrium as that state where there are no currents at the interface between the system and the environment, so what is the environment’s own definition of equilibrium?!

5 This because we have already exploited the First Law.

6 This nomenclature comes from alchemy, via chemistry (think of Goethe’s The elective affinities…), it propagated in the XXth century via De Donder and Prigogine, and eventually it is still present in language in Luxembourg because in some way we come from the “late Brussels school”.

7 Basically, we ask that the tunable parameters are environmental properties, such as temperatures, chemical potentials, etc. and not internal properties, such as the energy landscape or the activation barriers between configurations.


A Compositional Framework for Reaction Networks

30 July, 2017

For a long time Blake Pollard and I have been working on ‘open’ chemical reaction networks: that is, networks of chemical reactions where some chemicals can flow in from an outside source, or flow out. The picture to keep in mind is something like this:



where the yellow circles are different kinds of chemicals and the aqua boxes are different reactions. The purple dots in the sets X and Y are ‘inputs’ and ‘outputs’, where certain kinds of chemicals can flow in or out.

Here’s our paper on this stuff:

• John Baez and Blake Pollard, A compositional framework for reaction networks, Reviews in Mathematical Physics 29, 1750028.

Blake and I gave talks about this stuff in Luxembourg this June, at a nice conference called Dynamics, thermodynamics and information processing in chemical networks. So, if you’re the sort who prefers talk slides to big scary papers, you can look at those:

• John Baez, The mathematics of open reaction networks.

• Blake Pollard, Black-boxing open reaction networks.

But I want to say here what we do in our paper, because it’s pretty cool, and it took a few years to figure it out. To get things to work, we needed my student Brendan Fong to invent the right category-theoretic formalism: ‘decorated cospans’. But we also had to figure out the right way to think about open dynamical systems!

In the end, we figured out how to first ‘gray-box’ an open reaction network, converting it into an open dynamical system, and then ‘black-box’ it, obtaining the relation between input and output flows and concentrations that holds in steady state. The first step extracts the dynamical behavior of an open reaction network; the second extracts its static behavior. And both these steps are functors!

Lawvere had the idea that the process of assigning ‘meaning’ to expressions could be seen as a functor. This idea has caught on in theoretical computer science: it’s called ‘functorial semantics’. So, what we’re doing here is applying functorial semantics to chemistry.

Now Blake has passed his thesis defense based on this work, and he just needs to polish up his thesis a little before submitting it. This summer he’s doing an internship at the Princeton branch of the engineering firm Siemens. He’s working with Arquimedes Canedo on ‘knowledge representation’.

But I’m still eager to dig deeper into open reaction networks. They’re a small but nontrivial step toward my dream of a mathematics of living systems. My working hypothesis is that living systems seem ‘messy’ to physicists because they operate at a higher level of abstraction. That’s what I’m trying to explore.

Here’s the idea of our paper.

The idea

Reaction networks are a very general framework for describing processes where entities interact and transform int other entities. While they first showed up in chemistry, and are often called ‘chemical reaction networks’, they have lots of other applications. For example, a basic model of infectious disease, the ‘SIRS model’, is described by this reaction network:

S + I \stackrel{\iota}{\longrightarrow} 2 I  \qquad  I \stackrel{\rho}{\longrightarrow} R \stackrel{\lambda}{\longrightarrow} S

We see here three types of entity, called species:

S: susceptible,
I: infected,
R: resistant.

We also have three `reactions’:

\iota : S + I \to 2 I: infection, in which a susceptible individual meets an infected one and becomes infected;
\rho : I \to R: recovery, in which an infected individual gains resistance to the disease;
\lambda : R \to S: loss of resistance, in which a resistant individual becomes susceptible.

In general, a reaction network involves a finite set of species, but reactions go between complexes, which are finite linear combinations of these species with natural number coefficients. The reaction network is a directed graph whose vertices are certain complexes and whose edges are called reactions.

If we attach a positive real number called a rate constant to each reaction, a reaction network determines a system of differential equations saying how the concentrations of the species change over time. This system of equations is usually called the rate equation. In the example I just gave, the rate equation is

\begin{array}{ccl} \displaystyle{\frac{d S}{d t}} &=& r_\lambda R - r_\iota S I \\ \\ \displaystyle{\frac{d I}{d t}} &=&  r_\iota S I - r_\rho I \\  \\ \displaystyle{\frac{d R}{d t}} &=& r_\rho I - r_\lambda R \end{array}

Here r_\iota, r_\rho and r_\lambda are the rate constants for the three reactions, and S, I, R now stand for the concentrations of the three species, which are treated in a continuum approximation as smooth functions of time:

S, I, R: \mathbb{R} \to [0,\infty)

The rate equation can be derived from the law of mass action, which says that any reaction occurs at a rate equal to its rate constant times the product of the concentrations of the species entering it as inputs.

But a reaction network is more than just a stepping-stone to its rate equation! Interesting qualitative properties of the rate equation, like the existence and uniqueness of steady state solutions, can often be determined just by looking at the reaction network, regardless of the rate constants. Results in this direction began with Feinberg and Horn’s work in the 1960’s, leading to the Deficiency Zero and Deficiency One Theorems, and more recently to Craciun’s proof of the Global Attractor Conjecture.

In our paper, Blake and I present a ‘compositional framework’ for reaction networks. In other words, we describe rules for building up reaction networks from smaller pieces, in such a way that its rate equation can be figured out knowing those those of the pieces. But this framework requires that we view reaction networks in a somewhat different way, as ‘Petri nets’.

Petri nets were invented by Carl Petri in 1939, when he was just a teenager, for the purposes of chemistry. Much later, they became popular in theoretical computer science, biology and other fields. A Petri net is a bipartite directed graph: vertices of one kind represent species, vertices of the other kind represent reactions. The edges into a reaction specify which species are inputs to that reaction, while the edges out specify its outputs.

You can easily turn a reaction network into a Petri net and vice versa. For example, the reaction network above translates into this Petri net:



Beware: there are a lot of different names for the same thing, since the terminology comes from several communities. In the Petri net literature, species are called places and reactions are called transitions. In fact, Petri nets are sometimes called ‘place-transition nets’ or ‘P/T nets’. On the other hand, chemists call them ‘species-reaction graphs’ or ‘SR-graphs’. And when each reaction of a Petri net has a rate constant attached to it, it is often called a ‘stochastic Petri net’.

While some qualitative properties of a rate equation can be read off from a reaction network, others are more easily read from the corresponding Petri net. For example, properties of a Petri net can be used to determine whether its rate equation can have multiple steady states.

Petri nets are also better suited to a compositional framework. The key new concept is an ‘open’ Petri net. Here’s an example:



The box at left is a set X of ‘inputs’ (which happens to be empty), while the box at right is a set Y of ‘outputs’. Both inputs and outputs are points at which entities of various species can flow in or out of the Petri net. We say the open Petri net goes from X to Y. In our paper, we show how to treat it as a morphism f : X \to Y in a category we call \textrm{RxNet}.

Given an open Petri net with rate constants assigned to each reaction, our paper explains how to get its ‘open rate equation’. It’s just the usual rate equation with extra terms describing inflows and outflows. The above example has this open rate equation:

\begin{array}{ccr} \displaystyle{\frac{d S}{d t}} &=&  - r_\iota S I - o_1 \\ \\ \displaystyle{\frac{d I}{d t}} &=&  r_\iota S I - o_2  \end{array}

Here o_1, o_2 : \mathbb{R} \to \mathbb{R} are arbitrary smooth functions describing outflows as a function of time.

Given another open Petri net g: Y \to Z, for example this:



it will have its own open rate equation, in this case

\begin{array}{ccc} \displaystyle{\frac{d S}{d t}} &=& r_\lambda R + i_2 \\ \\ \displaystyle{\frac{d I}{d t}} &=& - r_\rho I + i_1 \\  \\ \displaystyle{\frac{d R}{d t}} &=& r_\rho I - r_\lambda R  \end{array}

Here i_1, i_2: \mathbb{R} \to \mathbb{R} are arbitrary smooth functions describing inflows as a function of time. Now for a tiny bit of category theory: we can compose f and g by gluing the outputs of f to the inputs of g. This gives a new open Petri net gf: X \to Z, as follows:



But this open Petri net gf has an empty set of inputs, and an empty set of outputs! So it amounts to an ordinary Petri net, and its open rate equation is a rate equation of the usual kind. Indeed, this is the Petri net we have already seen.

As it turns out, there’s a systematic procedure for combining the open rate equations for two open Petri nets to obtain that of their composite. In the example we’re looking at, we just identify the outflows of f with the inflows of g (setting i_1 = o_1 and i_2 = o_2) and then add the right hand sides of their open rate equations.

The first goal of our paper is to precisely describe this procedure, and to prove that it defines a functor

\diamond: \textrm{RxNet} \to \textrm{Dynam}

from \textrm{RxNet} to a category \textrm{Dynam} where the morphisms are ‘open dynamical systems’. By a dynamical system, we essentially mean a vector field on \mathbb{R}^n, which can be used to define a system of first-order ordinary differential equations in n variables. An example is the rate equation of a Petri net. An open dynamical system allows for the possibility of extra terms that are arbitrary functions of time, such as the inflows and outflows in an open rate equation.

In fact, we prove that \textrm{RxNet} and \textrm{Dynam} are symmetric monoidal categories and that d is a symmetric monoidal functor. To do this, we use Brendan Fong’s theory of ‘decorated cospans’.

Decorated cospans are a powerful general tool for describing open systems. A cospan in any category is just a diagram like this:



We are mostly interested in cospans in \mathrm{FinSet}, the category of finite sets and functions between these. The set S, the so-called apex of the cospan, is the set of states of an open system. The sets X and Y are the inputs and outputs of this system. The legs of the cospan, meaning the morphisms i: X \to S and o: Y \to S, describe how these inputs and outputs are included in the system. In our application, S is the set of species of a Petri net.

For example, we may take this reaction network:

A+B \stackrel{\alpha}{\longrightarrow} 2C \quad \quad C \stackrel{\beta}{\longrightarrow} D

treat it as a Petri net with S = \{A,B,C,D\}:



and then turn that into an open Petri net by choosing any finite sets X,Y and maps i: X \to S, o: Y \to S, for example like this:



(Notice that the maps including the inputs and outputs into the states of the system need not be one-to-one. This is technically useful, but it introduces some subtleties that I don’t feel like explaining right now.)

An open Petri net can thus be seen as a cospan of finite sets whose apex S is ‘decorated’ with some extra information, namely a Petri net with S as its set of species. Fong’s theory of decorated cospans lets us define a category with open Petri nets as morphisms, with composition given by gluing the outputs of one open Petri net to the inputs of another.

We call the functor

\diamond: \textrm{RxNet} \to \textrm{Dynam}

gray-boxing because it hides some but not all the internal details of an open Petri net. (In the paper we draw it as a gray box, but that’s too hard here!)

We can go further and black-box an open dynamical system. This amounts to recording only the relation between input and output variables that must hold in steady state. We prove that black-boxing gives a functor

\square: \textrm{Dynam} \to \mathrm{SemiAlgRel}

(yeah, the box here should be black, and in our paper it is). Here \mathrm{SemiAlgRel} is a category where the morphisms are semi-algebraic relations between real vector spaces, meaning relations defined by polynomials and inequalities. This relies on the fact that our dynamical systems involve algebraic vector fields, meaning those whose components are polynomials; more general dynamical systems would give more general relations.

That semi-algebraic relations are closed under composition is a nontrivial fact, a spinoff of the Tarski–Seidenberg theorem. This says that a subset of \mathbb{R}^{n+1} defined by polynomial equations and inequalities can be projected down onto \mathbb{R}^n, and the resulting set is still definable in terms of polynomial identities and inequalities. This wouldn’t be true if we didn’t allow inequalities. It’s neat to see this theorem, important in mathematical logic, showing up in chemistry!

Structure of the paper

Okay, now you’re ready to read our paper! Here’s how it goes:

In Section 2 we review and compare reaction networks and Petri nets. In Section 3 we construct a symmetric monoidal category \textrm{RNet} where an object is a finite set and a morphism is an open reaction network (or more precisely, an isomorphism class of open reaction networks). In Section 4 we enhance this construction to define a symmetric monoidal category \textrm{RxNet} where the transitions of the open reaction networks are equipped with rate constants. In Section 5 we explain the open dynamical system associated to an open reaction network, and in Section 6 we construct a symmetric monoidal category \textrm{Dynam} of open dynamical systems. In Section 7 we construct the gray-boxing functor

\diamond: \textrm{RxNet} \to \textrm{Dynam}

In Section 8 we construct the black-boxing functor

\square: \textrm{Dynam} \to \mathrm{SemiAlgRel}

We show both of these are symmetric monoidal functors.

Finally, in Section 9 we fit our results into a larger ‘network of network theories’. This is where various results in various papers I’ve been writing in the last few years start assembling to form a big picture! But this picture needs to grow….