## Nonequilibrium Thermodynamics in Biology (Part 2)

16 June, 2021

Larry Li, Bill Cannon and I ran a session on non-equilibrium thermodynamics in biology at SMB2021, the annual meeting of the Society for Mathematical Biology. You can see talk slides here!

Here’s the basic idea:

Since Lotka, physical scientists have argued that living things belong to a class of complex and orderly systems that exist not despite the second law of thermodynamics, but because of it. Life and evolution, through natural selection of dissipative structures, are based on non-equilibrium thermodynamics. The challenge is to develop an understanding of what the respective physical laws can tell us about flows of energy and matter in living systems, and about growth, death and selection. This session addresses current challenges including understanding emergence, regulation and control across scales, and entropy production, from metabolism in microbes to evolving ecosystems.

Click on the links to see slides for most of the talks:

Persistence, permanence, and global stability in reaction network models: some results inspired by thermodynamic principles

The standard mathematical model for the dynamics of concentrations in biochemical networks is called mass-action kinetics. We describe mass-action kinetics and discuss the connection between special classes of mass-action systems (such as detailed balanced and complex balanced systems) and the Boltzmann equation. We also discuss the connection between the ‘global attractor conjecture’ for complex balanced mass-action systems and Boltzmann’s H-theorem. We also describe some implications for biochemical mechanisms that implement noise filtering and cellular homeostasis.

The principle of maximum caliber of nonequilibria
Ken Dill, Stony Brook University

Maximum Caliber is a principle for inferring pathways and rate distributions of kinetic processes. The structure and foundations of MaxCal are much like those of Maximum Entropy for static distributions. We have explored how MaxCal may serve as a general variational principle for nonequilibrium statistical physics—giving well-known results, such as the Green-Kubo relations, Onsager’s reciprocal relations and Prigogine’s Minimum Entropy Production principle near equilibrium, but is also applicable far from equilibrium. I will also discuss some applications, such as finding reaction coordinates in molecular simulations non-linear dynamics in gene circuits, power-law-tail distributions in ‘social-physics’ networks, and others.

Nonequilibrium biomolecular information processes
Pierre Gaspard, Université libre de Bruxelles

Nearly 70 years have passed since the discovery of DNA structure and its role in coding genetic information. Yet, the kinetics and thermodynamics of genetic information processing in DNA replication, transcription, and translation remain poorly understood. These template-directed copolymerization processes are running away from equilibrium, being powered by extracellular energy sources. Recent advances show that their kinetic equations can be exactly solved in terms of so-called iterated function systems. Remarkably, iterated function systems can determine the effects of genome sequence on replication errors, up to a million times faster than kinetic Monte Carlo algorithms. With these new methods, fundamental links can be established between molecular information processing and the second law of thermodynamics, shedding a new light on genetic drift, mutations, and evolution.

Nonequilibrium dynamics of disturbed ecosystems
John Harte, University of California, Berkeley

The Maximum Entropy Theory of Ecology (METE) predicts the shapes of macroecological metrics in relatively static ecosystems, across spatial scales, taxonomic categories, and habitats, using constraints imposed by static state variables. In disturbed ecosystems, however, with time-varying state variables, its predictions often fail. We extend macroecological theory from static to dynamic, by combining the MaxEnt inference procedure with explicit mechanisms governing disturbance. In the static limit, the resulting theory, DynaMETE, reduces to METE but also predicts a new scaling relationship among static state variables. Under disturbances, expressed as shifts in demographic, ontogenic growth, or migration rates, DynaMETE predicts the time trajectories of the state variables as well as the time-varying shapes of macroecological metrics such as the species abundance distribution and the distribution of metabolic rates over
individuals. An iterative procedure for solving the dynamic theory is presented. Characteristic signatures of the deviation from static predictions of macroecological patterns are shown to result from different kinds of disturbance. By combining MaxEnt inference with explicit dynamical mechanisms of disturbance, DynaMETE is a candidate theory of macroecology for ecosystems responding to anthropogenic or natural disturbances.

Stochastic chemical reaction networks
Supriya Krishnamurthy, Stockholm University

The study of chemical reaction networks (CRN’s) is a very active field. Earlier well-known results (Feinberg Chem. Enc. Sci. 42 2229 (1987), Anderson et al Bull. Math. Biol. 72 1947 (2010)) identify a topological quantity called deficiency, easy to compute for CRNs of any size, which, when exactly equal to zero, leads to a unique factorized (non-equilibrium) steady-state for these networks. No general results exist however for the steady states of non-zero-deficiency networks. In recent work, we show how to write the full moment-hierarchy for any non-zero-deficiency CRN obeying mass-action kinetics, in terms of equations for the factorial moments. Using these, we can recursively predict values for lower moments from higher moments, reversing the procedure usually used to solve moment hierarchies. We show, for non-trivial examples, that in this manner we can predict any moment of interest, for CRN’s with non-zero deficiency and non-factorizable steady states. It is however an open question how scalable these techniques are for large networks.

Heat flows adjust local ion concentrations in favor of prebiotic chemistry
Christof Mast, Ludwig-Maximilians-Universität München

Prebiotic reactions often require certain initial concentrations of ions. For example, the activity of RNA enzymes requires a lot of divalent magnesium salt, whereas too much monovalent sodium salt leads to a reduction in enzyme function. However, it is known from leaching experiments that prebiotically relevant geomaterial such as basalt releases mainly a lot of sodium and only little magnesium. A natural solution to this problem is heat fluxes through thin rock fractures, through which magnesium is actively enriched and sodium is depleted by thermogravitational convection and thermophoresis. This process establishes suitable conditions for ribozyme function from a basaltic leach. It can take place in a spatially distributed system of rock cracks and is therefore particularly stable to natural fluctuations and disturbances.

Deficiency of chemical reaction networks and thermodynamics
Matteo Polettini, University of Luxembourg

Deficiency is a topological property of a Chemical Reaction Network linked to important dynamical features, in particular of deterministic fixed points and of stochastic stationary states. Here we link it to thermodynamics: in particular we discuss the validity of a strong vs. weak zeroth law, the existence of time-reversed mass-action kinetics, and the possibility to formulate marginal fluctuation relations. Finally we illustrate some subtleties of the Python module we created for MCMC stochastic simulation of CRNs, soon to be made public.

Large deviations theory and emergent landscapes in biological dynamics
Hong Qian, University of Washington

The mathematical theory of large deviations provides a nonequilibrium thermodynamic description of complex biological systems that consist of heterogeneous individuals. In terms of the notions of stochastic elementary reactions and pure kinetic species, the continuous-time, integer-valued Markov process dictates a thermodynamic structure that generalizes (i) Gibbs’ microscopic chemical thermodynamics of equilibrium matters to nonequilibrium small systems such as living cells and tissues; and (ii) Gibbs’ potential function to the landscapes for biological dynamics, such as that of C. H. Waddington and S. Wright.

Using the maximum entropy production principle to understand and predict microbial biogeochemistry
Joseph Vallino, Marine Biological Laboratory, Woods Hole

Natural microbial communities contain billions of individuals per liter and can exceed a trillion cells per liter in sediments, as well as harbor thousands of species in the same volume. The high species diversity contributes to extensive metabolic functional capabilities to extract chemical energy from the environment, such as methanogenesis, sulfate reduction, anaerobic photosynthesis, chemoautotrophy, and many others, most of which are only expressed by bacteria and archaea. Reductionist modeling of natural communities is problematic, as we lack knowledge on growth kinetics for most organisms and have even less understanding on the mechanisms governing predation, viral lysis, and predator avoidance in these systems. As a result, existing models that describe microbial communities contain dozens to hundreds of parameters, and state variables are extensively aggregated. Overall, the models are little more than non-linear parameter fitting exercises that have limited, to no, extrapolation potential, as there are few principles governing organization and function of complex self-assembling systems. Over the last decade, we have been developing a systems approach that models microbial communities as a distributed metabolic network that focuses on metabolic function rather than describing individuals or species. We use an optimization approach to determine which metabolic functions in the network should be up regulated versus those that should be down regulated based on the non-equilibrium thermodynamics principle of maximum entropy production (MEP). Derived from statistical mechanics, MEP proposes that steady state systems will likely organize to maximize free energy dissipation rate. We have extended this conjecture to apply to non-steady state systems and have proposed that living systems maximize entropy production integrated over time and space, while non-living systems maximize instantaneous entropy production. Our presentation will provide a brief overview of the theory and approach, as well as present several examples of applying MEP to describe the biogeochemistry of microbial systems in laboratory experiments and natural ecosystems.

Reduction and the quasi-steady state approximation
Carsten Wiuf, University of Copenhagen

Chemical reactions often occur at different time-scales. In applications of chemical reaction network theory it is often desirable to reduce a reaction network to a smaller reaction network by elimination of fast species or fast reactions. There exist various techniques for doing so, e.g. the Quasi-Steady-State Approximation or the Rapid Equilibrium Approximation. However, these methods are not always mathematically justifiable. Here, a method is presented for which (so-called) non-interacting species are eliminated by means of QSSA. It is argued that this method is mathematically sound. Various examples are given (Michaelis-Menten mechanism, two-substrate mechanism, …) and older related techniques from the 50s and 60s are briefly discussed.

## Fisher’s Fundamental Theorem (Part 3)

8 October, 2020

Last time we stated and proved a simple version of Fisher’s fundamental theorem of natural selection, which says that under some conditions, the rate of increase of the mean fitness equals the variance of the fitness. But the conditions we gave were very restrictive: namely, that the fitness of each species of replicator is constant, not depending on how many of these replicators there are, or any other replicators.

To broaden the scope of Fisher’s fundamental theorem we need to do one of two things:

1) change the left side of the equation: talk about some other quantity other than rate of change of mean fitness.

2) change the right side of the question: talk about some other quantity than the variance in fitness.

Or we could do both! People have spent a lot of time generalizing Fisher’s fundamental theorem. I don’t think there are, or should be, any hard rules on what counts as a generalization.

But today we’ll take alternative 1). We’ll show the square of something called the ‘Fisher speed’ always equals the variance in fitness. One nice thing about this result is that we can drop the restrictive condition I mentioned. Another nice thing is that the Fisher speed is a concept from information theory! It’s defined using the Fisher metric on the space of probability distributions.

And yes—that metric is named after the same guy who proved Fisher’s fundamental theorem! So, arguably, Fisher should have proved this generalization of Fisher’s fundamental theorem. But in fact it seems that I was the first to prove it, around February 1st, 2017. Some similar results were already known, and I will discuss those someday. But they’re a bit different.

A good way to think about the Fisher speed is that it’s ‘the rate at which information is being updated’. A population of replicators of different species gives a probability distribution. Like any probability distribution, this has information in it. As the populations of our replicators change, the Fisher speed says the rate at which this information is being updated. So, in simple terms, we’ll show

The square of the rate at which information is updated is equal to the variance in fitness.

This is quite a change from Fisher’s original idea, namely:

The rate of increase of mean fitness is equal to the variance in fitness.

But it has the advantage of always being true… as long the population dynamics are described by the general framework we introduced last time. So let me remind you of the general setup, and then prove the result!

### The setup

We start out with population functions $P_i \colon \mathbb{R} \to (0,\infty),$ one for each species of replicator $i = 1,\dots,n,$ obeying the Lotka–Volterra equation

$\displaystyle{ \frac{d P_i}{d t} = f_i(P_1, \dots, P_n) P_i }$

for some differentiable functions $f_i \colon (0,\infty) \to \mathbb{R}$ called fitness functions. The probability of a replicator being in the ith species is

$\displaystyle{ p_i(t) = \frac{P_i(t)}{\sum_j P_j(t)} }$

Using the Lotka–Volterra equation we showed last time that these probabilities obey the replicator equation

$\displaystyle{ \dot{p}_i = \left( f_i(P) - \overline f(P) \right) p_i }$

Here $P$ is short for the whole list of populations $(P_1(t), \dots, P_n(t)),$ and

$\displaystyle{ \overline f(P) = \sum_j f_j(P) p_j }$

is the mean fitness.

### The Fisher metric

The space of probability distributions on the set $\{1, \dots, n\}$ is called the (n-1)-simplex

$\Delta^{n-1} = \{ (x_1, \dots, x_n) : \; x_i \ge 0, \; \displaystyle{ \sum_{i=1}^n x_i = 1 } \}$

It’s called $\Delta^{n-1}$ because it’s (n-1)-dimensional. When $n = 3$ it looks like the letter $\Delta:$

The Fisher metric is a Riemannian metric on the interior of the (n-1)-simplex. That is, given a point $p$ in the interior of $\Delta^{n-1}$ and two tangent vectors $v,w$ at this point the Fisher metric gives a number

$g(v,w) = \displaystyle{ \sum_{i=1}^n \frac{v_i w_i}{p_i} }$

Here we are describing the tangent vectors $v,w$ as vectors in $\mathbb{R}^n$ with the property that the sum of their components is zero: that’s what makes them tangent to the (n-1)-simplex. And we’re demanding that $x$ be in the interior of the simplex to avoid dividing by zero, since on the boundary of the simplex we have $p_i = 0$ for at least one choice of $i.$

If we have a probability distribution $p(t)$ moving around in the interior of the (n-1)-simplex as a function of time, its Fisher speed is

$\displaystyle{ \sqrt{g(\dot{p}(t), \dot{p}(t))} = \sum_{i=1}^n \frac{\dot{p}_i(t)^2}{p_i(t)} }$

if the derivative $\dot{p}(t)$ exists. This is the usual formula for the speed of a curve moving in a Riemannian manifold, specialized to the case at hand.

Now we’ve got all the formulas we’ll need to prove the result we want. But for those who don’t already know and love it, it’s worthwhile saying a bit more about the Fisher metric.

The factor of $1/x_i$ in the Fisher metric changes the geometry of the simplex so that it becomes round, like a portion of a sphere:

But the reason the Fisher metric is important, I think, is its connection to relative information. Given two probability distributions $p, q \in \Delta^{n-1},$ the information of $q$ relative to $p$ is

$\displaystyle{ I(q,p) = \sum_{i = 1}^n q_i \ln\left(\frac{q_i}{p_i}\right) }$

You can show this is the expected amount of information gained if $p$ was your prior distribution and you receive information that causes you to update your prior to $q.$ So, sometimes it’s called the information gain. It’s also called relative entropy or—my least favorite, since it sounds so mysterious—the Kullback–Leibler divergence.

Suppose $p(t)$ is a smooth curve in the interior of the (n-1)-simplex. We can ask the rate at which information is gained as time passes. Perhaps surprisingly, a calculation gives

$\displaystyle{ \frac{d}{dt} I(p(t), p(t_0))\Big|_{t = t_0} = 0 }$

That is, in some sense ‘to first order’ no information is being gained at any moment $t_0 \in \mathbb{R}.$ However, we have

$\displaystyle{ \frac{d^2}{dt^2} I(p(t), p(t_0))\Big|_{t = t_0} = g(\dot{p}(t_0), \dot{p}(t_0))}$

So, the square of the Fisher speed has a nice interpretation in terms of relative entropy!

For a derivation of these last two equations, see Part 7 of my posts on information geometry. For more on the meaning of relative entropy, see Part 6.

### The result

It’s now extremely easy to show what we want, but let me state it formally so all the assumptions are crystal clear.

Theorem. Suppose the functions $P_i \colon \mathbb{R} \to (0,\infty)$ obey the Lotka–Volterra equations:

$\displaystyle{ \dot P_i = f_i(P) P_i}$

for some differentiable functions $f_i \colon (0,\infty)^n \to \mathbb{R}$ called fitness functions. Define probabilities and the mean fitness as above, and define the variance of the fitness by

$\displaystyle{ \mathrm{Var}(f(P)) = \sum_j ( f_j(P) - \overline f(P))^2 \, p_j }$

Then if none of the populations $P_i$ are zero, the square of the Fisher speed of the probability distribution $p(t) = (p_1(t), \dots , p_n(t))$ is the variance of the fitness:

$g(\dot{p}, \dot{p}) = \mathrm{Var}(f(P))$

Proof. The proof is near-instantaneous. We take the square of the Fisher speed:

$\displaystyle{ g(\dot{p}, \dot{p}) = \sum_{i=1}^n \frac{\dot{p}_i(t)^2}{p_i(t)} }$

and plug in the replicator equation:

$\displaystyle{ \dot{p}_i = (f_i(P) - \overline f(P)) p_i }$

We obtain:

$\begin{array}{ccl} \displaystyle{ g(\dot{p}, \dot{p})} &=& \displaystyle{ \sum_{i=1}^n \left( f_i(P) - \overline f(P) \right)^2 p_i } \\ \\ &=& \mathrm{Var}(f(P)) \end{array}$

as desired.   █

It’s hard to imagine anything simpler than this. We see that given the Lotka–Volterra equation, what causes information to be updated is nothing more and nothing less than variance in fitness! But there are other variants of Fisher’s fundamental theorem worth discussing, so I’ll talk about those in future posts.

## Ascendancy vs. Reserve

22 September, 2020

Why is biodiversity ‘good’? To what extent is this sort of goodness even relevant to ecosystems—as opposed to us humans? I’d like to study this mathematically.

To do this, we’d need to extract some answerable questions out of the morass of subtlety and complexity. For example: what role does biodiversity play in the ability of ecosystems to be robust under sudden changes of external conditions? This is already plenty hard to study mathematically, since it requires understanding ‘biodiversity’ and ‘robustness’.

Luckily there has already been a lot of work on the mathematics of biodiversity and its connection to entropy. For example:

• Tom Leinster, Measuring biodiversity, Azimuth, 7 November 2011.

But how does biodiversity help robustness?

There’s been a lot of work on this. This paper has some inspiring passages:

• Robert E. Ulanowicz,, Sally J. Goerner, Bernard Lietaer and Rocio Gomez, Quantifying sustainability: Resilience, efficiency and the return of information theory, Ecological Complexity 6 (2009), 27–36.

I’m not sure the math lives up to their claims, but I like these lines:

In other words, (14) says that the capacity for a system to undergo evolutionary change or self-organization consists of two aspects: It must be capable of exercising sufficient directed power (ascendancy) to maintain its integrity over time. Simultaneously, it must possess a reserve of flexible actions that can be used to meet the exigencies of novel disturbances. According to (14) these two aspects are literally complementary.

The two aspects are ‘ascendancy’, which is something like efficiency or being optimized, and ‘reserve capacity’, which is something like random junk that might come in handy if something unexpected comes up.

You know those gadgets you kept in the back of your kitchen drawer and never needed… until you did? If you’re aiming for ‘ascendancy’ you’d clear out those drawers. But if you keep that stuff, you’ve got more ‘reserve capacity’. They both have their good points. Ideally you want to strike a wise balance. You’ve probably sensed this every time you clean out your house: should I keep this thing because I might need it, or should I get rid of it?

I think it would be great to make these concepts precise. The paper at hand attempts this by taking a matrix of nonnegative numbers $T_{i j}$ to describe flows in an ecological network. They define a kind of entropy for this matrix, very similar in look to Shannon entropy. Then they write this as a sum of two parts: a part closely analogous to mutual information, and a part closely analogous to conditional entropy. This decomposition is standard in information theory. This is their equation (14).

The new idea of these authors is that in the context of an ecological network, the mutual information can be understood as ‘ascendancy’, while the conditional entropy can be understood as ‘reserve capacity’.

I don’t know if I believe this! But I like the general idea of a balance between ascendancy and reserve capacity.

They write:

While the dynamics of this dialectic interaction can be quite subtle and highly complex, one thing is boldly clear—systems with either vanishingly small ascendancy or insignificant reserves are destined to perish before long. A system lacking ascendancy has neither the extent of activity nor the internal organization needed to survive. By contrast, systems that are so tightly constrained and honed to a particular environment appear ‘‘brittle’’ in the sense of Holling (1986) or ‘‘senescent’’ in the sense of Salthe (1993) and are prone to collapse in the face of even minor novel disturbances. Systems that endure—that is, are sustainable—lie somewhere between these extremes. But, where?

## Entropy in the Universe

25 January, 2020

If you click on this picture, you’ll see a zoomable image of the Milky Way with 84 million stars:

But stars contribute only a tiny fraction of the total entropy in the observable Universe. If it’s random information you want, look elsewhere!

First: what’s the ‘observable Universe’, exactly?

The further you look out into the Universe, the further you look back in time. You can’t see through the hot gas from 380,000 years after the Big Bang. That ‘wall of fire’ marks the limits of the observable Universe.

But as the Universe expands, the distant ancient stars and gas we see have moved even farther away, so they’re no longer observable. Thus, the so-called ‘observable Universe’ is really the ‘formerly observable Universe’. Its edge is 46.5 billion light years away now!

This is true even though the Universe is only 13.8 billion years old. A standard challenge in understanding general relativity is to figure out how this is possible, given that nothing can move faster than light.

What’s the total number of stars in the observable Universe? Estimates go up as telescopes improve. Right now people think there are between 100 and 400 billion stars in the Milky Way. They think there are between 170 billion and 2 trillion galaxies in the Universe.

In 2009, Chas Egan and Charles Lineweaver estimated the total entropy of all the stars in the observable Universe at 1081 bits. You should think of these as qubits: it’s the amount of information to describe the quantum state of everything in all these stars.

But the entropy of interstellar and intergalactic gas and dust is about ten times more the entropy of stars! It’s about 1082 bits.

The entropy in all the photons in the Universe is even more! The Universe is full of radiation left over from the Big Bang. The photons in the observable Universe left over from the Big Bang have a total entropy of about 1090 bits. It’s called the ‘cosmic microwave background radiation’.

The neutrinos from the Big Bang also carry about 1090 bits—a bit less than the photons. The gravitons carry much less, about 1088 bits. That’s because they decoupled from other matter and radiation very early, and have been cooling ever since. On the other hand, photons in the cosmic microwave background radiation were formed by annihilating
electron-positron pairs until about 10 seconds after the Big Bang. Thus the graviton radiation is expected to be cooler than the microwave background radiation: about 0.6 kelvin as compared to 2.7 kelvin.

Black holes have immensely more entropy than anything listed so far. Egan and Lineweaver estimate the entropy of stellar-mass black holes in the observable Universe at 1098 bits. This is connected to why black holes are so stable: the Second Law says entropy likes to increase.

But the entropy of black holes grows quadratically with mass! So black holes tend to merge and form bigger black holes — ultimately forming the ‘supermassive’ black holes at the centers of most galaxies. These dominate the entropy of the observable Universe: about 10104 bits.

Hawking predicted that black holes slowly radiate away their mass when they’re in a cold enough environment. But the Universe is much too hot for supermassive black holes to be losing mass now. Instead, they very slowly grow by eating the cosmic microwave background, even when they’re not eating stars, gas and dust.

So, only in the far future will the Universe cool down enough for large black holes to start slowly decaying via Hawking radiation. Entropy will continue to increase… going mainly into photons and gravitons! This process will take a very long time. Assuming nothing is falling into it and no unknown effects intervene, a solar-mass black hole takes about 1067 years to evaporate due to Hawking radiation — while a really big one, comparable to the mass of a galaxy, should take about 1099 years.

If our current most popular ideas on dark energy are correct, the Universe will continue to expand exponentially. Thanks to this, there will be a cosmological event horizon surrounding each observer, which will radiate Hawking radiation at a temperature of roughly 10-30 kelvin.

In this scenario the Universe in the very far future will mainly consist of massless particles produced as Hawking radiation at this temperature: photons and gravitons. The entropy within the exponentially expanding ball of space that is today our ‘observable Universe’ will continue to increase exponentially… but more to the point, the entropy density will approach that of a gas of photons and gravitons in thermal equilibrium at 10-30 kelvin.

Of course, it’s quite likely that some new physics will turn up, between now and then, that changes the story! I hope so: this would be a rather dull ending to the Universe.

For more details, go here:

• Chas A. Egan and Charles H. Lineweaver, A larger estimate of the entropy of the universe, The Astrophysical Journal 710 (2010), 1825.

Also read my page on information.

## Coupling Through Emergent Conservation Laws (Part 8)

3 July, 2018

joint post with Jonathan Lorand, Blake Pollard, and Maru Sarazola

To wrap up this series, let’s look at an even more elaborate cycle of reactions featuring emergent conservation laws: the citric acid cycle. Here’s a picture of it from Stryer’s textbook Biochemistry:

I’ll warn you right now that we won’t draw any grand conclusions from this example: that’s why we left it out of our paper. Instead we’ll leave you with some questions we don’t know how to answer.

All known aerobic organisms use the citric cycle to convert energy derived from food into other useful forms. This cycle couples an exergonic reaction, the conversion of acetyl-CoA to CoA-SH, to endergonic reactions that produce ATP and a chemical called NADH.

The citric acid cycle can be described at various levels of detail, but at one level it consists of ten reactions:

$\begin{array}{rcl} \mathrm{A}_1 + \text{acetyl-CoA} + \mathrm{H}_2\mathrm{O} & \longleftrightarrow & \mathrm{A}_2 + \text{CoA-SH} \\ \\ \mathrm{A}_2 & \longleftrightarrow & \mathrm{A}_3 + \mathrm{H}_2\mathrm{O} \\ \\ \mathrm{A}_3 + \mathrm{H}_2\mathrm{O} & \longleftrightarrow & \mathrm{A}_4 \\ \\ \mathrm{A}_4 + \mathrm{NAD}^+ & \longleftrightarrow & \mathrm{A}_5 + \mathrm{NADH} + \mathrm{H}^+ \\ \\ \mathrm{A}_5 + \mathrm{H}^+ & \longleftrightarrow & \mathrm{A}_6 + \textrm{CO}_2 \\ \\ \mathrm{A}_6 + \mathrm{NAD}^+ + \text{CoA-SH} & \longleftrightarrow & \mathrm{A}_7 + \mathrm{NADH} + \mathrm{H}^+ + \textrm{CO}_2 \\ \\ \mathrm{A}_7 + \mathrm{ADP} + \mathrm{P}_{\mathrm{i}} & \longleftrightarrow & \mathrm{A}_8 + \text{CoA-SH} + \mathrm{ATP} \\ \\ \mathrm{A}_8 + \mathrm{FAD} & \longleftrightarrow & \mathrm{A}_9 + \mathrm{FADH}_2 \\ \\ \mathrm{A}_9 + \mathrm{H}_2\mathrm{O} & \longleftrightarrow & \mathrm{A}_{10} \\ \\ \mathrm{A}_{10} + \mathrm{NAD}^+ & \longleftrightarrow & \mathrm{A}_1 + \mathrm{NADH} + \mathrm{H}^+ \end{array}$

Here $\mathrm{A}_1, \dots, \mathrm{A}_{10}$ are abbreviations for species that cycle around, each being transformed into the next. It doesn’t really matter for what we’ll be doing, but in case you’re curious:

$\mathrm{A}_1=$ oxaloacetate,
$\mathrm{A}_2=$ citrate,
$\mathrm{A}_3=$ cis-aconitate,
$\mathrm{A}_4=$ isocitrate,
$\mathrm{A}_5=$ oxalosuccinate,
$\mathrm{A}_6=$ α-ketoglutarate,
$\mathrm{A}_7=$ succinyl-CoA,
$\mathrm{A}_8=$ succinate,
$\mathrm{A}_9=$ fumarate,
$\mathrm{A}_{10}=$ L-malate.

In reality, the citric acid cycle also involves inflows of reactants such as acetyl-CoA, which is produced by metabolism, as well as outflows of both useful products such as ADP and NADH and waste products such as CO2. Thus, a full analysis requires treating this cycle as an open chemical reaction network, where species flow in and out. However, we can gain some insight just by studying the emergent conservation laws present in this network, ignoring inflows and outflows—so let’s do that!

There are a total of 22 species in the citric acid cycle. There are 10 forward reactions. We can see that their vectors are all linearly independent as follows. Since each reaction turns $\mathrm{A}_i$ into $\mathrm{A}_{i+1}$, where we count modulo 10, it is easy to see that any nine of the reaction vectors are linearly independent. Whichever one we choose to ‘close the cycle’ could in theory be linearly dependent on the rest. However, it is easy to see that the vector for this reaction

$\mathrm{A}_8 + \mathrm{FAD} \longleftrightarrow \mathrm{A}_9 + \mathrm{FADH}_2$

is linearly independent from the rest, because only this one involves FAD. So, all 10 reaction vectors are linearly independent, and the stoichiometric subspace has dimension 10.

Since 22 – 10 = 12, there must be 12 linearly independent conserved quantities. Some of these conservation laws are ‘fundamental’, at least by the standards of chemistry. All the species involved are made of 6 different atoms (carbon, hydrogen, oxygen, nitrogen, phosphorus and sulfur), and conservation of charge provides another fundamental conserved quantity, for a total of 7.

(In our example from last time we didn’t keep track of conservation of hydrogen and charge, because both $\mathrm{H}^+$ and $e^-$ ions are freely available in water… but we studied the citric acid cycle when we were younger, more energetic and less wise, so we kept careful track of hydrogen and charge, and made sure that all the reactions conserved these. So, we’ll have 7 fundamental conserved quantities.)

For example, the conserved quantity

$[\text{acetyl-CoA}] + [\text{CoA-SH}] + [\mathrm{A}_7]$

arises from the fact that $\text{acetyl-CoA}$, $\text{CoA-SH}$ and $\mathrm{A}_7$ contain a single sulfur atom, while none of the other species involved contain sulfur.

Similarly, the conserved quantity

$3[\mathrm{ATP}] + 2[\mathrm{ADP}] + [\mathrm{P}_{\mathrm{i}}] + 2[\mathrm{FAD}] +2[\mathrm{FADH}_2]$

expresses conservation of phosphorus.

Besides the 7 fundamental conserved quantities, there must also be 5 linearly independent emergent conserved quantities: that is, quantities that are not conserved in every possible chemical reaction, but remain constant in every reaction in the citric acid cycle. We can use these 5 quantities:

$[\mathrm{ATP}] + [\mathrm{ADP}]$, due to the conservation of adenosine.

$[\mathrm{FAD}] + [\mathrm{FADH}_2]$, due to conservation of flavin adenine dinucleotide.

$[\mathrm{NAD}^+] + [\mathrm{NADH}]$, due to conservation of nicotinamide adenine dinucleotide.

$[\mathrm{A}_1] + \cdots + [\mathrm{A}_{10}]$. This expresses the fact that in the citric acid cycle each species $[\mathrm{A}_i]$ is transformed to the next, modulo 10.

$[\text{acetyl-CoA}] + [\mathrm{A}_1] + \cdots + [\mathrm{A}_7] + [\text{CoA-SH}]$. It can be checked by hand that each reaction in the citric acid cycle conserves this quantity. This expresses the fact that during the first 7 reactions of the citric acid cycle, one molecule of $\text{acetyl-CoA}$ is destroyed and one molecule of $\text{CoA-SH}$ is formed.

Of course, other conserved quantities can be formed as linear combinations of fundamental and emergent conserved quantities, often in nonobvious ways. An example is

$3 [\text{acetyl-CoA}] + 3 [\mathrm{A}_2] + 3[\mathrm{A}_3] + 3[\mathrm{A}_4] + 2[\mathrm{A}_5] +$
$2[\mathrm{A}_6] + [\mathrm{A}_7] + [\mathrm{A}_8] + [\mathrm{A}_9] + [\mathrm{A}_{10}] + [\mathrm{NADH}]$

which expresses the fact that in each turn of the citric acid cycle, one molecule of $\text{acetyl-CoA}$ is destroyed and three of $\mathrm{NADH}$ are formed. It is easier to check by hand that this quantity is conserved than to express it as an explicit linear combination of the 12 conserved quantities we have listed so far.

Finally, we bit you a fond farewell and leave you with this question: what exactly do the 7 emergent conservation laws do? In our previous two examples (ATP hydrolysis and the urea cycle) there were certain undesired reactions involving just the species we listed which were forbidden by the emergent conservation laws. In this case I don’t see any of those. But there are other important processes, involving additional species, that are forbidden. For example, if you let acetyl-CoA sit in water it will ‘hydrolyze’ as follows:

$\text{acetyl-CoA} + \mathrm{H}_2\mathrm{O} \longleftrightarrow \text{CoA-SH} + \text{acetate} + \text{H}^+$

So, it’s turning into CoA-SH and some other stuff, somewhat as does in the citric acid cycle, but in a way that doesn’t do anything ‘useful’: no ATP or NADH is created in this process. This is one of the things the citric acid cycle tries to prevent.

(Remember, a reaction being ‘forbidden by emergent conservation laws’ doesn’t mean it’s absolutely forbidden. It just means that it happens much more slowly than the catalyzed reactions we are listing in our reaction network.)

Unfortunately acetate and $\text{H}^+$ aren’t on the list of species we’re considering. We could add them. If we added them, and perhaps other species, could we get a setup where every emergent conservation law could be seen as preventing a specific unwanted reaction that’s chemically allowed?

Ideally the dimension of the space of emergent conservation laws would match the dimension of the space spanned by reaction vectors of unwanted reactions, so ‘everything would be accounted for’. But even in the simpler example of the urea cycle, we didn’t achieve this perfect match.

The paper:

• John Baez, Jonathan Lorand, Blake S. Pollard and Maru Sarazola,
Biochemical coupling through emergent conservation laws.

The blog series:

Part 1 – Introduction.

Part 2 – Review of reaction networks and equilibrium thermodynamics.

Part 3 – What is coupling?

Part 4 – Interactions.

Part 5 – Coupling in quasiequilibrium states.

Part 6 – Emergent conservation laws.

Part 7 – The urea cycle.

Part 8 – The citric acid cycle.

## Coupling Through Emergent Conservation Laws (Part 6)

1 July, 2018

joint post with Jonathan Lorand, Blake Pollard, and Maru Sarazola

Now let’s think about emergent conservation laws!

When a heavy rock connected to a lighter one by a pulley falls down and pulls up the lighter one, you’re seeing an emergent conservation law:

Here the height of the heavy rock plus the height of light one is a constant. That’s a conservation law! It forces some of the potential energy lost by one rock to be transferred to the other. But it’s not a fundamental conservation law, built into the fabric of physics. It’s an emergent law that holds only thanks to the clever design of the pulley. If the rope broke, this law would be broken too.

It’s not surprising that biology uses similar tricks. But let’s see exactly how it works. First let’s look at all four reactions we’ve been studying:

$\begin{array}{cccc} \mathrm{X} + \mathrm{Y} & \mathrel{\substack{\alpha_{\rightarrow} \\\longleftrightarrow\\ \alpha_{\leftarrow}}} & \mathrm{XY} & \qquad (1) \\ \\ \mathrm{ATP} & \mathrel{\substack{\beta_{\rightarrow} \\\longleftrightarrow\\ \beta_{\leftarrow}}} & \mathrm{ADP} + \mathrm{P}_{\mathrm{i}} & \qquad (2) \\ \\ \mathrm{X} + \mathrm{ATP} & \mathrel{\substack{\gamma_{\rightarrow} \\\longleftrightarrow\\ \gamma_{\leftarrow}}} & \mathrm{ADP} + \mathrm{XP}_{\mathrm{i}} & \qquad (3) \\ \\ \mathrm{XP}_{\mathrm{i}} +\mathrm{Y} & \mathrel{\substack{\delta_{\rightarrow} \\\longleftrightarrow\\ \delta_{\leftarrow}}} & \mathrm{XY} + \mathrm{P}_{\mathrm{i}} & \qquad (4) \end{array}$

It’s easy to check that the rate equations for these reactions have the following conserved quantities, that is, quantities that are constant in time:

A) $[\mathrm{X}] + [\mathrm{XP}_{\mathrm{i}} ] + [\mathrm{XY}],$ due to the conservation of X.

B) $[\mathrm{Y}] + [\mathrm{XY}],$ due to the conservation of Y.

C) $3[\mathrm{ATP}] +[\mathrm{XP}_{\mathrm{i}} ] +[\mathrm{P}_{\mathrm{i}}] +2[\mathrm{ADP}],$ due to the conservation of phosphorus.

D) $[\mathrm{ATP}] + [\mathrm{ADP}],$ due to the conservation of adenosine.

Moreover, these quantities, and their linear combinations, are the only conserved quantities for reactions (1)–(4).

To see this, we use some standard ideas from reaction network theory. Consider the 7-dimensional space with orthonormal basis given by the species in our reaction network:

$\mathrm{ATP}, \mathrm{ADP}, \mathrm{P}_{\mathrm{i}}, \mathrm{XP}_{\mathrm{i}}, \mathrm{X}, \mathrm{Y}, \mathrm{XY}$

We can think of complexes like $\mathrm{ADP} + \mathrm{P}_{\mathrm{i}}$ as vectors in this space. An arbitrary choice of the concentrations of all species also defines a vector in this space. Furthermore, any reaction involving these species defines a vector in this space, namely the sum of the products minus the sum of the reactants. This is called the reaction vector of this reaction. Reactions (1)–(4) give these reaction vectors:

$\begin{array}{ccl} v_\alpha &=& \mathrm{XY} - \mathrm{X} - \mathrm{Y} \\ \\ v_\beta &= & \mathrm{P}_{\mathrm{i}} + \mathrm{ADP} - \mathrm{ATP} \\ \\ v_\gamma &=& \mathrm{XP}_{\mathrm{i}} + \mathrm{ADP} - \mathrm{ATP} - \mathrm{X} \\ \\ v_\delta &= & \mathrm{XY} + \mathrm{P}_{\mathrm{i}} - \mathrm{XP}_{\mathrm{i}} - \mathrm{Y} \end{array}$

Any change in concentrations caused by these reactions must lie in the stoichiometric subspace: that is, the space spanned by the reaction vectors. Since these vectors obey one nontrivial relation:

$v_\alpha + v_\beta = v_\gamma + v_\delta$

the stochiometric subspace is 3-dimensional. Therefore, the space of conserved quantities must be 4-dimensional, since these specify the constraints on allowed changes in concentrations.

Now let’s compare the situation where ‘coupling’ occurs! For this we consider only reactions (3) and (4):

Now the stoichiometric subspace is 2-dimensional, since $v_\gamma$ and $v_\delta$ are linearly independent. Thus, the space of conserved quantities becomes 5-dimensional! Indeed, we can find an additional conserved quantity:

E) $[\mathrm{Y} ] +[\mathrm{P}_{\mathrm{i}}]$

that is linearly independent from the four conserved quantities we had before. It does not derive from the conservation of a particular molecular component. In other words, conservation of this quantity is not a fundamental law of chemistry. Instead, it is an emergent conservation law, which holds thanks to the workings of the cell! It holds in situations where the rate constants of reactions catalyzed by the cell’s enzymes are so much larger than those of other reactions that we can ignore those other reactions.

And remember from last time: these are precisely the situations where we have coupling.

Indeed, the emergent conserved quantity E) precisely captures the phenomenon of coupling! The only way for ATP to form ADP + Pi without changing this quantity is for Y to be consumed in the same amount as Pi is created… thus forming the desired product XY.

Next time we’ll look at a more complicated example from biology: the urea cycle.

The paper:

• John Baez, Jonathan Lorand, Blake S. Pollard and Maru Sarazola,
Biochemical coupling through emergent conservation laws.

The blog series:

Part 1 – Introduction.

Part 2 – Review of reaction networks and equilibrium thermodynamics.

Part 3 – What is coupling?

Part 4 – Interactions.

Part 5 – Coupling in quasiequilibrium states.

Part 6 – Emergent conservation laws.

Part 7 – The urea cycle.

Part 8 – The citric acid cycle.

## Coupling Through Emergent Conservation Laws (Part 5)

30 June, 2018

joint post with Jonathan Lorand, Blake Pollard, and Maru Sarazola

Coupling is the way biology makes reactions that ‘want’ to happen push forward desirable reactions that don’t want to happen. Coupling is achieved through the action of enzymes—but in a subtle way. An enzyme can increase the rate constant of a reaction. However, it cannot change the ratio of forward to reverse rate constants, since that is fixed by the difference of free energies, as we saw in Part 2:

$\displaystyle{ \frac{\alpha_\to}{\alpha_\leftarrow} = e^{-\Delta {G^\circ}/RT} } \qquad$

and the presence of an enzyme does not change this.

Indeed, if an enzyme could change this ratio, there would be no need for coupling! For example, increasing the ratio $\alpha_\rightarrow/\alpha_\leftarrow$ in the reaction

$\mathrm{X} + \mathrm{Y} \mathrel{\substack{\alpha_{\rightarrow} \\\longleftrightarrow\\ \alpha_{\leftarrow}}} \mathrm{XY}$

would favor the formation of XY, as desired. But this option is not available.

Instead, to achieve coupling, the cell uses enyzmes to greatly increase both the forward and reverse rate constants for some reactions while leaving those for others unchanged!

Let’s see how this works. In our example, the cell is trying to couple ATP hydrolysis to the formation of the molecule XY from two smaller parts X and Y. These reactions don’t help do that:

$\begin{array}{cclc} \mathrm{X} + \mathrm{Y} & \mathrel{\substack{\alpha_{\rightarrow} \\\longleftrightarrow\\ \alpha_{\leftarrow}}} & \mathrm{XY} & \qquad (1) \\ \\ \mathrm{ATP} & \mathrel{\substack{\beta_{\rightarrow} \\\longleftrightarrow\\ \beta_{\leftarrow}}} & \mathrm{ADP} + \mathrm{P}_{\mathrm{i}} & \qquad (2) \end{array}$

but these do:

$\begin{array}{cclc} \mathrm{X} + \mathrm{ATP} & \mathrel{\substack{\gamma_{\rightarrow} \\\longleftrightarrow\\ \gamma_{\leftarrow}}} & \mathrm{ADP} + \mathrm{XP}_{\mathrm{i}} & (3) \\ \\ \mathrm{XP}_{\mathrm{i}} +\mathrm{Y} & \mathrel{\substack{\delta_{\rightarrow} \\\longleftrightarrow\\ \delta_{\leftarrow}}} & \mathrm{XY} + \mathrm{P}_{\mathrm{i}} & (4) \end{array}$

So, the cell uses enzymes to make the rate constants for reactions (3) and (4) much bigger than for (1) and (2). In this situation we can ignore reactions (1) and (2) and still have a good approximate description of the dynamics, at least for sufficiently short time scales.

Thus, we shall study quasiequilibria, namely steady states of the rate equation for reactions (3) and (4) but not (1) and (2). In this approximation, the relevant Petri net becomes this:

Now it is impossible for ATP to turn into ADP + Pi without X + Y also turning into XY. As we shall see, this is the key to coupling!

In quasiequilibrium states, we shall find a nontrivial relation between the ratios $[\mathrm{XY}]/[\mathrm{X}][\mathrm{Y}]$ and $[\mathrm{ATP}]/[\mathrm{ADP}][\mathrm{P}_{\mathrm{i}}]$. This lets the cell increase the amount of XY that gets made by increasing the amount of ATP present.

Of course, this is just part of the full story. Over longer time scales, reactions (1) and (2) become important. They would drive the system toward a true equilibrium, and destroy coupling, if there were not an inflow of the reactants ATP, X and Y and an outflow of the products Pi and XY. To take these inflows and outflows into account, we need the theory of ‘open’ reaction networks… which is something I’m very interested in!

But this is beyond our scope here. We only consider reactions (3) and (4), which give the following rate equation:

$\begin{array}{ccl} \dot{[\mathrm{X}]} & = & -\gamma_\to [\mathrm{X}][\mathrm{ATP}] + \gamma_\leftarrow [\mathrm{ADP}][\mathrm{XP}_{\mathrm{i}} ] \\ \\ \dot{[\mathrm{Y}]} & = & -\delta_\to [\mathrm{XP}_{\mathrm{i}} ][\mathrm{Y}] +\delta_\leftarrow [\mathrm{XY}][\mathrm{P}_{\mathrm{i}}] \\ \\ \dot{[\mathrm{XY}]} & = &\delta_\to [\mathrm{XP}_{\mathrm{i}} ][\mathrm{Y}] -\delta_\leftarrow [\mathrm{XY}][\mathrm{P}_{\mathrm{i}}] \\ \\ \dot{[\mathrm{ATP}]} & = & -\gamma_\to [\mathrm{X}][\mathrm{ATP}] + \gamma_\leftarrow [\mathrm{ADP}][\mathrm{XP}_{\mathrm{i}} ] \\ \\ \dot{[\mathrm{ADP}]} & =& \gamma_\to [\mathrm{X}][\mathrm{ATP}] - \gamma_\leftarrow [\mathrm{ADP}][\mathrm{XP}_{\mathrm{i}} ] \\ \\ \dot{[\mathrm{P}_{\mathrm{i}}]} & = & \delta_\to [\mathrm{XP}_{\mathrm{i}} ][\mathrm{Y}] -\delta_\leftarrow [\mathrm{XY}][\mathrm{P}_{\mathrm{i}}] \\ \\ \dot{[\mathrm{XP}_{\mathrm{i}} ]} & = & \gamma_\to [\mathrm{X}][\mathrm{ATP}] - \gamma_\leftarrow [\mathrm{ADP}][\mathrm{XP}_{\mathrm{i}} ] \\ \\ && -\delta_\to [\mathrm{XP}_{\mathrm{i}}][\mathrm{Y}] +\delta_\leftarrow [\mathrm{XY}][\mathrm{P}_{\mathrm{i}}] \end{array}$

Quasiequilibria occur when all these time derivatives vanish. This happens when

$\begin{array}{ccl} \gamma_\to [\mathrm{X}][\mathrm{ATP}] & = & \gamma_\leftarrow [\mathrm{ADP}][\mathrm{XP}_{\mathrm{i}} ]\\ \\ \delta_\to [\mathrm{XP}_{\mathrm{i}} ][\mathrm{Y}] & = & \delta_\leftarrow [\mathrm{XY}][\mathrm{P}_{\mathrm{i}}] \end{array}$

This pair of equations is equivalent to

$\displaystyle{ \frac{\gamma_\to}{\gamma_\leftarrow}\frac{[\mathrm{X}][\mathrm{ATP}]}{[\mathrm{ADP}]}=[\mathrm{XP}_{\mathrm{i}} ] =\frac{\delta_\leftarrow}{\delta_\to}\frac{[\mathrm{XY}][\mathrm{P}_{\mathrm{i}}]}{[\mathrm{Y}]} }$

and it implies

$\displaystyle{ \frac{[\mathrm{XY}]}{[\mathrm{X}][\mathrm{Y}]} = \frac{\gamma_\to}{\gamma_\leftarrow}\frac{\delta_\to}{\delta_\leftarrow} \frac{[\mathrm{ATP}]}{[\mathrm{ADP}][\mathrm{P}_{\mathrm{i}}]} }$

If we forget about the species XPi (whose presence is crucial for the coupling to happen, but whose concentration we do not care about), the quasiequilibrium does not impose any conditions other than the above relation. We conclude that, under these circumstances and assuming we can increase the ratio

$\displaystyle{ \frac{[\mathrm{ATP}]}{[\mathrm{ADP}][\mathrm{P}_{\mathrm{i}}]} }$

it is possible to increase the ratio

$\displaystyle{\frac{[\mathrm{XY}]}{[\mathrm{X}][\mathrm{Y}]} }$

This constitutes ‘coupling’.

We can say a bit more, since we can express the ratios of forward and reverse rate constants in terms of exponentials of free energy differences using the laws of thermodynamics, as explained in Part 2. Reactions (1) and (2), taken together, convert X + Y + ATP to XY + ADP + Pi. So do reactions (3) and (4) taken together. Thus, these two pairs of reactions involve the same total change in free energy, so

$\displaystyle{ \frac{\alpha_\to}{\alpha_\leftarrow}\frac{\beta_\to}{\beta_\leftarrow} = \frac{\gamma_\to}{\gamma_\leftarrow}\frac{\delta_\to}{\delta_\leftarrow} }$

But we’re also assuming ATP hydrolysis is so strongly exergonic that

$\displaystyle{ \frac{\beta_\to}{\beta_\leftarrow} \gg \frac{\alpha_\leftarrow}{\alpha_\to} }$

This implies that

$\displaystyle{ \frac{\gamma_\to}{\gamma_\leftarrow}\frac{\delta_\to}{\delta_\leftarrow} \gg 1 }$

Thus,

$\displaystyle{ \frac{[\mathrm{XY}]}{[\mathrm{X}][\mathrm{Y}]} \gg \frac{[\mathrm{ATP}]}{[\mathrm{ADP}][\mathrm{P}_{\mathrm{i}}]} }$

This is why coupling to ATP hydrolysis is so good at driving the synthesis of XY from X and Y! Ultimately, this inequality arises from the fact that the decrease in free energy for the reaction

$\mathrm{ATP} \to \mathrm{ADP} + \mathrm{P}_{\mathrm{i}}$

greatly exceeds the increase in free energy for the reaction

$\mathrm{X} + \mathrm{Y} \to \mathrm{XY}$

But this fact can only be put to use in the right conditions. We need to be in a ‘quasiequilibrium’ state, where fast reactions have reached a steady state but not slow ones. And we need fast reactions to have this property: they can only turn ATP into ADP + Pi if they also turn X + Y into XY. Under these conditions, we have ‘coupling’.

Next time we’ll see how coupling relies on an ’emergent conservation law’.

The paper:

• John Baez, Jonathan Lorand, Blake S. Pollard and Maru Sarazola,
Biochemical coupling through emergent conservation laws.

The blog series:

Part 1 – Introduction.

Part 2 – Review of reaction networks and equilibrium thermodynamics.

Part 3 – What is coupling?

Part 4 – Interactions.

Part 5 – Coupling in quasiequilibrium states.

Part 6 – Emergent conservation laws.

Part 7 – The urea cycle.

Part 8 – The citric acid cycle.

## Coupling Through Emergent Conservation Laws (Part 4)

29 June, 2018

joint post with Jonathan Lorand, Blake Pollard, and Maru Sarazola

We’ve been trying to understand coupling: how a chemical reaction that ‘wants to happen’ because it decreases the amount of free energy can drive forward a chemical reaction that increases free energy.

For coupling to occur, the reactant species in both reactions must interact in some way. Indeed, in real-world examples where ATP hydrolysis is coupled to the formation of larger molecule $\mathrm{XY}$ from parts $\mathrm{X}$ and $\mathrm{Y},$ it is observed that, aside from the reactions we discussed last time:

$\begin{array}{cclc} \mathrm{X} + \mathrm{Y} & \mathrel{\substack{\alpha_{\rightarrow} \\\longleftrightarrow\\ \alpha_{\leftarrow}}} & \mathrm{XY} & \qquad (1) \\ \\ \mathrm{ATP} & \mathrel{\substack{\beta_{\rightarrow} \\\longleftrightarrow\\ \beta_{\leftarrow}}} & \mathrm{ADP} + \mathrm{P}_{\mathrm{i}} & \qquad (2) \end{array}$

two other reactions (and their reverses) take place:

$\begin{array}{cclc} \mathrm{X} + \mathrm{ATP} & \mathrel{\substack{\gamma_{\rightarrow} \\\longleftrightarrow\\ \gamma_{\leftarrow}}} & \mathrm{ADP} + \mathrm{XP}_{\mathrm{i}} & (3) \\ \\ \mathrm{XP}_{\mathrm{i}} +\mathrm{Y} & \mathrel{\substack{\delta_{\rightarrow} \\\longleftrightarrow\\ \delta_{\leftarrow}}} & \mathrm{XY} + \mathrm{P}_{\mathrm{i}} & (4) \end{array}$

We can picture all four reactions (1-4) in a single Petri net as follows:

Taking into account this more complicated set of reactions, which are interacting with each other, is still not enough to explain the phenomenon of coupling. To see this, let’s consider the rate equation for the system comprised of all four reactions. To write it down neatly, let’s introduce reaction velocities that say the rate at which each forward reaction is taking place, minus the rate of the reverse reaction:

$\begin{array}{ccl} J_\alpha &=& \alpha_\to [\mathrm{X}][\mathrm{Y}] - \alpha_\leftarrow [\mathrm{XY}] \\ \\ J_\beta &=& \beta_\to [\mathrm{ATP}] - \beta_\leftarrow [\mathrm{ADP}] [\mathrm{P}_{\mathrm{i}}] \\ \\ J_\gamma &=& \gamma_\to [\mathrm{ATP}] [\mathrm{X}] - \gamma_\leftarrow [\mathrm{ADP}] [\mathrm{XP}_{\mathrm{i}} ] \\ \\ J_\delta &=& \delta_\to [\mathrm{XP}_{\mathrm{i}} ] [\mathrm{Y}] - \delta_\leftarrow [\mathrm{XY}] [\mathrm{P}_{\mathrm{i}}] \end{array}$

All these follow from the law of mass action, which we explained in Part 2. Remember, this says that any reaction occurs at a rate equal to its rate constant times the product of the concentrations of the species involved. So, for example, this reaction

$\mathrm{XP}_{\mathrm{i}} +\mathrm{Y} \mathrel{\substack{\delta_{\rightarrow} \\\longleftrightarrow\\ \delta_{\leftarrow}}} \mathrm{XY} + \mathrm{P}_{\mathrm{i}}$

goes forward at a rate equal to $\delta_\rightarrow [\mathrm{XP}_{\mathrm{i}}][\mathrm{Y}]$, while the reverse reaction occurs at a rate equal to $\delta_\leftarrow [\mathrm{ADP}] [\mathrm{P}_{\mathrm{i}}].$ So, its reaction velocity is

$J_\delta = \delta_\to [\mathrm{XP}_{\mathrm{i}} ] [\mathrm{Y}] - \delta_\leftarrow [\mathrm{XY}] [\mathrm{P}_{\mathrm{i}}]$

In terms of these reaction velocities, we can write the rate equation as follows:

$\begin{array}{ccl} \dot{[\mathrm{X}]} & = & -J_\alpha - J_\gamma \\ \\ \dot{[\mathrm{Y}]} & = & -J_\alpha - J_\delta \\ \\ \dot{[\mathrm{XY}]} & = & J_\alpha + J_\delta \\ \\ \dot{[\mathrm{ATP}]} & = & -J_\beta - J_\gamma \\ \\ \dot{[\mathrm{ADP}]} & = & J_\beta + J_\gamma \\ \\ \dot{[\mathrm{P}_{\mathrm{i}}]} & = & J_\beta + J_\delta \\ \\ \dot{[\mathrm{XP}_{\mathrm{i}} ]} & = & J_\gamma -J_\delta \end{array}$

This makes sense if you think a bit: it says how each reaction contributes to the formation or destruction of each species.

In a steady state, all these time derivatives are zero, so we must have

$J_\alpha = J_\beta = -J_\gamma = - J_\delta$

Furthermore, in a detailed balanced equilibrium, every reaction occurs at the same rate as its reverse reaction, so all four reaction velocities vanish! In thermodynamics, a system that’s truly in equilibrium obeys this sort of detailed balance condition.

When all the reaction velocities vanish, we have:

$\begin{array}{ccl} \displaystyle{ \frac{[\mathrm{XY}]}{[\mathrm{X}][\mathrm{Y}]} } &=& \displaystyle{ \frac{\alpha_\to}{\alpha_\leftarrow} } \\ \\ \displaystyle{ \frac{[\mathrm{ADP}][\mathrm{P}_{\mathrm{i}}]}{[\mathrm{ATP}]} } &=& \displaystyle{ \frac{\beta_\to}{\beta_\leftarrow} } \\ \\ \displaystyle{ \frac{[\mathrm{ADP}] [\mathrm{XP}_{\mathrm{i}} ]}{[\mathrm{ATP}][\mathrm{X}]} } &=& \displaystyle{ \frac{\gamma_\to}{\gamma_\leftarrow} } \\ \\ \displaystyle{ \frac{[\mathrm{XY}][\mathrm{P}_{\mathrm{i}}]}{[\mathrm{XP}_{\mathrm{i}} ][\mathrm{Y}]} } &=& \displaystyle{ \frac{\delta_\to}{\delta_\leftarrow} } \end{array}$

Thus, even when the reactants interact, there can be no coupling if the whole system is in equilibrium, since then the ratio $[\mathrm{XY}]/[\mathrm{X}][\mathrm{Y}]$ is still forced to be $\alpha_\to/\alpha_\leftarrow$. This is obvious to anyone who truly understands what Boltzmann and Gibbs did. But here we saw it in detail.

The moral is that coupling cannot occur in equilibrium. But how, precisely, does coupling occur? Stay tuned!

The paper:

• John Baez, Jonathan Lorand, Blake S. Pollard and Maru Sarazola,
Biochemical coupling through emergent conservation laws.

The blog series:

Part 1 – Introduction.

Part 2 – Review of reaction networks and equilibrium thermodynamics.

Part 3 – What is coupling?

Part 4 – Interactions.

Part 5 – Coupling in quasiequilibrium states.

Part 6 – Emergent conservation laws.

Part 7 – The urea cycle.

Part 8 – The citric acid cycle.

## Coupling Through Emergent Conservation Laws (Part 3)

28 June, 2018

joint post with Jonathan Lorand, Blake Pollard, and Maru Sarazola

Last time we gave a quick intro to the chemistry and thermodynamics we’ll use to understand ‘coupling’. Now let’s really get started!

Suppose that we are in a setting in which some reaction

$\mathrm{X} + \mathrm{Y} \mathrel{\substack{\alpha_{\rightarrow} \\\longleftrightarrow\\ \alpha_{\leftarrow}}} \mathrm{XY}$

takes place. Let’s also assume we are interested in the production of $\mathrm{XY}$ from $\mathrm{X}$ and $\mathrm{Y},$ but that in our system, the reverse reaction is favored to happen. This means that that reverse rate constant exceeds the forward one, let’s say by a lot:

$\alpha_\leftarrow \gg \alpha_\to$

so that in equilibrium, the concentrations of the species will satisfy

$\displaystyle{ \frac{[\mathrm{XY}]}{[\mathrm{X}][\mathrm{Y}]}\ll 1 }$

which we assume undesirable. How can we influence this ratio to get a more desired outcome?

This is where coupling comes into play. Informally, we think of the coupling of two reactions as a process in which an endergonic reaction—one which does not ‘want’ to happen—is combined with an exergonic reaction—one that does ‘want’ to happen—in a way that improves the products-to-reactants concentrations ratio of the first reaction.

An important example of coupling, and one we will focus on, involves ATP hydrolysis:

$\mathrm{ATP} + \mathrm{H}_2\mathrm{O} \mathrel{\substack{\beta_{\rightarrow} \\\longleftrightarrow\\ \beta_{\leftarrow}}} \mathrm{ADP} + \mathrm{P}_{\mathrm{i}} + \mathrm{H}^+$

where ATP (adenosine triphosphate) reacts with a water molecule. Typically, this reaction results in ADP (adenosine diphosphate), a phosphate ion $\mathrm{P}_{\mathrm{i}}$ and a hydrogen ion $\mathrm{H}^+.$ To simplify calculations, we will replace the above equation with

$\mathrm{ATP} \mathrel{\substack{\beta_{\rightarrow} \\\longleftrightarrow\\ \beta_{\leftarrow}}} \mathrm{ADP} + \mathrm{P}_{\mathrm{i}}$

since suppressing the bookkeeping of hydrogen and oxygen atoms in this manner will not affect our main points.

One reason ATP hydrolysis is good for coupling is that this reaction is strongly exergonic:

$\beta_\to \gg \beta_\leftarrow$

and in fact so much that

$\displaystyle{ \frac{\beta_\to}{\beta_\leftarrow} \gg \frac{\alpha_\leftarrow}{\alpha_\to} }$

Yet this fact alone is insufficient to explain coupling!

To see why, suppose our system consists merely of the two reactions

$\begin{array}{ccc} \mathrm{X} + \mathrm{Y} & \mathrel{\substack{\alpha_{\rightarrow} \\\longleftrightarrow\\ \alpha_{\leftarrow}}} & \mathrm{XY} \\ \\ \mathrm{ATP} & \mathrel{\substack{\beta_{\rightarrow} \\\longleftrightarrow\\ \beta_{\leftarrow}}} & \mathrm{ADP} + \mathrm{P}_{\mathrm{i}} \label{beta} \end{array}$

happening in parallel. We can study the concentrations in equilibrium to see that one reaction has no influence on the other. Indeed, the rate equation for this reaction network is

$\begin{array}{ccl} \dot{[\mathrm{X}]} & = & -\alpha_\to [\mathrm{X}][\mathrm{Y}]+\alpha_\leftarrow [\mathrm{XY}]\\ \\ \dot{[\mathrm{Y}]} & = & -\alpha_\to [\mathrm{X}][\mathrm{Y}]+\alpha_\leftarrow [\mathrm{XY}]\\ \\ \dot{[\mathrm{XY}]} & = & \alpha_\to [\mathrm{X}][\mathrm{Y}]-\alpha_\leftarrow [\mathrm{XY}]\\ \\ \dot{[\mathrm{ATP}]} & =& -\beta_\to [\mathrm{ATP}]+\beta_\leftarrow [\mathrm{ADP}][\mathrm{P}_{\mathrm{i}}]\\ \\ \dot{[\mathrm{ADP}]} & = &\beta_\to [\mathrm{ATP}]-\beta_\leftarrow [\mathrm{ADP}][\mathrm{P}_{\mathrm{i}}]\\ \\ \dot{[\mathrm{P}_{\mathrm{i}}]} & = &\beta_\to [\mathrm{ATP}]-\beta_\leftarrow [\mathrm{ADP}][\mathrm{P}_{\mathrm{i}}] \end{array}$

When concentrations are constant, these are equivalent to the relations

$\displaystyle{ \frac{[\mathrm{XY}]}{[\mathrm{X}][\mathrm{Y}]} = \frac{\alpha_\to}{\alpha_\leftarrow} \ \ \text{ and } \ \ \frac{[\mathrm{ADP}][\mathrm{P}_{\mathrm{i}}]}{[\mathrm{ATP}]} = \frac{\beta_\to}{\beta_\leftarrow} }$

We thus see that ATP hydrolysis is in no way affecting the ratio of $[\mathrm{XY}]$ to $[\mathrm{X}][\mathrm{Y}].$ Intuitively, there is no coupling because the two reactions proceed independently. This ‘independence’ is clearly visible if we draw the reaction network as a so-called Petri net:

So what really happens when we are in the presence of coupling? Stay tuned for the next episode!

By the way, here’s what ATP hydrolysis looks like in a bit more detail, from a website at Loreto College:

The paper:

• John Baez, Jonathan Lorand, Blake S. Pollard and Maru Sarazola,
Biochemical coupling through emergent conservation laws.

The blog series:

Part 1 – Introduction.

Part 2 – Review of reaction networks and equilibrium thermodynamics.

Part 3 – What is coupling?

Part 4 – Interactions.

Part 5 – Coupling in quasiequilibrium states.

Part 6 – Emergent conservation laws.

Part 7 – The urea cycle.

Part 8 – The citric acid cycle.

## Coupling Through Emergent Conservation Laws (Part 2)

27 June, 2018

joint post with Jonathan Lorand, Blake Pollard, and Maru Sarazola

Here’s a little introduction to the chemistry and thermodynamics prerequisites for our work on ‘coupling’. Luckily, it’s fun stuff that everyone should know: a lot of the world runs on these principles!

We will be working with reaction networks. A reaction network consists of a set of reactions, for example

$\mathrm{X}+\mathrm{Y}\longrightarrow \mathrm{XY}$

Here X, Y and XY are the species involved, and we interpret this reaction as species X and Y combining to form species XY. We call X and Y the reactants and XY the product. Additive combinations of species, such as X + Y, are called complexes.

The law of mass action states that the rate at which a reaction occurs is proportional to the product of the concentrations of the reactants. The proportionality constant is called the rate constant; it is a positive real number associated to a reaction that depends on chemical properties of the reaction along with the temperature, the pH of the solution, the nature of any catalysts that may be present, and so on. Every reaction has a reverse reaction; that is, if X and Y combine to form XY, then XY can also split into X and Y. The reverse reaction has its own rate constant.

We can summarize this information by writing

$\mathrm{X} + \mathrm{Y} \mathrel{\substack{\alpha_{\rightarrow} \\\longleftrightarrow\\ \alpha_{\leftarrow}}} \mathrm{XY}$

where $\alpha_{\to}$ is the rate constant for X and Y to combine and form XY, while $\alpha_\leftarrow$ is the rate constant for the reverse reaction.

As time passes and reactions occur, the concentration of each species will likely change. We can record this information in a collection of functions

$[\mathrm{X}] \colon \mathbb{R} \to [0,\infty),$

one for each species $X,$ where $\mathrm{X}(t)$ gives the concentration of the species $\mathrm{X}$ at time $t.$ This naturally leads one to consider the rate equation of a given reaction, which specifies the time evolution of these concentrations. The rate equation can be read off from the reaction network, and in the above example it is:

$\begin{array}{ccc} \dot{[\mathrm{X}]} & = & -\alpha_\to [\mathrm{X}][\mathrm{Y}]+\alpha_\leftarrow [\mathrm{XY}]\\ \dot{[\mathrm{Y}]} & = & -\alpha_\to [\mathrm{X}][\mathrm{Y}]+\alpha_\leftarrow [\mathrm{XY}]\\ \dot{[\mathrm{XY}]} & = & \alpha_\to [\mathrm{X}][\mathrm{Y}]-\alpha_\leftarrow [\mathrm{XY}] \end{array}$

Here $\alpha_\to [\mathrm{X}] [\mathrm{Y}]$ is the rate at which the forward reaction is occurring; thanks to the law of mass action, this is the rate constant $\alpha_\to$ times the product of the concentrations of X and Y. Similarly, $\alpha_\leftarrow [\mathrm{XY}]$ is the rate at which the reverse reaction is occurring.

We say that a system is in detailed balanced equilibrium, or simply equilibrium, when every reaction occurs at the same rate as its reverse reaction. This implies that the concentration of each species is constant in time. In our example, the condition for equilibrium is

$\displaystyle{ \frac{\alpha_\to}{\alpha_\leftarrow}=\frac{[\mathrm{XY}]}{[\mathrm{X}][\mathrm{Y}]} }$

and the rate equation then implies that

$\dot{[\mathrm{X}]} = \dot{[\mathrm{Y}]} =\dot{[\mathrm{XY}]} = 0$

The laws of thermodynamics determine the ratio of the forward and reverse rate constants. For any reaction at all, this ratio is

$\displaystyle{ \frac{\alpha_\to}{\alpha_\leftarrow} = e^{-\Delta {G^\circ}/RT} } \qquad \qquad \qquad (1)$

where $T$ is the temperature, $R$ is the ideal gas constant, and $\Delta {G^\circ}$ is the free energy change under standard conditions.

Note that if $\Delta {G^\circ} < 0$, then the rate constant of the forward reaction is larger than the rate constant of the reverse reaction:

$\alpha_\to > \alpha_\leftarrow$

In this case one may loosely say that the forward reaction ‘wants’ to happen ‘spontaneously’. Such a reaction is called exergonic. If on the other hand $\Delta {G^\circ} > 0$, then the forward reaction is ‘non-spontaneous’ and it is called endergonic.

The most important thing for us is that $\Delta {G^\circ}$ takes a very simple form. Each species has a free energy. The free energy of a complex

$\mathrm{A}_1 + \cdots + \mathrm{A}_m$

is the sum of the free energies of the species $\mathrm{A}_i$. Given a reaction

$\mathrm{A}_1 + \cdots + \mathrm{A}_m \longrightarrow \mathrm{B}_1 + \cdots + \mathrm{B}_n$

the free energy change $\Delta {G^\circ}$ for this reaction is the free energy of

$\mathrm{B}_1 + \cdots + \mathrm{B}_n$

minus the free energy of

$\mathrm{A}_1 + \cdots + \mathrm{A}_m.$

As a consequence, $\Delta{G^\circ}$ is additive with respect to combining multiple reactions in either series or parallel. In particular, then, the law (1) imposes relations between ratios of rate constants: for example, if we have the following more complicated set of reactions

$\mathrm{A} \mathrel{\substack{\alpha_{\rightarrow} \\\longleftrightarrow\\ \alpha_{\leftarrow}}} \mathrm{B}$

$\mathrm{B} \mathrel{\substack{\beta_{\rightarrow} \\\longleftrightarrow\\ \beta_{\leftarrow}}} \mathrm{C}$

$\mathrm{A} \mathrel{\substack{\gamma_{\rightarrow} \\\longleftrightarrow\\ \gamma_{\leftarrow}}} \mathrm{C}$

then we must have

$\displaystyle{ \frac{\gamma_\to}{\gamma_\leftarrow} = \frac{\alpha_\to}{\alpha_\leftarrow} \frac{\beta_\to}{\beta_\leftarrow} . }$

So, not only are the rate constant ratios of reactions determined by differences in free energy, but also nontrivial relations between these ratios can arise, depending on the structure of the system of reactions in question!

Okay—this is all the basic stuff we’ll need to know. Please ask questions! Next time we’ll go ahead and use this stuff to start thinking about how biology manages to make reactions that ‘want’ to happen push forward reactions that are useful but wouldn’t happen spontaneously on their own.

The paper:

• John Baez, Jonathan Lorand, Blake S. Pollard and Maru Sarazola,
Biochemical coupling through emergent conservation laws.

The blog series:

Part 1 – Introduction.

Part 2 – Review of reaction networks and equilibrium thermodynamics.

Part 3 – What is coupling?

Part 4 – Interactions.

Part 5 – Coupling in quasiequilibrium states.

Part 6 – Emergent conservation laws.

Part 7 – The urea cycle.

Part 8 – The citric acid cycle.