Network Theory (Part 22)

16 August, 2012

Okay, now let’s dig deeper into the proof of the deficiency zero theorem. We’re only going to prove a baby version, at first. Later we can enhance it:

Deficiency Zero Theorem (Baby Version). Suppose we have a weakly reversible reaction network with deficiency zero. Then for any choice of rate constants there exists an equilibrium solution of the rate equation where all species are present in nonzero amounts.

The first step is to write down the rate equation in a new, more conceptual way. It’s incredibly cool. You’ve probably seen Schrödinger’s equation, which describes the motion of a quantum particle:

\displaystyle { \frac{d \psi}{d t} = -i H \psi }

If you’ve been following this series, you’ve probably also seen the master equation, which describes the motion of a ‘stochastic’ particle:

\displaystyle { \frac{d \psi}{d t} = H \psi }

A ‘stochastic’ particle is one that’s carrying out a random walk, and now \psi describes its probability to be somewhere, instead of its amplitude.

Today we’ll see that the rate equation for a reaction network looks somewhat similar:

\displaystyle { \frac{d x}{d t} =  Y H x^Y }

where Y is some matrix, and x^Y is defined using a new thing called ‘matrix exponentiation’, which makes the equation nonlinear!

If you’re reading this you probably know how to multiply a vector by a matrix. But if you’re like me, you’ve never seen anyone take a vector and raise to the power of some matrix! I’ll explain it, don’t worry… right now I’m just trying to get you intrigued. It’s not complicated, but it’s exciting how this unusual operation shows up naturally in chemistry. That’s just what I’m looking for these days: new math ideas that show up in practical subjects like chemistry, and new ways that math can help people in these subjects.

Since we’re looking for an equilibrium solution of the rate equation, we actually want to solve

\displaystyle { \frac{d x}{d t} =  0 }

or in other words

Y H x^Y = 0

In fact we will do better: we will find a solution of

H x^Y = 0

And we’ll do this in two stages:

• First we’ll find all solutions of

H \psi = 0

This equation is linear, so it’s easy to understand.

• Then, among these solutions \psi, we’ll find one that also obeys

\psi = x^Y

This is a nonlinear problem involving matrix exponentiation, but still, we can do it, using a clever trick called ‘logarithms’.

Putting the pieces together, we get our solution of

H x^Y = 0

and thus our equilibrium solution of the rate equation:

\displaystyle { \frac{d x}{d t} = Y H x^Y = 0 }

That’s a rough outline of the plan. But now let’s get started, because the details are actually fascinating. Today I’ll just show you how to rewrite the rate equation in this new way.

The rate equation

Remember how the rate equation goes. We start with a stochastic reaction network, meaning a little diagram like this:

This contains quite a bit of information:

• a finite set T of transitions,

• a finite set K of complexes,

• a finite set S of species,

• a map r: T \to (0,\infty) giving a rate constant for each transition,

source and target maps s,t : T \to K saying where each transition starts and ends,

• a one-to-one map Y : K \to \mathbb{N}^S saying how each complex is made of species.

Given all this, the rate equation says how the amount of each species changes with time. We describe these amounts with a vector x \in [0,\infty)^S. So, we want a differential equation filling in the question marks here:

\displaystyle{ \frac{d x}{d t} = ??? }

Now last time, we started by thinking of K as a subset of \mathbb{N}^S, and thus of the vector space \mathbb{R}^S. Back then, we wrote the rate equation as follows:

\displaystyle{ \frac{d x}{d t} = \sum_{\tau \in T} r(\tau) \; \left(t(\tau) - s(\tau)\right) \; x^{s(\tau)} }

where vector exponentiation is defined by

x^s = x_1^{s_1} \cdots x_k^{s_k}

when x and s are vectors in \mathbb{R}^S.

However, we’ve now switched to thinking of our set of complexes K as a set in its own right that is mapped into \mathbb{N}^S by Y. This is good for lots of reasons, like defining the concept of ‘deficiency’, which we did last time. But it means the rate equation above doesn’t quite parse anymore! Things like s(\tau) and t(\tau) live in K; we need to explicitly convert them into elements of \mathbb{R}^S using Y for our equation to make sense!

So now we have to write the rate equation like this:

\displaystyle{ \frac{d x}{d t} = Y \sum_{\tau \in T} r(\tau) \;  \left(t(\tau) - s(\tau)\right) \; x^{Y s(\tau)} }

This looks more ugly, but if you’ve got even one mathematical bone in your body, you can already see vague glimmerings of how we’ll rewrite this the way we want:

\displaystyle { \frac{d x}{d t} =  Y H x^Y }

Here’s how.

First, we extend our maps s, t and Y to linear maps between vector spaces:

Then, we put an inner product on the vector spaces \mathbb{R}^T, \mathbb{R}^K and \mathbb{R}^S. For \mathbb{R}^K we do this in the most obvious way, by letting the complexes be an orthonormal basis. So, given two complexes \kappa, \kappa', we define their inner product by

\langle \kappa, \kappa' \rangle = \delta_{\kappa, \kappa'}

We do the same for \mathbb{R}^S. But for \mathbb{R}^T we define the inner product in a more clever way involving the rate constants. If \tau, \tau' \in T are two transitions, we define their inner product by:

\langle \tau, \tau' \rangle = \frac{1}{r(\tau)} \delta_{\tau, \tau'}

This will seem perfectly natural when we continue our study of circuits made of electrical resistors, and if you’re very clever you can already see it lurking in Part 16. But never mind.

Having put inner products on these three vector spaces, we can take the adjoints of the linear maps between them, to get linear maps going back the other way:

These are defined in the usual way—though we’re using daggers here they way physicists do, where mathematicians would prefer to see stars! For example, s^\dagger : \mathbb{R}^K \to \mathbb{R}^T is defined by the relation

\langle s^\dagger \phi, \psi \rangle = \langle \phi, s \psi \rangle

and so on.

Next, we set up a random walk on the set of complexes. Remember, our reaction network is a graph with complexes as vertices and transitions as edges, like this:

Each transition \tau has a number attached to it: the rate constant r(\tau). So, we can randomly hop from complex to complex along these transitions, with probabilities per unit time described by these numbers. The probability of being at some particular complex will then be described by a function

\psi : K \to \mathbb{R}

which also depends on time, and changes according to the equation

\displaystyle { \frac{d \psi}{d t} = H \psi }

for some Hamiltonian

H : \mathbb{R}^K \to \mathbb{R}^K

I defined this Hamiltonian back in Part 15, but now I see a slicker way to write it:

H = (t - s) s^\dagger

I’ll justify this next time. For now, the main point is that with this Hamiltonian, the rate equation is equivalent to this:

\displaystyle{ \frac{d x}{d t} = Y H x^Y }

The only thing I haven’t defined yet is the funny exponential x^Y. That’s what makes the equation nonlinear. We’re taking a vector to the power of a matrix and getting a vector. This sounds weird—but it actually makes sense!

It only makes sense because we have chosen bases for our vector spaces. To understand it, let’s number our species 1, \dots, k as we’ve been doing all along, and number our complexes 1, \dots, \ell. Our linear map Y : \mathbb{R}^K \to \mathbb{R}^S then becomes a k \times \ell matrix of natural numbers. Its entries say how many times each species shows up in each complex:

Y = \left( \begin{array}{cccc}   Y_{11} & Y_{12}  & \cdots & Y_{1 \ell} \\  Y_{21} & Y_{22}  & \cdots & Y_{2 \ell} \\  \vdots  & \vdots   & \ddots & \vdots \\  Y_{k1} & Y_{k2}  & \cdots & Y_{k \ell} \end{array} \right)

The entry Y_{i j} says how many times the ith species shows up in the jth complex.

Now, let’s be a bit daring and think of the vector x \in \mathbb{R}^S as a row vector with k entries:

x = \left(x_1 , x_2 ,  \dots ,  x_k \right)

Then we can multiply x on the right by the matrix Y and get a vector in \mathbb{R}^K:

\begin{array}{ccl} x Y &=& ( x_1 , x_2, \dots, x_k) \;   \left( \begin{array}{cccc}   Y_{11} & Y_{12}  & \cdots & Y_{1 \ell} \\  Y_{21} & Y_{22}  & \cdots & Y_{2 \ell} \\  \vdots  & \vdots   & \ddots & \vdots \\  Y_{k1} & Y_{k2}  & \cdots & Y_{k \ell} \end{array} \right)   \\  \\  &=& \left( x_1 Y_{11} + \cdots + x_k Y_{k1}, \; \dots, \; x_1 Y_{1 \ell} + \cdots + x_k Y_{k \ell} \right)   \end{array}

So far, no big deal. But now you’re ready to see the definition of x^Y, which is very similar:

It’s exactly the same, but with multiplication replacing addition, and exponentiation replacing multiplication! Apparently my class on matrices stopped too soon: we learned about matrix multiplication, but matrix exponentiation is also worthwhile.

What’s the point of it? Well, suppose you have a certain number of hydrogen molecules, a certain number of oxygen molecules, a certain number of water molecules, and so on—a certain number of things of each species. You can list these numbers and get a vector x \in \mathbb{R}^S. Then the components of x^Y describe how many ways you can build up each complex from the things you have. For example,

x_1^{Y_{11}} x_2^{Y_{21}} \cdots  x_k^{Y_{k1}}

say roughly how many ways you can build complex 1 by picking Y_{11} things of species 1, Y_{21} things of species 2, and so on.

Why ‘roughly’? Well, we’re pretending we can pick the same thing twice. So if we have 4 water molecules and we need to pick 3, this formula gives 4^3. The right answer is 4 \times 3 \times 2. To get this answer we’d need to use the ‘falling power’ 4^{\underline{3}} = 4 \times 3 \times 2, as explained in Part 4. But the rate equation describes chemistry in the limit where we have lots of things of each species. In this limit, the ordinary power becomes a good approximation.

Puzzle. In this post we’ve seen a vector raised to a matrix power, which is a vector, and also a vector raised to a vector power, which is a number. How are they related?

There’s more to say about this, which I’d be glad to explain if you’re interested. But let’s get to the punchline:

Theorem. The rate equation:

\displaystyle{ \frac{d x}{d t} = Y \sum_{\tau \in T} r(\tau) \; \left(t(\tau) - s(\tau)\right) \; x^{Y s(\tau)} }

is equivalent to this equation:

\displaystyle{ \frac{d x}{d t} = Y (t - s) s^\dagger x^Y }

or in other words:

\displaystyle{ \frac{d x}{d t} = Y H x^Y }

Proof. It’s enough to show

\displaystyle{ (t - s) s^\dagger x^Y = \sum_{\tau \in T} r(\tau) \; \left(t(\tau) - s(\tau)\right) \; x^{Y s(\tau)} }

So, we’ll compute (t - s) s^\dagger x^Y, and think about the meaning of each quantity we get en route.

We start with x \in \mathbb{R}^S. This is a list of numbers saying how many things of each species we have: our raw ingredients, as it were. Then we compute

x^Y = (x_1^{Y_{11}} \cdots  x_k^{Y_{k1}} ,  \dots,  x_1^{Y_{1 \ell}} \cdots x_k^{Y_{k \ell}} )

This is a vector in \mathbb{R}^K. It’s a list of numbers saying how many ways we can build each complex starting from our raw ingredients.

Alternatively, we can write this vector x^Y as a sum over basis vectors:

x^Y = \sum_{\kappa \in K} x_1^{Y_{1\kappa}} \cdots  x_k^{Y_{k\kappa}} \; \kappa

Next let’s apply s^\dagger to this. We claim that

\displaystyle{ s^\dagger \kappa = \sum_{\tau \; : \; s(\tau) = \kappa} r(\tau) \; \tau }

In other words, we claim s^\dagger \kappa is the sum of all the transitions having \kappa as their source, weighted by their rate constants! To prove this claim, it’s enough to take the inner product of each side with any transition \tau', and check that we get the same answer. For the left side we get

\langle s^\dagger \kappa, \tau' \rangle = \langle \kappa, s(\tau') \rangle = \delta_{\kappa, s (\tau') }

To compute the right side, we need to use the cleverly chosen inner product on \mathbb{R}^T. Here we get

\displaystyle{ \left\langle \sum_{\tau \; : \; s(\tau) = \kappa} r(\tau) \tau, \; \tau' \right\rangle =  \sum_{\tau \; : \; s(\tau) = \kappa} \delta_{\tau, \tau'} = \delta_{\kappa, s(\tau')}  }

In the first step here, the factor of 1 /r(\tau) in the cleverly chosen inner product canceled the visible factor of r(\tau). For the second step, you just need to think for half a minute—or ten, depending on how much coffee you’ve had.

Either way, we we conclude that indeed

\displaystyle{ s^\dagger \kappa = \sum_{\tau \; : \; s(\tau) = \kappa} r(\tau) \tau }

Next let’s combine this with our formula for x^Y:

\displaystyle { x^Y = \sum_{\kappa \in K} x_1^{Y_{1\kappa}} \cdots  x_k^{Y_{k\kappa}} \; \kappa }

We get this:

\displaystyle { s^\dagger x^Y = \sum_{\kappa, \tau \; : \; s(\tau) = \kappa} r(\tau) \; x_1^{Y_{1\kappa}} \cdots  x_k^{Y_{k\kappa}} \; \tau }

In other words, s^\dagger x^Y is a linear combination of transitions, where each one is weighted both by the rate it happens and how many ways it can happen starting with our raw ingredients.

Our goal is to compute (t - s)s^\dagger x^Y. We’re almost there. Remember, s says which complex is the input of a given transition, and t says which complex is the output. So, (t - s) s^\dagger x^Y says the total rate at which complexes are created and/or destroyed starting with the species in x as our raw ingredients.

That sounds good. But let’s just pedantically check that everything works. Applying t - s to both sides of our last equation, we get

\displaystyle{ (t - s) s^\dagger x^Y = \sum_{\kappa, \tau \; : \; s(\tau) = \kappa} r(\tau) \; x_1^{Y_{1\kappa}} \cdots  x_k^{Y_{k\kappa}} \; \left( t(\tau) - s(\tau)\right) }

Remember, our goal was to prove that this equals

\displaystyle{ \sum_{\tau \in T} r(\tau) \; \left(t(\tau) - s(\tau)\right) \; x^{Y s(\tau)} }

But if you stare at these a while and think, you’ll see they’re equal.   █

It took me a couple of weeks to really understand this, so I’ll be happy if it takes you just a few days. It seems peculiar at first but ultimately it all makes sense. The interesting subtlety is that we use the linear map called ‘multiplying by Y’:

\begin{array}{ccc} \mathbb{R}^K &\to& \mathbb{R}^S \\                               \psi     &\mapsto&   Y \psi \end{array}

to take a bunch of complexes and work out the species they contain, while we use the nonlinear map called ‘raising to the Yth power’:

\begin{array}{ccc} \mathbb{R}^S &\to& \mathbb{R}^K \\                               x     &\mapsto&   x^Y \end{array}

to take a bunch of species and work out how many ways we can build each complex from them. There is much more to say about this: for example, these maps arise from a pair of what category theorists call ‘adjoint functors’. But I’m worn out and you probably are too, if you’re still here at all.

References

I found this thesis to be the most helpful reference when I was trying to understand the proof of the deficiency zero theorem:

• Jonathan M. Guberman, Mass Action Reaction Networks and the Deficiency Zero Theorem, B.A. thesis, Department of Mathematics, Harvard University, 2003.

I urge you to check it out. In particular, Section 3 and Appendix A discuss matrix exponentiation. Has anyone discussed this before?

Here’s another good modern treatment of the deficiency zero theorem:

• Jeremy Gunawardena, Chemical reaction network theory for in silico biologists, 2003.

The theorem was first proved here:

• Martin Feinberg, Chemical reaction network structure and the stability of complex isothermal reactors: I. The deficiency zero and deficiency one theorems, Chemical Engineering Science 42 (1987), 2229-2268.

However, Feinberg’s treatment here relies heavily on this paper:

• F. Horn and R. Jackson, General mass action kinetics, Archive for Rational Mechanics and Analysis 47 (1972), 81-116.

(Does anyone work on ‘irrational mechanics’?) These lectures are also very helpful:

• Martin Feinberg, Lectures on reaction networks, 1979.

If you’ve seen other proofs, let us know.


Network Theory (Part 21)

14 August, 2012

Recently we’ve been talking about ‘reaction networks’, like this:

The letters are names of chemicals, and the arrows are chemical reactions. If we know how fast each reaction goes, we can write down a ‘rate equation’ describing how the amount of each chemical changes with time.

In Part 17 we met the deficiency zero theorem. This is a powerful tool for finding equilibrium solutions of the rate equation: in other words, solutions where the amounts of the various chemicals don’t change at all. To apply this theorem, two conditions must hold. Both are quite easy to check:

• Your reaction network needs to be ‘weakly reversible’: if you have a reaction that takes one bunch of chemicals to another, there’s a series of reactions that takes that other bunch back to the one you started with.

• A number called the ‘deficiency’ that you can compute from your reaction network needs to be zero.

The first condition makes a lot of sense, intuitively: you won’t get an equilibrium with all the different chemicals present if some chemicals can turn into others but not the other way around. But the second condition, and indeed the concept of ‘deficiency’, seems mysterious.

Luckily, when you work through the proof of the deficiency zero theorem, the mystery evaporates. It turns out that there are two equivalent ways to define the deficiency of a reaction network. One makes it easy to compute, and that’s the definition people usually give. But the other explains why it’s important.

In fact the whole proof of the deficiency zero theorem is full of great insights, so I want to show it to you. This will be the most intense part of the network theory series so far, but also the climax of its first phase. After a little wrapping up, we will then turn to another kind of network: electrical circuits!

Today I’ll just unfold the concept of ‘deficiency’ so we see what it means. Next time I’ll show you a slick way to write the rate equation, which is crucial to proving the deficiency zero theorem. Then we’ll start the proof.

Reaction networks

Let’s recall what a reaction network is, and set up our notation. In chemistry we consider a finite set of ‘species’ like C, O2, H2O and so on… and then consider reactions like

CH4 + 2O2 \longrightarrow CO2 + 2H2O

On each side of this reaction we have a finite linear combination of species, with natural numbers as coefficients. Chemists call such a thing a complex.

So, given any finite collection of species, say S, let’s write \mathbb{N}^S to mean the set of finite linear combinations of elements of S, with natural numbers as coefficients. The complexes appearing in our reactions will form a subset of this, say K.

We’ll also consider a finite collection of reactions—or in the language of Petri nets, ‘transitions’. Let’s call this T. Each transition goes from some complex to some other complex: if we want a reaction to be reversible we’ll explicitly include another reaction going the other way. So, given a transition \tau \in T it will always go from some complex called its source s(\tau) to some complex called its target t(\tau).

All this data, put together, is a reaction network:

Definition. A reaction network (S,\; s,t : T \to K) consists of:

• a finite set S of species,

• a finite set T of transitions,

• a finite set K \subset \mathbb{N}^S of complexes,

source and target maps s,t : T \to K.

We can draw a reaction network as a graph with complexes as vertices and transitions as edges:

The set of species here is S = \{A,B,C,D,E\}, and the set of complexes is K = \{A,B,D,A+C,B+E\}.

But to write down the ‘rate equation’ describing our chemical reactions, we need a bit more information: constants r(\tau) saying the rate at which each transition \tau occurs. So, we define:

Definition. A stochastic reaction network is a reaction network together with a map r: T \to (0,\infty) assigning a rate constant to each transition.

Let me remind you how the rate equation works. At any time we have some amount x_i \in [0,\infty) of each species i. These numbers are the components of a vector x \in \mathbb{R}^S, which of course depends on time. The rate equation says how this vector changes:

\displaystyle{ \frac{d x_i}{d t} = \sum_{\tau \in T} r(\tau) \left(t_i(\tau) - s_i(\tau)\right)  x^{s(\tau)} }

Here I’m writing s_i(\tau) for the ith component of the vector s(\tau), and similarly for t_i(\tau). I should remind you what x^{s(\tau)} means, since here we are raising a vector to a vector power, which is a bit unusual. So, suppose we have any vector

x = (x_1, \dots, x_k)

and we raise it to the power of

s = (s_1, \dots, s_k)

Then by definition we get

\displaystyle{ x^s = x_1^{s_1} \cdots x_k^{s_k} }

Given this, I hope the rate equation makes intuitive sense! There’s one term for each transition \tau. The factor of t_i(\tau) - s_i(\tau) shows up because our transition destroys s_i(\tau) things of the ith species and creates t_i(\tau) of them. The big product

\displaystyle{ x^{s(\tau)} = x_1^{s_1(\tau)} \cdots x_k^{s_k(\tau)} }

shows up because our transition occurs at a rate proportional to the product of the numbers of things it takes as inputs. The constant of proportionality is the reaction rate r(\tau).

The deficiency zero theorem says lots of things, but in the next few episodes we’ll prove a weak version, like this:

Deficiency Zero Theorem (Baby Version). Suppose we have a weakly reversible reaction network with deficiency zero. Then for any choice of rate constants there exists an equilibrium solution x \in (0,\infty)^S of the rate equation. In other words:

\displaystyle{ \sum_{\tau \in T} r(\tau) \left(t_i(\tau) - s_i(\tau)\right)  x^{s(\tau)} = 0}

An important feature of this result is that all the components of the vector x are positive. In other words, there’s actually some chemical of each species present!

But what do the hypotheses in this theorem mean?

A reaction network is weakly reversible if for any transition \tau \in T going from a complex \kappa to a complex \kappa', there is a sequence of transitions going from \kappa' back to \kappa. But what about ‘deficiency zero’? As I mentioned, this requires more work to understand.

So, let’s dive in!

Deficiency

In modern math, we like to take all the stuff we’re thinking about and compress it into a diagram with a few objects and maps going between these objects. So, to understand the deficiency zero theorem, I wanted to isolate the crucial maps. For starters, there’s an obvious map

Y : K \to \mathbb{N}^S

sending each complex to the linear combination of species that it is.

Indeed, we can change viewpoints a bit and think of K as an abstract set equipped with this map saying how each complex is made of species. From now on this is what we’ll do. Then all the information in a stochastic reaction network sits in this diagram:

This is fundamental to everything we’ll do for the next few episodes, so take a minute to lock it into your brain.

We’ll do lots of different things with this diagram. For example, we often want to use ideas from linear algebra, and then we want our maps to be linear. For example, Y extends uniquely to a linear map

Y : \mathbb{R}^K \to \mathbb{R}^S

sending real linear combinations of complexes to real linear combinations of species. Reusing the name Y here won’t cause confusion. We can also extend r, s, and t to linear maps in a unique way, getting a little diagram like this:

Linear algebra lets us talk about differences of complexes. Each transition \tau gives a vector

\partial(\tau) = t(\tau) - s(\tau) \in \mathbb{R}^K

saying the change in complexes that it causes. And we can extend \partial uniquely to a linear map

\partial : \mathbb{R}^T \to \mathbb{R}^K

defined on linear combinations of transitions. Mathematicians would call \partial a boundary operator.

So, we have a little sequence of linear maps

\mathbb{R}^T \stackrel{\partial}{\to} \mathbb{R}^K \stackrel{Y}{\to} \mathbb{R}^S

This turns a transition into a change in complexes, and then a change in species.

If you know fancy math you’ll be wondering if this sequence is a ‘chain complex’, which is a fancy way of saying that Y \partial = 0. The answer is no. This equation means that every linear combination of reactions leaves the amount of all species unchanged. Or equivalently: every reaction leaves the amount of all species unchanged. This only happens in very silly examples.

Nonetheless, it’s possible for a linear combination of reactions to leave the amount of all species unchanged.

For example, this will happen if we have a linear combination of reactions that leaves the amount of all complexes unchanged. But this sufficient condition is not necessary. And this leads us to the concept of ‘deficiency zero’:

Definition. A reaction network has deficiency zero if any linear combination of reactions that leaves the amount of every species unchanged also leaves the amount of every complex unchanged.

In short, a reaction network has deficiency zero iff

Y (\partial \rho) = 0 \implies \partial \rho = 0

for every \rho \in \mathbb{R}^T. In other words—using some basic concepts from linear algebra—a reaction network has deficiency zero iff Y is one-to-one when restricted to the image of \partial. Remember, the image of \partial is

\mathrm{im} \partial = \{ \partial \rho \; : \; \rho \in \mathbb{R}^T \}

Roughly speaking, this consists of all changes in complexes that can occur due to reactions.

In still other words, a reaction network has deficiency zero if 0 is the only vector in both the image of \partial and the kernel of Y:

\mathrm{im} \partial \cap \mathrm{ker} Y = \{ 0 \}

Remember, the kernel of Y is

\mathrm{ker} Y = \{ \phi \in \mathbb{R}^K : Y \phi = 0 \}

Roughly speaking, this consists of all changes in complexes that don’t cause changes in species. So, ‘deficiency zero’ roughly says that if a reaction causes a change in complexes, it causes a change in species.

(All this ‘roughly speaking’ stuff is because in reality I should be talking about linear combinations of transitions, complexes and species. But it’s a bit distracting to keep saying that when I’m trying to get across the basic ideas!)

Now we’re ready to understand deficiencies other than zero, at least a little. They’re defined like this:

Definition. The deficiency of a reaction network is the dimension of \mathrm{im} \partial \cap \mathrm{ker} Y.

How to compute the deficiency

You can compute the deficiency of a reaction network just by looking at it. However, it takes a little training. First, remember that a reaction network can be drawn as a graph with complexes as vertices and transitions as edges, like this:

There are three important numbers we can extract from this graph:

• We can count the number of vertices in this graph; let’s call that |K|, since it’s just the number of complexes.

• We can count the number of pieces or ‘components’ of this graph; let’s call that \# \mathrm{components} for now.

• We can also count the dimension of the image of Y \partial. This space, \mathrm{im} Y \partial, is called the stochiometric subspace: vectors in here are changes in species that can be accomplished by transitions in our reaction network, or linear combinations of transitions.

These three numbers, all rather easy to compute, let us calculate the deficiency:

Theorem. The deficiency of a reaction network equals

|K| - \# \mathrm{components} - \mathrm{dim} \left( \mathrm{im} Y \partial \right)

Proof. By definition we have

\mathrm{deficiency} = \dim \left( \mathrm{im} \partial \cap \mathrm{ker} Y \right)

but another way to say this is

\mathrm{deficiency} = \mathrm{dim} (\mathrm{ker} Y|_{\mathrm{im} \partial})

where we are restricting Y to the subspace \mathrm{im} \partial, and taking the dimension of the kernel of that.

The rank-nullity theorem says that whenever you have a linear map T: V \to W between finite-dimensional vector spaces, then

\mathrm{dim} \left(\mathrm{ker} T \right) \; = \; \mathrm{dim}\left(\mathrm{dom} T\right) \; - \; \mathrm{dim} \left(\mathrm{im} T\right)

where \mathrm{dom} T is the domain of T, namely the vector space V. It follows that

\mathrm{dim}(\mathrm{ker} Y|_{\mathrm{im} \partial}) = \mathrm{dim}(\mathrm{dom}  Y|_{\mathrm{im} \partial}) - \mathrm{dim}(\mathrm{im}  Y|_{\mathrm{im} \partial})

The domain of Y|_{\mathrm{im} \partial} is just \mathrm{im} \partial, while its image equals \mathrm{im} Y \partial, so

\mathrm{deficiency} = \mathrm{dim}(\mathrm{im} \partial) - \mathrm{dim}(\mathrm{im} Y \partial)

The theorem then follows from this:

Lemma. \mathrm{dim} (\mathrm{im} \partial) = |K| - \# \mathrm{components}

Proof. In fact this holds whenever we have a finite set of complexes and a finite set of transitions going between them. We get a diagram

We can extend the source and target functions to linear maps as usual:

and then we can define \partial = t - s. We claim that

\mathrm{dim} (\mathrm{im} \partial) = |K| - \# \mathrm{components}

where \# \mathrm{components} is the number of connected components of the graph with K as vertices and T as edges.

This is easiest to see using an inductive argument where we start by throwing out all the edges of our graph and then put them back in one at a time. When our graph has no edges, \partial = 0 and the number of components is |K|, so our claim holds: both sides of the equation above are zero. Then, each time we put in an edge, there are two choices: either it goes between two different components of the graph we have built so far, or it doesn’t. If goes between two different components, it increases \mathrm{dim} (\mathrm{im} \partial) by 1 and decreases the number of components by 1, so our equation continues to hold. If it doesn’t, neither of these quantities change, so our equation again continues to hold.   █

Examples

This reaction network is not weakly reversible, since we can get from B + E and A + C to D but not back. It becomes weakly reversible if we throw in another transition:

Taking a reaction network and throwing in the reverse of an existing transition never changes the number of complexes, the number of components, or the dimension of the stoichiometric subspace. So, the deficiency of the reaction network remains unchanged. We computed it back in Part 17. For either reaction network above:

• the number of complexes is 5:

|K| = |\{A,B,D, A+C, B+E\}| = 5

• the number of components is 2:

\# \mathrm{components} = 2

• the dimension of the stoichiometric subspace is 3. For this we go through each transition, see what change in species it causes, and take the vector space spanned by these changes. Then we find a basis for this space and count it:

\mathrm{im} Y \partial =

\langle B - A, A - B, D - A - C, D - (B+E) ,  (B+E)-(A+C) \rangle

= \langle B -A, D - A - C, D - A - E \rangle

so

\mathrm{dim} \left(\mathrm{im} Y \partial \right) = 3

As a result, the deficiency is zero:

\begin{array}{ccl} \mathrm{deficiency}   &=& |K| - \# \mathrm{components} - \mathrm{dim} \left( \mathrm{im} Y \partial \right) \\  &=& 5 - 2 - 3 \\  &=&  0 \end{array}

Now let’s throw in another complex and some more transitions:

Now:

• the number of complexes increases by 1:

|K| = |\{A,B,D,E, A+C, B+E\}| = 6

• the number of components is unchanged:

\# \mathrm{components} = 2

• the dimension of the stoichiometric subspace increases by 1. We never need to include reverses of transitions when computing this:

\mathrm{im} Y \partial =

\langle B-A, E-B,  D - (A+C),
D - (B+E) , (B+E)-(A+C) \rangle

= \langle B -A, E-B, D - A - C, D - B - E \rangle

so

\mathrm{dim} \left(\mathrm{im} Y \partial \right) = 4

As a result, the deficiency is still zero:

\begin{array}{ccl} \mathrm{deficiency}   &=& |K| - \# \mathrm{components} - \mathrm{dim} \left( \mathrm{im} Y \partial \right) \\  &=& 6 - 2 - 4 \\  &=&  0 \end{array}

Do all reaction networks have deficiency zero? That would be nice. Let’s try one more example:

• the number of complexes is the same as in our last example:

|K| = |\{A,B,D,E, A+C, B+E\}| = 6

• the number of components is also the same:

\# \mathrm{components} = 2

• the dimension of the stoichiometric subspace is also the same:

\mathrm{im} Y \partial =

\langle B-A,  D - (A+C), D - (B+E) ,
(B+E)-(A+C), (B+E) - B \rangle

= \langle B -A, D - A - C, D - B , E\rangle

so

\mathrm{dim} \left(\mathrm{im} Y \partial \right) = 4

So the deficiency is still zero:

\begin{array}{ccl} \mathrm{deficiency}   &=& |K| - \# \mathrm{components} - \mathrm{dim} \left( \mathrm{im} Y \partial \right) \\  &=& 6 - 2 - 4 \\  &=&  0 \end{array}

It’s sure easy to find examples with deficiency zero!

Puzzle 1. Can you find an example where the deficiency is not zero?

Puzzle 2. If you can’t find an example, prove the deficiency is always zero. If you can, find 1) the smallest example and 2) the smallest example that actually arises in chemistry.

Note that not all reaction networks can actually arise in chemistry. For example, the transition $A \to A + A$ would violate conservation of mass. Nonetheless a reaction network like this might be useful in a very simple model of amoeba reproduction, one that doesn’t take limited resources into account.

Different kinds of graphs

I’ll conclude with some technical remarks that only a mathematician could love; feel free to skip them if you’re not in the mood. As you’ve already seen, it’s important that a reaction network can be drawn as a graph:

But there are many kinds of graph. What kind of graph is this, exactly?

As I mentioned in Part 17, it’s a directed multigraph, meaning that the edges have arrows on them, more than one edge can go from one vertex to another, and we can have any number of edges going from a vertex to itself. Not all those features are present in this example, but they’re certainly allowed by our definition of reaction network!

After all, we’ve seen that a stochastic reaction network amounts to a little diagram like this:

If we throw out the rate constants, we’re left with a reaction network. So, a reaction network is just a diagram like this:

If we throw out the information about how complexes are made of species, we’re left with an even smaller diagram:

And this precisely matches the slick definition of a directed multigraph: it’s a set E of edges, a set V of vertices, and functions s,t : E \to V saying where each edge starts and where it ends: its source and target.

Since this series is about networks, we should expect to run into many kinds of graphs. While their diversity is a bit annoying at first, we must learn to love it, at least if we’re mathematicians and want everything to be completely precise.

There are at least 23 = 8 kinds of graphs, depending on our answers to three questions:

• Do the edges have arrows on them?

• Can more than one edge can go between a specific pair of vertices?

and

• Can an edge can go from a vertex to itself?

We get directed multigraphs if we answer YES, YES and YES to these questions. Since they allow all three features, directed multigraphs are very important. For example, a category is a directed multigraph equipped with some extra structure. Also, what mathematicians call a quiver is just another name for a directed multigraph.

We’ve met two other kinds of graph so far:

• In Part 15 and Part 16 we described circuits made of resistors—or, in other words, Dirichlet operators—using ‘simple graphs’. We get simple graphs when we answer NO, NO and NO to the three questions above. The slick definition of a simple graph is that it’s a set V of vertices together with a subset E of the collection of 2-element subsets of V.

• In Part 20 we described Markov processes on finite sets—or, in other words, infinitesimal stochastic operators—using ‘directed graphs’. We get directed graphs when we answer YES, NO and YES to the three questions. The slick definition of a directed graph is that it’s a set V of vertices together with a subset E of the ordered pairs of V: E \subseteq V \times V.

There is a lot to say about this business, but for now I’ll just note that you can use directed multigraphs with edges labelled by positive numbers to describe Markov processes, just as we used directed graphs. You don’t get anything more general, though! After all, if we have multiple edges from one vertex to another, we can replace them by a single edge as long as we add up their rate constants. And an edge from a vertex to itself has no effect at all.

In short, both for Markov processes and reaction networks, we can take ‘graph’ to mean either ‘directed graph’ or ‘directed multigraph’, as long as we make some minor adjustments. In what follows, we’ll use directed multigraphs for both Markov processes and reaction networks. It will work a bit better if we ever get around to explaining how category theory fits into this story.


Network Theory (Part 19)

18 July, 2012

joint with Jacob Biamonte

It’s time to resume the network theory series! We’re writing a little book based on some of these posts, so we want to finish up our discussion of stochastic Petri nets and chemical reaction networks. But it’s been a long time, so we’ll assume you forgot everything we’ve said before, and make this post as self-contained as possible.

Last time we started looking at a simple example: a diatomic gas.

A diatomic molecule of this gas can break apart into two atoms:

A_2 \to A + A

and conversely, two atoms can combine to form a diatomic molecule:

A + A \to A_2

We can draw both these reactions using a chemical reaction network:

where we’re writing B instead of A_2 to abstract away some detail that’s just distracting here.

Last time we looked at the rate equation for this chemical reaction network, and found equilibrium solutions of that equation. Now let’s look at the master equation, and find equilibrium solutions of that. This will serve as a review of three big theorems.

The master equation

We’ll start from scratch. The master equation is all about how atoms or molecules or rabbits or wolves or other things interact randomly and turn into other things. So, let’s write \psi_{m,n} for the probability that we have m atoms of A and n molecule of B in our container. These probabilities are functions of time, and master equation will say how they change.

First we need to pick a rate constant for each reaction. Let’s say the rate constant for this reaction is some number \alpha > 0:

B \to A + A

while the rate constant for the reverse reaction is some number \beta > 0:

A + A \to B

Before we make it pretty using the ideas we’ve been explaining all along, the master equation says

\begin{array}{ccr} \displaystyle{ \frac{d}{d t} \psi_{m,n} (t)} &=&  \alpha (n+1) \, \psi_{m-2,n+1}(t) \\ \\   &&- \; \alpha n \, \psi_{m,n}(t) \\  \\  && + \beta (m+2)(m+1) \, \psi_{m+2,n-1}(t) \\ \\ && - \;\beta m(m-1) \, \psi_{m,n}(t) \\ \\   \end{array}

where we define \psi_{i,j} to be zero if either i or j is negative.

Yuck!

Normally we don’t show you such nasty equations. Indeed the whole point of our work has been to demonstrate that by packaging the equations in a better way, we can understand them using high-level concepts instead of mucking around with millions of scribbled symbols. But we thought we’d show you what’s secretly lying behind our beautiful abstract formalism, just once.

Each term has a meaning. For example, the third one:

\beta (m+2)(m+1)\psi_{m+2,n-1}(t)

means that the reaction A + A \to B will tend to increase the probability of there being m atoms of A and n molecules of B if we start with m+2 atoms of A and n-1 molecules of B. This reaction can happen in (m+2)(m+1) ways. And it happens at a probabilistic rate proportional to the rate constant for this reaction, \beta.

We won’t go through the rest of the terms. It’s a good exercise to do so, but there could easily be a typo in the formula, since it’s so long and messy. So let us know if you find one!

To simplify this mess, the key trick is to introduce a generating function that summarizes all the probabilities in a single power series:

\Psi = \sum_{m,n \ge 0} \psi_{m,n} y^m \, z^n

It’s a power series in two variables, y and z, since we have two chemical species: As and Bs.

Using this trick, the master equation looks like

\displaystyle{ \frac{d}{d t} \Psi(t) = H \Psi(t) }

where the Hamiltonian H is a sum of terms, one for each reaction. This Hamiltonian is built from operators that annihilate and create As and Bs. The annihilation and creation operators for A atoms are:

\displaystyle{ a = \frac{\partial}{\partial y} , \qquad a^\dagger = y }

The annihilation operator differentiates our power series with respect to the variable y. The creation operator multiplies it by that variable. Similarly, the annihilation and creation operators for B molecules are:

\displaystyle{ b = \frac{\partial}{\partial z} , \qquad b^\dagger = z }

In Part 8 we explained a recipe that lets us stare at our chemical reaction network and write down this Hamiltonian:

H = \alpha ({a^\dagger}^2 b - b^\dagger b) + \beta (b^\dagger a^2 - {a^\dagger}^2 a^2)

As promised, there’s one term for each reaction. But each term is itself a sum of two: one that increases the probability that our container of chemicals will be in a new state, and another that decreases the probability that it’s in its original state. We get a total of four terms, which correspond to the four terms in our previous way of writing the master equation.

Puzzle 1. Show that this new way of writing the master equation is equivalent to the previous one.

Equilibrium solutions

Now we will look for all equilibrium solutions of the master equation: in other words, solutions that don’t change with time. So, we’re trying to solve

H \Psi = 0

Given the rather complicated form of the Hamiltonian, this seems tough. The challenge looks more concrete but even more scary if we go back to our original formulation. We’re looking for probabilities \psi_{m,n}, nonnegative numbers that sum to one, such that

\begin{array}{ccr} 0 &=&  \alpha (n+1) \, \psi_{m-2,n+1} \\ \\   &&- \; \alpha n \, \psi_{m,n} \\  \\  && + \beta (m+2)(m+1) \, \psi_{m+2,n-1} \\ \\ && - \;\beta m(m-1) \, \psi_{m,n} \\ \\   \end{array}

for all m and n.

This equation is horrid! But the good news is that it’s linear, so a linear combination of solutions is again a solution. This lets us simplify the problem using a conserved quantity.

Clearly, there’s a quantity that the reactions here don’t change:

What’s that? It’s the number of As plus twice the number of Bs. After all, a B can turn into two As, or vice versa.

(Of course the secret reason is that B is a diatomic molecule made of two As. But you’d be able to follow the logic here even if you didn’t know that, just by looking at the chemical reaction network… and sometimes this more abstract approach is handy! Indeed, the way chemists first discovered that certain molecules are made of certain atoms is by seeing which reactions were possible and which weren’t.)

Suppose we start in a situation where we know for sure that the number of Bs plus twice the number of As equals some number k:

\psi_{m,n} = 0  \;  \textrm{unless} \; m+2n = k

Then we know \Psi is initially of the form

\displaystyle{ \Psi = \sum_{m+2n = k} \psi_{m,n} \, y^m z^n  }

But since the number of As plus twice the number of Bs is conserved, if \Psi obeys the master equation it will continue to be of this form!

Put a fancier way, we know that if a solution of the master equation starts in this subspace:

\displaystyle{ L_k =  \{ \Psi: \; \Psi = \sum_{m+2n = k} \psi_{m,n} y^m z^n \; \textrm{for some} \; \psi_{m,n} \}  }

it will stay in this subspace. So, because the master equation is linear, we can take any solution \Psi and write it as a linear combination of solutions \Psi_k, one in each subspace L_k.

In particular, we can do this for an equilibrium solution \Psi. And then all the solutions \Psi_k are also equilibrium solutions: they’re linearly independent, so if one of them changed with time, \Psi would too.

This means we can just look for equilibrium solutions in the subspaces L_k. If we find these, we can get all equilibrium solutions by taking linear combinations.

Once we’ve noticed that, our horrid equation makes a bit more sense:

\begin{array}{ccr} 0 &=&  \alpha (n+1) \, \psi_{m-2,n+1} \\ \\   &&- \; \alpha n \, \psi_{m,n} \\  \\  && + \beta (m+2)(m+1) \, \psi_{m+2,n-1} \\ \\ && - \;\beta m(m-1) \, \psi_{m,n} \\ \\   \end{array}

Note that if the pair of subscripts m, n obey m + 2n = k, the same is true for the other pairs of subscripts here! So our equation relates the values of \psi_{m,n} for all the points (m,n) with integer coordinates lying on this line segment:

m+2n = k , \qquad m ,n \ge 0

You should be visualizing something like this:

If you think about it a minute, you’ll see that if we know \psi_{m,n} at two points on such a line, we can keep using our equation to recursively work out all the rest. So, there are at most two linearly independent equilibrium solutions of the master equation in each subspace L_k.

Why at most two? Why not two? Well, we have to be a bit careful about what happens at the ends of the line segment: remember that \psi_{m,n} is defined to be zero when m or n becomes negative. If we think very hard about this, we’ll see there’s just one linearly independent equilibrium solution of the master equation in each subspace L_k. But this is the sort of nitty-gritty calculation that’s not fun to watch someone else do, so we won’t bore you with that.

Soon we’ll move on to a more high-level approach to this problem. But first, one remark. Our horrid equation is like a fancy version of the usual discretized form of the equation

\displaystyle {\frac{d^2 \psi}{d x^2} = 0 }

namely:

\psi_{n-1} - 2 \psi_{n} + \psi_{n+1} = 0

And this makes sense, since we get

\displaystyle {\frac{d^2 \psi}{d x^2} = 0 }

by taking the heat equation:

\displaystyle \frac{\partial \psi}{\partial t} = {\frac{\partial^2 \psi}{\partial x^2} }

and assuming \psi doesn’t depend on time. So what we’re doing is a lot like looking for equilibrium solutions of the heat equation.

The heat equation describes how heat smears out as little particles of heat randomly move around. True, there don’t really exist ‘little particles of heat’, but this equation also describes the diffusion of any other kind of particles as they randomly move around undergoing Brownian motion. Similarly, our master equation describes a random walk on this line segment:

m+2n = k , \qquad m , n \ge 0

or more precisely, the points on this segment with integer coordinates. The equilibrium solutions arise when the probabilities \psi_{m,n} have diffused as much as possible.

If you think about it this way, it should be physically obvious that there’s just one linearly independent equilibrium solution of the master equation for each value of k.

There’s a general moral here, too, which we’re seeing in a special case: the master equation for a chemical reaction network really describes a bunch of random walks, one for each allowed value of the conserved quantities that can be built as linear combinations of number operators. In our case we have one such conserved quantity, but in general there may be more (or none).

Furthermore, these ‘random walks’ are what we’ve been calling Markov processes.

Noether’s theorem

We simplified our task of finding equilibrium solutions of the master equation by finding a conserved quantity. The idea of simplifying problems using conserved quantities is fundamental to physics: this is why physicists are so enamored with quantities like energy, momentum, angular momentum and so on.

Nowadays physicists often use ‘Noether’s theorem’ to get conserved quantities from symmetries. There’s a very simple version of Noether’s theorem for quantum mechanics, but in Part 11 we saw a version for stochastic mechanics, and it’s that version that is relevant now. Here’s a paper which explains it in detail:

• John Baez and Brendan Fong, Noether’s theorem for Markov processes.

We don’t really need Noether’s theorem now, since we found the conserved quantity and exploited it without even noticing the symmetry. Nonetheless it’s interesting to see how it relates to what we’re doing.

For the reaction we’re looking at now, the idea is that the subspaces L_k are eigenspaces of an operator that commutes with the Hamiltonian H. It follows from standard math that a solution of the master equation that starts in one of these subspaces, stays in that subspace.

What is this operator? It’s built from ‘number operators’. The number operator for As is

N_A = a^\dagger a

and the number operator for Bs is

N_B = b^\dagger b

A little calculation shows

N_A \,y^m z^n = m \, y^m z^n, \quad \qquad  N_B\, y^m z^n = n \,y^m z^n

so the eigenvalue of N_A is the number of As, while the eigenvalue of N_B is the number of Bs. This is why they’re called number operators.

As a consequence, the eigenvalue of the operator N_A + 2N_B is the number of As plus twice the number of Bs:

(N_A + 2N_B) \, y^m z^n = (m + 2n) \, y^m z^n

Let’s call this operator O, since it’s so important:

O = N_A + 2N_B

If you think about it, the spaces L_k we saw a minute ago are precisely the eigenspaces of this operator:

L_k = \{ \Psi : \; O \Psi = k \Psi \}

As we’ve seen, solutions of the master equation that start in one of these eigenspaces will stay there. This lets take some techniques that are very familiar in quantum mechanics, and apply them to this stochastic situation.

First of all, time evolution as described by the master equation is given by the operators \exp(t H). In other words,

\displaystyle{ \frac{d}{d t} \Psi(t) } = H \Psi(t) \; \textrm{and} \;  \Psi(0) = \Phi \quad  \Rightarrow \quad \Psi(t) = \exp(t H) \Phi

But if you start in some eigenspace of O, you stay there. Thus if \Phi is an eigenvector of O, so is \exp(t H) \Phi, with the same eigenvalue. In other words,

O \Phi = k \Phi

implies

O \exp(t H) \Phi = k \exp(t H) \Phi = \exp(t H) O \Phi

But since we can choose a basis consisting of eigenvectors of O, we must have

O \exp(t H) = \exp(t H) O

or, throwing caution to the winds and differentiating:

O H = H O

So, as we’d expect from Noether’s theorem, our conserved quantity commutes with the Hamiltonian! This in turn implies that H commutes with any polynomial in O, which in turn suggests that

\exp(s O) H = H \exp(s O)

and also

\exp(s O) \exp(t H) = \exp(t H) \exp(s O)

The last equation says that O generates a 1-parameter family of ‘symmetries’: operators \exp(s O) that commute with time evolution. But what do these symmetries actually do? Since

O y^m z^n = (m + 2n) y^m z^n

we have

\exp(s O) y^m z^n = e^{s(m + 2n)}\, y^m z^n

So, this symmetry takes any probability distribution \psi_{m,n} and multiplies it by e^{s(m + 2n)}.

In other words, our symmetry multiplies the relative probability of finding our container of gas in a given state by a factor of e^s for each A atom, and by a factor of e^{2s} for each B molecule. It might not seem obvious that this operation commutes with time evolution! However, experts on chemical reaction theory are familiar with this fact.

Finally, a couple of technical points. Starting where we said “throwing caution to the winds”, our treatment has not been rigorous, since O and H are unbounded operators, and these must be handled with caution. Nonetheless, all the commutation relations we wrote down are true.

The operators \exp(s O) are unbounded for positive s. They’re bounded for negative s, so they give a one-parameter semigroup of bounded operators. But they’re not stochastic operators: even for s negative, they don’t map probability distributions to probability distributions. However, they do map any nonzero vector \Psi with \psi_{m,n} \ge 0 to a vector \exp(s O) \Psi with the same properties. So, we can just normalize this vector and get a probability distribution. The need for this normalization is why we spoke of relative probabilities.

The Anderson–Craciun–Kurtz theorem

Now we’ll actually find all equilibrium solutions of the master equation in closed form. To understand this final section, you really do need to remember some things we’ve discussed earlier. Last time we considered the same chemical reaction network we’re studying today, but we looked at its rate equation, which looks like this:

\displaystyle{ \frac{d}{d t} x_1 =  2 \alpha x_2 - 2 \beta x_1^2}

\displaystyle{ \frac{d}{d t} x_2 = - \alpha x_2 + \beta x_1^2 }

This describes how the number of As and Bs changes in the limit where there are lots of them and we can treat them as varying continuously, in a deterministic way. The number of As is x_1, and the number of Bs is x_2.

We saw that the quantity

x_1 + 2 x_2

is conserved, just as today we’ve seen that N_A + 2 N_B is conserved. We saw that the rate equation has one equilibrium solution for each choice of x_1 + 2 x_2. And we saw that these equilibrium solutions obey

\displaystyle{ \frac{x_1^2}{x_2} = \frac{\alpha}{\beta} }

The Anderson–Craciun–Kurtz theorem, introduced in Part 9, is a powerful result that gets equilibrium solution of the master equation from equilibrium solutions of the rate equation. It only applies to equilibrium solutions that are ‘complex balanced’, but that’s okay:

Puzzle 2. Show that the equilibrium solutions of the rate equation for the chemical reaction network

are complex balanced.

So, given any equilibrium solution (x_1,x_2) of our rate equation, we can hit it with the Anderson-Craciun-Kurtz theorem and get an equilibrium solution of the master equation! And it looks like this:

\displaystyle{  \Psi = e^{-(x_1 + x_2)} \, \sum_{m,n \ge 0} \frac{x_1^m x_2^n} {m! n! } \, y^m z^n }

In this solution, the probability distribution

\displaystyle{ \psi_{m,n} = e^{-(x_1 + x_2)} \,  \frac{x_1^m x_2^n} {m! n! } }

is a product of Poisson distributions. The factor in front is there to make the numbers \psi_{m,n} add up to one. And remember, x_1, x_2 are any nonnegative numbers with

\displaystyle{ \frac{x_1^2}{x_2} = \frac{\alpha}{\beta} }

So from all we’ve said, the above formula is an explicit closed-form solution of the horrid equation

\begin{array}{ccr} 0 &=&  \alpha (n+1) \, \psi_{m-2,n+1} \\ \\   &&- \; \alpha n \, \psi_{m,n} \\  \\  && + \beta (m+2)(m+1) \, \psi_{m+2,n-1} \\ \\ && - \;\beta m(m-1) \, \psi_{m,n} \\ \\   \end{array}

That’s pretty nice. We found some solutions without ever doing any nasty calculations.

But we’ve really done better than getting some equilibrium solutions of the master equation. By restricting attention to n,m with m+2n = k, our formula for \psi_{m,n} gives an equilibrium solution that lives in the eigenspace L_k:

\displaystyle{  \Psi_k = e^{-(x_1 + x_2)} \, \sum_{m+2n =k} \frac{x_1^m x_2^n} {m! n! } \, y^m z^n }

And by what we’ve said, linear combinations of these give all equilibrium solutions of the master equation.

And we got them with very little work! Despite all the fancy talk in today’s post, we essentially just took the equilibrium solutions of the rate equation and plugged them into a straightforward formula to get equilibrium solutions of the master equation. This is why the Anderson–Craciun–Kurtz theorem is so nice. And of course we’re looking at a very simple reaction network: for more complicated ones it becomes even better to use this theorem to avoid painful calculations.

We could go further. For example, we could study nonequilibrium solutions using Feynman diagrams like this:

But instead, we will leave off with two more puzzles. We introduced some symmetries, but we haven’t really explored them yet:

Puzzle 3. What do the symmetries associated to the conserved quantity O do to the equilibrium solutions of the master equation given by

\displaystyle{  \Psi = e^{-(x_1 + x_2)} \, \sum_{m,n \ge 0} \frac{x_1^m x_2^n} {m! n! } \, y^m z^n }

where (x_1,x_2) is an equilibrium solution of the rate equation? In other words, what is the significance of the one-parameter family of solutions

\exp(s O) \Psi ?

Also, we used a conceptual argument to check that H commutes with O, but it’s good to know that we can check this sort of thing directly:

Puzzle 4. Compute the commutator

[H, O] = H O - O H

and show it vanishes.


The Mathematics of Biodiversity (Part 3)

27 June, 2012

We tend to think of biodiversity as a good thing, but sometimes it’s deadly. Yesterday Andrei Korobeinikov gave a talk on ‘Viral evolution within a host’, which was mainly about AIDS.

The virus that causes this disease, HIV, can reproduce very fast. In an untreated patient near death there are between 1010 and 1012 new virions per day! Remember, a virion is an individual virus particle. The virus also has a high mutation rate: about 3 × 10-5 mutations per generation for each base—that is, each molecule of A,T,C, or G in the RNA of the virus. That may not seem like a lot, but if you multiply it by 1012 you’ll see that a huge number of new variations of each base arise within the body of a single patient.

So, evolution is at work within you as you die.

And in fact, many scientists believe that the diversity of the virus eventually overwhelms your immune system! Although it’s apparently not quite certain, it seems that while the body generates B cells and T cells to attack different variants of HIV as they arise, they eventually can’t keep up with the sheer number of variants.

Of course, the fact that the HIV virus attacks the immune system makes the disearse even worse. Here in blue you see the number of T cells per cubic millimeter of blood, and in red you see the number of virions per cubic centimeter of blood for a typical untreated patient:

Mathematicians and physicists have looked at some very simple models to get a qualitative understanding of these issues. One famous paper that started this off is:

• Lev S. Tsimring, Herbert Levine and David A. Kessler, RNA virus evolution via a fitness-space model, Phys. Rev. Lett. 76 (1996), 4440–4443.

The idea here is to say that at any time t the viruses have a probability density p(r,t) of having fitness r. In fact the different genotypes of the virus form a cloud in a higher-dimensional space, but these authors are treating that space is 1-dimensional, with fitness as its one coordinate, just to keep things simple. They then write down an equation for how the population density changes with time:

\displaystyle{\frac{\partial }{\partial t}p(r,t) = (r - \langle r \rangle)\, p(r,t) + D \frac{\partial^2 }{\partial r}p(r,t) - \frac{\partial}{\partial r}(v_{\mathrm{drift}}\, p(r,t)) }

This is a replication-mutation-drift equation. If we just had

\displaystyle{\frac{\partial }{\partial t}p(r,t) = (r - \langle r \rangle)\, p(r,t) }

this would be a version of the replicator equation, which I explained recently in Information Geometry (Part 9). Here

\displaystyle{ \langle r \rangle = \int_0^\infty r p(r,t) dr }

is the mean fitness, and the replicator equations says that the fraction of organisms of a given type grows at a rate proportional to how much their fitness exceeds the mean fitness: that’s where the (r - \langle r \rangle) comes from.

If we just had

\displaystyle{\frac{\partial }{\partial t}p(r,t) =  D \frac{\partial^2 }{\partial r^2}p(r,t) }

this would be the heat equation, which describes diffusion occurring at a rate D. This models the mutation of the virus, though not in a very realistic way.

If we just had

\displaystyle{\frac{\partial}{\partial t} p(r,t) =  - \frac{\partial}{\partial r}(v_{\mathrm{drift}} \, p(r,t)) }

the fitness of the virus would increase at rate equal to the drift velocity v_{\mathrm{drift}}.

If we include both the diffusion and drift terms:

\displaystyle{\frac{\partial }{\partial t} p(r,t) =  D \frac{\partial^2 }{\partial r^2}p(r,t) - \frac{\partial}{\partial r}(v_{\mathrm{drift}} \, p(r,t)) }

we get the Fokker–Planck equation. This is a famous model of something that’s spreading while also drifting along at a constant velocity: for example, a drop of ink in moving water. Its solutions look like this:

Here we start with stuff concentrated at one point, and it spreads out into a Gaussian while drifting along.

By the way, watch out: what biologists call ‘genetic drift’ is actually a form of diffusion, not what physicists call ‘drift’.

More recently, people have looked at another very simple model. You can read about it here:

• Martin A. Nowak, and R. M. May, Virus Dynamics, Oxford University Press, Oxford, 2000.

In this model the variables are:

• the number of healthy human cells of some type, \mathrm{H}(t)

• the number of infected human cells of that type, \mathrm{I}(t)

• the number of virions, \mathrm{V}(t)

These are my names for variables, not theirs. It’s just a sick joke that these letters spell out ‘HIV’.

Chemists like to describe how molecules react and turn into other molecules using ‘chemical reaction networks’. You’ve seen these if you’ve taken chemistry, but I’ve been explaining more about the math of these starting in Network Theory (Part 17). We can also use them here! Though May and Nowak probably didn’t put it this way, we can consider a chemical reaction network with the following 6 reactions:

• the production of a healthy cell:

\longrightarrow \mathrm{H}

• the infection of a healthy cell by a virion:

\mathrm{H} + \mathrm{V} \longrightarrow \mathrm{I}

• the production of a virion by an infected cell:

\mathrm{I} \longrightarrow \mathrm{I} + \mathrm{V}

• the death of a healthy cell:

\mathrm{H} \longrightarrow

• the death of a infected cell:

\mathrm{I} \longrightarrow

• the death of a virion:

\mathrm{V} \longrightarrow

Using a standard recipe which I explained, we can get from this chemical reaction network to some ‘rate equations’ saying how the number of healthy cells, infected cells and virions changes with time:

\displaystyle{ \frac{d\mathrm{H}}{dt}  =   \alpha - \beta \mathrm{H}\mathrm{V} - \gamma \mathrm{H} }

\displaystyle{ \frac{d\mathrm{I}}{dt}  = \beta \mathrm{H}\mathrm{V} - \delta \mathrm{I} }

\displaystyle{ \frac{d\mathrm{V}}{dt}  = - \beta \mathrm{H}\mathrm{V} + \epsilon \mathrm{I} - \zeta \mathrm{V} }

The Greek letters are constants called ‘rate constants’, and there’s one for each of the 6 reactions. The equations we get this way are exactly those described by Nowak and May!

What Andrei Korobeinikov is to unify the ideas behind the two models I’ve described here. Alas, I don’t have the energy to explain how. Indeed, I don’t even have the energy to explain what the models I’ve described actually predict. Sad, but true.

I don’t see anything online about Korobeinikov’s new work, but you can read some of his earlier work here:

• Andrei Korobeinikov, Global properties of basic virus dynamics models.

• Suzanne M. O’Regan, Thomas C. Kelly, Andrei Korobeinikov, Michael J. A. O’Callaghan and Alexei V. Pokrovskii, Lyapunov functions for SIR and SIRS epidemic models, Appl. Math. Lett. 23 (2010), 446-448.

The SIR and SIRS models are models of disease that also arise from chemical reaction networks. I explained them back in Network Theory (Part 3). That was before I introduced the terminology of chemical reaction networks… back then I was talking about ‘stochastic Petri nets’, which are an entirely equivalent formalism. Here’s the stochastic Petri net for the SIRS model:

Puzzle: Draw the stochastic Petri net for the HIV model discussed above. It should have 3 yellow circles and 6 aqua squares.


Quantizing Electrical Circuits

2 February, 2012

As you may know, there’s a wonderful and famous analogy between classical mechanics and electrical circuit theory. I explained it back in “week288”, so I won’t repeat that story now. If you don’t know what I’m talking about, take a look!

This analogy opens up the possibility of quantizing electrical circuits by straightforwardly copying the way we quantize classical mechanics problems. I’d often wondered if this would be useful.

It is, and people have done it:

• Michel H. Devoret, Quantum fluctuations in electrical circuits.

Michel Devoret, Rob Schoelkopf and others call this idea quantronics: the study of mesoscopic electronic effects in which collective degrees of freedom like currents and voltages behave quantum mechanically.

I just learned about this from a talk by Sean Barrett here in Coogee. There are lots of cool applications, but right now I’m mainly interested in how this extends the set of analogies between different physical theories.

One interesting thing is how they quantize circuits with resistors. Over in classical mechanics, this corresponds to systems with friction. These systems, called ‘dissipative’ systems, don’t have a conserved energy. More precisely, energy leaks out of the system under consideration and gets transferred to the environment in the form of heat. It’s hard to quantize systems where energy isn’t conserved, so people in quantronics model resistors as infinite chains of inductors and capacitors: see the ‘LC ladder circuit’ on page 15 of Devoret’s notes. This idea is also the basis of the Caldeira–Leggett model of a particle coupled to a heat bath made of harmonic oscillators: it amounts to including the environment as part of the system being studied.


Network Theory (Part 18)

21 November, 2011

joint with Jacob Biamonte

Last time we explained how ‘reaction networks’, as used in chemistry, are just another way of talking about Petri nets. We stated an amazing result, the deficiency zero theorem, which settles quite a number of questions about chemical reactions. Now let’s illustrate it with an example.

Our example won’t show how powerful this theorem is: it’s too simple. But it’ll help explain the ideas involved.

Diatomic molecules

A diatomic molecule consists of two atoms of the same kind, stuck together:

At room temperature there are 5 elements that are diatomic gases: hydrogen, nitrogen, oxygen, fluorine, chlorine. Bromine is a diatomic liquid, but easily evaporates into a diatomic gas:

Iodine is a crystal at room temperatures:

but if you heat it a bit, it becomes a diatomic liquid and then a gas:

so people often list it as a seventh member of the diatomic club.

When you heat any diatomic gas enough, it starts becoming a ‘monatomic’ gas as molecules break down into individual atoms. However, just as a diatomic molecule can break apart into two atoms:

A_2 \to A + A

two atoms can recombine to form a diatomic molecule:

A + A \to A_2

So in equilibrium, the gas will be a mixture of diatomic and monatomic forms. The exact amount of each will depend on the temperature and pressure, since these affect the likelihood that two colliding atoms stick together, or a diatomic molecule splits apart. The detailed nature of our gas also matters, of course.

But we don’t need to get into these details here! Instead, we can just write down the ‘rate equation’ for the reactions we’re talking about. All the details we’re ignoring will be hiding in some constants called ‘rate constants’. We won’t try to compute these; we’ll leave that to our chemist friends.

A reaction network

To write down our rate equation, we start by drawing a ‘reaction network’. For this, we can be a bit abstract and call the diatomic molecule B instead of A_2. Then it looks like this:

We could write down the same information using a Petri net:

But today let’s focus on the reaction network! Staring at this picture, we can read off various things:

Species. The species are the different kinds of atoms, molecules, etc. In our example the set of species is S = \{A, B\}.

Complexes. A complex is a finite sum of species, like A, or A + A, or for a fancier example using more efficient notation, 2 A + 3 B. So, we can think of a complex as a vector v \in \mathbb{R}^S. The complexes that actually show up in our reaction network form a set C \subseteq \mathbb{R}^S. In our example, C = \{A+A, B\}.

Reactions. A reaction is an arrow going from one complex to another. In our example we have two reactions: A + A \to B and B \to A + A.

Chemists define a reaction network to be a triple (S, C, T) where S is a set of species, C is the set of complexes that appear in the reactions, and T is the set of reactions v \to w where v, w \in C. (Stochastic Petri net people call reactions transitions, hence the letter T.)

So, in our example we have:

• Species S = \{A,B\}.

• Complexes: C= \{A+A, B\}.

• Reactions: T =  \{A+A\to B, B\to A+A\}.

To get the rate equation, we also need one more piece of information: a rate constant r(\tau) for each reaction \tau \in T. This is a nonnegative real number that affects how fast the reaction goes. All the details of how our particular diatomic gas behaves at a given temperature and pressure are packed into these constants!

The rate equation

The rate equation says how the expected numbers of the various species—atoms, molecules and the like—changes with time. This equation is deterministic. It’s a good approximation when the numbers are large and any fluctuations in these numbers are negligible by comparison.

Here’s the general form of the rate equation:

\displaystyle{ \frac{d}{d t} x_i =  \sum_{\tau\in T} r(\tau) \, (n_i(\tau)-m_i(\tau)) \, x^{m(\tau)}  }

Let’s take a closer look. The quantity x_i is the expected population of the ith species. So, this equation tells us how that changes. But what about the right hand side? As you might expect, it’s a sum over reactions. And:

• The term for the reaction \tau is proportional to the rate constant r(\tau).

• Each reaction \tau goes between two complexes, so we can write it as m(\tau) \to n(\tau). Among chemists the input m(\tau) is called the reactant complex, and the output is called the product complex. The difference n_i(\tau)-m_i(\tau) tells us how many items of species i get created, minus how many get destroyed. So, it’s the net amount of this species that gets produced by the reaction \tau. The term for the reaction \tau is proportional to this, too.

• Finally, the law of mass action says that the rate of a reaction is proportional to the product of the concentrations of the species that enter as inputs. More precisely, if we have a reaction \tau with input is the complex m(\tau), we define x^{m(\tau)} = x_1^{m_1(\tau)} \cdots x_k^{m_k(\tau)}. The law of mass action says the term for the reaction \tau is proportional to this, too!

Let’s see what this says for the reaction network we’re studying:

Let’s write x_1(t) for the number of A atoms and x_2(t) for the number of B molecules. Let the rate constant for the reaction B \to A + A be \alpha, and let the rate constant for A + A \to B be \beta, Then the rate equation is this:

\displaystyle{\frac{d}{d t} x_1 =  2 \alpha x_2 - 2 \beta x_1^2 }

\displaystyle{\frac{d}{d t} x_2 = -\alpha x_2 + \beta x_1^2 }

This is a bit intimidating. However, we can solve it in closed form thanks to something very precious: a conserved quantity.

We’ve got two species, A and B. But remember, B is just an abbreviation for a molecule made of two A atoms. So, the total number of A atoms is conserved by the reactions in our network. This is the number of A‘s plus twice the number of B‘s: x_1 + 2x_2. So, this should be a conserved quantity: it should not change with time. Indeed, by adding the first equation above to twice the second, we see:

\displaystyle{\frac{d}{d t} \left( x_1 + 2x_2 \right) = 0 }

As a consequence, any solution will stay on a line

x_1 + 2 x_2 = c

for some constant c. We can use this fact to rewrite the rate equation just in terms of x_1:

\displaystyle{ \frac{d}{d t} x_1 = \alpha (2c - x_1) - 2 \beta x_1^2 }

This is a separable differential equation, so we can solve it if we can figure out how to do this integral

\displaystyle{ t = \int \frac{d x_1}{\alpha (2c - x_1) - 2 \beta x_1^2 }  }

and then solve for x_1.

This sort of trick won’t work for more complicated examples.
But the idea remains important: the numbers of atoms of various kinds—hydrogen, helium, lithium, and so on—are conserved by chemical reactions, so a solution of the rate equation can’t roam freely in \mathbb{R}^S. It will be trapped on some hypersurface, which is called a ‘stoichiometric compatibility class’. And this is very important.

We don’t feel like doing the integral required to solve our rate equation in closed form, because this idea doesn’t generalize too much. On the other hand, we can always solve the rate equation numerically. So let’s try that!

For example, suppose we set \alpha = \beta = 1. We can plot the solutions for three different choices of initial conditions, say (x_1,x_2) = (0,3), (4,0), and (3,3). We get these graphs:

It looks like the solution always approaches an equilibrium. We seem to be getting different equilibria for different initial conditions, and the pattern is a bit mysterious. However, something nice happens when we plot the ratio x_1^2 / x_2:

Apparently it always converges to 1. Why should that be? It’s not terribly surprising. With both rate constants equal to 1, the reaction A + A \to B proceeds at a rate equal to the square of the number of A‘s, namely x_1^2. The reverse reaction proceeds at a rate equal to the number of B‘s, namely x_2. So in equilibrium, we should have x_1^2 = x_2.

But why is the equilibrium stable? In this example we could see that using the closed-form solution, or maybe just common sense. But it also follows from a powerful theorem that handles a lot of reaction networks.

The deficiency zero theorem

It’s called Feinberg’s deficiency zero theorem, and we saw it last time. Very roughly, it says that if our reaction network is ‘weakly reversible’ and has ‘deficiency zero’, the rate equation will have equilibrium solutions that behave about as nicely as you could want.

Let’s see how this works. We need to remember some jargon:

Weakly reversible. A reaction network is weakly reversible if for every reaction v \to w in the network, there exists a path of reactions in the network starting at w and leading back to v.

Reversible. A reaction network is reversible if for every reaction v \to w in the network, w \to v is also a reaction in the network. Any reversible reaction network is weakly reversible. Our example is reversible, since it consists of reactions A + A \to B, B \to A + A.

But what about ‘deficiency zero’? We defined that concept last time, but let’s review:

Connected component. A reaction network gives a kind of graph with complexes as vertices and reactions as edges. Two complexes lie in the same connected component if we can get from one to the other by a path of reactions, where at each step we’re allowed to go either forward or backward along a reaction. Chemists call a connected component a linkage class. In our example there’s just one:

Stoichiometric subspace. The stoichiometric subspace is the subspace \mathrm{Stoch} \subseteq \mathbb{R}^S spanned by the vectors of the form w - v for all reactions v \to w in our reaction network. This subspace describes the directions in which a solution of the rate equation can move. In our example, it’s spanned by B - 2 A and 2 A - B, or if you prefer, (-2,1) and (2,-1). These vectors are linearly dependent, so the stoichiometric subspace has dimension 1.

Deficiency. The deficiency of a reaction network is the number of complexes, minus the number of connected components, minus the dimension of the stoichiometric subspace. In our example there are 2 complexes, 1 connected component, and the dimension of the stoichiometric subspace is 1. So, our reaction network has deficiency 2 – 1 – 1 = 0.

So, the deficiency zero theorem applies! What does it say? To understand it, we need a bit more jargon. First of all, a vector x \in \mathbb{R}^S tells us how much we’ve got of each species: the amount of species i \in S is the number x_i. And then:

Stoichiometric compatibility class. Given a vector v\in \mathbb{R}^S, its stoichiometric compatibility class is the subset of all vectors that we could reach using the reactions in our reaction network:

\{ v + w \; : \; w \in \mathrm{Stoch} \}

In our example, where the stoichiometric subspace is spanned by (2,-1), the stoichiometric compatibility class of the vector (v_1,v_2) is the line consisting of points

(x_1, x_2) = (v_1,v_2) + s(2,-1)

where the parameter s ranges over all real numbers. Notice that this line can also be written as

x_1 + 2x_2 = c

We’ve already seen that if we start with initial conditions on such a line, the solution will stay on this line. And that’s how it always works: as time passes, any solution of the rate equation stays in the same stoichiometric compatibility class!

In other words: the stoichiometric subspace is defined by a bunch of linear equations, one for each linear conservation law that all the reactions in our network obey.

Here a linear conservation law is a law saying that some linear combination of the numbers of species does not change.

Next:

Positivity. A vector in \mathbb{R}^S is positive if all its components are positive; this describes a a container of chemicals where all the species are actually present. The positive stoichiometric compatibility class of x\in \mathbb{R}^S consists of all positive vectors in its stoichiometric compatibility class.

We finally have enough jargon in our arsenal to state the zero deficiency theorem. We’ll only state the part we need today:

Zero Deficiency Theorem (Feinberg). If a reaction network has deficiency zero and is weakly reversible, and the rate constants are positive, the rate equation has exactly one equilibrium solution in each positive stoichiometric compatibility class. Any sufficiently nearby solution that starts in the same class will approach this equilibrium as t \to +\infty.

In our example, this theorem says there’s just one positive
equilibrium (x_1,x_2) in each line

x_1 + 2x_2 = c

We can find it by setting the time derivatives to zero:

\displaystyle{ \frac{d}{d t}x_1 =  2 \alpha x_2 - 2 \beta x_1^2 = 0 }

\displaystyle{ \frac{d}{d t} x_2 = -\alpha x_2 + \beta x_1^2 = 0 }

Solving these, we get

\displaystyle{ \frac{x_1^2}{x_2} = \frac{\alpha}{\beta} }

So, these are our equilibrium solutions. It’s easy to verify that indeed, there’s one of these in each stoichiometric compatibility class x_1 + 2x_2 = c. And the zero deficiency theorem also tells us that any sufficiently nearby solution that starts in the same class will approach this equilibrium as t \to \infty.

This partially explains what we saw before in our graphs. It shows that in the case \alpha = \beta = 1, any solution that starts by nearly having

\displaystyle{\frac{x_1^2}{x_2} = 1}

will actually have

\displaystyle{\lim_{t \to +\infty} \frac{x_1^2}{x_2} = 1 }

But in fact, in this example we don’t even need to start near the equilibrium for our solution to approach the equilibrium! What about in general? We don’t know, but just to get the ball rolling, we’ll risk the following wild guess:

Conjecture. If a reaction network has deficiency zero and is weakly reversible, and the rate constants are positive, the rate equation has exactly one equilibrium solution in each positive stoichiometric compatibility class, and any positive solution that starts in the same class will approach this equilibrium as t \to +\infty.

If anyone knows a proof or counterexample, we’d be interested. If this result were true, it would really clarify the dynamics of reaction networks in the zero deficiency case.

Next time we’ll talk about this same reaction network from a stochastic point of view, where we think of the atoms and molecules as reacting in a probabilistic way. And we’ll see how the conservation laws we’ve been talking about today are related to Noether’s theorem for Markov processes!


Network Theory (Part 17)

12 November, 2011

joint with Jacob Biamonte

We’ve seen how Petri nets can be used to describe chemical reactions. Indeed our very first example came from chemistry:

However, chemists rarely use the formalism of Petri nets. They use a different but entirely equivalent formalism, called ‘reaction networks’. So now we’d like to tell you about those.

You may wonder: why bother with another formalism, if it’s equivalent to the one we’ve already seen? Well, one goal of this network theory program is to get people from different subjects to talk to each other—or at least be able to. This requires setting up some dictionaries to translate between formalisms. Furthermore, lots of deep results on stochastic Petri nets are being proved by chemists—but phrased in terms of reaction networks. So you need to learn this other formalism to read their papers. Finally, this other formalism is actually better in some ways!

Reaction networks

Here’s a reaction network:

This network involves 5 species: that is, different kinds of things. They could be atoms, molecules, ions or whatever: chemists call all of these species, and there’s no need to limit the applications to chemistry: in population biology, they could even be biological species! We’re calling them A, B, C, D, and E, but in applications, we’d call them by specific names like CO2 and HCO3, or ‘rabbit’ and ‘wolf’, or whatever.

This network also involves 5 reactions, which are shown as arrows. Each reaction turns one bunch of species into another. So, written out more longwindedly, we’ve got these reactions:

A \to B

B \to A

A + C \to D

B + E \to A + C

B + E \to D

If you remember how Petri nets work, you can see how to translate any reaction network into a Petri net, or vice versa. For example, the reaction network we’ve just seen gives this Petri net:

Each species corresponds to a state of this Petri net, drawn as a yellow circle. And each reaction corresponds to transition of this Petri net, drawn as a blue square. The arrows say how many things of each species appear as input or output to each transition. There’s less explicit emphasis on complexes in the Petri net notation, but you can read them off if you want them.

In chemistry, a bunch of species is called a ‘complex’. But what do I mean by ‘bunch’, exactly? Well, I mean that in a given complex, each species can show up 0,1,2,3… or any natural number of times. So, we can formalize things like this:

Definition. Given a set S of species, a complex of those species is a function C: S \to \mathbb{N}.

Roughly speaking, a reaction network is a graph whose vertices are labelled by complexes. Unfortunately, the word ‘graph’ means different things in mathematics—appallingly many things! Everyone agrees that a graph has vertices and edges, but there are lots of choices about the details. Most notably:

• We can either put arrows on the edges, or not.

• We can either allow more than one edge between vertices, or not.

• We can either allow edges from a vertex to itself, or not.

If we say ‘no’ in every case we get the concept of ‘simple graph’, which we discussed last time. At the other extreme, if we say ‘yes’ in every case we get the concept of ‘directed multigraph’, which is what we want now. A bit more formally:

Definition. A directed multigraph consists of a set V of vertices, a set E of edges, and functions s,t: E \to V saying the source and target of each edge.

Given this, we can say:

Definition. A reaction network is a set of species together with a directed multigraph whose vertices are labelled by complexes of those species.

You can now prove that reaction networks are equivalent to Petri nets:

Puzzle 1. Show that any reaction network gives a Petri net, and vice versa.

In a stochastic Petri net each transition is labelled by a rate constant: that is, a numbers in [0,\infty). This lets us write down some differential equations saying how species turn into each other. So, let’s make this definition (which is not standard, but will clarify things for us):

Definition. A stochastic reaction network is a reaction network where each reaction is labelled by a rate constant.

Now you can do this:

Puzzle 2. Show that any stochastic reaction network gives a stochastic Petri net, and vice versa.

For extra credit, show that in each of these puzzles we actually get an equivalence of categories! For this you need to define morphisms between Petri nets, morphisms between reaction networks, and similarly for stochastic Petric nets and stochastic reaction networks. If you get stuck, ask Eugene Lerman for advice. There are different ways to define morphisms, but he knows a cool one.

We’ve been downplaying category theory so far, but it’s been lurking beneath everything we do, and someday it may rise to the surface.

The deficiency zero theorem

You may have already noticed one advantage of reaction networks over Petri nets: they’re quicker to draw. This is true even for tiny examples. For example, this reaction network:

2 X_1 + X_2 \leftrightarrow 2 X_3

corresponds to this Petri net:

But there’s also a deeper advantage. As we saw in Part 8, any stochastic Petri net gives two equations:

• The master equation, which says how the probability that we have a given number of things of each species changes with time.

• The rate equation, which says how the expected number of things in each state changes with time.

The simplest solutions of these equations are the equilibrium solutions, where nothing depends on time. Back in Part 9, we explained when an equilibrium solution of the rate equation gives an equilibrium solution of the master equation. But when is there an equilibrium solution of the rate equation in the first place?

Feinberg’s ‘deficiency zero theorem’ gives a handy sufficient condition. And this condition is best stated using reaction networks! But to understand it, we need to understand the ‘deficiency’ of a reaction network. So let’s define that, and they say what all the words in the definition mean:

Definition. The deficiency of a reaction network is:

• the number of vertices

minus

• the number of connected components

minus

• the dimension of the stoichiometric subspace.

The first two concepts here are easy. A reaction network is a graph (okay, a directed multigraph). So, it has some number of vertices, and also some number of connected components. Two vertices lie in the same connected component iff you can get from one to the other by a path where you don’t care which way the arrows point. For example, this reaction network:

has 5 vertices and 2 connected components.

So, what’s the ‘stoichiometric subspace’? ‘Stoichiometry’ is a scary-sounding word. According to the Wikipedia article:

‘Stoichiometry’ is derived from the Greek words στοιχεῖον (stoicheion, meaning element) and μέτρον (metron, meaning measure.) In patristic Greek, the word Stoichiometria was used by Nicephorus to refer to the number of line counts of the canonical New Testament and some of the Apocrypha.

But for us, stoichiometry is just the art of counting species. To do this, we can form a vector space \mathbb{R}^S where S is the set of species. A vector in \mathbb{R}^S is a function from species to real numbers, saying how much of each species is present. Any complex gives a vector in \mathbb{R}^S, because it’s actually a function from species to natural numbers.

Definition. The stoichiometric subspace of a reaction network is the subspace \mathrm{Stoch} \subseteq \mathbb{R}^S spanned by vectors of the form x - y where x and y are complexes connected by a reaction.

‘Complexes connected by a reaction’ makes sense because vertices in the reaction network are complexes, and edges are reactions. Let’s see how it works in our example:

Each complex here can be seen as a vector in \mathbb{R}^S, which is a vector space whose basis we can call A, B, C, D, E. Each reaction gives a difference of two vectors coming from complexes:

• The reaction A \to B gives the vector B - A.

• The reaction B \to A gives the vector A - B.

• The reaction A + C \to D gives the vector D - A - C.

• The reaction B + E \to A + C gives the vector A + C - B - E.

• The reaction B + E \to D gives the vector D - B - E.

The pattern is obvious, I hope.

These 5 vectors span the stoichiometric subspace. But this subspace isn’t 5-dimensional, because these vectors are linearly dependent! The first vector is the negative of the second one. The last is the sum of the previous two. And those are all the linear dependencies, so the stoichiometric subspace is 3 dimensional. For example, it’s spanned by these 3 linearly independent vectors: A - B, D - A - C, and D - B - E.

I hope you see the moral of this example: the stoichiometric subspace is the space of ways to move in \mathbb{R}^S that are allowed by the reactions in our reaction network! And this is important because the rate equation describes how the amount of each species changes as time passes. So, it describes a point moving around in \mathbb{R}^S.

Thus, if \mathrm{Stoch} \subseteq \mathbb{R}^S is the stoichiometric subspace, and x(t) \in \mathbb{R}^S is a solution of the rate equation, then x(t) always stays within the set

x(0) + \mathrm{Stoch} = \{ x(0) + y \colon \; y \in \mathrm{Stoch} \}

Mathematicians would call this set the coset of x(0), but chemists call it the stoichiometric compatibility class of x(0).

Anyway: what’s the deficiency of the reaction complex in our example? It’s

5 - 2 - 3 = 0

since there are 5 complexes, 2 connected components and the dimension of the stoichiometric subspace is 3.

But what’s the deficiency zero theorem? You’re almost ready for it. You just need to know one more piece of jargon! A reaction network is weakly reversible if whenever there’s a reaction going from a complex x to a complex y, there’s a path of reactions going back from y to x. Here the paths need to follow the arrows.

So, this reaction network is not weakly reversible:

since we can get from A + C to D but not back from D to A + C, and from B+E to D but not back, and so on. However, the network becomes weakly reversible if we add a reaction going back from D to B + E:

If a reaction network isn’t weakly reversible, one complex can turn into another, but not vice versa. In this situation, what typically happens is that as time goes on we have less and less of one species. We could have an equilibrium where there’s none of this species. But we have little right to expect an equilibrium solution of the rate equation that’s positive, meaning that it sits at a point x \in (0,\infty)^S, where there’s a nonzero amount of every species.

My argument here is not watertight: you’ll note that I fudged the difference between species and complexes. But it can be made so when our reaction network has deficiency zero:

Deficiency Zero Theorem (Feinberg). Suppose we are given a reaction network with a finite set of species S, and suppose its deficiency is zero. Then:

(i) If the network is not weakly reversible and the rate constants are positive, the rate equation does not have a positive equilibrium solution.

(ii) If the network is not weakly reversible and the rate constants are positive, the rate equation does not have a positive periodic solution, that is, a periodic solution lying in (0,\infty)^S.

(iii) If the network is weakly reversible and the rate constants are positive, the rate equation has exactly one equilibrium solution in each positive stoichiometric compatibility class.
Any sufficiently nearby solution that starts in the same stoichiometric compatibility class will approach this equilibrium as t \to +\infty. Furthermore, there are no other positive periodic solutions.

This is quite an impressive result. We’ll look at an easy example next time.

References and remarks

The deficiency zero theorem was published here:

• Martin Feinberg, Chemical reaction network structure and the stability of complex isothermal reactors: I. The deficiency zero and deficiency one theorems, Chemical Engineering Science 42 (1987), 2229–2268.

These other explanations are also very helpful:

• Martin Feinberg, Lectures on reaction networks, 1979.

• Jeremy Gunawardena, Chemical reaction network theory for in-silico biologists, 2003.

At first glance the deficiency zero theorem might seem to settle all the basic questions about the dynamics of reaction networks, or stochastic Petri nets… but actually, it just means that deficiency zero reaction networks don’t display very interesting dynamics in the limit as t \to +\infty. So, to get more interesting behavior, we need to look at reaction networks that don’t have deficiency zero.

For example, in biology it’s interesting to think about ‘bistable’ chemical reactions: reactions that have two stable equilibria. An electrical switch of the usual sort is a bistable system: it has stable ‘on’ and ‘off’ positions. A bistable chemical reaction can serve as a kind of biological switch:

• Gheorghe Craciun, Yangzhong Tang and Martin Feinberg, Understanding bistability in complex enzyme-driven reaction networks, PNAS 103 (2006), 8697-8702.

It’s also interesting to think about chemical reactions with stable periodic solutions. Such a reaction can serve as a biological clock:

• Daniel B. Forger, Signal processing in cellular clocks, PNAS 108 (2011), 4281-4285.


Follow

Get every new post delivered to your Inbox.

Join 3,095 other followers