Network Theory (Part 19)

joint with Jacob Biamonte

It’s time to resume the network theory series! We’re writing a little book based on some of these posts, so we want to finish up our discussion of stochastic Petri nets and chemical reaction networks. But it’s been a long time, so we’ll assume you forgot everything we’ve said before, and make this post as self-contained as possible.

Last time we started looking at a simple example: a diatomic gas.

A diatomic molecule of this gas can break apart into two atoms:

A_2 \to A + A

and conversely, two atoms can combine to form a diatomic molecule:

A + A \to A_2

We can draw both these reactions using a chemical reaction network:

where we’re writing B instead of A_2 to abstract away some detail that’s just distracting here.

Last time we looked at the rate equation for this chemical reaction network, and found equilibrium solutions of that equation. Now let’s look at the master equation, and find equilibrium solutions of that. This will serve as a review of three big theorems.

The master equation

We’ll start from scratch. The master equation is all about how atoms or molecules or rabbits or wolves or other things interact randomly and turn into other things. So, let’s write \psi_{m,n} for the probability that we have m atoms of A and n molecule of B in our container. These probabilities are functions of time, and master equation will say how they change.

First we need to pick a rate constant for each reaction. Let’s say the rate constant for this reaction is some number \alpha > 0:

B \to A + A

while the rate constant for the reverse reaction is some number \beta > 0:

A + A \to B

Before we make it pretty using the ideas we’ve been explaining all along, the master equation says

\begin{array}{ccr} \displaystyle{ \frac{d}{d t} \psi_{m,n} (t)} &=&  \alpha (n+1) \, \psi_{m-2,n+1}(t) \\ \\   &&- \; \alpha n \, \psi_{m,n}(t) \\  \\  && + \beta (m+2)(m+1) \, \psi_{m+2,n-1}(t) \\ \\ && - \;\beta m(m-1) \, \psi_{m,n}(t) \\ \\   \end{array}

where we define \psi_{i,j} to be zero if either i or j is negative.

Yuck!

Normally we don’t show you such nasty equations. Indeed the whole point of our work has been to demonstrate that by packaging the equations in a better way, we can understand them using high-level concepts instead of mucking around with millions of scribbled symbols. But we thought we’d show you what’s secretly lying behind our beautiful abstract formalism, just once.

Each term has a meaning. For example, the third one:

\beta (m+2)(m+1)\psi_{m+2,n-1}(t)

means that the reaction A + A \to B will tend to increase the probability of there being m atoms of A and n molecules of B if we start with m+2 atoms of A and n-1 molecules of B. This reaction can happen in (m+2)(m+1) ways. And it happens at a probabilistic rate proportional to the rate constant for this reaction, \beta.

We won’t go through the rest of the terms. It’s a good exercise to do so, but there could easily be a typo in the formula, since it’s so long and messy. So let us know if you find one!

To simplify this mess, the key trick is to introduce a generating function that summarizes all the probabilities in a single power series:

\Psi = \sum_{m,n \ge 0} \psi_{m,n} y^m \, z^n

It’s a power series in two variables, y and z, since we have two chemical species: As and Bs.

Using this trick, the master equation looks like

\displaystyle{ \frac{d}{d t} \Psi(t) = H \Psi(t) }

where the Hamiltonian H is a sum of terms, one for each reaction. This Hamiltonian is built from operators that annihilate and create As and Bs. The annihilation and creation operators for A atoms are:

\displaystyle{ a = \frac{\partial}{\partial y} , \qquad a^\dagger = y }

The annihilation operator differentiates our power series with respect to the variable y. The creation operator multiplies it by that variable. Similarly, the annihilation and creation operators for B molecules are:

\displaystyle{ b = \frac{\partial}{\partial z} , \qquad b^\dagger = z }

In Part 8 we explained a recipe that lets us stare at our chemical reaction network and write down this Hamiltonian:

H = \alpha ({a^\dagger}^2 b - b^\dagger b) + \beta (b^\dagger a^2 - {a^\dagger}^2 a^2)

As promised, there’s one term for each reaction. But each term is itself a sum of two: one that increases the probability that our container of chemicals will be in a new state, and another that decreases the probability that it’s in its original state. We get a total of four terms, which correspond to the four terms in our previous way of writing the master equation.

Puzzle 1. Show that this new way of writing the master equation is equivalent to the previous one.

Equilibrium solutions

Now we will look for all equilibrium solutions of the master equation: in other words, solutions that don’t change with time. So, we’re trying to solve

H \Psi = 0

Given the rather complicated form of the Hamiltonian, this seems tough. The challenge looks more concrete but even more scary if we go back to our original formulation. We’re looking for probabilities \psi_{m,n}, nonnegative numbers that sum to one, such that

\begin{array}{ccr} 0 &=&  \alpha (n+1) \, \psi_{m-2,n+1} \\ \\   &&- \; \alpha n \, \psi_{m,n} \\  \\  && + \beta (m+2)(m+1) \, \psi_{m+2,n-1} \\ \\ && - \;\beta m(m-1) \, \psi_{m,n} \\ \\   \end{array}

for all m and n.

This equation is horrid! But the good news is that it’s linear, so a linear combination of solutions is again a solution. This lets us simplify the problem using a conserved quantity.

Clearly, there’s a quantity that the reactions here don’t change:

What’s that? It’s the number of As plus twice the number of Bs. After all, a B can turn into two As, or vice versa.

(Of course the secret reason is that B is a diatomic molecule made of two As. But you’d be able to follow the logic here even if you didn’t know that, just by looking at the chemical reaction network… and sometimes this more abstract approach is handy! Indeed, the way chemists first discovered that certain molecules are made of certain atoms is by seeing which reactions were possible and which weren’t.)

Suppose we start in a situation where we know for sure that the number of Bs plus twice the number of As equals some number k:

\psi_{m,n} = 0  \;  \textrm{unless} \; m+2n = k

Then we know \Psi is initially of the form

\displaystyle{ \Psi = \sum_{m+2n = k} \psi_{m,n} \, y^m z^n  }

But since the number of As plus twice the number of Bs is conserved, if \Psi obeys the master equation it will continue to be of this form!

Put a fancier way, we know that if a solution of the master equation starts in this subspace:

\displaystyle{ L_k =  \{ \Psi: \; \Psi = \sum_{m+2n = k} \psi_{m,n} y^m z^n \; \textrm{for some} \; \psi_{m,n} \}  }

it will stay in this subspace. So, because the master equation is linear, we can take any solution \Psi and write it as a linear combination of solutions \Psi_k, one in each subspace L_k.

In particular, we can do this for an equilibrium solution \Psi. And then all the solutions \Psi_k are also equilibrium solutions: they’re linearly independent, so if one of them changed with time, \Psi would too.

This means we can just look for equilibrium solutions in the subspaces L_k. If we find these, we can get all equilibrium solutions by taking linear combinations.

Once we’ve noticed that, our horrid equation makes a bit more sense:

\begin{array}{ccr} 0 &=&  \alpha (n+1) \, \psi_{m-2,n+1} \\ \\   &&- \; \alpha n \, \psi_{m,n} \\  \\  && + \beta (m+2)(m+1) \, \psi_{m+2,n-1} \\ \\ && - \;\beta m(m-1) \, \psi_{m,n} \\ \\   \end{array}

Note that if the pair of subscripts m, n obey m + 2n = k, the same is true for the other pairs of subscripts here! So our equation relates the values of \psi_{m,n} for all the points (m,n) with integer coordinates lying on this line segment:

m+2n = k , \qquad m ,n \ge 0

You should be visualizing something like this:

If you think about it a minute, you’ll see that if we know \psi_{m,n} at two points on such a line, we can keep using our equation to recursively work out all the rest. So, there are at most two linearly independent equilibrium solutions of the master equation in each subspace L_k.

Why at most two? Why not two? Well, we have to be a bit careful about what happens at the ends of the line segment: remember that \psi_{m,n} is defined to be zero when m or n becomes negative. If we think very hard about this, we’ll see there’s just one linearly independent equilibrium solution of the master equation in each subspace L_k. But this is the sort of nitty-gritty calculation that’s not fun to watch someone else do, so we won’t bore you with that.

Soon we’ll move on to a more high-level approach to this problem. But first, one remark. Our horrid equation is like a fancy version of the usual discretized form of the equation

\displaystyle {\frac{d^2 \psi}{d x^2} = 0 }

namely:

\psi_{n-1} - 2 \psi_{n} + \psi_{n+1} = 0

And this makes sense, since we get

\displaystyle {\frac{d^2 \psi}{d x^2} = 0 }

by taking the heat equation:

\displaystyle \frac{\partial \psi}{\partial t} = {\frac{\partial^2 \psi}{\partial x^2} }

and assuming \psi doesn’t depend on time. So what we’re doing is a lot like looking for equilibrium solutions of the heat equation.

The heat equation describes how heat smears out as little particles of heat randomly move around. True, there don’t really exist ‘little particles of heat’, but this equation also describes the diffusion of any other kind of particles as they randomly move around undergoing Brownian motion. Similarly, our master equation describes a random walk on this line segment:

m+2n = k , \qquad m , n \ge 0

or more precisely, the points on this segment with integer coordinates. The equilibrium solutions arise when the probabilities \psi_{m,n} have diffused as much as possible.

If you think about it this way, it should be physically obvious that there’s just one linearly independent equilibrium solution of the master equation for each value of k.

There’s a general moral here, too, which we’re seeing in a special case: the master equation for a chemical reaction network really describes a bunch of random walks, one for each allowed value of the conserved quantities that can be built as linear combinations of number operators. In our case we have one such conserved quantity, but in general there may be more (or none).

Furthermore, these ‘random walks’ are what we’ve been calling Markov processes.

Noether’s theorem

We simplified our task of finding equilibrium solutions of the master equation by finding a conserved quantity. The idea of simplifying problems using conserved quantities is fundamental to physics: this is why physicists are so enamored with quantities like energy, momentum, angular momentum and so on.

Nowadays physicists often use ‘Noether’s theorem’ to get conserved quantities from symmetries. There’s a very simple version of Noether’s theorem for quantum mechanics, but in Part 11 we saw a version for stochastic mechanics, and it’s that version that is relevant now. Here’s a paper which explains it in detail:

• John Baez and Brendan Fong, Noether’s theorem for Markov processes.

We don’t really need Noether’s theorem now, since we found the conserved quantity and exploited it without even noticing the symmetry. Nonetheless it’s interesting to see how it relates to what we’re doing.

For the reaction we’re looking at now, the idea is that the subspaces L_k are eigenspaces of an operator that commutes with the Hamiltonian H. It follows from standard math that a solution of the master equation that starts in one of these subspaces, stays in that subspace.

What is this operator? It’s built from ‘number operators’. The number operator for As is

N_A = a^\dagger a

and the number operator for Bs is

N_B = b^\dagger b

A little calculation shows

N_A \,y^m z^n = m \, y^m z^n, \quad \qquad  N_B\, y^m z^n = n \,y^m z^n

so the eigenvalue of N_A is the number of As, while the eigenvalue of N_B is the number of Bs. This is why they’re called number operators.

As a consequence, the eigenvalue of the operator N_A + 2N_B is the number of As plus twice the number of Bs:

(N_A + 2N_B) \, y^m z^n = (m + 2n) \, y^m z^n

Let’s call this operator O, since it’s so important:

O = N_A + 2N_B

If you think about it, the spaces L_k we saw a minute ago are precisely the eigenspaces of this operator:

L_k = \{ \Psi : \; O \Psi = k \Psi \}

As we’ve seen, solutions of the master equation that start in one of these eigenspaces will stay there. This lets take some techniques that are very familiar in quantum mechanics, and apply them to this stochastic situation.

First of all, time evolution as described by the master equation is given by the operators \exp(t H). In other words,

\displaystyle{ \frac{d}{d t} \Psi(t) } = H \Psi(t) \; \textrm{and} \;  \Psi(0) = \Phi \quad  \Rightarrow \quad \Psi(t) = \exp(t H) \Phi

But if you start in some eigenspace of O, you stay there. Thus if \Phi is an eigenvector of O, so is \exp(t H) \Phi, with the same eigenvalue. In other words,

O \Phi = k \Phi

implies

O \exp(t H) \Phi = k \exp(t H) \Phi = \exp(t H) O \Phi

But since we can choose a basis consisting of eigenvectors of O, we must have

O \exp(t H) = \exp(t H) O

or, throwing caution to the winds and differentiating:

O H = H O

So, as we’d expect from Noether’s theorem, our conserved quantity commutes with the Hamiltonian! This in turn implies that H commutes with any polynomial in O, which in turn suggests that

\exp(s O) H = H \exp(s O)

and also

\exp(s O) \exp(t H) = \exp(t H) \exp(s O)

The last equation says that O generates a 1-parameter family of ‘symmetries’: operators \exp(s O) that commute with time evolution. But what do these symmetries actually do? Since

O y^m z^n = (m + 2n) y^m z^n

we have

\exp(s O) y^m z^n = e^{s(m + 2n)}\, y^m z^n

So, this symmetry takes any probability distribution \psi_{m,n} and multiplies it by e^{s(m + 2n)}.

In other words, our symmetry multiplies the relative probability of finding our container of gas in a given state by a factor of e^s for each A atom, and by a factor of e^{2s} for each B molecule. It might not seem obvious that this operation commutes with time evolution! However, experts on chemical reaction theory are familiar with this fact.

Finally, a couple of technical points. Starting where we said “throwing caution to the winds”, our treatment has not been rigorous, since O and H are unbounded operators, and these must be handled with caution. Nonetheless, all the commutation relations we wrote down are true.

The operators \exp(s O) are unbounded for positive s. They’re bounded for negative s, so they give a one-parameter semigroup of bounded operators. But they’re not stochastic operators: even for s negative, they don’t map probability distributions to probability distributions. However, they do map any nonzero vector \Psi with \psi_{m,n} \ge 0 to a vector \exp(s O) \Psi with the same properties. So, we can just normalize this vector and get a probability distribution. The need for this normalization is why we spoke of relative probabilities.

The Anderson–Craciun–Kurtz theorem

Now we’ll actually find all equilibrium solutions of the master equation in closed form. To understand this final section, you really do need to remember some things we’ve discussed earlier. Last time we considered the same chemical reaction network we’re studying today, but we looked at its rate equation, which looks like this:

\displaystyle{ \frac{d}{d t} x_1 =  2 \alpha x_2 - 2 \beta x_1^2}

\displaystyle{ \frac{d}{d t} x_2 = - \alpha x_2 + \beta x_1^2 }

This describes how the number of As and Bs changes in the limit where there are lots of them and we can treat them as varying continuously, in a deterministic way. The number of As is x_1, and the number of Bs is x_2.

We saw that the quantity

x_1 + 2 x_2

is conserved, just as today we’ve seen that N_A + 2 N_B is conserved. We saw that the rate equation has one equilibrium solution for each choice of x_1 + 2 x_2. And we saw that these equilibrium solutions obey

\displaystyle{ \frac{x_1^2}{x_2} = \frac{\alpha}{\beta} }

The Anderson–Craciun–Kurtz theorem, introduced in Part 9, is a powerful result that gets equilibrium solution of the master equation from equilibrium solutions of the rate equation. It only applies to equilibrium solutions that are ‘complex balanced’, but that’s okay:

Puzzle 2. Show that the equilibrium solutions of the rate equation for the chemical reaction network

are complex balanced.

So, given any equilibrium solution (x_1,x_2) of our rate equation, we can hit it with the Anderson-Craciun-Kurtz theorem and get an equilibrium solution of the master equation! And it looks like this:

\displaystyle{  \Psi = e^{-(x_1 + x_2)} \, \sum_{m,n \ge 0} \frac{x_1^m x_2^n} {m! n! } \, y^m z^n }

In this solution, the probability distribution

\displaystyle{ \psi_{m,n} = e^{-(x_1 + x_2)} \,  \frac{x_1^m x_2^n} {m! n! } }

is a product of Poisson distributions. The factor in front is there to make the numbers \psi_{m,n} add up to one. And remember, x_1, x_2 are any nonnegative numbers with

\displaystyle{ \frac{x_1^2}{x_2} = \frac{\alpha}{\beta} }

So from all we’ve said, the above formula is an explicit closed-form solution of the horrid equation

\begin{array}{ccr} 0 &=&  \alpha (n+1) \, \psi_{m-2,n+1} \\ \\   &&- \; \alpha n \, \psi_{m,n} \\  \\  && + \beta (m+2)(m+1) \, \psi_{m+2,n-1} \\ \\ && - \;\beta m(m-1) \, \psi_{m,n} \\ \\   \end{array}

That’s pretty nice. We found some solutions without ever doing any nasty calculations.

But we’ve really done better than getting some equilibrium solutions of the master equation. By restricting attention to n,m with m+2n = k, our formula for \psi_{m,n} gives an equilibrium solution that lives in the eigenspace L_k:

\displaystyle{  \Psi_k = e^{-(x_1 + x_2)} \, \sum_{m+2n =k} \frac{x_1^m x_2^n} {m! n! } \, y^m z^n }

And by what we’ve said, linear combinations of these give all equilibrium solutions of the master equation.

And we got them with very little work! Despite all the fancy talk in today’s post, we essentially just took the equilibrium solutions of the rate equation and plugged them into a straightforward formula to get equilibrium solutions of the master equation. This is why the Anderson–Craciun–Kurtz theorem is so nice. And of course we’re looking at a very simple reaction network: for more complicated ones it becomes even better to use this theorem to avoid painful calculations.

We could go further. For example, we could study nonequilibrium solutions using Feynman diagrams like this:

But instead, we will leave off with two more puzzles. We introduced some symmetries, but we haven’t really explored them yet:

Puzzle 3. What do the symmetries associated to the conserved quantity O do to the equilibrium solutions of the master equation given by

\displaystyle{  \Psi = e^{-(x_1 + x_2)} \, \sum_{m,n \ge 0} \frac{x_1^m x_2^n} {m! n! } \, y^m z^n }

where (x_1,x_2) is an equilibrium solution of the rate equation? In other words, what is the significance of the one-parameter family of solutions

\exp(s O) \Psi ?

Also, we used a conceptual argument to check that H commutes with O, but it’s good to know that we can check this sort of thing directly:

Puzzle 4. Compute the commutator

[H, O] = H O - O H

and show it vanishes.

41 Responses to Network Theory (Part 19)

  1. This is a nice post and I’m glad Jacob sent the link. I’m just like to point out that the conserved quantity, k, is just the mass. This gives a more physical interpretation to your symmetry.

    • John Baez says:

      Good point—yes, k is proportional to the mass. I usually think of it as the total number of A atoms after we admit that a B molecule is made of two A atoms. My reason is that if we had a chemical reaction network where the chemicals were made of many kinds of elements, the number of atoms of each element would be a conserved quantity (barring radioactive decay or other forms of transmutation). Mass would be a linear combination of all these conserved quantities.

  2. Jamie Vicary says:

    Hi! This is nice to read.

    Something I feel strongly when I read your posts on these things is that I’m missing the ‘big picture’. We start with something primordial and beautiful — reaction networks, of chemicals or animals or probabilities or quantum particles — but then we start doing things that seem a bit too clever for their own good, like defining “complex balance”, and finding ingenious symmetries.

    What should come out of this research is an understanding of why these are the right notions to introduce, for basic reasons. And the reasons should be beautiful and profound, because the basic constituents are.

    • John Baez says:

      I agree that we should figure this out more deeply. The chemists have done their work; now it’s time for the category theorists to do theirs.

      The concept of ‘complex balanced’ seemed very odd when I first met it. The proof of the deficiency zero theorem, which I’ll get to soon, reveals some of its significance. There’s this important interplay between the species (the generating objects in our symmetric monoidal category) and the complexes (essentially all the objects in this category). Equilibrium for the master equation seems like an obvious concept, but it involves just the generating objects. Complex balanced equilibrium involves all the objects. But I still need to understand this better.

      On the other hand, there’s nothing ‘ingenious’ about the symmetry in this post. Noether’s theorem says that conserved quantities give symmetries and vice versa. When you have a diatomic gas splitting into atoms and recombining into molecules, the number of atoms is a perfectly obvious conserved quantity. So blame Noether, if you like, for any ingenuity involved here.

    • Jamie Vicary says:

      That symmetry’s perfectly obvious. But I’m not sure why you say “the” symmetry, there are plenty of other less-obvious symmetries you describe here.

      • John Baez says:

        By “the symmetry” I mean the 1-parameter family of symmetries \exp(s O) associated with the conserved quantity O = N_A + 2 N_B, which is secretly just the total number of A atoms, since a B is made of two A‘s. That’s the only symmetry I remember discussing here.

        By the way I’m pretty sure O, and functions of O, are the only conserved quantities in this problem.

      • Jamie Vicary says:

        You’re right, what I said didn’t make much sense. But you do say “It might not seem obvious that this operation commutes with time evolution!”. This is all I mean.

        I feel this stuff somehow jumps to the fun juicy stuff, but misses out some sort of intermediate layer of understanding that needs to be better understood. That by no means denigrates the juice.

      • John Baez says:

        Yes, that’s an example of an obvious conserved quantity that gives rise to some non-obvious symmetries. Last night I added a puzzle at the end of this post which lures the reader into learning more about the meaning of these symmetries. In the end they’re not so surprising.

        I wish these notes could provide this ‘intermediate layer of understanding’ you want. I imagine there are some things I understand and a lot I don’t. I’m writing about this stuff as I learn it, mainly from papers by chemists, and these guys naturally go straight for the ‘juicy stuff’… unlike mathematicians, who like to ponder fundamental concepts. The explanations here go deeper into the basics than the papers I’m reading, but it’s just the start of a long process of digging…

        Also, these posts about chemical reaction networks are just the start of a long series where I consider other kinds of networks too! Next on the list is electrical circuits, and then Bayesian networks. Only when we consider these all side by side will we really understand what’s going on…

      • John Baez says:

        By the way, Gromov said:

        The common and unfortunate fact of the lack of adequate presentation of basic ideas and motivations of almost any mathematical theory is probably due to the binary nature of mathematical perception.

        Either you have no inkling of the idea, or, once you have understood it, the whole idea seems so embarrassingly obvious that you feel reluctant to say it out loud.

  3. Jamie Vicary says:

    Hey, things keep changing after I reply to them! This isn’t fair…

  4. Greg Egan says:

    A couple of minor typos:

    n-1 molecules of AO” should be “n-1 molecules of B“.

    In “A little calculation shows N_A y^m z_2^n” that should be just N_A y^m z^n.

    To be really pedantic, I think those plural A‘s and B‘s are a case of grocer’s apostrophes.

    • John Baez says:

      Thanks for the corrections! I’d be glad to get rid of those apostrophes because on this blog they come out backwards-looking: the half-intelligent software interprets them as the beginning of a quote ‘like this’. But I thought mathematicians were supposed to write

      “how many x’s are there in this equation?”

      not

      “how many xs are there in this equation?”

      I’ll wait for more feedback from pedantic readers here before killing off those apostrophes. I really try to write right.

    • Jamie Vicary says:

      I’m very pedantic and I agree with Greg.

      The best reason I can come up with to have them in is to reduce the jarring effect of two typefaces, of the mathematical symbol and the “s”, immediately next to one another.

      The best reasons to ditch them are that they go against ordinary rules of grammar, and there are real situations where you might want a mathematical object to actually possess something, so you want the apostrophe to mean something when it is there.

    • Greg Egan says:

      Opinions vary on when it’s correct to form a plural with an apostrophe, so I was being a bit tongue in cheek, though when you have capital letters in a different font I don’t think there’s any risk of ambiguity if you drop the apostrophe. There’s a good summary of the range of views for various cases in the wikipedia article.

    • And another typo (or a new one ;-) : after “A little calculation shows” N_B should have eigenvalue n.

  5. John Baez says:

    A test to see if I can force the apostrophe to point the right way after a math symbol:

    A’s

    Yes! The trick is to cut-and-paste an apostrophe into the desired location, rather than just typing one.

  6. Greg Egan says:

    To respond to Puzzle 3, I think the symmetry \exp(s O) maps the equilibrium solution of the master equation associated with the solution (x_1, x_2) of the rate equation to that associated with (e^s x_1, e^{2s} x_2).

    Clearly the equation \frac{x_1^2}{x_2} = \frac{\alpha}{\beta} is still satisfied by the new concentrations x_1'=e^s x_1 and x_2'=e^{2s} x_2. The symmetry \exp(s O) ought to preserve the spaces L_k corresponding to the conserved quantity k, even though x_1'+2x_2' \neq x_1+2x_2, which is a bit confusing.

    • John Baez says:

      Greg wrote:

      I think the symmetry \exp(s O) maps the equilibrium solution of the master equation associated with the solution (x_1, x_2) of the rate equation to that associated with (e^s x_1, e^{2s} x_2).

      Right!

      Clearly the equation

      \displaystyle{ \frac{x_1^2}{x_2} = \frac{\alpha}{\beta}}

      is still satisfied by the new concentrations x_1'=e^s x_1 and x_2'=e^{2s} x_2.

      Right! I feel like rhapsodizing about this a bit:

      We’re studying a diatomic gas in equilibrium, assuming fixed rates for the reactions

      A + A \to A_2

      and

      A_2 \to A + A

      Of course these rates will depend on various things, but we’re treating them as fixed.

      The rate equation lets us find the equilibrium in the limit of large numbers of atoms. In this limit, we’ve seen the number of diatomic molecules is proportional to the square of the number of lone atoms. So, the set of equilibria forms half a parabola:

      \displaystyle{ \frac{x_1^2}{x_2} = \frac{\alpha}{\beta}, \qquad x_1, x_2 \ge 0 }

      There’s a one-parameter group of symmetries:

      (x_1, x_2) \mapsto (e^s x_1, e^{2s} x_2)

      that maps this parabola to itself. For example, we can multiply the number of lone atoms by 1.5 and multiply the number of molecules by 2.25.

      More surprisingly, this symmetry exists even when we consider small numbers of atoms and molecules, where we treat these numbers as integers instead of real numbers. If we have 3 atoms, we can’t multiply the number of atoms by 1.5. So this is a bit shocking at first!

      The trick is to treat the gas stochastically using the master equation rather than deterministically using the rate equation. What our symmetry does is multiply the relative probability of finding our container of gas in a given state by a factor of e^s for each lone atom, and by a factor of e^{2s} for each molecule.

      This symmetry commutes with time evolution as given by the master equation. And for probability distributions that are products of Poisson distributions, this symmetry has the effect of multiplying the mean number of lone atoms by e^s, and the mean number of molecules by e^{2s}.

      But as you note, this symmetry the property that if we start in a state with a definite total number of atoms (that is, lone atoms plus twice the number of molecules), it will map us to another state with the same total number of molecules!

      And if we start in a state with a definite number of lone atoms and a definite number of molecules, the symmetry will leave this state completely unchanged!

      These facts sound paradoxical at first, but of course they’re not. They’re just a bit weird.

      In fact they’re a lot like this other weird fact. If we take a quantum system and start it off in an eigenstate of energy, it will never change, except for an unobservable phase. Every state is a superposition of energy eigenstates. So you might think that nothing can ever change in quantum mechanics. But that’s wrong: the phases that are unobservable in a single energy eigenstate become observable relative phases in a superposition.

      Indeed the math is exactly the same, except now we’re multiplying relative probabilities by positive real numbers, instead of multiplying relative amplitudes by complex numbers!

      • nick says:

        More abstractly, according to the theory of Lie, action under the symmetry group, associated with the conserved quantity of the underlying system, maps solutions to solutions.

  7. Greg Egan says:

    I guess what confused me is that for the Anderson–Craciun–Kurtz solution \Psi, if the number of atoms of A is m and the number of molecules of B is n, then (\bar{m}, \bar{n}) = (x_1, x_2). So if k=m+2n, we have \bar{k} = x_1 + 2x_2.

    If we look at the components \Psi_k of \Psi in the eigenspaces L_k with definite values for k, the symmetry \exp(s O) will take each \Psi_k to another vector in L_k. But the relative size of these components won’t be preserved, so I shouldn’t expect \bar{k} to be preserved.

    On the other hand, \exp(s O) applied to an individual \Psi_k will certainly preserve the definite value of k. But in that case, we no longer have \bar{k} = x_1 + 2x_2.

    • John Baez says:

      I did to you the same thing that annoyed Jamie: keep expanding my original reply to your comment. The last two paragraphs in my reply above note that this curious (but not really paradoxical, and ultimately perfectly reasonable) phenomenon is something we’ve seen before in quantum mechanics. We’re now seeing its analogue in stochastic mechanics.

      • John Baez says:

        But if you prefer, you can just think of it as a fact about Poisson distributions: take the Poisson distribution with some mean, multiply it by an exponential, and you get a new Poisson distribution with a new mean.

  8. John Baez says:

    Some of the equations in this post look better in the version on my website. This version also has the complete answer for Puzzle 3 nicely written up.

    You can get the whole series on my website.

  9. Hi, very intriguing post (and beautiful series, this one on network theory). There is however a hidden symmetry that you are apparently ignoring (not a continuous one, though): the one for time-reversal. At least you did not comment its existence in your particular example.

    Chemical networks often are not symmetric under time-reversal, because they represent phenomena out of thermal equilibrium. This is why I prefer to say, when I put to zero the time-derivative in the l.h.s. of a master equation, that I’m looking for the “stationary solution”, not the “equilibrium” one, reserving the word equilibrium only to stationary solutions which are invariant for time-reversal (in order to guarantee “consistency” with the idea of thermodynamic equilibrium). This symmetry allows you to map the master equation into a Schroedinger-like equation with a symmetric operator (the hamiltonian). Usually this is not possible and one has to symmetrize the operator (or to find left/right basis etc.). In the master equation the time-reversal symmetry appears as the so-called “detailed balance” which, it seems to me, is different from the complex balance concept. In your post it was perfectly legit to use the word equilibrium, but it comes as a word totally equivalent to

    \displaystyle{ \frac{d}{d t}\psi_{m,n}(t)=0 }

    which is not.

    Time-reversal is related to the absence of closed loops (indeed in your example the diffusion occurs over a single line). Basically you are at equilibrium if there is only one route connecting a state with another state. If there are more than one route, it is possible to break the time-reversal symmetry by having a stationary solution with a non-zero probability current flowing in your network. There is a very elegant theory of non-equilibrium master equation in terms of closed loops in the chemical network, done by J. Schnakenberg, where the network is decomposed in “fundamental loops” (such that a current in any edge is the sum of currents in the fundamental loops including that edge). You can see

    • J. Schnakenberg, Rev. Mod. Phys. 48 (1978), 571.

    I have however a problem, because detailed balance means putting to zero each “term” of the sum, if you write the master equation as a sum of terms each one representing the (bi-directional) connection between two states, e.g.

    \displaystyle{ \frac{d}{dt}\psi_{z}(t) = \sum_{z'} [W(z' \to z)\psi_{z'}(t)-W(z \to z')\psi_{z}(t)] }

    where z is a state (node) of the network, which in your notation is a couple (m,n). In your notation this should correspond to ask being zero separately the two couples of terms

    \alpha (m+2)(m+1) \psi_{m+2,n-1}-\beta n \psi_{m,n}

    and

    \beta(n+1)\psi_{m-2,n+1}-\alpha m (m-1) \psi_{mn}

    If I use your solution obtained with the ACK theorem I find that it satisfies detailed balance but with x_1^2/x_2=\beta/\alpha which is slightly different than your x_1^2/x_2=\alpha/\beta. Is there a typo or am I missing something?

    • John Baez says:

      Andrea wrote:

      This is why I prefer to say, when I put to zero the time-derivative in the l.h.s. of a master equation, that I’m looking for the “stationary solution”, not the “equilibrium” one, reserving the word equilibrium only to stationary solutions which are invariant for time-reversal.

      That makes sense. This series has not mentioned ‘detailed balance’, which is indeed different than ‘complex balance’. But Jacob has been thinking about time reversal symmetry lately, so he’ll be interested to read these comments of yours (as was I).

      Thanks for the reference to Shnakenberg!

      Is there a typo or am I missing something?

      Thanks for catching that!

      The problem was that I picked a hard notation to remember: the reaction with rate constant \alpha reduces the number of A atoms. By the time I wrote the last section I’d forgotten this and mentally switched to the more reasonable convention where the bigger \alpha is, the more A atoms there are in equilibrium, while making \beta bigger makes more B molecules.

      In fact I wrote down the rate equation based on that other more reasonable convention, and my whole discussion with Greg Egan was based on that!

      I like that other convention better, and I’m turning all these articles into a book. So, right now I just rewrote this article to use that other convention.

      Of course now there are probably more new typos.

  10. Arjun Jain says:

    1. Why are you calling

    \exp(s O)

    a ‘symmetry’ ? What is the use of considering this? Also, why are we normalizing the solution later?

    2. In puzzle 4, shouldn’t

    \displaystyle{ \Psi = e^{-(x_1 + x_2)} \, \sum_{m,n \ge 0} \frac{x_1^m x_2^n} {m! n! } \, y^m z^n }

    be

    \displaystyle{ \Psi = e^{-(x_1 + x_2)} \, \sum_{m,n \ge 0, m+2n=k} \frac{x_1^m x_2^n} {m! n! } \, y^m z^n }

    along with a normalization constant?

    3. Have you talked about the Feynman diagram approach in detail, in some later post?

    • John Baez says:

      Regarding question 1:

      In quantum mechanics, a symmetry of a Hamiltonian H is a unitary operator U that commutes with time evolution. Usually we just check that

      HU = UH

      Modulo subtleties of analysis, this implies

      \exp(itH) U = U \exp(it H)

      for all t, and this is the important equation. This says that we can take our physical system in any state \psi, apply the symmetry, then wait for some time t, and get the same result as first waiting and then applying the symmetry:

      \exp(itH) U \Psi = U \exp(it H) \Psi

      That’s exactly what a symmetry should do! For example, there’s a symmetry called ‘translation’: the laws of physics don’t change from place to place, so we can take an atom and move it and it will not notice. So, if you have an atom in empty space, you can move the atom 1 meter north—that’s U—and then wait a second–that’s \exp(t H)—and you get the same result as waiting a second and then moving the atom!

      All this is standard stuff in quantum mechanics. So, we’re just generalizing this bunch of ideas to stochastic mechanics. For a more detailed introduction, read this:

      • John Baez and Brendan Fong, A Noether theorem for Markov processes.

      The point of Noether’s theorem is that it sets up a 1-1 correspondence between symmetries and conserved quantities. For example, translation symmetry gives conservation of momentum. Rotation symmetry gives conservation of angular momentum, and so on.

      What we’re showing here is that similarly, in chemistry, each conservation law (like conservation of the total number boron atoms) corresponds to a symmetry!

    • John Baez says:

      Arjun asks:

      Also, why are we normalizing the solution later?

      I don’t see what you mean, but it’s only normalized choices of \Psi that describe probability distributions, where \psi_{m,n} is the probability of having m A’s and n B’s. When looking for equilibrium solutions of the master equation we can first find solutions and then normalize them.

      I tend to call a normalized choice of \Psi a stochastic state, in analogy with the usual concept of quantum state. I explained this back in Part 12.

      • Arjun Jain says:

        I meant that isn’t it a bit artificial finding solutions first and then normalizing them by ourselves? \exp(s O) is an observable- so we don’t expect \exp(s O) to be a stochastic state. Does the normalized solution represent anything realizable?

      • John Baez says:

        Arjun Jain wrote:

        I meant that isn’t it a bit artificial finding solutions first and then normalizing them by ourselves?

        No, it’s very common to use constructions that produce non-normalized quantum states or probability distributions, and then normalize them.

        \exp(s O) is an observable- so we don’t expect \exp(s O) to be a stochastic state.

        It’s not \exp(s O) that’s a stochastic state: if \Psi is a stochastic state, \exp(sO) \Psi gives a stochastic state after we normalize it. And if \Psi(t) depends on time,and it’s a solution of the master equation, and O is a symmetry, we see \exp(sO) \Psi(t) is also a solution of the master equation! It’s very valuable to have a way to get new solutions from old ones, even if we need to normalize them.

        (All these ideas are taken directly from quantum mechanics.)

        Does the normalized solution represent anything realizable?

        I think this is answered by Puzzle 3. You can see the answer to that puzzle on my website. And there’s another discussion of this same issue at the end of my second paper with Brendan Fong.

    • John Baez says:

      Arjun wrote:

      2. In puzzle 4, shouldn’t

      \displaystyle{ \Psi = e^{-(x_1 + x_2)} \, \sum_{m,n \ge 0} \frac{x_1^m x_2^n} {m! n! } \, y^m z^n }

      be

      \displaystyle{ \Psi = e^{-(x_1 + x_2)} \, \sum_{m,n \ge 0, m+2n=k} \frac{x_1^m x_2^n} {m! n! } \, y^m z^n }

      along with a normalization constant?

      I think you mean puzzle 3. No, I meant what I said. Either of these formulas gives an equilibrium solution of the master equation. It’s interesting to compute \exp(s O) \Psi in either case! So, there are two different puzzles to do: my puzzle, and the puzzle you just suggested.

      • Arjun Jain says:

        Okay, so for the first case, \exp(s O) \Psi, after normalization, gives the Poisson distribution with mean (e^s x_1, e^{2s} x_2), while for the second case, after normalization, we get \Psi again.

        In the first case, (x_1,x_2) has some value of x_1 + 2 x_2, which is not the same that of (e^s x_1, e^{2s} x_2), because \Psi consisted of states other than those corresponding to x_1 + 2 x_2.

        Can we always assume that we will be able to normalize solutions for any symmetry, like this?

      • John Baez says:

        Arjun wrote:

        Can we always assume that we will be able to normalize solutions for any symmetry, like this?

        That’s a fun puzzle. Either prove this or find a counterexample!

    • John Baez says:

      Arjun wrote:

      3. Have you talked about the Feynman diagram approach in detail, in some later post?

      The most detail was in Part 7. I haven’t gone into more detail because this stuff is easy if you know Feynman diagram theory, and I don’t yet see a really exciting theorem to prove, or calculation to do, that would be worth writing a paper about. There should be something interesting to do, though. Maybe we can figure out something when we meet in Singapore.

  11. Arjun Jain says:

    4. For the general solution, you write that, because the master equation is linear, we can take any solution \Psi and write it as a linear combination of solutions \Psi_k, one in each subspace L_k. In reality, don’t we have only one value of k from the initial conditions? Then what is the need of the linear combinations? Is it that the initial conditions are also in the form of a linear combination of \Psi_ks?

    5. Typo above the Feynman diagram: n+2m = k -> m+2n = k.

    • John Baez says:

      4. For the general solution, you write that, because the master equation is linear, we can take any solution \Psi and write it as a linear combination of solutions \Psi_k, one in each subspace L_k. In reality, don’t we have only one value of k from the initial conditions? Then what is the need of the linear combinations?

      k is the total number of atoms. If you know for such what the total number of atoms in a box of gas is, you can choose \Psi \in L_k for that particular choice of k, and you don’t need to consider any other choice of k.

      But in reality, chemists rarely count the atoms in a box of gas. So it’s actually often better to work with coherent states, where we just know the expected value of the number of atoms. They are mathematically much simpler to deal with, and when k is large the standard deviation in the number of atoms is much smaller than the mean, so you almost know the precise number of atoms in such a state.

      These coherent states are linear combinations of states living in different subspaces L_k.

  12. Arjun Jain says:

    Puzzle 4:

    Consider:

    ({a^\dagger}^2 b - b^\dagger b)(a^\dagger a + 2 b^\dagger b) - (a^\dagger a + 2 b^\dagger b)({a^\dagger}^2 b - b^\dagger b)

    As as and bs work independently, we can put the as in front of the bs in each term of the expression. Then we get:

    ({a^\dagger}^3 a b - a^\dagger a b^\dagger b + 2 {a^\dagger}^2 b b^\dagger b - 2 b^\dagger b b^\dagger b) - (a^\dagger {a ^\dagger}^2 b - a^\dagger a b^\dagger b + 2 {a^\dagger}^2 b^\dagger b^2 - 2 b^\dagger b b^\dagger b)

    The 2nd, 4th, 6th and 8th terms cancel. Then we get:
    a^\dagger ( {a^\dagger}^2 a + 2 a^\dagger b b^\dagger - a {a^\dagger}^2 - 2 a^\dagger b^\dagger b ) b

    The 2nd and 4th terms in the bracket equal 2 a^\dagger, while the 1st and 3rd terms give -2 a^\dagger.

    So we get 0. Similarly for the \beta terms.

You can use Markdown or HTML in your comments. You can also use LaTeX, like this: $latex E = m c^2 $. The word 'latex' comes right after the first dollar sign, with a space after it.

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s