Network Theory (Part 6)

16 April, 2011

Now for the fun part. Let’s see how tricks from quantum theory can be used to describe random processes. I’ll try to make this post self-contained. So, even if you skipped a bunch of the previous ones, this should make sense.

You’ll need to know a bit of math: calculus, a tiny bit probability theory, and linear operators on vector spaces. You don’t need to know quantum theory, though you’ll have more fun if you do. What we’re doing here is very similar… but also strangely different—for reasons I explained last time.

Rabbits and quantum mechanics

Suppose we have a population of rabbits in a cage and we’d like to describe its growth in a stochastic way, using probability theory. Let \psi_n be the probability of having n rabbits. We can borrow a trick from quantum theory, and summarize all these probabilities in a formal power series like this:

\Psi = \sum_{n = 0}^\infty \psi_n z^n

The variable z doesn’t mean anything in particular, and we don’t care if the power series converges. See, in math ‘formal’ means “it’s only symbols on the page, just follow the rules”. It’s like if someone says a party is ‘formal’, so need to wear a white tie: you’re not supposed to ask what the tie means.

However, there’s a good reason for this trick. We can define two operators on formal power series, called the annihilation operator:

a \Psi = \frac{d}{d z} \Psi

and the creation operator:

a^\dagger \Psi = z \Psi

They’re just differentiation and multiplication by z, respectively. So, for example, suppose we start out being 100% sure we have n rabbits for some particular number n. Then \psi_n = 1, while all the other probabilities are 0, so:

\Psi = z^n

If we then apply the creation operator, we obtain

a^\dagger \Psi = z^{n+1}

Voilà! One more rabbit!

The annihilation operator is more subtle. If we start out with n rabbits:

\Psi = z^n

and then apply the annihilation operator, we obtain

a \Psi = n z^{n-1}

What does this mean? The z^{n-1} means we have one fewer rabbit than before. But what about the factor of n? It means there were n different ways we could pick a rabbit and make it disappear! This should seem a bit mysterious, for various reasons… but we’ll see how it works soon enough.

The creation and annihilation operators don’t commute:

(a a^\dagger - a^\dagger a) \Psi = \frac{d}{d z} (z \Psi) - z \frac{d}{d z} \Psi = \Psi

so for short we say:

a a^\dagger - a^\dagger a = 1

or even shorter:

[a, a^\dagger] = 1

where the commutator of two operators is

[S,T] = S T - T S

The noncommutativity of operators is often claimed to be a special feature of quantum physics, and the creation and annihilation operators are fundamental to understanding the quantum harmonic oscillator. There, instead of rabbits, we’re studying quanta of energy, which are peculiarly abstract entities obeying rather counterintuitive laws. So, it’s cool that the same math applies to purely classical entities, like rabbits!

In particular, the equation [a, a^\dagger] = 1 just says that there’s one more way to put a rabbit in a cage of rabbits, and then take one out, than to take one out and then put one in.

But how do we actually use this setup? We want to describe how the probabilities \psi_n change with time, so we write

\Psi(t) = \sum_{n = 0}^\infty \psi_n(t) z^n

Then, we write down an equation describing the rate of change of \Psi:

\frac{d}{d t} \Psi(t) = H \Psi(t)

Here H is an operator called the Hamiltonian, and the equation is called the master equation. The details of the Hamiltonian depend on our problem! But we can often write it down using creation and annihilation operators. Let’s do some examples, and then I’ll tell you the general rule.

Catching rabbits

Last time I told you what happens when we stand in a river and catch fish as they randomly swim past. Let me remind you of how that works. But today let’s use rabbits.

So, suppose an inexhaustible supply of rabbits are randomly roaming around a huge field, and each time a rabbit enters a certain area, we catch it and add it to our population of caged rabbits. Suppose that on average we catch one rabbit per unit time. Suppose the chance of catching a rabbit during any interval of time is independent of what happened before. What is the Hamiltonian describing the probability distribution of caged rabbits, as a function of time?

There’s an obvious dumb guess: the creation operator! However, we saw last time that this doesn’t work, and we saw how to fix it. The right answer is

H = a^\dagger - 1

To see why, suppose for example that at some time t we have n rabbits, so:

\Psi(t) = z^n

Then the master equation says that at this moment,

\frac{d}{d t} \Psi(t) = (a^\dagger - 1) \Psi(t) =  z^{n+1} - z^n

Since \Psi = \sum_{n = 0}^\infty \psi_n(t) z^n, this implies that the coefficients of our formal power series are changing like this:

\frac{d}{d t} \psi_{n+1}(t) = 1
\frac{d}{d t} \psi_{n}(t) = -1

while all the rest have zero derivative at this moment. And that’s exactly right! See, \psi_{n+1}(t) is the probability of having one more rabbit, and this is going up at rate 1. Meanwhile, \psi_n(t) is the probability of having n rabbits, and this is going down at the same rate.

Puzzle 1. Show that with this Hamiltonian and any initial conditions, the master equation predicts that the expected number of rabbits grows linearly.

Dying rabbits

Don’t worry: no rabbits are actually injured in the research that Jacob Biamonte is doing here at the Centre for Quantum Technologies. He’s keeping them well cared for in a big room on the 6th floor. This is just a thought experiment.

Suppose a mean nasty guy had a population of rabbits in a cage and didn’t feed them at all. Suppose that each rabbit has a unit probability of dying per unit time. And as always, suppose the probability of this happening in any interval of time is independent of what happens before that time.

What is the Hamiltonian? Again there’s a dumb guess: the annihilation operator! And again this guess is wrong, but it’s not far off. As before, the right answer includes a ‘correction term’:

H = a - N

This time the correction term is famous in its own right. It’s called the number operator:

N = a^\dagger a

The reason is that if we start with n rabbits, and apply this operator, it amounts to multiplication by n:

N z^n = z \frac{d}{d z} z^n = n z^n

Let’s see why this guess is right. Again, suppose that at some particular time t we have n rabbits, so

\Psi(t) = z^n

Then the master equation says that at this time

\frac{d}{d t} \Psi(t) = (a - N) \Psi(t) = n z^{n-1} - n z^n

So, our probabilities are changing like this:

\frac{d}{d t} \psi_{n-1}(t) = n
\frac{d}{d t} \psi_n(t) = -n

while the rest have zero derivative. And this is good! We’re starting with n rabbits, and each has a unit probability per unit time of dying. So, the chance of having one less should be going up at rate n. And the chance of having the same number we started with should be going down at the same rate.

Puzzle 2. Show that with this Hamiltonian and any initial conditions, the master equation predicts that the expected number of rabbits decays exponentially.

Breeding rabbits

Suppose we have a strange breed of rabbits that reproduce asexually. Suppose that each rabbit has a unit probability per unit time of having a baby rabbit, thus effectively duplicating itself.

As you can see from the cryptic picture above, this ‘duplication’ process takes one rabbit as input and has two rabbits as output. So, if you’ve been paying attention, you should be ready with a dumb guess for the Hamiltonian: a^\dagger a^\dagger a. This operator annihilates one rabbit and then creates two!

But you should also suspect that this dumb guess will need a ‘correction term’. And you’re right! As always, the correction terms makes the probability of things staying the same go down at exactly the rate that the probability of things changing goes up.

You should guess the correction term… but I’ll just tell you:

H = a^\dagger a^\dagger a - N

We can check this in the usual way, by seeing what it does when we have n rabbits:

H z^n =  z^2 \frac{d}{d z} z^n - n z^n = n z^{n+1} - n z^n

That’s good: since there are n rabbits, the rate of rabbit duplication is n. This is the rate at which the probability of having one more rabbit goes up… and also the rate at which the probability of having n rabbits goes down.

Puzzle 3. Show that with this Hamiltonian and any initial conditions, the master equation predicts that the expected number of rabbits grows exponentially.

Dueling rabbits

Let’s do some stranger examples, just so you can see the general pattern.

Here each pair of rabbits has a unit probability per unit time of fighting a duel with only one survivor. You might guess the Hamiltonian a^\dagger a a, but in fact:

H = a^\dagger a a - N(N-1)

Let’s see why this is right! Let’s see what it does when we have n rabbits:

H z^n = z \frac{d^2}{d z^2} z^n - n(n-1)z^n = n(n-1) z^{n-1} - n(n-1)z^n

That’s good: since there are n(n-1) ordered pairs of rabbits, the rate at which duels take place is n(n-1). This is the rate at which the probability of having one less rabbit goes up… and also the rate at which the probability of having n rabbits goes down.

(If you prefer unordered pairs of rabbits, just divide the Hamiltonian by 2. We should talk about this more, but not now.)

Brawling rabbits

Now each triple of rabbits has a unit probability per unit time of getting into a fight with only one survivor! I don’t know the technical term for a three-way fight, but perhaps it counts as a small ‘brawl’ or ‘melee’. In fact the Wikipedia article for ‘melee’ shows three rabbits in suits of armor, fighting it out:

Now the Hamiltonian is:

H = a^\dagger a^3 - N(N-1)(N-2)

You can check that:

H z^n = n(n-1)(n-2) z^{n-2} - n(n-1)(n-2) z^n

and this is good, because n(n-1)(n-2) is the number of ordered triples of rabbits. You can see how this number shows up from the math, too:

a^3 z^n = \frac{d^3}{d z^3} z^n = n(n-1)(n-2) z^{n-3}

The general rule

Suppose we have a process taking k rabbits as input and having j rabbits as output:

I hope you can guess the Hamiltonian I’ll use for this:

H = {a^{\dagger}}^j a^k - N(N-1) \cdots (N-k+1)

This works because

a^k z^n = \frac{d^k}{d z^k} z^n = n(n-1) \cdots (n-k+1) z^{n-k}

so that if we apply our Hamiltonian to n rabbits, we get

H z^n =  n(n-1) \cdots (n-k+1) (z^{n+j-k} - z^n)

See? As the probability of having n+j-k rabbits goes up, the probability of having n rabbits goes down, at an equal rate. This sort of balance is necessary for H to be a sensible Hamiltonian in this sort of stochastic theory (an ‘infinitesimal stochastic operator’, to be precise). And the rate is exactly the number of ordered k-tuples taken from a collection of n rabbits. This is called the kth falling power of n, and written as follows:

n^{\underline{k}} = n(n-1) \cdots (n-k+1)

Since we can apply functions to operators as well as numbers, we can write our Hamiltonian as:

H = {a^{\dagger}}^j a^k - N^{\underline{k}}

Kissing rabbits

Let’s do one more example just to test our understanding. This time each pair of rabbits has a unit probability per unit time of bumping into one another, exchanging a friendly kiss and walking off. This shouldn’t affect the rabbit population at all! But let’s follow the rules and see what they say.

According to our rules, the Hamiltonian should be:

H = {a^{\dagger}}^2 a^2 - N(N-1)

However,

{a^{\dagger}}^2 a^2 z^n = z^2 \frac{d^2}{dz^2} z^n = n(n-1) z^n = N(N-1) z^n

and since z^n form a ‘basis’ for the formal power series, we see that:

{a^{\dagger}}^2 a^2 = N(N-1)

so in fact:

H = 0

That’s good: if the Hamiltonian is zero, the master equation will say

\frac{d}{d t} \Psi(t) = 0

so the population, or more precisely the probability of having any given number of rabbits, will be constant.

There’s another nice little lesson here. Copying the calculation we just did, it’s easy to see that:

{a^{\dagger}}^k a^k = N^{\underline{k}}

This is a cute formula for falling powers of the number operator in terms of annihilation and creation operators. It means that for the general transition we saw before:

we can write the Hamiltonian in two equivalent ways:

H = {a^{\dagger}}^j a^k - N^{\underline{k}} =  {a^{\dagger}}^j a^k - {a^{\dagger}}^k a^k

Okay, that’s it for now! We can, and will, generalize all this stuff to stochastic Petri nets where there are things of many different kinds—not just rabbits. And we’ll see that the master equation we get matches the answer to the puzzle in Part 4. That’s pretty easy. But first, we’ll have a guest post by Jacob Biamonte, who will explain a more realistic example from population biology.


The Genetic Code

14 April, 2011

Certain mathematical physicists can’t help wondering why the genetic code works exactly the way it does. As you probably know, DNA is a helix bridged by pairs of bases, which come in 4 kinds:

adenine (A)
thymine (T)
cytosine (C)
guanine (G)

Because of how they’re shaped, A can only connect to T:

while C can only connect to G:

When DNA is copied to ‘messenger RNA’ as part of the process of making proteins, the T gets copied to uracil, or U. The other three base pairs stay the same.

A protein is made of lots of amino acids. A sequence of three base pairs forms a ‘codon’, which codes for a single amino acid. Here’s some messenger RNA with the codons indicated:

But here’s where it gets tricky: while there are 43 = 64 codons, they code for only 20 amino acids. Typically more than one codon codes for the same amino acid. There are two exceptions. One is the amino acid tryptophan, which is encoded only by UGG. The other is methionine, which is encoded only by AUG. AUG is also the ‘start codon’, which tells the cell where the code for a protein starts. So, methionine shows up at the start of every protein (or most maybe just most?), at least at first. It’s usually removed later in the protein manufacture process.

There are also three ‘stop codons’, which mark the end of a protein. They have cute names:

• UAG (‘amber’)
• UAA (‘ochre’)
• UGA (‘opal’)

But look at the actual pattern of which codons code for which amino acids:

It looks sort of regular… but also sort of irregular! Note how:

• Almost all amino acids either have 4 codons coding for them, or 2.
• If 4 codons code for the same amino acid, it’s because we can change the last base without any effect.
• If 2 codons code for the same amino acid, it’s because we can change the last base from U to C or from A to G without any effect.
• The amino acid tryptophan, with just one base pair coding for it, is right next to the 3 stop codons.

And so on…

This what attracts the mathematical physicists I’m talking about. They’re wondering what is the pattern here! Saying the patterns are coincidental—a “frozen accident of history”—won’t please these people.

Though I certainly don’t vouch for their findings, I sympathize with the impulse to find order amid chaos. Here are some papers I’ve seen:

• José Eduardo M. Hornos, Yvone M. M. Hornos and Michael Forger, Symmetry and symmetry breaking, algebraic approach to the genetic code, International Journal of Modern Physics B, 13 (1999), 2795-2885.

After a very long review of symmetry in physics, starting with the Big Bang and moving up through the theory of Lie algebras and Cartan’s classification of simple Lie algebras, the authors describe their program:

The first step in the search for symmetries in the genetic code consists in selecting a simple Lie algebra and an irreducible representation of this Lie algebra on a vector space of dimension 64: such a representation will in the following be referred to as a codon representation.

There turn out to be 11 choices. Then they look at Lie subalgebras of these Lie algebras that have codon representations, and try to organize the codons for the same amino acid into irreducible representations of these subalgebras. This follows the ‘symmetry breaking’ strategy that particle physicists use to organize particles into families (but with less justification, it seems to me). They show:

There is no symmetry breaking pattern through chains of subalgebras capable of reproducing exactly the degeneracies of the genetic code.

This is not the end of the paper, however!

Here’s another paper, which seems to focus on how the genetic code might be robust against small errors:

• Miguel A. Jimenez-Montano, Carlos R. de la Mora-Basanez, and Thorsten Poeschel, The hypercube structure of the genetic code explains conservative and non-conservative amino acid substitutions in vivo and in vitro.

And here’s another:

• S. Petoukhov, The genetic code, 8-dimensional hypercomplex numbers and dyadic shifts.

But these three papers seem rather ‘Platonic’ in inspiration: they don’t read like biology papers. What papers on the genetic code do biologists like best? I know there’s a lot of research on the origin of this code.

Maybe some of these would be interesting. I haven’t read any of them! But they seem a bit more mainstream than the ones I just listed:

• T. A. Ronneberg, L. F. Landweber, S. J. Freeland, Testing a biosynthetic theory of the genetic code: fact or artifact?, Proc. Natl. Acad. Sci. U.S.A. 97 (200), 13690–13695.

It has long been conjectured that the canonical genetic code evolved from a simpler primordial form that encoded fewer amino acids (e.g. Crick 1968). The most influential form of this idea, “code coevolution” (Wong 1975) proposes that the genetic code coevolved with the invention of biosynthetic pathways for new amino acids. It further proposes that a comparison of modern codon assignments with the conserved metabolic pathways of amino acid biosynthesis can inform us about this history of code expansion. Here we re-examine the biochemical basis of this theory to test the validity of its statistical support. We show that the theory’s definition of “precursor-product” amino acid pairs is unjustified biochemically because it requires the energetically unfavorable reversal of steps in extant metabolic pathways to achieve desired relationships. In addition, the theory neglects important biochemical constraints when calculating the probability that chance could assign precursor-product amino acids to contiguous codons. A conservative correction for these errors reveals a surprisingly high 23% probability that apparent patterns within the code are caused purely by chance. Finally, even this figure rests on post hoc assumptions about primordial codon assignments, without which the probability rises to 62% that chance alone could explain the precursor-product pairings found within the code. Thus we conclude that coevolution theory cannot adequately explain the structure of the genetic code.

• Pavel V. Baranov, Maxime Venin and Gregory Provan, Codon size reduction as the origin of the triplet genetic code, PLoS ONE 4 (2009), e5708.

The genetic code appears to be optimized in its robustness to missense errors and frameshift errors. In addition, the genetic code is near-optimal in terms of its ability to carry information in addition to the sequences of encoded proteins. As evolution has no foresight, optimality of the modern genetic code suggests that it evolved from less optimal code variants. The length of codons in the genetic code is also optimal, as three is the minimal nucleotide combination that can encode the twenty standard amino acids. The apparent impossibility of transitions between codon sizes in a discontinuous manner during evolution has resulted in an unbending view that the genetic code was always triplet. Yet, recent experimental evidence on quadruplet decoding, as well as the discovery of organisms with ambiguous and dual decoding, suggest that the possibility of the evolution of triplet decoding from living systems with non-triplet decoding merits reconsideration and further exploration. To explore this possibility we designed a mathematical model of the evolution of primitive digital coding systems which can decode nucleotide sequences into protein sequences. These coding systems can evolve their nucleotide sequences via genetic events of Darwinian evolution, such as point-mutations. The replication rates of such coding systems depend on the accuracy of the generated protein sequences. Computer simulations based on our model show that decoding systems with codons of length greater than three spontaneously evolve into predominantly triplet decoding systems. Our findings suggest a plausible scenario for the evolution of the triplet genetic code in a continuous manner. This scenario suggests an explanation of how protein synthesis could be accomplished by means of long RNA-RNA interactions prior to the emergence of the complex decoding machinery, such as the ribosome, that is required for stabilization and discrimination of otherwise weak triplet codon-anticodon interactions.

What’s the “recent experimental evidence on quadruplet decoding”, and what organisms have “ambiguous” or “dual” decoding?

• Tsvi Tlusty, A model for the emergence of the genetic code as a transition in a noisy information channel, J. Theor. Bio. 249 (2007), 331–342.

The genetic code maps the sixty-four nucleotide triplets (codons) to twenty amino-acids. Some argue that the specific form of the code with its twenty amino-acids might be a ‘frozen accident’ because of the overwhelming effects of any further change. Others see it as a consequence of primordial biochemical pathways and their evolution. Here we examine a scenario in which evolution drives the emergence of a genetic code by selecting for an amino-acid map that minimizes the impact of errors. We treat the stochastic mapping of codons to amino-acids as a noisy information channel with a natural fitness measure. Organisms compete by the fitness of their codes and, as a result, a genetic code emerges at a supercritical transition in the noisy channel, when the mapping of codons to amino-acids becomes nonrandom. At the phase transition, a small expansion is valid and the emergent code is governed by smooth modes of the Laplacian of errors. These modes are in turn governed by the topology of the error-graph, in which codons are connected if they are likely to be confused. This topology sets an upper bound – which is related to the classical map-coloring problem – on the number of possible amino-acids. The suggested scenario is generic and may describe a mechanism for the formation of other error-prone biological codes, such as the recognition of DNA sites by proteins in the transcription regulatory network.

• Tsvi Tlusty, A colorful origin for the genetic code: Information theory, statistical mechanics and the emergence of molecular codes, Phys. Life. Rev. 7 (2010), 362–376.

• S. J. Freeland, T. Wu and N. Keulmann, The case for an error minimizing standard genetic code, Orig. Life Evol. Biosph. 33 (2009), 457–477.

• G. Sella and D. Ardell, The coevolution of genes and genetic codes: Crick’s frozen accident revisited, J. Mol. Evol. 63 (2006), 297–313.


The Three-Fold Way

13 April, 2011

I just finished a series of blog posts about doing quantum theory using the real numbers, the complex numbers and the quaternions… and how Nature seems to use all three. Mathematically, they fit together in a structure that Freeman Dyson called The Three-Fold Way.

You read all those blog posts here:

State-observable duality – Part 1: a review of normed division algebras.

State-observable duality – Part 2: the theorem by Jordan, von Neumann and Wigner classifying ‘finite-dimensional formally real Jordan algebras’.

State-observable duality – Part 3: the Koecher–Vinberg classification of self-dual homogeneous convex cones, and its relation to state-observable duality.

Solèr’s Theorem: Maria Pia Solèr’s amazing theorem from 1995, which characterizes Hilbert spaces over the real numbers, complex numbers and quaternions.

The Three-Fold Way – Part 1: two problems with real and quaternionic quantum mechanics.

The Three-Fold Way – Part 2: why irreducible unitary representations on complex Hilbert spaces come in three flavors: real, complex and quaternionic.

The Three-Fold Way – Part 3: why the “q” in “qubit” stands for “quaternion”.

The Three-Fold Way – Part 4: how turning a particle around 180 degrees is related to making it go backwards in time, and what all this has to do with real numbers and quaternions.

The Three-Fold Way – Part 5 – a triangle of functors relating the categories of real, complex and quaternionic Hilbert spaces.

The Three-Fold Way – Part 6 — how the three-fold way solves two problems with real and quaternionic quantum mechanics.

All these blog posts are based on the following paper… but they’ve got a lot more jokes, digressions and silly pictures thrown in, so personally I recommend the blog posts:

Quantum theory and division algebras.

And if you’re into normed division algebras and physics, you might like this talk I gave on my work with John Huerta, which also brings the octonions into the game:

Higher gauge theory, division algebras and superstrings.

Finally, around May, John and I will come out with a Scientific American article explaining the same stuff in a less technical way. It’ll be called “The strangest numbers in string theory”.

Whew! I think that’s enough division algebras for now. I’ve long been on a quest to save the quaternions and octonions from obscurity and show the world just how great they are. It’s time to declare victory and quit. There’s a more difficult quest ahead: the search for green mathematics, whatever that might be.


Network Theory (Part 5)

11 April, 2011

Last time we saw clues that stochastic Petri nets are a lot like quantum field theory, but with probabilities replacing amplitudes. There’s a powerful analogy at work here, which can help us a lot. So, this time I want to make that analogy precise.

But first, let me quickly sketch why it could be worthwhile.

A Poisson process

Consider this stochastic Petri net with rate constant r:

It describes an inexhaustible supply of fish swimming down a river, and getting caught when they run into a fisherman’s net. In any short time \Delta t there’s a chance of about r \Delta t of a fish getting caught. There’s also a chance of two or more fish getting caught, but this becomes negligible by comparison as \Delta t \to 0. Moreover, the chance of a fish getting caught during this interval of time is independent of what happens before or afterwards. This sort of process is called a Poisson process.

Problem. Suppose we start out knowing for sure there are no fish in the fisherman’s net. What’s the probability that he has caught n fish at time t?

At any time there will be some probability of having caught n fish; let’s call this probability \psi(n,t), or \psi(n) for short. We can summarize all these probabilities in a single power series, called a generating function:

\Psi(t) = \sum_{n=0}^\infty \psi(n,t) \, z^n

Here z is a formal variable—don’t ask what it means, for now it’s just a trick. In quantum theory we use this trick when talking about collections of photons rather than fish, but then the numbers \psi(n,t) are complex ‘amplitudes’. Now they are real probabilities, but we can still copy what the physicists do, and use this trick to rewrite the master equation as follows:

\frac{d}{d t} \Psi(t) = H \Psi(t)

This describes how the probability of having caught any given number of fish changes with time.

What’s the operator H? Well, in quantum theory we describe the creation of photons using a certain operator on power series called the creation operator:

a^\dagger \Psi = z \Psi

We can try to apply this to our fish. If at some time we’re 100% sure we have n fish, we have

\Psi = z^n

so applying the creation operator gives

a^\dagger \Psi = z^{n+1}

One more fish! That’s good. So, an obvious wild guess is

H = r a^\dagger

where r is the rate at which we’re catching fish. Let’s see how well this guess works.

If you know how to exponentiate operators, you know to solve this equation:

\frac{d}{d t} \Psi(t) = H \Psi(t)

It’s easy:

\Psi(t) = \mathrm{exp}(t H) \Psi(0)

Since we start out knowing there are no fish in the net, we have

\Psi(0) = 1

so with our guess for H we get

\Psi(t) = \mathrm{exp}(r t a^\dagger) 1

But a^\dagger is the operator of multiplication by z, so \mathrm{exp}(r t a^\dagger) is multiplication by e^{r t z}, so

\Psi(t) = e^{r t z} = \sum_{n = 0}^\infty \frac{(r t)^n}{n!} \, z^n

So, if our guess is right, the probability of having caught n fish at time t is

\frac{(r t)^n}{n!}

Unfortunately, this can’t be right, because these probabilities don’t sum to 1! Instead their sum is

\sum_{n=0}^\infty \frac{(r t)^n}{n!} = e^{r t}

We can try to wriggle out of the mess we’re in by dividing our answer by this fudge factor. It sounds like a desperate measure, but we’ve got to try something!

This amounts to guessing that the probability of having caught n fish by time t is

\frac{(r t)^n}{n!} \, e^{-r t }

And this is right! This is called the Poisson distribution: it’s famous for being precisely the answer to the problem we’re facing.

So on the one hand our wild guess about H was wrong, but on the other hand it was not so far off. We can fix it as follows:

H = r (a^\dagger - 1)

The extra -1 gives us the fudge factor we need.

So, a wild guess corrected by an ad hoc procedure seems to have worked! But what’s really going on?

What’s really going on is that a^\dagger, or any multiple of this, is not a legitimate Hamiltonian for a master equation: if we define a time evolution operator \exp(t H) using a Hamiltonian like this, probabilities won’t sum to 1! But a^\dagger - 1 is okay. So, we need to think about which Hamiltonians are okay.

In quantum theory, self-adjoint Hamiltonians are okay. But in probability theory, we need some other kind of Hamiltonian. Let’s figure it out.

Probability versus quantum theory

Suppose we have a system of any kind: physical, chemical, biological, economic, whatever. The system can be in different states. In the simplest sort of model, we say there’s some set X of states, and say that at any moment in time the system is definitely in one of these states. But I want to compare two other options:

• In a probabilistic model, we may instead say that the system has a probability \psi(x) of being in any state x \in X. These probabilities are nonnegative real numbers with

\sum_{x \in X} \psi(x) = 1

• In a quantum model, we may instead say that the system has an amplitude \psi(x) of being in any state x \in X. These amplitudes are complex numbers with

\sum_{x \in X} | \psi(x) |^2 = 1

Probabilities and amplitudes are similar yet strangely different. Of course given an amplitude we can get a probability by taking its absolute value and squaring it. This is a vital bridge from quantum theory to probability theory. Today, however, I don’t want to focus on the bridges, but rather the parallels between these theories.

We often want to replace the sums above by integrals. For that we need to replace our set X by a measure space, which is a set equipped with enough structure that you can integrate real or complex functions defined on it. Well, at least you can integrate so-called ‘integrable’ functions—but I’ll neglect all issues of analytical rigor here. Then:

• In a probabilistic model, the system has a probability distribution \psi : X \to \mathbb{R}, which obeys \psi \ge 0 and

\int_X \psi(x) \, d x = 1

• In a quantum model, the system has a wavefunction \psi : X \to \mathbb{C}, which obeys

\int_X | \psi(x) |^2 \, d x= 1

In probability theory, we integrate \psi over a set S \subset X to find out the probability that our systems state is in this set. In quantum theory we integrate |\psi|^2 over the set to answer the same question.

We don’t need to think about sums over sets and integrals over measure spaces separately: there’s a way to make any set X into a measure space such that by definition,

\int_X \psi(x) \, dx = \sum_{x \in X} \psi(x)

In short, integrals are more general than sums! So, I’ll mainly talk about integrals, until the very end.

In probability theory, we want our probability distributions to be vectors in some vector space. Ditto for wave functions in quantum theory! So, we make up some vector spaces:

• In probability theory, the probability distribution \psi is a vector in the space

L^1(X) = \{ \psi: X \to \mathbb{C} \; : \; \int_X |\psi(x)| \, d x < \infty \}

• In quantum theory, the wavefunction \psi is a vector in the space

L^2(X) = \{ \psi: X \to \mathbb{C} \; : \; \int_X |\psi(x)|^2 \, d x < \infty \}

You may wonder why I defined L^1(X) to consist of complex functions when probability distributions are real. I’m just struggling to make the analogy seem as strong as possible. In fact probability distributions are not just real but nonnegative. We need to say this somewhere… but we can, if we like, start by saying they’re complex-valued functions, but then whisper that they must in fact be nonnegative (and thus real). It’s not the most elegant solution, but that’s what I’ll do for now.

Now:

• The main thing we can do with elements of L^1(X), besides what we can do with vectors in any vector space, is integrate one. This gives a linear map:

\int : L^1(X) \to \mathbb{C}

• The main thing we can with elements of L^2(X), besides the besides the things we can do with vectors in any vector space, is take the inner product of two:

\langle \psi, \phi \rangle = \int_X \overline{\psi}(x) \phi(x) \, d x

This gives a map that’s linear in one slot and conjugate-linear in the other:

\langle - , - \rangle :  L^2(X) \times L^2(X) \to \mathbb{C}

First came probability theory with L^1(X); then came quantum theory with L^2(X). Naive extrapolation would say it’s about time for someone to invent an even more bizarre theory of reality based on L^3(X). In this, you’d have to integrate the product of three wavefunctions to get a number! The math of Lp spaces is already well-developed, so give it a try if you want. I’ll stick to L^1 and L^2 today.

Stochastic versus unitary operators

Now let’s think about time evolution:

• In probability theory, the passage of time is described by a map sending probability distributions to probability distributions. This is described using a stochastic operator

U : L^1(X) \to L^1(X)

meaning a linear operator such that

\int U \psi = \int \psi

and

\psi \ge 0 \quad \implies \quad U \psi \ge 0

• In quantum theory the passage of time is described by a map sending wavefunction to wavefunctions. This is described using an isometry

U : L^2(X) \to L^2(X)

meaning a linear operator such that

\langle U \psi , U \phi \rangle = \langle \psi , \phi \rangle

In quantum theory we usually want time evolution to be reversible, so we focus on isometries that have inverses: these are called unitary operators. In probability theory we often consider stochastic operators that are not invertible.

Infinitesimal stochastic versus self-adjoint operators

Sometimes it’s nice to think of time coming in discrete steps. But in theories where we treat time as continuous, to describe time evolution we usually need to solve a differential equation. This is true in both probability theory and quantum theory:

• In probability theory we often describe time evolution using a differential equation called the master equation:

\frac{d}{d t} \psi(t) = H \psi(t)

whose solution is

\psi(t) = \exp(t H)\psi(0)

• In quantum theory we often describe time evolution using a differential equation called Schrödinger’s equation:

i \frac{d}{d t} \psi(t) = H \psi(t)

whose solution is

\psi(t) = \exp(-i t H)\psi(0)

In fact the appearance of i in the quantum case is purely conventional; we could drop it to make the analogy better, but then we’d have to work with ‘skew-adjoint’ operators instead of self-adjoint ones in what follows.

Let’s guess what properties an operator H should have to make \exp(-i t H) unitary for all t. We start by assuming it’s an isometry:

\langle \exp(-i t H) \psi, \exp(-i t H) \phi \rangle = \langle \psi, \phi \rangle

Then we differentiate this with respect to t and set t = 0, getting

\langle -iH \psi, \phi \rangle +  \langle \psi, -iH \phi \rangle = 0

or in other words

\langle H \psi, \phi \rangle = \langle \psi, H \phi \rangle

Physicists call an operator obeying this condition self-adjoint. Mathematicians know there’s more to it, but today is not the day to discuss such subtleties, intriguing though they be. All that matters now is that there is, indeed, a correspondence between self-adjoint operators and well-behaved ‘one-parameter unitary groups’ \exp(-i t H). This is called Stone’s Theorem.

But now let’s copy this argument to guess what properties an operator H must have to make \exp(t H) stochastic. We start by assuming \exp(t H) is stochastic, so

\int \exp(t H) \psi = \int \psi

and

\psi \ge 0 \quad \implies \quad \exp(t H) \psi \ge 0

We can differentiate the first equation with respect to t and set t = 0, getting

\int H \psi = 0

for all \psi.

But what about the second condition,

\psi \ge 0 \quad \implies \quad \exp(t H) \psi \ge 0?

It seems easier to deal with this in the special case when integrals over X reduce to sums. So let’s suppose that happens… and let’s start by seeing what the first condition says in this case.

In this case, L^1(X) has a basis of ‘Kronecker delta functions’: The Kronecker delta function \delta_i vanishes everywhere except at one point i \in X, where it equals 1. Using this basis, we can write any operator on L^1(X) as a matrix.

As a warmup, let’s see what it means for an operator

U: L^1(X) \to L^1(X)

to be stochastic in this case. We’ll take the conditions

\int U \psi = \int \psi

and

\psi \ge 0 \quad \implies \quad U \psi \ge 0

and rewrite them using matrices. For both, it’s enough to consider the case where \psi is a Kronecker delta, say \delta_j.

In these terms, the first condition says

\sum_{i \in X} U_{i j} = 1

for each column j. The second says

U_{i j} \ge 0

for all i, j. So, in this case, a stochastic operator is just a square matrix where each column sums to 1 and all the entries are nonnegative. (Such matrices are often called left stochastic.)

Next, let’s see what we need for an operator H to have the property that \exp(t H) is stochastic for all t \ge 0. It’s enough to assume t is very small, which lets us use the approximation

\exp(t H) = 1 + t H + \cdots

and work to first order in t. Saying that each column of this matrix sums to 1 then amounts to

\sum_{i \in X} \delta_{i j} + t H_{i j} + \cdots  = 1

which requires

\sum_{i \in X} H_{i j} = 0

Saying that each entry is nonnegative amounts to

\delta_{i j} + t H_{i j} + \cdots  \ge 0

When i = j this will be automatic when t is small enough, so the meat of this condition is

H_{i j} \ge 0 \; \mathrm{if} \; i \ne j

So, let’s say H is an infinitesimal stochastic matrix if its columns sum to zero and its off-diagonal entries are nonnegative.

I don’t love this terminology: do you know a better one? There should be some standard term. People here say they’ve seen the term ‘stochastic Hamiltonian’. The idea behind my term is that any infintesimal stochastic operator should be the infinitesimal generator of a stochastic process.

In other words, when we get the details straightened out, any 1-parameter family of stochastic operators

U(t) : L^1(X) \to L^1(X) \qquad   t \ge 0

obeying

U(0) = I

U(t) U(s) = U(t+s)

and continuity:

t_i \to t \quad \implies \quad U(t_i) \psi \to U(t)\psi

should be of the form

U(t) = \exp(t H)

for a unique ‘infinitesimal stochastic operator’ H.

When X is a finite set, this is true—and an infinitesimal stochastic operator is just a square matrix whose columns sum to zero and whose off-diagonal entries are nonnegative. But do you know a theorem characterizing infinitesimal stochastic operators for general measure spaces X? Someone must have worked it out.

Luckily, for our work on stochastic Petri nets, we only need to understand the case where X is a countable set and our integrals are really just sums. This should be almost like the case where X is a finite set—but we’ll need to take care that all our sums converge.

The moral

Now we can see why a Hamiltonian like a^\dagger is no good, while a^\dagger - 1 is good. (I’ll ignore the rate constant r since it’s irrelevant here.) The first one is not infinitesimal stochastic, while the second one is!

In this example, our set of states is the natural numbers:

X = \mathbb{N}

The probability distribution

\psi : \mathbb{N} \to \mathbb{C}

tells us the probability of having caught any specific number of fish.

The creation operator is not infinitesimal stochastic: in fact, it’s stochastic! Why? Well, when we apply the creation operator, what was the probability of having n fish now becomes the probability of having n+1 fish. So, the probabilities remain nonnegative, and their sum over all n is unchanged. Those two conditions are all we need for a stochastic operator.

Using our fancy abstract notation, these conditions say:

\int a^\dagger \psi = \int \psi

and

\psi \ge 0 \; \implies \; a^\dagger \psi \ge 0

So, precisely by virtue of being stochastic, the creation operator fails to be infinitesimal stochastic:

\int a^\dagger \psi \ne 0

Thus it’s a bad Hamiltonian for our stochastic Petri net.

On the other hand, a^\dagger - 1 is infinitesimal stochastic. Its off-diagonal entries are the same as those of a^\dagger, so they’re nonnegative. Moreover:

\int (a^\dagger - 1) \psi = 0

precisely because

\int a^\dagger \psi = \int \psi

You may be thinking: all this fancy math just to understand a single stochastic Petri net, the simplest one of all!

But next time I’ll explain a general recipe which will let you write down the Hamiltonian for any stochastic Petri net. The lessons we’ve learned today will make this much easier. And pondering the analogy between probability theory and quantum theory will also be good for our bigger project of unifying the applications of network diagrams to dozens of different subjects.


Network Theory (Part 4)

6 April, 2011

Last time I explained the rate equation of a stochastic Petri net. But now let’s get serious: let’s see what’s really stochastic— that is, random—about a stochastic Petri net. For this we need to forget the rate equation (temporarily) and learn about the ‘master equation’. This is where ideas from quantum field theory start showing up!

A Petri net has a bunch of states and a bunch of transitions. Here’s an example we’ve already seen, from chemistry:

The states are in yellow, the transitions in blue. A labelling of our Petri net is a way of putting some number of things in each state. We can draw these things as little black dots:

In this example there are only 0 or 1 things in each state: we’ve got one atom of carbon, one molecule of oxygen, one molecule of sodium hydroxide, one molecule of hydrochloric acid, and nothing else. But in general, we can have any natural number of things in each state.

In a stochastic Petri net, the transitions occur randomly as time passes. For example, as time passes we could see a sequence of transitions like this:

 

 

 

 

 

 

Each time a transition occurs, the number of things in each state changes in an obvious way.

The Master Equation

Now, I said the transitions occur ‘randomly’, but that doesn’t mean there’s no rhyme or reason to them! The miracle of probability theory is that it lets us state precise laws about random events. The law governing the random behavior of a stochastic Petri net is called the ‘master equation’.

In a stochastic Petri net, each transition has a rate constant, a nonnegative real number. Roughly speaking, this determines the probability of that transition.

A bit more precisely: suppose we have a Petri net that is labelled in some way at some moment. Then the probability that a given transition occurs in a short time \Delta t is approximately:

• the rate constant for that transition, times

• the time \Delta t, times

• the number of ways the transition can occur.

More precisely still: this formula is correct up to terms of order (\Delta t)^2. So, taking the limit as \Delta t \to 0, we get a differential equation describing precisely how the probability of the Petri net having a given labelling changes with time! And this is the master equation.

Now, you might be impatient to actually see the master equation, but that would be rash. The true master doesn’t need to see the master equation. It sounds like a Zen proverb, but it’s true. The raw beginner in mathematics wants to see the solutions of an equation. The more advanced student is content to prove that the solution exists. But the master is content to prove that the equation exists.

A bit more seriously: what matters is understanding the rules that inevitably lead to some equation: actually writing it down is then straightforward.

And you see, there’s something I haven’t explained yet: “the number of ways the transition can occur”. This involves a bit of counting. Consider, for example, this Petri net:

Suppose there are 10 rabbits and 5 wolves.

• How many ways can the birth transition occur? Since birth takes one rabbit as input, it can occur in 10 ways.

• How many ways can predation occur? Since predation takes one rabbit and one wolf as inputs, it can occur in 10 × 5 = 50 ways.

• How many ways can death occur? Since death takes one wolf as input, it can occur in 5 ways.

Or consider this one:

Suppose there are 10 hydrogen atoms and 5 oxygen atoms. How many ways can they form a water molecule? There are 10 ways to pick the first hydrogen, 9 ways to pick the second hydrogen, and 5 ways to pick the oxygen. So, there are

10 \times 9 \times 5 = 450

ways.

Note that we’re treating the hydrogen atoms as distinguishable, so there are 10 \times 9 ways to pick them, not \frac{10 \times 9}{2} = \binom{9}{2}. In general, the number of ways to choose M distinguishable things from a collection of L is the falling power

L^{\underline{M}} = L \cdot (L - 1) \cdots (L - M + 1)

where there are M factors in the product, but each is 1 less than the preceding one—hence the term ‘falling’.

Okay, now I’ve given you all the raw ingredients to work out the master equation for any stochastic Petri net. The previous paragraph was a big fat hint. One more nudge and you’re on your own:

Puzzle. Suppose we have a stochastic Petri net with k states and just one transition, whose rate constant is r. Suppose the ith state appears m_i times as the input of this transition and n_i times as the output. A labelling of this stochastic Petri net is a k-tuple of natural numbers \ell = (\ell_1, \dots, \ell_k) saying how many things are in each state. Let \psi_\ell(t) be the probability that the labelling is \ell at time t. Then the master equation looks like this:

\frac{d}{d t} \psi_{\ell}(t)  = \sum_{\ell'} H_{\ell \ell'} \psi_{\ell'}(t)

for some matrix of real numbers H_{\ell \ell'}. What is this matrix?

You can write down a formula for this matrix using what I’ve told you. And then, if you have a stochastic Petri net with more transitions, you can just compute the matrix for each transition using this formula, and add them all up.

Someday I’ll tell you the answer to this puzzle, but I want to get there by a strange route: I want to guess the master equation using ideas from quantum field theory!

Some clues

Why?

Well, if we think about a stochastic Petri net whose labelling undergoes random transitions as I’ve described, you’ll see that any possible ‘history’ for the labelling can be drawn in a way that looks like a Feynman diagram. In quantum field theory, Feynman diagrams show how things interact and turn into other things. But that’s what stochastic Petri nets do, too!

For example, if our Petri net looks like this:

then a typical history can be drawn like this:

Some rabbits and wolves come in on top. They undergo some transitions as time passes, and go out on the bottom. The vertical coordinate is time, while the horizontal coordinate doesn’t really mean anything: it just makes the diagram easier to draw.

If we ignore all the artistry that makes it cute, this Feynman diagram is just a graph with states as edges and transitions as vertices. Each transition occurs at a specific time.

We can use these Feynman diagrams to compute the probability that if we start it off with some labelling at time t_1, our stochastic Petri net will wind up with some other labelling at time t_2. To do this, we just take a sum over Feynman diagrams that start and end with the given labellings. For each Feynman diagram, we integrate over all possible times at which the transitions occur. And what do we integrate? Just the product of the rate constants for those transitions!

That was a bit of a mouthful, and it doesn’t really matter if you followed it in detail. What matters is that it sounds a lot like stuff you learn when you study quantum field theory!

That’s one clue that something cool is going on here. Another is the master equation itself:

\frac{d}{dt} \psi_{\ell}(t) = \sum_{\ell'} H_{\ell \ell'} \psi_{\ell'}(t)

This looks a lot like Schrödinger’s equation, the basic equation describing how a quantum system changes with the passage of time.

We can make it look even more like Schrödinger’s equation if we create a vector space with the labellings \ell as a basis. The numbers \psi_\ell(t) will be the components of some vector \psi(t) in this vector space. The numbers H_{\ell \ell'} will be the matrix entries of some operator H on that vector space. And the master equation becomes:

\frac{d}{d t} \psi(t) = H \psi(t)

Compare Schrödinger’s equation:

i \frac{d}{d t} \psi(t) = H \psi(t)

The only visible difference is that factor of i!

But of course this is linked to another big difference: in the master equation \psi describes probabilities, so it’s a vector in a real vector space. In quantum theory \psi describes probability amplitudes, so it’s a vector in a complex Hilbert space.

Apart from this huge difference, everything is a lot like quantum field theory. In particular, our vector space is a lot like the Fock space one sees in quantum field theory. Suppose we have a quantum particle that can be in k different states. Then its Fock space is the Hilbert space we use to describe an arbitrary collection of such particles. It has an orthonormal basis denoted

| \ell_1 \cdots \ell_k \rangle

where \ell_1, \dots, \ell_k are natural numbers saying how many particles there are in each state. So, any vector in Fock space looks like this:

\psi = \sum_{\ell_1, \dots, \ell_k} \; \psi_{\ell_1 , \dots, \ell_k} \,  | \ell_1 \cdots \ell_k \rangle

But if write the whole list \ell_1, \dots, \ell_k simply as \ell, this becomes

\psi = \sum_{\ell} \psi_{\ell}   | \ell \rangle

This is almost like what we’ve been doing with Petri nets!—except I hadn’t gotten around to giving names to the basis vectors.

In quantum field theory class, I learned lots of interesting operators on Fock space: annihilation and creation operators, number operators, and so on. So, when I bumped into this master equation

\frac{d}{d t} \psi(t) = H \psi(t)

it seemed natural to take the operator H and write it in terms of these. There was an obvious first guess, which didn’t quite work… but thinking a bit harder eventually led to the right answer. Later, it turned out people had already thought about similar things. So, I want to explain this.

When I first started working on this stuff, I was focused on the difference between collections of indistinguishable things, like bosons or fermions, and collections of distinguishable things, like rabbits or wolves.

But with the benefit of hindsight, it’s even more important to think about the difference between ordinary quantum theory, which is all about probability amplitudes, and the game we’re playing now, which is all about probabilities. So, next time I’ll explain how we need to modify quantum theory so that it’s about probabilities. This will make it easier to guess a nice formula for H.


Network Theory (Part 2)

31 March, 2011

Today I’d like to start telling you about some research Jacob Biamonte and I are doing on stochastic Petri nets, quantum field theory, and category theory. It’ll take a few blog entries to cover this story, which is part of a larger story about network theory.

Stochastic Petri nets are one of many different diagrammatic languages people have evolved to study complex systems. We’ll see how they’re used in chemistry, molecular biology, population biology and queuing theory, which is roughly the science of waiting in line. If you’re a faithful reader of this blog, you’ve already seen an example of a Petri net taken from chemistry:

It shows some chemicals and some reactions involving these chemicals. To make it into a stochastic Petri net, we’d just label each reaction by a nonnegative real number: the reaction rate constant, or rate constant for short.

In general, a Petri net will have a set of states, which we’ll draw as yellow circles, and a set of transitions, which we’ll draw as blue rectangles. Here’s a Petri net from population biology:

Now, instead of different chemicals, the states are different species. And instead of chemical reactions, the transitions are processes involving our species.

This Petri net has two states: rabbit and wolf. It has three transitions:

• In birth, one rabbit comes in and two go out. This is a caricature of reality: these bunnies reproduce asexually, splitting in two like amoebas.

• In predation, one wolf and one rabbit come in and two wolves go out. This is a caricature of how predators need to eat prey to reproduce.

• In death, one wolf comes in and nothing goes out. Note that we’re pretending rabbits don’t die unless they’re eaten by wolves.

If we labelled each transition with a rate constant, we’d have a stochastic Petri net.

To make this Petri net more realistic, we’d have to make it more complicated. I’m trying to explain general ideas here, not realistic models of specific situations. Nonetheless, this Petri net already leads to an interesting model of population dynamics: a special case of the so-called ‘Lotka-Volterra predator-prey model’. We’ll see the details soon.

More to the point, this Petri net illustrates some possibilities that our previous example neglected. Every transition has some ‘input’ states and some ‘output’ states. But a state can show up more than once as the output (or input) of some transition. And as we see in ‘death’, we can have a transition with no outputs (or inputs) at all.

But let me stop beating around the bush, and give you the formal definitions. They’re simple enough:

Definition. A Petri net consists of a set S of states and a set T of transitions, together with a function

i : S \times T \to \mathbb{N}

saying how many copies of each state shows up as input for each transition, and a function

o: S \times T \to \mathbb{N}

saying how many times it shows up as output.

Definition. A stochastic Petri net is a Petri net together with a function

r: T \to [0,\infty)

giving a rate constant for each transition.

Starting from any stochastic Petri net, we can get two things. First:

• The master equation. This says how the probability that we have a given number of things in each state changes with time.

Since stochastic means ‘random’, the master equation is what gives stochastic Petri nets their name. The master equation is the main thing I’ll be talking about in future blog entries. But not right away!

Why not?

In chemistry, we typically have a huge number of things in each state. For example, a gram of water contains about 3 \times 10^{22} water molecules, and a smaller but still enormous number of hydroxide ions (OH), hydronium ions (H3O+), and other scarier things. These things blunder around randomly, bump into each other, and sometimes react and turn into other things. There’s a stochastic Petri net describing all this, as we’ll eventually see. But in this situation, we don’t usually want to know the probability that there are, say, exactly 310,1849,578,476,264 hydronium ions. That would be too much information! We’d be quite happy knowing the expected value of the number of hydronium ions, so we’d be delighted to have a differential equation that says how this changes with time.

And luckily, such an equation exists—and it’s much simpler than the master equation. So, today we’ll talk about:

• The rate equation. This says how the expected number of things in each state changes with time.

But first, I hope you get the overall idea. The master equation is stochastic: at each time the number of things in each state is a random variable taking values in \mathbb{N}, the set of natural numbers. The rate equation is deterministic: at each time the expected number of things in each state is a non-random variable taking values in [0,\infty), the set of nonnegative real numbers. If the master equation is the true story, the rate equation is only approximately true—but the approximation becomes good in some limit where the expected value of the number of things in each state is large, and the standard deviation is comparatively small.

If you’ve studied physics, this should remind you of other things. The master equation should remind you of the quantum harmonic oscillator, where energy levels are discrete, and probabilities are involved. The rate equation should remind you of the classical harmonic oscillator, where energy levels are continuous, and everything is deterministic.

When we get to the ‘original research’ part of our story, we’ll see this analogy is fairly precise! We’ll take a bunch of ideas from quantum mechanics and quantum field theory, and tweak them a bit, and show how we can use them to describe the master equation for a stochastic Petri net.

Indeed, the random processes that the master equation describes can be drawn as pictures:

This looks like a Feynman diagram, with animals instead of particles! It’s pretty funny, but the resemblance is no joke: the math will back it up.

I’m dying to explain all the details. But just as classical field theory is easier than quantum field theory, the rate equation is simpler than the master equation. So we should start there.

The rate equation

If you hand me a stochastic Petri net, I can write down its rate equation. Instead of telling you the general rule, which sounds rather complicated at first, let me do an example. Take the Petri net we were just looking at:

We can make it into a stochastic Petri net by choosing a number for each transition:

• the birth rate constant \beta

• the predation rate constant \gamma

• the death rate constant \delta

Let x(t) be the number of rabbits and let y(t) be the number of wolves at time t. Then the rate equation looks like this:

\frac{d x}{d t} = \beta x - \gamma x y

\frac{d y}{d t} = \gamma x y - \delta y

It’s really a system of equations, but I’ll call the whole thing “the rate equation” because later we may get smart and write it as a single equation.

See how it works?

• We get a term \beta x in the equation for rabbits, because rabbits are born at a rate equal to the number of rabbits times the birth rate constant \beta.

• We get a term - \delta y in the equation for wolves, because wolves die at a rate equal to the number of wolves times the death rate constant \delta.

• We get a term - \gamma x y in the equation for rabbits, because rabbits die at a rate equal to the number of rabbits times the number of wolves times the predation rate constant \gamma.

• We also get a term \gamma x y in the equation for wolves, because wolves are born at a rate equal to the number of rabbits times the number of wolves times \gamma.

Of course I’m not claiming that this rate equation makes any sense biologically! For example, think about predation. The \gamma x y terms in the above equation would make sense if rabbits and wolves roamed around randomly, and whenever a wolf and a rabbit came within a certain distance, the wolf had a certain probability of eating the rabbit and giving birth to another wolf. At least it would be make sense in the limit of large numbers of rabbits and wolves, where we can treat x and y as varying continuously rather than discretely. That’s a reasonable approximation to make sometimes. Unfortunately, rabbits and wolves don’t roam around randomly, and a wolf doesn’t spit out a new wolf each time it eats a rabbit.

Despite that, the equations

\frac{d x}{d t} = \beta x - \gamma x y

\frac{d y}{d t} = \gamma x y - \delta y

are actually studied in population biology. As I said, they’re a special case of the Lotka-Volterra predator-prey model, which looks like this:

\frac{d x}{d t} = \beta x - \gamma x y

\frac{d y}{d t} = \epsilon x y - \delta y

The point is that while these models are hideously oversimplified and thus quantitatively inaccurate, they exhibit interesting qualititative behavior that’s fairly robust. Depending on the rate constants, these equations can show either a stable equilibrium or stable periodic behavior. And we go from one regime to another, we see a kind of catastrophe called a “Hopf bifurcation”. I explained all this in week308 and week309. There I was looking at some other equations, not the Lotka-Volterra equations. But their qualitative behavior is the same!

If you want stochastic Petri nets that give quantitatively accurate models, it’s better to retreat to chemistry. Compared to animals, molecules come a lot closer to roaming around randomly and having a chance of reacting when they come within a certain distance. So in chemistry, rate equations can be used to make accurate predictions.

But I’m digressing. I should be explaining the general recipe for getting a rate equation from a stochastic Petri net! You might not be able to guess it from just one example. But I sense that you’re getting tired. So let’s stop now. Next time I’ll do more examples, and maybe even write down a general formula. But if you’re feeling ambitious, you can try this now:

Puzzle. Can you write down a stochastic Petri net whose rate equation is the Lotka-Volterra predator-prey model:

\frac{d x}{d t} = \beta x - \gamma x y

\frac{d y}{d t} = \epsilon x y - \delta y

for arbitrary \beta, \gamma, \delta, \epsilon \ge 0? If not, for which values of these rate constants can you do it?

References

If you want to study a bit on your own, here are some great online references on stochastic Petri nets and their rate equations:

• Peter J. E. Goss and Jean Peccoud, Quantitative modeling of stochastic systems in molecular biology by using stochastic Petri nets, Proc. Natl. Acad. Sci. USA 95 (June 1998), 6750-6755.

• Jeremy Gunawardena, Chemical reaction network theory for in-silico biologists.

• Martin Feinberg, Lectures on reaction networks.

I should admit that the first two talk about ‘chemical reaction networks’ instead of Petri nets. That’s no big deal: any chemical reaction network gives a Petri net in a pretty obvious way. You can probably figure out how; if you get stuck, just ask.

Also, I should admit that Petri net people say place where I’m saying state.

Here are some other references, which aren’t free unless you have an online subscription or access to a library:

• Peter J. Haas, Stochastic Petri Nets: Modelling, Stability, Simulation, Springer, Berlin, 2002.

• F. Horn and R. Jackson, General mass action kinetics, Archive for Rational Mechanics and Analysis 47 (1972), 81–116.

• Ina Koch, Petri nets – a mathematical formalism to analyze chemical reaction networks, Molecular Informatics 29 (2010), 838–843.

• Darren James Wilkinson, Stochastic Modelling for Systems Biology, Taylor & Francis, New York, 2006.


Energy, the Environment, and What Mathematicians Can Do (Part 2)

20 March, 2011

A couple of days ago I begged for help with a math colloquium talk I’m giving this Wednesday at Hong Kong University.

The response was immediate and wonderfully useful. Thanks, everyone! If my actual audience is as knowledgeable and critical as you folks, I’ll be shocked and delighted.

But I only showed you the first part of the talk… because I hadn’t written the second part yet! And the second part is the hard part: it’s about “what mathematicians can do”.

Here’s a version including the second part:

Energy, the Environment, and What Mathematicians Can Do.

I include just one example of what you’re probably dying to see: a mathematician proving theorems that are relevant to environmental and energy problems. And you’ll notice that this guy is not doing work that will directly help solve these problems.

That’s sort of on purpose: I think we mathematicians sit sort of near the edge of the big conversation about these problems. We do important things, now and then, but their importance tends to be indirect. And I think that’s okay.

But it’s also a bit unsatisfying. What’s your most impressive example of a mathematically exciting result that also directly impacts environmental and energy issues?

I have a bunch of my own examples, but I’d like to hear yours. I want to start creating a list.

(By the way: research is just part of the story! One of the easier ways mathematicians can help save the planet is to teach well. And I do discuss that.)


Follow

Get every new post delivered to your Inbox.

Join 3,095 other followers