A Question About Graduate Schools

12 May, 2011

I got an email from a physics major asking for some advice about graduate programs. He said it would be okay if I posted it here. Maybe you can help out!

Prof. Baez,

My name is Blake Pollard, I am an undergraduate physics major at Columbia University graduating next week. I agree very much with the premise of and the need for the Azimuth Project and would like to help out. Though my passion is physics, most of my undergraduate research has been in climate and sustainability. I would really like to find a graduate program enabling me to do both physics and something useful for the environmental movement, hence I haven’t committed to a Ph.D. program in pure physics. I studied category theory a bit here at Columbia from Lauda, and took some representation theory with Khovanov, but I think (at least at this point in time) my calling in physics is geometrical algebras. I was planning on spending a year off reading on my own, trying to do some work, and making the decision between environmental, physics, or mathematics graduate studies. Your blog served me well as a guidepost in my early college years for reading good stuff, and would appreciate any advice you have on:

1) graduate programs where I could do work on both mathematical physics and the environment

2) good people/places/projects that I could participate in in the coming year.

The Azimuth Project web resources have already been helpful in finding people to reach out to, but I figured you might have something or someone popping out of your head in particular.

I have programming experience in data mining, numerical simulations, remote sensing, and just having fun programming; decent math/physics background; and really just want to find a good place where good people are working hard. Like Göttingen way back in the day. Sorry for the longish email.

Thank you in advance for your time,

Blake S. Pollard
Applied Physics 2011

In a later email he added a bit more detail:

I think most people, though, would associate my goals with doing some physical modeling/analysis of environmental systems/problems, maybe a statistical-physical hybrid. That is probably what I would do in a PhD program on the environmental side of things. But I’m more thinking of having an advisor who does research in mathematical physics, while applying myself on the side to some problem related to the environment, like Google’s 20% projects. Probably it’s a bit too inter-departmental and too flexible for there to be a formal program for this (plus I’d likely be too busy!).

It seems though the answer might be simply doing a PhD in environmental sciences and doing physics in my spare time. Just organizing my own thoughts.

Are there any good grad programs that involve a mix of mathematical or theoretical physics and environmental science? I’ll take any good answers I get and add them to the Azimuth Wiki.


Crooks’ Fluctuation Theorem

30 April, 2011

guest post by Eric Downes

Christopher Jarzynski, Gavin Crooks, and some others have made a big splash by providing general equalities that relate free energy differences to non-equilibrium work values. The best place to start is the first two chapters of Gavin Crooks’ thesis:

• Gavin Crooks, Excursions in Statistical Dynamics, Ph.D. Thesis, Department of Chemistry, U.C. Berkeley, 1999.

Here is the ~1 kiloword summary:

If we consider the work W done on a system, Clausius’ Inequality states that

W \ge \Delta F

where \Delta F is the change in free energy. One must perform more work along a non-equilibrium path because of the second law of thermodynamics. The excess work

W- \Delta F

is dissipated as heat, and is basically the entropy change in the universe, measured in different units. But who knows how large the excess work will be…

One considers a small system for which we imagine there exists a distribution of thermodynamic work values W (more on that below) in moving a system through phase space. We start at a macrostate with free energy F_1, and (staying in touch with a thermal reservoir at inverse temperature \beta) move in finite time to a new non-equilibrium state. When this new state is allowed to equilibriate it will have free energy

F_1 + \Delta F

You can do this by changing the spin-spin coupling, compressing a gas, etc: you’re changing one of the parameters in the system’s Hamiltonian in a completely deterministic way, such that the structure of the Hamiltonian does not change, and the system still has well-defined microstates at all intervening times. Your total accumulated work values will follow

\displaystyle{ \langle \exp(-\beta W) \rangle = \exp(-\beta \Delta F)}

where the expectation value is over a distribution of all possible paths through the classical phase space. This is the Jarzynski Equality.

It has an analogue for quantum systems, which appears to be related to supersymmetry, somehow. But the proof for classical systems simply relies on a Markov chain that moves through state space and an appropriate definition for work (see below). I can dig up the reference if anyone wants.

This is actually a specific case of a more fundamental theorem discovered about a decade ago by Gavin Crooks: the Crooks fluctuation theorem:

\displaystyle{ \exp(-\beta(W- \Delta F)) = \frac{P_{\mathrm{fwd}}}{P_{\mathrm{rev}}} }

where P_{\mathrm{fwd}} is the probability of a particular forward path which requires work W, and P_{\mathrm{rev}} is the probability of its time-reversal dual (see Gavin Crooks’ thesis for more precise definitions).

How do we assign a thermodynamic work value to a path of microstates? At the risk of ruining it for you: It turns out that one can write a first law analog for a subsystem Hamiltonian. We start with:

H_{\mathrm{tot}} = H_{\mathrm{subsys}} + H_{\mathrm{environ}} + H_{\mathrm{interact}}

As with Gibbs’ derivation of the canonical ensemble, we never specify what H_{\mathrm{environ}} and H_{\mathrm{interact}} are, only that the number of degrees of freedom in H_{\mathrm{environ}} is very large, and H_{\mathrm{interact}} is a small coupling. You make the observation that work can be associated with changing the energy-levels of the microstates in H_{\mathrm{subsys}}, while heat is associated with the energy change when the (sub)system jumps from one microstate to another (due to H_{\mathrm{interact}}) with no change in the spectrum of available energies. This implies a rather deep connection between the Hamiltonian and thermodynamic work. The second figure in Gavin’s thesis explained everything for me, after that you can basically derive it yourself.

The only physical applications I am aware of are to Monte Carlo simulations and mesoscopic systems in nano- or molecular biophysics. In that regard John Baez’ recent relation between free energy and Rényi entropy is a nice potential competitor for the efficient calculation of free energy differences (which apparently normally requires multiple simulations at intervening temperaturess, calculating the specific heat at each.)

But the relation to Markov chains is much more interesting to me, because this is a very general mathematical object which can be related to a much broader class of problems. Heat ends up being associated with fluctuations in the system’s state, and the (phenomenological) energy values are kind of the “relative unlikelihood” of each state. The excess work turns out to be related to the Kullback-Leibler divergence between the forward and reverse path-probabilities.

For visual learners with a background in stat mech, this is all developed in a pedagogical talk I gave in Fall 2010 at U. Wisconsin-Madison’s Condensed Matter Theory Seminar; talk available here. I’m licensing it cc-by-sa-nc through the Creative Commons License. I’ve been sloppy with references, but I emphasize that this is not original work; it is my presentation of Crooks’ and Jarzynski’s. Nonetheless, any errors you find are my own. Hokay, have a nice day!


Network Theory (Part 7)

27 April, 2011

guest post by Jacob Biamonte

This post is part of a series on what John and I like to call Petri net field theory. Stochastic Petri nets can be used to model everything from vending machines to chemical reactions. Chemists have proven some powerful theorems about when these systems have equilibrium states. We’re trying to bind these old ideas into our fancy framework, in hopes that quantum field theory techniques could also be useful in this deep subject. We’ll describe the general theory later; today we’ll do an example from population biology.

Those of you following this series should know that I’m the calculation bunny for this project, with John playing the role of the wolf. If I don’t work quickly, drawing diagrams and trying to keep up with John’s space-bending quasar of information, I’ll be eaten alive! It’s no joke, so please try to respond and pretend to enjoy anything you read here. This will keep me alive for longer. If I did not take notes during our meetings, lots of this stuff would have never made it here, so hope you enjoy.

Amoeba reproduction and competition

Here’s a stochastic Petri net:

It shows a world with one state, amoeba, and two transitions:

reproduction, where one amoeba turns into two. Let’s call the rate constant for this transition \alpha.

competition, where two amoebas battle for resources and only one survives. Let’s call the rate constant for this transition \beta.

We are going to analyse this example in several ways. First we’ll study the deterministic dynamics it describes: we’ll look at its rate equation, which turns out to be the logistic equation, familiar in population biology. Then we’ll study the stochastic dynamics, meaning its master equation. That’s where the ideas from quantum field theory come in.

The rate equation

If P(t) is the population of amoebas at time t, we can follow the rules explained in Part 3 and crank out this rate equation:

\displaystyle{ \frac{d P}{d t} = \alpha P - \beta P^2}

We can rewrite this as

\displaystyle{\frac{d P }{d t}= k P(1-\frac{P}{Q}) }

where

\displaystyle{ Q = \frac{\alpha}{\beta} , \qquad k = \alpha}

What’s the meaning of Q and k?

Q is the carrying capacity, that is, the maximum sustainable population the environment can support.

k is the growth rate describing the approximately exponential growth of population when P(t) is small.

It’s a rare treat to find such an important differential equation that can be solved by analytical methods. Let’s enjoy solving it.

We start by separating variables and integrating both sides:

\displaystyle{\int \frac{d P}{P (1-P/Q)} = \int k d t}

We need to use partial fractions on the left side above, resulting in

\displaystyle{\int \frac{d P}{P}} + \displaystyle{\int \frac{d P}{Q-P} } = \displaystyle{\int k d t}

and so we pick up a constant C, let

A= \pm e^{-C}

and rearrange things as

\displaystyle{\frac{Q-P}{P}=A e^{-k t} }

so the population as a function of time becomes

\displaystyle{ P(t) = \frac{Q}{1+A e^{-k t}}}

At t=0 we can determine A uniquely. We write P_0 := P(0) and A becomes

\displaystyle{ A = \frac{Q-P_0}{P_0}}

The model now becomes very intuitive. Let’s set Q = k=1 and make a plot for various values of A:

We arrive at three distinct cases:

equilibrium (A=0). The horizontal blue line corresponds to the case where the initial population P_0 exactly equals the carrying capacity. In this case the population is constant.

dieoff (A < 0). The three decaying curves above the horizontal blue line correspond to cases where initial population is higher than the carrying capacity. The population dies off over time and approaches the carrying capacity.

growth (A > 0). The four increasing curves below the horizontal blue line represent cases where the initial population is lower than the carrying capacity. Now the population grows over time and approaches the carrying capacity.

The master equation

Next, let us follow the rules explained in Part 6 to write down the master equation for our example. Remember, now we write:

\displaystyle{\Psi(t) = \sum_{n = 0}^\infty \psi_n(t) z^n }

where \psi_n(t) is the probability of having n amoebas at time t, and z is a formal variable. The master equation says

\displaystyle{\frac{d}{d t} \Psi(t) = H \Psi(t)}

where H is an operator on formal power series called the Hamiltonian. To get the Hamiltonian we take each transition in our Petri net and build an operator built from creation and annihilation operators, as follows. Reproduction works like this:

while competition works like this:

Here a is the annihilation operator, a^\dagger is the creation operator and N = a^\dagger a is the number operator. Last time John explained precisely how the N‘s arise. So the theory is already in place, and we arrive at this Hamiltonian:

H = \alpha (a^\dagger a^\dagger a - N) \;\; + \;\; \beta(a^\dagger a a - N(N-1))

Remember, \alpha is the rate constant for reproduction, while \beta is the rate constant for competition.

The master equation can be solved: it’s equivalent to

\frac{d}{d t}(e^{-t H}\Psi(t))=0

so that e^{-t H}\Psi(t) is constant, and so

\Psi(t) = e^{t H}\Psi(0)

and that’s it! We can calculate the time evolution starting from any initial probability distribution of populations. Maybe everyone is already used to this, but I find it rather remarkable.

Here’s how it works. We pick a population, say n amoebas at t=0. This would mean \Psi(0) = z^n. We then evolve this state using e^{t H}. We expand this operator as

\begin{array}{ccl} e^{t H} &=&\displaystyle{ \sum_{n=0}^\infty \frac{t^n H^n}{n!} }  \\  \\  &=& \displaystyle{ 1 + t H + \frac{1}{2}t^2 H^n + \cdots }\end{array}

This operator contains the full information for the evolution of the system. It contains the histories of all possible amoeba populations—an amoeba mosaic if you will. From this, we can construct amoeba Feynman diagrams.

To do this, we work out each of the H^n terms in the expansion above. The first-order terms correspond to the Hamiltonian acting once. These are proportional to either \alpha or \beta. The second-order terms correspond to the Hamiltonian acting twice. These are proportional to either \alpha^2, \alpha\beta or \beta^2. And so on.

This is where things start to get interesting! To illustrate how it works, we will consider two possibilities for the second-order terms:

1) We start with a lone amoeba, so \Psi(0) = z. It reproduces and splits into two. In the battle of the century, the resulting amoebas compete and one dies. At the end we have:

\frac{\alpha \beta}{2}  (a^\dagger a a)(a^\dagger a^\dagger a) z

We can draw this as a Feynman diagram:

You might find this tale grim, and you may not like the odds either. It’s true, the odds could be better, but people are worse off than amoebas! The great Japanese swordsman Miyamoto Musashi quoted the survival odds of fair sword duels as 1/3, seeing that 1/3 of the time both participants die. A remedy is to cheat, but these amoeba are competing honestly.

2) We start with two amoebas, so the initial state is \Psi(0) = z^2. One of these amoebas splits into two. One of these then gets into an argument with the original amoeba over the Azimuth blog. The amoeba who solved all John’s puzzles survives. At the end we have

\frac{\alpha \beta}{2} (a^\dagger a a)(a^\dagger a^\dagger a) z^2

with corresponding Feynman diagram:

This should give an idea of how this all works. The exponential of the Hamiltonian gives all possible histories, and each of these can be translated into a Feynman diagram. In a future blog entry, we might explain this theory in detail.

An equilibrium state

We’ve seen the equilibrium solution for the rate equation; now let’s look for equilibrium solutions of the master equation. This paper:

• D. F. Anderson, G. Craciun and T.G. Kurtz, Product-form stationary distributions for deficiency zero chemical reaction networks, arXiv:0803.3042.

proves that for a large class of stochastic Petri nets, there exists an equilibrium solution of the master equation where the number of things in each state is distributed according to a Poisson distribution. Even more remarkably, these probability distributions are independent, so knowing how many things are in one state tells you nothing about how many are in another!

Here’s a nice quote from this paper:

The surprising aspect of the deficiency zero theorem is that the assumptions of the theorem are completely related to the network of the system whereas the conclusions of the theorem are related to the dynamical properties of the system.

The ‘deficiency zero theorem’ is a result of Feinberg, which says that for a large class of stochastic Petri nets, the rate equation has a unique equilibrium solution. Anderson showed how to use this fact to get equilibrium solutions of the master equation!

We will consider this in future posts. For now, we need to talk a bit about ‘coherent states’.

These are all over the place in quantum theory. Legend (or at least Wikipedia) has it that Erwin Schrödinger himself discovered coherent states when he was looking for states of a quantum system that look ‘as classical as possible’. Suppose you have a quantum harmonic oscillator. Then the uncertainty principle says that

\Delta p \Delta q \ge \hbar/2

where \Delta p is the uncertainty in the momentum and \Delta q is the uncertainty in position. Suppose we want to make \Delta p \Delta q as small as possible, and suppose we also want \Delta p = \Delta q. Then we need our particle to be in a ‘coherent state’. That’s the definition. For the quantum harmonic oscillator, there’s a way to write quantum states as formal power series

\displaystyle{ \Psi = \sum_{n = 0}^\infty \psi_n z^n}

where \psi_n is the amplitude for having n quanta of energy. A coherent state then looks like this:

\displaystyle{ \Psi = e^{c z} = \sum_{n = 0}^\infty \frac{c^n}{n!} z^n}

where c can be any complex number. Here we have omitted a constant factor necessary to normalize the state.

We can also use coherent states in classical stochastic systems like collections of amoebas! Now the coefficient of z^n tells us the probability of having n amoebas, so c had better be real. And probabilities should sum to 1, so we really should normalize \Psi as follows:

\displaystyle{  \Psi = \frac{e^{c z}}{e^c} = e^{-c} \sum_{n = 0}^\infty \frac{c^n}{n!} z^n }

Now, the probability distribution

\displaystyle{\psi_n = e^{-c} \; \frac{c^n}{n!}}

is called a Poisson distribution. So, for starters you can think of a ‘coherent state’ as an over-educated way of talking about a Poisson distribution.

Let’s work out the expected number of amoebas in this Poisson distribution. In the answers to the puzzles in Part 6, we started using this abbreviation:

\displaystyle{ \sum \Psi = \sum_{n = 0}^\infty \psi_n }

We also saw that the expected number of amoebas in the probability distribution \Psi is

\displaystyle{  \sum N \Psi }

What does this equal? Remember that N = a^\dagger a. The annihilation operator a is just \frac{d}{d z}, so

\displaystyle{ a \Psi = c \Psi}

and we get

\displaystyle{ \sum N \Psi = \sum a^\dagger a \Psi = c \sum a^\dagger \Psi }

But we saw in Part 5 that a^\dagger is stochastic, meaning

\displaystyle{  \sum a^\dagger \Psi = \sum \Psi }

for any \Psi. Furthermore, our \Psi here has

\displaystyle{ \sum \Psi = 1}

since it’s a probability distribution. So:

\displaystyle{  \sum N \Psi = c \sum a^\dagger \Psi = c \sum \Psi = c}

The expected number of amoebas is just c.

Puzzle 1. This calculation must be wrong if c is negative: there can’t be a negative number of amoebas. What goes wrong then?

Puzzle 2. Use the same tricks to calculate the standard deviation of the number of amoebas in the Poisson distribution \Psi.

Now let’s return to our problem and consider the initial amoeba state

\displaystyle{ \Psi = e^{c z}}

Here aren’t bothering to normalize it, because we’re going to look for equilibrium solutions to the master equation, meaning solutions where \Psi(t) doesn’t change with time. So, we want to solve

\displaystyle{  H \Psi = 0}

Since this equation is linear, the normalization of \Psi doesn’t really matter.

Remember,

\displaystyle{  H\Psi = \alpha (a^\dagger a^\dagger a - N)\Psi + \beta(a^\dagger a a - N(N-1)) \Psi }

Let’s work this out. First consider the two \alpha terms:

\displaystyle{ a^\dagger a^\dagger a \Psi = c z^2 \Psi }

and

\displaystyle{ -N \Psi = -a^\dagger a\Psi = -c z \Psi}

Likewise for the \beta terms we find

\displaystyle{ a^\dagger a a\Psi=c^2 z \Psi}

and

\displaystyle{ -N(N-1)\psi = -a^\dagger a^\dagger a a \Psi = -c^2 z^2\Psi }

Here I’m using something John showed in Part 6: the product a^\dagger a^\dagger a a equals the ‘falling power’ N(N-1).

The sum of all four terms must vanish. This happens whenever

\displaystyle{ \alpha(c z^2 - c z)+\beta(c^2 z-c^2 z^2) = 0}

which is satisfied for

\displaystyle{ c= \frac{\alpha}{\beta}}

Yipee! We’ve found an equilibrium solution, since we found a value for c that makes H \Psi = 0. Even better, we’ve seen that the expected number of amoebas in this equilibrium state is

\displaystyle{ \frac{\alpha}{\beta}}

This is just the same as the equilibrium population we saw for the rate equation—that is, the carrying capacity for the logistic equation! That’s pretty cool, but it’s no coincidence: in fact, Anderson proved it works like this for lots of stochastic Petri nets.

I’m not sure what’s up next or what’s in store, since I’m blogging at gun point from inside a rabbit cage:

Give me all your blog posts, get out of that rabbit cage and reach for the sky!

I’d imagine we’re going to work out the theory behind this example and prove the existence of equilibrium solutions for master equations more generally. One idea John had was to have me start a night shift—that way you’ll get Azimuth posts 24 hours a day.


The Three-Fold Way

13 April, 2011

I just finished a series of blog posts about doing quantum theory using the real numbers, the complex numbers and the quaternions… and how Nature seems to use all three. Mathematically, they fit together in a structure that Freeman Dyson called The Three-Fold Way.

You read all those blog posts here:

State-observable duality – Part 1: a review of normed division algebras.

State-observable duality – Part 2: the theorem by Jordan, von Neumann and Wigner classifying ‘finite-dimensional formally real Jordan algebras’.

State-observable duality – Part 3: the Koecher–Vinberg classification of self-dual homogeneous convex cones, and its relation to state-observable duality.

Solèr’s Theorem: Maria Pia Solèr’s amazing theorem from 1995, which characterizes Hilbert spaces over the real numbers, complex numbers and quaternions.

The Three-Fold Way – Part 1: two problems with real and quaternionic quantum mechanics.

The Three-Fold Way – Part 2: why irreducible unitary representations on complex Hilbert spaces come in three flavors: real, complex and quaternionic.

The Three-Fold Way – Part 3: why the “q” in “qubit” stands for “quaternion”.

The Three-Fold Way – Part 4: how turning a particle around 180 degrees is related to making it go backwards in time, and what all this has to do with real numbers and quaternions.

The Three-Fold Way – Part 5 – a triangle of functors relating the categories of real, complex and quaternionic Hilbert spaces.

The Three-Fold Way – Part 6 — how the three-fold way solves two problems with real and quaternionic quantum mechanics.

All these blog posts are based on the following paper… but they’ve got a lot more jokes, digressions and silly pictures thrown in, so personally I recommend the blog posts:

Quantum theory and division algebras.

And if you’re into normed division algebras and physics, you might like this talk I gave on my work with John Huerta, which also brings the octonions into the game:

Higher gauge theory, division algebras and superstrings.

Finally, around May, John and I will come out with a Scientific American article explaining the same stuff in a less technical way. It’ll be called “The strangest numbers in string theory”.

Whew! I think that’s enough division algebras for now. I’ve long been on a quest to save the quaternions and octonions from obscurity and show the world just how great they are. It’s time to declare victory and quit. There’s a more difficult quest ahead: the search for green mathematics, whatever that might be.


Network Theory (Part 5)

11 April, 2011

Last time we saw clues that stochastic Petri nets are a lot like quantum field theory, but with probabilities replacing amplitudes. There’s a powerful analogy at work here, which can help us a lot. So, this time I want to make that analogy precise.

But first, let me quickly sketch why it could be worthwhile.

A Poisson process

Consider this stochastic Petri net with rate constant r:

It describes an inexhaustible supply of fish swimming down a river, and getting caught when they run into a fisherman’s net. In any short time \Delta t there’s a chance of about r \Delta t of a fish getting caught. There’s also a chance of two or more fish getting caught, but this becomes negligible by comparison as \Delta t \to 0. Moreover, the chance of a fish getting caught during this interval of time is independent of what happens before or afterwards. This sort of process is called a Poisson process.

Problem. Suppose we start out knowing for sure there are no fish in the fisherman’s net. What’s the probability that he has caught n fish at time t?

At any time there will be some probability of having caught n fish; let’s call this probability \psi(n,t), or \psi(n) for short. We can summarize all these probabilities in a single power series, called a generating function:

\Psi(t) = \sum_{n=0}^\infty \psi(n,t) \, z^n

Here z is a formal variable—don’t ask what it means, for now it’s just a trick. In quantum theory we use this trick when talking about collections of photons rather than fish, but then the numbers \psi(n,t) are complex ‘amplitudes’. Now they are real probabilities, but we can still copy what the physicists do, and use this trick to rewrite the master equation as follows:

\frac{d}{d t} \Psi(t) = H \Psi(t)

This describes how the probability of having caught any given number of fish changes with time.

What’s the operator H? Well, in quantum theory we describe the creation of photons using a certain operator on power series called the creation operator:

a^\dagger \Psi = z \Psi

We can try to apply this to our fish. If at some time we’re 100% sure we have n fish, we have

\Psi = z^n

so applying the creation operator gives

a^\dagger \Psi = z^{n+1}

One more fish! That’s good. So, an obvious wild guess is

H = r a^\dagger

where r is the rate at which we’re catching fish. Let’s see how well this guess works.

If you know how to exponentiate operators, you know to solve this equation:

\frac{d}{d t} \Psi(t) = H \Psi(t)

It’s easy:

\Psi(t) = \mathrm{exp}(t H) \Psi(0)

Since we start out knowing there are no fish in the net, we have

\Psi(0) = 1

so with our guess for H we get

\Psi(t) = \mathrm{exp}(r t a^\dagger) 1

But a^\dagger is the operator of multiplication by z, so \mathrm{exp}(r t a^\dagger) is multiplication by e^{r t z}, so

\Psi(t) = e^{r t z} = \sum_{n = 0}^\infty \frac{(r t)^n}{n!} \, z^n

So, if our guess is right, the probability of having caught n fish at time t is

\frac{(r t)^n}{n!}

Unfortunately, this can’t be right, because these probabilities don’t sum to 1! Instead their sum is

\sum_{n=0}^\infty \frac{(r t)^n}{n!} = e^{r t}

We can try to wriggle out of the mess we’re in by dividing our answer by this fudge factor. It sounds like a desperate measure, but we’ve got to try something!

This amounts to guessing that the probability of having caught n fish by time t is

\frac{(r t)^n}{n!} \, e^{-r t }

And this is right! This is called the Poisson distribution: it’s famous for being precisely the answer to the problem we’re facing.

So on the one hand our wild guess about H was wrong, but on the other hand it was not so far off. We can fix it as follows:

H = r (a^\dagger - 1)

The extra -1 gives us the fudge factor we need.

So, a wild guess corrected by an ad hoc procedure seems to have worked! But what’s really going on?

What’s really going on is that a^\dagger, or any multiple of this, is not a legitimate Hamiltonian for a master equation: if we define a time evolution operator \exp(t H) using a Hamiltonian like this, probabilities won’t sum to 1! But a^\dagger - 1 is okay. So, we need to think about which Hamiltonians are okay.

In quantum theory, self-adjoint Hamiltonians are okay. But in probability theory, we need some other kind of Hamiltonian. Let’s figure it out.

Probability versus quantum theory

Suppose we have a system of any kind: physical, chemical, biological, economic, whatever. The system can be in different states. In the simplest sort of model, we say there’s some set X of states, and say that at any moment in time the system is definitely in one of these states. But I want to compare two other options:

• In a probabilistic model, we may instead say that the system has a probability \psi(x) of being in any state x \in X. These probabilities are nonnegative real numbers with

\sum_{x \in X} \psi(x) = 1

• In a quantum model, we may instead say that the system has an amplitude \psi(x) of being in any state x \in X. These amplitudes are complex numbers with

\sum_{x \in X} | \psi(x) |^2 = 1

Probabilities and amplitudes are similar yet strangely different. Of course given an amplitude we can get a probability by taking its absolute value and squaring it. This is a vital bridge from quantum theory to probability theory. Today, however, I don’t want to focus on the bridges, but rather the parallels between these theories.

We often want to replace the sums above by integrals. For that we need to replace our set X by a measure space, which is a set equipped with enough structure that you can integrate real or complex functions defined on it. Well, at least you can integrate so-called ‘integrable’ functions—but I’ll neglect all issues of analytical rigor here. Then:

• In a probabilistic model, the system has a probability distribution \psi : X \to \mathbb{R}, which obeys \psi \ge 0 and

\int_X \psi(x) \, d x = 1

• In a quantum model, the system has a wavefunction \psi : X \to \mathbb{C}, which obeys

\int_X | \psi(x) |^2 \, d x= 1

In probability theory, we integrate \psi over a set S \subset X to find out the probability that our systems state is in this set. In quantum theory we integrate |\psi|^2 over the set to answer the same question.

We don’t need to think about sums over sets and integrals over measure spaces separately: there’s a way to make any set X into a measure space such that by definition,

\int_X \psi(x) \, dx = \sum_{x \in X} \psi(x)

In short, integrals are more general than sums! So, I’ll mainly talk about integrals, until the very end.

In probability theory, we want our probability distributions to be vectors in some vector space. Ditto for wave functions in quantum theory! So, we make up some vector spaces:

• In probability theory, the probability distribution \psi is a vector in the space

L^1(X) = \{ \psi: X \to \mathbb{C} \; : \; \int_X |\psi(x)| \, d x < \infty \}

• In quantum theory, the wavefunction \psi is a vector in the space

L^2(X) = \{ \psi: X \to \mathbb{C} \; : \; \int_X |\psi(x)|^2 \, d x < \infty \}

You may wonder why I defined L^1(X) to consist of complex functions when probability distributions are real. I’m just struggling to make the analogy seem as strong as possible. In fact probability distributions are not just real but nonnegative. We need to say this somewhere… but we can, if we like, start by saying they’re complex-valued functions, but then whisper that they must in fact be nonnegative (and thus real). It’s not the most elegant solution, but that’s what I’ll do for now.

Now:

• The main thing we can do with elements of L^1(X), besides what we can do with vectors in any vector space, is integrate one. This gives a linear map:

\int : L^1(X) \to \mathbb{C}

• The main thing we can with elements of L^2(X), besides the besides the things we can do with vectors in any vector space, is take the inner product of two:

\langle \psi, \phi \rangle = \int_X \overline{\psi}(x) \phi(x) \, d x

This gives a map that’s linear in one slot and conjugate-linear in the other:

\langle - , - \rangle :  L^2(X) \times L^2(X) \to \mathbb{C}

First came probability theory with L^1(X); then came quantum theory with L^2(X). Naive extrapolation would say it’s about time for someone to invent an even more bizarre theory of reality based on L^3(X). In this, you’d have to integrate the product of three wavefunctions to get a number! The math of Lp spaces is already well-developed, so give it a try if you want. I’ll stick to L^1 and L^2 today.

Stochastic versus unitary operators

Now let’s think about time evolution:

• In probability theory, the passage of time is described by a map sending probability distributions to probability distributions. This is described using a stochastic operator

U : L^1(X) \to L^1(X)

meaning a linear operator such that

\int U \psi = \int \psi

and

\psi \ge 0 \quad \implies \quad U \psi \ge 0

• In quantum theory the passage of time is described by a map sending wavefunction to wavefunctions. This is described using an isometry

U : L^2(X) \to L^2(X)

meaning a linear operator such that

\langle U \psi , U \phi \rangle = \langle \psi , \phi \rangle

In quantum theory we usually want time evolution to be reversible, so we focus on isometries that have inverses: these are called unitary operators. In probability theory we often consider stochastic operators that are not invertible.

Infinitesimal stochastic versus self-adjoint operators

Sometimes it’s nice to think of time coming in discrete steps. But in theories where we treat time as continuous, to describe time evolution we usually need to solve a differential equation. This is true in both probability theory and quantum theory:

• In probability theory we often describe time evolution using a differential equation called the master equation:

\frac{d}{d t} \psi(t) = H \psi(t)

whose solution is

\psi(t) = \exp(t H)\psi(0)

• In quantum theory we often describe time evolution using a differential equation called Schrödinger’s equation:

i \frac{d}{d t} \psi(t) = H \psi(t)

whose solution is

\psi(t) = \exp(-i t H)\psi(0)

In fact the appearance of i in the quantum case is purely conventional; we could drop it to make the analogy better, but then we’d have to work with ‘skew-adjoint’ operators instead of self-adjoint ones in what follows.

Let’s guess what properties an operator H should have to make \exp(-i t H) unitary for all t. We start by assuming it’s an isometry:

\langle \exp(-i t H) \psi, \exp(-i t H) \phi \rangle = \langle \psi, \phi \rangle

Then we differentiate this with respect to t and set t = 0, getting

\langle -iH \psi, \phi \rangle +  \langle \psi, -iH \phi \rangle = 0

or in other words

\langle H \psi, \phi \rangle = \langle \psi, H \phi \rangle

Physicists call an operator obeying this condition self-adjoint. Mathematicians know there’s more to it, but today is not the day to discuss such subtleties, intriguing though they be. All that matters now is that there is, indeed, a correspondence between self-adjoint operators and well-behaved ‘one-parameter unitary groups’ \exp(-i t H). This is called Stone’s Theorem.

But now let’s copy this argument to guess what properties an operator H must have to make \exp(t H) stochastic. We start by assuming \exp(t H) is stochastic, so

\int \exp(t H) \psi = \int \psi

and

\psi \ge 0 \quad \implies \quad \exp(t H) \psi \ge 0

We can differentiate the first equation with respect to t and set t = 0, getting

\int H \psi = 0

for all \psi.

But what about the second condition,

\psi \ge 0 \quad \implies \quad \exp(t H) \psi \ge 0?

It seems easier to deal with this in the special case when integrals over X reduce to sums. So let’s suppose that happens… and let’s start by seeing what the first condition says in this case.

In this case, L^1(X) has a basis of ‘Kronecker delta functions’: The Kronecker delta function \delta_i vanishes everywhere except at one point i \in X, where it equals 1. Using this basis, we can write any operator on L^1(X) as a matrix.

As a warmup, let’s see what it means for an operator

U: L^1(X) \to L^1(X)

to be stochastic in this case. We’ll take the conditions

\int U \psi = \int \psi

and

\psi \ge 0 \quad \implies \quad U \psi \ge 0

and rewrite them using matrices. For both, it’s enough to consider the case where \psi is a Kronecker delta, say \delta_j.

In these terms, the first condition says

\sum_{i \in X} U_{i j} = 1

for each column j. The second says

U_{i j} \ge 0

for all i, j. So, in this case, a stochastic operator is just a square matrix where each column sums to 1 and all the entries are nonnegative. (Such matrices are often called left stochastic.)

Next, let’s see what we need for an operator H to have the property that \exp(t H) is stochastic for all t \ge 0. It’s enough to assume t is very small, which lets us use the approximation

\exp(t H) = 1 + t H + \cdots

and work to first order in t. Saying that each column of this matrix sums to 1 then amounts to

\sum_{i \in X} \delta_{i j} + t H_{i j} + \cdots  = 1

which requires

\sum_{i \in X} H_{i j} = 0

Saying that each entry is nonnegative amounts to

\delta_{i j} + t H_{i j} + \cdots  \ge 0

When i = j this will be automatic when t is small enough, so the meat of this condition is

H_{i j} \ge 0 \; \mathrm{if} \; i \ne j

So, let’s say H is an infinitesimal stochastic matrix if its columns sum to zero and its off-diagonal entries are nonnegative.

I don’t love this terminology: do you know a better one? There should be some standard term. People here say they’ve seen the term ‘stochastic Hamiltonian’. The idea behind my term is that any infintesimal stochastic operator should be the infinitesimal generator of a stochastic process.

In other words, when we get the details straightened out, any 1-parameter family of stochastic operators

U(t) : L^1(X) \to L^1(X) \qquad   t \ge 0

obeying

U(0) = I

U(t) U(s) = U(t+s)

and continuity:

t_i \to t \quad \implies \quad U(t_i) \psi \to U(t)\psi

should be of the form

U(t) = \exp(t H)

for a unique ‘infinitesimal stochastic operator’ H.

When X is a finite set, this is true—and an infinitesimal stochastic operator is just a square matrix whose columns sum to zero and whose off-diagonal entries are nonnegative. But do you know a theorem characterizing infinitesimal stochastic operators for general measure spaces X? Someone must have worked it out.

Luckily, for our work on stochastic Petri nets, we only need to understand the case where X is a countable set and our integrals are really just sums. This should be almost like the case where X is a finite set—but we’ll need to take care that all our sums converge.

The moral

Now we can see why a Hamiltonian like a^\dagger is no good, while a^\dagger - 1 is good. (I’ll ignore the rate constant r since it’s irrelevant here.) The first one is not infinitesimal stochastic, while the second one is!

In this example, our set of states is the natural numbers:

X = \mathbb{N}

The probability distribution

\psi : \mathbb{N} \to \mathbb{C}

tells us the probability of having caught any specific number of fish.

The creation operator is not infinitesimal stochastic: in fact, it’s stochastic! Why? Well, when we apply the creation operator, what was the probability of having n fish now becomes the probability of having n+1 fish. So, the probabilities remain nonnegative, and their sum over all n is unchanged. Those two conditions are all we need for a stochastic operator.

Using our fancy abstract notation, these conditions say:

\int a^\dagger \psi = \int \psi

and

\psi \ge 0 \; \implies \; a^\dagger \psi \ge 0

So, precisely by virtue of being stochastic, the creation operator fails to be infinitesimal stochastic:

\int a^\dagger \psi \ne 0

Thus it’s a bad Hamiltonian for our stochastic Petri net.

On the other hand, a^\dagger - 1 is infinitesimal stochastic. Its off-diagonal entries are the same as those of a^\dagger, so they’re nonnegative. Moreover:

\int (a^\dagger - 1) \psi = 0

precisely because

\int a^\dagger \psi = \int \psi

You may be thinking: all this fancy math just to understand a single stochastic Petri net, the simplest one of all!

But next time I’ll explain a general recipe which will let you write down the Hamiltonian for any stochastic Petri net. The lessons we’ve learned today will make this much easier. And pondering the analogy between probability theory and quantum theory will also be good for our bigger project of unifying the applications of network diagrams to dozens of different subjects.


Network Theory (Part 4)

6 April, 2011

Last time I explained the rate equation of a stochastic Petri net. But now let’s get serious: let’s see what’s really stochastic— that is, random—about a stochastic Petri net. For this we need to forget the rate equation (temporarily) and learn about the ‘master equation’. This is where ideas from quantum field theory start showing up!

A Petri net has a bunch of states and a bunch of transitions. Here’s an example we’ve already seen, from chemistry:

The states are in yellow, the transitions in blue. A labelling of our Petri net is a way of putting some number of things in each state. We can draw these things as little black dots:

In this example there are only 0 or 1 things in each state: we’ve got one atom of carbon, one molecule of oxygen, one molecule of sodium hydroxide, one molecule of hydrochloric acid, and nothing else. But in general, we can have any natural number of things in each state.

In a stochastic Petri net, the transitions occur randomly as time passes. For example, as time passes we could see a sequence of transitions like this:

 

 

 

 

 

 

Each time a transition occurs, the number of things in each state changes in an obvious way.

The Master Equation

Now, I said the transitions occur ‘randomly’, but that doesn’t mean there’s no rhyme or reason to them! The miracle of probability theory is that it lets us state precise laws about random events. The law governing the random behavior of a stochastic Petri net is called the ‘master equation’.

In a stochastic Petri net, each transition has a rate constant, a nonnegative real number. Roughly speaking, this determines the probability of that transition.

A bit more precisely: suppose we have a Petri net that is labelled in some way at some moment. Then the probability that a given transition occurs in a short time \Delta t is approximately:

• the rate constant for that transition, times

• the time \Delta t, times

• the number of ways the transition can occur.

More precisely still: this formula is correct up to terms of order (\Delta t)^2. So, taking the limit as \Delta t \to 0, we get a differential equation describing precisely how the probability of the Petri net having a given labelling changes with time! And this is the master equation.

Now, you might be impatient to actually see the master equation, but that would be rash. The true master doesn’t need to see the master equation. It sounds like a Zen proverb, but it’s true. The raw beginner in mathematics wants to see the solutions of an equation. The more advanced student is content to prove that the solution exists. But the master is content to prove that the equation exists.

A bit more seriously: what matters is understanding the rules that inevitably lead to some equation: actually writing it down is then straightforward.

And you see, there’s something I haven’t explained yet: “the number of ways the transition can occur”. This involves a bit of counting. Consider, for example, this Petri net:

Suppose there are 10 rabbits and 5 wolves.

• How many ways can the birth transition occur? Since birth takes one rabbit as input, it can occur in 10 ways.

• How many ways can predation occur? Since predation takes one rabbit and one wolf as inputs, it can occur in 10 × 5 = 50 ways.

• How many ways can death occur? Since death takes one wolf as input, it can occur in 5 ways.

Or consider this one:

Suppose there are 10 hydrogen atoms and 5 oxygen atoms. How many ways can they form a water molecule? There are 10 ways to pick the first hydrogen, 9 ways to pick the second hydrogen, and 5 ways to pick the oxygen. So, there are

10 \times 9 \times 5 = 450

ways.

Note that we’re treating the hydrogen atoms as distinguishable, so there are 10 \times 9 ways to pick them, not \frac{10 \times 9}{2} = \binom{9}{2}. In general, the number of ways to choose M distinguishable things from a collection of L is the falling power

L^{\underline{M}} = L \cdot (L - 1) \cdots (L - M + 1)

where there are M factors in the product, but each is 1 less than the preceding one—hence the term ‘falling’.

Okay, now I’ve given you all the raw ingredients to work out the master equation for any stochastic Petri net. The previous paragraph was a big fat hint. One more nudge and you’re on your own:

Puzzle. Suppose we have a stochastic Petri net with k states and just one transition, whose rate constant is r. Suppose the ith state appears m_i times as the input of this transition and n_i times as the output. A labelling of this stochastic Petri net is a k-tuple of natural numbers \ell = (\ell_1, \dots, \ell_k) saying how many things are in each state. Let \psi_\ell(t) be the probability that the labelling is \ell at time t. Then the master equation looks like this:

\frac{d}{d t} \psi_{\ell}(t)  = \sum_{\ell'} H_{\ell \ell'} \psi_{\ell'}(t)

for some matrix of real numbers H_{\ell \ell'}. What is this matrix?

You can write down a formula for this matrix using what I’ve told you. And then, if you have a stochastic Petri net with more transitions, you can just compute the matrix for each transition using this formula, and add them all up.

Someday I’ll tell you the answer to this puzzle, but I want to get there by a strange route: I want to guess the master equation using ideas from quantum field theory!

Some clues

Why?

Well, if we think about a stochastic Petri net whose labelling undergoes random transitions as I’ve described, you’ll see that any possible ‘history’ for the labelling can be drawn in a way that looks like a Feynman diagram. In quantum field theory, Feynman diagrams show how things interact and turn into other things. But that’s what stochastic Petri nets do, too!

For example, if our Petri net looks like this:

then a typical history can be drawn like this:

Some rabbits and wolves come in on top. They undergo some transitions as time passes, and go out on the bottom. The vertical coordinate is time, while the horizontal coordinate doesn’t really mean anything: it just makes the diagram easier to draw.

If we ignore all the artistry that makes it cute, this Feynman diagram is just a graph with states as edges and transitions as vertices. Each transition occurs at a specific time.

We can use these Feynman diagrams to compute the probability that if we start it off with some labelling at time t_1, our stochastic Petri net will wind up with some other labelling at time t_2. To do this, we just take a sum over Feynman diagrams that start and end with the given labellings. For each Feynman diagram, we integrate over all possible times at which the transitions occur. And what do we integrate? Just the product of the rate constants for those transitions!

That was a bit of a mouthful, and it doesn’t really matter if you followed it in detail. What matters is that it sounds a lot like stuff you learn when you study quantum field theory!

That’s one clue that something cool is going on here. Another is the master equation itself:

\frac{d}{dt} \psi_{\ell}(t) = \sum_{\ell'} H_{\ell \ell'} \psi_{\ell'}(t)

This looks a lot like Schrödinger’s equation, the basic equation describing how a quantum system changes with the passage of time.

We can make it look even more like Schrödinger’s equation if we create a vector space with the labellings \ell as a basis. The numbers \psi_\ell(t) will be the components of some vector \psi(t) in this vector space. The numbers H_{\ell \ell'} will be the matrix entries of some operator H on that vector space. And the master equation becomes:

\frac{d}{d t} \psi(t) = H \psi(t)

Compare Schrödinger’s equation:

i \frac{d}{d t} \psi(t) = H \psi(t)

The only visible difference is that factor of i!

But of course this is linked to another big difference: in the master equation \psi describes probabilities, so it’s a vector in a real vector space. In quantum theory \psi describes probability amplitudes, so it’s a vector in a complex Hilbert space.

Apart from this huge difference, everything is a lot like quantum field theory. In particular, our vector space is a lot like the Fock space one sees in quantum field theory. Suppose we have a quantum particle that can be in k different states. Then its Fock space is the Hilbert space we use to describe an arbitrary collection of such particles. It has an orthonormal basis denoted

| \ell_1 \cdots \ell_k \rangle

where \ell_1, \dots, \ell_k are natural numbers saying how many particles there are in each state. So, any vector in Fock space looks like this:

\psi = \sum_{\ell_1, \dots, \ell_k} \; \psi_{\ell_1 , \dots, \ell_k} \,  | \ell_1 \cdots \ell_k \rangle

But if write the whole list \ell_1, \dots, \ell_k simply as \ell, this becomes

\psi = \sum_{\ell} \psi_{\ell}   | \ell \rangle

This is almost like what we’ve been doing with Petri nets!—except I hadn’t gotten around to giving names to the basis vectors.

In quantum field theory class, I learned lots of interesting operators on Fock space: annihilation and creation operators, number operators, and so on. So, when I bumped into this master equation

\frac{d}{d t} \psi(t) = H \psi(t)

it seemed natural to take the operator H and write it in terms of these. There was an obvious first guess, which didn’t quite work… but thinking a bit harder eventually led to the right answer. Later, it turned out people had already thought about similar things. So, I want to explain this.

When I first started working on this stuff, I was focused on the difference between collections of indistinguishable things, like bosons or fermions, and collections of distinguishable things, like rabbits or wolves.

But with the benefit of hindsight, it’s even more important to think about the difference between ordinary quantum theory, which is all about probability amplitudes, and the game we’re playing now, which is all about probabilities. So, next time I’ll explain how we need to modify quantum theory so that it’s about probabilities. This will make it easier to guess a nice formula for H.


Network Theory (Part 2)

31 March, 2011

Today I’d like to start telling you about some research Jacob Biamonte and I are doing on stochastic Petri nets, quantum field theory, and category theory. It’ll take a few blog entries to cover this story, which is part of a larger story about network theory.

Stochastic Petri nets are one of many different diagrammatic languages people have evolved to study complex systems. We’ll see how they’re used in chemistry, molecular biology, population biology and queuing theory, which is roughly the science of waiting in line. If you’re a faithful reader of this blog, you’ve already seen an example of a Petri net taken from chemistry:

It shows some chemicals and some reactions involving these chemicals. To make it into a stochastic Petri net, we’d just label each reaction by a nonnegative real number: the reaction rate constant, or rate constant for short.

In general, a Petri net will have a set of states, which we’ll draw as yellow circles, and a set of transitions, which we’ll draw as blue rectangles. Here’s a Petri net from population biology:

Now, instead of different chemicals, the states are different species. And instead of chemical reactions, the transitions are processes involving our species.

This Petri net has two states: rabbit and wolf. It has three transitions:

• In birth, one rabbit comes in and two go out. This is a caricature of reality: these bunnies reproduce asexually, splitting in two like amoebas.

• In predation, one wolf and one rabbit come in and two wolves go out. This is a caricature of how predators need to eat prey to reproduce.

• In death, one wolf comes in and nothing goes out. Note that we’re pretending rabbits don’t die unless they’re eaten by wolves.

If we labelled each transition with a rate constant, we’d have a stochastic Petri net.

To make this Petri net more realistic, we’d have to make it more complicated. I’m trying to explain general ideas here, not realistic models of specific situations. Nonetheless, this Petri net already leads to an interesting model of population dynamics: a special case of the so-called ‘Lotka-Volterra predator-prey model’. We’ll see the details soon.

More to the point, this Petri net illustrates some possibilities that our previous example neglected. Every transition has some ‘input’ states and some ‘output’ states. But a state can show up more than once as the output (or input) of some transition. And as we see in ‘death’, we can have a transition with no outputs (or inputs) at all.

But let me stop beating around the bush, and give you the formal definitions. They’re simple enough:

Definition. A Petri net consists of a set S of states and a set T of transitions, together with a function

i : S \times T \to \mathbb{N}

saying how many copies of each state shows up as input for each transition, and a function

o: S \times T \to \mathbb{N}

saying how many times it shows up as output.

Definition. A stochastic Petri net is a Petri net together with a function

r: T \to [0,\infty)

giving a rate constant for each transition.

Starting from any stochastic Petri net, we can get two things. First:

• The master equation. This says how the probability that we have a given number of things in each state changes with time.

Since stochastic means ‘random’, the master equation is what gives stochastic Petri nets their name. The master equation is the main thing I’ll be talking about in future blog entries. But not right away!

Why not?

In chemistry, we typically have a huge number of things in each state. For example, a gram of water contains about 3 \times 10^{22} water molecules, and a smaller but still enormous number of hydroxide ions (OH), hydronium ions (H3O+), and other scarier things. These things blunder around randomly, bump into each other, and sometimes react and turn into other things. There’s a stochastic Petri net describing all this, as we’ll eventually see. But in this situation, we don’t usually want to know the probability that there are, say, exactly 310,1849,578,476,264 hydronium ions. That would be too much information! We’d be quite happy knowing the expected value of the number of hydronium ions, so we’d be delighted to have a differential equation that says how this changes with time.

And luckily, such an equation exists—and it’s much simpler than the master equation. So, today we’ll talk about:

• The rate equation. This says how the expected number of things in each state changes with time.

But first, I hope you get the overall idea. The master equation is stochastic: at each time the number of things in each state is a random variable taking values in \mathbb{N}, the set of natural numbers. The rate equation is deterministic: at each time the expected number of things in each state is a non-random variable taking values in [0,\infty), the set of nonnegative real numbers. If the master equation is the true story, the rate equation is only approximately true—but the approximation becomes good in some limit where the expected value of the number of things in each state is large, and the standard deviation is comparatively small.

If you’ve studied physics, this should remind you of other things. The master equation should remind you of the quantum harmonic oscillator, where energy levels are discrete, and probabilities are involved. The rate equation should remind you of the classical harmonic oscillator, where energy levels are continuous, and everything is deterministic.

When we get to the ‘original research’ part of our story, we’ll see this analogy is fairly precise! We’ll take a bunch of ideas from quantum mechanics and quantum field theory, and tweak them a bit, and show how we can use them to describe the master equation for a stochastic Petri net.

Indeed, the random processes that the master equation describes can be drawn as pictures:

This looks like a Feynman diagram, with animals instead of particles! It’s pretty funny, but the resemblance is no joke: the math will back it up.

I’m dying to explain all the details. But just as classical field theory is easier than quantum field theory, the rate equation is simpler than the master equation. So we should start there.

The rate equation

If you hand me a stochastic Petri net, I can write down its rate equation. Instead of telling you the general rule, which sounds rather complicated at first, let me do an example. Take the Petri net we were just looking at:

We can make it into a stochastic Petri net by choosing a number for each transition:

• the birth rate constant \beta

• the predation rate constant \gamma

• the death rate constant \delta

Let x(t) be the number of rabbits and let y(t) be the number of wolves at time t. Then the rate equation looks like this:

\frac{d x}{d t} = \beta x - \gamma x y

\frac{d y}{d t} = \gamma x y - \delta y

It’s really a system of equations, but I’ll call the whole thing “the rate equation” because later we may get smart and write it as a single equation.

See how it works?

• We get a term \beta x in the equation for rabbits, because rabbits are born at a rate equal to the number of rabbits times the birth rate constant \beta.

• We get a term - \delta y in the equation for wolves, because wolves die at a rate equal to the number of wolves times the death rate constant \delta.

• We get a term - \gamma x y in the equation for rabbits, because rabbits die at a rate equal to the number of rabbits times the number of wolves times the predation rate constant \gamma.

• We also get a term \gamma x y in the equation for wolves, because wolves are born at a rate equal to the number of rabbits times the number of wolves times \gamma.

Of course I’m not claiming that this rate equation makes any sense biologically! For example, think about predation. The \gamma x y terms in the above equation would make sense if rabbits and wolves roamed around randomly, and whenever a wolf and a rabbit came within a certain distance, the wolf had a certain probability of eating the rabbit and giving birth to another wolf. At least it would be make sense in the limit of large numbers of rabbits and wolves, where we can treat x and y as varying continuously rather than discretely. That’s a reasonable approximation to make sometimes. Unfortunately, rabbits and wolves don’t roam around randomly, and a wolf doesn’t spit out a new wolf each time it eats a rabbit.

Despite that, the equations

\frac{d x}{d t} = \beta x - \gamma x y

\frac{d y}{d t} = \gamma x y - \delta y

are actually studied in population biology. As I said, they’re a special case of the Lotka-Volterra predator-prey model, which looks like this:

\frac{d x}{d t} = \beta x - \gamma x y

\frac{d y}{d t} = \epsilon x y - \delta y

The point is that while these models are hideously oversimplified and thus quantitatively inaccurate, they exhibit interesting qualititative behavior that’s fairly robust. Depending on the rate constants, these equations can show either a stable equilibrium or stable periodic behavior. And we go from one regime to another, we see a kind of catastrophe called a “Hopf bifurcation”. I explained all this in week308 and week309. There I was looking at some other equations, not the Lotka-Volterra equations. But their qualitative behavior is the same!

If you want stochastic Petri nets that give quantitatively accurate models, it’s better to retreat to chemistry. Compared to animals, molecules come a lot closer to roaming around randomly and having a chance of reacting when they come within a certain distance. So in chemistry, rate equations can be used to make accurate predictions.

But I’m digressing. I should be explaining the general recipe for getting a rate equation from a stochastic Petri net! You might not be able to guess it from just one example. But I sense that you’re getting tired. So let’s stop now. Next time I’ll do more examples, and maybe even write down a general formula. But if you’re feeling ambitious, you can try this now:

Puzzle. Can you write down a stochastic Petri net whose rate equation is the Lotka-Volterra predator-prey model:

\frac{d x}{d t} = \beta x - \gamma x y

\frac{d y}{d t} = \epsilon x y - \delta y

for arbitrary \beta, \gamma, \delta, \epsilon \ge 0? If not, for which values of these rate constants can you do it?

References

If you want to study a bit on your own, here are some great online references on stochastic Petri nets and their rate equations:

• Peter J. E. Goss and Jean Peccoud, Quantitative modeling of stochastic systems in molecular biology by using stochastic Petri nets, Proc. Natl. Acad. Sci. USA 95 (June 1998), 6750-6755.

• Jeremy Gunawardena, Chemical reaction network theory for in-silico biologists.

• Martin Feinberg, Lectures on reaction networks.

I should admit that the first two talk about ‘chemical reaction networks’ instead of Petri nets. That’s no big deal: any chemical reaction network gives a Petri net in a pretty obvious way. You can probably figure out how; if you get stuck, just ask.

Also, I should admit that Petri net people say place where I’m saying state.

Here are some other references, which aren’t free unless you have an online subscription or access to a library:

• Peter J. Haas, Stochastic Petri Nets: Modelling, Stability, Simulation, Springer, Berlin, 2002.

• F. Horn and R. Jackson, General mass action kinetics, Archive for Rational Mechanics and Analysis 47 (1972), 81–116.

• Ina Koch, Petri nets – a mathematical formalism to analyze chemical reaction networks, Molecular Informatics 29 (2010), 838–843.

• Darren James Wilkinson, Stochastic Modelling for Systems Biology, Taylor & Francis, New York, 2006.


Follow

Get every new post delivered to your Inbox.

Join 3,095 other followers