Trends in Reaction Network Theory

27 January, 2015

For those who have been following the posts on reaction networks, this workshop should be interesting! I hope to see you there.

Workshop on Mathematical Trends in Reaction Network Theory, 1-3 July 2015, Department of Mathematical Sciences, University of Copenhagen. Organized by Elisenda Feliu and Carsten Wiuf.


This workshop focuses on current and new trends in mathematical reaction network theory, which we consider broadly to be the theory describing the behaviour of systems of (bio)chemical reactions. In recent years, new interesting approaches using theory from dynamical systems, stochastics, algebra and beyond, have appeared. We aim to provide a forum for discussion of these new approaches and to bring together researchers from different communities.


The workshop starts in the morning of Wednesday, July 1st, and finishes at lunchtime on Friday, July 3rd. In the morning there will be invited talks, followed by contributed talks in the afternoon. There will be a reception and poster session Wednesday in the afternoon, and a conference dinner Thursday. For those participants staying Friday afternoon, a sightseeing event will be arranged.


The workshop is organized by the research group on Mathematics of Reaction Networks at the Department of Mathematical Sciences, University of Copenhagen. The event is sponsored by the Danish Research Council, the Department of Mathematical Sciences and the Dynamical Systems Interdisciplinary Network, which is part of the UCPH Excellence Programme for Interdisciplinary Research.

Confirmed invited speakers

Nikki Meskhat (North Carolina State University, US)

Alan D. Rendall (Johannes Gutenberg Universität Mainz, Germany)

• János Tóth (Budapest University of Technology and Economics, Hungary)

Sebastian Walcher (RWTH Aachen, Germany)

Gheorghe Craciun (University of Wisconsin, Madison, US)

David Doty (California Institute of Technology, US)


Manoj Gopalkrishnan (Tata Institute of Fundamental Research, India)

Michal Komorowski (Institute of Fundamental Technological Research, Polish Academy of Sciences, Poland)

John Baez (University of California, Riverside, US)

Important dates

Abstract submission for posters and contributed talks: March 15, 2015.

Notification of acceptance: March 26, 2015.

Registration deadline: May 15, 2015.

Conference: July 1-3, 2015.

The organizers

The organizers are Elisenda Feliu and Carsten Wiuf at the Department of Mathematical Sciences of the University of Copenhagen.

They’ve written some interesting papers on reaction networks, including some that discuss chemical reactions with more than one stationary state. This is a highly nonlinear regime that’s very important in biology:

• Elisenda Feliu and Carsten Wiuf, A computational method to preclude multistationarity in networks of interacting species, Bioinformatics 29 (2013), 2327-2334.

Motivation. Modeling and analysis of complex systems are important aspects of understanding systemic behavior. In the lack of detailed knowledge about a system, we often choose modeling equations out of convenience and search the (high-dimensional) parameter space randomly to learn about model properties. Qualitative modeling sidesteps the issue of choosing specific modeling equations and frees the inference from specific properties of the equations. We consider classes of ordinary differential equation (ODE) models arising from interactions of species/entities, such as (bio)chemical reaction networks or ecosystems. A class is defined by imposing mild assumptions on the interaction rates. In this framework, we investigate whether there can be multiple positive steady states in some ODE models in a given class.

Results. We have developed and implemented a method to decide whether any ODE model in a given class cannot have multiple steady states. The method runs efficiently on models of moderate size. We tested the method on a large set of models for gene silencing by sRNA interference and on two publicly available databases of biological models, KEGG and Biomodels. We recommend that this method is used as (i) a pre-screening step for selecting an appropriate model and (ii) for investigating the robustness of non-existence of multiple steady state for a given ODE model with respect to variation in interaction rates.

Availability and Implementation. Scripts and examples in Maple are available in the Supplementary Information.

• Elisenda Feliu, Injectivity, multiple zeros, and multistationarity in reaction networks, Proceedings of the Royal Society A.

Abstract. Polynomial dynamical systems are widely used to model and study real phenomena. In biochemistry, they are the preferred choice for modelling the concentration of chemical species in reaction networks with mass-action kinetics. These systems are typically parameterised by many (unknown) parameters. A goal is to understand how properties of the dynamical systems depend on the parameters. Qualitative properties relating to the behaviour of a dynamical system are locally inferred from the system at steady state. Here we focus on steady states that are the positive solutions to a parameterised system of generalised polynomial equations. In recent years, methods from computational algebra have been developed to understand these solutions, but our knowledge is limited: for example, we cannot efficiently decide how many positive solutions the system has as a function of the parameters. Even deciding whether there is one or more solutions is non-trivial. We present a new method, based on so-called injectivity, to preclude or assert that multiple positive solutions exist. The results apply to generalised polynomials and variables can be restricted to the linear, parameter-independent first integrals of the dynamical system. The method has been tested in a wide range of systems.

You can see more of their papers on their webpages.

A Second Law for Open Markov Processes

15 November, 2014

guest post by Blake Pollard

What comes to mind when you hear the term ‘random process’? Do you think of Brownian motion? Do you think of particles hopping around? Do you think of a drunkard staggering home?

Today I’m going to tell you about a version of the drunkard’s walk with a few modifications. Firstly, we don’t have just one drunkard: we can have any positive real number of drunkards. Secondly, our drunkards have no memory; where they go next doesn’t depend on where they’ve been. Thirdly, there are special places, such as entrances to bars, where drunkards magically appear and disappear.

The second condition says that our drunkards satisfy the Markov property, making their random walk into a Markov process. The third condition is really what I want to tell you about, because it makes our Markov process into a more general ‘open Markov process’.

There are a collection of places the drunkards can be, for example:

V= \{ \text{bar},\text{sidewalk}, \text{street}, \text{taco truck}, \text{home} \}

We call this set V the set of states. There are certain probabilities associated with traveling between these places. We call these transition rates. For example it is more likely for a drunkard to go from the bar to the taco truck than to go from the bar to home so the transition rate between the bar and the taco truck should be greater than the transition rate from the bar to home. Sometimes you can’t get from one place to another without passing through intermediate places. In reality the drunkard can’t go directly from the bar to the taco truck: he or she has to go from the bar to sidewalk to the taco truck.

This information can all be summarized by drawing a directed graph where the positive numbers labelling the edges are the transition rates:

For simplicity we draw only three states: home, bar, taco truck. Drunkards go from home to the bar and back, but they never go straight from home to the taco truck.

We can keep track of where all of our drunkards are using a vector with 3 entries:

\displaystyle{ p(t) = \left( \begin{array}{c} p_h(t) \\ p_b(t) \\ p_{tt}(t) \end{array} \right) \in \mathbb{R}^3 }

We call this our population distribution. The first entry p_h is the number of drunkards that are at home, the second p_b is how many are at the bar, and the third p_{tt} is how many are at the taco truck.

There is a set of coupled, linear, first-order differential equations we can write down using the information in our graph that tells us how the number of drunkards in each place change with time. This is called the master equation:

\displaystyle{ \frac{d p}{d t} = H p }

where H is a 3×3 matrix which we call the Hamiltonian. The off-diagonal entries are nonnegative:

H_{ij} \geq 0, i \neq j

and the columns sum to zero:

\sum_i H_{ij}=0

We call a matrix satisfying these conditions infinitesimal stochastic. Stochastic matrices have columns that sum to one. If we take the exponential of an infinitesimal stochastic matrix we get one whose columns sum to one, hence the label ‘infinitesimal’.

The Hamiltonian for the graph above is

H = \left( \begin{array}{ccc} -2 & 5 & 10 \\ 2 & -12 & 0 \\ 0 & 7 & -10 \end{array} \right)

John has written a lot about Markov processes and infinitesimal stochastic Hamiltonians in previous posts.

Given two vectors p,q \in \mathbb{R}^3 describing the populations of drunkards which obey the same master equation, we can calculate the relative entropy of p relative to q:

\displaystyle{ S(p,q) = \sum_{ i \in V} p_i \ln \left( \frac{p_i}{q_i} \right) }

This is an example of a ‘divergence’. In statistics, a divergence a way of measuring the distance between probability distributions, which may not be symmetrical and may even not obey the triangle inequality.

The relative entropy is important because it decreases monotonically with time, making it a Lyapunov function for Markov processes. Indeed, it is a well known fact that

\displaystyle{ \frac{dS(p(t),q(t) ) } {dt} \leq 0 }

This is true for any two population distributions which evolve according to the same master equation, though you have to allow infinity as a possible value for the relative entropy and negative infinity for its time derivative.

Why is entropy decreasing? Doesn’t the Second Law of Thermodynamics say entropy increases?

Don’t worry: the reason is that I have not put a minus sign in my definition of relative entropy. Put one in if you like, and then it will increase. Sometimes without the minus sign it’s called the Kullback–Leibler divergence. This decreases with the passage of time, saying that any two population distributions p(t) and q(t) get ‘closer together’ as they get randomized with the passage of time.

That itself is a nice result, but I want to tell you what happens when you allow drunkards to appear and disappear at certain states. Drunkards appear at the bar once they’ve had enough to drink and once they are home for long enough they can disappear. The set of places where drunkards can appear or disappear B is called the set of boundary states.  So for the above process

B = \{ \text{home},\text{bar} \}

is the set of boundary states. This changes the way in which the population of drunkards changes with time!

The drunkards at the taco truck obey the master equation. For them,

\displaystyle{ \frac{dp_{tt}}{dt} = 7p_b -10 p_{tt} }

still holds. But because the populations can appear or disappear at the boundary states the master equation no longer holds at those states! Instead it is useful to define the flow of drunkards into the i^{th} state by

\displaystyle{ \frac{Dp_i}{Dt} = \frac{dp_i}{dt}-\sum_j H_{ij} p_j}

This quantity describes by how much the rate of change of the populations at the boundary states differ from that given by the master equation.

The reason why we are interested in open Markov processes is because you can take two open Markov processes and glue them together along some subset of their boundary states to get a new open Markov process! This allows us to build up or break down complicated Markov processes using open Markov processes as the building blocks.

For example we can draw the graph corresponding to the drunkards’ walk again, only now we will distinguish boundary states from internal states by coloring internal states blue and having boundary states be white:

Consider another open Markov process with states

V=\{ \text{home},\text{work},\text{bar} \}


B=\{ \text{home}, \text{bar}\}

are the boundary states, leaving


as an internal state:

Since the boundary states of this process overlap with the boundary states of the first process we can compose the two to form a new Markov process:

Notice the boundary states are now internal states. I hope any Markov process that could approximately model your behavior has more interesting nodes! There is a nice way to figure out the Hamiltonian of the composite from the Hamiltonians of the pieces, but we will leave that for another time.

We can ask ourselves, how does relative entropy change with time in open Markov processes? You can read my paper for the details, but here is the punchline:

\displaystyle{ \frac{dS(p(t),q(t) ) }{dt} \leq \sum_{i \in B} \frac{Dp_i}{Dt}\frac{\partial S}{\partial p_i} + \frac{Dq_i}{Dt}\frac{\partial S}{\partial q_i} }

This is a version of the Second Law of Thermodynamics for open Markov processes.

It is important to notice that the sum is only over the boundary states! This inequality tells us that relative entropy still decreases inside our process, but depending on the flow of populations through the boundary states the relative entropy of the whole process could either increase or decrease! This inequality will be important when we study how the relative entropy changes in different parts of a bigger more complicated process.

That is all for now, but I leave it as an exercise for you to imagine a Markov process that describes your life. How many states does it have? What are the relative transition rates? Are there states you would like to spend more or less time in? Are there states somewhere you would like to visit?

Here is my paper, which proves the above inequality:

• Blake Pollard, A Second Law for open Markov processes.

If you have comments or corrections, let me know!

Network Theory (Part 33)

4 November, 2014

Last time I came close to describing the ‘black box functor’, which takes an electrical circuit made of resistors

and sends it to its behavior as viewed from outside. From outside, all you can see is the relation between currents and potentials at the ‘terminals’—the little bits of wire that poke out of the black box:

I came close to defining the black box functor, but I didn’t quite make it! This time let’s finish the job.

The categories in question

The black box functor

\blacksquare : \mathrm{ResCirc} \to \mathrm{LinRel}

goes from the category \mathrm{ResCirc}, where morphisms are circuits made of resistors, to the category \mathrm{LinRel}, where morphisms are linear relations. Let me remind you how these categories work, and introduce a bit of new notation.

Here is the category \mathrm{ResCirc}:

• an object is a finite set;

• a morphism from X to Y is an isomorphism class of cospans

in the category of graphs with edges labelled by resistances: numbers in (0,\infty). Here we think of the finite sets X and Y as graphs with no edges. We call X the set of inputs and Y the set of outputs.

• we compose morphisms in \mathrm{ResCirc} by composing isomorphism classes of cospans.

And here is the category \mathrm{LinRel}:

• an object is a finite-dimensional real vector space;

• a morphism from U to V is a linear relation R : U \leadsto V, meaning a linear subspace R \subseteq U \times V;

• we compose a linear relation R \subseteq U \times V and a linear relation S \subseteq V \times W in the usual way we compose relations, getting:

SR = \{(u,w) \in U \times W : \; \exists v \in V \; (u,v) \in R \mathrm{\; and \;} (v,w) \in S \}

In case you’re wondering: I’ve just introduced the wiggly arrow notation

R : U \leadsto V

for a linear relation from U to V, because it suggests that a relation is a bit like a function but more general. Indeed, a function is a special case of a relation, and composing functions is a special case of composing relations.

The black box functor

Now, how do we define the black box functor?

Defining it on objects is easy. An object of \mathrm{ResCirc} is a finite set S, and we define

\blacksquare{S} = \mathbb{R}^S \times \mathbb{R}^S

The idea is that S could be a set of inputs or outputs, and then

(\phi, I) \in \mathbb{R}^S \times \mathbb{R}^S

is a list of numbers: the potentials and currents at those inputs or outputs.

So, the interesting part is defining the black box functor on morphisms!

For this we start with a morphism in \mathrm{ResCirc}:

The labelled graph \Gamma consists of:

• a set N of nodes,

• a set E of edges,

• maps s, t : E \to N sending each edge to its source and target,

• a map r : E \to (0,\infty) sending each edge to its resistance.

The cospan gives maps

i: X \to N, \qquad o: Y \to N

These say how the inputs and outputs are interpreted as nodes in the circuit. We’ll call the nodes that come from inputs or outputs ‘terminals’. So, mathematically,

T = \mathrm{im}(i) \cup \mathrm{im}(o) \subseteq N

is the set of terminals: the union of the images of i and o.

In the simplest case, the maps i and o are one-to-one, with disjoint ranges. Then each terminal either comes from a single input, or a single output, but not both! This is a good picture to keep in mind. But some subtleties arise when we leave this simplest case and consider other cases.

Now, the black box functor is supposed to send our circuit to a linear relation. I’ll call the circuit \Gamma for short, though it’s really the whole cospan

So, our black box functor is supposed to send this circuit to a linear relation

\blacksquare(\Gamma) : \mathbb{R}^X \times \mathbb{R}^X \leadsto \mathbb{R}^Y \times \mathbb{R}^Y

This is a relation between the potentials and currents at the input terminals and the potentials and currents at the output terminals! How is it defined?

I’ll start by outlining how this works.

First, our circuit picks out a subspace

dQ \subseteq \mathbb{R}^T \times \mathbb{R}^T

This is the subspace of allowed potentials and currents on the terminals. I’ll explain this and why it’s called dQ a bit later. Briefly, it comes from the principle of minimum power, described last time.

Then, the map

i: X \to T

gives a linear relation

S(i) : \mathbb{R}^X \times \mathbb{R}^X \leadsto \mathbb{R}^T \times \mathbb{R}^T

This says how the potentials and currents at the inputs are related to those at the terminals. Similarly, the map

o: Y \to T

gives a linear relation

S(o) : \mathbb{R}^Y \times \mathbb{R}^Y \leadsto \mathbb{R}^T \times \mathbb{R}^T

This says how the potentials and currents at the outputs are related to those at the terminals.

Next, we can ‘turn around’ any linear relation

R : \mathbb{R}^Y \times \mathbb{R}^Y \leadsto \mathbb{R}^T \times \mathbb{R}^T

to get a relation

R^\dagger : \mathbb{R}^T \times \mathbb{R}^T  \leadsto \mathbb{R}^Y \times \mathbb{R}^Y

defined by

R^\dagger = \{(\phi',-I',\phi,-I) : (\phi, I, \phi', I') \in R \}

Here we are just switching the input and output potentials, but when we switch the currents we also throw in a minus sign. The reason is that we care about the current flowing in to an input, but out of an output.

Finally, one more trick: given a linear subspace

L \subseteq V

of a vector space V we get a linear relation

1|_L : V \leadsto V

called the identity restricted to L, defined like this:

1|_L = \{ (v, v) :\; v \in L \} \subseteq V \times V

If L is all of V this relation is actually the identity function on V. Otherwise it’s a partially defined function that’s defined only on L, and is the identity there. (A partially defined function is an example of a relation.) My notation 1|_L is probably bad, but I don’t know a better one, so bear with me.

Let’s use all these ideas to define

\blacksquare(\Gamma) : \mathbb{R}^X \times \mathbb{R}^X \leadsto \mathbb{R}^Y \times \mathbb{R}^Y

To do this, we compose three linear relations:

1) We start with

S(i) : \mathbb{R}^X \times \mathbb{R}^X \leadsto \mathbb{R}^T \times \mathbb{R}^T

2) We compose this with

1|_{dQ} : \mathbb{R}^T \times \mathbb{R}^T \leadsto \mathbb{R}^T \times \mathbb{R}^T

3) Then we compose this with

S(o)^\dagger : \mathbb{R}^T \times \mathbb{R}^T \leadsto \mathbb{R}^Y \times \mathbb{R}^Y

Note that:

1) says how the potentials and currents at the inputs are related to those at the terminals,

2) picks out which potentials and currents at the terminals are actually allowed, and

3) says how the potentials and currents at the terminals are related to those at the outputs.

So, I hope all makes sense, at least in some rough way. In brief, here’s the formula:

\blacksquare(\Gamma) = S(o)^\dagger \; 1|_{dQ} \; S(i)

Now I just need to fill in some details. First, how do we define S(i) and S(o)? They work exactly the same way, by ‘copying potentials and adding currents’, so I’ll just talk about one. Second, how do we define the subspace dQ? This uses the principle of minimum power.

Duplicating potentials and adding currents

Any function between finite sets

i: X \to T

gives a linear map

i^* : \mathbb{R}^T \to \mathbb{R}^X

Mathematicians call this linear map the pullback along i, and for any \phi \in \mathbb{R}^T it’s defined by

i^*(\phi)(x) = \phi(i(x))

In our application, we think of \phi as a list of potentials at terminals. The function i could map a bunch of inputs to the same terminal, and the above formula says the potential at this terminal gives the potential at all those inputs. So, we are copying potentials.

We also get a linear map going the other way:

i_* : \mathbb{R}^X \to \mathbb{R}^T

Mathematicians call this the pushforward along i, and for any I \in \mathbb{R}^X it’s defined by

\displaystyle{ i_*(I)(t) = \sum_{x \; : \; i(x) = t } I(x) }

In our application, we think of I as a list of currents entering at some inputs. The function i could map a bunch of inputs to the same terminal, and the above formula says the current at this terminal is the sum of the currents at all those inputs. So, we are adding currents.

Putting these together, our map

i : X \to T

gives a linear relation

S(i) : \mathbb{R}^X \times \mathbb{R}^X \leadsto \mathbb{R}^T \times \mathbb{R}^T

where the pair (\phi, I) \in \mathbb{R}^X \times \mathbb{R}^X is related to the pair (\phi', I') \in \mathbb{R}^T \times \mathbb{R}^T iff

\phi = i^*(\phi')


I' = i_*(I)

So, here’s the rule of thumb when attaching the points of X to the input terminals of our circuit: copy potentials, but add up currents. More formally:

\begin{array}{ccl} S(i) &=& \{ (\phi, I, \phi', I') : \; \phi = i^*(\phi') , \; I' = i_*(I) \}  \\ \\  &\subseteq& \mathbb{R}^X \times \mathbb{R}^X \times \mathbb{R}^T \times \mathbb{R}^T \end{array}

The principle of minimum power

Finally, how does our circuit define a subspace

dQ \subseteq \mathbb{R}^T \times \mathbb{R}^T

of allowed potential-current pairs at the terminals? The trick is to use the ideas we discussed last time. If we know the potential at all nodes of our circuit, say \phi \in \mathbb{R}^N, we know the power used by the circuit:

P(\phi) = \displaystyle{ \sum_{e \in E} \frac{1}{r_e} \big(\phi(s(e)) - \phi(t(e))\big)^2 }

We saw last time that if we fix the potentials at the terminals, the circuit will choose potentials at the other nodes to minimize this power. We can describe the potential at the terminals by

\psi \in \mathbb{R}^T

So, the power for a given potential at the terminals is

Q(\psi) = \displaystyle{ \frac{1}{2} \min_{\phi \in \mathbb{R}^N \; : \; \phi|_T = \psi} \sum_{e \in E} \frac{1}{r_e} \big(\phi(s(e)) - \phi(t(e))\big)^2 }

Actually this is half the power: I stuck in a factor of 1/2 for some reason we’ll soon see. This Q is a quadratic function

Q : \mathbb{R}^T \to \mathbb{R}

so its derivative is linear. And, our work last time showed something interesting: to compute the current J_x flowing into a terminal x \in T, we just differentiate Q with respect to the potential at that terminal:

\displaystyle{ J_x = \frac{\partial Q(\psi)}{\partial \psi_x} }

This is the reason for the 1/2: when we take the derivative of Q, we bring down a 2 from differentiating all those squares, and to make that go away we need a 1/2.

The space of allowed potential-current pairs at the terminals is thus the linear subspace

dQ = \{ (\psi, J) : \; \displaystyle{ J_x = \frac{\partial Q(\psi)}{\partial \psi_x} \}  \subseteq \mathbb{R}^T \times \mathbb{R}^T }

And this completes our precise description of the black box functor!

The hard part is this:

Theorem. \blacksquare : \mathrm{ResCirc} \to \mathrm{LinRel} is a functor.

In other words, we have to prove that it preserves composition:

\blacksquare(fg) = \blacksquare(f) \blacksquare(g)

For that, read our paper:

• John Baez and Brendan Fong, A compositional framework for passive linear networks.

Network Theory (Part 32)

20 October, 2014

Okay, today we will look at the ‘black box functor’ for circuits made of resistors. Very roughly, this takes a circuit made of resistors with some inputs and outputs:

and puts a ‘black box’ around it:

forgetting the internal details of the circuit and remembering only how the it behaves as viewed from outside. As viewed from outside, all the circuit does is define a relation between the potentials and currents at the inputs and outputs. We call this relation the circuit’s behavior. Lots of different choices of the resistances R_1, \dots, R_6 would give the same behavior. In fact, we could even replace the whole fancy circuit by a single edge with a single resistor on it, and get a circuit with the same behavior!

The idea is that when we use a circuit to do something, all we care about is its behavior: what it does as viewed from outside, not what it’s made of.

Furthermore, we’d like the behavior of a system made of parts to depend in a simple way on the external behaviors of its parts. We don’t want to have to ‘peek inside’ the parts to figure out what the whole will do! Of course, in some situations we do need to peek inside the parts to see what the whole will do. But in this particular case we don’t—at least in the idealization we are considering. And this fact is described mathematically by saying that black boxing is a functor.

So, how do circuits made of resistors behave? To answer this we first need to remember what they are!


Remember that for us, a circuit made of resistors is a mathematical structure like this:

It’s a cospan where:

\Gamma is a graph labelled by resistances. So, it consists of a finite set N of nodes, a finite set E of edges, two functions

s, t : E \to N

sending each edge to its source and target nodes, and a function

r : E \to (0,\infty)

that labels each edge with its resistance.

i: I \to \Gamma is a map of graphs labelled by resistances, where I has no edges. A labelled graph with no edges has nothing but nodes! So, the map i is just a trick for specifying a finite set of nodes called inputs and mapping them to N. Thus i picks out some nodes of \Gamma and declares them to be inputs. (However, i may not be one-to-one! We’ll take advantage of that subtlety later.)

o: O \to \Gamma is another map of graphs labelled by resistances, where O again has no edges, and we call its nodes outputs.

The principle of minimum power

So what does a circuit made of resistors do? This is described by the principle of minimum power.

Recall from Part 27 that when we put it to work, our circuit has a current I_e flowing along each edge e \in E. This is described by a function

I: E \to \mathbb{R}

It also has a voltage across each edge. The word ‘across’ is standard here, but don’t worry about it too much; what matters is that we have another function

V: E \to \mathbb{R}

describing the voltage V_e across each edge e.

Resistors heat up when current flows through them, so they eat up electrical power and turn this power into heat. How much? The power is given by

\displaystyle{ P = \sum_{e \in E} I_e V_e }

So far, so good. But what does it mean to minimize power?

To understand this, we need to manipulate the formula for power using the laws of electrical circuits described in Part 27. First, Ohm’s law says that for linear resistors, the current is proportional to the voltage. More precisely, for each edge e \in E,

\displaystyle{ I_e = \frac{V_e}{r_e} }

where r_e is the resistance of that edge. So, the bigger the resistance, the less current flows: that makes sense. Using Ohm’s law we get

\displaystyle{ P = \sum_{e \in E} \frac{V_e^2}{r_e} }

Now we see that power is always nonnegative! Now it makes more sense to minimize it. Of course we could minimize it simply by setting all the voltages equal to zero. That would work, but that would be boring: it gives a circuit with no current flowing through it. The fun starts when we minimize power subject to some constraints.

For this we need to remember another law of electrical circuits: a spinoff of Kirchhoff’s voltage law. This says that we can find a function called the potential

\phi: N \to \mathbb{R}

such that

V_e = \phi_{s(e)} - \phi_{t(e)}

for each e \in E. In other words, the voltage across each edge is the difference of potentials at the two ends of this edge.

Using this, we can rewrite the power as

\displaystyle{ P = \sum_{e \in E} \frac{1}{r_e} (\phi_{s(e)} - \phi_{t(e)})^2 }

Now we’re really ready to minimize power! Our circuit made of resistors has certain nodes called terminals:

T \subseteq N

These are the nodes that are either inputs or outputs. More precisely, they’re the nodes in the image of

i: I \to \Gamma


o: O \to \Gamma

The principle of minimum power says that:

If we fix the potential \phi on all terminals, the potential at other nodes will minimize the power

\displaystyle{ P(\phi) = \sum_{e \in E} \frac{1}{r_e} (\phi_{s(e)} - \phi_{t(e)})^2 }

subject to this constraint.

This should remind you of all the other minimum or maximum principles you know, like the principle of least action, or the way a system in thermodynamic equilibrium maximizes its entropy. All these principles—or at least, most of them—are connected. I could talk about this endlessly. But not now!

Now let’s just use the principle of minimum power. Let’s see what it tells us about the behavior of an electrical circuit.

Let’s imagine changing the potential \phi by adding some multiple of a function

\psi: N \to \mathbb{R}

If this other function vanishes at the terminals:

\forall n \in T \; \; \psi(n) = 0

then \phi + x \psi doesn’t change at the terminals as we change the number x.

Now suppose \phi obeys the principle of minimum power. In other words, supposes it minimizes power subject to the constraint of taking the values it does at the terminals. Then we must have

\displaystyle{ \frac{d}{d x} P(\phi + x \psi)\Big|_{x = 0} }


\forall n \in T \; \; \psi(n) = 0

This is just the first derivative test for a minimum. But the converse is true, too! The reason is that our power function is a sum of nonnegative quadratic terms. Its graph will look like a paraboloid. So, the power has no points where its derivative vanishes except minima, even when we constrain \phi by making it lie on a linear subspace.

We can go ahead and start working out the derivative:

\displaystyle{ \frac{d}{d x} P(\phi + x \psi)! = ! \frac{d}{d x} \sum_{e \in E} \frac{1}{r_e} (\phi_{s(e)} - \phi_{t(e)} + x(\psi_{s(e)} -\psi_{t(e)}))^2  }

To work out the derivative of these quadratic terms at x = 0, we only need to keep the part that’s proportional to x. The rest gives zero. So:

\begin{array}{ccl} \displaystyle{ \frac{d}{d t} P(\phi + x \psi)\Big|_{x = 0} } &=& \displaystyle{ \frac{d}{d x} \sum_{e \in E} \frac{x}{r_e} (\phi_{s(e)} - \phi_{t(e)}) (\psi_{s(e)} - \psi_{t(e)}) \Big|_{x = 0} } \\ \\  &=&   \displaystyle{  \sum_{e \in E} \frac{1}{r_e} (\phi_{s(e)} - \phi_{t(e)}) (\psi_{s(e)} - \psi_{t(e)}) }  \end{array}

The principle of minimum power says this is zero whenever \psi : N \to \mathbb{R} is a function that vanishes at terminals. By linearity, it’s enough to consider functions \psi that are zero at every node except one node n that is not a terminal. By linearity we can also assume \psi(n) = 1.

Given this, the only nonzero terms in the sum

\displaystyle{ \sum_{e \in E} \frac{1}{r_e} (\phi_{s(e)} - \phi_{t(e)}) (\psi_{s(e)} - \psi_{t(e)}) }

will be those involving edges whose source or target is n. We get

\begin{array}{ccc} \displaystyle{ \frac{d}{d x} P(\phi + x \psi)\Big|_{x = 0} } &=& \displaystyle{ \sum_{e: \; s(e) = n}  \frac{1}{r_e} (\phi_{s(e)} - \phi_{t(e)})}  \\  \\        && -\displaystyle{ \sum_{e: \; t(e) = n}  \frac{1}{r_e} (\phi_{s(e)} - \phi_{t(e)}) }   \end{array}

So, the principle of minimum power says precisely

\displaystyle{ \sum_{e: \; s(e) = n}  \frac{1}{r_e} (\phi_{s(e)} - \phi_{t(e)}) = \sum_{e: \; t(e) = n}  \frac{1}{r_e} (\phi_{s(e)} - \phi_{t(e)}) }

for all nodes n that aren’t terminals.

What does this mean? You could just say it’s a set of linear equations that must be obeyed by the potential \phi. So, the principle of minimum power says that fixing the potential at terminals, the potential at other nodes must be chosen in a way that obeys a set of linear equations.

But what do these equations mean? They have a nice meaning. Remember, Kirchhoff’s voltage law says

V_e = \phi_{s(e)} - \phi_{t(e)}

and Ohm’s law says

\displaystyle{ I_e = \frac{V_e}{r_e} }

Putting these together,

\displaystyle{ I_e = \frac{1}{r_e} (\phi_{s(e)} - \phi_{t(e)}) }

so the principle of minimum power merely says that

\displaystyle{ \sum_{e: \; s(e) = n} I_e = \sum_{e: \; t(e) = n}  I_e }

for any node n that is not a terminal.

This is Kirchhoff’s current law: for any node except a terminal, the total current flowing into that node must equal the total current flowing out! That makes a lot of sense. We allow current to flow in or out of our circuit at terminals, but ‘inside’ the circuit charge is conserved, so if current flows into some other node, an equal amount has to flow out.

In short: the principle of minimum power implies Kirchoff’s current law! Conversely, we can run the whole argument backward and derive the principle of minimum power from Kirchhoff’s current law. (In both the forwards and backwards versions of this argument, we use Kirchhoff’s voltage law and Ohm’s law.)

When the node n is a terminal, the quantity

\displaystyle{  \sum_{e: \; s(e) = n} I_e \; - \; \sum_{e: \; t(e) = n}  I_e }

need not be zero. But it has an important meaning: it’s the amount of current flowing into that terminal!

We’ll call this I_n, the current at the terminal n \in T. This is something we can measure even when our circuit has a black box around it:

So is the potential \phi_n at the terminal n. It’s these currents and potentials at terminals that matter when we try to describe the behavior of a circuit while ignoring its inner workings.

Black boxing

Now let me quickly sketch how black boxing becomes a functor.

A circuit made of resistors gives a linear relation between the potentials and currents at terminals. A relation is something that can hold or fail to hold. A ‘linear’ relation is one defined using linear equations.

A bit more precisely, suppose we choose potentials and currents at the terminals:

\psi : T \to \mathbb{R}

J : T \to \mathbb{R}

Then we seek potentials and currents at all the nodes and edges of our circuit:

\phi: N \to \mathbb{R}

I : E \to \mathbb{R}

that are compatible with our choice of \psi and J. Here compatible means that

\psi_n = \phi_n


J_n = \displaystyle{  \sum_{e: \; s(e) = n} I_e \; - \; \sum_{e: \; t(e) = n}  I_e }

whenever n \in T, but also

\displaystyle{ I_e = \frac{1}{r_e} (\phi_{s(e)} - \phi_{t(e)}) }

for every e \in E, and

\displaystyle{  \sum_{e: \; s(e) = n} I_e \; = \; \sum_{e: \; t(e) = n}  I_e }

whenever n \in N - T. (The last two equations combine Kirchoff’s laws and Ohm’s law.)

There either exist I and \phi making all these equations true, in which case we say our potentials and currents at the terminals obey the relation… or they don’t exist, in which case we say the potentials and currents at the terminals don’t obey the relation.

The relation is clearly linear, since it’s defined by a bunch of linear equations. With a little work, we can make it into a linear relation between potentials and currents in

\mathbb{R}^I \oplus \mathbb{R}^I

and potentials and currents in

\mathbb{R}^O \oplus \mathbb{R}^O

Remember, I is our set of inputs and O is our set of outputs.

In fact, this process of getting a linear relation from a circuit made of resistors defines a functor:

\blacksquare : \mathrm{ResCirc} \to \mathrm{LinRel}

Here \mathrm{ResCirc} is the category where morphisms are circuits made of resistors, while \mathrm{LinRel} is the category where morphisms are linear relations.

More precisely, here is the category \mathrm{ResCirc}:

• an object of \mathrm{ResCirc} is a finite set;

• a morphism from I to O is an isomorphism class of circuits made of resistors:

having I as its set of inputs and O as its set of outputs;

• we compose morphisms in \mathrm{ResCirc} by composing isomorphism classes of cospans.

(Remember, circuits made of resistors are cospans. This lets us talk about isomorphisms between them. If you forget the how isomorphism between cospans work, you can review it in Part 31.)

And here is the category \mathrm{LinRel}:

• an object of \mathrm{LinRel} is a finite-dimensional real vector space;

• a morphism from U to V is a linear relation R \subseteq U \times V, meaning a linear subspace of the vector space U \times V;

• we compose a linear relation R \subseteq U \times V and a linear relation S \subseteq V \times W in the usual way we compose relations, getting:

SR = \{(u,w) \in U \times W : \; \exists v \in V \; (u,v) \in R \mathrm{\; and \;} (v,w) \in S \}

Next steps

So far I’ve set up most of the necessary background but not precisely defined the black boxing functor

\blacksquare : \mathrm{ResCirc} \to \mathrm{LinRel}

There are some nuances I’ve glossed over, like the difference between inputs and outputs as elements of I and O and their images in N. If you want to see the precise definition and the proof that it’s a functor, read our paper:

• John Baez and Brendan Fong, A compositional framework for passive linear networks.

The proof is fairly long: there may be a much quicker one, but at least this one has the virtue of introducing a lot of nice ideas that will be useful elsewhere.

Next time I’ll define the black box functor more carefully.

Network Theory Seminar (Part 2)

16 October, 2014


This time I explain more about how ‘cospans’ represent gadgets with two ends, an input end and an output end:

I describe how to glue such gadgets together by composing cospans. We compose cospans using a category-theoretic construction called a ‘pushout’, so I also explain pushouts. At the end, I explain how this gives us a category where the morphisms are electrical circuits made of resistors, and sketch what we’ll do next: study the behavior of these circuits.

These lecture notes provide extra details:

Network theory (part 31).

Network Theory Seminar (Part 1)

11 October, 2014


Check out this video! I start with a quick overview of network theory, and then begin building a category where the morphisms are electrical circuits. These lecture notes provide extra details:

Network theory (part 30).

With luck, this video will be the first of a series. I’m giving a seminar on network theory at U.C. Riverside this fall. I’ll start by sketching the results here:

• John Baez and Brendan Fong, A compositional framework for passive linear networks.

But this is a big paper, and I also want to talk about other papers, so I certainly won’t explain everything in here—just enough to help you get started! If you have questions, don’t be shy about asking them.

I thank Blake Pollard for filming this seminar, and Muhammad “Siddiq” Siddiqui-Ali for providing the videocamera and technical support.

The Large-Number Limit for Reaction Networks (Part 3)

8 October, 2014

joint with Arjun Jain

We used to talk about reaction networks quite a lot here. When Arjun Jain was visiting the CQT, we made a lot of progress understanding how the master equation reduces to the rate equation in the limit where there are very large numbers of things of each kind. But we never told you the end of the story, and by now it’s been such a long time that you may need a reminder of some basic facts!


The rate equation treats the number of things of each kind as continuous—a nonnegative real number—and says how it changes in a deterministic way.

The master equation treats the number of things of each kind as discrete—a nonnegative integer—and says how it changes in a probabilistic way.

You can think of the master equation as the ‘true’ description, and the rate equation as an approximation that’s good in some limit where there are large numbers of molecules — or more precisely, where the probability distribution of having some number of molecules of each kind is sharply peaked near some large value.

You may remember that in the master equation, the state of a chemical system is described by a vector \psi in a kind of ‘Fock space’, while time evolution is described with the help of an operator on this space, called the ‘Hamiltonian’ H:

\displaystyle{ \frac{d}{dt} \psi(t) = H \psi(t) }

The Hamiltonian is built from annihilation and creation operators, so all of this looks very much like quantum field theory. The details are here, and we won’t try to review them all:

• John Baez and Jacob Biamonte, Quantum Techniques for Stochastic Mechanics.

The point is this: the ‘large-number limit’ where the master equation reduces to the rate equation smells a lot like the ‘classical limit’ of quantum field theory, where the description of light in terms of photons reduces to the good old Maxwell equations. So, maybe we can understand the large-number limit by borrowing techniques from quantum field theory!

How do we take the classical limit of quantum electromagnetism and get the classical Maxwell equations? For simplicity let’s ignore charged particles and consider the ‘free electromagnetic field': just photons, described by the quantum version of Maxwell’s equations. When we take the classical limit we let Planck’s constant \hbar go to zero: that much is obvious. However, that’s not all! The energy of each photon is proportional to \hbar, so to take the classical limit and get a solution of the classical Maxwell’s equations with nonzero energy we also need to increase the number of photons. We cleverly do this in such a way that the total energy remains constant as \hbar \to 0.

So, in quantum electromagnetism the classical limit is also a large-number limit!

That’s a good sign. It suggests the same math will also apply to our reaction network problem.

But then we hit an apparent roadblock. What’s the analogue of Planck’s constant in chemical reaction networks? What should go to zero?

We told you the answer to that puzzle a while ago: it’s the reciprocal of Avogadro’s number!

You see, chemists measure numbers of molecules in ‘moles’. There’s a large number of molecules in each mole: Avogadro’s number. If we let the reciprocal of Avogadro’s number go to zero, we are taking a limit where chemistry becomes ‘continuous’ and the discreteness of molecules goes away. Of course this is just a mathematical trick, but it’s a very useful one.

So, we got around that roadblock. And then something nice happened.

When taking the classical limit of quantum electromagnetism, we focus attention on certain quantum states that are the ‘best approximations’ to classical states. These are called ‘coherent states’, and it’s very easy to study how the behave as we simultaneously let \hbar \to 0 and let the expected number of photons go to infinity.

And the nice thing is that these coherent states are also important in chemistry! But because chemistry involves probabilities rather than amplitudes, they have a different name: ‘Poisson distributions’. On this blog, Brendan Fong used them to give a new proof of a great result in mathematical chemistry, the Anderson–Craciun–Kurtz theorem.

So, we have most of the concepts and tools in place, and we can tackle the large-number limit using quantum techniques.

You can review the details here:

The large-number limit for reaction networks (part 1).

The large-number limit for reaction networks (part 2) .

So, after a quick refresher on the notation, we’ll plunge right in.

As you’ll see, we solve the problem except for one important technical detail: passing a derivative through a limit! This means our main result is not a theorem. Rather, it’s an idea for how to prove a theorem. Or if we act like physicists, we can call it a theorem.

Review of notation

The rate equation says

\displaystyle{\frac{d}{dt}{x}(t) = \sum_{\tau \in T} r(\tau) (t(\tau)-s(\tau)) {x}(t)^{s(\tau)} }


x(t) \in \mathbb{R}^k is a vector describing concentrations of k different species at time t. In chemistry these species could be different kinds of molecules.

• Each \tau \in T is a transition, or in chemistry, a reaction.

s(\tau) \in \mathbb{N}^k is a vector of natural numbers saying how many items of each species appear as inputs to the reaction \tau. This is called the source of the reaction.

t(\tau) \in \mathbb{N}^k is a vector of natural numbers saying how many items of each species appear as outputs of the reaction \tau. This is called the target of the reaction. So, t(\tau) - s(\tau) says the net change of the number of items of each species in the reaction \tau.

• The rate at which the reaction \tau occurs is proportional to the rate constant r(\tau) times the number


Here we are raising a vector to a vector power and getting a number, using this rule:

x^r = x_1^{r_1} \cdots x_k^{r_k}

where r is any vector of natural numbers (r_1, \dots, r_k), and x is any vector of nonnegative real numbers. From now on we’ll call a vector of natural numbers a multi-index.

In this paper:

• John Baez, Quantum techniques for reaction networks.

it was shown that the master equation implies

\displaystyle{\frac{d}{dt}\langle N_i \psi(t)\rangle = \sum_{\tau \in T} r(\tau) (t_i(\tau)-s_i(\tau)) \; \langle N^{\, \underline{s(\tau)}} \, \psi(t)\rangle }


\psi(t) is the stochastic state saying the probability of having any particular number of items of each species at each time t. We won’t review the precise details of how this work; for that reread the relevant bit of Part 8.

N_i is the ith number operator, defined using annihilation and creation operators as in quantum mechanics:

N_i = a_i^\dagger a_i

For the annihilation and creation operators, see Part 8.

\langle N_i \psi(t) \rangle is the expected number of items of the ith species at time t.

• Similarly, \langle N^{\underline{s(\tau)}} \psi(t)\rangle is the expected value of a certain product of operators. For any multi-index r we define the falling power

N_i^{\underline{r}_i} = N_i (N_i - 1) (N_i - 2) \cdots (N_i - r_i +1)

and then we define

N^{\underline{r}} = N_1^{\underline{r_1}} \cdots N_k^{\underline{r_k}}

The large-number limit

Okay. Even if you don’t understand any of what we just said, you’ll see the master and rate equation look similar. The master equation implies this:

\displaystyle{\frac{d}{dt}\langle N_i \psi(t)\rangle = \sum_{\tau \in T} r(\tau) (t_i(\tau)-s_i(\tau)) \; \langle N^{\, \underline{s(\tau)}} \, \psi(t)\rangle }

while the rate equation says this:

\displaystyle{\frac{d}{dt}{x}(t) = \sum_{\tau \in T} r(\tau) (t(\tau)-s(\tau)) \; {x}(t)^{s(\tau)} }

So, we will try to get from the first to the second second with the help of a ‘large-number limit’.

We start with a few definitions. We introduce an adjustable dimensionless quantity which we call \hbar. This is just a positive number, which has nothing to do with quantum theory except that we’re using a mathematical analogy to quantum mechanics to motivate everything we’re doing.

Definition. The rescaled number operators are defined as \widetilde{N}_i = \hbar N_i. This can be thought of as a rescaling of the number of objects, so that instead of counting objects individually, we count them in bunches of size 1/\hbar.

Definition. For any multi-index r we define the rescaled falling power of the number operator N_i by:

\widetilde{N}_i^{\underline{r_i}} = \widetilde{N}_i (\widetilde{N}_i - \hbar)(\widetilde{N}_i-2\hbar)\cdots (\widetilde{N}_i-r_i\hbar+\hbar)

and also define

\widetilde{N}^{\underline{r}} = \widetilde{N}_1^{\underline{r_1}} \; \cdots \;\widetilde{N}_k^{\underline{r_k}}

for any multi-index r.

Using these, we get the following equation:

\displaystyle{\frac{1}{\hbar}\frac{d}{dt} \langle \widetilde{N}_i \psi(t)\rangle = \sum_{\tau \in T} r(\tau) (t_i(\tau)-s_i(\tau)) \; \langle \widetilde{N}^{\underline{s(\tau)}} \psi(t)\rangle \; \frac{1}{\hbar^{|s(\tau)|}} }

where for any multi-index r we set

|r| = r_1 + \cdots + r_k

This suggests a way to rescale the rate constants in the master equation:

Definition. The rescaled rate constants are

\displaystyle{\widetilde{r}(\tau) = \frac{r(\tau)}{\hbar^{|s(\tau)|-1}}}

From here onwards, we change our viewpoint. We consider the rescaled rate constants \widetilde{r}(\tau) to be fixed, instead of the original rate constants r(\tau). So, as we decrease \hbar, we are studying situations where the original rate constants change to ensure that the rescaled rate constants stays fixed!

So, we switch to working with a rescaled master equation:

Definition. The rescaled master equation is:

\displaystyle{\frac{d}{dt} \langle \widetilde{N}_i\widetilde{\psi}(t)\rangle = \sum_{\tau \in T} \widetilde{r}(\tau) (t_i(\tau)-s_i(\tau)) \; \langle \widetilde{N}^{\underline{s(\tau)}} \widetilde{\psi}(t)\rangle }

This is really a one-parameter family of equations, depending on \hbar\in (0,\infty). We write a solution of the rescaled master equation as \widetilde{\psi}(t), but it is really one solution for each value of \hbar.

Following the same procedure as above, we can rescale the rate equation, using the same definition of the rescaled rate constants:

Definition. The rescaled number of objects of the i\mathrm{th} species is defined as \widetilde{x_i}=\hbar x_i, where x_i is the original number of objects of the i\mathrm{th} species. Here again, we are counting in bunches of 1/\hbar.

Using this to rescale the rate equation, we get

Definition. The rescaled rate equation is

\displaystyle{\frac{d}{dt}\widetilde{x}(t) = \sum_{\tau \in T} \widetilde{r}(\tau) (t(\tau)-s(\tau))\; \widetilde{x}(t)^{s(\tau)} }


\widetilde{x}(t)=(\widetilde{x_1}(t),\widetilde{x_2}(t),\dots, \widetilde{x_k}(t))

Therefore, to go from the rescaled master equation to the rescaled rate equation, we require that

\langle\widetilde{N}^{\underline{r}}\, \widetilde{\psi}(t)\rangle \to \langle\widetilde{N}\widetilde{\psi}(t)\rangle^r

as \hbar \to 0. If this holds, we can identify \langle\widetilde{N}\widetilde{\psi}(t)\rangle with \widetilde{x}(t) and get the rate equation from the master equation!

To this end, we introduce the following crucial idea:

Definition. A semiclassical family of states, \widetilde{\psi}, is defined as a one-parameter family of states depending on \hbar \in (0,\infty) such that for some \widetilde{c} \in [0,\infty)^k we have

\langle\widetilde{N}^r\widetilde{\psi}\rangle \to \widetilde{c}^{\, r}

for every r\in \mathbb{N}^k as \hbar \to 0.

In particular, this implies

\langle\widetilde{N}_i\widetilde{\psi}\rangle \to \widetilde{c}_i

for every index i.

Intuitively, a semiclassical family is a family of probability distributions that becomes more and more sharply peaked with a larger and larger mean as \hbar decreases. We would like to show that in this limit, the rescaled master equation gives the rescaled rate equation.

We make this precise in the following propositions.

Proposition 1. If \widetilde{\psi} is a semiclassical family as defined above, then in the \hbar \to 0 limit, we have \langle\widetilde{N}^{\underline{r}}\widetilde{\psi}\rangle \to \widetilde{c}^{\; r} as well.

Proof. For each index i,

\displaystyle{ \langle\widetilde{N}_i^{\; \underline{r_i}}\, \widetilde{\psi}\rangle = \displaystyle{ \langle \widetilde{N}_i (\widetilde{N}_i - \hbar)(\widetilde{N}_i-2\hbar)\cdots(\widetilde{N}_i-r_i\hbar+\hbar)\,\widetilde{\psi}\rangle} }

\displaystyle{ = \Big\langle\Big(\widetilde{N}_i^r + \hbar\frac{r_i(r_i-1)}{2}\widetilde{N}_i^{r_i-1}+\cdots + \hbar^{r_i-1}(r_i-1)!\Big)\,\widetilde{\psi}\Big\rangle }

By the definition of a semiclassical family,

\displaystyle{ \lim_{\hbar\to 0} \langle\Big(\widetilde{N}_i^{r_i} + \hbar\frac{r_i(r_i-1)}{2}\widetilde{N}_i^{r_i-1}+ \cdots + \hbar^{r_i-1}(r_i-1)!\Big)\;\widetilde{\psi}\rangle} = \widetilde{c}_i^{\; r_i}

since every term but the first approaches zero. Thus, we have

\displaystyle{ \lim_{\hbar \to 0} \langle\widetilde{N}_i^{\; \underline{r_i}}\, \widetilde{\psi}\rangle =   \widetilde{c}_i^{\; r_i} }

A similar but more elaborate calculation shows that

\displaystyle{ \lim_{\hbar \to 0} \langle\widetilde{N}_1^{\, \underline{r_1}} \cdots\widetilde{N}_k^{\, \underline{r_k}} \, \widetilde{\psi}\rangle = \lim_{\hbar \to 0} \langle\widetilde{N}_1^{\, r_1}\cdots \widetilde{N}_k^{\, r_k} \widetilde{\psi}\rangle= \lim_{\hbar \to 0}\langle\widetilde{N}^{\, r} \, \widetilde{\psi}\rangle }

or in other words

\langle\widetilde{N}^{\, \underline{r}}\,\widetilde{\psi}\rangle \to \widetilde{c}^{\, r}

as \hbar \to 0.   █

Proposition 2. If \widetilde{\psi} is a semiclassical family of states, then

\displaystyle{  \langle (\widetilde{N}-\widetilde{c})^{r}\, \widetilde{\psi}\rangle \to 0 }

for any multi-index r.

Proof. Consider the r_i\mathrm{th} centered moment of the i\mathrm{th} number operator:

\displaystyle{\langle(\widetilde{N}_i-\widetilde{c}_i)^{r_i}\widetilde{\psi}\rangle = \sum_{p =0}^{r_i} {r_i \choose p}\langle\widetilde{N}_i^p\widetilde{\psi}\rangle(-\widetilde{c}_i)^{r_i-p} }

Taking the limit as \hbar goes to zero, this becomes

\begin{array}{ccl} \displaystyle{ \lim_{\hbar \to 0}\sum_{p =0}^{r_i} {r_i \choose p}\langle\widetilde{N}_i^p\widetilde{\psi}\rangle(-\widetilde{c}_i)^{r_i-p} } &=& \displaystyle{ \sum_{p =0}^{r_i} {r_i \choose p}(\widetilde{c}_i)^p(-\widetilde{c}_i)^{r_i-p} } \\ \\  &=& (\widetilde{c}_i-\widetilde{c}_i)^{r_i} \\ \\  &=& 0 \end{array}

For a general multi-index r we can prove the same sort of thing with a more elaborate calculation. First note that

\langle (\widetilde{N}-\widetilde{c})^{r}\widetilde{\psi}\rangle=\langle(\widetilde{N_1}-\widetilde{c_1})^{r_1} \cdots (\widetilde{N_k}-\widetilde{c_k})^{r_k})\widetilde{\psi}\rangle

The right-hand side can be expanded as

\displaystyle{ \langle(\sum_{p_1 =0}^{r_1} {r_1 \choose p_1}\widetilde{N}_1^{p_1}(-\widetilde{c}_1)^{r_1-p_1} ) \cdots (\sum_{p_k =0}^{r_k} {r_k \choose p_k}\widetilde{N}_k^{p_k}(-\widetilde{c}_k)^{r_k-p_k} )\widetilde{\psi} }\rangle

We can write this more tersely as

\displaystyle{ \sum_{p=0}^r} {r\choose p} \langle \widetilde{N}^p\widetilde{\psi}\rangle (-\widetilde{c})^{r-p}

where for any multi-index r we define

\displaystyle{{\sum_{p=0}^{r}}= \sum_{p_1 =0}^{r_1} \cdots  \sum_{p_k =0}^{r_k}  }

and for any multi-indices r, p we define

\displaystyle{ {r \choose p}={r_1 \choose p_1}{r_2 \choose p_2}\cdots {r_k \choose p_k}}

Now using the definition of a semiclassical state, we see

\displaystyle{ \lim_{\hbar \to 0} \sum_{p=0}^r} {r\choose p} \langle \widetilde{N}^p\widetilde{\psi}\rangle (-\widetilde{c})^{r-p}= \displaystyle{ \sum_{p=0}^r} {r\choose p} (\widetilde{c})^{p} (-\widetilde{c})^{r-p}

But this equals zero, as the last expression, expanded, is

\displaystyle{ (\widetilde{c})^r \left( \sum_{p_1=0}^{r_1} {r_1\choose p_1} (-1)^{r_1-p_1}\right) \cdots \left( \sum_{p_k=0}^{r_k} {r_k\choose p_k} (-1)^{r_k-p_k} \right) }

where each individual sum is zero.   █

Here is the theorem that would finish the job if we could give a fully rigorous proof:

“Theorem.” If \widetilde{\psi}(t) is a solution of the rescaled master equation and also a semiclassical family for the time interval [t_0,t_1], then \widetilde{x}(t) = \langle \widetilde{N} \widetilde{\psi}(t) \rangle is a solution of the rescaled rate equation for t \in [t_0,t_1].

Proof sketch. We sketch a proof that relies on the assumption that we can pass the \hbar \to 0 limit through a time derivative. Of course, to make this rigorous, we would need to justify this. Perhaps it is true only in certain cases.

Assuming that we can pass the limit through the derivative:

\displaystyle{\lim_{\hbar \to 0}\frac{d}{dt} \langle \widetilde{N}\widetilde{\psi}(t)\rangle = \lim_{\hbar \to 0} \sum_{\tau \in T} \widetilde{r}(\tau) (t(\tau)-s(\tau))\langle \widetilde{N}^{\, \underline{s(\tau)}} \, \widetilde{\psi}(t)\rangle }

and thus

\displaystyle{\frac{d}{dt}\lim_{\hbar \to 0} \langle \widetilde{N}\widetilde{\psi}(t)\rangle = \sum_{\tau \in T} \widetilde{r}(\tau) (t(\tau)-s(\tau)) \lim_{\hbar \to 0}\langle \widetilde{N}^{\, \underline{s(\tau)}} \, \widetilde{\psi}(t)\rangle }

and thus

\displaystyle{\frac{d}{dt}\widetilde{x}(t) = \sum_{\tau \in T} \widetilde{r}(\tau) (t(\tau)-s(\tau))\widetilde{x}^{\, s(\tau)} }.

As expected, we obtain the rescaled rate equation.   █

Another question is this: if we start with a semiclassical family of states as our initial data, does it remain semiclassical as we evolve it in time? This will probably be true only in certain cases.

An example: rescaled coherent states

The best-behaved semiclassical states are the coherent states.
Consider the family of coherent states

\displaystyle{\widetilde{\psi}_{\widetilde{c}} = \frac{e^{(\widetilde{c}/\hbar) z}}{e^{\widetilde{c}/\hbar}}}

using the notation developed in the earlier mentioned paper. In that paper it was shown that for any multi-index m and any coherent state \Psi we have

\langle N^{\underline{m}}\Psi\rangle = \langle N\Psi \rangle^m

Using this result for \widetilde{\psi}_{\widetilde{c}} we get

\displaystyle{\langle \widetilde{N}^{\underline{m}}\widetilde{\psi}_{\widetilde{c}}\rangle~=~\hbar^{|m|}\langle N^{\underline{m}}\widetilde{\psi}_{\widetilde{c}}\rangle~=~\hbar^{|m|}\langle N\widetilde{\psi}_{\widetilde{c}}\rangle^m~=~\hbar^{|m|}\frac{\widetilde{c}^m}{\hbar^{|m|}}~=~\widetilde{c}^m}

Since \langle\widetilde{N}^{\underline{m}}\widetilde{\psi}_{\widetilde{c}}\rangle equals \langle \widetilde{N}^{m} \widetilde{\psi}_{\widetilde{c}}\rangle plus terms of order \hbar, as \hbar \to 0 we have


showing that our chosen \widetilde{\psi}_{\widetilde{c}} is indeed a semiclassical family.


Get every new post delivered to your Inbox.

Join 2,939 other followers