Noether’s Theorem: Quantum vs Stochastic

3 May, 2014

guest post by Ville Bergholm

In 1915 Emmy Noether discovered an important connection between the symmetries of a system and its conserved quantities. Her result has become a staple of modern physics and is known as Noether’s theorem.

Photo of Emmy Noether

The theorem and its generalizations have found particularly wide use in quantum theory. Those of you following the Network Theory series here on Azimuth might recall Part 11 where John Baez and Brendan Fong proved a version of Noether’s theorem for stochastic systems. Their result is now published here:

• John Baez and Brendan Fong, A Noether theorem for stochastic mechanics, J. Math. Phys. 54:013301 (2013).

One goal of the network theory series here on Azimuth has been to merge ideas appearing in quantum theory with other disciplines. John and Brendan proved their stochastic version of Noether’s theorem by exploiting ‘stochastic mechanics’ which was formulated in the network theory series to mathematically resemble quantum theory. Their result, which we will outline below, was different than what would be expected in quantum theory, so it is interesting to try to figure out why.

Recently Jacob Biamonte, Mauro Faccin and myself have been working to try to get to the bottom of these differences. What we’ve done is prove a version of Noether’s theorem for Dirichlet operators. As you may recall from Parts 16 and 20 of the network theory series, these are the operators that generate both stochastic and quantum processes. In the language of the series, they lie in the intersection of stochastic and quantum mechanics. So, they are a subclass of the infinitesimal stochastic operators considered in John and Brendan’s work.

The extra structure of Dirichlet operators—compared with the wider class of infinitesimal stochastic operators—provided a handle for us to dig a little deeper into understanding the intersection of these two theories. By the end of this article, astute readers will be able to prove that Dirichlet operators generate doubly stochastic processes.

Before we get into the details of our proof, let’s recall first how conservation laws work in quantum mechanics, and then contrast this with what John and Brendan discovered for stochastic systems. (For a more detailed comparison between the stochastic and quantum versions of the theorem, see Part 13 of the network theory series.)

The quantum case

I’ll assume you’re familiar with quantum theory, but let’s start with a few reminders.

In standard quantum theory, when we have a closed system with n states, the unitary time evolution of a state |\psi(t)\rangle is generated by a self-adjoint n \times n matrix H called the Hamiltonian. In other words, |\psi(t)\rangle satisfies Schrödinger’s equation:

i \hbar \displaystyle{\frac{d}{d t}} |\psi(t) \rangle = H |\psi(t) \rangle.

The state of a system starting off at time zero in the state |\psi_0 \rangle and evolving for a time t is then given by

|\psi(t) \rangle = e^{-i t H}|\psi_0 \rangle.

The observable properties of a quantum system are associated with self-adjoint operators. In the state |\psi \rangle, the expected value of the observable associated to a self-adjoint operator O is

\langle O \rangle_{\psi} = \langle \psi | O | \psi \rangle

This expected value is constant in time for all states if and only if O commutes with the Hamiltonian H:

[O, H] = 0 \quad \iff \quad \displaystyle{\frac{d}{d t}} \langle O \rangle_{\psi(t)} = 0 \quad \forall \: |\psi_0 \rangle, \forall t.

In this case we say O is a ‘conserved quantity’. The fact that we have two equivalent conditions for this is a quantum version of Noether’s theorem!

The stochastic case

In stochastic mechanics, the story changes a bit. Now a state |\psi(t)\rangle is a probability distribution: a vector with n nonnegative components that sum to 1. Schrödinger’s equation gets replaced by the master equation:

\displaystyle{\frac{d}{d t}} |\psi(t) \rangle = H |\psi(t) \rangle

If we start with a probability distribution |\psi_0 \rangle at time zero and evolve it according to this equation, at any later time have

|\psi(t)\rangle = e^{t H} |\psi_0 \rangle.

We want this always be a probability distribution. To ensure that this is so, the Hamiltonian H must be infinitesimal stochastic: that is, a real-valued n \times n matrix where the off-diagonal entries are nonnegative and the entries of each column sum to zero. It no longer needs to be self-adjoint!

When H is infinitesimal stochastic, the operators e^{t H} map the set of probability distributions to itself whenever t \ge 0, and we call this family of operators a continuous-time Markov process, or more precisely a Markov semigroup.

In stochastic mechanics, we say an observable O is a real diagonal n \times n matrix, and its expected value is given by

\langle O\rangle_{\psi} = \langle \hat{O} | \psi \rangle

where \hat{O} is the vector built from the diagonal entries of O. More concretely,

\langle O\rangle_{\psi} = \displaystyle{ \sum_i O_{i i} \psi_i }

where \psi_i is the ith component of the vector |\psi\rangle.

Here is a version of Noether’s theorem for stochastic mechanics:

Noether’s Theorem for Markov Processes (Baez–Fong). Suppose H is an infinitesimal stochastic operator and O is an observable. Then

[O,H] =0

if and only if

\displaystyle{\frac{d}{d t}} \langle O \rangle_{\psi(t)} = 0


\displaystyle{\frac{d}{d t}}\langle O^2 \rangle_{\psi(t)} = 0

for all t \ge 0 and all \psi(t) obeying the master equation.   █

So, just as in quantum mechanics, whenever [O,H]=0 the expected value of O will be conserved:

\displaystyle{\frac{d}{d t}} \langle O\rangle_{\psi(t)} = 0

for any \psi_0 and all t \ge 0. However, John and Brendan saw that—unlike in quantum mechanics—you need more than just the expectation value of the observable O to be constant to obtain the equation [O,H]=0. You really need both

\displaystyle{\frac{d}{d t}} \langle O\rangle_{\psi(t)} = 0

together with

\displaystyle{\frac{d}{d t}} \langle O^2\rangle_{\psi(t)} = 0

for all initial data \psi_0 to be sure that [O,H]=0.

So it’s a bit subtle, but symmetries and conserved quantities have a rather different relationship than they do in quantum theory.

A Noether theorem for Dirichlet operators

But what if the infinitesimal generator of our Markov semigroup is also self-adjoint? In other words, what if H is both an infinitesimal stochastic matrix but also its own transpose: H = H^\top? Then it’s called a Dirichlet operator… and we found that in this case, we get a stochastic version of Noether’s theorem that more closely resembles the usual quantum one:

Noether’s Theorem for Dirichlet Operators. If H is a Dirichlet operator and O is an observable, then

[O, H] = 0 \quad \iff \quad \displaystyle{\frac{d}{d t}} \langle O \rangle_{\psi(t)} = 0 \quad \forall \: |\psi_0 \rangle, \forall t \ge 0

Proof. The \Rightarrow direction is easy to show, and it follows from John and Brendan’s theorem. The point is to show the \Leftarrow direction. Since H is self-adjoint, we may use a spectral decomposition:

H = \displaystyle{ \sum_k E_k |\phi_k \rangle \langle \phi_k |}

where \phi_k are an orthonormal basis of eigenvectors, and E_k are the corresponding eigenvalues. We then have:

\displaystyle{\frac{d}{d t}} \langle O \rangle_{\psi(t)} = \langle \hat{O} | H e^{t H} |\psi_0 \rangle = 0 \quad \forall \: |\psi_0 \rangle, \forall t \ge 0

\iff \quad \langle \hat{O}| H e^{t H} = 0 \quad \forall t \ge 0

\iff \quad \sum_k \langle \hat{O} | \phi_k \rangle E_k e^{t E_k} \langle \phi_k| = 0 \quad \forall t \ge 0

\iff \quad \langle \hat{O} | \phi_k \rangle E_k e^{t E_k} = 0 \quad \forall t \ge 0

\iff \quad |\hat{O} \rangle \in \mathrm{Span}\{|\phi_k \rangle \, : \; E_k = 0\} = \ker \: H,

where the third equivalence is due to the vectors |\phi_k \rangle being linearly independent. For any infinitesimal stochastic operator H the corresponding transition graph consists of m connected components iff we can reorder (permute) the states of the system such that H becomes block-diagonal with m blocks. Now it is easy to see that the kernel of H is spanned by m eigenvectors, one for each block. Since H is also symmetric, the elements of each such vector can be chosen to be ones within the block and zeros outside it. Consequently

|\hat{O} \rangle \in \ker \: H

implies that we can choose the basis of eigenvectors of O to be the vectors |\phi_k \rangle, which implies

[O, H] = 0


|\hat{O} \rangle \in \ker \, H

implies that

|\hat{O^2} \rangle \in \ker \: H \; \iff \; \cdots \; \iff \; \displaystyle{\frac{d}{d t}} \langle O^2 \rangle_{\psi(t)} = 0 \; \forall \: |\psi_0 \rangle, \forall t \ge 0,

where we have used the above sequence of equivalences backwards. Now, using John and Brendan’s original proof, we can obtain [O, H] = 0.   █

In summary, by restricting ourselves to the intersection of quantum and stochastic generators, we have found a version of Noether’s theorem for stochastic mechanics that looks formally just like the quantum version! However, this simplification comes at a cost. We find that the only observables O whose expected value remains constant with time are those of the very restricted type described above, where the observable has the same value in every state in a connected component.


Suppose we have a graph whose graph Laplacian matrix H generates a Markov semigroup as follows:

U(t) = e^{t H}

Puzzle 1. Suppose that also H = H^\top, so that H is a Dirichlet operator and hence i H generates a 1-parameter unitary group. Show that the indegree and outdegree of any node of our graph must be equal. Graphs with this property are called balanced.

Puzzle 2. Suppose that U(t) = e^{t H} is doubly stochastic Markov semigroup, meaning that for all t \ge 0 each row and each column of U(t) sums to 1:

\displaystyle{ \sum_i U(t)_{i j} = \sum_j U(t)_{i j} = 1 }

and all the matrix entries are nonnegative. Show that the Hamiltonian H obeys

\displaystyle{\sum_i H_{i j} = \sum_j H_{i j} = 0 }

and all the off-diagonal entries of H are nonnegative. Show the converse is also true.

Puzzle 3. Prove that any doubly stochastic Markov semigroup U(t) is of the form e^{t H} where H is the graph Laplacian of a balanced graph.

Puzzle 4. Let O(t) be a possibly time-dependent observable, and write \langle O(t) \rangle_{\psi(t)} for its expected value with respect to some initial state \psi_0 evolving according to the master equation. Show that

\displaystyle{ \frac{d}{d t}\langle O(t)\rangle_{\psi(t)} = \left\langle [O(t), H] \right\rangle_{\psi(t)} + \left\langle \frac{\partial O(t)}{\partial t}\right\rangle_{\psi(t)} }

This is a stochastic version of the Ehrenfest theorem.

Networks of Dynamical Systems

18 March, 2014

guest post by Eugene Lerman

Hi, I’m Eugene Lerman. I met John back in the mid 1980s when John and I were grad students at MIT. John was doing mathematical physics and I was studying symplectic geometry. We never talked about networks. Now I teach in the math department at the University of Illinois at Urbana, and we occasionally talk about networks on his blog.

A few years ago a friend of mine who studies locomotion in humans and other primates asked me if I knew of any math that could be useful to him.

I remember coming across an expository paper on ‘coupled cell networks':

• Martin Golubitsky and Ian Stewart, Nonlinear dynamics of networks: the groupoid formalism, Bull. Amer. Math. Soc. 43 (2006), 305–364.

In this paper, Golubitsky and Stewart used the study of animal gaits and models for the hypothetical neural networks called ‘central pattern generators’ that give rise to these gaits to motivate the study of networks of ordinary differential equations with symmetry. In particular they were interested in ‘synchrony’. When a horse trots, or canters, or gallops, its limbs move in an appropriate pattern, with different pairs of legs moving in synchrony:

They explained that synchrony (and the patterns) could arise when the differential equations have finite group symmetries. They also proposed several systems of symmetric ordinary differential equations that could generate the appropriate patterns.

Later on Golubitsky and Stewart noticed that there are systems of ODEs that have no group symmetries but whose solutions nonetheless exhibit certain synchrony. They found an explanation: these ODEs were ‘groupoid invariant’. I thought that it would be fun to understand what ‘groupoid invariant’ meant and why such invariance leads to synchrony.

I talked my colleague Lee DeVille into joining me on this adventure. At the time Lee had just arrived at Urbana after a postdoc at NYU. After a few years of thinking about these networks Lee and I realized that strictly speaking one doesn’t really need groupoids for these synchrony results and it’s better to think of the social life of networks instead. Here is what we figured out—a full and much too precise story is here:

• Eugene Lerman and Lee DeVille, Dynamics on networks of manifolds.

Let’s start with an example of a class of ODEs with a mysterious property:

Example. Consider this ordinary differential equation for a function \vec{x} : \mathbb{R} \to {\mathbb{R}}^3

\begin{array}{rcl}  \dot{x}_1&=& f(x_1,x_2)\\  \dot{x}_2&=& f(x_2,x_1)\\  \dot{x}_3&=& f(x_3, x_2)  \end{array}

for some function f:{\mathbb{R}}^2 \to {\mathbb{R}}. It is easy to see that a function x(t) solving

\displaystyle{  \dot{x} = f(x,x)  }

gives a solution of these equations if we set

\vec{x}(t) = (x(t),x(t),x(t))

You can think of the differential equations in this example as describing the dynamics of a complex system built out of three interacting subsystems. Then any solution of the form

\vec{x}(t) = (x(t),x(t),x(t))

may be thought of as a synchronization: the three subsystems are evolving in lockstep.

One can also view the result geometrically: the diagonal

\displaystyle{  \Delta = \{(x_1,x_2, x_3)\in {\mathbb{R}}^3 \mid x_1 =x_2 = x_3\}  }

is an invariant subsystem of the continuous-time dynamical system defined by the differential equations. Remarkably enough, such a subsystem exists for any choice of a function f.

Where does such a synchronization or invariant subsystem come from? There is no apparent symmetry of {\mathbb{R}}^3 that preserves the differential equations and fixes the diagonal \Delta, and thus could account for this invariant subsystem. It turns out that what matters is the structure of the mutual dependencies of the three subsystems making up the big system. That is, the evolution of x_1 depends only on x_1 and x_2, the evolution of x_2 depends only on x_2 and x_3, and the evolution of x_3 depends only on x_3 and x_2.

These dependencies can be conveniently pictured as a directed graph:

The graph G has no symmetries. Nonetheless, the existence of the invariant subsystem living on the diagonal \Delta can be deduced from certain properties of the graph G. The key is the existence of a surjective map of graphs

\varphi :G\to G'

from our graph G to a graph G' with exactly one node, call it a, and one arrow. It is also crucial that \varphi has the following lifting property: there is a unique way to lift the one and only arrow of G' to an arrow of G once we specify the target node of the lift.

We now formally define the notion of a ‘network of phase spaces’ and of a continuous-time dynamical system on such a network. Equivalently, we define a ‘network of continuous-time dynamical systems’. We start with a directed graph

G=\{G_1\rightrightarrows G_0\}

Here G_1 is the set of edges, G_0 is the set of nodes, and the two arrows assign to an edge its source and target, respectively. To each node we attach a phase space (or more formally a manifold, perhaps with boundary or corners). Here ‘attach’ means that we choose a function {\mathcal P}:G_0 \to {\mathsf{PhaseSpace}}; it assigns to each node a\in G_0 a phase space {\mathcal P}(a).

In our running example, to each node of the graph G we attach the real line {\mathbb{R}}. (If we think of the set G_0 as a discrete category and {\mathsf{PhaseSpace}} as a category of manifolds and smooth maps, then {\mathcal P} is simply a functor.)

Thus a network of phase spaces is a pair (G,{\mathcal P}), where G is a directed graph and {\mathcal P} is an assignment of phase spaces to the nodes of G.

We think of the collection \{{\mathcal P}(a)\}_{a\in G_0} as the collection of phase spaces of the subsystems constituting the network (G, {\mathcal P}). We refer to {\mathcal P} as a phase space function. Since the state of the network should be determined completely and uniquely by the states of its subsystems, it is reasonable to take the total phase space of the network to be the product

\displaystyle{  {\mathbb{P}}(G, {\mathcal P}):= \bigsqcap_{a\in G_0} {\mathcal P}(a).  }

In the example the total phase space of the network (G,{\mathcal P}) is {\mathbb{R}}^3, while the phase space of the network (G', {\mathcal P}') is the real line {\mathbb{R}}.

Finally we need to interpret the arrows. An arrow b\xrightarrow{\gamma}a in a graph G should encode the fact that the dynamics of the subsystem associated to the node a depends on the states of the subsystem associated to the node b. To make this precise requires the notion of an ‘open system’, or ‘control system’, which we define below. It also requires a way to associate an open system to the set of arrows coming into a node/vertex a.

To a first approximation an open system (or control system, I use the two terms interchangeably) is a system of ODEs depending on parameters. I like to think of a control system geometrically: a control system on a phase space M controlled by the the phase space U is a map

F: U\times M \to TM

where TM is the tangent bundle of the space M, so that for all (u,m)\in U\times M, F(u,m) is a vector tangent to M at the point m. Given phase spaces U and M the set of all corresponding control systems forms a vector space which we denote by

\displaystyle{ \mathsf{Ctrl}(U\times M \to M)}

(More generally one can talk about the space of control systems associated with a surjective submersion Q\to M. For us, submersions of the form U\times M \to M are general enough.)

To encode the incoming arrows, we introduce the input tree I(a) (this is a very short tree, a corolla if you like). The input tree of a node a of a graph G is a directed graph whose arrows are precisely the arrows of G coming into the vertex a, but any two parallel arrows of G with target a will have disjoint sources in I(a). In the example the input tree I of the one node of a of G' is the tree

There is always a map of graphs \xi:I(a) \to G. For instance for the input tree in the example we just discussed, \xi is the map

Consequently if (G,{\mathcal P}) is a network and I(a) is an input tree of a node of G, then (I(a), {\mathcal P}\circ \xi) is also a network. This allows us to talk about the phase space {\mathbb{P}} I(a) of an input tree. In our running example,

{\mathbb{P}} I(a) = {\mathbb{R}}^2

Given a network (G,{\mathcal P}), there is a vector space \mathsf{Ctrl}({\mathbb{P}} I(a)\to {\mathbb{P}} a) of open systems associated with every node a of G.

In our running example, the vector space associated to the one node a of (G',{\mathcal P}') is

\mathsf{Ctrl}({\mathbb{R}}^2, {\mathbb{R}})  \simeq C^\infty({\mathbb{R}}^2, {\mathbb{R}})

In the same example, the network (G,{\mathcal P}) has three nodes and we associate the same vector space C^\infty({\mathbb{R}}^2, {\mathbb{R}}) to each one of them.

We then construct an interconnection map

\displaystyle{  {\mathcal{I}}: \bigsqcap_{a\in G_0} \mathsf{Ctrl}({\mathbb{P}} I(a)\to {\mathbb{P}} a) \to \Gamma (T{\mathbb{P}}(G, {\mathcal P})) }

from the product of spaces of all control systems to the space of vector fields

\Gamma (T{\mathbb{P}} (G, {\mathcal P}))

on the total phase space of the network. (We use the standard notation to denote the tangent bundle of a manifold R by TR and the space of vector fields on R by \Gamma (TR)). In our running example the interconnection map for the network (G',{\mathcal P}') is the map

\displaystyle{  {\mathcal{I}}: C^\infty({\mathbb{R}}^2, {\mathbb{R}}) \to C^\infty({\mathbb{R}}, {\mathbb{R}}), \quad f(x,u) \mapsto f(x,x).  }

The interconnection map for the network (G,{\mathcal P}) is the map

\displaystyle{  {\mathcal{I}}: C^\infty({\mathbb{R}}^2, {\mathbb{R}})^3 \to C^\infty({\mathbb{R}}^3, {\mathbb{R}}^3)}

given by

\displaystyle{  ({\mathscr{I}}(f_1,f_2, f_3))\,(x_1,x_2, x_3) = (f_1(x_1,x_2), f_2(x_2,x_1),  f_3(x_3,x_2)).  }

To summarize: a dynamical system on a network of phase spaces is the data (G, {\mathcal P}, (w_a)_{a\in G_0} ) where G=\{G_1\rightrightarrows G_0\} is a directed graph, {\mathcal P}:{\mathcal P}:G_0\to {\mathsf{PhaseSpace}} is a phase space function and (w_a)_{a\in G_0} is a collection of open systems compatible with the graph and the phase space function. The corresponding vector field on the total space of the network is obtained by interconnecting the open systems.

Dynamical systems on networks can be made into the objects of a category. Carrying this out gives us a way to associate maps of dynamical systems to combinatorial data.

The first step is to form the category of networks of phase spaces, which we call {\mathsf{Graph}}/{\mathsf{PhaseSpace}}. In this category, by definition, a morphism from a network (G,{\mathcal P}) to a network (G', {\mathcal P}') is a map of directed graphs \varphi:G\to G' which is compatible with the phase space functions:

\displaystyle{  {\mathcal P}'\circ \varphi = {\mathcal P}.  }

Using the universal properties of products it is easy to show that a map of networks \varphi: (G,{\mathcal P})\to (G',{\mathcal P}') defines a map {\mathbb{P}}\varphi of total phase spaces in the opposite direction:

\displaystyle{  {\mathbb{P}} \varphi: {\mathbb{P}} G' \to {\mathbb{P}} G.  }

In the category theory language the total phase space assignment extends to a contravariant functor

\displaystyle{ {\mathbb{P}}:  {({\mathsf{Graph}}/{\mathsf{Region}})}^{\mbox{\sf {\tiny {op}}}} \to  {\mathsf{Region}}.  }

We call this functor the total phase space functor. In our running example, the map

{\mathbb{P}}\varphi:{\mathbb{R}} = {\mathbb{P}}(G',{\mathcal P}') \to  {\mathbb{R}}^3 = {\mathbb{P}} (G,{\mathcal P})

is given by

\displaystyle{  {\mathbb{P}} \varphi (x) = (x,x,x).  }

Continuous-time dynamical systems also form a category, which we denote by \mathsf{DS}. The objects of this category are pairs consisting of a phase space and a vector field on this phase space. A morphism in this category is a smooth map of phase spaces that intertwines the two vector fields. That is:

\displaystyle{  \mathrm{Hom}_\mathsf{DS} ((M,X), (N,Y))   = \{f:M\to N \mid Df \circ X = Y\circ f\}  }

for any pair of objects (M,X), (N,Y) in \mathsf{DS}.

In general, morphisms in this category are difficult to determine explicitly. For example if (M, X) = ((a,b), \frac{d}{dt}) then a morphism from (M,X) to some dynamical system (N,Y) is simply a piece of an integral curve of the vector field Y defined on an interval (a,b). And if (M, X) = (S^1, \frac{d}{d\theta}) is the constant vector field on the circle then a morphism from (M,X) to (N,Y) is a periodic orbit of Y. Proving that a given dynamical system has a periodic orbit is usually hard.

Consequently, given a map of networks

\varphi:(G,{\mathcal P})\to (G',{\mathcal P}')

and a collection of open systems

\{w'_{a'}\}_{a'\in G'_0}

on (G',{\mathcal P}') we expect it to be very difficult if not impossible to find a collection of open systems \{w_a\}_{a\in G_0} so that

\displaystyle{  {\mathbb{P}} \varphi: ({\mathbb{P}} G', {\mathscr{I}}' (w'))\to ({\mathbb{P}} G, {\mathscr{I}} (w))  }

is a map of dynamical systems.

It is therefore somewhat surprising that there is a class of maps of graphs for which the above problem has an easy solution! The graph maps of this class are known by several different names. Following Boldi and Vigna we refer to them as graph fibrations. Note that despite what the name suggests, graph fibrations in general are not required to be surjective on nodes or edges. For example, the inclusion

is a graph fibration. We say that a map of networks

\varphi:(G,{\mathcal P})\to (G',{\mathcal P}')

is a fibration of networks if \varphi:G\to G' is a graph fibration. With some work one can show that a fibration of networks induces a pullback map

\displaystyle{  \varphi^*: \bigsqcap_{b\in G_0'} \mathsf{Ctrl}({\mathbb{P}} I(b)\to {\mathbb{P} b) \to  \bigsqcap_{a\in G_0} \mathsf{Ctrl}({\mathbb{P}}} I(a)\to {\mathbb{P}} a)  }

on the sets of tuples of the associated open systems. The main result of Dynamics on networks of manifolds is a proof that for a fibration of networks \varphi:(G,{\mathcal P})\to (G',{\mathcal P}') the maps \varphi^*, {\mathbb{P}} \varphi and the two interconnection maps {\mathcal{I}} and {\mathcal{I}}' are compatible:

Theorem. Let \varphi:(G,{\mathcal P})\to (G',{\mathcal P}') be a fibration of networks of manifolds. Then the pullback map

\displaystyle{ \varphi^*: \mathsf{Ctrl}(G',{\mathcal P}')\to \mathsf{Ctrl}(G,{\mathcal P})  }

is compatible with the interconnection maps

\displaystyle{  {\mathcal{I}}': \mathsf{Ctrl}(G',{\mathcal P}')) \to \Gamma (T{\mathbb{P}} G') }


\displaystyle{  {\mathcal{I}}:  (\mathsf{Ctrl}(G,{\mathcal P})) \to \Gamma (T{\mathbb{P}} G)  }

Namely for any collection w'\in \mathsf{Ctrl}(G',{\mathcal P}') of open systems on the network (G', {\mathcal P}') the diagram

commutes. In other words,

\displaystyle{ {\mathbb{P}} \varphi: ({\mathbb{P}}  (G',{\mathcal P}'), {\mathscr{I}}' (w'))\to ({\mathbb{P}} (G,  {\mathcal P}), {\mathscr{I}} (\varphi^* w')) }

is a map of continuous-time dynamical systems, a morphism in \mathsf{DS}.

At this point the pure mathematician in me is quite happy: I have a theorem! Better yet, the theorem solves the puzzle at the beginning of this post. But if you have read this far, you may well be wondering: “Ok, you told us about your theorem. Very nice. Now what?”

There is plenty to do. On the practical side of things, the continuous-time dynamical systems are much too limited for contemporary engineers. Most of the engineers I know care a lot more about hybrid systems. These kinds of systems exhibit both continuous time behavior and abrupt transitions, hence the name. For example, anti-lock breaks on a car is just that kind of a system — if a sensor detects that the car is skidding and the foot break is pressed, it starts pulsing the breaks. This is not your father’s continuous time dynamical system! Hybrid dynamical systems are very hard to understand. They have been all the rage in the engineering literature for the last 10-15 years. Sadly, most pure mathematicians never heard of them. It would be interesting to extend the theorem I am bragging about to hybrid systems.

Here is another problem that may be worth thinking about: how much of the theorem holds up to numerical simulation? My feeling is that any explicit numerical integration method will behave well. Implicit methods I am not sure about.

And finally a more general issue. John has been talking about networks quite a bit on this blog. But his networks and my networks look like very different mathematical structures. What do they have in common besides nodes and arrows? What mathematical structure are they glimpses of?

Network Theory II

12 March, 2014


Chemists are secretly doing applied category theory! When chemists list a bunch of chemical reactions like

C + O₂ → CO₂

they are secretly describing a ‘category’.

That shouldn’t be surprising. A category is simply a collection of things called objects together with things called morphisms going from one object to another, often written

f: x → y

The rules of a category say:

1) we can compose a morphism f: x → y and another morphism g: y → z to get an arrow gf: x → z,

2) (hg)f = h(gf), so we don’t need to bother with parentheses when composing arrows,

3) every object x has an identity morphism 1ₓ: x → x that obeys 1ₓ f = f and f 1ₓ = f.

Whenever we have a bunch of things (objects) and processes (arrows) that take one thing to another, we’re likely to have a category. In chemistry, the objects are bunches of molecules and the arrows are chemical reactions. But we can ‘add’ bunches of molecules and also add reactions, so we have something more than a mere category: we have something called a symmetric monoidal category.

My talk here, part of a series, is an explanation of this viewpoint and how we can use it to take ideas from elementary particle physics and apply them to chemistry! For more details try this free book:

• John Baez and Jacob Biamonte, A Course on Quantum Techniques for Stochastic Mechanics.

as well as this paper on the Anderson–Craciun–Kurtz theorem (discussed in my talk):

• John Baez and Brendan Fong, Quantum techniques for studying equilibrium in reaction networks.

You can also see the slides of this talk. Click on any picture in the slides, or any text in blue, and get more information!

Network Theory I

2 March, 2014


Here’s a video of a talk I gave last Tuesday—part of a series. You can see the slides here:

Network Theory I: electrical circuits and signal-flow graphs.

Click on items in blue, or pictures, for more information.

One reason I’m glad I gave this talk is because afterwards Jamie Vicary pointed out some very interesting consequences of the relations among signal-flow diagrams listed in my talk. It turns out they imply equations familiar from the theory of complementarity in categorical quantum mechanics!

This is the kind of mathematical surprise that makes life worthwhile for me. It seemed utterly shocking at first, but I think I’ve figured out why it happens. Now is not the time to explain… but I’ll have to do it soon, both here and in the paper that Jason Eberle are writing about control theory.

For now, besides the slides, the best place to read more about this program is here:

• Brendan Fong, A compositional approach to control theory.

Categories in Control

6 February, 2014


I’m visiting Erlangen from now until the end of May, since my wife got a grant to do research here. I’m trying to get a lot of papers finished. But today I’m giving a talk in the math department of the university here, which with Germanic brevity is called the Friedrich-Alexander-Universität Erlangen-Nürnberg.

You can see my slides here, or maybe even come to my talk:

Categories in control, Thursday 6 February 2014, 16:15–18:00, Mathematics Department of the FAU, in Übungsraum 1.

The title is a pun. It’s about categories in control theory, the branch of engineering that studies dynamical systems with inputs and outputs, and how to optimize their behavior.

Control theorists often describe these systems using signal-flow graphs. Here is a very rough schematic signal-flow graph, describing the all-important concept of a ‘feedback loop':

Here is a detailed one, describing a specific device called a servo:

The device is shown on top, and the signal-flow graph describing its behavior is at bottom. For details, click on the picture.

Now, if you have a drop of category-theorist’s blood in your veins, you’ll look at this signal-flow graph and think my god, that’s a string diagram for a morphism in a monoidal category!

And you’d be right. But if you want to learn what that means, and why it matters, read my talk slides!

The slides should make sense if you’re a mathematician, but maybe not otherwise. So, here’s the executive summary. The same sort of super-abstract math that handles things like Feynman diagrams:

also handles signal-flow graphs. The details are different in important and fascinating ways, and this is what I’m mainly concerned with. But we now understand how signal-flow graphs fit into the general theory of networks. This means we can proceed to use modern math to study them—and their relation to other kinds of networks, like electrical circuit diagrams:

More talks

Thanks to the Azimuth Project team, my graduate students and many other folks, the dream of network theory as a step toward ‘green mathematics’ seems to be coming true! There’s a vast amount left to be done, so I’d have trouble convincing a skeptic, but I feel the project has turned a corner. I now feel in my bones that it’s going to work: we’ll eventually develop a language for biology and ecology based in part on category theory.

So, I think it’s a good time to explain all the various aspects of this project that have been cooking away—some quite visibly, but others on secret back burners:

• Jacob Biamonte and I have written a book on Petri nets and chemical reaction networks. You may have seen parts of this on the blog. We started this project at the Centre for Quantum Technologies, but now he’s working at the Institute for Scientific Interchange, in Turin—and collaborating with people there on various aspects of network theory.

• Brendan Fong is working with me on electrical circuits. You may know him for his posts here on chemical reaction networks. He’s now a grad student in computer science at Oxford.

• Jason Erbele, a math grad student at U.C. Riverside, is working with me on control theory. This work is the main topic of my talk—but I also sketch how it ties together with what Brendan is doing. There’s a lot more to say here.

• Tobias Fritz, a postdoc at the Perimeter Institute, is working with me on category-theoretic aspects of information theory. We published a paper on entropy with Tom Leinster, and we’ve got a followup on relative entropy that’s almost done. I should be working on it right this instant! But for now, read the series of posts here on Azimuth: Relative Entropy Part 1, Part 2 and Part 3.

• Brendan Fong has also done some great work on Bayesian networks, using ideas that connect nicely to what Tobias and I are doing.

• Tu Pham and Franciscus Rebro are working on the math that underlies all these projects: bicategories of spans.

The computer science department at Oxford is a great place for category theory and diagrammatic reasoning, thanks to the presence of Samson Abramsky, Bob Coecke and others. I’m going to visit them from February 21 to March 14. It seems like a good time to give a series of talks on this stuff. So, stay tuned! I’ll try to make slides available here.

Wormholes and Entanglement

20 January, 2014


An apparent contradiction in what most physicists believe about black holes—the ‘firewall problem’—is making some very good physicists reach for some very crazy-sounding ideas to find a way out. In particular, Maldacena and Susskind have come up with the idea that any pair of quantum-entangled particles is actually connected by a wormhole.

Entanglement is a spooky way for far-away particles to be correlated, but you can’t use it communicate faster than light. It’s been seen in the lab, but it’s only possible thanks to quantum mechanics. The first people to make a fuss over entanglement were Einstein, Podolsky and Rosen, back in 1935.

A wormhole is a spooky way for far-away regions of space to be connected by a kind of ‘tunnel’—but you probably can’t use it to communicate faster than light. Nobody has ever seen one, but they’re theoretically possible thanks to general relativity. The first people to make a fuss over wormholes were Einstein and Rosen, back in 1935.

So, superficially, it makes sense that there should be a connection between wormholes and entanglement. But when you learn enough physics, you’ll see that Maldacena and Susskind’s proposal sounds completely hare-brained.

But when you learn more physics—maybe more than enough?—you might decide there’s some merit to this idea after all. At the Centre for Quantum Technologies last summer, Jamie Vicary and I noticed some interesting connections between wormholes and quantum entanglement. We now have a paper out!

In it, we study quantum gravity in a universe where space is just 2-dimensional, not 3-dimensional like ours. It’s not realistic, but it has one huge advantage: there’s a working theory of what quantum gravity could be like when space is 2-dimensional, so you can calculate stuff!

So, we calculate what happens when a wormhole forms, and we show the ends look like a particle and its antiparticle (this was already known), and we note that this particle-antiparticle pair is entangled. In fact it’s completely entangled: any piece of information you might want to know about one can also be found in the other.

However, in a sense that Jamie and I make precise, this entanglement is ‘fake’. The reason is that the two ends of the wormhole are not independent things. They’re just two views of the same thing… and, technically, it doesn’t count as entanglement when something is ‘entangled with itself’. This fact is crucial to how Maldacena and Susskind want to get around the firewall problem.

For more details, try this:

Wormholes and entanglement, The n-Category Café.

This has links to other stuff, including our paper, but also some blog articles explaining the firewall problem, the paper by Maldacena and Susskind, and the original Einstein–Podolsky–Rosen and Einstein–Rosen papers (in English).

Since this quantum gravity stuff is more suited to the n-Category Café than here, I won’t enable comments here. If you want to talk, please go there. Sorry!

Quantropy (Part 4)

11 November, 2013

There’s a new paper on the arXiv:

• John Baez and Blake Pollard, Quantropy.

Blake is a physics grad student at U. C. Riverside who plans to do his thesis with me.

If you have carefully read all my previous posts on quantropy (Part 1, Part 2 and Part 3), there’s only a little new stuff here. But still, it’s better organized, and less chatty.

And in fact, Blake came up with a lot of new stuff for this paper! He studied the quantropy of the harmonic oscillator, and tweaked the analogy between statistical mechanics and quantum mechanics in an interesting way. Unfortunately, we needed to put a version of this paper on the arXiv by a deadline, and our writeup of this new work wasn’t quite ready (my fault). So, we’ll put that other stuff in a new version—or, I’m thinking now, a separate paper.

But here are two new things.

First, putting this paper on the arXiv had the usual good effect of revealing some existing work on the same topic. Joakim Munkhammar emailed me and pointed out this paper, which is free online:

• Joakim Munkhammar, Canonical relational quantum mechanics from information theory, Electronic Journal of Theoretical Physics 8 (2011), 93–108.

You’ll see it cites Garrett Lisi’s paper and pushes forward in various directions. There seems to be a typo where he writes the path integral

Z = \displaystyle{ \int e^{-\alpha S(q) } D q}

and says

In order to fit the purpose Lisi concluded that the Lagrange multiplier value \alpha \equiv 1/i \hbar. In similarity with Lisi’s approach we shall also assume that the arbitrary scaling-part of the constant \alpha is in fact 1/\hbar.

I’m pretty sure he means 1/i\hbar, given what he writes later. However, he speaks of ‘maximizing entropy’, which is not quite right for a complex-valued quantity; Blake and I prefer to give this new quantity a new name, and speak of ‘finding a stationary point of quantropy’.

But in a way these are small issues; being a mathematician, I’m more quick to spot tiny technical defects than to absorb significant new ideas, and it will take a while to really understand Munkhammar’s paper.

Second, while writing our paper, Blake and I noticed another similarity between the partition function of a classical ideal gas and the partition function of a quantum free particle. Both are given by an integral like this:

Z = \displaystyle{\int e^{-\alpha S(q) } D q }

where S is a quadratic function of q \in \mathbb{R}^n. Here n is the number of degrees of freedom for the particles in the ideal gas, or the number of time steps for a free particle on a line (where we are discretizing time). The only big difference is that

\alpha = 1/kT

for the ideal gas, but

\alpha = 1/i \hbar

for the free particle.

In both cases there’s an ambiguity in the answer! The reason is that to do this integral, we need to pick a measure D q. The obvious guess is Lebesgue measure

dq = dq_1 \cdots dq_n

on \mathbb{R}^n. But this can’t be right, on physical grounds!

The reason is that the partition function Z needs to be dimensionless, but d q has units. To correct this, we need to divide dq by some dimensionful quantity to get D q.

In the case of the ideal gas, this dimensionful quantity involves the ‘thermal de Broglie wavelength’ of the particles in the gas. And this brings Planck’s constant into the game, even though we’re not doing quantum mechanics: we’re studying the statistical mechanics of a classical ideal gas!

That’s weird and interesting. It’s not the only place where we see that classical statistical mechanics is incomplete or inconsistent, and we need to introduce some ideas from quantum physics to get sensible answers. The most famous one is the ultraviolet catastrophe. What are all rest?

In the case of the free particle, we need to divide by a quantity with dimensions of lengthn to make

dq = dq_1 \cdots dq_n

dimensionless, since each dq_i has dimensions of length. The easiest way is to introduce a length scale \Delta x and divide each dq_i by that. This is commonly done when people study the free particle. This length scale drops out of the final answer for the questions people usually care about… but not for the quantropy.

Similarly, Planck’s constant drops out of the final answer for some questions about the classical ideal gas, but not for its entropy!

So there’s an interesting question here, about what this new length scale \Delta x means, if anything. One might argue that quantropy is a bad idea, and the need for this new length scale to make it unambiguous is just proof of that. However, the mathematical analogy to quantum mechanics is so precise that I think it’s worth going a bit further out on this limb, and thinking a bit more about what’s going on.

Some weird sort of déjà vu phenomenon seems to be going on. Once upon a time, people tried to calculate the partition functions of classical systems. They discovered they were infinite or ambiguous until they introduced Planck’s constant, and eventually quantum mechanics. Then Feynman introduced the path integral approach to quantum mechanics. In this approach one is again computing partition functions, but now with a new meaning, and with complex rather than real exponentials. But these partition functions are again infinite or ambiguous… for very similar mathematical reasons! And at least in some cases, we can remove the ambiguity using the same trick as before: introducing a new constant. But then… what?

Are we stuck in an infinite loop here? What, if anything, is the meaning of this ‘second Planck’s constant’? Does this have anything to do with second quantization? (I don’t see how, but I can’t resist asking.)


Get every new post delivered to your Inbox.

Join 2,845 other followers