## Quantropy (Part 4)

11 November, 2013

There’s a new paper on the arXiv:

• John Baez and Blake Pollard, Quantropy.

Blake is a physics grad student at U. C. Riverside who plans to do his thesis with me.

If you have carefully read all my previous posts on quantropy (Part 1, Part 2 and Part 3), there’s only a little new stuff here. But still, it’s better organized, and less chatty.

And in fact, Blake came up with a lot of new stuff for this paper! He studied the quantropy of the harmonic oscillator, and tweaked the analogy between statistical mechanics and quantum mechanics in an interesting way. Unfortunately, we needed to put a version of this paper on the arXiv by a deadline, and our writeup of this new work wasn’t quite ready (my fault). So, we’ll put that other stuff in a new version—or, I’m thinking now, a separate paper.

But here are two new things.

First, putting this paper on the arXiv had the usual good effect of revealing some existing work on the same topic. Joakim Munkhammar emailed me and pointed out this paper, which is free online:

• Joakim Munkhammar, Canonical relational quantum mechanics from information theory, Electronic Journal of Theoretical Physics 8 (2011), 93–108.

You’ll see it cites Garrett Lisi’s paper and pushes forward in various directions. There seems to be a typo where he writes the path integral

$Z = \displaystyle{ \int e^{-\alpha S(q) } D q}$

and says

In order to fit the purpose Lisi concluded that the Lagrange multiplier value $\alpha \equiv 1/i \hbar.$ In similarity with Lisi’s approach we shall also assume that the arbitrary scaling-part of the constant $\alpha$ is in fact $1/\hbar.$

I’m pretty sure he means $1/i\hbar,$ given what he writes later. However, he speaks of ‘maximizing entropy’, which is not quite right for a complex-valued quantity; Blake and I prefer to give this new quantity a new name, and speak of ‘finding a stationary point of quantropy’.

But in a way these are small issues; being a mathematician, I’m more quick to spot tiny technical defects than to absorb significant new ideas, and it will take a while to really understand Munkhammar’s paper.

Second, while writing our paper, Blake and I noticed another similarity between the partition function of a classical ideal gas and the partition function of a quantum free particle. Both are given by an integral like this:

$Z = \displaystyle{\int e^{-\alpha S(q) } D q }$

where $S$ is a quadratic function of $q \in \mathbb{R}^n.$ Here $n$ is the number of degrees of freedom for the particles in the ideal gas, or the number of time steps for a free particle on a line (where we are discretizing time). The only big difference is that

$\alpha = 1/kT$

for the ideal gas, but

$\alpha = 1/i \hbar$

for the free particle.

In both cases there’s an ambiguity in the answer! The reason is that to do this integral, we need to pick a measure $D q.$ The obvious guess is Lebesgue measure

$dq = dq_1 \cdots dq_n$

on $\mathbb{R}^n.$ But this can’t be right, on physical grounds!

The reason is that the partition function $Z$ needs to be dimensionless, but $d q$ has units. To correct this, we need to divide $dq$ by some dimensionful quantity to get $D q.$

In the case of the ideal gas, this dimensionful quantity involves the ‘thermal de Broglie wavelength’ of the particles in the gas. And this brings Planck’s constant into the game, even though we’re not doing quantum mechanics: we’re studying the statistical mechanics of a classical ideal gas!

That’s weird and interesting. It’s not the only place where we see that classical statistical mechanics is incomplete or inconsistent, and we need to introduce some ideas from quantum physics to get sensible answers. The most famous one is the ultraviolet catastrophe. What are all rest?

In the case of the free particle, we need to divide by a quantity with dimensions of lengthn to make

$dq = dq_1 \cdots dq_n$

dimensionless, since each $dq_i$ has dimensions of length. The easiest way is to introduce a length scale $\Delta x$ and divide each $dq_i$ by that. This is commonly done when people study the free particle. This length scale drops out of the final answer for the questions people usually care about… but not for the quantropy.

Similarly, Planck’s constant drops out of the final answer for some questions about the classical ideal gas, but not for its entropy!

So there’s an interesting question here, about what this new length scale $\Delta x$ means, if anything. One might argue that quantropy is a bad idea, and the need for this new length scale to make it unambiguous is just proof of that. However, the mathematical analogy to quantum mechanics is so precise that I think it’s worth going a bit further out on this limb, and thinking a bit more about what’s going on.

Some weird sort of déjà vu phenomenon seems to be going on. Once upon a time, people tried to calculate the partition functions of classical systems. They discovered they were infinite or ambiguous until they introduced Planck’s constant, and eventually quantum mechanics. Then Feynman introduced the path integral approach to quantum mechanics. In this approach one is again computing partition functions, but now with a new meaning, and with complex rather than real exponentials. But these partition functions are again infinite or ambiguous… for very similar mathematical reasons! And at least in some cases, we can remove the ambiguity using the same trick as before: introducing a new constant. But then… what?

Are we stuck in an infinite loop here? What, if anything, is the meaning of this ‘second Planck’s constant’? Does this have anything to do with second quantization? (I don’t see how, but I can’t resist asking.)

## The Elitzur–Vaidman Bomb-Testing Method

24 August, 2013

Quantum mechanics forces us to refine our attitude to counterfactual conditionals: questions about what would have happened if we had done something, even though we didn’t.

“What would the position of the particle be if I’d measured that… when actually I measured its momentum?” Here you’ll usually get no definite answer.

But sometimes you can use quantum mechanics to find out what would have happened if you’d done something… when classically it seems impossible!

Suppose you have a bunch of bombs. Some have a sensor that will absorb a photon you shine on it, and make the bomb explode! Others have a broken sensor, that won’t interact with the photon at all.

Can you choose some working bombs? You can tell if a bomb works by shining a photon on it. But if it works, it blows up—and then it doesn’t work anymore!

So, it sounds impossible. But with quantum mechanics you can do it. You can find some bombs that would have exploded if you had shone photons at them!

Here’s how:

Put a light that emits a single photon at A. Have the photon hit the half-silvered mirror at lower left, so it has a 50% chance of going through to the right, and a 50% chance of reflecting and going up. But in quantum mechanics, it sort of does both!

Put a bomb at B. Recombine the photon’s paths using two more mirrors. Have the two paths meet at a second half-silvered mirror at upper right. You can make it so that if the bomb doesn’t work, the photon interferes with itself and definitely goes to C, not D.

But if the bomb works, it absorbs the photon and explodes unless the photon takes the top route… in which case, when it hits the second half-silvered mirror, it has a 50% chance of going to C and a 50% chance of going to D.

So:

• If the bomb doesn’t work, the photon has a 100% chance of going to C.

• If the bomb works, there’s a 50% chance that it absorbs the photon and explodes. There’s also a 50% chance that the bomb does not explode—and then the photon is equally likely to go to either C or D. So, the photon has a 25% chance of reaching C and a 25% chance of reaching D.

So: if you see a photon at D, you know you have a working bomb… but the bomb has not exploded!

For each working bomb there’s:

• a 50% chance that it explodes,
• a 25% chance that it doesn’t explode but you can’t tell if it works,
• a 25% chance that it doesn’t explode but you can tell that it works.

This is the Elitzur–Vaidman bomb-testing method. It was invented by Avshalom Elitzur and Lev Vaidman in 1993. One year later, physicists actually did an experiment to show this idea works… but alas, not using actual bombs!

In 1996, Kwiat showed that using more clever methods, you can reduce the percentage of wasted working bombs as close to zero as you like. And pushing the idea even further, Graeme Mitchison and Richard Jozsa showed in 1999 that you can get a quantum computer to do a calculation for you without even turning it on!

This sounds amazing, but it’s really no more amazing than the bomb-testing method I’ve already described.

### References

• A. Elitzur and L. Vaidman, Quantum mechanical interaction-free measurements, Found. Phys. 23 (1993), 987–997.

• Paul G. Kwiat, H. Weinfurter, T. Herzog, A. Zeilinger, and M. Kasevich, Experimental realization of “interaction-free” measurements.

• Paul G. Kwiat, Interaction-free measurements.

• Graeme Mitchison and Richard Jozsa, Counterfactual computation, Proc. Roy. Soc. Lond. A457 (2001), 1175–1194.

The picture is from the Wikipedia article, which also has other references:

Elitzur–Vaidman bomb tester, Wikipedia.

Bas Spitters pointed out this category-theoretic analysis of the issue:

• Robert Furber and Bart Jacobs, Towards a categorical account of conditional probability.

## Quantum Network Theory (Part 2)

13 August, 2013

guest post by Tomi Johnson

Last time I told you how a random walk called the ‘uniform escape walk’ could be used to analyze a network. In particular, Google uses it to rank nodes. For the case of an undirected network, the steady state of this random walk tells us the degrees of the nodes—that is, how many edges come out of each node.

Now I’m going to prove this to you. I’ll also exploit the connection between this random walk and a quantum walk, also introduced last time. In particular, I’ll connect the properties of this quantum walk to the degrees of a network by exploiting its relationship with the random walk.

This is pretty useful, considering how tricky these quantum walks can be. As the parts of the world that we model using quantum mechanics get bigger and have more complicated structures, like biological network, we need all the help in understanding quantum walks that we can get. So I’d better start!

### Flashback

Starting with any (simple, connected) graph, we can get an old-fashioned ‘stochastic’ random walk on this graph, but also a quantum walk. The first is the uniform escape stochastic walk, where the walker has an equal probability per time of walking along any edge leaving the node they are standing at. The second is the related quantum walk we’re going to study now. These two walks are generated by two matrices, which we called $S$ and $Q.$ The good thing is that these matrices are similar, in the technical sense.

We studied this last time, and everything we learned is summarized here:

where:

$G$ is a simple graph that specifies

$A$ the adjacency matrix (the generator of a quantum walk) with elements $A_{i j}$ equal to unity if nodes $i$ and $j$ are connected, and zero otherwise ($A_{i i} = 0$), which subtracted from

$D$ the diagonal matrix of degrees $D_{i i} = \sum_j A_{i j}$ gives

$L = D - A$ the symmetric Laplacian (generator of stochastic and quantum walks), which when normalized by $D$ returns both

$S = L D^{-1}$ the generator of the uniform escape stochastic walk and

$Q = D^{-1/2} L D^{-1/2}$ the quantum walk generator to which it is similar!

Now I hope you remember where we are. Next I’ll talk you through the mathematics of the uniform escape stochastic walk $S$ and how it connects to the degrees of the nodes in the large-time limit. Then I’ll show you how this helps us solve aspects of the quantum walk generated by $Q.$

### Stochastic walk

The uniform escape stochastic walk generated by $S$ is popular because it has a really useful stationary state.

To recap from Part 20 of the network theory series, a stationary state of a stochastic walk is one that does not change in time. By the master equation

$\displaystyle{ \frac{d}{d t} \psi(t) = -S \psi(t)}$

the stationary state must be an eigenvector of $S$ with eigenvalue $0.$

A fantastic pair of theorems hold:

• There is always a unique (up to multiplication by a positive number) positive eigenvector $\pi$ of $S$ with eigenvalue $0.$ That is, there is a unique stationary state $\pi.$

• Regardless of the initial state $\psi(0),$ any solution of the master equation approaches this stationary state $\pi$ in the large-time limit:

$\displaystyle{ \lim_{t \rightarrow \infty} \psi(t) = \pi }$

To find this unique stationary state, consider the Laplacian $L,$ which is both infinitesimal stochastic and symmetric. Among other things, this means the rows of $L$ sum to zero:

$\displaystyle{ \sum_j L_{i j} = 0 }$

Thus, the ‘all ones’ vector $\mathbf{1}$ is an eigenvector of $L$ with zero eigenvalue:

$L \mathbf{1} = 0$

Inserting the identity $I = D^{-1} D$ into this equation we then find $D \mathbf{1}$ is a zero eigenvector of $S$:

$L \mathbf{1} = ( L D^{-1} ) ( D \mathbf{1} ) = S ( D \mathbf{1} ) = 0$

Therefore we just need to normalize this to get the large-time stationary state of the walk:

$\displaystyle{ \pi = \frac{D \mathbf{1}}{\sum_i D_{i i}} }$

If we write $i$ for the basis vector that is zero except at the ith node of our graph, and 1 at that node, the inner product $\langle i , \pi \rangle$ is large-time probability of finding a walker at that node. The equation above implies this is proportional to the degree $D_{i i}$ of node $i.$

We can check this for the following graph:

We find that $\pi$ is

$\displaystyle{ \left( \begin{matrix} 1/6 \\ 1/6 \\ 1/4 \\ 1/4 \\ 1/6 \end{matrix} \right) }$

which implies large-time probability $1/6$ for nodes $1,$ $2$ and $5,$ and $1/4$ for nodes $3$ and $4.$ Comparing this to the original graph, this exactly reflects the arrangement of degrees, as we knew it must.

Math works!

### The quantum walk

Next up is the quantum walk generated by $Q.$ Not a lot is known about quantum walks on networks of arbitrary geometry, but below we’ll see some analytical results are obtained by exploiting the similarity of $S$ and $Q.$

Where to start? Well, let’s start at the bottom, what quantum physicists call the ground state. In contrast to stochastic walks, for a quantum walk every eigenvector $\phi_k$ of $Q$ is a stationary state of the quantum walk. (In Puzzle 5, at the bottom of this page, I ask you to prove this). The stationary state $\phi_0$ is of particular interest physically and mathematically. Physically, since eigenvectors of the $Q$ correspond to states of well-defined energy equal to the associated eigenvalue, $\phi_0$ is the state of lowest energy, energy zero, hence the name ‘ground state’. (In Puzzle 3, I ask you to prove that all eigenvalues of $Q$ are non-negative, so zero really does correspond to the ground state.)

Mathematically, the relationship between eigenvectors implied by the similarity of $S$ and $Q$ means

$\phi_0 \propto D^{-1/2} \pi \propto D^{1/2} \mathbf{1}$

So in the ground state, the probability of our quantum walker being found at node $i$ is

$| \langle i , \phi_0 \rangle |^2 \propto | \langle i , D^{1/2} \rangle \mathbf{1} |^2 = D_{i i}$

Amazingly, this probability is proportional to the degree and so is exactly the same as $\langle i , \pi \rangle,$ the probability in the stationary state $\pi$ of the stochastic walk!

In short: a zero energy quantum walk $Q$ leads to exactly the same distribution of the walker over the nodes as in the large-time limit of the uniform escape stochastic walk $S.$ The classically important notion of degree distribution also plays a role in quantum walks!

This is already pretty exciting. What else can we say? If you are someone who feels faint at the sight of quantum mechanics, well done for getting this far, but watch out for what’s coming next.

What if the walker starts in some other initial state? Is there some quantum walk analogue of the unique large-time state of a stochastic walk?

In fact, the quantum walk in general does not converge to a stationary state. But there is a probability distribution that can be thought to characterize the quantum walk in the same way as the large-time state characterizes the stochastic walk. It’s the large-time average probability vector $P.$

If you didn’t know the time that had passed since the beginning of a quantum walk, then the best estimate for the probability of your measuring the walker to be at node $i$ would be the large-time average probability

$\displaystyle{ \langle i , P \rangle = \lim_{T \rightarrow \infty} \frac{1}{T} \int_0^T | \psi_i (t) |^2 d t }$

There’s a bit that we can do to simplify this expression. As usual in quantum mechanics, let’s start with the trick of diagonalizing $Q.$ This amounts to writing

$\displaystyle{ Q= \sum_k \epsilon_k \Phi_k }$

where $\Phi_k$ are projectors onto the eigenvectors $\phi_k$ of $Q,$ and $\epsilon_k$ are the corresponding eigenvalues of $Q.$ If we insert this equation into

$\psi(t) = e^{-Q t} \psi(0)$

we get

$\displaystyle{ \psi(t) = \sum_k e^{-\epsilon_k t} \Phi_k \psi(0) }$

and thus

$\displaystyle{ \langle i , P \rangle = \lim_{T \rightarrow \infty} \frac{1}{T} \int_0^T | \sum_k e^{-i \epsilon_k t} \langle i, \Phi_k \psi (0) \rangle |^2 d t }$

Due to the integral over all time, the interference between terms corresponding to different eigenvalues averages to zero, leaving:

$\displaystyle{ \langle i , P \rangle = \sum_k | \langle i, \Phi_k \psi(0) \rangle |^2 }$

The large-time average probability is then the sum of terms contributed by the projections of the initial state onto each eigenspace.

So we have a distribution that characterizes a quantum walk for a general initial state, but it’s a complicated beast. What can we say about it?

Our best hope of understanding the large-time average probability is through the term $| \langle i, \Phi_0 \psi (0) \rangle |^2$ associated with the zero energy eigenspace, since we know everything about this space.

For example, we know the zero energy eigenspace is one-dimensional and spanned by the eigenvector $\phi_0.$ This means that the projector is just the usual outer product

$\Phi_0 = | \phi_0 \rangle \langle \phi_0 | = \phi_0 \phi_0^\dagger$

where we have normalized $\phi_0$ according to the inner product $\langle \phi_0, \phi_0\rangle = 1.$ (If you’re wondering why I’m using all these angled brackets, well, they’re a notation named after Dirac that is adored by quantum physicists.)

The zero eigenspace contribution to the large-time average probability then breaks nicely into two:

$\begin{array}{ccl} | \langle i, \Phi_0 \psi (0) \rangle |^2 &=& | \langle i, \phi_0\rangle \; \langle \phi_0, \psi (0) \rangle |^2 \\ \\ &=& | \langle i, \phi_0\rangle |^2 \; | \langle \phi_0 , \psi (0) \rangle |^2 \\ \\ &=& \langle i , \pi \rangle \; | \langle \phi_0 , \psi (0) \rangle |^2 \end{array}$

This is just the product of two probabilities:

• first, the probability $\langle i , \pi \rangle$ for a quantum state in the zero energy eigenspace to be at node $i,$ as we found above,

and

• second, the probability $| \langle \phi_0, \psi (0)\rangle |^2$ of being in this eigenspace to begin with. (Remember, in quantum mechanics the probability of measuring the system to have an energy is the modulus squared of the projection of the state onto the associated eigenspace, which for the one-dimensional zero energy eigenspace means just the inner product with the ground state.)

This is all we need to say something interesting about the large-time average probability for all states. We’ve basically shown that we can break the large-time probability vector $P$ into a sum of two normalized probability vectors:

$P = (1- \eta) \pi + \eta \Omega$

the first $\pi$ being the stochastic stationary state associated with the zero energy eigenspace, and the second $\Omega$ associated with the higher energy eigenspaces, with

$\displaystyle{ \langle i , \Omega \rangle = \frac{ \sum_{k\neq 0} | \langle i, \Phi_k \psi (0) \rangle |^2 }{ \eta} }$

The weight of each term is governed by the parameter

$\eta = 1 - | \langle \phi_0, \psi (0)\rangle |^2$

which you could think of as the quantumness of the result. This is one minus the probability of the walker being in the zero energy eigenspace, or equivalently the probability of the walker being outside the zero energy eigenspace.

So even if we don’t know $\Omega,$ we know its importance is controlled by a parameter $\eta$ that governs how close the large-time average distribution $P$ of the quantum walk is to the corresponding stochastic stationary distribution $\pi.$

What do we mean by ‘close’? Find out for yourself:

Puzzle 1. Show, using a triangle inequality, that the trace distance between the two characteristic stochastic and quantum distributions $\{ \langle i , P \rangle \}_i$ and $\{ \langle i , \pi \rangle \}_i$ is upper-bounded by $2 \eta.$

Can we say anything physical about when the quantumness $\eta$ is big or small?

Because the eigenvalues of $Q$ have a physical interpretation in terms of energy, the answer is yes. The quantumness $\eta$ is the probability of being outside the zero energy state. Call the next lowest eigenvalue $\Delta = \min_{k \neq 0} \epsilon_k$ the energy gap. If the quantum walk is not in the zero energy eigenspace then it must be in an eigenspace of energy greater or equal to $\Delta.$ Therefore the expected energy $E$ of the quantum walker must bound the quantumness $E \ge \eta \Delta.$

This tells us that a quantum walk with a low energy is similar to a stochastic walk in the large-time limit. We already knew this was exactly true in the zero energy limit, but this result goes further.

So little is known about quantum walks on networks of arbitrary geometry that we were very pleased to find this result. It says there is a special case in which the walk is characterized by the degree distribution of the network, and a clear physical parameter that bounds how far the walk is from this special case.

Also, in finding it we learned that the difficulties of the initial state dependence, enhanced by the lack of convergence to a stationary state, could be overcome for a quantum walk, and that the relationships between quantum and stochastic walks extend beyond those with shared generators.

### What next?

That’s all for the latest bit of idea sharing at the interface between stochastic and quantum systems.

Other questions we have include: What holds analytically about the form of the quantum correction? Numerically it is known that the so-called quantum correction $\Omega$ tends to enhance the probability of being found on nodes of low degree compared to $\pi.$ Can someone explain why? What happens if a small amount of stochastic noise is added to a quantum walk? Or a lot of noise?

It’s difficult to know who is best placed to answer these questions: experts in quantum physics, graph theory, complex networks or stochastic processes? I suspect it’ll take a bit of help from everyone.

A couple of textbooks with comprehensive sections on non-negative matrices and continuous-time stochastic processes are:

• Peter Lancaster and Miron Tismenetsky, The Theory of Matrices: with Applications, 2nd edition, Academic Press, San Diego, 1985.

• James R. Norris, Markov Chains, Cambridge University Press, Cambridge, 1997.

There is, of course, the book that arose from the Azimuth network theory series, which considers several relationships between quantum and stochastic processes on networks:

• John Baez and Jacob Biamonte, A Course on Quantum Techniques for Stochastic Mechanics, 2012.

Another couple of books on complex networks are:

• Mark Newman, Networks: An Introduction, Oxford University Press, Oxford, 2010.

• Ernesto Estrada, The Structure of Complex Networks: Theory and Applications, Oxford University Press, Oxford, 2011. Note that the first chapter is available free online.

There are plenty more useful references in our article on this topic:

• Mauro Faccin, Tomi Johnson, Jacob Biamonte, Sabre Kais and Piotr Migdał, Degree distribution in quantum walks on complex networks.

### Puzzles for the enthusiastic

Sadly I didn’t have space to show proofs of all the theorems I used. So here are a few puzzles that guide you to doing the proofs for yourself:

#### Stochastic walks and stationary states

Puzzle 2. (For the hard core.) Prove there is always a unique positive eigenvector for a stochastic walk generated by $S.$ You’ll need the assumption that the graph $G$ is connected. It’s not simple, and you’ll probably need help from a book, perhaps one of those above by Lancaster and Tismenetsky, and Norris.

Puzzle 3. Show that the eigenvalues of $S$ (and therefore $Q$) are non-negative. A good way to start this proof is to apply the Perron-Frobenius theorem to the non-negative matrix $M = - S + I \max_i S_{i i}.$ This implies that $M$ has a positive eigenvalue $r$ equal to its spectral radius

$r = \max_k | \lambda_k |$

where $\lambda_k$ are the eigenvalues of $M,$ and the associated eigenvector $v$ is positive. Since $S = - M + I \max_i S_{i i},$ it follows that $S$ shares the eigenvectors of $M$ and the associated eigenvalues are related by inverted translation:

$\epsilon_k = - \lambda_k + \max_i S_{i i}$

Puzzle 4. Prove that regardless of the initial state $\psi(0),$ the zero eigenvector $\pi$ is obtained in the large-time limit $\lim_{t \rightarrow \infty} \psi(t) = \pi$ of the walk generated by $S.$ This breaks down into two parts:

(a) Using the approach from Puzzle 5, to show that $S v = \epsilon_v v,$ the positivity of $v$ and the infinitesimal stochastic property $\sum_i S_{i j} = 0$ imply that $\epsilon_v = \epsilon_0 = 0$ and thus $v = \pi$ is actually the unique zero eigenvector and stationary state of $S$ (its uniqueness follows from puzzle 4, you don’t need to re-prove it).

(b) By inserting the decomposition $S = \sum_k \epsilon_k \Pi_k$ into $e^{-S t}$ and using the result of puzzle 5, complete the proof.

(Though I ask you to use the diagonalizability of $S,$ the main results still hold if the generator is irreducible but not diagonalizable.)

#### Quantum walks

Here are a couple of extra puzzles for those of you interested in quantum mechanics:

Puzzle 5. In quantum mechanics, probabilities are given by the moduli squared of amplitudes, so multiplying a state by a number of modulus unity has no physical effect. By inserting

$\displaystyle{ Q= \sum_k \epsilon_k \Phi_k }$

into the quantum time evolution matrix $e^{-Q t},$ show that if

$\psi(0) = \phi_k$

then

$\psi(t) = e^{ - i \epsilon_k t} \psi(0)$

hence $\phi_k$ is a stationary state in the quantum sense, as probabilities don’t change in time.

Puzzle 6. By expanding the initial state $\psi(0)$ in terms of the complete orthogonal basis vectors $\phi_k$ show that for a quantum walk $\psi(t)$ never converges to a stationary state unless it began in one.

## Quantum Network Theory (Part 1)

5 August, 2013

guest post by Tomi Johnson

If you were to randomly click a hyperlink on this web page and keep doing so on each page that followed, where would you end up?

As an esteemed user of Azimuth, I’d like to think you browse more intelligently, but the above is the question Google asks when deciding how to rank the world’s web pages.

Recently, together with the team (Mauro Faccin, Jacob Biamonte and Piotr Migdał) at the ISI Foundation in Turin, we attended a workshop in which several of the attendees were asking a similar question with a twist. “What if you, the web surfer, behaved quantum mechanically?”

Now don’t panic! I have no reason to think you might enter a superposition of locations or tunnel through a wall. This merely forms part of a recent drive towards understanding the role that network science can play in quantum physics.

As we’ll find, playing with quantum networks is fun. It could also become a necessity. The size of natural systems in which quantum effects have been identified has grown steadily over the past few years. For example, attention has recently turned to explaining the remarkable efficiency of light-harvesting complexes, comprising tens of molecules and thousands of atoms, using quantum mechanics. If this expansion continues, perhaps quantum physicists will have to embrace the concepts of complex networks.

To begin studying quantum complex networks, we found a revealing toy model. Let me tell you about it. Like all good stories, it has a beginning, a middle and an end. In this part, I’ll tell you the beginning and the middle. I’ll introduce the stochastic walk describing the randomly clicking web surfer mentioned above and a corresponding quantum walk. In part 2 the story ends with the bounding of the difference between the two walks in terms of the energy of the walker.

But for now I’ll start by introducing you to a graph, this time representing the internet!

If this taster gets you interested, there are more details available here:

• Mauro Faccin, Tomi Johnson, Jacob Biamonte, Sabre Kais and Piotr Migdał, Degree distribution in quantum walks on complex networks, arXiv:1305.6078 (2013).

### What does the internet look like from above?

As we all know, the idea of the internet is to connect computers to each other. What do these connections look like when abstracted as a network, with each computer a node and each connection an edge?

The internet on a local scale, such as in your house or office, might look something like this:

with several devices connected to a central hub. Each hub connects to other hubs, and so the internet on a slightly larger scale might look something like this:

What about the full global, not local, structure of the internet? To answer this question, researchers have developed representations of the whole internet, such as this one:

While such representations might be awe inspiring, how can we make any sense of them? Or are they merely excellent desktop wallpapers and new-age artworks?

In terms of complex network theory, there’s actually a lot that can be said that is not immediately obvious from the above representation.

For example, we find something very interesting if we plot the number of web pages with different incoming links (called degree) on a log-log axis. What is found for the African web is the following:

This shows that very few pages are linked to by a very large number others, while a very large number of pages receive very few links. More precisely, what this shows is a power law distribution, the signature of which is a straight line on a log-log axis.

In fact, power law distributions arise in a diverse number of real world networks, human-built networks such as the internet and naturally occurring networks. It is often discussed alongside the concept of the preferential attachment; highly connected nodes seem to accumulate connections more quickly. We all know of a successful blog whose success had led to an increased presence and more success. That’s an example of preferential attachment.

It’s clear then that degree is an important concept in network theory, and its distribution across the nodes a useful characteristic of a network. Degree gives one indication of how important a node is in a network.

And this is where stochastic walks come in. Google, who are in the business of ranking the importance of nodes (web pages) in a network (the web), use (up to a small modification) the idealized model of a stochastic walker (web surfer) who randomly hops to connected nodes (follows one of the links on a page). This is called the uniform escape model, since the total rate of leaving any node is set to be the same for all nodes. Leaving the walker to wander for a long while, Google then takes the probability of the walker being on a node to rank the importance of that node. In the case that the network is undirected (all links are reciprocated) this long-time probability, and therefore the rank of the node, is proportional to the degree of the node.

So node degrees and the uniform escape model play an important role in the fields of complex networks and stochastic walks. But can they tell us anything about the much more poorly understood topics of quantum networks and quantum walks? In fact, yes, and demonstrating that to you is the purpose of this pair of articles.

Before we move on to the interesting bit, the math, it’s worth just listing a few properties of quantum walks that make them hard to analyze, and explaining why they are poorly understood. These are the difficulties we will show how to overcome below.

No convergence. In a stochastic walk, if you leave the walker to wander for a long time, eventually the probability of finding a walker at a node converges to a constant value. In a quantum walk, this doesn’t happen, so the walk can’t be characterized so easily by its long-time properties.

Dependence on initial states. In some stochastic walks the long-time properties of the walk are independent of the initial state. It is possible to characterize the stochastic walk without referring to the initialization of the walker. Such a characterization is not so easy in quantum walks, since their evolution always depends on the initialization of the walker. Is it even possible then to say something useful that applies to all initializations?

Stochastic and quantum generators differ. Those of you familiar with the network theory series know that some generators produce both stochastic and quantum walks (see part 16 for more details). However, most stochastic walk generators, including that for the uniform escape model, do not generate quantum walks and vice versa. How do we then compare stochastic and quantum walks when their generators differ?

With the task outlined, let’s get started!

### Graphs and walks

In the next couple of sections I’m going to explain the diagram below to you. If you’ve been following the network theory series, in particular part 20, you’ll find parts of it familiar. But as it’s been a while since the last post covering this topic, let’s start with the basics.

A simple graph $G$ can be used to define both stochastic and quantum walks. A simple graph is something like this:

where there is at most one edge between any two nodes, there are no edges from a node to itself and all edges are undirected. To avoid complications, let’s stick to simple graphs with a finite number $n$ of nodes. Let’s also assume you can get from every node to every other node via some combination of edges i.e. the graph is connected.

In the particular example above the graph represents a network of $n = 5$ nodes, where nodes 3 and 4 have degree (number of edges) 3, and nodes 1, 2 and 5 have degree 2.

Every simple graph defines a matrix $A,$ called the adjacency matrix. For a network with $n$ nodes, this matrix is of size $n \times n,$ and each element $A_{i j}$ is unity if there is an edge between nodes $i$ and $j$, and zero otherwise (let’s use this basis for the rest of this post). For the graph drawn above the adjacency matrix is

$\left( \begin{matrix} 0 & 1 & 0 & 1 & 0 \\ 1 & 0 & 1 & 0 & 0 \\ 0 & 1 & 0 & 1 & 1 \\ 1 & 0 & 1 & 0 & 1 \\ 0 & 0 & 1 & 1 & 0 \end{matrix} \right)$

By construction, every adjacency matrix is symmetric:

$A =A^T$

(the $T$ means the transposition of the elements in the node basis) and further, because each $A$ is real, it is self-adjoint:

$A=A^\dagger$

(the $\dagger$ means conjugate transpose).

This is nice, since (as seen in parts 16 and 20) a self-adjoint matrix generates a continuous-time quantum walk.

To recap from the series, a quantum walk is an evolution arising from a quantum walker moving on a network.

A state of a quantum walk is represented by a size $n$ complex column vector $\psi$. Each element $\langle i , \psi \rangle$ of this vector is the so-called amplitude associated with node $i$ and the probability of the walker being found on that node (if measured) is the modulus of the amplitude squared $|\langle i , \psi \rangle|^2.$ Here $i$ is the standard basis vector with a single non-zero $i$th entry equal to unity, and $\langle u , v \rangle = u^\dagger v$ is the usual inner product.

A quantum walk evolves in time according to the Schrödinger equation

$\displaystyle{ \frac{d}{d t} \psi(t)= - i H \psi(t) }$

where $H$ is called the Hamiltonian. If the initial state is $\psi(0)$ then the solution is written as

$\psi(t) = \exp(- i t H) \psi(0)$

The probabilities $| \langle i , \psi (t) \rangle |^2$ are guaranteed to be correctly normalized when the Hamiltonian $H$ is self-adjoint.

There are other matrices that are defined by the graph. Perhaps the most familiar is the Laplacian, which has recently been a topic on this blog (see parts 15, 16 and 20 of the series, and this recent post).

The Laplacian $L$ is the $n \times n$ matrix

$L = D - A$

where the degree matrix $D$ is an $n \times n$ diagonal matrix with elements given by the degrees

$\displaystyle{ D_{i i}=\sum_{j} A_{i j} }$

For the graph drawn above, the degree matrix and Laplacian are:

$\left( \begin{matrix} 2 & 0 & 0 & 0 & 0 \\ 0 & 2 & 0 & 0 & 0 \\ 0 & 0 & 3 & 0 & 0 \\ 0 & 0 & 0 & 3 & 0 \\ 0 & 0 & 0 & 0 & 2 \end{matrix} \right) \qquad \mathrm{and} \qquad \left( \begin{matrix} 2 & -1 & 0 & -1 & 0 \\ -1 & 2 & -1 & 0 & 0 \\ 0 & -1 & 3 & -1 & -1 \\ -1 & 0 & -1 & 3 & -1 \\ 0 & 0 & -1 & -1 & 2 \end{matrix} \right)$

The Laplacian is self-adjoint and generates a quantum walk.

The Laplacian has another property; it is infinitesimal stochastic. This means that its off diagonal elements are non-positive and its columns sum to zero. This is interesting because an infinitesimal stochastic matrix generates a continuous-time stochastic walk.

To recap from the series, a stochastic walk is an evolution arising from a stochastic walker moving on a network.

A state of a stochastic walk is represented by a size $n$ non-negative column vector $\psi$. Each element $\langle i , \psi \rangle$ of this vector is the probability of the walker being found on node $i.$

A stochastic walk evolves in time according to the master equation

$\displaystyle{ \frac{d}{d t} \psi(t)= - H \psi(t) }$

where $H$ is called the stochastic Hamiltonian. If the initial state is $\psi(0)$ then the solution is written

$\psi(t) = \exp(- t H) \psi(0)$

The probabilities $\langle i , \psi (t) \rangle$ are guaranteed to be non-negative and correctly normalized when the stochastic Hamiltonian $H$ is infinitesimal stochastic.

So far, I have just presented what has been covered on Azimuth previously. However, to analyze the important uniform escape model we need to go beyond the class of (Dirichlet) generators that produce both quantum and stochastic walks. Further, we have to somehow find a related quantum walk. We’ll see below that both tasks are achieved by considering the normalized Laplacians: one generating the uniform escape stochastic walk and the other a related quantum walk.

### Normalized Laplacians

The two normalized Laplacians are:

• the asymmetric normalized Laplacian $S = L D^{-1}$ (that generates the uniform escape Stochastic walk) and

• the symmetric normalized Laplacian $Q = D^{-1/2} L D^{-1/2}$ (that generates a Quantum walk).

For the graph drawn above the asymmetric normalized Laplacian $S$ is

$\left( \begin{matrix} 1 & -1/2 & 0 & -1/3 & 0 \\ -1/2 & 1 & -1/3 & 0 & 0 \\ 0 & -1/2 & 1 & -1/3 & -1/2 \\ -1/2 & 0 & -1/3 & 1 & -1/2 \\ 0 & 0 & -1/3 & -1/3 & 1 \end{matrix} \right)$

The identical diagonal elements indicates that the total rates of leaving each node are identical, and the equality within each column of the other non-zero elements indicates that the walker is equally likely to hop to any node connected to its current node. This is the uniform escape model!

For the same graph the symmetric normalized Laplacian $Q$ is

$\left( \begin{matrix} 1 & -1/2 & 0 & -1/\sqrt{6} & 0 \\ -1/2 & 1 & -1/\sqrt{6} & 0 & 0 \\ 0 & -1/\sqrt{6} & 1 & -1/3 & -1/\sqrt{6} \\ -1/\sqrt{6} & 0 & -1/3 & 1 & -1/\sqrt{6} \\ 0 & 0 & -1/\sqrt{6} & -1/\sqrt{6} & 1 \end{matrix} \right)$

That the diagonal elements are identical in the quantum case indicates that all nodes are of equal energy, this is type of quantum walk usually considered.

Puzzle 1. Show that in general $S$ is infinitesimal stochastic but not self-adjoint.

Puzzle 2. Show that in general $Q$ is self-adjoint but not infinitesimal stochastic.

So a graph defines two matrices: one $S$ that generates a stochastic walk, and one $Q$ that generates a quantum walk. The natural question to ask is whether these walks are related. The answer is that they are!

Underpinning this relationship is the mathematical property that $S$ and $Q$ are similar. They are related by the following similarity transformation

$S = D^{1/2} Q D^{-1/2}$

which means that any eigenvector $\phi_k$ of $Q$ associated to eigenvalue $\epsilon_k$ gives a vector

$\pi_k \propto D^{1/2} \phi_k$

that is an eigenvector of $S$ with the same eigenvalue! To show this, insert the identity $I = D^{-1/2} D^{1/2}$ into

$Q \phi_k = \epsilon_k \phi_k$

and multiply from the left with $D^{1/2}$ to obtain

\begin{aligned} (D^{1/2} Q D^{-1/2} ) (D^{1/2} \phi_k) &= \epsilon_k ( D^{1/2} \phi_k ) \\ S \pi_k &= \epsilon_k \pi_k \end{aligned}

The same works in the opposite direction. Any eigenvector $\pi_k$ of $S$ gives an eigenvector

$\phi_k \propto D^{-1/2} \pi_k$

of $Q$ with the same eigenvalue $\epsilon_k.$

The mathematics is particularly nice because $Q$ is self-adjoint. A self-adjoint matrix is diagonalizable, and has real eigenvalues and orthogonal eigenvectors.

As a result, the symmetric normalized Laplacian can be decomposed as

$Q = \sum_k \epsilon_k \Phi_k$

where $\epsilon_k$ is real and $\Phi_k$ are orthogonal projectors. Each $\Phi_k$ acts as the identity only on vectors in the space spanned by $\phi_k$ and as zero on all others, such that

$\Phi_k \Phi_\ell = \delta_{k \ell} \Phi_k.$

Multiplying from the left by $D^{1/2}$ and the right by $D^{-1/2}$ results in a similar decomposition for $S$:

$S = \sum_k \epsilon_k \Pi_k$

with orthogonal projectors

$\Pi_k = D^{1/2} \Phi_k D^{-1/2}$

I promised above that I would explain the following diagram:

Let’s summarize what it represents now:

$G$ is a simple graph that specifies

$A$ the adjacency matrix (generator of a quantum walk), which subtracted from

$D$ the diagonal matrix of the degrees gives

$L$ the symmetric Laplacian (generator of stochastic and quantum walks), which when normalized by $D$ returns both

$S$ the generator of the uniform escape stochastic walk and

$Q$ the quantum walk generator to which it is similar!

### What next?

Sadly, this is where we’ll finish for now.

We have all the ingredients necessary to study the walks generated by the normalized Laplacians and exploit the relationship between them.

Next time, in part 2, I’ll talk you through the mathematics of the uniform escape stochastic walk $S$ and how it connects to the degrees of the nodes in the long-time limit. Then I’ll show you how this helps us solve aspects of the quantum walk generated by $Q.$

### In other news

Before I leave you, let me tell you about a workshop the ISI team recently attended (in fact helped organize) at the Institute of Quantum Computing, on the topic of quantum computation and complex networks. Needless to say, there were talks on papers related to quantum mechanics and networks!

Some researchers at the workshop gave exciting talks based on numerical examinations of what happens if a quantum walk is used instead of a stochastic walk to rank the nodes of a network:

• Giuseppe Davide Paparo and Miguel Angel Martín-Delgado, Google in a quantum network, Sci. Rep. 2 (2012), 444.

• Eduardo Sánchez-Burillo, Jordi Duch, Jesús Gómez-Gardenes and David Zueco, Quantum navigation and ranking in complex networks, Sci. Rep. 2 (2012), 605.

Others attending the workshop have numerically examined what happens when using quantum computers to represent the stationary state of a stochastic process:

• Silvano Garnerone, Paolo Zanardi and Daniel A. Lidar, Adiabatic quantum algorithm for search engine ranking, Phys. Rev. Lett. 108 (2012), 230506.

It was a fun workshop and we plan to organize/attend more in the future!

## Coherence for Solutions of the Master Equation

10 July, 2013

guest post by Arjun Jain

I am a master’s student in the physics department of the Indian Institute of Technology Roorkee. I’m originally from Delhi. Since some time now, I’ve been wanting to go into Mathematical Physics. I hope to do a PhD in that. Apart from maths and physics, I am also quite passionate about art and music.

Right now I am visiting John Baez at the Centre for Quantum Technologies, and we’re working on chemical reaction networks. This post can be considered as an annotation to the last paragraph of John’s paper, Quantum Techniques for Reaction Networks, where he raises the question of when a solution to the master equation that starts as a coherent state will remain coherent for all times. Remember, the ‘master equation’ describes the random evolution of collections of classical particles, and a ‘coherent state’ is one where the probability distribution of particles of each type is a Poisson distribution.

If you’ve been following the network theory series on this blog, you’ll know these concepts, and you’ll know the Anderson-Craciun-Kurtz theorem gives many examples of coherent states that remain coherent. However, all these are equilibrium solutions of the master equation: they don’t change with time. Moreover they are complex balanced equilibria: the rate at which any complex is produced equals the rate at which it is consumed.

There are also non-equilibrium examples where coherent states remain coherent. But they seem rather rare, and I would like to explain why. So, I will give a necessary condition for it to happen. I’ll give the proof first, and then discuss some simple examples. We will see that while the condition is necessary, it is not sufficient.

First, recall the setup. If you’ve been following the network theory series, you can skip the next section.

### Reaction networks

Definition. A reaction network consists of:

• a finite set $S$ of species,

• a finite set $K$ of complexes, where a complex is a finite sum of species, or in other words, an element of $\mathbb{N}^S,$

• a graph with $K$ as its set of vertices and some set $T$ of edges.

You should have in mind something like this:

where our set of species is $S = \{A,B,C,D,E\},$ the complexes are things like $A + E,$ and the arrows are the elements of $T,$ called transitions or reactions. So, we have functions

$s , t : T \to K$

saying the source and target of each transition.

Next:

Definition. A stochastic reaction network is a reaction network together with a function $r: T \to (0,\infty)$ assigning a rate constant to each reaction.

From this we can write down the master equation, which describes how a stochastic state evolves in time:

$\displaystyle{ \frac{d}{dt} \Psi(t) = H \Psi(t) }$

Here $\Psi(t)$ is a vector in the stochastic Fock space, which is the space of formal power series in a bunch of variables, one for each species, and $H$ is an operator on this space, called the Hamiltonian.

From now on I’ll number the species with numbers from $1$ to $k,$ so

$S = \{1, \dots, k\}$

Then the stochastic Fock space consists of real formal power series in variables that I’ll call $z_1, \dots, z_k.$ We can write any of these power series as

$\displaystyle{\Psi = \sum_{\ell \in \mathbb{N}^k} \psi_\ell z^\ell }$

where

$z^\ell = z_1^{\ell_1} \cdots z_k^{\ell_k}$

We have annihilation and creation operators on the stochastic Fock space:

$\displaystyle{ a_i \Psi = \frac{\partial}{\partial z_i} \Psi }$

$\displaystyle{ a_i^\dagger \Psi = z_i \Psi }$

and the Hamiltonian is built from these as follows:

$\displaystyle{ H = \sum_{\tau \in T} r(\tau) \, ({a^\dagger}^{t(\tau)} - {a^\dagger}^{s(\tau)}) \, a^{s(\tau)} }$

John explained this here (using slightly different notation), so I won’t go into much detail now, but I’ll say what all the symbols mean. Remember that the source of a transition $\tau$ is a complex, or list of natural numbers:

$s(\tau) = (s_1(\tau), \dots, s_k(\tau))$

So, the power $a^{s(\tau)}$ is really an abbreviation for a big product of annihilation operators, like this:

$\displaystyle{ a^{s(\tau)} = a_1^{s_1(\tau)} \cdots a_k^{s_k(\tau)} }$

This describes the annihilation of all the inputs to the transition $\tau.$ Similarly, we define

$\displaystyle{ {a^\dagger}^{s(\tau)} = {a_1^\dagger}^{s_1(\tau)} \cdots {a_k^\dagger}^{s_k(\tau)} }$

and

$\displaystyle{ {a^\dagger}^{t(\tau)} = {a_1^\dagger}^{t_1(\tau)} \cdots {a_k^\dagger}^{t_k(\tau)} }$

### The result

Here’s the result:

Theorem. If a solution $\Psi(t)$ of the master equation is a coherent state for all times $t \ge 0,$ then $\Psi(0)$ must be complex balanced except for complexes of degree 0 or 1.

This requires some explanation.

First, saying that $\Psi(t)$ is a coherent state means that it is an eigenvector of all the annihilation operators. Concretely this means

$\Psi (t) = \displaystyle{\frac{e^{c(t) \cdot z}}{e^{c_1(t) + \cdots + c_k(t)}}}$

where

$c(t) = (c_1(t), \dots, c_k(t)) \in [0,\infty)^k$

and

$z = (z_1, \dots, z_k)$

It will be helpful to write

$\mathbf{1}= (1,1,1,...)$

so we can write

$\Psi (t) = \displaystyle{ e^{c(t) \cdot (z - \mathbf{1})} }$

Second, we say that a complex has degree $d$ if it is a sum of exactly $d$ species. For example, in this reaction network:

the complexes $A + C$ and $B + E$ have degree 2, while the rest have degree 1. We use the word ‘degree’ because each complex $\ell$ gives a monomial

$z^\ell = z_1^{\ell_1} \cdots z_k^{\ell_k}$

and the degree of the complex is the degree of this monomial, namely

$\ell_1 + \cdots + \ell_k$

Third and finally, we say a solution $\Psi(t)$ of the master equation is complex balanced for a specific complex $\ell$ if the total rate at which that complex is produced equals the total rate at which it’s destroyed.

Now we are ready to prove the theorem:

Proof. Consider the master equation

$\displaystyle { \frac{d \Psi (t)}{d t} = H \psi (t) }$

Assume that $\Psi(t)$ is a coherent state for all $t \ge 0.$ This means

$\Psi (t) = \displaystyle{ e^{c(t) \cdot (z - \mathbf{1})} }$

For convenience, we write $c(t)$ simply as $c,$ and similarly for the components $c_i$. Then we have

$\displaystyle{ \frac{d\Psi(t)}{dt} = (\dot{c} \cdot (z - \mathbf{1})) \, e^{c \cdot (z - \mathbf{1})} }$

On the other hand, the master equation gives

$\begin{array}{ccl} \displaystyle {\frac{d\Psi(t)}{dt}} &=& \displaystyle{ \sum_{\tau \in T} r(\tau) \, ({a^\dagger}^{t(\tau)} - {a^\dagger}^{s(\tau)}) \, a^{s(\tau)} e^{c \cdot (z - \mathbf{1})} } \\ \\ &=& \displaystyle{\sum_{\tau \in T} c^{t(\tau)} r(\tau) \, ({z}^{t(\tau)} - {z}^{s(\tau)}) e^{c \cdot (z - \mathbf{1})} } \end{array}$

So,

$\displaystyle{ (\dot{c} \cdot (z - \mathbf{1})) \, e^{c \cdot (z - \mathbf{1})} =\sum_{\tau \in T} c^{t(\tau)} r(\tau) \, ({z}^{t(\tau)} - {z}^{s(\tau)}) e^{c \cdot (z - \mathbf{1})} }$

As a result, we get

$\displaystyle{ \dot{c}\cdot z -\dot{c}\cdot\mathbf{1} = \sum_{\tau \in T} c^{s(\tau)} r(\tau) \, ({z}^{t(\tau)} - {z}^{s(\tau)}) }.$

Comparing the coefficients of all $z^\ell,$ we obtain the following. For $\ell = 0,$ which is the only complex of degree zero, we get

$\displaystyle { \sum_{\tau: t(\tau)=0} r(\tau) c^{s(\tau)} - \sum_{\tau\;:\; s(\tau)= 0} r(\tau) c^{s(\tau)} = -\dot{c}\cdot\mathbf{1} }$

For the complexes $\ell$ of degree one, we get these equations:

$\displaystyle { \sum_{\tau\;:\; t(\tau)=(1,0,0,\dots)} r(\tau) c^{s(\tau)} - \sum_{\tau \;:\;s(\tau)=(1,0,0,\dots)} r(\tau) c^{s(\tau)}= \dot{c_1} }$

$\displaystyle { \sum_{\tau\; :\; t(\tau)=(0,1,0,\dots)} r(\tau) c^{s(\tau)} - \sum_{\tau\;:\; s(\tau)=(0,1,0,\dots)} r(\tau) c^{s(\tau)} = \dot{c_2} }$

and so on. For all the remaining complexes $\ell$ we have

$\displaystyle { \sum_{\tau\;:\; t(\tau)=\ell} r(\tau) c^{s(\tau)} = \sum_{\tau \;:\; s(\tau)=\ell} r(\tau) c^{s(\tau)} }.$

This says that the total rate at which this complex is produced equals the total rate at which it’s destroyed. So, our solution of the master equation is complex balanced for all complexes $\ell$ of degree greater than one. This is our necessary condition.                                                                                   █

To illustrate the theorem, I’ll consider three simple examples. The third example shows that the condition in the theorem, though necessary, is not sufficient. Note that our proof also gives a necessary and sufficient condition for a coherent state to remain coherent: namely, that all the equations we listed hold, not just initially but for all times. But this condition seems a bit complicated.

### Introducing amoebae into a Petri dish

Suppose that there is an inexhaustible supply of amoebae, randomly floating around in a huge pond. Each time an amoeba comes into our collection area, we catch it and add it to the population of amoebae in the Petri dish. Suppose that the rate constant for this process is 3.

So, the Hamiltonian is $3(a^\dagger -1).$ If we start with a coherent state, say

$\displaystyle { \Psi(0)=\frac{e^{cz}}{e^c} }$

then

$\displaystyle { \Psi(t) = e^{3(a^\dagger -1)t} \; \frac{e^{cz}}{e^c} = \frac{e^{(c+3t)z}}{e^{c+3t}} }$

which is coherent at all times.

We can see that the condition of the theorem is satisfied, as all the complexes in the reaction network have degree 0 or 1.

### Amoebae reproducing and competing

This example shows a Petri dish with one species, amoebae, and two transitions: fission and competition. We suppose that the rate constant for fission is 2, while that for competition is 1. The Hamiltonian is then

$H= 2({a^\dagger}^2-a^\dagger)a + (a^\dagger-{a^\dagger}^2)a^2$

If we start off with the coherent state

$\displaystyle{\Psi(0) = \frac{e^{2z}}{e^2}}$

we find that

$\displaystyle {\Psi(t)=e^{2(z^2-z)2+(z-z^2)4} \; \Psi(0)}=\Psi(0)$

which is coherent. It should be noted that the chosen initial state

$\displaystyle{ \frac{e^{2z}}{e^2}}$

was a complex balanced equilibrium solution. So, the Anderson–Craciun–Kurtz Theorem applies to this case.

### Amoebae reproducing, competing, and being introduced

This is a combination of the previous two examples, where apart from ongoing reproduction and competition, amoebae are being introduced into the dish with a rate constant 3.

As in the above examples, we might think that coherent states could remain coherent forever here too. Let’s check that.

Assuming that this was true, if

$\displaystyle{\Psi(t) = \frac{e^{c(t)z}}{e^{c(t)}} }$

then $c(t)$ would have to satisfy the following:

$\dot{c}(t) = c(t)^2 + 3 -2c(t)$

and

$c(t)^2=2c(t)$

Using the second equation, we get

$\dot{c}(t) = 3 \Rightarrow c = 3t+ c_0$

But this is certainly not a solution of the second equation. So, here we find that initially coherent states do not remain remain coherent for all times.

However, if we choose

$\displaystyle{\Psi(0) = \frac{e^{2z}}{e^2}}$

then this coherent state is complex balanced except for complexes of degree 1, since it was in the previous example, and the only new feature of this example, at time zero, is that single amoebas are being introduced—and these are complexes of degree 1. So, the condition of the theorem does hold.

So, the condition in the theorem is necessary but not sufficient. However, it is easy to check, and we can use it to show that in many cases, coherent states must cease to be coherent.

## The Large-Number Limit for Reaction Networks (Part 2)

6 July, 2013

I’ve been talking a lot about ‘stochastic mechanics’, which is like quantum mechanics but with probabilities replacing amplitudes. In Part 1 of this mini-series I started telling you about the ‘large-number limit’ in stochastic mechanics. It turns out this is mathematically analogous to the ‘classical limit’ of quantum mechanics, where Planck’s constant $\hbar$ goes to zero.

There’s a lot more I need to say about this, and lots more I need to figure out. But here’s one rather easy thing.

In quantum mechanics, ‘coherent states’ are a special class of quantum states that are very easy to calculate with. In a certain precise sense they are the best quantum approximations to classical states. This makes them good tools for studying the classical limit of quantum mechanics. As $\hbar \to 0,$ they reduce to classical states where, for example, a particle has a definite position and momentum.

We can borrow this strategy to study the large-number limit of stochastic mechanics. We’ve run into coherent states before in our discussions here. Now let’s see how they work in the large-number limit!

### Coherent states

For starters, let’s recall what coherent states are. We’ve got $k$ different kinds of particles, and we call each kind a species. We describe the probability that we have some number of particles of each kind using a ‘stochastic state’. For starters, this is a formal power series in variables $z_1, \dots, z_k.$ We write it as

$\displaystyle{\Psi = \sum_{\ell \in \mathbb{N}^k} \psi_\ell z^\ell }$

where $z^\ell$ is an abbreviation for

$z_1^{\ell_1} \cdots z_k^{\ell_k}$

But for $\Psi$ to be a stochastic state the numbers $\psi_\ell$ need to be probabilities, so we require that

$\psi_\ell \ge 0$

and

$\displaystyle{ \sum_{\ell \in \mathbb{N}^k} \psi_\ell = 1}$

Sums of coefficients like this show up so often that it’s good to have an abbreviation for them:

$\displaystyle{ \langle \Psi \rangle = \sum_{\ell \in \mathbb{N}^k} \psi_\ell}$

Now, a coherent state is a stochastic state where the numbers of particles of each species are independent random variables, and the number of the $i$th species is distributed according to a Poisson distribution.

Since we can pick ithe means of these Poisson distributions to be whatever we want, we get a coherent state $\Psi_c$ for each list of numbers $c \in [0,\infty)^k:$

$\displaystyle{ \Psi_c = \frac{e^{c \cdot z}}{e^c} }$

Here I’m using another abbreviation:

$e^{c} = e^{c_1 + \cdots + c_k}$

If you calculate a bit, you’ll see

$\displaystyle{ \Psi_c = e^{-(c_1 + \cdots + c_k)} \, \sum_{n \in \mathbb{N}^k} \frac{c_1^{n_1} \cdots c_k^{n_k}} {n_1! \, \cdots \, n_k! } \, z_1^{n_1} \cdots z_k^{n_k} }$

Thus, the probability of having $n_i$ things of the $i$th species is equal to

$\displaystyle{ e^{-c_i} \, \frac{c_i^{n_i}}{n_i!} }$

This is precisely the definition of a Poisson distribution with mean equal to $c_i.$

What are the main properties of coherent states? For starters, they are indeed states:

$\langle \Psi_c \rangle = 1$

More interestingly, they are eigenvectors of the annihilation operators

$a_i = \displaystyle{ \frac{\partial}{\partial z_i} }$

since when you differentiate an exponential you get back an exponential:

$\begin{array}{ccl} a_i \Psi_c &=& \displaystyle{ \frac{\partial}{\partial z_i} \frac{e^{c \cdot z}}{e^c} } \\ \\ &=& c_i \Psi_c \end{array}$

We can use this fact to check that in this coherent state, the mean number of particles of the $i$th species really is $c_i.$ For this, we introduce the number operator

$N_i = a_i^\dagger a_i$

where $a_i^\dagger$ is the creation operator:

$(a_i^\dagger \Psi)(z) = z_i \Psi(z)$

The number operator has the property that

$\langle N_i \Psi \rangle$

is the mean number of particles of the $i$th species. If we calculate this for our coherent state $\Psi_c,$ we get

$\begin{array}{ccl} \langle a_i^\dagger a_i \Psi_c \rangle &=& c_i \langle a_i^\dagger \Psi_c \rangle \\ \\ &=& c_i \langle \Psi_c \rangle \\ \\ &=& c_i \end{array}$

Here in the second step we used the general rule

$\langle a_i^\dagger \Phi \rangle = \langle \Phi \rangle$

which is easy to check.

### Rescaling

Now let’s see how coherent states work in the large-numbers limit. For this, let’s use the rescaled annihilation, creation and number operators from Part 1. They look like this:

$A_i = \hbar \, a_i$

$C_i = a_i^\dagger$

$\widetilde{N}_i = C_i A_i$

Since

$\widetilde{N}_i = \hbar N_i$

the point is that the rescaled number operator counts particles not one at a time, but in bunches of size $1/\hbar.$ For example, if $\hbar$ is the reciprocal of Avogadro’s number, we are counting particles in ‘moles’. So, $\hbar \to 0$ corresponds to a large-number limit.

To flesh out this idea some more, let’s define rescaled coherent states:

$\widetilde{\Psi}_c = \Psi_{c/\hbar}$

These are eigenvectors of the rescaled annihilation operators:

$\begin{array}{ccl} A_i \widetilde{\Psi}_c &=& \hbar a_i \Psi_{c/\hbar} \\ \\ &=& c_i \Psi_{c/\hbar} \\ \\ &=& c_i \widetilde{\Psi}_c \end{array}$

This in turn means that

$\begin{array}{ccl} \langle \widetilde{N}_i \widetilde{\Psi}_c \rangle &=& \langle C_i A_i \widetilde{\Psi}_c \rangle \\ \\ &=& c_i \langle C_i \widetilde{\Psi}_c \rangle \\ \\ &=& c_i \langle \widetilde{\Psi}_c \rangle \\ \\ &=& c_i \end{array}$

Here we used the general rule

$\langle C_i \Phi \rangle = \langle \Phi \rangle$

which holds because the ‘rescaled’ creation operator $C_i$ is really just the usual creation operator, which obeys this rule.

What’s the point of all this fiddling around? Simply this. The equation

$\langle \widetilde{N}_i \widetilde{\Psi}_c \rangle = c_i$

says the expected number of particles of the $i$th species in the state $\widetilde{\Psi}_c$ is $c_i,$ if we count these particles not one at a time, but in bunches of size $1/\hbar.$

### A simple test

As a simple test of this idea, let’s check that as $\hbar \to 0,$ the standard deviation of the number of particles in the state $\Psi_c$ goes to zero… where we count particle using the rescaled number operator.

The variance of the rescaled number operator is, by definition,

$\langle \widetilde{N}_i^2 \widetilde{\Psi}_c \rangle - \langle \widetilde{N}_i^2 \widetilde{\Psi}_c \rangle^2$

and the standard deviation is the square root of the variance.

We already know the mean of the rescaled number operator:

$\langle \widetilde{N}_i \widetilde{\Psi}_c \rangle = c_i$

So, the main thing we need to calculate is the mean of its square:

$\langle \widetilde{N}_i^2 \widetilde{\Psi}_c \rangle$

For this we will use the commutation relation derived last time:

$[A_i , C_i] = \hbar$

This implies

$\begin{array}{ccl} \widetilde{N}_i^2 &=& C_i A_i C_i A_i \\ \\ &=& C_i (C_i A_i + \hbar) A_i \\ \\ &=& C_i^2 A_i^2 + \hbar C_i A_i \end{array}$

so

$\begin{array}{ccl} \langle \widetilde{N}_i^2\widetilde{\Psi}_c \rangle &=& \langle (C_i^2 A_i^2 + \hbar C_i A_i) \Psi_c \rangle \\ \\ &=& c_i^2 + \hbar c_i \end{array}$

where we used our friends

$A_i \Psi_c = c_i \Psi_c$

and

$\langle C_i \Phi \rangle = \langle \Phi \rangle$

So, the variance of the rescaled number of particles is

$\begin{array}{ccl} \langle \widetilde{N}_i^2 \widetilde{\Psi}_c \rangle - \langle \widetilde{N}_i \widetilde{\Psi}_c \rangle^2 &=& c_i^2 + \hbar c_i - c_i^2 \\ \\ &=& \hbar c_i \end{array}$

and the standard deviation is

$(\hbar c_i)^{1/2}$

Good, it goes to zero as $\hbar \to 0!$ And the square root is just what you’d expect if you’ve thought about stuff like random walks or the central limit theorem.

### A puzzle

I feel sure that in any coherent state, not only the variance but also all the higher moments of the rescaled number operators go to zero as $\hbar \to 0.$ Can you prove this?

Here I mean the moments after the mean has been subtracted. The $p$th moment is then

$\langle (\widetilde{N}_i - c_i)^p \; \widetilde{\Psi}_c \rangle$

I want this to go to zero as $\hbar \to 0.$

Here’s a clue that should help. First, there’s a textbook formula for the higher moments of Poisson distributions without the mean subtracted. If I understand it correctly, it gives this:

$\displaystyle{ \langle N_i^m \; \Psi_c \rangle = \sum_{j = 1}^m {c_i}^j \; \left\{ \begin{array}{c} m \\ j \end{array} \right\} }$

Here

$\displaystyle{ \left\{ \begin{array}{c} m \\ j \end{array} \right\} }$

is the number of ways to partition an $m$-element set into $j$ nonempty subsets. This is called Stirling’s number of the second kind. This suggests that there’s some fascinating combinatorics involving coherent states. That’s exactly the kind of thing I enjoy, so I would like to understand this formula someday… but not today! I just want something to go to zero!

If I rescale the above formula, I seem to get

$\begin{array}{ccl} \langle \widetilde{N}_i^m \; \widetilde{\Psi}_c \rangle &=& \hbar^m \langle N_i^m \Psi_{c/\hbar} \rangle \\ \\ &=& \hbar^m \; \displaystyle{ \sum_{j = 1}^m \left(\frac{c_i}{\hbar}\right)^j \left\{ \begin{array}{c} m \\ j \end{array} \right\} } \end{array}$

We could plug this formula into

$\langle (\widetilde{N}_i - c_i)^p \; \widetilde{\Psi}_c \rangle = \displaystyle{ \sum_{m = 0}^p \, \binom{m}{p} \; \langle \widetilde{N}_i^m \; \widetilde{\Psi}_c \rangle \, (-c_i)^{p - m} }$

and then try to show the result goes to zero as $\hbar \to 0.$ But I don’t have the energy to do that… not right now, anyway!

Maybe you do. Or maybe you can think of a better approach to solving this problem. The answer must be well-known, since the large-number limit of a Poisson distribution is a very important thing.

## The Large-Number Limit for Reaction Networks (Part 1)

1 July, 2013

Waiting for the other shoe to drop.

This is a figure of speech that means ‘waiting for the inevitable consequence of what’s come so far’. Do you know where it comes from? You have to imagine yourself in an apartment on the floor below someone who is taking off their shoes. When you hear one, you know the next is coming.

A guest who checked into an inn one night was warned to be quiet because the guest in the room next to his was a light sleeper. As he undressed for bed, he dropped one shoe, which, sure enough, awakened the other guest. He managed to get the other shoe off in silence, and got into bed. An hour later, he heard a pounding on the wall and a shout: “When are you going to drop the other shoe?”

When we were working on math together, James Dolan liked to say “the other shoe has dropped” whenever an inevitable consequence of some previous realization became clear. There’s also the mostly British phrase the penny has dropped. You say this when someone finally realizes the situation they’re in.

But sometimes one realization comes after another, in a long sequence. Then it feels like it’s raining shoes!

I guess that’s a rather strained metaphor. Perhaps falling like dominoes is better for these long chains of realizations.

This is how I’ve felt in my recent research on the interplay between quantum mechanics, stochastic mechanics, statistical mechanics and extremal principles like the principle of least action. The basics of these subjects should be completely figured out by now, but they aren’t—and a lot of what’s known, nobody bothered to tell most of us.

So, I was surprised to rediscover that the Maxwell relations in thermodynamics are formally identical to Hamilton’s equations in classical mechanics… though in retrospect it’s obvious. Thermodynamics obeys the principle of maximum entropy, while classical mechanics obeys the principle of least action. Wherever there’s an extremal principle, symplectic geometry, and equations like Hamilton’s equations, are sure to follow.

I was surprised to discover (or maybe rediscover, I’m not sure yet) that just as statistical mechanics is governed by the principle of maximum entropy, quantum mechanics is governed by a principle of maximum ‘quantropy’. The analogy between statistical mechanics and quantum mechanics has been known at least since Feynman and Schwinger. But this basic aspect was never explained to me!

I was also surprised to rediscover that simply by replacing amplitudes by probabilities in the formalism of quantum field theory, we get a nice formalism for studying stochastic many-body systems. This formalism happens to perfectly match the ‘stochastic Petri nets’ and ‘reaction networks’ already used in subjects from population biology to epidemiology to chemistry. But now we can systematically borrow tools from quantum field theory! All the tricks that particle physicists like—annihilation and creation operators, coherent states and so on—can be applied to problems like the battle between the AIDS virus and human white blood cells.

And, perhaps because I’m a bit slow on the uptake, I was surprised when yet another shoe came crashing to the floor the other day.

Because quantum field theory has, at least formally, a nice limit where Planck’s constant goes to zero, the same is true for for stochastic Petri nets and reaction networks!

In quantum field theory, we call this the ‘classical limit’. For example, if you have a really huge number of photons all in the same state, quantum effects sometimes become negligible, and we can describe them using the classical equations describing electromagnetism: the classical Maxwell equations. In stochastic situations, it makes more sense to call this limit the ‘large-number limit’: the main point is that there are lots of particles in each state.

In quantum mechanics, different observables don’t commute, so the so-called commutator matters a lot:

$[A,B] = AB - BA$

These commutators tend to be proportional to Planck’s constant. So in the limit where Planck’s constant $\hbar$ goes to zero, observables commute… but commutators continue to have a ghostly existence, in the form of Poisson bracket:

$\displaystyle{ \{A,B\} = \lim_{\hbar \to 0} \; \frac{1}{\hbar} [A,B] }$

Poisson brackets are a key part of symplectic geometry—the geometry of classical mechanics. So, this sort of geometry naturally shows up in the study of stochastic Petri nets!

Let me sketch how it works. I’ll start with a section reviewing stuff you should already know if you’ve been following the network theory series.

### The stochastic Fock space

Suppose we have some finite set $S$. We call its elements species, since we think of them as different kinds of things—e.g., kinds of chemicals, or kinds of organisms.

To describe the probability of having any number of things of each kind, we need the stochastic Fock space. This is the space of real formal power series in a bunch of variables, one for each element of $S.$ It won’t hurt to simply say

$S = \{1, \dots, k \}$

Then the stochastic Fock space is

$\mathbb{R}[[z_1, \dots, z_k ]]$

this being math jargon for the space of formal power series with real coefficients in some variables $z_1, \dots, z_k,$ one for each element of $S.$

We write

$n = (n_1, \dots, n_k) \in \mathbb{N}^S$

and use this abbreviation:

$z^n = z_1^{n_1} \cdots z_k^{n_k}$

We use $z^n$ to describe a state where we have $n_1$ things of the first species, $n_2$ of the second species, and so on.

More generally, a stochastic state is an element $\Psi$ of the stochastic Fock space with

$\displaystyle{ \Psi = \sum_{n \in \mathbb{N}^k} \psi_n \, z^n }$

where

$\psi_n \ge 0$

and

$\displaystyle{ \sum_{n \in \mathbb{N}^k} \psi_n = 1 }$

We use $\Psi$ to describe a state where $\psi_n$ is the probability of having $n_1$ things of the first species, $n_2$ of the second species, and so on.

The stochastic Fock space has some important operators on it: the annihilation operators given by

$\displaystyle{ a_i \Psi = \frac{\partial}{\partial z_i} \Psi }$

and the creation operators given by

$\displaystyle{ a_i^\dagger \Psi = z_i \Psi }$

From these we can define the number operators:

$N_i = a_i^\dagger a_i$

Part of the point is that

$N_i z^n = n_i z^n$

This says the stochastic state $z^n$ is an eigenstate of all the number operators, with eigenvalues saying how many things there are of each species.

The annihilation, creation, and number operators obey some famous commutation relations, which are easy to check for yourself:

$[a_i, a_j] = 0$

$[a_i^\dagger, a_j^\dagger] = 0$

$[a_i, a_j^\dagger] = \delta_{i j}$

$[N_i, N_j ] = 0$

$[N_i , a_j^\dagger] = \delta_{i j} a_j^\dagger$

$[N_i , a_j] = - \delta_{i j} a_j^\dagger$

The last two have easy interpretations. The first of these two implies

$N_i a_i^\dagger \Psi = a_i^\dagger (N_i + 1) \Psi$

This says that if we start in some state $\Psi,$ create a thing of type $i,$ and then count the things of that type, we get one more than if we counted the number of things before creating one. Similarly,

$N_i a_i \Psi = a_i (N_i - 1) \Psi$

says that if we annihilate a thing of type $i$ and then count the things of that type, we get one less than if we counted the number of things before annihilating one.

### Introducing Planck’s constant

Now let’s introduce an extra parameter into this setup. To indicate the connection to quantum physics, I’ll call it $\hbar,$ which is the usual symbol for Planck’s constant. However, I want to emphasize that we’re not doing quantum physics here! We’ll see that the limit where $\hbar \to 0$ is very interesting, but it will correspond to a limit where there are many things of each kind.

We’ll start by defining

$A_i = \hbar \, a_i$

and

$C_i = a_i^\dagger$

Here $A$ stands for ‘annihilate’ and $C$ stands for ‘create’. Think of $A$ as a rescaled annihilation operator. Using this we can define a rescaled number operator:

$\widetilde{N}_i = C_i A_i$

So, we have

$\widetilde{N}_i = \hbar N_i$

and this explains the meaning of the parameter $\hbar.$ The idea is that instead of counting things one at time, we count them in bunches of size $1/\hbar.$

For example, suppose $\hbar = 1/12.$ Then we’re counting things in dozens! If we have a state $\Psi$ with

$N_i \Psi = 36 \Psi$

then there are 36 things of the ith kind. But this implies

$\widetilde{N}_i \Psi = 3 \Psi$

so there are 3 dozen things of the ith kind.

Chemists don’t count in dozens; they count things in big bunches called moles. A mole is approximately the number of carbon atoms in 12 grams: Avogadro’s number, 6.02 × 1023. When you count things by moles, you’re taking $\hbar$ to be 1.66 × 10-24, the reciprocal of Avogadro’s number.

So, while in quantum mechanics Planck’s constant is ‘the quantum of action’, a unit of action, here it’s ‘the quantum of quantity’: the amount that corresponds to one thing.

We can easily work out the commutation relations of our new rescaled operators:

$[A_i, A_j] = 0$

$[C_i, C_j] = 0$

$[A_i, C_j] = \hbar \, \delta_{i j}$

$[\widetilde{N}_i, \widetilde{N}_j ] = 0$

$[\widetilde{N}_i , C_j] = \hbar \, \delta_{i j} C_j$

$[\widetilde{N}_i , A_j] = - \hbar \, \delta_{i j} A_j$

These are just what you see in quantum mechanics! The commutators are all proportional to $\hbar.$

Again, we can understand what these relations mean if we think a bit. For example, the commutation relation for $\widetilde{N}_i$ and $C_i$ says

$N_i C_i \Psi = C_i (N_i + \hbar) \Psi$

This says that if we start in some state $\Psi,$ create a thing of type $i,$ and then count the things of that type, we get $\hbar$ more than if we counted the number of things before creating one. This is because we are counting things not one at a time, but in bunches of size $1/\hbar.$

You may be wondering why I defined the rescaled annihilation operator to be $\hbar$ times the original annihilation operator:

$A_i = \hbar \, a_i$

but left the creation operator unchanged:

$C_i = a_i^\dagger$

I’m wondering that too! I’m not sure I’m doing things the best way yet. I’ve also tried another more symmetrical scheme, taking $A_k = \sqrt{\hbar} \, a_k$ and $C_k = \sqrt{\hbar} a_k^\dagger.$ This gives the same commutation relations, but certain other formulas become more unpleasant. I’ll explain that some other day.

Next, we can take the limit as $\hbar \to 0$ and define Poisson brackets of operators by

$\displaystyle{ \{A,B\} = \lim_{\hbar \to 0} \; \frac{1}{\hbar} [A,B] }$

To make this rigorous it’s best to proceed algebraically. For this we treat $\hbar$ as a formal variable rather than a specific number. So, our number system becomes $\mathbb{R}[\hbar],$ the algebra of polynomials in $\hbar$. We define the Weyl algebra to be the algebra over $\mathbb{R}[\hbar]$ generated by elements $A_i$ and $C_i$ obeying

$[A_i, A_j] = 0$

$[C_i, C_j] = 0$

$[A_i, C_j] = \hbar \, \delta_{i j}$

We can set $\hbar = 0$ in this formalism; then the Weyl algebra reduces to the algebra of polynomials in the variables $A_i$ and $C_i.$ This algebra is commutative! But we can define a Poisson bracket on this algebra by

$\displaystyle{ \{A,B\} = \lim_{\hbar \to 0} \; \frac{1}{\hbar} [A,B] }$

It takes a bit of work to explain to algebraists exactly what’s going on in this formula, because it involves an interplay between the algebra of polynomials in $A_i$ and $C_i,$ which is commutative, and the Weyl algebra, which is not. I’ll be glad to explain the details if you want. But if you’re a physicist, you can just follow your nose and figure out what the formula gives. For example:

$\begin{array}{ccl} \{A_i, C_j\} &=& \displaystyle{ \lim_{\hbar \to 0} \; \frac{1}{\hbar} [A_i, C_j] } \\ \\ &=& \displaystyle{ \lim_{\hbar \to 0} \; \frac{1}{\hbar} \, \hbar \, \delta_{i j} } \\ \\ &=& \delta_{i j} \end{array}$

Similarly, we have:

$\{ A_i, A_j \} = 0$

$\{ C_i, C_j \} = 0$

$\{ A_i, C_j \} = \delta_{i j}$

$\{ \widetilde{N}_i, \widetilde{N}_j \} = 0$

$\{ \widetilde{N}_i , C_j \} = \delta_{i j} C_j$

$\{ \widetilde{N}_i , A_j \} = - \delta_{i j} A_j$

I should probably use different symbols for $A_i, C_i$ and $\widetilde{N}_i$ after we’ve set $\hbar = 0,$ since they’re really different now, but I don’t have the patience to make up more names for things!

Now, we can think of $A_i$ and $C_i$ as coordinate functions on a 2k-dimensional vector space, and all the polynomials in $A_i$ and $C_i$ as functions on this space. This space is what physicists would call a ‘phase space’: they use this kind of space to describe the position and momentum of a particle, though here we are using it in a different way. Mathematicians would call it a ‘symplectic vector space’, because it’s equipped with a special structure, called a symplectic structure, that lets us define Poisson brackets of smooth functions on this space. We won’t need to get into that now, but it’s important—and it makes me happy to see it here.

### More

There’s a lot more to do, but not today. My main goal is to understand, in a really elegant way, how the master equation for a stochastic Petri net reduces to the rate equation in the large-number limit. What we’ve done so far is start thinking of this as a $\hbar \to 0$ limit. This should let us borrow ideas about classical limits in quantum mechanics, and apply them to stochastic mechanics.

Stay tuned!

## Energy and the Environment – What Physicists Can Do

25 April, 2013

The Perimeter Institute is a futuristic-looking place where over 250 physicists are thinking about quantum gravity, quantum information theory, cosmology and the like. Since I work on some of these things, I was recently invited to give the weekly colloquium there. But I took the opportunity to try to rally them into action:

Energy and the Environment: What Physicists Can Do. Watch the video or read the slides.

Abstract. The global warming crisis is part of a bigger transformation in which humanity realizes that the Earth is a finite system and that our population, energy usage, and the like cannot continue to grow exponentially. While politics and economics pose the biggest challenges, physicists are in a good position to help make this transition a bit easier. After a quick review of the problems, we discuss a few ways physicists can help.

On the video you can hear me say a lot of stuff that’s not on the slides: it’s more of a coherent story. The advantage of the slides is that anything in blue, you can click on to get more information. So for example, when I say that solar power capacity has been growing annually by 75% in recent years, you can see where I got that number.

I was pleased by the response to this talk. Naturally, it was not a case of physicists saying “okay, tomorrow I’ll quit working on the foundations of quantum mechanics and start trying to improve quantum dot solar cells.” It’s more about getting them to see that huge problems are looming ahead of us… and to see the huge opportunities for physicists who are willing to face these problems head-on, starting now. Work on energy technologies, the smart grid, and ‘ecotechnology’ is going to keep growing. I think a bunch of the younger folks, at least, could see this.

However, perhaps the best immediate outcome of this talk was that Lee Smolin introduced me to Manjana Milkoreit. She’s at the school of international affairs at Waterloo University, practically next door to the Perimeter Institute. She works on “climate change governance, cognition and belief systems, international security, complex systems approaches, especially threshold behavior, and the science-policy interface.”

So, she knows a lot about the all-important human and political side of climate change. Right now she’s interviewing diplomats involved in climate treaty negotiations, trying to see what they believe about climate change. And it’s very interesting!

In my next post, I’ll talk about something she pointed me to. Namely: what we can do to hold the temperature increase to 2 °C or less, given that the pledges made by various nations aren’t enough.

## Network Theory (Part 29)

23 April, 2013

I’m talking about electrical circuits, but I’m interested in them as models of more general physical systems. Last time we started seeing how this works. We developed an analogy between electrical circuits and physical systems made of masses and springs, with friction:

 Electronics Mechanics charge: $Q$ position: $q$ current: $I = \dot{Q}$ velocity: $v = \dot{q}$ flux linkage: $\lambda$ momentum: $p$ voltage: $V = \dot{\lambda}$ force: $F = \dot{p}$ inductance: $L$ mass: $m$ resistance: $R$ damping coefficient: $r$ inverse capacitance: $1/C$ spring constant: $k$

But this is just the first of a large set of analogies. Let me list some, so you can see how wide-ranging they are!

### More analogies

People in system dynamics often use effort as a term to stand for anything analogous to force or voltage, and flow as a general term to stand for anything analogous to velocity or electric current. They call these variables $e$ and $f.$

To me it’s important that force is the time derivative of momentum, and velocity is the time derivative of position. Following physicists, I write momentum as $p$ and position as $q.$ So, I’ll usually write effort as $\dot{p}$ and flow as $\dot{q}$.

Of course, ‘position’ is a term special to mechanics; it’s nice to have a general term for the thing whose time derivative is flow, that applies to any context. People in systems dynamics seem to use displacement as that general term.

It would also be nice to have a general term for the thing whose time derivative is effort… but I don’t know one. So, I’ll use the word momentum.

Now let’s see the analogies! Let’s see how displacement $q$, flow $\dot{q},$ momentum $p$ and effort $\dot{p}$ show up in several subjects:

 displacement:    $q$ flow:      $\dot q$ momentum:      $p$ effort:           $\dot p$ Mechanics: translation position velocity momentum force Mechanics: rotation angle angular velocity angular momentum torque Electronics charge current flux linkage voltage Hydraulics volume flow pressure momentum pressure Thermal Physics entropy entropy flow temperature momentum temperature Chemistry moles molar flow chemical momentum chemical potential

We’d been considering mechanics of systems that move along a line, via translation, but we can also consider mechanics for systems that turn round and round, via rotation. So, there are two rows for mechanics here.

There’s a row for electronics, and then a row for hydraulics, which is closely analogous. In this analogy, a pipe is like a wire. The flow of water plays the role of current. Water pressure plays the role of electrostatic potential. The difference in water pressure between two ends of a pipe is like the voltage across a wire. When water flows through a pipe, the power equals the flow times this pressure difference—just as in an electrical circuit the power is the current times the voltage across the wire.

A resistor is like a narrowed pipe:

An inductor is like a heavy turbine placed inside a pipe: this makes the water tend to keep flowing at the same rate it’s already flowing! In other words, it provides a kind of ‘inertia’ analogous
to mass.

A capacitor is like a tank with pipes coming in from both ends, and a rubber sheet dividing it in two lengthwise:

When studying electrical circuits as a kid, I was shocked when I first learned that capacitors don’t let the electrons through: it didn’t seem likely you could do anything useful with something like that! But of course you can. Similarly, this gizmo doesn’t let the water through.

A voltage source is like a compressor set up to maintain a specified pressure difference between the input and output:

Similarly, a current source is like a pump set up to maintain a specified flow.

Finally, just as voltage is the time derivative of a fairly obscure quantity called ‘flux linkage’, pressure is the time derivative of an even more obscure quantity which has no standard name. I’m calling it ‘pressure momentum’, thanks to the analogy

momentum: force :: pressure momentum: pressure

Just as pressure has units of force per area, pressure momentum has units of momentum per area!

People invented this analogy back when they were first struggling to understand electricity, before electrons had been observed:

Hydraulic analogy, Wikipedia.

The famous electrical engineer Oliver Heaviside pooh-poohed this analogy, calling it the “drain-pipe theory”. I think he was making fun of William Henry Preece. Preece was another electrical engineer, who liked the hydraulic analogy and disliked Heaviside’s fancy math. In his inaugural speech as president of the Institution of Electrical Engineers in 1893, Preece proclaimed:

True theory does not require the abstruse language of mathematics to make it clear and to render it acceptable. All that is solid and substantial in science and usefully applied in practice, have been made clear by relegating mathematic symbols to their proper store place—the study.

According to the judgement of history, Heaviside made more progress in understanding electromagnetism than Preece. But there’s still a nice analogy between electronics and hydraulics. And I’ll eventually use the abstruse language of mathematics to make it very precise!

But now let’s move on to the row called ‘thermal physics’. We could also call this ‘thermodynamics’. It works like this. Say you have a physical system in thermal equilibrium and all you can do is heat it up or cool it down ‘reversibly’—that is, while keeping it in thermal equilibrium all along. For example, imagine a box of gas that you can heat up or cool down. If you put a tiny amount $dE$ of energy into the system in the form of heat, then its entropy increases by a tiny amount $dS.$ And they’re related by this equation:

$dE = TdS$

where $T$ is the temperature.

Another way to say this is

$\displaystyle{ \frac{dE}{dt} = T \frac{dS}{dt} }$

where $t$ is time. On the left we have the power put into the system in the form of heat. But since power should be ‘effort’ times ‘flow’, on the right we should have ‘effort’ times ‘flow’. It makes some sense to call $dS/dt$ the ‘entropy flow’. So temperature, $T,$ must play the role of ‘effort’.

This is a bit weird. I don’t usually think of temperature as a form of ‘effort’ analogous to force or torque. Stranger still, our analogy says that ‘effort’ should be the time derivative of some kind of ‘momentum’, So, we need to introduce temperature momentum: the integral of temperature over time. I’ve never seen people talk about this concept, so it makes me a bit nervous.

But when we have a more complicated physical system like a piston full of gas in thermal equilibrium, we can see the analogy working. Now we have

$dE = TdS - PdV$

The change in energy $dE$ of our gas now has two parts. There’s the change in heat energy $TdS$, which we saw already. But now there’s also the change in energy due to compressing the piston! When we change the volume of the gas by a tiny amount $dV,$ we put in energy $-PdV.$

Now look back at the first chart I drew! It says that pressure is a form of ‘effort’, while volume is a form of ‘displacement’. If you believe that, the equation above should help convince you that temperature is also a form of effort, while entropy is a form of displacement.

But what about the minus sign? That’s no big deal: it’s the result of some arbitrary conventions. $P$ is defined to be the outward pressure of the gas on our piston. If this is positive, reducing the volume of the gas takes a positive amount of energy, so we need to stick in a minus sign. I could eliminate this minus sign by changing some conventions—but if I did, the chemistry professors at UCR would haul me away and increase my heat energy by burning me at the stake.

Speaking of chemistry: here’s how the chemistry row in the analogy chart works. Suppose we have a piston full of gas made of different kinds of molecules, and there can be chemical reactions that change one kind into another. Now our equation gets fancier:

$\displaystyle{ dE = TdS - PdV + \sum_i \mu_i dN_i }$

Here $N_i$ is the number of molecules of the ith kind, while $\mu_i$ is a quantity called a chemical potential. The chemical potential simply says how much energy it takes to increase the number of molecules of a given kind. So, we see that chemical potential is another form of effort, while number of molecules is another form of displacement.

But chemists are too busy to count molecules one at a time, so they count them in big bunches called ‘moles’. A mole is the number of atoms in 12 grams of carbon-12. That’s roughly

602,214,150,000,000,000,000,000

atoms. This is called Avogadro’s constant. If we used 1 gram of hydrogen, we’d get a very close number called ‘Avogadro’s number’, which leads to lots of jokes:

(He must be desperate because he looks so weird… sort of like a mole!)

So, instead of saying that the displacement in chemistry is called ‘number of molecules’, you’ll sound more like an expert if you say ‘moles’. And the corresponding flow is called molar flow.

The truly obscure quantity in this row of the chart is the one whose time derivative is chemical potential! I’m calling it chemical momentum simply because I don’t know another name.

Why are linear and angular momentum so famous compared to pressure momentum, temperature momentum and chemical momentum?

I suspect it’s because the laws of physics are symmetrical
under translations and rotations. When the assumptions of Noether’s theorem hold, this guarantees that the total momentum and angular momentum of a closed system are conserved. Apparently the laws of physics lack the symmetries that would make the other kinds of momentum be conserved.

This suggests that we should dig deeper and try to understand more deeply how this chart is connected to ideas in classical mechanics, like Noether’s theorem or symplectic geometry. I will try to do that sometime later in this series.

More generally, we should try to understand what gives rise to a row in this analogy chart. Are there are lots of rows I haven’t talked about yet, or just a few? There are probably lots. But are there lots of practically important rows that I haven’t talked about—ones that can serve as the basis for new kinds of engineering? Or does something about the structure of the physical world limit the number of such rows?

### Mildly defective analogies

Engineers care a lot about dimensional analysis. So, they often make a big deal about the fact that while effort and flow have different dimensions in different rows of the analogy chart, the following four things are always true:

$pq$ has dimensions of action (= energy × time)
$\dot{p} q$ has dimensions of energy
$p \dot{q}$ has dimensions of energy
$\dot{p} \dot{q}$ has dimensions of power (= energy / time)

In fact any one of these things implies all the rest.

These facts are important when designing ‘mixed systems’, which combine different rows in the chart. For example, in mechatronics, we combine mechanical and electronic elements in a single circuit! And in a hydroelectric dam, power is converted from hydraulic to mechanical and then electric form:

One goal of network theory should be to develop a unified language for studying mixed systems! Engineers have already done most of the hard work. And they’ve realized that thanks to conservation of energy, working with pairs of flow and effort variables whose product has dimensions of power is very convenient. It makes it easy to track the flow of energy through these systems.

However, people have tried to extend the analogy chart to include ‘mildly defective’ examples where effort times flow doesn’t have dimensions of power. The two most popular are these:

 displacement:    $q$ flow:      $\dot q$ momentum:      $p$ effort:           $\dot p$ Heat flow heat heat flow temperature momentum temperature Economics inventory product flow economic momentum product price

The heat flow analogy comes up because people like to think of heat flow as analogous to electrical current, and temperature as analogous to voltage. Why? Because an insulated wall acts a bit like a resistor! The current flowing through a resistor is a function the voltage across it. Similarly, the heat flowing through an insulated wall is about proportional to the difference in temperature between the inside and the outside.

However, there’s a difference. Current times voltage has dimensions of power. Heat flow times temperature does not have dimensions of power. In fact, heat flow by itself already has dimensions of power! So, engineers feel somewhat guilty about this analogy.

Being a mathematical physicist, a possible way out presents itself to me: use units where temperature is dimensionless! In fact such units are pretty popular in some circles. But I don’t know if this solution is a real one, or whether it causes some sort of trouble.

In the economic example, ‘energy’ has been replaced by ‘money’. So other words, ‘inventory’ times ‘product price’ has units of money. And so does ‘product flow’ times ‘economic momentum’! I’d never heard of economic momentum before I started studying these analogies, but I didn’t make up that term. It’s the thing whose time derivative is ‘product price’. Apparently economists have noticed a tendency for rising prices to keep rising, and falling prices to keep falling… a tendency toward ‘conservation of momentum’ that doesn’t fit into their models of rational behavior.

I’m suspicious of any attempt to make economics seem like physics. Unlike elementary particles or rocks, people don’t seem to be very well modelled by simple differential equations. However, some economists have used the above analogy to model economic systems. And I can’t help but find that interesting—even if intellectually dubious when taken too seriously.

### An auto-analogy

Beside the analogy I’ve already described between electronics and mechanics, there’s another one, called ‘Firestone’s analogy’:

• F.A. Firestone, A new analogy between mechanical and electrical systems, Journal of the Acoustical Society of America 4 (1933), 249–267.

Alain Bossavit pointed this out in the comments to Part 27. The idea is to treat current as analogous to force instead of velocity… and treat voltage as analogous to velocity instead of force!

In other words, switch your $p$’s and $q$’s:

 Electronics Mechanics          (usual analogy) Mechanics      (Firestone’s analogy) charge position: $q$ momentum: $p$ current velocity: $\dot{q}$ force: $\dot{p}$ flux linkage momentum: $p$ position: $q$ voltage force: $\dot{p}$ velocity: $\dot{q}$

This new analogy is not ‘mildly defective’: the product of effort and flow variables still has dimensions of power. But why bother with another analogy?

It may be helpful to recall this circuit from last time:

It’s described by this differential equation:

$L \ddot{Q} + R \dot{Q} + C^{-1} Q = V$

We used the ‘usual analogy’ to translate it into classical mechanics problem, and we got a problem where an object of mass $L$ is hanging from a spring with spring constant $1/C$ and damping coefficient $R,$ and feeling an additional external force $F:$

$m \ddot{q} + r \dot{q} + k q = F$

And that’s fine. But there’s an intuitive sense in which all three forces are acting ‘in parallel’ on the mass, rather than in series. In other words, all side by side, instead of one after the other.

Using Firestone’s analogy, we get a different classical mechanics problem, where the three forces are acting in series. The spring is connected to source of friction, which in turn is connected to an external force.

This may seem a bit mysterious. But instead of trying to explain it, I’ll urge you to read his paper, which is short and clearly written. I instead want to make a somewhat different point, which is that we can take a mechanical system, convert it to an electrical one following the usual analogy, and then convert back to a mechanical one using Firestone’s analogy. This gives us an ‘auto-analogy’ between mechanics and itself, which switches $p$ and $q.$

And although I haven’t been able to figure out why from Firestone’s paper, I have other reasons for feeling sure this auto-analogy should contain a minus sign. For example:

$p \mapsto q, \qquad q \mapsto -p$

In other words, it should correspond to a 90° rotation in the $(p,q)$ plane. There’s nothing sacred about whether we rotate clockwise or counterclockwise; we can equally well do this:

$p \mapsto -q, \qquad q \mapsto p$

But we need the minus sign to get a so-called symplectic transformation of the $(p,q)$ plane. And from my experience with classical mechanics, I’m pretty sure we want that. If I’m wrong, please let me know!

I have a feeling we should revisit this issue when we get more deeply into the symplectic aspects of circuit theory. So, I won’t go on now.

### References

The analogies I’ve been talking about are studied in a branch of engineering called system dynamics. You can read more about it here:

• Dean C. Karnopp, Donald L. Margolis and Ronald C. Rosenberg, System Dynamics: a Unified Approach, Wiley, New York, 1990.

• Forbes T. Brown, Engineering System Dynamics: a Unified Graph-Centered Approach, CRC Press, Boca Raton, 2007.

• Francois E. Cellier, Continuous System Modelling, Springer, Berlin, 1991.

System dynamics already uses lots of diagrams of networks. One of my goals in weeks to come is to explain the category theory lurking behind these diagrams.

## Network Theory (Part 28)

10 April, 2013

Last time I left you with some puzzles. One was to use the laws of electrical circuits to work out what this one does:

If we do this puzzle, and keep our eyes open, we’ll see an analogy between electrical circuits and classical mechanics! And this is the first of a huge set of analogies. The same math shows up in many different subjects, whenever we study complex systems made of interacting parts. So, it should become part of any general theory of networks.

This simple circuit is very famous: it’s called a series RLC circuit, because it has a resistor of resistance $R,$ an inductor of inductance $L,$ and a capacitor of capacitance $C,$ all hooked up ‘in series’, meaning one after another. But understand this circuit, it’s good to start with an even simpler one, where we leave out the voltage source:

This has three edges, so reading from top to bottom there are 3 voltages $V_1, V_2, V_3,$ and 3 currents $I_1, I_2, I_3,$ one for each edge. The white and black dots are called ‘nodes’, and the white ones are called ‘terminals’: current can flow in or out of those.

The voltages and currents obey a bunch of equations:

• Kirchhoff’s current law says the current flowing into each node that’s not a terminal equals the current flowing out:

$I_1 = I_2 = I_3$

• Kirchhoff’s voltage law says there are potentials $\phi_0, \phi_1, \phi_2, \phi_3$, one for each node, such that:

$V_1 = \phi_0 - \phi_1$

$V_2 = \phi_1 - \phi_2$

$V_3 = \phi_2 - \phi_3$

In this particular problem, Kirchhoff’s voltage law doesn’t say much, since we can always find potentials obeying this, given the voltages. But in other problems it can be important. And even here it suggests that the sum $V_1 + V_2 + V_3$ will be important; this is the ‘total voltage across the circuit’.

Next, we get one equation for each circuit element:

• The law for a resistor says:

$V_1 = R I_1$

The law for a inductor says:

$\displaystyle{ V_2 = L \frac{d I_2}{d t} }$

The law for a capacitor says:

$\displaystyle{ I_3 = C \frac{d V_3}{d t} }$

These are all our equations. What should we do with them? Since $I_1 = I_2 = I_3,$ it makes sense to call all these currents simply $I$ and solve for each voltage in terms of this. Here’s what we get:

$V_1 = R I$

$\displaystyle{ V_2 = L \frac{d I}{d t} }$

$\displaystyle {V_3 = C^{-1} \int I \, dt }$

So, if we know the current flowing through the circuit we can work out the voltage across each circuit element!

Well, not quite: in the case of the capacitor we only know it up to a constant, since there’s a constant of integration. This may seem like a minor objection, but it’s worth taking seriously. The point is that the charge on the capacitor’s plate is proportional to the voltage across the capacitor:

$\displaystyle{V_3 = C^{-1} Q }$

When electrons move on or off the plate, this charge changes, and we get a current:

$\displaystyle{I = \frac{d Q}{d t} }$

So, we can work out the time derivative of $V_3$ from the current $I$, but to work out $V_3$ itself we need the charge $Q.$

Treat these as definitions if you like, but they’re physical facts too! And they let us rewrite our trio of equations:

$V_1 = R I$

$\displaystyle{ V_2 = L \frac{d I}{d t} }$

$\displaystyle{V_3 = C^{-1} \int I \, dt }$

in terms of the charge, as follows:

$V_1 = R \dot{Q}$

$V_2 = L \ddot{Q}$

$V_3 = C^{-1} Q$

Then if we add these three equations, we get

$V_1 + V_2 + V_3 = L \ddot Q + R \dot Q + C^{-1} Q$

So, if we define the total voltage by

$V = V_1 + V_2 + V_3 = \phi_0 - \phi_3$

we get

$L \ddot Q + R \dot Q + C^{-1} Q = V$

And this is great!

Why? Because this equation is famous! If you’re a mathematician, you know it as the most general second-order linear ordinary differential equation with constant coefficients. But if you’re a physicist, you know it as the damped driven oscillator.

### The analogy between electronics and mechanics

Here’s an example of a damped driven oscillator:

We’ve got an object hanging from a spring with some friction, and an external force pulling it down. Here the external force is gravity, so it’s constant in time, but we can imagine fancier situations where it’s not. So in a general damped driven oscillator:

• the object has mass $m$ (and the spring is massless),

• the spring constant is $k$ (this says how strong the spring force is),

• the damping coefficient is $r$ (this says how much friction there is),

• the external force is $F$ (in general a function of time).

Then Newton’s law says

$m \ddot{q} + r \dot{q} + k q = F$

And apart from the use of different letters, this is exactly like the equation for our circuit! Remember, that was

$L \ddot Q + R \dot Q + C^{-1} Q = V$

So, we get a wonderful analogy relating electronics and mechanics! It goes like this:

 Electronics Mechanics charge: $Q$ position: $q$ current: $I = \dot{Q}$ velocity: $v = \dot{q}$ voltage: $V$ force: $F$ inductance: $L$ mass: $m$ resistance: $R$ damping coefficient: $r$ inverse capacitance: $1/C$ spring constant: $k$

If you understand mechanics, you can use this to get intuition about electronics… or vice versa. I’m more comfortable with mechanics, so when I see this circuit:

I imagine a current of electrons whizzing along, ‘forced’ by the voltage across the circuit, getting slowed by the ‘friction’ of the resistor, wanting to continue their motion thanks to the inertia or ‘mass’ of the inductor, and getting stuck on the plate of the capacitor, where their mutual repulsion pushes back against the flow of current—just like a spring fights back when you pull on it! This lets me know how the circuit will behave: I can use my mechanical intuition.

The only mildly annoying thing is that the inverse of the capacitance $C$ is like the spring constant $k.$ But this makes perfect sense. A capacitor is like a spring: you ‘pull’ on it with voltage and it ‘stretches’ by building up electric charge on its plate. If its capacitance is high, it’s like a easily stretchable spring. But this means the corresponding spring constant is low.

Besides letting us transfer intuition and techniques, the other great thing about analogies is that they suggest ways of extending themselves. For example, we’ve seen that current is the time derivative of charge. But if we hadn’t, we could still have guessed it, because current is like velocity, which is the time derivative of something important.

Similarly, force is analogous to voltage. But force is the time derivative of momentum! We don’t have momentum on our chart. Our chart is also missing the thing whose time derivative is voltage. This thing is called flux linkage, and sometimes denotes $\lambda.$ So we should add this, and momentum, to our chart:

 Electronics Mechanics charge: $Q$ position: $q$ current: $I = \dot{Q}$ velocity: $v = \dot{q}$ flux linkage: $\lambda$ momentum: $p$ voltage: $V = \dot{\lambda}$ force: $F = \dot{p}$ inductance: $L$ mass: $m$ resistance: $R$ damping coefficient: $r$ inverse capacitance: $1/C$ spring constant: $k$

### Fourier transforms

But before I get carried away talking about analogies, let’s try to solve the equation for our circuit:

$L \ddot Q + R \dot Q + C^{-1} Q = V$

This instantly tells us the voltage $V$ as a function of time if we know the charge $Q$ as a function of time. So, ‘solving’ it means figuring out $Q$ if we know $V.$ You may not care about $Q$—it’s the charge of the electrons stuck on the capacitor—but you should certainly care about the current $I = \dot{Q},$ and figuring out $Q$ will get you that.

Besides, we’ll learn something good from solving this equation.

We could solve it using either the Laplace transform or the Fourier transform. They’re very similar. For some reason electrical engineers prefer the Laplace transform—does anyone know why? But I think the Fourier transform is conceptually preferable, slightly, so I’ll use that.

The idea is to write any function of time as a linear combination of oscillating functions $\exp(i\omega t)$ with different frequencies $\omega.$ More precisely, we write our function $f$ as an integral

$\displaystyle{ f(t) = \frac{1}{\sqrt{2 \pi}} \int_{-\infty}^\infty \hat{f}(\omega) e^{i\omega t} \, d\omega }$

Here the function $\hat{f}$ is called the Fourier transform of $f$, and it’s given by

$\displaystyle{ \hat{f}(\omega) = \frac{1}{\sqrt{2 \pi}} \int_{-\infty}^\infty f(t) e^{-i\omega t} \, dt }$

There is a lot one could say about this, but all I need right now is that differentiating a function has the effect of multiplying its Fourier transform by $i\omega.$ To see this, we simply take the Fourier transform of $\dot{f}$:

$\begin{array}{ccl} \hat{\dot{f}}(\omega) &=& \displaystyle{ \frac{1}{\sqrt{2 \pi}} \int_{-\infty}^\infty \frac{df(t)}{dt} \, e^{-i\omega t} \, dt } \\ \\ &=& \displaystyle{ -\frac{1}{\sqrt{2 \pi}} \int_{-\infty}^\infty f(t) \frac{d}{dt} e^{-i\omega t} \, dt } \\ \\ &=& \displaystyle{ i\omega \; \frac{1}{\sqrt{2 \pi}} \int_{-\infty}^\infty f(t) e^{-i\omega t} \, dt } \\ \\ &=& i\omega \hat{f}(\omega) \end{array}$

where in the second step we integrate by parts. So,

$\hat{\dot{f}}(\omega) = i\omega \hat{f}(\omega)$

The Fourier transform is linear, too, so we can start with our differential equation:

$L \ddot Q + R \dot Q + C^{-1} Q = V$

and take the Fourier transform of each term, getting

$\displaystyle{ \left((i\omega)^2 L + (i\omega) R + C^{-1}\right) \hat{Q}(\omega) = \hat{V}(\omega) }$

We can now solve for the charge in a completely painless way:

$\displaystyle{ \hat{Q}(\omega) = \frac{1}{((i\omega)^2 L + (i\omega) R + C^{-1})} \, \hat{V}(\omega) }$

Well, we actually solved for $\hat{Q}$ in terms of $\hat{V}.$ But if we’re good at taking Fourier transforms, this is good enough. And it has a deep inner meaning.

To see its inner meaning, note that the Fourier transform of an oscillating function $\exp(i \omega_0 t)$ is a delta function at the frequency $\omega = \omega_0.$ This says that this oscillating function is purely of frequency $\omega_0,$ like a laser beam of one pure color, or a sound of one pure pitch.

Actually there’s a little fudge factor due to how I defined the Fourier transform: if

$f(t) = e^{i\omega_0 t}$

then

$\displaystyle{ \hat{f}(\omega) = \sqrt{2 \pi} \, \delta(\omega - \omega_0) }$

But it’s no big deal. (You can define your Fourier transform so the $2\pi$ doesn’t show up here, but it’s bound to show up somewhere.)

Also, you may wonder how the complex numbers got into the game. What would it mean to say the voltage is $\exp(i \omega t)?$ The answer is: don’t worry, everything in sight is linear, so we can take the real or imaginary part of any equation and get one that makes physical sense.

Anyway, what does our relation

$\displaystyle{ \hat{Q}(\omega) = \frac{1}{((i\omega)^2 L + (i\omega) R + C^{-1})} \hat{V}(\omega) }$

mean? It means that if we put an oscillating voltage of frequency $\omega_0$ across our circuit, like this:

$V(t) = e^{i \omega_0 t}$

then we’ll get an oscillating charge at the same frequency, like this:

$\displaystyle{ Q(t) = \frac{1}{((i\omega_0)^2 L + (i\omega_0) R + C^{-1})} e^{i \omega_0 t} }$

To see this, just use the fact that the Fourier transform of $\exp(i \omega_0 t)$ is essentially a delta function at $\omega_0,$ and juggle the equations appropriately!

But the magnitude and phase of this oscillating charge $Q(t)$ depends on the function

$\displaystyle{ \frac{1}{((i\omega_0)^2 L + (i\omega_0) R + C^{-1})} }$

For example, $Q(t)$ will be big when $\omega_0$ is near a pole of this function! We can use this to study the resonant frequency of our circuit.

The same idea works for many more complicated circuits, and other things too. The function up there is an example of a transfer function: it describes the response of a linear, time-invariant system to an input of a given frequency. Here the ‘input’ is the voltage and the ‘response’ is the charge.

### Impedance

Taking this idea to its logical conclusion, we can see inductors and capacitors as being resistors with a frequency-dependent, complex-valued resistance! This generalized resistance is called ‘impedance. Let’s see how it works.

Suppose we have an electrical circuit. Consider any edge $e$ of this circuit:

• If our edge $e$ is labelled by a resistor of resistance $R$:

then

$V_e = R I_e$

Taking Fourier transforms, we get

$\hat{V}_e = R \hat{I}_e$

so nothing interesting here: our resistor acts like a resistor of resistance $R$ no matter what the frequency of the voltage and current are!

• If our edge $e$ is labelled by an inductor of inductance $L$:

then

$\displaystyle{ V_e = L \frac{d I_e}{d t} }$

Taking Fourier transforms, we get

$\hat{V}_e = (i\omega L) \hat{I}_e$

This is interesting: our inductor acts like a resistor of resistance $i \omega L$ when the frequency of the current and voltage is $\omega.$ So, we say the ‘impedance’ of the inductor is $i \omega L.$

• If our edge $e$ is labelled by a capacitor of capacitance $C$:

we have

$\displaystyle{ I_e = C \frac{d V_e}{d t} }$

Taking Fourier transforms, we get

$\hat{I}_e = (i\omega C) \hat{V}_e$

or

$\displaystyle{ \hat{V}_e = \frac{1}{i \omega C} \hat{I_e} }$

So, our capacitor acts like a resistor of resistance $1/(i \omega C)$ when the frequency of the current and voltage is $\omega.$ We say the ‘impedance’ of the capacitor is $1/(i \omega L).$

It doesn’t make sense to talk about the impedance of a voltage source or current source, since these circuit elements don’t give a linear relation between voltage and current. But whenever an element is linear and its properties don’t change with time, the Fourier transformed voltage will be some function of frequency times the Fourier transformed current. And in this case, we call that function the impedance of the element. The symbol for impedance is $Z,$ so we have

$\hat{V}_e(\omega) = Z(\omega) \hat{I}_e(\omega)$

or

$\hat{V}_e = Z \hat{I}_e$

for short.

### The big picture

In case you’re getting lost in the details, here are the big lessons for today:

• There’s a detailed analogy between electronics and mechanics, which we’ll later extend to many other systems.

• The study of linear time-independent elements can be reduced to the study of resistors if we generalize resistance to impedance by letting it be a complex-valued function instead of a real number.

One thing we’re doing is preparing for a general study of linear time-independent open systems. We’ll use linear algebra, but the field—the number system in our linear algebra—will consist of complex-valued functions, rather than real numbers.

### Puzzle

Let’s not forget our original problem:

This is closely related to the problem we just solved. All the equations we derived still hold! But if you do the math, or use some intuition, you’ll see the voltage source ensures that the voltage we’ve been calling $V$ is a constant. So, the current $I$ flowing around the wire obeys the same equation we got before:

$L \ddot Q + R \dot Q + C^{-1} Q = V$

where $\dot Q = I.$ The only difference is that now $V$ is constant.

Puzzle. Solve this equation for $Q(t).$

There are lots of ways to do this. You could use a Fourier transform, which would give a satisfying sense of completion to this blog article. Or, you could do it some other way.