The Inverse Cube Force Law

30 August, 2015

Here you see three planets. The blue planet is orbiting the Sun in a realistic way: it’s going around an ellipse.

The other two are moving in and out just like the blue planet, so they all stay on the same circle. But they’re moving around this circle at different rates! The green planet is moving faster than the blue one: it completes 3 orbits each time the blue planet goes around once. The red planet isn’t going around at all: it only moves in and out.

What’s going on here?

In 1687, Isaac Newton published his Principia Mathematica. This book is famous, but in Propositions 43–45 of Book I he did something that people didn’t talk about much—until recently. He figured out what extra force, besides gravity, would make a planet move like one of these weird other planets. It turns out an extra force obeying an inverse cube law will do the job!

Let me make this more precise. We’re only interested in ‘central forces’ here. A central force is one that only pushes a particle towards or away from some chosen point, and only depends on the particle’s distance from that point. In Newton’s theory, gravity is a central force obeying an inverse square law:

F(r) = - \displaystyle{ \frac{a}{r^2} }

for some constant a. But he considered adding an extra central force obeying an inverse cube law:

F(r) = - \displaystyle{ \frac{a}{r^2} + \frac{b}{r^3} }

He showed that if you do this, for any motion of a particle in the force of gravity you can find a motion of a particle in gravity plus this extra force, where the distance r(t) is the same, but the angle \theta(t) is not.

In fact Newton did more. He showed that if we start with any central force, adding an inverse cube force has this effect.

There’s a very long page about this on Wikipedia:

Newton’s theorem of revolving orbits, Wikipedia.

I haven’t fully understood all of this, but it instantly makes me think of three other things I know about the inverse cube force law, which are probably related. So maybe you can help me figure out the relationship.

The first, and simplest, is this. Suppose we have a particle in a central force. It will move in a plane, so we can use polar coordinates r, \theta to describe its position. We can describe the force away from the origin as a function F(r). Then the radial part of the particle’s motion obeys this equation:

\displaystyle{ m \ddot r = F(r) + \frac{L^2}{mr^3} }

where L is the magnitude of particle’s angular momentum.

So, angular momentum acts to provide a ‘fictitious force’ pushing the particle out, which one might call the centrifugal force. And this force obeys an inverse cube force law!

Furthermore, thanks to the formula above, it’s pretty obvious that if you change L but also add a precisely compensating inverse cube force, the value of \ddot r will be unchanged! So, we can set things up so that the particle’s radial motion will be unchanged. But its angular motion will be different, since it has a different angular momentum. This explains Newton’s observation.

It’s often handy to write a central force in terms of a potential:

F(r) = -V'(r)

Then we can make up an extra potential responsible for the centrifugal force, and combine it with the actual potential V into a so-called effective potential:

\displaystyle{ U(r) = V(r) + \frac{L^2}{2mr^2} }

The particle’s radial motion then obeys a simple equation:

\ddot{r} = - U'(r)

For a particle in gravity, where the force obeys an inverse square law and V is proportional to -1/r, the effective potential might look like this:

This is the graph of

\displaystyle{ U(r) = -\frac{4}{r} + \frac{1}{r^2} }

If you’re used to particles rolling around in potentials, you can easily see that a particle with not too much energy will move back and forth, never making it to r = 0 or r = \infty. This corresponds to an elliptical orbit. Give it more energy and the particle can escape to infinity, but it will never hit the origin. The repulsive ‘centrifugal force’ always overwhelms the attraction of gravity near the origin, at least if the angular momentum is nonzero.

On the other hand, suppose we have a particle moving in an attractive inverse cube force! Then the potential is proportional to 1/r^2, so the effective potential is

\displaystyle{ U(r) = \frac{c}{r^2} + \frac{L^2}{mr^2} }

where c is negative for an attractive force. If this attractive force is big enough, namely

\displaystyle{ c < -\frac{L^2}{m} }

then this force can exceed the centrifugal force, and the particle can fall in to r = 0.

If we keep track of the angular coordinate \theta, we can see what’s really going on. The particle is spiraling in to its doom, hitting the origin in a finite amount of time!

This should remind you of a black hole, and indeed something similar happens there, but even more drastic:

Schwarzschild geodesics: effective radial potential energy, Wikipedia.

For a nonrotating uncharged black hole, the effective potential has three terms. Like Newtonian gravity it has an attractive -1/r term and a repulsive 1/r^2 term. But it also has an attractive term -1/r^3 term! In other words, it’s as if on top of Newtonian gravity, we had another attractive force obeying an inverse fourth power law! This overwhelms the others at short distances, so if you get too close to a black hole, you spiral in to your doom.

For example, a black hole can have an effective potential like this:

But back to inverse cube force laws! I know two more things about them. A while back I discussed how a particle in an inverse square force can be reinterpreted as a harmonic oscillator:

Planets in the fourth dimension, Azimuth.

There are many ways to think about this, and apparently the idea in some form goes all the way back to Newton! It involves a sneaky way to take a particle in a potential

\displaystyle{ V(r) \propto r^{-1} }

and think of it as moving around in the complex plane. Then if you square its position—thought of as a complex number—and cleverly reparametrize time, you get a particle moving in a potential

\displaystyle{ V(r) \propto r^2 }

This amazing trick can be generalized! A particle in a potential

\displaystyle{ V(r) \propto r^p }

can transformed to a particle in a potential

\displaystyle{ V(r) \propto r^q }


(p+2)(q+2) = 4

A good description is here:

• Rachel W. Hall and Krešimir Josić, Planetary motion and the duality of force laws, SIAM Review 42 (2000), 115–124.

This trick transforms particles in r^p potentials with p ranging between -2 and +\infty to r^q potentials with q ranging between +\infty and -2. It’s like a see-saw: when p is small, q is big, and vice versa.

But you’ll notice this trick doesn’t actually work at p = -2, the case that corresponds to the inverse cube force law. The problem is that p + 2 = 0 in this case, so we can’t find q with (p+2)(q+2) = 4.

So, the inverse cube force is special in three ways: it’s the one that you can add on to any force to get solutions with the same radial motion but different angular motion, it’s the one that naturally describes the ‘centrifugal force’, and it’s the one that doesn’t have a partner! We’ve seen how the first two ways are secretly the same. I don’t know about the third, but I’m hopeful.

Quantum aspects

Finally, here’s a fourth way in which the inverse cube law is special. This shows up most visibly in quantum mechanics… and this is what got me interested in this business in the first place.

You see, I’m writing a paper called ‘Struggles with the continuum’, which discusses problems in analysis that arise when you try to make some of our favorite theories of physics make sense. The inverse square force law poses interesting problems of this sort, which I plan to discuss. But I started wanting to compare the inverse cube force law, just so people can see things that go wrong in this case, and not take our successes with the inverse square law for granted.

Unfortunately a huge digression on the inverse cube force law would be out of place in that paper. So, I’m offloading some of that material to here.

In quantum mechanics, a particle moving in an inverse cube force law has a Hamiltonian like this:

H = -\nabla^2 + c r^{-2}

The first term describes the kinetic energy, while the second describes the potential energy. I’m setting \hbar = 1 and 2m = 1 to remove some clutter that doesn’t really affect the key issues.

To see how strange this Hamiltonian is, let me compare an easier case. If p < 2, the Hamiltonian

H = -\nabla^2 + c r^{-p}

is essentially self-adjoint on C_0^\infty(\mathbb{R}^3 - \{0\}), which is the space of compactly supported smooth functions on 3d Euclidean space minus the origin. What this means is that first of all, H is defined on this domain: it maps functions in this domain to functions in L^2(\mathbb{R}^3). But more importantly, it means we can uniquely extend H from this domain to a self-adjoint operator on some larger domain. In quantum physics, we want our Hamiltonians to be self-adjoint. So, this fact is good.

Proving this fact is fairly hard! It uses something called the Kato–Lax–Milgram–Nelson theorem together with this beautiful inequality:

\displaystyle{ \int_{\mathbb{R}^3} \frac{1}{4r^2} |\psi(x)|^2 \,d^3 x \le \int_{\mathbb{R}^3} |\nabla \psi(x)|^2 \,d^3 x }

for any \psi\in C_0^\infty(\mathbb{R}^3).

If you think hard, you can see this inequality is actually a fact about the quantum mechanics of the inverse cube law! It says that if c \ge -1/4, the energy of a quantum particle in the potential c r^{-2} is bounded below. And in a sense, this inequality is optimal: if c < -1/4, the energy is not bounded below. This is the quantum version of how a classical particle can spiral in to its doom in an attractive inverse cube law, if it doesn’t have enough angular momentum. But it’s subtly and mysteriously different.

You may wonder how this inequality is used to prove good things about potentials that are ‘less singular’ than the c r^{-2} potential: that is, potentials c r^{-p} with p < 2. For that, you have to use some tricks that I don’t want to explain here. I also don’t want to prove this inequality, or explain why its optimal! You can find most of this in some old course notes of mine:

• John Baez, Quantum Theory and Analysis, 1989.

See especially section 15.

But it’s pretty easy to see how this inequality implies things about the expected energy of a quantum particle in the potential c r^{-2}. So let’s do that.

In this potential, the expected energy of a state \psi is:

\displaystyle{  \langle \psi, H \psi \rangle =   \int_{\mathbb{R}^3} \overline\psi(x)\, (-\nabla^2 + c r^{-2})\psi(x) \, d^3 x }

Doing an integration by parts, this gives:

\displaystyle{  \langle \psi, H \psi \rangle = \int_{\mathbb{R}^3} |\nabla \psi(x)|^2 + cr^{-2} |\psi(x)|^2 \,d^3 x }

The inequality I showed you says precisely that when c = -1/4, this is greater than or equal to zero. So, the expected energy is actually nonnegative in this case! And making c greater than -1/4 only makes the expected energy bigger.

Note that in classical mechanics, the energy of a particle in this potential ceases to be bounded below as soon as c < 0. Quantum mechanics is different because of the uncertainty principle! To get a lot of negative potential energy, the particle’s wavefunction must be squished near the origin, but that gives it kinetic energy.

It turns out that the Hamiltonian for a quantum particle in an inverse cube force law has exquisitely subtle and tricky behavior. Many people have written about it, running into ‘paradoxes’ when they weren’t careful enough. Only rather recently have things been straightened out.

For starters, the Hamiltonian for this kind of particle

H = -\nabla^2 + c r^{-2}

has different behaviors depending on c. Obviously the force is attractive when c > 0 and repulsive when c < 0, but that’s not the only thing that matters! Here’s a summary:

c \ge 3/4. In this case H is essentially self-adjoint on C_0^\infty(\mathbb{R}^3 - \{0\}). So, it admits a unique self-adjoint extension and there’s no ambiguity about this case.

c < 3/4. In this case H is not essentially self-adjoint on C_0^\infty(\mathbb{R}^3 - \{0\}). In fact, it admits more than one self-adjoint extension! This means that we need extra input from physics to choose the Hamiltonian in this case. It turns out that we need to say what happens when the particle hits the singularity at r = 0. This is a long and fascinating story that I just learned yesterday.

c \ge -1/4. In this case the expected energy \langle \psi, H \psi \rangle is bounded below for \psi \in C_0^\infty(\mathbb{R}^3 - \{0\}). It turns out that whenever we have a Hamiltonian that is bounded below, even if there is not a unique self-adjoint extension, there exists a canonical ‘best choice’ of self-adjoint extension, called the Friedrichs extension. I explain this in my course notes.

c < -1/4. In this case the expected energy is not bounded below, so we don’t have the Friedrichs extension to help us choose which self-adjoint extension is ‘best’.

To go all the way down this rabbit hole, I recommend these two papers:

• Sarang Gopalakrishnan, Self-Adjointness and the Renormalization of Singular Potentials, B.A. Thesis, Amherst College.

• D. M. Gitman, I. V. Tyutin and B. L. Voronov, Self-adjoint extensions and spectral analysis in the Calogero problem, Jour. Phys. A 43 (2010), 145205.

The first is good for a broad overview of problems associated to singular potentials such as the inverse cube force law; there is attention to mathematical rigor the focus is on physical insight. The second is good if you want—as I wanted—to really get to the bottom of the inverse cube force law in quantum mechanics. Both have lots of references.

Also, both point out a crucial fact I haven’t mentioned yet: in quantum mechanics the inverse cube force law is special because, naively, at least it has a kind of symmetry under rescaling! You can see this from the formula

H = -\nabla^2 + cr^{-2}

by noting that both the Laplacian and r^{-2} have units of length-2. So, they both transform in the same way under rescaling: if you take any smooth function \psi, apply H and then expand the result by a factor of k, you get k^2 times what you get if you do those operations in the other order.

In particular, this means that if you have a smooth eigenfunction of H with eigenvalue \lambda, you will also have one with eigenfunction k^2 \lambda for any k > 0. And if your original eigenfunction was normalizable, so will be the new one!

With some calculation you can show that when c \le -1/4, the Hamiltonian H has a smooth normalizable eigenfunction with a negative eigenvalue. In fact it’s spherically symmetric, so finding it is not so terribly hard. But this instantly implies that H has smooth normalizable eigenfunctions with any negative eigenvalue.

This implies various things, some terrifying. First of all, it means that H is not bounded below, at least not on the space of smooth normalizable functions. A similar but more delicate scaling argument shows that it’s also not bounded below on C_0^\infty(\mathbb{R}^3 - \{0\}), as I claimed earlier.

This is scary but not terrifying: it simply means that when c \le -1/4, the potential is too strongly negative for the Hamiltonian to be bounded below.

The terrifying part is this: we’re getting uncountably many normalizable eigenfunctions, all with different eigenvalues, one for each choice of k. A self-adjoint operator on a countable-dimensional Hilbert space like L^2(\mathbb{R}^3) can’t have uncountably many normalizable eigenvectors with different eigenvalues, since then they’d all be orthogonal to each other, and that’s too many orthogonal vectors to fit in a Hilbert space of countable dimension!

This sounds like a paradox, but it’s not. These functions are not all orthogonal, and they’re not all eigenfunctions of a self-adjoint operator. You see, the operator H is not self-adjoint on the domain we’ve chosen, the space of all smooth functions in L^2(\mathbb{R}^3). We can carefully choose a domain to get a self-adjoint operator… but it turns out there are many ways to do it.

Intriguingly, in most cases this choice breaks the naive dilation symmetry. So, we’re getting what physicists call an ‘anomaly’: a symmetry of a classical system that fails to give a symmetry of the corresponding quantum system.

Of course, if you’ve made it this far, you probably want to understand what the different choices of Hamiltonian for a particle in an inverse cube force law actually mean, physically. The idea seems to be that they say how the particle changes phase when it hits the singularity at r = 0 and bounces back out.

(Why does it bounce back out? Well, if it didn’t, time evolution would not be unitary, so it would not be described by a self-adjoint Hamiltonian! We could try to describe the physics of a quantum particle that does not come back out when it hits the singularity, and I believe people have tried, but this requires a different set of mathematical tools.)

For a detailed analysis of this, it seems one should take Schrödinger’s equation and do a separation of variables into the angular part and the radial part:

\psi(r,\theta,\phi) = \Psi(r) \Phi(\theta,\phi)

For each choice of \ell = 0,1,2,\dots one gets a space of spherical harmonics that one can use for the angular part \Phi. The interesting part is the radial part, \Psi. Here it is helpful to make a change of variables

u(r) = \Psi(r)/r

At least naively, Schrödinger’s equation for the particle in the cr^{-2} potential then becomes

\displaystyle{ \frac{d}{dt} u = -iH u }


\displaystyle{ H = -\frac{d^2}{dr^2} + \frac{c + \ell(\ell+1)}{r^2} }

Beware: I keep calling all sorts of different but related Hamiltonians H, and this one is for the radial part of the dynamics of a quantum particle in an inverse cube force. As we’ve seen before in the classical case, the centrifugal force and the inverse cube force join forces in an ‘effective potential’

\displaystyle{ U(r) = kr^{-2} }


k = c + \ell(\ell+1)

So, we have reduced the problem to that of a particle on the open half-line (0,\infty) moving in the potential kr^{-2}. The Hamiltonian for this problem:

\displaystyle{ H = -\frac{d^2}{dr^2} + \frac{k}{r^2} }

is called the Calogero Hamiltonian. Needless to say, it has fascinating and somewhat scary properties, since to make it into a bona fide self-adjoint operator, we must make some choice about what happens when the particle hits r = 0. The formula above does not really specify the Hamiltonian.

This is more or less where Gitman, Tyutin and Voronov begin their analysis, after a long and pleasant review of the problem. They describe all the possible choices of self-adjoint operator that are allowed. The answer depends on the values of k, but very crudely, the choice says something like how the phase of your particle changes when it bounces off the singularity. Most choices break the dilation invariance of the problem. But intriguingly, some choices retain invariance under a discrete subgroup of dilations!

So, the rabbit hole of the inverse cube force law goes quite deep, and I expect I haven’t quite gotten to the bottom yet. The problem may seem pathological, verging on pointless. But the math is fascinating, and it’s a great testing-ground for ideas in quantum mechanics—very manageable compared to deeper subjects like quantum field theory, which are riddled with their own pathologies. Finally, the connection between the inverse cube force law and centrifugal force makes me think it’s not a mere curiosity.

In four dimensions

It’s a bit odd to study the inverse cube force law in 3-dimensonal space, since Newtonian gravity and the electrostatic force would actually obey an inverse cube law in 4-dimensional space. For the classical 2-body problem it doesn’t matter much whether you’re in 3d or 4d space, since the motion stays on the plane. But for quantum 2-body problem it makes more of a difference!

Just for the record, let me say how the quantum 2-body problem works in 4 dimensions. As before, we can work in the center of mass frame and consider this Hamiltonian:

H = -\nabla^2 + c r^{-2}

And as before, the behavior of this Hamiltonian depends on c. Here’s the story this time:

c \ge 0. In this case H is essentially self-adjoint on C_0^\infty(\mathbb{R}^4 - \{0\}). So, it admits a unique self-adjoint extension and there’s no ambiguity about this case.

c < 0. In this case H is not essentially self-adjoint on C_0^\infty(\mathbb{R}^4 - \{0\}).

c \ge -1. In this case the expected energy \langle \psi, H \psi \rangle is bounded below for \psi \in C_0^\infty(\mathbb{R}^3 - \{0\}). So, there is exists a canonical ‘best choice’ of self-adjoint extension, called the Friedrichs extension.

c < -1. In this case the expected energy is not bounded below, so we don’t have the Friedrichs extension to help us choose which self-adjoint extension is ‘best’.

I’ve been assured these are correct by Barry Simon, and a lot of this material will appear in Section 7.4 of his book:

• Barry Simon, A Comprehensive Course in Analysis, Part 4: Operator Theory, American Mathematical Society, Providence, RI, 2015.

See also:

• Barry Simon, Essential self-adjointness of Schrödinger operators with singular potentials, Arch. Rational Mech. Analysis 52 (1973), 44–48.


The animation was made by ‘WillowW’ and placed on Wikicommons. It’s one of a number that appears in this Wikipedia article:

Newton’s theorem of revolving orbits, Wikipedia.

I made the graphs using the free online Desmos graphing calculator.

The picture of a spiral was made by ‘Anarkman’ and ‘Pbroks13’ and placed on Wikicommons; it appears in

Hyperbolic spiral, Wikipedia.

The hyperbolic spiral is one of three kinds of orbits that are possible in an inverse cube force law. They are vaguely analogous to ellipses, hyperbolas and parabolas, but there are actually no bound orbits except perfect circles. The three kinds are called Cotes’s spirals. In polar coordinates, they are:

• the epispiral:

\displaystyle{ \frac{1}{r} = A \cos\left( k\theta + \varepsilon \right) }

• the hyperbolic spiral:

\displaystyle{ \frac{1}{r} = A \cosh\left( k\theta + \varepsilon \right) }

• the Poinsot spiral:

\displaystyle{ \frac{1}{r} = A \theta + \varepsilon }

The Physics of Butterfly Wings

11 August, 2015

Some butterflies have shiny, vividly colored wings. From different angles you see different colors. This effect is called iridescence. How does it work?

It turns out these butterfly wings are made of very fancy materials! Light bounces around inside these materials in a tricky way. Sunlight of different colors winds up reflecting off these materials in different directions.

We’re starting to understand the materials and make similar substances in the lab. They’re called photonic crystals. They have amazing properties.

Here at the Centre for Quantum Technologies we have people studying exotic materials of many kinds. Next door, there’s a lab completely devoted to studying graphene: crystal sheets of carbon in which electrons can move as if they were massless particles! Graphene has a lot of potential for building new technologies—that’s why Singapore is pumping money into researching it.

Some physicists at MIT just showed that one of the materials in butterfly wings might act like a 3d form of graphene. In graphene, electrons can only move easily in 2 directions. In this new material, electrons could move in all 3 directions, acting as if they had no mass.

The pictures here show the microscopic structure of two materials found in butterfly wings:

The picture at left actually shows a sculpture made by the mathematical artist Bathsheba Grossman. But it’s a piece of a gyroid: a surface with a very complicated shape, which repeats forever in 3 directions. It’s called a minimal surface because you can’t shrink its area by tweaking it just a little. It divides space into two regions.

The gyroid was discovered in 1970 by a mathematician, Alan Schoen. It’s a triply periodic minimal surfaces, meaning one that repeats itself in 3 different directions in space, like a crystal.

Schoen was working for NASA, and his idea was to use the gyroid for building ultra-light, super-strong structures. But that didn’t happen. Research doesn’t move in predictable directions.

In 1983, people discovered that in some mixtures of oil and water, the oil naturally forms a gyroid. The sheets of oil try to minimize their area, so it’s not surprising that they form a minimal surface. Something else makes this surface be a gyroid—I’m not sure what.

Butterfly wings are made of a hard material called chitin. Around 2008, people discovered that the chitin in some iridescent butterfly wings is made in a gyroid pattern! The spacing in this pattern is very small, about one wavelength of visible light. This makes light move through this material in a complicated way, which depends on the light’s color and the direction it’s moving.

So: butterflies have naturally evolved a photonic crystal based on a gyroid!

The universe is awesome, but it’s not magic. A mathematical pattern is beautiful if it’s a simple solution to at least one simple problem. This is why beautiful patterns naturally bring themselves into existence: they’re the simplest ways for certain things to happen. Darwinian evolution helps out: it scans through trillions of possibilities and finds solutions to problems. So, we should expect life to be packed with mathematically beautiful patterns… and it is.

The picture at right above shows a ‘double gyroid’. Here it is again:

This is actually two interlocking surfaces, shown in red and blue. You can get them by writing the gyroid as a level surface:

f(x,y,z) = 0

and taking the two nearby surfaces

f(x,y,z) = \pm c

for some small value of c..

It turns out that while they’re still growing, some butterflies have a double gyroid pattern in their wings. This turns into a single gyroid when they grow up!

The new research at MIT studied how an electron would move through a double gyroid pattern. They calculated its dispersion relation: how the speed of the electron would depend on its energy and the direction it’s moving.

An ordinary particle moves faster if it has more energy. But a massless particle, like a photon, moves at the same speed no matter what energy it has. The MIT team showed that an electron in a double gyroid pattern moves at a speed that doesn’t depend much on its energy. So, in some ways this electron acts like a massless particle.

But it’s quite different than a photon. It’s actually more like a neutrino! You see, unlike photons, electrons and neutrinos are spin-1/2 particles. Neutrinos are almost massless. A massless spin-1/2 particle can have a built-in handedness, spinning in only one direction around its axis of motion. Such a particle is called a Weyl spinor. The MIT team showed that a electron moving through a double gyroid acts approximately like a Weyl spinor!

How does this work? Well, the key fact is that the double gyroid has a built-in handedness, or chirality. It comes in a left-handed and right-handed form. You can see the handedness quite clearly in Grossman’s sculpture of the ordinary gyroid:

Beware: nobody has actually made electrons act like Weyl spinors in the lab yet. The MIT team just found a way that should work. Someday someone will actually make it happen, probably in less than a decade. And later, someone will do amazing things with this ability. I don’t know what. Maybe the butterflies know!

References and more

For a good introduction to the physics of gyroids, see:

• James A. Dolan, Bodo D. Wilts, Silvia Vignolini, Jeremy J. Baumberg, Ullrich Steiner and Timothy D. Wilkinson, Optical properties of gyroid structured materials: from photonic crystals to metamaterials, Advanced Optical Materials 3 (2015), 12–32.

For some of the history and math of gyroids, see Alan Schoen’s webpage:

• Alan Schoen, Triply-periodic minimal surfaces.

For more on gyroids in butterfly wings, see:

• K. Michielsen and D. G. Stavenga, Gyroid cuticular structures in butterfly wing scales: biological photonic crystals.

• Vinodkumar Saranathana et al, Structure, function, and self-assembly of single network gyroid (I4132) photonic crystals in butterfly wing scales, PNAS 107 (2010), 11676–11681.

The paper by Michielsen and Stavenga is free online! They say the famous ‘blue Morpho’ butterfly shown in the picture at the top of this article does not use a gyroid; it uses a “two-dimensional photonic crystal slab consisting of arrays of rectangles formed by lamellae and microribs.” But they find gyroids in four other species: Callophrys rubi, Cyanophrys remus, Pardes sesostris and Teinopalpus imperialis. It compares tunnelling electron microscope pictures of slices of their iridescent patches with computer-generated slices of gyroids. The comparison looks pretty good to me:

For the evolution of iridescence, see:

• Melissa G. Meadows et al, Iridescence: views from many angles, J. Roy. Soc. Interface 6 (2009).

For the new research at MIT, see:

• Ling Lu, Liang Fu, John D. Joannopoulos and Marin Soljačić, Weyl points and line nodes in gapless gyroid photonic crystals.

• Ling Lu, Zhiyu Wang, Dexin Ye, Lixin Ran, Liang Fu, John D. Joannopoulos and Marin Soljačić, Experimental observation of Weyl points, Science 349 (2015), 622–624.

Again, the first is free online. There’s a lot of great math lurking inside, most of which is too mind-blowing too explain quickly. Let me just paraphrase the start of the paper, so at least experts can get the idea:

Two-dimensional (2d) electrons and photons at the energies and frequencies of Dirac points exhibit extraordinary features. As the best example, almost all the remarkable properties of graphene are tied to the massless Dirac fermions at its Fermi level. Topologically, Dirac cones are not only the critical points for 2d phase transitions but also the unique surface manifestation of a topologically gapped 3d bulk. In a similar way, it is expected that if a material could be found that exhibits a 3d linear dispersion relation, it would also display a wide range of interesting physics phenomena. The associated 3D linear point degeneracies are called “Weyl points”. In the past year, there have been a few studies of Weyl fermions in electronics. The associated Fermi-arc surface states, quantum Hall effect, novel transport properties and a realization of the Adler–Bell–Jackiw anomaly are also expected. However, no observation of Weyl points has been reported. Here, we present a theoretical discovery and detailed numerical investigation of frequency-isolated Weyl points in perturbed double-gyroid photonic crystals along with their complete phase diagrams and their topologically protected surface states.

Also a bit for the mathematicians:

Weyl points are topologically stable objects in the 3d Brillouin zone: they act as monopoles of Berry flux in momentum space, and hence are intimately related to the topological invariant known as the Chern number. The Chern number can be defined for a single bulk band or a set of bands, where the Chern numbers of the individual bands are summed, on any closed 2d surface in the 3d Brillouin zone. The difference of the Chern numbers defined on two surfaces, of all bands below the Weyl point frequencies, equals the sum of the chiralities of the Weyl points enclosed in between the two surfaces.

This is a mix of topology and physics jargon that may be hard for pure mathematicians to understand, but I’ll be glad to translate if there’s interest.

For starters, a ‘monopole of Berry flux in momentum space’ is a poetic way of talking about a twisted complex line bundle over the space of allowed energy-momenta of the electron in the double gyroid. We get a twist at every ‘Weyl point’, meaning a point where the dispersion relations look locally like those of a Weyl spinor when its energy-momentum is near zero. Near such a point, the dispersion relations are a Fourier-transformed version of the Weyl equation.

A Compositional Framework for Passive Linear Networks

28 April, 2015

Here’s a new paper on network theory:

• John Baez and Brendan Fong, A compositional framework for passive linear networks.

While my paper with Jason Erbele studies signal flow diagrams, this one focuses on circuit diagrams. The two are different, but closely related.

I’ll explain their relation at the Turin workshop in May. For now, let me just talk about this paper with Brendan. There’s a lot in here, but let me just try to explain the main result. It’s all about ‘black boxing’: hiding the details of a circuit and only remembering its behavior as seen from outside.

The idea

In late 1940s, just as Feynman was developing his diagrams for processes in particle physics, Eilenberg and Mac Lane initiated their work on category theory. Over the subsequent decades, and especially in the work of Joyal and Street in the 1980s, it became clear that these developments were profoundly linked: monoidal categories have a precise graphical representation in terms of string diagrams, and conversely monoidal categories provide an algebraic foundation for the intuitions behind Feynman diagrams. The key insight is the use of categories where morphisms describe physical processes, rather than structure-preserving maps between mathematical objects.

In work on fundamental physics, the cutting edge has moved from categories to higher categories. But the same techniques have filtered into more immediate applications, particularly in computation and quantum computation. Our paper is part of a new program of applying string diagrams to engineering, with the aim of giving diverse diagram languages a unified foundation based on category theory.

Indeed, even before physicists began using Feynman diagrams, various branches of engineering were using diagrams that in retrospect are closely related. Foremost among these are the ubiquitous electrical circuit diagrams. Although less well-known, similar diagrams are used to describe networks consisting of mechanical, hydraulic, thermodynamic and chemical systems. Further work, pioneered in particular by Forrester and Odum, applies similar diagrammatic methods to biology, ecology, and economics.

As discussed in detail by Olsen, Paynter and others, there are mathematically precise analogies between these different systems. In each case, the system’s state is described by variables that come in pairs, with one variable in each pair playing the role of ‘displacement’ and the other playing the role of ‘momentum’. In engineering, the time derivatives of these variables are sometimes called ‘flow’ and ‘effort’.

displacement:    q flow:      \dot q momentum:      p effort:           \dot p
Mechanics: translation position velocity momentum force
Mechanics: rotation angle angular velocity angular momentum torque
Electronics charge current flux linkage voltage
Hydraulics volume flow pressure momentum pressure
Thermal Physics entropy entropy flow temperature momentum temperature
Chemistry moles molar flow chemical momentum chemical potential

In classical mechanics, this pairing of variables is well understood using symplectic geometry. Thus, any mathematical formulation of the diagrams used to describe networks in engineering needs to take symplectic geometry as well as category theory into account.

While diagrams of networks have been independently introduced in many disciplines, we do not expect formalizing these diagrams to immediately help the practitioners of these disciplines. At first the flow of information will mainly go in the other direction: by translating ideas from these disciplines into the language of modern mathematics, we can provide mathematicians with food for thought and interesting new problems to solve. We hope that in the long run mathematicians can return the favor by bringing new insights to the table.

Although we keep the broad applicability of network diagrams in the back of our minds, our paper talks in terms of electrical circuits, for the sake of familiarity. We also consider a somewhat limited class of circuits. We only study circuits built from ‘passive’ components: that is, those that do not produce energy. Thus, we exclude batteries and current sources. We only consider components that respond linearly to an applied voltage. Thus, we exclude components such as nonlinear resistors or diodes. Finally, we only consider components with one input and one output, so that a circuit can be described as a graph with edges labeled by components. Thus, we also exclude transformers. The most familiar components our framework covers are linear resistors, capacitors and inductors.

While we want to expand our scope in future work, the class of circuits made from these components has appealing mathematical properties, and is worthy of deep study. Indeed, these circuits has been studied intensively for many decades by electrical engineers. Even circuits made exclusively of resistors have inspired work by mathematicians of the caliber of Weyl and Smale!

Our work relies on this research. All we are adding is an emphasis on symplectic geometry and an explicitly ‘compositional’ framework, which clarifies the way a larger circuit can be built from smaller pieces. This is where monoidal categories become important: the main operations for building circuits from pieces are composition and tensoring.

Our strategy is most easily illustrated for circuits made of linear resistors. Such a resistor dissipates power, turning useful energy into heat at a rate determined by the voltage across the resistor. However, a remarkable fact is that a circuit made of these resistors always acts to minimize the power dissipated this way. This ‘principle of minimum power’ can be seen as the reason symplectic geometry becomes important in understanding circuits made of resistors, just as the principle of least action leads to the role of symplectic geometry in classical mechanics.

Here is a circuit made of linear resistors:

The wiggly lines are resistors, and their resistances are written beside them: for example, 3\Omega means 3 ohms, an ‘ohm’ being a unit of resistance. To formalize this, define a circuit of linear resistors to consist of:

• a set N of nodes,
• a set E of edges,
• maps s,t : E \to N sending each edge to its source and target node,
• a map r: E \to (0,\infty) specifying the resistance of the resistor
labelling each edge,
• maps i : X \to N, o : Y \to N specifying the inputs and outputs of the circuit.

When we run electric current through such a circuit, each node n \in N gets a potential \phi(n). The voltage across an edge e \in E is defined as the change in potential as we move from to the source of e to its target, \phi(t(e)) - \phi(s(e)). The power dissipated by the resistor on this edge is then

\displaystyle{ \frac{1}{r(e)}\big(\phi(t(e))-\phi(s(e))\big)^2 }

The total power dissipated by the circuit is therefore twice

\displaystyle{ P(\phi) = \frac{1}{2}\sum_{e \in E} \frac{1}{r(e)}\big(\phi(t(e))-\phi(s(e))\big)^2 }

The factor of \frac{1}{2} is convenient in some later calculations.

Note that P is a nonnegative quadratic form on the vector space \mathbb{R}^N. However, not every nonnegative definite quadratic form on \mathbb{R}^N arises in this way from some circuit of linear resistors with N as its set of nodes. The quadratic forms that do arise are called Dirichlet forms. They have been extensively investigated, and they play a major role in our work.

We write

\partial N = i(X) \cup o(Y)

for the set of terminals: that is, nodes corresponding to inputs or outputs. The principle of minimum power says that if we fix the potential at the terminals, the circuit will choose the potential at other nodes to minimize the total power dissipated. An element \psi of the vector space \mathbb{R}^{\partial N} assigns a potential to each terminal. Thus, if we fix \psi, the total power dissipated will be twice

Q(\psi) = \min_{\substack{ \phi \in \mathbb{R}^N \\ \phi\vert_{\partial N} = \psi}} \; P(\phi)

The function Q : \mathbb{R}^{\partial N} \to \mathbb{R} is again a Dirichlet form. We call it the power functional of the circuit.

Now, suppose we are unable to see the internal workings of a circuit, and can only observe its ‘external behavior’: that is, the potentials at its terminals and the currents flowing into or out of these terminals. Remarkably, this behavior is completely determined by the power functional Q. The reason is that the current at any terminal can be obtained by differentiating Q with respect to the potential at this terminal, and relations of this form are all the relations that hold between potentials and currents at the terminals.

The Laplace transform allows us to generalize this immediately to circuits that can also contain linear inductors and capacitors, simply by changing the field we work over, replacing \mathbb{R} by the field \mathbb{R}(s) of rational functions of a single real variable, and talking of impedance where we previously talked of resistance. We obtain a category \mathrm{Circ} where an object is a finite set, a morphism f : X \to Y is a circuit with input set X and output set Y, and composition is given by identifying the outputs of one circuit with the inputs of the next, and taking the resulting union of labelled graphs. Each such circuit gives rise to a Dirichlet form, now defined over \mathbb{R}(s), and this Dirichlet form completely describes the externally observable behavior of the circuit.

We can take equivalence classes of circuits, where two circuits count as the same if they have the same Dirichlet form. We wish for these equivalence classes of circuits to form a category. Although there is a notion of composition for Dirichlet forms, we find that it lacks identity morphisms or, equivalently, it lacks morphisms representing ideal wires of zero impedance. To address this we turn to Lagrangian subspaces of symplectic vector spaces. These generalize quadratic forms via the map

\Big(Q: \mathbb{F}^{\partial N} \to \mathbb{F}\Big) \longmapsto

\mathrm{Graph}(dQ) =    \{(\psi, dQ_\psi) \mid \psi \in \mathbb{F}^{\partial N} \} \; \subseteq \; \mathbb{F}^{\partial N} \oplus (\mathbb{F}^{\partial N})^\ast

taking a quadratic form Q on the vector space \mathbb{F}^{\partial N} over the field \mathbb{F} to the graph of its differential dQ. Here we think of the symplectic vector space \mathbb{F}^{\partial N} \oplus (\mathbb{F}^{\partial N})^\ast as the state space of the circuit, and the subspace \mathrm{Graph}(dQ) as the subspace of attainable states, with \psi \in \mathbb{F}^{\partial N} describing the potentials at the terminals, and dQ_\psi \in (\mathbb{F}^{\partial N})^\ast the currents.

This construction is well-known in classical mechanics, where the principle of least action plays a role analogous to that of the principle of minimum power here. The set of Lagrangian subspaces is actually an algebraic variety, the Lagrangian Grassmannian, which serves as a compactification of the space of quadratic forms. The Lagrangian Grassmannian has already played a role in Sabot’s work on circuits made of resistors. For us, its importance it that we can find identity morphisms for the composition of Dirichlet forms by taking circuits made of parallel resistors and letting their resistances tend to zero: the limit is not a Dirichlet form, but it exists in the Lagrangian Grassmannian.

Indeed, there exists a category \mathrm{LagrRel} with finite dimensional symplectic vector spaces as objects and Lagrangian relations as morphisms: that is, linear relations from V to W that are given by Lagrangian subspaces of \overline{V} \oplus W, where \overline{V} is the symplectic vector space conjugate to V—that is, with the sign of the symplectic structure switched.

To move from the Lagrangian subspace defined by the graph of the differential of the power functional to a morphism in the category \mathrm{LagrRel}—that is, to a Lagrangian relation— we must treat seriously the input and output functions of the circuit. These express the circuit as built upon a cospan:

Applicable far more broadly than this present formalization of circuits, cospans model systems with two ‘ends’, an input and output end, albeit without any connotation of directionality: we might just as well exchange the role of the inputs and outputs by taking the mirror image of the above diagram. The role of the input and output functions, as we have discussed, is to mark the terminals we may glue onto the terminals of another circuit, and the pushout of cospans gives formal precision to this gluing construction.

One upshot of this cospan framework is that we may consider circuits with elements of N that are both inputs and outputs, such as this one:

This corresponds to the identity morphism on the finite set with two elements. Another is that some points may be considered an input or output multiple times, like here:

This lets to connect two distinct outputs to the above double input.

Given a set X of inputs or outputs, we understand the electrical behavior on this set by considering the symplectic vector space \mathbb{F}^X \oplus {(\mathbb{F}^X)}^\ast, the direct sum of the space \mathbb{F}^X of potentials and the space {(\mathbb{F}^X)}^\ast of currents at these points. A Lagrangian relation specifies which states of the output space \mathbb{F}^Y \oplus {(\mathbb{F}^Y)}^\ast are allowed for each state of the input space \mathbb{F}^X \oplus {(\mathbb{F}^X)}^\ast. Turning the Lagrangian subspace \mathrm{Graph}(dQ) of a circuit into this information requires that we understand the ‘symplectification’

Sf: \mathbb{F}^B \oplus {(\mathbb{F}^B)}^\ast \to \mathbb{F}^A \oplus {(\mathbb{F}^A)}^\ast

and ‘twisted symplectification’

S^tf: \mathbb{F}^B \oplus {(\mathbb{F}^B)}^\ast \to \overline{\mathbb{F}^A \oplus {(\mathbb{F}^A)}^\ast}

of a function f: A \to B between finite sets. In particular we need to understand how these apply to the input and output functions with codomain restricted to \partial N; abusing notation, we also write these i: X \to \partial N and o: Y \to \partial N.

The symplectification Sf is a Lagrangian relation, and the catch phrase is that it ‘copies voltages’ and ‘splits currents’. More precisely, for any given potential-current pair (\psi,\iota) in \mathbb{F}^B \oplus {(\mathbb{F}^B)}^\ast, its image under Sf consists of all elements of (\psi', \iota') in \mathbb{F}^A \oplus {(\mathbb{F}^A)}^\ast such that the potential at a \in A is equal to the potential at f(a) \in B, and such that, for each fixed b \in B, collectively the currents at the a \in f^{-1}(b) sum to the current at b. We use the symplectification So of the output function to relate the state on \partial N to that on the outputs Y.

As our current framework is set up to report the current out of each node, to describe input currents we define the twisted symplectification:

S^tf: \mathbb{F}^B \oplus {(\mathbb{F}^B)}^\ast \to \overline{\mathbb{F}^A \oplus {(\mathbb{F}^A)}^\ast}

almost identically to the above, except that we flip the sign of the currents \iota' \in (\mathbb{F}^A)^\ast. This again gives a Lagrangian relation. We use the twisted symplectification S^ti of the input function to relate the state on \partial N to that on the inputs.

The Lagrangian relation corresponding to a circuit then comprises exactly a list of the potential-current pairs that are possible electrical states of the inputs and outputs of the circuit. In doing so, it identifies distinct circuits. A simple example of this is the identification of a single 2-ohm resistor:

with two 1-ohm resistors in series:

Our inability to access the internal workings of a circuit in this representation inspires us to call this process black boxing: you should imagine encasing the circuit in an opaque black box, leaving only the terminals accessible. Fortunately, this information is enough to completely characterize the external behavior of a circuit, including how it interacts when connected with other circuits!

Put more precisely, the black boxing process is functorial: we can compute the black-boxed version of a circuit made of parts by computing the black-boxed versions of the parts and then composing them. In fact we shall prove that \mathrm{Circ} and \mathrm{LagrRel} are dagger compact categories, and the black box functor preserves all this extra structure:

Theorem. There exists a symmetric monoidal dagger functor, the black box functor

\blacksquare: \mathrm{Circ} \to \mathrm{LagrRel}

mapping a finite set X to the symplectic vector space \mathbb{F}^X \oplus (\mathbb{F}^X)^\ast it generates, and a circuit \big((N,E,s,t,r),i,o\big) to the Lagrangian relation

\bigcup_{v \in \mathrm{Graph}(dQ)} S^ti(v) \times So(v)      \subseteq \overline{\mathbb{F}^X \oplus (\mathbb{F}^X)^\ast} \oplus \mathbb{F}^Y \oplus (\mathbb{F}^Y)^\ast

where Q is the circuit’s power functional.

The goal of this paper is to prove and explain this result. The proof is more tricky than one might first expect, but our approach involves concepts that should be useful throughout the study of networks, such as ‘decorated cospans’ and ‘corelations’.

Give it a read, and let us know if you have questions or find mistakes!

Categories in Control

23 April, 2015

To understand ecosystems, ultimately will be to understand networks. – B. C. Patten and M. Witkamp

A while back I decided one way to apply my math skills to help save the planet was to start pushing toward green mathematics: a kind of mathematics that can interact with biology and ecology just as fruitfully as traditional mathematics interacts with physics. As usual with math, the payoffs will come slowly, but they may be large. It’s not a substitute for doing other, more urgent things—but if mathematicians don’t do this, who will?

As a first step in this direction, I decided to study networks.

This May, a small group of mathematicians is meeting in Turin for a workshop on the categorical foundations of network theory, organized by Jacob Biamonte. I’m trying to get us mentally prepared for this. We all have different ideas, yet they should fit together somehow.

Tobias Fritz, Eugene Lerman and David Spivak have all written articles here about their work, though I suspect Eugene will have a lot of completely new things to say, too. Now it’s time for me to say what my students and I have been doing.

Despite my ultimate aim of studying biological and ecological networks, I decided to start by clarifying the math of networks that appear in chemistry and engineering, since these are simpler, better understood, useful in their own right, and probably a good warmup for the grander goal. I’ve been working with Brendan Fong on electrical ciruits, and with Jason Erbele on control theory. Let me talk about this paper:

• John Baez and Jason Erbele, Categories in control.

Control theory is the branch of engineering that focuses on manipulating open systems—systems with inputs and outputs—to achieve desired goals. In control theory, signal-flow diagrams are used to describe linear ways of manipulating signals, for example smooth real-valued functions of time. Here’s a real-world example; click the picture for more details:

For a category theorist, at least, it is natural to treat signal-flow diagrams as string diagrams in a symmetric monoidal category. This forces some small changes of perspective, which I’ll explain, but more important is the question: which symmetric monoidal category?

We argue that the answer is: the category \mathrm{FinRel}_k of finite-dimensional vector spaces over a certain field k, but with linear relations rather than linear maps as morphisms, and direct sum rather than tensor product providing the symmetric monoidal structure. We use the field k = \mathbb{R}(s) consisting of rational functions in one real variable s. This variable has the meaning of differentation. A linear relation from k^m to k^n is thus a system of linear constant-coefficient ordinary differential equations relating m ‘input’ signals and n ‘output’ signals.

Our main goal in this paper is to provide a complete ‘generators and relations’ picture of this symmetric monoidal category, with the generators being familiar components of signal-flow diagrams. It turns out that the answer has an intriguing but mysterious connection to ideas that are familiar in the diagrammatic approach to quantum theory! Quantum theory also involves linear algebra, but it uses linear maps between Hilbert spaces as morphisms, and the tensor product of Hilbert spaces provides the symmetric monoidal structure.

We hope that the category-theoretic viewpoint on signal-flow diagrams will shed new light on control theory. However, in this paper we only lay the groundwork.

Signal flow diagrams

There are several basic operations that one wants to perform when manipulating signals. The simplest is multiplying a signal by a scalar. A signal can be amplified by a constant factor:

f \mapsto cf

where c \in \mathbb{R}. We can write this as a string diagram:

Here the labels f and c f on top and bottom are just for explanatory purposes and not really part of the diagram. Control theorists often draw arrows on the wires, but this is unnecessary from the string diagram perspective. Arrows on wires are useful to distinguish objects from their
duals, but ultimately we will obtain a compact closed category where each object is its own dual, so the arrows can be dropped. What we really need is for the box denoting scalar multiplication to have a clearly defined input and output. This is why we draw it as a triangle. Control theorists often use a rectangle or circle, using arrows on wires to indicate which carries the input f and which the output c f.

A signal can also be integrated with respect to the time variable:

f \mapsto \int f

Mathematicians typically take differentiation as fundamental, but engineers sometimes prefer integration, because it is more robust against small perturbations. In the end it will not matter much here. We can again draw integration as a string diagram:

Since this looks like the diagram for scalar multiplication, it is natural to extend \mathbb{R} to \mathbb{R}(s), the field of rational functions of a variable s which stands for differentiation. Then differentiation becomes a special case of scalar multiplication, namely multiplication by s, and integration becomes multiplication by 1/s. Engineers accomplish the same effect with Laplace transforms, since differentiating a signal $f$ is equivalent to multiplying its Laplace transform

\displaystyle{  (\mathcal{L}f)(s) = \int_0^\infty f(t) e^{-st} \,dt  }

by the variable s. Another option is to use the Fourier transform: differentiating f is equivalent to multiplying its Fourier transform

\displaystyle{   (\mathcal{F}f)(\omega) = \int_{-\infty}^\infty f(t) e^{-i\omega t}\, dt  }

by -i\omega. Of course, the function f needs to be sufficiently well-behaved to justify calculations involving its Laplace or Fourier transform. At a more basic level, it also requires some work to treat integration as the two-sided inverse of differentiation. Engineers do this by considering signals that vanish for t < 0, and choosing the antiderivative that vanishes under the same condition. Luckily all these issues can be side-stepped in a formal treatment of signal-flow diagrams: we can simply treat signals as living in an unspecified vector space over the field \mathbb{R}(s). The field \mathbb{C}(s) would work just as well, and control theory relies heavily on complex analysis. In our paper we work over an arbitrary field k.

The simplest possible signal processor is a rock, which takes the 'input' given by the force F on the rock and produces as 'output' the rock's position q. Thanks to Newton's second law F=ma, we can describe this using a signal-flow diagram:

Here composition of morphisms is drawn in the usual way, by attaching the output wire of one morphism to the input wire of the next.

To build more interesting machines we need more building blocks, such as addition:

+ : (f,g) \mapsto f + g

and duplication:

\Delta :  f \mapsto (f,f)

When these linear maps are written as matrices, their matrices are transposes of each other. This is reflected in the string diagrams for addition and duplication:

The second is essentially an upside-down version of the first. However, we draw addition as a dark triangle and duplication as a light one because we will later want another way to ‘turn addition upside-down’ that does not give duplication. As an added bonus, a light upside-down triangle resembles the Greek letter \Delta, the usual symbol for duplication.

While they are typically not considered worthy of mention in control theory, for completeness we must include two other building blocks. One is the zero map from the zero-dimensional vector space \{0\} to our field k, which we denote as 0 and draw as follows:

The other is the zero map from k to \{0\}, sometimes called ‘deletion’, which we denote as ! and draw thus:

Just as the matrices for addition and duplication are transposes of each other, so are the matrices for zero and deletion, though they are rather degenerate, being 1 \times 0 and 0 \times 1 matrices, respectively. Addition and zero make k into a commutative monoid, meaning that the following relations hold:

The equation at right is the commutative law, and the crossing of strands is the braiding:

B : (f,g) \mapsto (g,f)

by which we switch two signals. In fact this braiding is a symmetry, so it does not matter which strand goes over which:

Dually, duplication and deletion make k into a cocommutative comonoid. This means that if we reflect the equations obeyed by addition and zero across the horizontal axis and turn dark operations into light ones, we obtain another set of valid equations:

There are also relations between the monoid and comonoid operations. For example, adding two signals and then duplicating the result gives the same output as duplicating each signal and then adding the results:

This diagram is familiar in the theory of Hopf algebras, or more generally bialgebras. Here it is an example of the fact that the monoid operations on k are comonoid homomorphisms—or equivalently, the comonoid operations are monoid homomorphisms.

We summarize this situation by saying that k is a bimonoid. These are all the bimonoid laws, drawn as diagrams:

The last equation means we can actually make the diagram at left disappear, since it equals the identity morphism on the 0-dimensional vector space, which is drawn as nothing.

So far all our string diagrams denote linear maps. We can treat these as morphisms in the category \mathrm{FinVect}_k, where objects are finite-dimensional vector spaces over a field k and morphisms are linear maps. This category is equivalent to the category where the only objects are vector spaces k^n for n \ge 0, and then morphisms can be seen as n \times m matrices. The space of signals is a vector space V over k which may not be finite-dimensional, but this does not cause a problem: an n \times m matrix with entries in k still defines a linear map from V^n to V^m in a functorial way.

In applications of string diagrams to quantum theory, we make \mathrm{FinVect}_k into a symmetric monoidal category using the tensor product of vector spaces. In control theory, we instead make \mathrm{FinVect}_k into a symmetric monoidal category using the direct sum of vector spaces. In Lemma 1 of our paper we prove that for any field k, \mathrm{FinVect}_k with direct sum is generated as a symmetric monoidal category by the one object k together with these morphisms:

where c \in k is arbitrary.

However, these generating morphisms obey some unexpected relations! For example, we have:

Thus, it is important to find a complete set of relations obeyed by these generating morphisms, thus obtaining a presentation of \mathrm{FinVect}_k as a symmetric monoidal category. We do this in Theorem 2. In brief, these relations say:

(1) (k, +, 0, \Delta, !) is a bicommutative bimonoid;

(2) the rig operations of k can be recovered from the generating morphisms;

(3) all the generating morphisms commute with scalar multiplication.

Here item (2) means that +, \cdot, 0 and 1 in the field k can be expressed in terms of signal-flow diagrams as follows:

Multiplicative inverses cannot be so expressed, so our signal-flow diagrams so far do not know that k is a field. Additive inverses also cannot be expressed in this way. So, we expect that a version of Theorem 2 will hold whenever k is a mere rig: that is, a ‘ring without negatives’, like the natural numbers. The one change is that instead of working with vector spaces, we should work with finitely presented free k-modules.

Item (3), the fact that all our generating morphisms commute with scalar multiplication, amounts to these diagrammatic equations:

While Theorem 2 is a step towards understanding the category-theoretic underpinnings of control theory, it does not treat signal-flow diagrams that include ‘feedback’. Feedback is one of the most fundamental concepts in control theory because a control system without feedback may be highly sensitive to disturbances or unmodeled behavior. Feedback allows these uncontrolled behaviors to be mollified. As a string diagram, a basic feedback system might look schematically like this:

The user inputs a ‘reference’ signal, which is fed into a controller, whose output is fed into a system, which control theorists call a ‘plant’, which in turn produces its own output. But then the system’s output is duplicated, and one copy is fed into a sensor, whose output is added (or if we prefer, subtracted) from the reference signal.

In string diagrams—unlike in the usual thinking on control theory—it is essential to be able to read any diagram from top to bottom as a composite of tensor products of generating morphisms. Thus, to incorporate the idea of feedback, we need two more generating morphisms. These are the ‘cup’:

and ‘cap’:

These are not maps: they are relations. The cup imposes the relation that its two inputs be equal, while the cap does the same for its two outputs. This is a way of describing how a signal flows around a bend in a wire.

To make this precise, we use a category called \mathrm{FinRel}_k. An object of this category is a finite-dimensional vector space over k, while a morphism from U to V, denoted L : U \rightharpoonup V, is a linear relation, meaning a linear subspace

L \subseteq U \oplus V

In particular, when k = \mathbb{R}(s), a linear relation L : k^m \to k^n is just an arbitrary system of constant-coefficient linear ordinary differential equations relating m input variables and n output variables.

Since the direct sum U \oplus V is also the cartesian product of U and V, a linear relation is indeed a relation in the usual sense, but with the property that if u \in U is related to v \in V and u' \in U is related to v' \in V then cu + c'u' is related to cv + c'v' whenever c,c' \in k.

We compose linear relations L : U \rightharpoonup V and L' : V \rightharpoonup W as follows:

L'L = \{(u,w) \colon \; \exists\; v \in V \;\; (u,v) \in L \textrm{ and } (v,w) \in L'\}

Any linear map f : U \to V gives a linear relation F : U \rightharpoonup V, namely the graph of that map:

F = \{ (u,f(u)) : u \in U \}

Composing linear maps thus becomes a special case of composing linear relations, so \mathrm{FinVect}_k becomes a subcategory of \mathrm{FinRel}_k. Furthermore, we can make \mathrm{FinRel}_k into a monoidal category using direct sums, and it becomes symmetric monoidal using the braiding already present in \mathrm{FinVect}_k.

In these terms, the cup is the linear relation

\cup : k^2 \rightharpoonup \{0\}

given by

\cup \; = \; \{ (x,x,0) : x \in k   \} \; \subseteq \; k^2 \oplus \{0\}

while the cap is the linear relation

\cap : \{0\} \rightharpoonup k^2

given by

\cap \; = \; \{ (0,x,x) : x \in k   \} \; \subseteq \; \{0\} \oplus k^2

These obey the zigzag relations:

Thus, they make \mathrm{FinRel}_k into a compact closed category where k, and thus every object, is its own dual.

Besides feedback, one of the things that make the cap and cup useful is that they allow any morphism L : U \rightharpoonup V to be ‘plugged in backwards’ and thus ‘turned around’. For instance, turning around integration:

we obtain differentiation. In general, using caps and cups we can turn around any linear relation L : U \rightharpoonup V and obtain a linear relation L^\dagger : V \rightharpoonup U, called the adjoint of L, which turns out to given by

L^\dagger = \{(v,u) : (u,v) \in L \}

For example, if c \in k is nonzero, the adjoint of scalar multiplication by c is multiplication by c^{-1}:

Thus, caps and cups allow us to express multiplicative inverses in terms of signal-flow diagrams! One might think that a problem arises when when c = 0, but no: the adjoint of scalar multiplication by 0 is

\{(0,x) : x \in k \} \subseteq k \oplus k

In Lemma 3 we show that \mathrm{FinRel}_k is generated, as a symmetric monoidal category, by these morphisms:

where c \in k is arbitrary.

In Theorem 4 we find a complete set of relations obeyed by these generating morphisms,thus giving a presentation of \mathrm{FinRel}_k as a symmetric monoidal category. To describe these relations, it is useful to work with adjoints of the generating morphisms. We have already seen that the adjoint of scalar multiplication by c is scalar multiplication by c^{-1}, except when c = 0. Taking adjoints of the other four generating morphisms of \mathrm{FinVect}_k, we obtain four important but perhaps unfamiliar linear relations. We draw these as ‘turned around’ versions of the original generating morphisms:

Coaddition is a linear relation from k to k^2 that holds when the two outputs sum to the input:

+^\dagger : k \rightharpoonup k^2

+^\dagger = \{(x,y,z)  : \; x = y + z \} \subseteq k \oplus k^2

Cozero is a linear relation from k to \{0\} that holds when the input is zero:

0^\dagger : k \rightharpoonup \{0\}

0^\dagger = \{ (0,0)\} \subseteq k \oplus \{0\}

Coduplication is a linear relation from k^2 to k that holds when the two inputs both equal the output:

\Delta^\dagger : k^2 \rightharpoonup k

\Delta^\dagger = \{(x,y,z)  : \; x = y = z \} \subseteq k^2 \oplus k

Codeletion is a linear relation from \{0\} to k that holds always:

!^\dagger : \{0\} \rightharpoonup k

!^\dagger = \{(0,x) \} \subseteq \{0\} \oplus k

Since +^\dagger,0^\dagger,\Delta^\dagger and !^\dagger automatically obey turned-around versions of the relations obeyed by +,0,\Delta and !, we see that k acquires a second bicommutative bimonoid structure when considered as an object in \mathrm{FinRel}_k.

Moreover, the four dark operations make k into a Frobenius monoid. This means that (k,+,0) is a monoid, (k,+^\dagger, 0^\dagger) is a comonoid, and the Frobenius relation holds:

All three expressions in this equation are linear relations saying that the sum of the two inputs equal the sum of the two outputs.

The operation sending each linear relation to its adjoint extends to a contravariant functor

\dagger : \mathrm{FinRel}_k\ \to \mathrm{FinRel}_k

which obeys a list of properties that are summarized by saying that \mathrm{FinRel}_k is a †-compact category. Because two of the operations in the Frobenius monoid (k, +,0,+^\dagger,0^\dagger) are adjoints of the other two, it is a †-Frobenius monoid.

This Frobenius monoid is also special, meaning that
comultiplication (in this case +^\dagger) followed by multiplication (in this case +) equals the identity:

This Frobenius monoid is also commutative—and cocommutative, but for Frobenius monoids this follows from commutativity.

Starting around 2008, commutative special †-Frobenius monoids have become important in the categorical foundations of quantum theory, where they can be understood as ‘classical structures’ for quantum systems. The category \mathrm{FinHilb} of finite-dimensional Hilbert spaces and linear maps is a †-compact category, where any linear map f : H \to K has an adjoint f^\dagger : K \to H given by

\langle f^\dagger \phi, \psi \rangle = \langle \phi, f \psi \rangle

for all \psi \in H, \phi \in K . A commutative special †-Frobenius monoid in \mathrm{FinHilb} is then the same as a Hilbert space with a chosen orthonormal basis. The reason is that given an orthonormal basis \psi_i for a finite-dimensional Hilbert space H, we can make H into a commutative special †-Frobenius monoid with multiplication m : H \otimes H \to H given by

m (\psi_i \otimes \psi_j ) = \left\{ \begin{array}{cl}  \psi_i & i = j \\                                                                 0 & i \ne j  \end{array}\right.

and unit i : \mathbb{C} \to H given by

i(1) = \sum_i \psi_i

The comultiplication m^\dagger duplicates basis states:

m^\dagger(\psi_i) = \psi_i \otimes \psi_i

Conversely, any commutative special †-Frobenius monoid in \mathrm{FinHilb} arises this way.

Considerably earlier, around 1995, commutative Frobenius monoids were recognized as important in topological quantum field theory. The reason, ultimately, is that the free symmetric monoidal category on a commutative Frobenius monoid is 2\mathrm{Cob}, the category with 2-dimensional oriented cobordisms as morphisms. But the free symmetric monoidal category on a commutative special Frobenius monoid was worked out even earlier: it is the category with finite sets as objects, where a morphism f : X \to Y is an isomorphism class of cospans

X \longrightarrow S \longleftarrow Y

This category can be made into a †-compact category in an obvious way, and then the 1-element set becomes a commutative special †-Frobenius monoid.

For all these reasons, it is interesting to find a commutative special †-Frobenius monoid lurking at the heart of control theory! However, the Frobenius monoid here has yet another property, which is more unusual. Namely, the unit 0 : \{0\} \rightharpoonup k followed by the counit 0^\dagger : k \rightharpoonup \{0\} is the identity:

We call a special Frobenius monoid that also obeys this extra law extra-special. One can check that the free symmetric monoidal category on a commutative extra-special Frobenius monoid is the category with finite sets as objects, where a morphism f : X \to Y is an equivalence relation on the disjoint union X \sqcup Y, and we compose f : X \to Y and g : Y \to Z by letting f and g generate an equivalence relation on X \sqcup Y \sqcup Z and then restricting this to X \sqcup Z.

As if this were not enough, the light operations share many properties with the dark ones. In particular, these operations make k into a commutative extra-special †-Frobenius monoid in a second way. In summary:

(k, +, 0, \Delta, !) is a bicommutative bimonoid;

(k, \Delta^\dagger, !^\dagger, +^\dagger, 0^\dagger) is a bicommutative bimonoid;

(k, +, 0, +^\dagger, 0^\dagger) is a commutative extra-special †-Frobenius monoid;

(k, \Delta^\dagger, !^\dagger, \Delta, !) is a commutative extra-special †-Frobenius monoid.

It should be no surprise that with all these structures built in, signal-flow diagrams are a powerful method of designing processes.

However, it is surprising that most of these structures are present in a seemingly very different context: the so-called ZX calculus, a diagrammatic formalism for working with complementary observables in quantum theory. This arises naturally when one has an n-dimensional Hilbert space $H$ with two orthonormal bases \psi_i, \phi_i that are mutually unbiased, meaning that

|\langle \psi_i, \phi_j \rangle|^2 = \displaystyle{\frac{1}{n}}

for all 1 \le i, j \le n. Each orthonormal basis makes H into commutative special †-Frobenius monoid in \mathrm{FinHilb}. Moreover, the multiplication and unit of either one of these Frobenius monoids fits together with the comultiplication and counit of the other to form a bicommutative bimonoid. So, we have all the structure present in the list above—except that these Frobenius monoids are only extra-special if H is 1-dimensional.

The field k is also a 1-dimensional vector space, but this is a red herring: in \mathrm{FinRel}_k every finite-dimensional vector space naturally acquires all four structures listed above, since addition, zero, duplication and deletion are well-defined and obey all the relations we have discussed. Jason and I focus on k in our paper simply because it generates all the objects \mathrm{FinRel}_k via direct sum.

Finally, in \mathrm{FinRel}_k the cap and cup are related to the light and dark operations as follows:

Note the curious factor of -1 in the second equation, which breaks some of the symmetry we have seen so far. This equation says that two elements x, y \in k sum to zero if and only if -x = y. Using the zigzag relations, the two equations above give

We thus see that in \mathrm{FinRel}_k, both additive and multiplicative inverses can be expressed in terms of the generating morphisms used in signal-flow diagrams.

Theorem 4 of our paper gives a presentation of \mathrm{FinRel}_k based on the ideas just discussed. Briefly, it says that \mathrm{FinRel}_k is equivalent to the symmetric monoidal category generated by an object k and these morphisms:

• addition +: k^2 \rightharpoonup k
• zero 0 : \{0\} \rightharpoonup k
• duplication \Delta: k\rightharpoonup k^2
• deletion ! : k \rightharpoonup 0
• scalar multiplication c: k\rightharpoonup k for any c\in k
• cup \cup : k^2 \rightharpoonup \{0\}
• cap \cap : \{0\} \rightharpoonup k^2

obeying these relations:

(1) (k, +, 0, \Delta, !) is a bicommutative bimonoid;

(2) \cap and \cup obey the zigzag equations;

(3) (k, +, 0, +^\dagger, 0^\dagger) is a commutative extra-special †-Frobenius monoid;

(4) (k, \Delta^\dagger, !^\dagger, \Delta, !) is a commutative extra-special †-Frobenius monoid;

(5) the field operations of k can be recovered from the generating morphisms;

(6) the generating morphisms (1)-(4) commute with scalar multiplication.

Note that item (2) makes \mathrm{FinRel}_k into a †-compact category, allowing us to mention the adjoints of generating morphisms in the subsequent relations. Item (5) means that +, \cdot, 0, 1 and also additive and multiplicative inverses in the field k can be expressed in terms of signal-flow diagrams in the manner we have explained.

So, we have a good categorical understanding of the linear algebra used in signal flow diagrams!

Now Jason is moving ahead to apply this to some interesting problems… but that’s another story, for later.

Resource Convertibility (Part 3)

13 April, 2015

guest post by Tobias Fritz

In Part 1 and Part 2, we learnt about ordered commutative monoids and how they formalize theories of resource convertibility and combinability. In this post, I would like to say a bit about the applications that have been explored so far. First, the study of resource theories has become a popular subject in quantum information theory, and many of the ideas in my paper actually originate there. I’ll list some references at the end. So I hope that the toolbox of ordered commutative monoids will turn out to be useful for this. But here I would like to talk about an example application that is much easier to understand, but no less difficult to analyze: graph theory and the resource theory of zero-error communication.

A graph consists of a bunch of nodes connected by a bunch of edges, for example like this:

This particular graph is the pentagon graph or 5-cycle. To give it some resource-theoretic interpretation, think of it as the distinguishability graph of a communication channel, where the nodes are the symbols that can be sent across the channel, and two symbols share an edge if and only if they can be unambiguously decoded. For example, the pentagon graph roughly corresponds to the distinguishability graph of my handwriting, when restricted to five letters only:

So my ‘w’ is distinguishable from my ‘u’, but it may be confused for my ‘m’. In order to communicate unambiguously, it looks like I should restrict myself to using only two of those letters in writing, since any third of them may be mistaken for one of the other three. But alternatively, I could use a block code to create context around each letter which allows for perfect disambiguation. This is what happens in practice: I write in natural language, where an entire word is usually not ambiguous.

One can now also consider graph homomorphisms, which are maps like this:

The numbers on the nodes indicate where each node on the left gets mapped to. Formally, a graph homomorphism is a function taking nodes to nodes such that adjacent nodes get mapped to adjacent nodes. If a homomorphism G\to H exists between graphs G and H, then we also write H\geq G; in terms of communication channels, we can interpret this as saying that H simulates G, since the homomorphism provides a map between the symbols which preserves distinguishability. A ‘code’ for a communication channel is then just a homomorphism from the complete graph in which all nodes share an edge to the graph which describes the channel. With this ordering structure, the collection of all finite graphs forms an ordered set. This ordered set has an intricate structure which is intimately related to some big open problems in graph theory.

We can also combine two communication channels to form a compound one. Going back to the handwriting example, we can consider the new channel in which the symbols are pairs of letters. Two such pairs are distinguishable if and only if either the first letters of each pair are distinguishable or the second letters are,

(a,b) \sim (a',b') \:\Leftrightarrow\: a\sim a' \:\lor\: b\sim b'

When generalized to arbitrary graphs, this yields the definition of disjunctive product of graphs. It is not hard to show that this equips the ordered set of graphs with a binary operation compatible with the ordering, so that we obtain an ordered commutative monoid denoted Grph. It mathematically formalizes the resource theory of zero-error communication.

Using the toolbox of ordered commutative monoids combined with some concrete computations on graphs, one can show that Grph is not cancellative: if K_{11} is the complete graph on 11 nodes, then 3C_5\not\geq K_{11}, but there exists a graph G such that

3 C_5 + G \geq K_{11} + G

The graph G turns out to have 136 nodes. This result seems to be new. But if you happen to have seen something like this before, please let me know!

Last time, we also talked about rates of conversion. In Grph, it turns out that some of these correspond to famous graph invariants! For example, the rate of conversion from a graph G to the single-edge graph K_2 is Shannon capacity \Theta(\overline{G}), where \overline{G} is the complement graph. This is of no surprise since \Theta was originally defined by Shannon with precisely this rate in mind, although he did not use the language of ordered commutative monoids. In any case, the Shannon capacity \Theta(\overline{G}) is a graph invariant notorious for its complexity: it is not known whether there exists an algorithm to compute it! But an application of the Rate Theorem from Part 2 gives us a formula for the Shannon capacity:

\Theta(\overline{G}) = \inf_f f(G)

where f ranges over all graph invariants which are monotone under graph homomorphisms, multiplicative under disjunctive product, and normalized such that f(K_2) = 2. Unfortunately, this formula still not produce an algorithm for computing \Theta. But it nonconstructively proves the existence of many new graph invariants f which approximate the Shannon capacity from above.

Although my story ends here, I also feel that the whole project has barely started. There are lots of directions to explore! For example, it would be great to fit Shannon’s noisy channel coding theorem into this framework, but this has turned out be technically challenging. If you happen to be familiar with rate-distortion theory and you want to help out, please get in touch!


Here is a haphazard selection of references on resource theories in quantum information theory and related fields:

• Igor Devetak, Aram Harrow and Andreas Winter, A resource framework for quantum Shannon theory.

• Gilad Gour, Markus P. Müller, Varun Narasimhachar, Robert W. Spekkens and Nicole Yunger Halpern, The resource theory of informational nonequilibrium in thermodynamics.

• Fernando G.S.L. Brandão, Michał Horodecki, Nelly Huei Ying Ng, Jonathan Oppenheim and Stephanie Wehner, The second laws of quantum thermodynamics.

• Iman Marvian and Robert W. Spekkens, The theory of manipulations of pure state asymmetry: basic tools and equivalence classes of states under symmetric operations.

• Elliott H. Lieb and Jakob Yngvason, The physics and mathematics of the second law of thermodynamics.

Resource Convertibility (Part 2)

10 April, 2015

guest post by Tobias Fritz

In Part 1, I introduced ordered commutative monoids as a mathematical formalization of resources and their convertibility. Today I’m going to say something about what to do with this formalization. Let’s start with a quick recap!

Definition: An ordered commutative monoid is a set A equipped with a binary relation \geq, a binary operation +, and a distinguished element 0 such that the following hold:

+ and 0 equip A with the structure of a commutative monoid;

\geq equips A with the structure of a partially ordered set;

• addition is monotone: if x\geq y, then also x + z \geq y + z.

Recall also that we think of the x,y\in A as resource objects such that x+y represents the object consisting of x and y together, and x\geq y means that the resource object x can be converted into y.

When confronted with an abstract definition like this, many people ask: so what is it useful for? The answer to this is twofold: first, it provides a language which we can use to guide our thoughts in any application context. Second, the definition itself is just the very start: we can now also prove theorems about ordered commutative monoids, which can be instantiated in any particular application context. So the theory of ordered commutative monoids will provide a useful toolbox for talking about concrete resource theories and studying them. In the remainder of this post, I’d like to say a bit about what this toolbox contains. For more, you’ll have to read the paper!

To start, let’s consider catalysis as one of the resource-theoretic phenomena neatly captured by ordered commutative monoids. Catalysis is the phenomenon that certain conversions become possible only due to the presence of a catalyst, which is an additional resource object which does not get consumed in the process of the conversion. For example, we have

\text{timber + nails}\not\geq \text{table},

\text{timber + nails + saw + hammer} \geq \text{table + saw + hammer}

because making a table from timber and nails requires a saw and a hammer as tools. So in this example, ‘saw + hammer’ is a catalyst for the conversion of ‘timber + nails’ into ‘table’. In mathematical language, catalysis occurs precisely when the ordered commutative monoid is not cancellative, which means that x + z\geq y + z sometimes holds even though x\geq y does not. So, the notion of catalysis perfectly matches up with a very natural and familiar notion from algebra.

One can continue along these lines and study those ordered commutative monoids which are cancellative. It turns out that every ordered commutative monoid can be made cancellative in a universal way; in the resource-theoretic interpretation, this boils down to replacing the convertibility relation by catalytic convertibility, in which x is declared to be convertible into y as soon as there exists a catalyst which achieves this conversion. Making an ordered commutative monoid cancellative like this is a kind of ‘regularization’: it leads to a mathematically more well-behaved structure. As it turns out, there are several additional steps of regularization that can be performed, and all of these are both mathematically natural and have an appealing resource-theoretic interpretation. These regularizations successively take us from the world of ordered commutative monoids to the realm of linear algebra and functional analysis, where powerful theorems are available. For now, let me not go into the details, but only try to summarize one of the consequences of this development. This requires a bit of preparation.

In many situations, it is not just of interest to convert a single copy of some resource object x into a single copy of some y; instead, one may be interested in converting many copies of x into many copies of y all together, and thereby maximizing (or minimizing) the ratio of the resulting number of y‘s compared to the number of x‘s that get consumed. This ratio is measured by the maximal rate:

\displaystyle{ R_{\mathrm{max}}(x\to y) = \sup \left\{ \frac{m}{n} \:|\: nx \geq my \right\} }

Here, m and n are natural numbers, and nx stands for the n-fold sum x+\cdots+x, and similarly for my. So this maximal rate quantifies how many y’ s we can get out of one copy of x, when working in a ‘mass production’ setting. There is also a notion of regularized rate, which has a slightly more complicated definition that I don’t want to spell out here, but is similar in spirit. The toolbox of ordered commutative monoids now provides the following result:

Rate Theorem: If x\geq 0 and y\geq 0 in an ordered commutative monoid A which satisfies a mild technical assumption, then the maximal regularized rate from x to y can be computed like this:

\displaystyle{ R_{\mathrm{max}}^{\mathrm{reg}}(x\to y) = \inf_f \frac{f(y)}{f(x)} }

where f ranges over all functionals on A with f(y)\neq 0.

Wait a minute, what’s a ‘functional’? It’s defined to be a map f:A\to\mathbb{R} which is monotone,

x\geq y \:\Rightarrow\: f(x)\geq f(y)

and additive,

f(x+y) = f(x) + f(y)

In economic terms, we can think of a functional as a consistent assignment of prices to all resource objects. If x is at least as useful as y, then the price of x should be at least as high as the price of y; and the price of two objects together should be the sum of their individual prices. So the f in the rate formula above ranges over all ‘markets’ on which resource objects can be ‘traded’ at consistent prices. The term ‘functional’ is supposed to hint at a relation to functional analysis. In fact, the proof of the theorem crucially relies on the Hahn–Banach Theorem.

The mild technical mentioned in the Rate Theorem is that the ordered commutative monoid needs to have a generating pair. This turns out to hold in the applications that I have considered so far, and I hope that it will turn out to hold in most others as well. For the full gory details, see the paper.

So this provides some idea of what kinds of gadgets one can find in the toolbox of ordered commutative monoids. Next time, I’ll show some applications to graph theory and zero-error communication and say a bit about where this project might be going next.

Resource convertibility: part 3.

Resource Convertibility (Part 1)

7 April, 2015

guest post by Tobias Fritz

Hi! I am Tobias Fritz, a mathematician at the Perimeter Institute for Theoretical Physics in Waterloo, Canada. I like to work on all sorts of mathematical structures which pop up in probability theory, information theory, and other sorts of applied math. Today I would like to tell you about my latest paper:

The mathematical structure of theories of resource convertibility, I.

It should be of interest to Azimuth readers as it forms part of what John likes to call ‘green mathematics’. So let’s get started!

Resources and their management are an essential part of our everyday life. We deal with the management of time or money pretty much every day. We also consume natural resources in order to afford food and amenities for (some of) the 7 billion people on our planet. Many of the objects that we deal with in science and engineering can be considered as resources. For example, a communication channel is a resource for sending information from one party to another. But for now, let’s stick with a toy example: timber and nails constitute a resource for making a table. In mathematical notation, this looks like so:

\mathrm{timber} + \mathrm{nails} \geq \mathrm{table}

We interpret this inequality as saying that “given timber and nails, we can make a table”. I like to write it as an inequality like this, which I think of as stating that having timber and nails is at least as good as having a table, because the timber and nails can always be turned into a table whenever one needs a table.

To be more precise, we should also take into account that making the table requires some tools. These tools do not get consumed in the process, so we also get them back out:

\text{timber} + \text{nails} + \text{saw} + \text{hammer} \geq \text{table} + \text{hammer} + \text{saw}

Notice that this kind of equation is analogous to a chemical reaction equation like this:

2 \mathrm{H}_2 + \mathrm{O}_2 \geq \mathrm{H}_2\mathrm{O}

So given a hydrogen molecules and an oxygen molecule, we can let them react such as to form a molecule of water. In chemistry, this kind of equation would usually be written with an arrow ‘\rightarrow’ instead of an ordering symbol ‘\geq’ , but here we interpret the equation slightly differently. As with the timber and the nails and nails above, the inequality says that if we have two hydrogen atoms and an oxygen atom, then we can let them react to a molecule of water, but we don’t have to. In this sense, having two hydrogen atoms and an oxygen atom is at least as good as having a molecule of water.

So what’s going on here, mathematically? In all of the above equations, we have a bunch of stuff on each side and an inequality ‘\geq’ in between. The stuff on each side consists of a bunch of objects tacked together via ‘+’ . With respect to these two pieces of structure, the collection of all our resource objects forms an ordered commutative monoid:

Definition: An ordered commutative monoid is a set A equipped with a binary relation \geq, a binary operation +, and a distinguished element 0 such that the following hold:

+ and 0 equip A with the structure of a commutative monoid;

\geq equips A with the structure of a partially ordered set;

• addition is monotone: if x\geq y, then also x + z \geq y + z.

Here, the third axiom is the most important, since it tells us how the additive structure interacts with the ordering structure.

Ordered commutative monoids are the mathematical formalization of resource convertibility and combinability as follows. The elements x,y\in A are the resource objects, corresponding to the ‘collections of stuff’ in our earlier examples, such as x = \text{timber} + \text{nails} or y = 2 \text{H}_2 + \text{O}_2. Then the addition operation simply joins up collections of stuff into bigger collections of stuff. The ordering relation \geq is what formalizes resource convertibility, as in the examples above. The third axiom states that if we can convert x into y, then we can also convert x together with z into y together with z for any z, for example by doing nothing to z.

A mathematically minded reader might object that requiring A to form a partially ordered set under \geq is too strong a requirement, since it requires two resource objects to be equal as soon as they are mutually interconvertible: x \geq y and y \geq x implies x = y. However, I think that this is not an essential restriction, because we can regard this implication as the definition of equality: ‘x = y’ is just a shorthand notation for ‘x\geq y and y\geq x’ which formalizes the perfect interconvertibility of resource objects.

We could now go back to the original examples and try to model carpentry and chemistry in terms of ordered commutative monoids. But as a mathematician, I needed to start out with something mathematically precise and rigorous as a testing ground for the formalism. This helps ensure that the mathematics is sensible and useful before diving into real-world applications. So, the main example in my paper is the ordered commutative monoid of graphs, which has a resource-theoretic interpretation in terms of zero-error information theory. As graph theory is a difficult and traditional subject, this application constitutes the perfect training camp for the mathematics of ordered commutative monoids. I will get to this in Part 3.

In Part 2, I will say something about what one can do with ordered commutative monoids. In the meantime, I’d be curious to know what you think about what I’ve said so far!

Resource convertibility: part 2.

Resource convertibility: part 3.


Get every new post delivered to your Inbox.

Join 3,093 other followers