The Game of Googol

20 July, 2015

Here’s a puzzle from a recent issue of Quanta, an online science magazine:

Puzzle 1: I write down two different numbers that are completely unknown to you, and hold one in my left hand and one in my right. You have absolutely no idea how I generated these two numbers. Which is larger?

You can point to one of my hands, and I will show you the number in it. Then you can decide to either select the number you have seen or switch to the number you have not seen, held in the other hand. Is there a strategy that will give you a greater than 50% chance of choosing the larger number, no matter which two numbers I write down?

At first it seems the answer is no. Whatever number you see, the other number could be larger or smaller. There’s no way to tell. So obviously you can’t get a better than 50% chance of picking the hand with the largest number—even if you’ve seen one of those numbers!

But “obviously” is not a proof. Sometimes “obvious” things are wrong!

It turns out that, amazingly, the answer to the puzzle is yes! You can find a strategy to do better than 50%. But the strategy uses randomness. So, this puzzle is a great illustration of the power of randomness.

If you want to solve it yourself, stop now or read Quanta magazine for some clues—they offered a small prize for the best answer:

• Pradeep Mutalik, Can information rise from randomness?, Quanta, 7 July 2015.

Greg Egan gave a nice solution in the comments to this magazine article, and I’ll reprint it below along with two followup puzzles. So don’t look down there unless you want a spoiler.

I should add: the most common mistake among educated readers seems to be assuming that the first player, the one who chooses the two numbers, chooses them according to some probability distribution. Don’t assume that. They are simply arbitrary numbers.

The history of this puzzle

I’d seen this puzzle before—do you know who invented it? On G+, Hans Havermann wrote:

I believe the origin of this puzzle goes back to (at least) John Fox and Gerald Marnie’s 1958 betting game ‘Googol’. Martin Gardner mentioned it in his February 1960 column in Scientific American. Wikipedia mentions it under the heading ‘Secretary problem’. Gardner suggested that a variant of the game was proposed by Arthur Cayley in 1875.

Actually the game of Googol is a generalization of the puzzle that we’ve been discussing. Martin Gardner explained it thus:

Ask someone to take as many slips of paper as he pleases, and on each slip write a different positive number. The numbers may range from small fractions of 1 to a number the size of a googol (1 followed by a hundred 0s) or even larger. These slips are turned face down and shuffled over the top of a table. One at a time you turn the slips face up. The aim is to stop turning when you come to the number that you guess to be the largest of the series. You cannot go back and pick a previously turned slip. If you turn over all the slips, then of course you must pick the last one turned.

So, the puzzle I just showed you is the special case when there are just 2 slips of paper. I seem to recall that Gardner incorrectly dismissed this case as trivial!

There’s been a lot of work on Googol. Julien Berestycki writes:

I heard about this puzzle a few years ago from Sasha Gnedin. He has a very nice paper about this

• Alexander V. Gnedin, A solution to the game of Googol, Annals of Probability (1994), 1588–1595.

One of the many beautiful ideas in this paper is that it asks what is the best strategy for the guy who writes the numbers! It also cites a paper by Gnedin and Berezowskyi (of oligarchic fame). 

Egan’s solution

Okay, here is Greg Egan’s solution, paraphrased a bit:

Pick some function f : \mathbb{R} \to \mathbb{R} such that:

\displaystyle{ \lim_{x \to -\infty} f(x) = 0 }

\displaystyle{ \lim_{x \to +\infty} f(x) = 1 }

f is monotonically increasing: if x > y then f(x) > f(y)

There are lots of functions like this, for example

\displaystyle{f(x) = \frac{e^x}{e^x + 1} }

Next, pick one of the first player’s hands at random. If the number you are shown is x, compute f(x). Then generate a uniformly distributed random number z between 0 and 1. If z is less than or equal to f(x) guess that x is the larger number, but if z is greater than f(x) guess that the larger number is in the other hand.

The probability of guessing correctly can be calculated as the probability of seeing the larger number initially and then, correctly, sticking with it, plus the probability of seeing the smaller number initially and then, correctly, choosing the other hand.

This is

\frac{1}{2} f(x) + \frac{1}{2} (1 - f(y)) =  \frac{1}{2} + \frac{1}{2} (f(x) - f(y))

This is strictly greater than \frac{1}{2} since x > y so f(x) - f(y) > 0.

So, you have a more than 50% chance of winning! But as you play the game, there’s no way to tell how much more than 50%. If the numbers on the other players hands are very large, or very small, your chance will be just slightly more than 50%.

Followup puzzles

Here are two more puzzles:

Puzzle 2: Prove that no deterministic strategy can guarantee you have a more than 50% chance of choosing the larger number.

Puzzle 3: There are perfectly specific but ‘algorithmically random’ sequences of bits, which can’t predicted well by any program. If we use these to generate a uniform algorithmically random number between 0 and 1, and use the strategy Egan describes, will our chance of choosing the larger number be more than 50%, or not?

But watch out—here come Egan’s solutions to those!


Egan writes:

Here are my answers to your two puzzles on G+.

Puzzle 2: Prove that no deterministic strategy can guarantee you have a more than 50% chance of choosing the larger number.

Answer: If we adopt a deterministic strategy, that means there is a function S: \mathbb{R} \to \{0,1\} that tells us whether on not we stick with the number x when we see it. If S(x)=1 we stick with it, if S(x)=0 we swap it for the other number.

If the two numbers are x and y, with x > y, then the probability of success will be:

P = 0.5 + 0.5(S(x)-S(y))

This is exactly the same as the formula we obtained when we stuck with x with probability f(x), but we have specialised to functions S valued in \{0,1\}.

We can only guarantee a more than 50% chance of choosing the larger number if S is monotonically increasing everywhere, i.e. S(x) > S(y) whenever x > y. But this is impossible for a function valued in \{0,1\}. To prove this, define x_0 to be any number in [1,2] such that S(x_0)=0; such an x_0 must exist, otherwise S would be constant on [1,2] and hence not monotonically increasing. Similarly define x_1 to be any number in [-2,-1] such that S(x_1) = 1. We then have x_0 > x_1 but S(x_0) < S(x_1).

Puzzle 3: There are perfectly specific but ‘algorithmically random’ sequences of bits, which can’t predicted well by any program. If we use these to generate a uniform algorithmically random number between 0 and 1, and use the strategy Egan describes, will our chance of choosing the larger number be more than 50%, or not?

Answer: As Philip Gibbs noted, a deterministic pseudo-random number generator is still deterministic. Using a specific sequence of algorithmically random bits

(b_1, b_2, \dots )

to construct a number z between 0 and 1 means z takes on the specific value:

z_0 = \sum_i b_i 2^{-i}

So rather than sticking with x with probability f(x) for our monotonically increasing function f, we end up always sticking with x if z_0 \le f(x), and always swapping if z_0 > f(x). This is just using a function S:\mathbb{R} \to \{0,1\} as in Puzzle 2, with:

S(x) = 0 if x < f^{-1}(z_0)

S(x) = 1 if x \ge f^{-1}(z_0)

So all the same consequences as in Puzzle 2 apply, and we cannot guarantee a more than 50% chance of choosing the larger number.

Puzzle 3 emphasizes the huge gulf between ‘true randomness’, where we only have a probability distribution of numbers z, and the situation where we have a specific number z_0, generated by any means whatsoever.

We could generate z_0 using a pseudorandom number generator, radioactive decay of atoms, an oracle whose randomness is certified by all the Greek gods, or whatever. No matter how randomly z_0 is generated, once we have it, we know there exist choices for the first player that will guarantee our defeat!

This may seem weird at first, but if you think about simple games of luck you’ll see it’s completely ordinary. We can have a more than 50% chance of winning such a game even if for any particular play we make the other player has a move that ensures our defeat. That’s just how randomness works.

Trends in Reaction Network Theory (Part 2)

1 July, 2015

Here in Copenhagen we’ll soon be having a bunch of interesting talks on chemical reaction networks:

Workshop on Mathematical Trends in Reaction Network Theory, 1-3 July 2015, Department of Mathematical Sciences, University of Copenhagen. Organized by Elisenda Feliu and Carsten Wiuf.

Looking through the abstracts, here are a couple that strike me.

First of all, Gheorghe Craciun claims to have proved the biggest open conjecture in this field: the Global Attractor Conjecture!

• Gheorge Craciun, Toric differential inclusions and a proof of the global attractor conjecture.

This famous old conjecture says that for a certain class of chemical reactions, the ones coming from ‘complex balanced reaction networks’, the chemicals will approach equilibrium no matter what their initial concentrations are. Here’s what Craciun says:

Abstract. In a groundbreaking 1972 paper Fritz Horn and Roy Jackson showed that a complex balanced mass-action system must have a unique locally stable equilibrium within any compatibility class. In 1974 Horn conjectured that this equilibrium is a global attractor, i.e., all solutions in the same compatibility class must converge to this equilibrium. Later, this claim was called the Global Attractor Conjecture, and it was shown that it has remarkable implications for the dynamics of large classes of polynomial and power-law dynamical systems, even if they are not derived from mass-action kinetics. Several special cases of this conjecture have been proved during the last decade. We describe a proof of the conjecture in full generality. In particular, it will follow that all detailed balanced mass action systems and all deficiency zero mass-action systems have the global attractor property. We will also discuss some implications for biochemical mechanisms that implement noise filtering and cellular homeostasis.

Manoj Gopalkrishnan wrote a great post explaining the concept of complex balanced reaction network here on Azimuth, so if you want to understand the conjecture you could start there.

Even better, Manoj is talking here about a way to do statistical inference with chemistry! His talk is called ‘Statistical inference with a chemical soup':

Abstract. The goal is to design an “intelligent chemical soup” that can do statistical inference. This may have niche technological applications in medicine and biological research, as well as provide fundamental insight into the workings of biochemical reaction pathways. As a first step towards our goal, we describe a scheme that exploits the remarkable mathematical similarity between log-linear models in statistics and chemical reaction networks. We present a simple scheme that encodes the information in a log-linear model as a chemical reaction network. Observed data is encoded as initial concentrations, and the equilibria of the corresponding mass-action system yield the maximum likelihood estimators. The simplicity of our scheme suggests that molecular environments, especially within cells, may be particularly well suited to performing statistical computations.

It’s based on this paper:

• Manoj Gopalkrishnan, A scheme for molecular computation of maximum likelihood estimators for log-linear models.

I’m not sure, but this idea may exploit existing analogies between the approach to equilibrium in chemistry, the approach to equilibrium in evolutionary game theory, and statistical inference. You may have read Marc Harper’s post about that stuff!

David Doty is giving a broader review of ‘Computation by (not about) chemistry':

Abstract. The model of chemical reaction networks (CRNs) is extensively used throughout the natural sciences as a descriptive language for existing chemicals. If we instead think of CRNs as a programming language for describing artificially engineered chemicals, what sorts of computations are possible for these chemicals to achieve? The answer depends crucially on several formal choices:

1) Do we treat matter as infinitely divisible (real-valued concentrations) or atomic (integer-valued counts)?

2) How do we represent the input and output of the computation (e.g., Boolean presence or absence of species, positive numbers directly represented by counts/concentrations, positive and negative numbers represented indirectly by the difference between counts/concentrations of a pair of species)?

3) Do we assume mass-action rate laws (reaction rates proportional to reactant counts/concentrations) or do we insist the system works correctly under a broader class of rate laws?

The talk will survey several recent results and techniques. A primary goal of the talk is to convey the “programming perspective”: rather than asking “What does chemistry do?”, we want to understand “What could chemistry do?” as well as “What can chemistry provably not do?”

I’m really interested in chemical reaction networks that appear in biological systems, and there will be lots of talks about that. For example, Ovidiu Radulescu will talk about ‘Taming the complexity of biochemical networks through model reduction and tropical geometry’. Model reduction is the process of simplifying complicated models while preserving at least some of their good features. Tropical geometry is a cool version of algebraic geometry that uses the real numbers with minimization as addition and addition as multiplication. This number system underlies the principle of least action, or the principle of maximum energy. Here is Radulescu’s abstract:

Abstract. Biochemical networks are used as models of cellular physiology with diverse applications in biology and medicine. In the absence of objective criteria to detect essential features and prune secondary details, networks generated from data are too big and therefore out of the applicability of many mathematical tools for studying their dynamics and behavior under perturbations. However, under circumstances that we can generically denote by multi-scaleness, large biochemical networks can be approximated by smaller and simpler networks. Model reduction is a way to find these simpler models that can be more easily analyzed. We discuss several model reduction methods for biochemical networks with polynomial or rational rate functions and propose as their common denominator the notion of tropical equilibration, meaning finite intersection of tropical varieties in algebraic geometry. Using tropical methods, one can strongly reduce the number of variables and parameters of biochemical network. For multi-scale networks, these reductions are computed symbolically on orders of magnitude of parameters and variables, and are valid in wide domains of parameter and phase spaces.

I’m talking about the analogy between probabilities and quantum amplitudes, and how this makes chemistry analogous to particle physics. You can see two versions of my talk here, but I’ll be giving the ‘more advanced’ version, which is new:

Probabilities versus amplitudes.

Abstract. Some ideas from quantum theory are just beginning to percolate back to classical probability theory. For example, the master equation for a chemical reaction network describes the interactions of molecules in a stochastic rather than quantum way. If we look at it from the perspective of quantum theory, this formalism turns out to involve creation and annihilation operators, coherent states and other well-known ideas—but with a few big differences.

Anyway, there are a lot more talks, but if I don’t have breakfast and walk over to the math department, I’ll miss those talks!

You can learn more about individual talks in the comments here (see below) and also in Matteo Polettini’s blog:

• Matteo Polettini, Mathematical trends in reaction network theory: part 1 and part 2, Out of Equilibrium, 1 July 2015.

PROPs for Linear Systems

18 May, 2015

Eric Drexler likes to say: engineering is dual to science, because science tries to understand what the world does, while engineering is about getting the world to do what you want. I think we need a slightly less ‘coercive’, more ‘cooperative’ approach to the world in order to develop ‘ecotechnology’, but it’s still a useful distinction.

For example, classical mechanics is the study of what things do when they follow Newton’s laws. Control theory is the study of what you can get them to do.

Say you have an upside-down pendulum on a cart. Classical mechanics says what it will do. But control theory says: if you watch the pendulum and use what you see to move the cart back and forth correctly, you can make sure the pendulum doesn’t fall over!

Control theorists do their work with the help of ‘signal-flow diagrams’. For example, here is the signal-flow diagram for an inverted pendulum on a cart:

When I take a look at a diagram like this, I say to myself: that’s a string diagram for a morphism in a monoidal category! And it’s true. Jason Erbele wrote a paper explaining this. Independently, Bonchi, Sobociński and Zanasi did some closely related work:

• John Baez and Jason Erbele, Categories in control.

• Filippo Bonchi, Paweł Sobociński and Fabio Zanasi, Interacting Hopf algebras.

• Filippo Bonchi, Paweł Sobociński and Fabio Zanasi, A categorical semantics of signal flow graphs.

I’ll explain some of the ideas at the Turin meeting on the categorical foundations of network theory. But I also want to talk about this new paper that Simon Wadsley of Cambridge University wrote with my student Nick Woods:

• Simon Wadsley and Nick Woods, PROPs for linear systems.

This makes the picture neater and more general!

You see, Jason and I used signal flow diagrams to give a new description of the category of finite-dimensional vector spaces and linear maps. This category plays a big role in the control theory of linear systems. Bonchi, Sobociński and Zanasi gave a closely related description of an equivalent category, \mathrm{Mat}(k), where:

• objects are natural numbers, and

• a morphism f : m \to n is an n \times m matrix with entries in the field k,

and composition is given by matrix multiplication.

But Wadsley and Woods generalized all this work to cover \mathrm{Mat}(R) whenever R is a commutative rig. A rig is a ‘ring without negatives’—like the natural numbers. We can multiply matrices valued in any rig, and this includes some very useful examples… as I’ll explain later.

Wadsley and Woods proved:

Theorem. Whenever R is a commutative rig, \mathrm{Mat}(R) is the PROP for bicommutative bimonoids over R.

This result is quick to state, but it takes a bit of explaining! So, let me start by bringing in some definitions.

Bicommutative bimonoids

We will work in any symmetric monoidal category, and draw morphisms as string diagrams.

A commutative monoid is an object equipped with a multiplication:

and a unit:

obeying these laws:

For example, suppose \mathrm{FinVect}_k is the symmetric monoidal category of finite-dimensional vector spaces over a field k, with direct sum as its tensor product. Then any object V \in \mathrm{FinVect}_k is a commutative monoid where the multiplication is addition:

(x,y) \mapsto x + y

and the unit is zero: that is, the unique map from the zero-dimensional vector space to V.

Turning all this upside down, cocommutative comonoid has a comultiplication:

and a counit:

obeying these laws:

For example, consider our vector space V \in \mathrm{FinVect}_k again. It’s a commutative comonoid where the comultiplication is duplication:

x \mapsto (x,x)

and the counit is deletion: that is, the unique map from V to the zero-dimensional vector space.

Given an object that’s both a commutative monoid and a cocommutative comonoid, we say it’s a bicommutative bimonoid if these extra axioms hold:

You can check that these are true for our running example of a finite-dimensional vector space V. The most exciting one is the top one, which says that adding two vectors and then duplicating the result is the same as duplicating each one, then adding them appropriately.

Our example has some other properties, too! Each element c \in k defines a morphism from V to itself, namely scalar multiplication by c:

x \mapsto c x

We draw this as follows:

These morphisms are compatible with the ones so far:

Moreover, all the ‘rig operations’ in k—that is, addition, multiplication, 0 and 1, but not subtraction or division—can be recovered from what we have so far:

We summarize this by saying our vector space V is a bicommutative bimonoid ‘over k‘.

More generally, suppose we have a bicommutative bimonoid A in a symmetric monoidal category. Let \mathrm{End}(A) be the set of bicommutative bimonoid homomorphisms from A to itself. This is actually a rig: there’s a way to add these homomorphisms, and also a way to ‘multiply’ them (namely, compose them).

Suppose R is any commutative rig. Then we say A is a bicommutative bimonoid over R if it’s equipped with a rig homomorphism

\Phi : R \to \mathrm{End}(A)

This is a way of summarizing the diagrams I just showed you! You see, each c \in R gives a morphism from A to itself, which we write as

The fact that this is a bicommutative bimonoid endomorphism says precisely this:

And the fact that \Phi is a rig homomorphism says precisely this:

So sometimes the right word is worth a dozen pictures!

What Jason and I showed is that for any field k, the \mathrm{FinVect}_k is the free symmetric monoidal category on a bicommutative bimonoid over k. This means that the above rules, which are rules for manipulating signal flow diagrams, completely characterize the world of linear algebra!

Bonchi, Sobociński and Zanasi used ‘PROPs’ to prove a similar result where the field is replaced by a sufficiently nice commutative ring. And Wadlsey and Woods used PROPS to generalize even further to the case of an arbitrary commutative rig!

But what are PROPs?


A PROP is a particularly tractable sort of symmetric monoidal category: a strict symmetric monoidal category where the objects are natural numbers and the tensor product of objects is given by ordinary addition. The symmetric monoidal category \mathrm{FinVect}_k is equivalent to the PROP \mathrm{Mat}(k), where a morphism f : m \to n is an n \times m matrix with entries in k, composition of morphisms is given by matrix multiplication, and the tensor product of morphisms is the direct sum of matrices.

We can define a similar PROP \mathrm{Mat}(R) whenever R is a commutative rig, and Wadsley and Woods gave an elegant description of the ‘algebras’ of \mathrm{Mat}(R). Suppose C is a PROP and D is a strict symmetric monoidal category. Then the category of algebras of C in D is the category of strict symmetric monoidal functors F : C \to D and natural transformations between these.

If for every choice of D the category of algebras of C in D is equivalent to the category of algebraic structures of some kind in D, we say C is the PROP for structures of that kind. This explains the theorem Wadsley and Woods proved:

Theorem. Whenever R is a commutative rig, \mathrm{Mat}(R) is the PROP for bicommutative bimonoids over R.

The fact that an algebra of \mathrm{Mat}(R) is a bicommutative bimonoid is equivalent to all this stuff:

The fact that \Phi(c) is a bimonoid homomorphism for all c \in R is equivalent to this stuff:

And the fact that \Phi is a rig homomorphism is equivalent to this stuff:

This is a great result because it includes some nice new examples.

First, the commutative rig of natural numbers gives a PROP \mathrm{Mat}. This is equivalent to the symmetric monoidal category \mathrm{FinSpan}, where morphisms are isomorphism classes of spans of finite sets, with disjoint union as the tensor product. Steve Lack had already shown that \mathrm{FinSpan} is the PROP for bicommutative bimonoids. But this also follows from the result of Wadsley and Woods, since every bicommutative bimonoid V is automatically equipped with a unique rig homomorphism

\Phi : \mathbb{N} \to \mathrm{End}(V)

Second, the commutative rig of booleans

\mathbb{B} = \{F,T\}

with ‘or’ as addition and ‘and’ as multiplication gives a PROP \mathrm{Mat}(\mathbb{B}). This is equivalent to the symmetric monoidal category \mathrm{FinRel} where morphisms are relations between finite sets, with disjoint union as the tensor product. Samuel Mimram had already shown that this is the PROP for special bicommutative bimonoids, meaning those where comultiplication followed by multiplication is the identity:

But again, this follows from the general result of Wadsley and Woods!

Finally, taking the commutative ring of integers \mathbb{Z}, Wadsley and Woods showed that \mathrm{Mat}(\mathbb{Z}) is the PROP for bicommutative Hopf monoids. The key here is that scalar multiplication by -1 obeys the axioms for an antipode—the extra morphism that makes a bimonoid into a Hopf monoid. Here are those axioms:

More generally, whenever R is a commutative ring, the presence of -1 \in R guarantees that a bimonoid over R is automatically a Hopf monoid over R. So, when R is a commutative ring, Wadsley and Woods’ result implies that \mathrm{Mat}(R) is the PROP for Hopf monoids over R.

Earlier, in their paper on ‘interacting Hopf algebras’, Bonchi, Sobociński and Zanasi had given an elegant and very different proof that \mathrm{Mat}(R) is the PROP for Hopf monoids over R whenever R is a principal ideal domain. The advantage of their argument is that they build up the PROP for Hopf monoids over R from smaller pieces, using some ideas developed by Steve Lack. But the new argument by Wadsley and Woods has its own charm.

In short, we’re getting the diagrammatics of linear algebra worked out very nicely, providing a solid mathematical foundation for signal flow diagrams in control theory!

Resource Theories

12 May, 2015

by Brendan Fong

Hugo Nava-Kopp and I have a new paper on resource theories:

• Brendan Fong and Hugo Nava-Kopp, Additive monotones for resource theories of parallel-combinable processes with discarding.

A mathematical theory of resources is Tobias Fritz’s current big project. He’s already explained how ordered monoids can be viewed as theories of resource convertibility in a three part series on this blog.

Ordered monoids are great, and quite familiar in network theory: for example, a Petri net can be viewed as a presentation for an ordered commutative monoid. But this work started in symmetric monoidal categories, together with my (Oxford) supervisor Bob Coecke and Rob Spekkens.

The main idea is this: think of the objects of your symmetric monoidal category as resources, and your morphisms as ways to convert one resource into another. The monoidal product or ‘tensor product’ in your category allows you to talk about collections of your resources. So, for example, in the resource theory of chemical reactions, our objects are molecules like oxygen O2, hydrogen H2, and water H2O, and morphisms things like the electrolysis of water:

2H2O → O2 + 2H2

This is a categorification of the ordered commutative monoid of resource convertibility: we now keep track of how we convert resources into one another, instead of just whether we can convert them.

Categorically, I find the other direction easier to state: being a category, the resource theory is enriched over \mathrm{Set}, while a poset is enriched over the poset of truth values or ‘booleans’ \mathbb{B} = \{0,1\}. If we ‘partially decategorify’ by changing the base of enrichment along the functor \mathrm{Set} \to \mathbb{B} that maps the empty set to 0 and any nonempty set to 1, we obtain the ordered monoid corresponding to the resource theory.

But the research programme didn’t start at resource theories either. The starting point was ‘partitioned process theories’.

Here’s an example that guided the definitions. Suppose we have a bunch of labs with interacting quantum systems, separated in space. With enough cooperation and funding, they can do big joint operations on their systems, like create entangled pairs between two locations. For ‘free’, however, they’re limited to classical communication between the locations, although they can do the full range of quantum operations on their local system. So you’ve got a symmetric monoidal category with objects quantum systems and morphisms quantum operations, together with a wide (all-object-including) symmetric monoidal subcategory that contains the morphisms you can do with local quantum operations and classical communication (known as LOCC operations).

This general structure: a symmetric monoidal category (or SMC for short) with a wide symmetric monoidal subcategory, is called a partitioned process theory. We call the morphisms in the SMC processes, and those in the subSMC free processes.

There are a number of methods for building a resource theory (i.e. an SMC) from a partitioned process theory. The unifying idea though, is that your new SMC has the processes f,g as objects, and morphisms f \to g ways of using the free processes to build g from f.

But we don’t have to go to fancy sounding quantum situations to find examples of partitioned process theories. Instead, just look at any SMC in which each object is equipped with an algebraic structure. Then the morphisms defining this structure can be taken as our ‘free’ processes.

For example, in a multigraph category every object has the structure of a ‘special commutative Frobenius algebra’. That’s a bit of a mouthful, but John defined it a while back, and examples include categories where morphisms are electrical circuits, and categories where morphisms are signal flow diagrams.

So these categories give partitioned process theories! This idea of partitioning the morphisms into ‘free’ ones and ‘costly’ ones is reminiscent of what I was saying earlier about the operad of wiring diagrams about it being useful to separate behavioural structure from interconnection structure.

This suggests that we can also view the free processes as generating some sort of operad, that describes the ways we allow ourselves to use free processes to turn processes into other processes. If we really want to roll a big machine out to play with this stuff, framed bicategories may also be interesting; Spivak is already using them to get at questions about operads. But that’s all conjecture, and a bit of a digression.

To get back to the point, this was all just to say that if you find yourself with a bunch of resistors, and you ask ‘what can I build?’, then you’re after the resource theory apparatus.

You can read more about this stuff here:

• Bob Coecke, Tobias Fritz and Rob W. Spekkens, A mathematical theory of resources.

• Tobias Fritz, The mathematical structure of theories of resource convertibility I.

Cospans, Wiring Diagrams, and the Behavioral Approach

5 May, 2015

joint with Brendan Fong

We’re getting ready for the Turin workshop on the
Categorical Foundations of Network Theory. So, we’re trying to get our thoughts in order.

Last time we talked about understanding types of networks as categories of decorated cospans. Earlier, David Spivak told us about understanding networks as algebras of an operad. Both these frameworks work at capturing notions of modularity and interconnection. Are they then related? How?

In this post we want to discuss some similarities between decorated cospan categories and algebras for Spivak’s operad of wiring diagrams. The main idea is that the two approaches are ‘essentially’ equivalent, but that compared to decorated cospans, Spivak’s operad formalism puts greater emphasis on the distinction between the ‘duplication’ and ‘deletion’ morphisms and other morphisms in our category.

The precise details are still to be worked out—jump in and help us!


We begin with a bit about operads in general. Recall that an operad is similar to a category, except that instead of a set \mathrm{hom}(x,y) of morphisms from any object x to any object y, you have a set \mathrm{hom}(x_1,\dots,x_n;y) of operations from any finite list of objects x_1,...,x_n to any object y . If we have an operation f \in \mathrm{hom}(x_1,\dots,x_n;y), we can call x_1, \dots, x_n the inputs of f and call y the output of f .

We can compose operations in an operad. To understand how, it’s easiest to use pictures. We draw an operation in \mathrm{hom}(x_1,\dots,x_n;y) as a little box with n wires coming in and one wire coming out:

The input wires should be labelled with the objects x_1, \dots, x_n and the output wire with the object y, but I haven’t done this.

We are allowed to compose these operations as follows:

as long as the outputs of the operations g_1,\dots,g_n match the inputs of the operation f . The result is a new operation which we call f \circ (g_1,\dots,g_n) .

We demand that there be unary operations 1_x \in \mathrm{hom}(x;x) serving as identities for composition, and we impose an associative law that makes a composite of composites like this well-defined:

So far this is the definition of a operad without permutations. In a full-fledged permutative operad, we can also permute the inputs of an operation f and get a new operation:

which we call f \sigma if \sigma is the the permutation of the inputs. We demand that (f \sigma) \sigma' = f (\sigma \sigma') . And finally, we demand that permutations act in a way that is compatible with composition. For example:

Here we see that (f \sigma) \circ (g_1, \dots, g_n) is equal to some obvious other thing.

Finally, there is a law saying

f \circ (g_1 \sigma_1, \dots, g_n \sigma_n) = (f \circ (g_1 , \dots, g_n)) \sigma

for some choice of \sigma that you can cook up from the permuations \sigma_i in an obvious way. We leave it as an exercise to work out the details. By the way, one well-known book on operads accidentally omits this law, so here’s a rather more lengthy exercise: read this book, see which theorems require this law, and correct their proofs!

Operads are similar to symmetric monoidal categories. The idea is that in a symmetric monoidal category you can just form the tensor product x_1 \otimes \dots \otimes x_n and talk about the set of morphisms x_1 \otimes \cdots \otimes \cdots x_n \to y . Indeed any symmetric monoidal category gives an operad in this way: just define \mathrm{hom}(x_1,...,x_n;y) to be \mathrm{hom}(x_1 \otimes \cdots \otimes x_n, y) . If we do this with Set, which is a symmetric monoidal category using the usual cartesian product of sets, we get an operad called \mathrm{Set}.

An algebra for an operad O is an operad homomorphism O \to \mathrm{Set}. We haven’t said what an operad homomorphism is, but you can probably figure it out yourself. The point is this: an algebra for O turns the abstract operations in O into actual operations on sets!

Finally, we should warn you that operads come in several flavors, and we’ve been talking about ‘typed permutative operads’. ‘Typed’ means that there’s more than one object; ‘permutative’ means that we have the ability to permute the input wires. When people say ‘operad’, they often mean an untyped permutative operad. For that, just specialize down to the case where there’s only one object x.

You can see a fully precise definition of untyped permutative operads here:

Operad theory, Wikipedia.

along with the definition of an untyped operad without permutations.

The operad of wiring diagrams

Spivak’s favorite operad is the operad of wiring diagrams. The operad of wiring diagrams WD is an operad version of \mathrm{Cospan}(\mathrm{FinSet}), constructed in the vein suggested above: the objects are finite sets, and an operation from a list of sets X_1,...,X_n to a set Y is a cospan

X_1+ \cdots +X_n \rightarrow S \leftarrow Y

Spivak draws such a thing as a big circle with n small circles cut out from the interior:

The outside of the big circle has a set Y of terminals marked on it, and each small circle has a set X_i of terminals marked on it. Then in the interior of this shape there are wires connecting these terminals. This what he calls a wiring diagrams.

You compose these wiring diagrams by pasting other wiring diagrams into each of the small circles.

The relationship with our Frobenius monoid diagrams is pretty simple: we draw our ‘wiring diagrams’ X \to Y in a square, with the X terminals on the left and Y terminals on the right. To get a Spivak-approved wiring diagram, glue the top and bottom edges of this square together, then flatten the cylinder you get down into an annulus, with the X-side on the inside and Y-side on the outside. If X = X_1+X_2 you can imagine gluing opposite edges of the inside circle together to divide it into two small circles accordingly, and so on.

Relational algebras of type A

Algebras for wiring diagrams tell you what components you have available to wire together with your diagrams. An algebra for the operad of wiring diagrams is an operad homomorphism

WD \to \mathrm{Set}

What does this look like? Just like a functor for categories, it assigns to each natural number a set, and each wiring diagram a function.

In work related to decorated cospans (such as our paper on circuits or John and Jason’s work on signal flow diagrams), our semantics usually is constructed from a field of values—not a physicist’s ‘field’, bt an algebraist’s sort of ‘field’, where you can add, multiply, subtract and divide. For example, we like being able to assign a real number like a velocity, or potential, or current to a variable. This gives us vector spaces and a bunch of nice linear-algebraic structures.

Spivak works more generally: he’s interested in the structure when you just have a set of values. While this means we can’t do some of the nice things we could do with a field, it also means this framework can do things like talk about logic gates, where the variables are boolean ones, or number theoretic questions, where you’re interested in the natural numbers.

So to discuss semantics we pick a set A of values, such as the real numbers or natural numbers or booleans or colors. We imagine then associating elements of this set to each wire in a wiring diagram. More technically, the algebra

\mathrm{Rel}A: WD \to \mathrm{Set}

then maps each finite set X to the power set \mathcal{P}(A^X) of the set A^X of functions X \to A .

On the morphisms (the wiring diagrams themselves), this functor behaves as follows. Note that a function (X \to A) can be thought of as an ‘X-vector’ (a_1,...,a_x) of ‘A-coordinates’. A wiring diagram X \to Y is just a cospan

X \to N  \leftarrow Y

in \mathrm{FinSet}, so it can be thought of as some compares

X \to N

followed by some copies

N \to Y

Thus, given a wiring diagram X \to Y, we can consider a partial function that maps an X-vector to the Y-vector by doing these compares, and if it passes them does the copies and returns the resulting Y-vector, but if not returns ‘undefined’. We can then define a map \mathcal{P}(A^X) \to \mathcal{P}(A^Y) which takes a set of X-vectors to its image under this partial function.

This semantics is called the relational WD-algebra of type A. We can think of it as being like the ‘light operations’ fragment of the signal flow calculus. By ‘light operations’, we mean the operations of duplication and deletion, which form a cocommutative comonoid:

and their time-reversed versions, ‘coduplication’ and ‘codeletion’, which form a commutatative monoid:

These fit together to form a Frobenius monoid, meaning that these equations hold:

And it’s actually extra-special, meaning that these equations hold:

(If you don’t understand these hieroglyphics, please reread our post about categories in control theory, and ask some questions!)

Note that we can’t do the ‘dark operations’, because we only have a set A of values, not a field, and the dark operations involve addition and zero!

Operads and the behavioral approach

In formulating Frobenius monoids this way, Spivak achieves something that we’ve been working hard to find ways to achieve: a separation of the behavioral structure from the interconnection structure.

What do I mean by this? In his ‘behavioral approach‘, Willems makes the point that for all their elaborate and elegant formulation, in the end physical laws just come down to dividing the set of what might a priori be possible (the ‘universum’) into the set of things that actually are possible (the ‘behavior’), and the set of things that aren’t). Here the universum is the set A^X: a priori, on each of the wires in X, we might associate any value of A . For example, to the two wires at the ends of a resistor, we might a priori associate any pair of currents. But physical law, here Kirchhoff’s current law, says otherwise: the currents must be equal and opposite. So the ‘behavior’ is the subset (i,-i) of the universum \mathbb{R}^2.

So you can say that to each object X in the operad of wiring diagrams the relational algebra of type A associates the set \mathcal{P}(A^X) of possible behaviors—the universum is A^X . (\mathcal{P}(A^X) forms some sort of meta-universum, where you can discuss physical laws about physical laws, commonly called ‘principles’.)

The second key aspect of the behavioral approach is that the behaviors of larger systems can be constructed from the behaviors of its subsystems, if we understand the so-called ‘interconnection structure’ well enough. This is a key principle in engineering: we build big, complicated systems from a much smaller set of components, whether it be electronics from resistors and inductors, or mechanical devices from wheels and rods and ropes, or houses from Lego bricks. The various interconnection structures here are the wiring diagrams, and our relational algebras say they act by what Willems calls ‘variable sharing’.

This division between behavior and interconnection motivates the decorated cospan construction (where the decorations are the ‘components’, the cospans the ‘interconnection’) and also the multigraph categories discussed by Aleks Kissinger (where morphisms are the ‘components’, and the Frobenius monoid operations are the ‘interconnection’):

• Aleks Kissinger, Finite matrices are complete for (dagger-)multigraph categories.

So it’s good to have this additional way of thinking about things in our repertoire: operads describe ‘interconnection’, their algebras ‘behaviors’.

The separation Spivak achieves, however, seems to me to come at the cost of neat ways to talk about individual components, and perhaps this can be seen as the essential difference between the two approaches. By including our components as morphisms, we can talk more carefully about them and additional structure individual components have. On the other hand, by lumping all the components into the objects, Spivak can talk more carefully about how the interconnection structure acts on all behaviors at once.

Other operads of wiring diagrams

One advantage of the operad approach is that you can easily tweak your operad to talk about different sorts of network structure. Sometimes you can make similar adjustments with decorated cospans too, such as working over the category of typed finite sets, rather than just finite sets, to discuss networks in which wires have types, and only wires of the same types can be connected together. A physical example is a model of a hydroelectric power plant, where you don’t want to connect a water pipe with an electrical cable! This is also a common technique in computer science, where you don’t want to try to multiply two strings of text, or try to interpret a telephone number as a truth value.

But some modifications are harder to do with decorated cospans. In some other papers, Spivak employs a more restricted operad of wiring diagrams, in which joining wires and terminating wires is not allowed, among other things. He uses this to formalise graphical languages for certain types of discrete-time processes, open dynamical systems, including mode-dependent ones.

For more detail, read these:

• Brendan Fong, Decorated cospans.

• David Spivak, The operad of wiring diagrams: formalizing a graphical language for databases, recursion, and plug-and-play circuits.

Decorated Cospans

1 May, 2015

Last time I talked about a new paper I wrote with Brendan Fong. It’s about electrical circuits made of ‘passive’ components, like resistors, inductors and capacitors. We showed these circuits are morphisms in a category. Moreover, there’s a functor sending each circuit to its ‘external behavior': what it does, as seen by someone who can only measure voltages and currents at the terminals.

Our paper uses a formalism that Brendan developed here:

• Brendan Fong, Decorated cospans.

The idea here is we may want to take something like a graph with edges labelled by positive numbers:

and say that some of its nodes are ‘inputs’, while others are ‘outputs':

This lets us treat our labelled graph as a ‘morphism’ from the set X to the set Y.

The point is that we can compose such morphisms. For example, suppose we have another one of these things, going from Y to Z:

Since the points of Y are sitting in both things:

we can glue them together and get a thing going from X to Z:

That’s how we compose these morphisms.

Note how we’re specifying some nodes of our original thing as inputs and outputs:

We’re using maps from two sets X and Y to the set N of nodes of our graph. And a bit surprisingly, we’re not demanding that these maps be one-to-one. That turns out to be useful—and in general, when doing math, it’s dumb to make your definitions forbid certain possibilities unless you really need to.

So, our thing is really a cospan of finite sets—that is, a diagram of finite sets and functions like this:

together some extra structure on the set N. This extra structure is what Brendan calls a decoration, and it makes the cospan into a ‘decorated cospan’. In this particular example, a decoration on N is a way of making it into the nodes of a graph with edges labelled by positive numbers. But one could consider many other kinds of decorations: this idea is very general.

To formalize the idea of ‘a kind of decoration’, Brendan uses a functor

F: \mathrm{FinSet} \to \mathrm{Set}

sending each finite set N to a set of F(N). This set F(N) is the set of decorations of the given kind that we can put on N.

So, for any such functor F, a decorated cospan of finite sets is a cospan of finite sets:

together with an element of F(N).

But in fact, Brendan goes further. He’s not content to use a functor

F: \mathrm{FinSet} \to \mathrm{Set}

to decorate his cospans.

First, there’s no need to limit ourselves to cospans of finite sets: we can replace \mathrm{FinSet} with some other category! If C is any category with finite colimits, there’s a category \mathrm{Cospan}(C) with:

• objects of C as its objects,
• isomorphism classes of cospans between these as morphisms.

Second, there’s no need to limit ourselves to decorations that are elements of a set: we can replace \mathrm{Set} with some other category! If D is any symmetric monoidal category, we can define an element of an object d \in D to be a morphism

f: I \to d

where I is the unit for the tensor product in D.

So, Brendan defines decorated cospans at this high level of generality, and shows that under some mild conditions we can compose them, just as in the pictures we saw earlier.

Here’s one of the theorems Brendan proves:

Theorem. Suppose C is a category with finite colimits, and make C into a symmetric monoidal category with its coproduct as the tensor product. Suppose D is a symmetric monoidal category, and suppose F: C \to D is a lax symmetric monoidal functor. Define an F-decorated cospan to be a cospan

in C together with an element of F(N). Then there is a category with

• objects of C as its objects,
• isomorphism classes of F-decorated cospans as its morphisms.

This is called the F-decorated cospan category, FCospan. This category becomes symmetric monoidal in a natural way. It is then a dagger compact category.

(You may not know all this jargon, but ‘lax symmetric monoidal’, for example, talks about how we can take decorations on two things and get a decoration on their disjoint union, or ‘coproduct’. We need to be able to do this—as should be obvious from the pictures I drew. Also, a ‘dagger compact category’ is the kind of category whose morphisms can be drawn as networks.)

Brendan also explains how to get functors between decorated cospan categories. We need this in our paper on electrical circuits, because we consider several categories where morphisms is a circuit, or something that captures some aspect of a circuit. Most of these categories are decorated cospan categories. We want to get functors between them. And often we can just use Brendan’s general results to get the job done! No fuss, no muss: all the hard work has been done ahead of time.

I expect to use this technology a lot in my work on network theory.

A Compositional Framework for Passive Linear Networks

28 April, 2015

Here’s a new paper on network theory:

• John Baez and Brendan Fong, A compositional framework for passive linear networks.

While my paper with Jason Erbele studies signal flow diagrams, this one focuses on circuit diagrams. The two are different, but closely related.

I’ll explain their relation at the Turin workshop in May. For now, let me just talk about this paper with Brendan. There’s a lot in here, but let me just try to explain the main result. It’s all about ‘black boxing': hiding the details of a circuit and only remembering its behavior as seen from outside.

The idea

In late 1940s, just as Feynman was developing his diagrams for processes in particle physics, Eilenberg and Mac Lane initiated their work on category theory. Over the subsequent decades, and especially in the work of Joyal and Street in the 1980s, it became clear that these developments were profoundly linked: monoidal categories have a precise graphical representation in terms of string diagrams, and conversely monoidal categories provide an algebraic foundation for the intuitions behind Feynman diagrams. The key insight is the use of categories where morphisms describe physical processes, rather than structure-preserving maps between mathematical objects.

In work on fundamental physics, the cutting edge has moved from categories to higher categories. But the same techniques have filtered into more immediate applications, particularly in computation and quantum computation. Our paper is part of a new program of applying string diagrams to engineering, with the aim of giving diverse diagram languages a unified foundation based on category theory.

Indeed, even before physicists began using Feynman diagrams, various branches of engineering were using diagrams that in retrospect are closely related. Foremost among these are the ubiquitous electrical circuit diagrams. Although less well-known, similar diagrams are used to describe networks consisting of mechanical, hydraulic, thermodynamic and chemical systems. Further work, pioneered in particular by Forrester and Odum, applies similar diagrammatic methods to biology, ecology, and economics.

As discussed in detail by Olsen, Paynter and others, there are mathematically precise analogies between these different systems. In each case, the system’s state is described by variables that come in pairs, with one variable in each pair playing the role of ‘displacement’ and the other playing the role of ‘momentum’. In engineering, the time derivatives of these variables are sometimes called ‘flow’ and ‘effort’.

displacement:    q flow:      \dot q momentum:      p effort:           \dot p
Mechanics: translation position velocity momentum force
Mechanics: rotation angle angular velocity angular momentum torque
Electronics charge current flux linkage voltage
Hydraulics volume flow pressure momentum pressure
Thermal Physics entropy entropy flow temperature momentum temperature
Chemistry moles molar flow chemical momentum chemical potential

In classical mechanics, this pairing of variables is well understood using symplectic geometry. Thus, any mathematical formulation of the diagrams used to describe networks in engineering needs to take symplectic geometry as well as category theory into account.

While diagrams of networks have been independently introduced in many disciplines, we do not expect formalizing these diagrams to immediately help the practitioners of these disciplines. At first the flow of information will mainly go in the other direction: by translating ideas from these disciplines into the language of modern mathematics, we can provide mathematicians with food for thought and interesting new problems to solve. We hope that in the long run mathematicians can return the favor by bringing new insights to the table.

Although we keep the broad applicability of network diagrams in the back of our minds, our paper talks in terms of electrical circuits, for the sake of familiarity. We also consider a somewhat limited class of circuits. We only study circuits built from ‘passive’ components: that is, those that do not produce energy. Thus, we exclude batteries and current sources. We only consider components that respond linearly to an applied voltage. Thus, we exclude components such as nonlinear resistors or diodes. Finally, we only consider components with one input and one output, so that a circuit can be described as a graph with edges labeled by components. Thus, we also exclude transformers. The most familiar components our framework covers are linear resistors, capacitors and inductors.

While we want to expand our scope in future work, the class of circuits made from these components has appealing mathematical properties, and is worthy of deep study. Indeed, these circuits has been studied intensively for many decades by electrical engineers. Even circuits made exclusively of resistors have inspired work by mathematicians of the caliber of Weyl and Smale!

Our work relies on this research. All we are adding is an emphasis on symplectic geometry and an explicitly ‘compositional’ framework, which clarifies the way a larger circuit can be built from smaller pieces. This is where monoidal categories become important: the main operations for building circuits from pieces are composition and tensoring.

Our strategy is most easily illustrated for circuits made of linear resistors. Such a resistor dissipates power, turning useful energy into heat at a rate determined by the voltage across the resistor. However, a remarkable fact is that a circuit made of these resistors always acts to minimize the power dissipated this way. This ‘principle of minimum power’ can be seen as the reason symplectic geometry becomes important in understanding circuits made of resistors, just as the principle of least action leads to the role of symplectic geometry in classical mechanics.

Here is a circuit made of linear resistors:

The wiggly lines are resistors, and their resistances are written beside them: for example, 3\Omega means 3 ohms, an ‘ohm’ being a unit of resistance. To formalize this, define a circuit of linear resistors to consist of:

• a set N of nodes,
• a set E of edges,
• maps s,t : E \to N sending each edge to its source and target node,
• a map r: E \to (0,\infty) specifying the resistance of the resistor
labelling each edge,
• maps i : X \to N, o : Y \to N specifying the inputs and outputs of the circuit.

When we run electric current through such a circuit, each node n \in N gets a potential \phi(n). The voltage across an edge e \in E is defined as the change in potential as we move from to the source of e to its target, \phi(t(e)) - \phi(s(e)). The power dissipated by the resistor on this edge is then

\displaystyle{ \frac{1}{r(e)}\big(\phi(t(e))-\phi(s(e))\big)^2 }

The total power dissipated by the circuit is therefore twice

\displaystyle{ P(\phi) = \frac{1}{2}\sum_{e \in E} \frac{1}{r(e)}\big(\phi(t(e))-\phi(s(e))\big)^2 }

The factor of \frac{1}{2} is convenient in some later calculations.

Note that P is a nonnegative quadratic form on the vector space \mathbb{R}^N. However, not every nonnegative definite quadratic form on \mathbb{R}^N arises in this way from some circuit of linear resistors with N as its set of nodes. The quadratic forms that do arise are called Dirichlet forms. They have been extensively investigated, and they play a major role in our work.

We write

\partial N = i(X) \cup o(Y)

for the set of terminals: that is, nodes corresponding to inputs or outputs. The principle of minimum power says that if we fix the potential at the terminals, the circuit will choose the potential at other nodes to minimize the total power dissipated. An element \psi of the vector space \mathbb{R}^{\partial N} assigns a potential to each terminal. Thus, if we fix \psi, the total power dissipated will be twice

Q(\psi) = \min_{\substack{ \phi \in \mathbb{R}^N \\ \phi\vert_{\partial N} = \psi}} \; P(\phi)

The function Q : \mathbb{R}^{\partial N} \to \mathbb{R} is again a Dirichlet form. We call it the power functional of the circuit.

Now, suppose we are unable to see the internal workings of a circuit, and can only observe its ‘external behavior': that is, the potentials at its terminals and the currents flowing into or out of these terminals. Remarkably, this behavior is completely determined by the power functional Q. The reason is that the current at any terminal can be obtained by differentiating Q with respect to the potential at this terminal, and relations of this form are all the relations that hold between potentials and currents at the terminals.

The Laplace transform allows us to generalize this immediately to circuits that can also contain linear inductors and capacitors, simply by changing the field we work over, replacing \mathbb{R} by the field \mathbb{R}(s) of rational functions of a single real variable, and talking of impedance where we previously talked of resistance. We obtain a category \mathrm{Circ} where an object is a finite set, a morphism f : X \to Y is a circuit with input set X and output set Y, and composition is given by identifying the outputs of one circuit with the inputs of the next, and taking the resulting union of labelled graphs. Each such circuit gives rise to a Dirichlet form, now defined over \mathbb{R}(s), and this Dirichlet form completely describes the externally observable behavior of the circuit.

We can take equivalence classes of circuits, where two circuits count as the same if they have the same Dirichlet form. We wish for these equivalence classes of circuits to form a category. Although there is a notion of composition for Dirichlet forms, we find that it lacks identity morphisms or, equivalently, it lacks morphisms representing ideal wires of zero impedance. To address this we turn to Lagrangian subspaces of symplectic vector spaces. These generalize quadratic forms via the map

\Big(Q: \mathbb{F}^{\partial N} \to \mathbb{F}\Big) \longmapsto

\mathrm{Graph}(dQ) =    \{(\psi, dQ_\psi) \mid \psi \in \mathbb{F}^{\partial N} \} \; \subseteq \; \mathbb{F}^{\partial N} \oplus (\mathbb{F}^{\partial N})^\ast

taking a quadratic form Q on the vector space \mathbb{F}^{\partial N} over the field \mathbb{F} to the graph of its differential dQ. Here we think of the symplectic vector space \mathbb{F}^{\partial N} \oplus (\mathbb{F}^{\partial N})^\ast as the state space of the circuit, and the subspace \mathrm{Graph}(dQ) as the subspace of attainable states, with \psi \in \mathbb{F}^{\partial N} describing the potentials at the terminals, and dQ_\psi \in (\mathbb{F}^{\partial N})^\ast the currents.

This construction is well-known in classical mechanics, where the principle of least action plays a role analogous to that of the principle of minimum power here. The set of Lagrangian subspaces is actually an algebraic variety, the Lagrangian Grassmannian, which serves as a compactification of the space of quadratic forms. The Lagrangian Grassmannian has already played a role in Sabot’s work on circuits made of resistors. For us, its importance it that we can find identity morphisms for the composition of Dirichlet forms by taking circuits made of parallel resistors and letting their resistances tend to zero: the limit is not a Dirichlet form, but it exists in the Lagrangian Grassmannian.

Indeed, there exists a category \mathrm{LagrRel} with finite dimensional symplectic vector spaces as objects and Lagrangian relations as morphisms: that is, linear relations from V to W that are given by Lagrangian subspaces of \overline{V} \oplus W, where \overline{V} is the symplectic vector space conjugate to V—that is, with the sign of the symplectic structure switched.

To move from the Lagrangian subspace defined by the graph of the differential of the power functional to a morphism in the category \mathrm{LagrRel}—that is, to a Lagrangian relation— we must treat seriously the input and output functions of the circuit. These express the circuit as built upon a cospan:

Applicable far more broadly than this present formalization of circuits, cospans model systems with two ‘ends’, an input and output end, albeit without any connotation of directionality: we might just as well exchange the role of the inputs and outputs by taking the mirror image of the above diagram. The role of the input and output functions, as we have discussed, is to mark the terminals we may glue onto the terminals of another circuit, and the pushout of cospans gives formal precision to this gluing construction.

One upshot of this cospan framework is that we may consider circuits with elements of N that are both inputs and outputs, such as this one:

This corresponds to the identity morphism on the finite set with two elements. Another is that some points may be considered an input or output multiple times, like here:

This lets to connect two distinct outputs to the above double input.

Given a set X of inputs or outputs, we understand the electrical behavior on this set by considering the symplectic vector space \mathbb{F}^X \oplus {(\mathbb{F}^X)}^\ast, the direct sum of the space \mathbb{F}^X of potentials and the space {(\mathbb{F}^X)}^\ast of currents at these points. A Lagrangian relation specifies which states of the output space \mathbb{F}^Y \oplus {(\mathbb{F}^Y)}^\ast are allowed for each state of the input space \mathbb{F}^X \oplus {(\mathbb{F}^X)}^\ast. Turning the Lagrangian subspace \mathrm{Graph}(dQ) of a circuit into this information requires that we understand the ‘symplectification’

Sf: \mathbb{F}^B \oplus {(\mathbb{F}^B)}^\ast \to \mathbb{F}^A \oplus {(\mathbb{F}^A)}^\ast

and ‘twisted symplectification’

S^tf: \mathbb{F}^B \oplus {(\mathbb{F}^B)}^\ast \to \overline{\mathbb{F}^A \oplus {(\mathbb{F}^A)}^\ast}

of a function f: A \to B between finite sets. In particular we need to understand how these apply to the input and output functions with codomain restricted to \partial N; abusing notation, we also write these i: X \to \partial N and o: Y \to \partial N.

The symplectification Sf is a Lagrangian relation, and the catch phrase is that it ‘copies voltages’ and ‘splits currents’. More precisely, for any given potential-current pair (\psi,\iota) in \mathbb{F}^B \oplus {(\mathbb{F}^B)}^\ast, its image under Sf consists of all elements of (\psi', \iota') in \mathbb{F}^A \oplus {(\mathbb{F}^A)}^\ast such that the potential at a \in A is equal to the potential at f(a) \in B, and such that, for each fixed b \in B, collectively the currents at the a \in f^{-1}(b) sum to the current at b. We use the symplectification So of the output function to relate the state on \partial N to that on the outputs Y.

As our current framework is set up to report the current out of each node, to describe input currents we define the twisted symplectification:

S^tf: \mathbb{F}^B \oplus {(\mathbb{F}^B)}^\ast \to \overline{\mathbb{F}^A \oplus {(\mathbb{F}^A)}^\ast}

almost identically to the above, except that we flip the sign of the currents \iota' \in (\mathbb{F}^A)^\ast. This again gives a Lagrangian relation. We use the twisted symplectification S^ti of the input function to relate the state on \partial N to that on the inputs.

The Lagrangian relation corresponding to a circuit then comprises exactly a list of the potential-current pairs that are possible electrical states of the inputs and outputs of the circuit. In doing so, it identifies distinct circuits. A simple example of this is the identification of a single 2-ohm resistor:

with two 1-ohm resistors in series:

Our inability to access the internal workings of a circuit in this representation inspires us to call this process black boxing: you should imagine encasing the circuit in an opaque black box, leaving only the terminals accessible. Fortunately, this information is enough to completely characterize the external behavior of a circuit, including how it interacts when connected with other circuits!

Put more precisely, the black boxing process is functorial: we can compute the black-boxed version of a circuit made of parts by computing the black-boxed versions of the parts and then composing them. In fact we shall prove that \mathrm{Circ} and \mathrm{LagrRel} are dagger compact categories, and the black box functor preserves all this extra structure:

Theorem. There exists a symmetric monoidal dagger functor, the black box functor

\blacksquare: \mathrm{Circ} \to \mathrm{LagrRel}

mapping a finite set X to the symplectic vector space \mathbb{F}^X \oplus (\mathbb{F}^X)^\ast it generates, and a circuit \big((N,E,s,t,r),i,o\big) to the Lagrangian relation

\bigcup_{v \in \mathrm{Graph}(dQ)} S^ti(v) \times So(v)      \subseteq \overline{\mathbb{F}^X \oplus (\mathbb{F}^X)^\ast} \oplus \mathbb{F}^Y \oplus (\mathbb{F}^Y)^\ast

where Q is the circuit’s power functional.

The goal of this paper is to prove and explain this result. The proof is more tricky than one might first expect, but our approach involves concepts that should be useful throughout the study of networks, such as ‘decorated cospans’ and ‘corelations’.

Give it a read, and let us know if you have questions or find mistakes!


Get every new post delivered to your Inbox.

Join 3,042 other followers