Here are the slides of the talk I’m giving on Monday to kick off the Categorical Foundations of Network Theory workshop in Turin:
This is a long talk, starting with the reasons I care about this subject, and working into the details of one particular project: the categorical foundations of networks as applied to electrical engineering and control theory. There are lots of links in blue; click on them for more details!
• John Baez, Information and entropy in biological systems.
Abstract. Information and entropy are being used in biology in many different ways: for example, to study biological communication systems, the ‘action-perception loop’, the thermodynamic foundations of biology, the structure of ecosystems, measures of biodiversity, and evolution. Can we unify these? To do this, we must learn to talk to each other. This will be easier if we share some basic concepts which I’ll sketch here.
The talk is full of links, in blue. If you click on these you can get more details. You can also watch a video of my talk:
So, let’s dive into Chris Lee’s exciting ideas about organisms as ‘information evolving machines’ that may provide ‘disinformation’ to their competitors. Near the end of his talk, he discusses some new results on an ever-popular topic: the Prisoner’s Dilemma. You may know about this classic book:
• Robert Axelrod, The Evolution of Cooperation, Basic Books, New York, 1984. Some passages available free online.
If you don’t, read it now! He showed that the simple ‘tit for tat’ strategy did very well in some experiments where the game was played repeatedly and strategies who did well got to ‘reproduce’ themselves. This result was very exciting, so a lot of people have done research on it. More recently a paper on this subject by William Press and Freeman Dyson received a lot of hype. I think this is a good place to learn about that:
• Mike Shulman, Zero determinant strategies in the iterated Prisoner’s Dilemma, The n-Category Café, 19 July 2012.
Chris Lee’s new work on the Prisoner’s Dilemma is here, cowritten with two other people who attended the workshop:
• The art of war: beyond memory-one strategies in population games, PLOS One, 24 March 2015.
Abstract. We show that the history of play in a population game contains exploitable information that can be successfully used by sophisticated strategies to defeat memory-one opponents, including zero determinant strategies. The history allows a player to label opponents by their strategies, enabling a player to determine the population distribution and to act differentially based on the opponent’s strategy in each pairwise interaction. For the Prisoner’s Dilemma, these advantages lead to the natural formation of cooperative coalitions among similarly behaving players and eventually to unilateral defection against opposing player types. We show analytically and empirically that optimal play in population games depends strongly on the population distribution. For example, the optimal strategy for a minority player type against a resident tit-for-tat (TFT) population is ‘always cooperate’ (ALLC), while for a majority player type the optimal strategy versus TFT players is ‘always defect’ (ALLD). Such behaviors are not accessible to memory-one strategies. Drawing inspiration from Sun Tzu’s the Art of War, we implemented a non-memory-one strategy for population games based on techniques from machine learning and statistical inference that can exploit the history of play in this manner. Via simulation we find that this strategy is essentially uninvadable and can successfully invade (significantly more likely than a neutral mutant) essentially all known memory-one strategies for the Prisoner’s Dilemma, including ALLC (always cooperate), ALLD (always defect), tit-for-tat (TFT), win-stay-lose-shift (WSLS), and zero determinant (ZD) strategies, including extortionate and generous strategies.
And now for the talk! Click on the talk title here for Chris Lee’s slides, or go down and watch the video:
Abstract. Information theory is an intuitively attractive way of thinking about biological evolution, because it seems to capture a core aspect of biology—life as a solution to “information problems”—in a fundamental way. However, there are non-trivial questions about how to apply that idea, and whether it has actual predictive value. For example, should we think of biological systems as being actually driven by an information metric? One idea that can draw useful links between information theory, evolution and statistical inference is the definition of an information evolving machine (IEM) as a system whose elements represent distinct predictions, and whose weights represent an information (prediction power) metric, typically as a function of sampling some iterative observation process. I first show how this idea provides useful results for describing a statistical inference process, including its maximum entropy bound for optimal inference, and how its sampling-based metrics (“empirical information”, Ie, for prediction power; and “potential information”, I_{p}, for latent prediction power) relate to classical definitions such as mutual information and relative entropy. These results suggest classification of IEMs into several distinct types:
1. I_{e} machine: e.g. a population of competing genotypes evolving under selection and mutation is an IEM that computes an Ie equivalent to fitness, and whose gradient (I_{p}) acts strictly locally, on mutations that it actually samples. Its transition rates between steady states will decrease exponentially as a function of evolutionary distance.
2. “I_{p} tunneling” machine: a statistical inference process summing over a population of models to compute both I_{e}, I_{p} can directly detect “latent” information in the observations (not captured by its model), which it can follow to “tunnel” rapidly to a new steady state.
3. disinformation machine (multiscale IEM): an ecosystem of species is an IEM whose elements (species) are themselves IEMs that can interact. When an attacker IEM can reduce a target IEM’s prediction power (I_{e}) by sending it a misleading signal, this “disinformation dynamic” can alter the evolutionary landscape in interesting ways, by opening up paths for rapid co-evolution to distant steady-states. This is especially true when the disinformation attack targets a feature of high fitness value, yielding a combination of strong negative selection for retention of the target feature, plus strong positive selection for escaping the disinformation attack. I will illustrate with examples from statistical inference and evolutionary game theory. These concepts, though basic, may provide useful connections between diverse themes in the workshop.
For example, classical mechanics is the study of what things do when they follow Newton’s laws. Control theory is the study of what you can get them to do.
Say you have an upside-down pendulum on a cart. Classical mechanics says what it will do. But control theory says: if you watch the pendulum and use what you see to move the cart back and forth correctly, you can make sure the pendulum doesn’t fall over!
Control theorists do their work with the help of ‘signal-flow diagrams’. For example, here is the signal-flow diagram for an inverted pendulum on a cart:
When I take a look at a diagram like this, I say to myself: that’s a string diagram for a morphism in a monoidal category! And it’s true. Jason Erbele wrote a paper explaining this. Independently, Bonchi, Sobociński and Zanasi did some closely related work:
• John Baez and Jason Erbele, Categories in control.
• Filippo Bonchi, Paweł Sobociński and Fabio Zanasi, Interacting Hopf algebras.
• Filippo Bonchi, Paweł Sobociński and Fabio Zanasi, A categorical semantics of signal flow graphs.
I’ll explain some of the ideas at the Turin meeting on the categorical foundations of network theory. But I also want to talk about this new paper that Simon Wadsley of Cambridge University wrote with my student Nick Woods:
• Simon Wadsley and Nick Woods, PROPs for linear systems.
This makes the picture neater and more general!
You see, Jason and I used signal flow diagrams to give a new description of the category of finite-dimensional vector spaces and linear maps. This category plays a big role in the control theory of linear systems. Bonchi, Sobociński and Zanasi gave a closely related description of an equivalent category, where:
• objects are natural numbers, and
• a morphism is an matrix with entries in the field
and composition is given by matrix multiplication.
But Wadsley and Woods generalized all this work to cover whenever is a commutative rig. A rig is a ‘ring without negatives’—like the natural numbers. We can multiply matrices valued in any rig, and this includes some very useful examples… as I’ll explain later.
Wadsley and Woods proved:
Theorem. Whenever is a commutative rig, is the PROP for bicommutative bimonoids over
This result is quick to state, but it takes a bit of explaining! So, let me start by bringing in some definitions.
We will work in any symmetric monoidal category, and draw morphisms as string diagrams.
A commutative monoid is an object equipped with a multiplication:
and a unit:
obeying these laws:
For example, suppose is the symmetric monoidal category of finite-dimensional vector spaces over a field , with direct sum as its tensor product. Then any object is a commutative monoid where the multiplication is addition:
and the unit is zero: that is, the unique map from the zero-dimensional vector space to
Turning all this upside down, cocommutative comonoid has a comultiplication:
and a counit:
obeying these laws:
For example, consider our vector space again. It’s a commutative comonoid where the comultiplication is duplication:
and the counit is deletion: that is, the unique map from to the zero-dimensional vector space.
Given an object that’s both a commutative monoid and a cocommutative comonoid, we say it’s a bicommutative bimonoid if these extra axioms hold:
You can check that these are true for our running example of a finite-dimensional vector space The most exciting one is the top one, which says that adding two vectors and then duplicating the result is the same as duplicating each one, then adding them appropriately.
Our example has some other properties, too! Each element defines a morphism from to itself, namely scalar multiplication by
We draw this as follows:
These morphisms are compatible with the ones so far:
Moreover, all the ‘rig operations’ in —that is, addition, multiplication, 0 and 1, but not subtraction or division—can be recovered from what we have so far:
We summarize this by saying our vector space is a bicommutative bimonoid ‘over ‘.
More generally, suppose we have a bicommutative bimonoid in a symmetric monoidal category. Let be the set of bicommutative bimonoid homomorphisms from to itself. This is actually a rig: there’s a way to add these homomorphisms, and also a way to ‘multiply’ them (namely, compose them).
Suppose is any commutative rig. Then we say is a bicommutative bimonoid over if it’s equipped with a rig homomorphism
This is a way of summarizing the diagrams I just showed you! You see, each gives a morphism from to itself, which we write as
The fact that this is a bicommutative bimonoid endomorphism says precisely this:
And the fact that is a rig homomorphism says precisely this:
So sometimes the right word is worth a dozen pictures!
What Jason and I showed is that for any field the is the free symmetric monoidal category on a bicommutative bimonoid over This means that the above rules, which are rules for manipulating signal flow diagrams, completely characterize the world of linear algebra!
Bonchi, Sobociński and Zanasi used ‘PROPs’ to prove a similar result where the field is replaced by a sufficiently nice commutative ring. And Wadlsey and Woods used PROPS to generalize even further to the case of an arbitrary commutative rig!
But what are PROPs?
A PROP is a particularly tractable sort of symmetric monoidal category: a strict symmetric monoidal category where the objects are natural numbers and the tensor product of objects is given by ordinary addition. The symmetric monoidal category is equivalent to the PROP where a morphism is an matrix with entries in composition of morphisms is given by matrix multiplication, and the tensor product of morphisms is the direct sum of matrices.
We can define a similar PROP whenever is a commutative rig, and Wadsley and Woods gave an elegant description of the ‘algebras’ of . Suppose is a PROP and is a strict symmetric monoidal category. Then the category of algebras of in is the category of strict symmetric monoidal functors and natural transformations between these.
If for every choice of the category of algebras of in is equivalent to the category of algebraic structures of some kind in we say is the PROP for structures of that kind. This explains the theorem Wadsley and Woods proved:
Theorem. Whenever is a commutative rig, is the PROP for bicommutative bimonoids over
The fact that an algebra of is a bicommutative bimonoid is equivalent to all this stuff:
The fact that is a bimonoid homomorphism for all is equivalent to this stuff:
And the fact that is a rig homomorphism is equivalent to this stuff:
This is a great result because it includes some nice new examples.
First, the commutative rig of natural numbers gives a PROP This is equivalent to the symmetric monoidal category where morphisms are isomorphism classes of spans of finite sets, with disjoint union as the tensor product. Steve Lack had already shown that is the PROP for bicommutative bimonoids. But this also follows from the result of Wadsley and Woods, since every bicommutative bimonoid is automatically equipped with a unique rig homomorphism
Second, the commutative rig of booleans
with ‘or’ as addition and ‘and’ as multiplication gives a PROP This is equivalent to the symmetric monoidal category where morphisms are relations between finite sets, with disjoint union as the tensor product. Samuel Mimram had already shown that this is the PROP for special bicommutative bimonoids, meaning those where comultiplication followed by multiplication is the identity:
But again, this follows from the general result of Wadsley and Woods!
Finally, taking the commutative ring of integers Wadsley and Woods showed that is the PROP for bicommutative Hopf monoids. The key here is that scalar multiplication by obeys the axioms for an antipode—the extra morphism that makes a bimonoid into a Hopf monoid. Here are those axioms:
More generally, whenever is a commutative ring, the presence of guarantees that a bimonoid over is automatically a Hopf monoid over So, when is a commutative ring, Wadsley and Woods’ result implies that is the PROP for Hopf monoids over
Earlier, in their paper on ‘interacting Hopf algebras’, Bonchi, Sobociński and Zanasi had given an elegant and very different proof that is the PROP for Hopf monoids over whenever is a principal ideal domain. The advantage of their argument is that they build up the PROP for Hopf monoids over from smaller pieces, using some ideas developed by Steve Lack. But the new argument by Wadsley and Woods has its own charm.
In short, we’re getting the diagrammatics of linear algebra worked out very nicely, providing a solid mathematical foundation for signal flow diagrams in control theory!
At least, that’s what preliminary data from the International Energy Agency say. It seems the big difference is China. The Chinese made more electricity from renewable sources, such as hydropower, solar and wind, and burned less coal.
In fact, a report by Greenpeace says that from April 2014 to April 2015, China’s carbon emissions dropped by an amount equal to the entire carbon emissions of the United Kingdom!
I want to check this, because it would be wonderful if true: a 5% drop. They say that if this trend continues, China will close out 2015 with the biggest reduction in CO_{2} emissions every recorded by a single country.
The International Energy Agency also credits Europe’s improved attempts to cut carbon emissions for the turnaround. In the US, carbon emissions has basically been dropping since 2006—with a big drop in 2009 due to the economic collapse, a partial bounce-back in 2010, but a general downward trend.
In the last 40 years, there have only been 3 times in which emissions stood still or fell compared to the previous year, all during global economic crises: the early 1980’s, 1992, and 2009. In 2014, however, the global economy expanded by 3%.
So, the tide may be turning! But please remember: while carbon emissions may start dropping, they’re still huge. The amount of the CO2 in the air shot above 400 parts per million in March this year. As Erika Podest of NASA put it:
CO2 concentrations haven’t been this high in millions of years. Even more alarming is the rate of increase in the last five decades and the fact that CO2 stays in the atmosphere for hundreds or thousands of years. This milestone is a wake up call that our actions in response to climate change need to match the persistent rise in CO2. Climate change is a threat to life on Earth and we can no longer afford to be spectators.
Here is the announcement by the International Energy Agency:
• Global energy-related emissions of carbon dioxide stalled in 2014, IEA, 13 March 2015.
Their full report on this subject will come out on 15 June 2015. Here is the report by Greenpeace EnergyDesk:
• China coal use falls: CO_{2} reduction this year could equal UK total emissions over same period, Greenpeace EnergyDesk.
I trust them less than the IEA when it comes to using statistics correctly, but someone should be able to verify their claims if true.
Hugo Nava-Kopp and I have a new paper on resource theories:
• Brendan Fong and Hugo Nava-Kopp, Additive monotones for resource theories of parallel-combinable processes with discarding.
A mathematical theory of resources is Tobias Fritz’s current big project. He’s already explained how ordered monoids can be viewed as theories of resource convertibility in a three part series on this blog.
Ordered monoids are great, and quite familiar in network theory: for example, a Petri net can be viewed as a presentation for an ordered commutative monoid. But this work started in symmetric monoidal categories, together with my (Oxford) supervisor Bob Coecke and Rob Spekkens.
The main idea is this: think of the objects of your symmetric monoidal category as resources, and your morphisms as ways to convert one resource into another. The monoidal product or ‘tensor product’ in your category allows you to talk about collections of your resources. So, for example, in the resource theory of chemical reactions, our objects are molecules like oxygen O_{2}, hydrogen H_{2}, and water H_{2}O, and morphisms things like the electrolysis of water:
This is a categorification of the ordered commutative monoid of resource convertibility: we now keep track of how we convert resources into one another, instead of just whether we can convert them.
Categorically, I find the other direction easier to state: being a category, the resource theory is enriched over , while a poset is enriched over the poset of truth values or ‘booleans’ If we ‘partially decategorify’ by changing the base of enrichment along the functor that maps the empty set to 0 and any nonempty set to 1, we obtain the ordered monoid corresponding to the resource theory.
But the research programme didn’t start at resource theories either. The starting point was ‘partitioned process theories’.
Here’s an example that guided the definitions. Suppose we have a bunch of labs with interacting quantum systems, separated in space. With enough cooperation and funding, they can do big joint operations on their systems, like create entangled pairs between two locations. For ‘free’, however, they’re limited to classical communication between the locations, although they can do the full range of quantum operations on their local system. So you’ve got a symmetric monoidal category with objects quantum systems and morphisms quantum operations, together with a wide (all-object-including) symmetric monoidal subcategory that contains the morphisms you can do with local quantum operations and classical communication (known as LOCC operations).
This general structure: a symmetric monoidal category (or SMC for short) with a wide symmetric monoidal subcategory, is called a partitioned process theory. We call the morphisms in the SMC processes, and those in the subSMC free processes.
There are a number of methods for building a resource theory (i.e. an SMC) from a partitioned process theory. The unifying idea though, is that your new SMC has the processes as objects, and morphisms ways of using the free processes to build from
But we don’t have to go to fancy sounding quantum situations to find examples of partitioned process theories. Instead, just look at any SMC in which each object is equipped with an algebraic structure. Then the morphisms defining this structure can be taken as our ‘free’ processes.
For example, in a multigraph category every object has the structure of a ‘special commutative Frobenius algebra’. That’s a bit of a mouthful, but John defined it a while back, and examples include categories where morphisms are electrical circuits, and categories where morphisms are signal flow diagrams.
So these categories give partitioned process theories! This idea of partitioning the morphisms into ‘free’ ones and ‘costly’ ones is reminiscent of what I was saying earlier about the operad of wiring diagrams about it being useful to separate behavioural structure from interconnection structure.
This suggests that we can also view the free processes as generating some sort of operad, that describes the ways we allow ourselves to use free processes to turn processes into other processes. If we really want to roll a big machine out to play with this stuff, framed bicategories may also be interesting; Spivak is already using them to get at questions about operads. But that’s all conjecture, and a bit of a digression.
To get back to the point, this was all just to say that if you find yourself with a bunch of resistors, and you ask ‘what can I build?’, then you’re after the resource theory apparatus.
You can read more about this stuff here:
• Bob Coecke, Tobias Fritz and Rob W. Spekkens, A mathematical theory of resources.
• Tobias Fritz, The mathematical structure of theories of resource convertibility I.
We’re getting ready for the Turin workshop on the
Categorical Foundations of Network Theory. So, we’re trying to get our thoughts in order.
Last time we talked about understanding types of networks as categories of decorated cospans. Earlier, David Spivak told us about understanding networks as algebras of an operad. Both these frameworks work at capturing notions of modularity and interconnection. Are they then related? How?
In this post we want to discuss some similarities between decorated cospan categories and algebras for Spivak’s operad of wiring diagrams. The main idea is that the two approaches are ‘essentially’ equivalent, but that compared to decorated cospans, Spivak’s operad formalism puts greater emphasis on the distinction between the ‘duplication’ and ‘deletion’ morphisms and other morphisms in our category.
The precise details are still to be worked out—jump in and help us!
We begin with a bit about operads in general. Recall that an operad is similar to a category, except that instead of a set of morphisms from any object to any object you have a set of operations from any finite list of objects to any object If we have an operation we can call the inputs of and call the output of
We can compose operations in an operad. To understand how, it’s easiest to use pictures. We draw an operation in as a little box with wires coming in and one wire coming out:
The input wires should be labelled with the objects and the output wire with the object but I haven’t done this.
We are allowed to compose these operations as follows:
as long as the outputs of the operations match the inputs of the operation The result is a new operation which we call
We demand that there be unary operations serving as identities for composition, and we impose an associative law that makes a composite of composites like this well-defined:
So far this is the definition of a operad without permutations. In a full-fledged permutative operad, we can also permute the inputs of an operation and get a new operation:
which we call if is the the permutation of the inputs. We demand that And finally, we demand that permutations act in a way that is compatible with composition. For example:
Here we see that is equal to some obvious other thing.
Finally, there is a law saying
for some choice of that you can cook up from the permuations in an obvious way. We leave it as an exercise to work out the details. By the way, one well-known book on operads accidentally omits this law, so here’s a rather more lengthy exercise: read this book, see which theorems require this law, and correct their proofs!
Operads are similar to symmetric monoidal categories. The idea is that in a symmetric monoidal category you can just form the tensor product and talk about the set of morphisms Indeed any symmetric monoidal category gives an operad in this way: just define to be If we do this with Set, which is a symmetric monoidal category using the usual cartesian product of sets, we get an operad called
An algebra for an operad is an operad homomorphism We haven’t said what an operad homomorphism is, but you can probably figure it out yourself. The point is this: an algebra for turns the abstract operations in into actual operations on sets!
Finally, we should warn you that operads come in several flavors, and we’ve been talking about ‘typed permutative operads’. ‘Typed’ means that there’s more than one object; ‘permutative’ means that we have the ability to permute the input wires. When people say ‘operad’, they often mean an untyped permutative operad. For that, just specialize down to the case where there’s only one object
You can see a fully precise definition of untyped permutative operads here:
• Operad theory, Wikipedia.
along with the definition of an untyped operad without permutations.
Spivak’s favorite operad is the operad of wiring diagrams. The operad of wiring diagrams is an operad version of constructed in the vein suggested above: the objects are finite sets, and an operation from a list of sets to a set is a cospan
Spivak draws such a thing as a big circle with small circles cut out from the interior:
The outside of the big circle has a set of terminals marked on it, and each small circle has a set of terminals marked on it. Then in the interior of this shape there are wires connecting these terminals. This what he calls a wiring diagrams.
You compose these wiring diagrams by pasting other wiring diagrams into each of the small circles.
The relationship with our Frobenius monoid diagrams is pretty simple: we draw our ‘wiring diagrams’ in a square, with the terminals on the left and terminals on the right. To get a Spivak-approved wiring diagram, glue the top and bottom edges of this square together, then flatten the cylinder you get down into an annulus, with the -side on the inside and -side on the outside. If you can imagine gluing opposite edges of the inside circle together to divide it into two small circles accordingly, and so on.
Algebras for wiring diagrams tell you what components you have available to wire together with your diagrams. An algebra for the operad of wiring diagrams is an operad homomorphism
What does this look like? Just like a functor for categories, it assigns to each natural number a set, and each wiring diagram a function.
In work related to decorated cospans (such as our paper on circuits or John and Jason’s work on signal flow diagrams), our semantics usually is constructed from a field of values—not a physicist’s ‘field’, bt an algebraist’s sort of ‘field’, where you can add, multiply, subtract and divide. For example, we like being able to assign a real number like a velocity, or potential, or current to a variable. This gives us vector spaces and a bunch of nice linear-algebraic structures.
Spivak works more generally: he’s interested in the structure when you just have a set of values. While this means we can’t do some of the nice things we could do with a field, it also means this framework can do things like talk about logic gates, where the variables are boolean ones, or number theoretic questions, where you’re interested in the natural numbers.
So to discuss semantics we pick a set of values, such as the real numbers or natural numbers or booleans or colors. We imagine then associating elements of this set to each wire in a wiring diagram. More technically, the algebra
then maps each finite set to the power set of the set of functions
On the morphisms (the wiring diagrams themselves), this functor behaves as follows. Note that a function () can be thought of as an ‘X-vector’ of ‘A-coordinates’. A wiring diagram is just a cospan
in so it can be thought of as some compares
followed by some copies
Thus, given a wiring diagram we can consider a partial function that maps an -vector to the -vector by doing these compares, and if it passes them does the copies and returns the resulting -vector, but if not returns ‘undefined’. We can then define a map which takes a set of -vectors to its image under this partial function.
This semantics is called the relational -algebra of type A. We can think of it as being like the ‘light operations’ fragment of the signal flow calculus. By ‘light operations’, we mean the operations of duplication and deletion, which form a cocommutative comonoid:
and their time-reversed versions, ‘coduplication’ and ‘codeletion’, which form a commutatative monoid:
These fit together to form a Frobenius monoid, meaning that these equations hold:
And it’s actually extra-special, meaning that these equations hold:
(If you don’t understand these hieroglyphics, please reread our post about categories in control theory, and ask some questions!)
Note that we can’t do the ‘dark operations’, because we only have a set of values, not a field, and the dark operations involve addition and zero!
In formulating Frobenius monoids this way, Spivak achieves something that we’ve been working hard to find ways to achieve: a separation of the behavioral structure from the interconnection structure.
What do I mean by this? In his ‘behavioral approach‘, Willems makes the point that for all their elaborate and elegant formulation, in the end physical laws just come down to dividing the set of what might a priori be possible (the ‘universum’) into the set of things that actually are possible (the ‘behavior’), and the set of things that aren’t). Here the universum is the set : a priori, on each of the wires in we might associate any value of For example, to the two wires at the ends of a resistor, we might a priori associate any pair of currents. But physical law, here Kirchhoff’s current law, says otherwise: the currents must be equal and opposite. So the ‘behavior’ is the subset of the universum
So you can say that to each object in the operad of wiring diagrams the relational algebra of type associates the set of possible behaviors—the universum is ( forms some sort of meta-universum, where you can discuss physical laws about physical laws, commonly called ‘principles’.)
The second key aspect of the behavioral approach is that the behaviors of larger systems can be constructed from the behaviors of its subsystems, if we understand the so-called ‘interconnection structure’ well enough. This is a key principle in engineering: we build big, complicated systems from a much smaller set of components, whether it be electronics from resistors and inductors, or mechanical devices from wheels and rods and ropes, or houses from Lego bricks. The various interconnection structures here are the wiring diagrams, and our relational algebras say they act by what Willems calls ‘variable sharing’.
This division between behavior and interconnection motivates the decorated cospan construction (where the decorations are the ‘components’, the cospans the ‘interconnection’) and also the multigraph categories discussed by Aleks Kissinger (where morphisms are the ‘components’, and the Frobenius monoid operations are the ‘interconnection’):
• Aleks Kissinger, Finite matrices are complete for (dagger-)multigraph categories.
So it’s good to have this additional way of thinking about things in our repertoire: operads describe ‘interconnection’, their algebras ‘behaviors’.
The separation Spivak achieves, however, seems to me to come at the cost of neat ways to talk about individual components, and perhaps this can be seen as the essential difference between the two approaches. By including our components as morphisms, we can talk more carefully about them and additional structure individual components have. On the other hand, by lumping all the components into the objects, Spivak can talk more carefully about how the interconnection structure acts on all behaviors at once.
One advantage of the operad approach is that you can easily tweak your operad to talk about different sorts of network structure. Sometimes you can make similar adjustments with decorated cospans too, such as working over the category of typed finite sets, rather than just finite sets, to discuss networks in which wires have types, and only wires of the same types can be connected together. A physical example is a model of a hydroelectric power plant, where you don’t want to connect a water pipe with an electrical cable! This is also a common technique in computer science, where you don’t want to try to multiply two strings of text, or try to interpret a telephone number as a truth value.
But some modifications are harder to do with decorated cospans. In some other papers, Spivak employs a more restricted operad of wiring diagrams, in which joining wires and terminating wires is not allowed, among other things. He uses this to formalise graphical languages for certain types of discrete-time processes, open dynamical systems, including mode-dependent ones.
For more detail, read these:
• Brendan Fong, Decorated cospans.
• David Spivak, The operad of wiring diagrams: formalizing a graphical language for databases, recursion, and plug-and-play circuits.
Our paper uses a formalism that Brendan developed here:
• Brendan Fong, Decorated cospans.
The idea here is we may want to take something like a graph with edges labelled by positive numbers:
and say that some of its nodes are ‘inputs’, while others are ‘outputs':
This lets us treat our labelled graph as a ‘morphism’ from the set to the set
The point is that we can compose such morphisms. For example, suppose we have another one of these things, going from to :
Since the points of are sitting in both things:
we can glue them together and get a thing going from to :
That’s how we compose these morphisms.
Note how we’re specifying some nodes of our original thing as inputs and outputs:
We’re using maps from two sets and to the set of nodes of our graph. And a bit surprisingly, we’re not demanding that these maps be one-to-one. That turns out to be useful—and in general, when doing math, it’s dumb to make your definitions forbid certain possibilities unless you really need to.
So, our thing is really a cospan of finite sets—that is, a diagram of finite sets and functions like this:
together some extra structure on the set . This extra structure is what Brendan calls a decoration, and it makes the cospan into a ‘decorated cospan’. In this particular example, a decoration on is a way of making it into the nodes of a graph with edges labelled by positive numbers. But one could consider many other kinds of decorations: this idea is very general.
To formalize the idea of ‘a kind of decoration’, Brendan uses a functor
sending each finite set to a set of This set is the set of decorations of the given kind that we can put on
So, for any such functor a decorated cospan of finite sets is a cospan of finite sets:
together with an element of
But in fact, Brendan goes further. He’s not content to use a functor
to decorate his cospans.
First, there’s no need to limit ourselves to cospans of finite sets: we can replace with some other category! If is any category with finite colimits, there’s a category with:
• objects of as its objects,
• isomorphism classes of cospans between these as morphisms.
Second, there’s no need to limit ourselves to decorations that are elements of a set: we can replace with some other category! If is any symmetric monoidal category, we can define an element of an object to be a morphism
where is the unit for the tensor product in
So, Brendan defines decorated cospans at this high level of generality, and shows that under some mild conditions we can compose them, just as in the pictures we saw earlier.
Here’s one of the theorems Brendan proves:
Theorem. Suppose is a category with finite colimits, and make into a symmetric monoidal category with its coproduct as the tensor product. Suppose is a symmetric monoidal category, and suppose is a lax symmetric monoidal functor. Define an F-decorated cospan to be a cospan
in together with an element of Then there is a category with
• objects of as its objects,
• isomorphism classes of -decorated cospans as its morphisms.
This is called the F-decorated cospan category, This category becomes symmetric monoidal in a natural way. It is then a dagger compact category.
(You may not know all this jargon, but ‘lax symmetric monoidal’, for example, talks about how we can take decorations on two things and get a decoration on their disjoint union, or ‘coproduct’. We need to be able to do this—as should be obvious from the pictures I drew. Also, a ‘dagger compact category’ is the kind of category whose morphisms can be drawn as networks.)
Brendan also explains how to get functors between decorated cospan categories. We need this in our paper on electrical circuits, because we consider several categories where morphisms is a circuit, or something that captures some aspect of a circuit. Most of these categories are decorated cospan categories. We want to get functors between them. And often we can just use Brendan’s general results to get the job done! No fuss, no muss: all the hard work has been done ahead of time.
I expect to use this technology a lot in my work on network theory.
• John Baez and Brendan Fong, A compositional framework for passive linear networks.
While my paper with Jason Erbele studies signal flow diagrams, this one focuses on circuit diagrams. The two are different, but closely related.
I’ll explain their relation at the Turin workshop in May. For now, let me just talk about this paper with Brendan. There’s a lot in here, but let me just try to explain the main result. It’s all about ‘black boxing': hiding the details of a circuit and only remembering its behavior as seen from outside.
In late 1940s, just as Feynman was developing his diagrams for processes in particle physics, Eilenberg and Mac Lane initiated their work on category theory. Over the subsequent decades, and especially in the work of Joyal and Street in the 1980s, it became clear that these developments were profoundly linked: monoidal categories have a precise graphical representation in terms of string diagrams, and conversely monoidal categories provide an algebraic foundation for the intuitions behind Feynman diagrams. The key insight is the use of categories where morphisms describe physical processes, rather than structure-preserving maps between mathematical objects.
In work on fundamental physics, the cutting edge has moved from categories to higher categories. But the same techniques have filtered into more immediate applications, particularly in computation and quantum computation. Our paper is part of a new program of applying string diagrams to engineering, with the aim of giving diverse diagram languages a unified foundation based on category theory.
Indeed, even before physicists began using Feynman diagrams, various branches of engineering were using diagrams that in retrospect are closely related. Foremost among these are the ubiquitous electrical circuit diagrams. Although less well-known, similar diagrams are used to describe networks consisting of mechanical, hydraulic, thermodynamic and chemical systems. Further work, pioneered in particular by Forrester and Odum, applies similar diagrammatic methods to biology, ecology, and economics.
As discussed in detail by Olsen, Paynter and others, there are mathematically precise analogies between these different systems. In each case, the system’s state is described by variables that come in pairs, with one variable in each pair playing the role of ‘displacement’ and the other playing the role of ‘momentum’. In engineering, the time derivatives of these variables are sometimes called ‘flow’ and ‘effort’.
displacement: | flow: | momentum: | effort: | |
Mechanics: translation | position | velocity | momentum | force |
Mechanics: rotation | angle | angular velocity | angular momentum | torque |
Electronics | charge | current | flux linkage | voltage |
Hydraulics | volume | flow | pressure momentum | pressure |
Thermal Physics | entropy | entropy flow | temperature momentum | temperature |
Chemistry | moles | molar flow | chemical momentum | chemical potential |
In classical mechanics, this pairing of variables is well understood using symplectic geometry. Thus, any mathematical formulation of the diagrams used to describe networks in engineering needs to take symplectic geometry as well as category theory into account.
While diagrams of networks have been independently introduced in many disciplines, we do not expect formalizing these diagrams to immediately help the practitioners of these disciplines. At first the flow of information will mainly go in the other direction: by translating ideas from these disciplines into the language of modern mathematics, we can provide mathematicians with food for thought and interesting new problems to solve. We hope that in the long run mathematicians can return the favor by bringing new insights to the table.
Although we keep the broad applicability of network diagrams in the back of our minds, our paper talks in terms of electrical circuits, for the sake of familiarity. We also consider a somewhat limited class of circuits. We only study circuits built from ‘passive’ components: that is, those that do not produce energy. Thus, we exclude batteries and current sources. We only consider components that respond linearly to an applied voltage. Thus, we exclude components such as nonlinear resistors or diodes. Finally, we only consider components with one input and one output, so that a circuit can be described as a graph with edges labeled by components. Thus, we also exclude transformers. The most familiar components our framework covers are linear resistors, capacitors and inductors.
While we want to expand our scope in future work, the class of circuits made from these components has appealing mathematical properties, and is worthy of deep study. Indeed, these circuits has been studied intensively for many decades by electrical engineers. Even circuits made exclusively of resistors have inspired work by mathematicians of the caliber of Weyl and Smale!
Our work relies on this research. All we are adding is an emphasis on symplectic geometry and an explicitly ‘compositional’ framework, which clarifies the way a larger circuit can be built from smaller pieces. This is where monoidal categories become important: the main operations for building circuits from pieces are composition and tensoring.
Our strategy is most easily illustrated for circuits made of linear resistors. Such a resistor dissipates power, turning useful energy into heat at a rate determined by the voltage across the resistor. However, a remarkable fact is that a circuit made of these resistors always acts to minimize the power dissipated this way. This ‘principle of minimum power’ can be seen as the reason symplectic geometry becomes important in understanding circuits made of resistors, just as the principle of least action leads to the role of symplectic geometry in classical mechanics.
Here is a circuit made of linear resistors:
The wiggly lines are resistors, and their resistances are written beside them: for example, means 3 ohms, an ‘ohm’ being a unit of resistance. To formalize this, define a circuit of linear resistors to consist of:
• a set of nodes,
• a set of edges,
• maps sending each edge to its source and target node,
• a map specifying the resistance of the resistor
labelling each edge,
• maps specifying the inputs and outputs of the circuit.
When we run electric current through such a circuit, each node gets a potential The voltage across an edge is defined as the change in potential as we move from to the source of to its target, The power dissipated by the resistor on this edge is then
The total power dissipated by the circuit is therefore twice
The factor of is convenient in some later calculations.
Note that is a nonnegative quadratic form on the vector space However, not every nonnegative definite quadratic form on arises in this way from some circuit of linear resistors with as its set of nodes. The quadratic forms that do arise are called Dirichlet forms. They have been extensively investigated, and they play a major role in our work.
We write
for the set of terminals: that is, nodes corresponding to inputs or outputs. The principle of minimum power says that if we fix the potential at the terminals, the circuit will choose the potential at other nodes to minimize the total power dissipated. An element of the vector space assigns a potential to each terminal. Thus, if we fix the total power dissipated will be twice
The function is again a Dirichlet form. We call it the power functional of the circuit.
Now, suppose we are unable to see the internal workings of a circuit, and can only observe its ‘external behavior': that is, the potentials at its terminals and the currents flowing into or out of these terminals. Remarkably, this behavior is completely determined by the power functional The reason is that the current at any terminal can be obtained by differentiating with respect to the potential at this terminal, and relations of this form are all the relations that hold between potentials and currents at the terminals.
The Laplace transform allows us to generalize this immediately to circuits that can also contain linear inductors and capacitors, simply by changing the field we work over, replacing by the field of rational functions of a single real variable, and talking of impedance where we previously talked of resistance. We obtain a category where an object is a finite set, a morphism is a circuit with input set and output set and composition is given by identifying the outputs of one circuit with the inputs of the next, and taking the resulting union of labelled graphs. Each such circuit gives rise to a Dirichlet form, now defined over and this Dirichlet form completely describes the externally observable behavior of the circuit.
We can take equivalence classes of circuits, where two circuits count as the same if they have the same Dirichlet form. We wish for these equivalence classes of circuits to form a category. Although there is a notion of composition for Dirichlet forms, we find that it lacks identity morphisms or, equivalently, it lacks morphisms representing ideal wires of zero impedance. To address this we turn to Lagrangian subspaces of symplectic vector spaces. These generalize quadratic forms via the map
taking a quadratic form on the vector space over the field to the graph of its differential Here we think of the symplectic vector space as the state space of the circuit, and the subspace as the subspace of attainable states, with describing the potentials at the terminals, and the currents.
This construction is well-known in classical mechanics, where the principle of least action plays a role analogous to that of the principle of minimum power here. The set of Lagrangian subspaces is actually an algebraic variety, the Lagrangian Grassmannian, which serves as a compactification of the space of quadratic forms. The Lagrangian Grassmannian has already played a role in Sabot’s work on circuits made of resistors. For us, its importance it that we can find identity morphisms for the composition of Dirichlet forms by taking circuits made of parallel resistors and letting their resistances tend to zero: the limit is not a Dirichlet form, but it exists in the Lagrangian Grassmannian.
Indeed, there exists a category with finite dimensional symplectic vector spaces as objects and Lagrangian relations as morphisms: that is, linear relations from to that are given by Lagrangian subspaces of where is the symplectic vector space conjugate to —that is, with the sign of the symplectic structure switched.
To move from the Lagrangian subspace defined by the graph of the differential of the power functional to a morphism in the category —that is, to a Lagrangian relation— we must treat seriously the input and output functions of the circuit. These express the circuit as built upon a cospan:
Applicable far more broadly than this present formalization of circuits, cospans model systems with two ‘ends’, an input and output end, albeit without any connotation of directionality: we might just as well exchange the role of the inputs and outputs by taking the mirror image of the above diagram. The role of the input and output functions, as we have discussed, is to mark the terminals we may glue onto the terminals of another circuit, and the pushout of cospans gives formal precision to this gluing construction.
One upshot of this cospan framework is that we may consider circuits with elements of that are both inputs and outputs, such as this one:
This corresponds to the identity morphism on the finite set with two elements. Another is that some points may be considered an input or output multiple times, like here:
This lets to connect two distinct outputs to the above double input.
Given a set of inputs or outputs, we understand the electrical behavior on this set by considering the symplectic vector space the direct sum of the space of potentials and the space of currents at these points. A Lagrangian relation specifies which states of the output space are allowed for each state of the input space Turning the Lagrangian subspace of a circuit into this information requires that we understand the ‘symplectification’
and ‘twisted symplectification’
of a function between finite sets. In particular we need to understand how these apply to the input and output functions with codomain restricted to ; abusing notation, we also write these and
The symplectification is a Lagrangian relation, and the catch phrase is that it ‘copies voltages’ and ‘splits currents’. More precisely, for any given potential-current pair in its image under consists of all elements of in such that the potential at is equal to the potential at and such that, for each fixed collectively the currents at the sum to the current at We use the symplectification of the output function to relate the state on to that on the outputs
As our current framework is set up to report the current out of each node, to describe input currents we define the twisted symplectification:
almost identically to the above, except that we flip the sign of the currents This again gives a Lagrangian relation. We use the twisted symplectification of the input function to relate the state on to that on the inputs.
The Lagrangian relation corresponding to a circuit then comprises exactly a list of the potential-current pairs that are possible electrical states of the inputs and outputs of the circuit. In doing so, it identifies distinct circuits. A simple example of this is the identification of a single 2-ohm resistor:
with two 1-ohm resistors in series:
Our inability to access the internal workings of a circuit in this representation inspires us to call this process black boxing: you should imagine encasing the circuit in an opaque black box, leaving only the terminals accessible. Fortunately, this information is enough to completely characterize the external behavior of a circuit, including how it interacts when connected with other circuits!
Put more precisely, the black boxing process is functorial: we can compute the black-boxed version of a circuit made of parts by computing the black-boxed versions of the parts and then composing them. In fact we shall prove that and are dagger compact categories, and the black box functor preserves all this extra structure:
Theorem. There exists a symmetric monoidal dagger functor, the black box functor
mapping a finite set to the symplectic vector space it generates, and a circuit to the Lagrangian relation
where is the circuit’s power functional.
The goal of this paper is to prove and explain this result. The proof is more tricky than one might first expect, but our approach involves concepts that should be useful throughout the study of networks, such as ‘decorated cospans’ and ‘corelations’.
Give it a read, and let us know if you have questions or find mistakes!
A while back I decided one way to apply my math skills to help save the planet was to start pushing toward green mathematics: a kind of mathematics that can interact with biology and ecology just as fruitfully as traditional mathematics interacts with physics. As usual with math, the payoffs will come slowly, but they may be large. It’s not a substitute for doing other, more urgent things—but if mathematicians don’t do this, who will?
As a first step in this direction, I decided to study networks.
This May, a small group of mathematicians is meeting in Turin for a workshop on the categorical foundations of network theory, organized by Jacob Biamonte. I’m trying to get us mentally prepared for this. We all have different ideas, yet they should fit together somehow.
Tobias Fritz, Eugene Lerman and David Spivak have all written articles here about their work, though I suspect Eugene will have a lot of completely new things to say, too. Now it’s time for me to say what my students and I have been doing.
Despite my ultimate aim of studying biological and ecological networks, I decided to start by clarifying the math of networks that appear in chemistry and engineering, since these are simpler, better understood, useful in their own right, and probably a good warmup for the grander goal. I’ve been working with Brendan Fong on electrical ciruits, and with Jason Erbele on control theory. Let me talk about this paper:
• John Baez and Jason Erbele, Categories in control.
Control theory is the branch of engineering that focuses on manipulating open systems—systems with inputs and outputs—to achieve desired goals. In control theory, signal-flow diagrams are used to describe linear ways of manipulating signals, for example smooth real-valued functions of time. Here’s a real-world example; click the picture for more details:
For a category theorist, at least, it is natural to treat signal-flow diagrams as string diagrams in a symmetric monoidal category. This forces some small changes of perspective, which I’ll explain, but more important is the question: which symmetric monoidal category?
We argue that the answer is: the category of finite-dimensional vector spaces over a certain field but with linear relations rather than linear maps as morphisms, and direct sum rather than tensor product providing the symmetric monoidal structure. We use the field consisting of rational functions in one real variable This variable has the meaning of differentation. A linear relation from to is thus a system of linear constant-coefficient ordinary differential equations relating ‘input’ signals and ‘output’ signals.
Our main goal in this paper is to provide a complete ‘generators and relations’ picture of this symmetric monoidal category, with the generators being familiar components of signal-flow diagrams. It turns out that the answer has an intriguing but mysterious connection to ideas that are familiar in the diagrammatic approach to quantum theory! Quantum theory also involves linear algebra, but it uses linear maps between Hilbert spaces as morphisms, and the tensor product of Hilbert spaces provides the symmetric monoidal structure.
We hope that the category-theoretic viewpoint on signal-flow diagrams will shed new light on control theory. However, in this paper we only lay the groundwork.
There are several basic operations that one wants to perform when manipulating signals. The simplest is multiplying a signal by a scalar. A signal can be amplified by a constant factor:
where We can write this as a string diagram:
Here the labels and on top and bottom are just for explanatory purposes and not really part of the diagram. Control theorists often draw arrows on the wires, but this is unnecessary from the string diagram perspective. Arrows on wires are useful to distinguish objects from their
duals, but ultimately we will obtain a compact closed category where each object is its own dual, so the arrows can be dropped. What we really need is for the box denoting scalar multiplication to have a clearly defined input and output. This is why we draw it as a triangle. Control theorists often use a rectangle or circle, using arrows on wires to indicate which carries the input and which the output
A signal can also be integrated with respect to the time variable:
Mathematicians typically take differentiation as fundamental, but engineers sometimes prefer integration, because it is more robust against small perturbations. In the end it will not matter much here. We can again draw integration as a string diagram:
Since this looks like the diagram for scalar multiplication, it is natural to extend to the field of rational functions of a variable which stands for differentiation. Then differentiation becomes a special case of scalar multiplication, namely multiplication by and integration becomes multiplication by Engineers accomplish the same effect with Laplace transforms, since differentiating a signal $f$ is equivalent to multiplying its Laplace transform
by the variable Another option is to use the Fourier transform: differentiating is equivalent to multiplying its Fourier transform
by Of course, the function needs to be sufficiently well-behaved to justify calculations involving its Laplace or Fourier transform. At a more basic level, it also requires some work to treat integration as the two-sided inverse of differentiation. Engineers do this by considering signals that vanish for and choosing the antiderivative that vanishes under the same condition. Luckily all these issues can be side-stepped in a formal treatment of signal-flow diagrams: we can simply treat signals as living in an unspecified vector space over the field The field would work just as well, and control theory relies heavily on complex analysis. In our paper we work over an arbitrary field
The simplest possible signal processor is a rock, which takes the 'input' given by the force on the rock and produces as 'output' the rock's position Thanks to Newton's second law we can describe this using a signal-flow diagram:
Here composition of morphisms is drawn in the usual way, by attaching the output wire of one morphism to the input wire of the next.
To build more interesting machines we need more building blocks, such as addition:
and duplication:
When these linear maps are written as matrices, their matrices are transposes of each other. This is reflected in the string diagrams for addition and duplication:
The second is essentially an upside-down version of the first. However, we draw addition as a dark triangle and duplication as a light one because we will later want another way to ‘turn addition upside-down’ that does not give duplication. As an added bonus, a light upside-down triangle resembles the Greek letter the usual symbol for duplication.
While they are typically not considered worthy of mention in control theory, for completeness we must include two other building blocks. One is the zero map from the zero-dimensional vector space to our field which we denote as and draw as follows:
The other is the zero map from to sometimes called ‘deletion’, which we denote as and draw thus:
Just as the matrices for addition and duplication are transposes of each other, so are the matrices for zero and deletion, though they are rather degenerate, being and matrices, respectively. Addition and zero make into a commutative monoid, meaning that the following relations hold:
The equation at right is the commutative law, and the crossing of strands is the braiding:
by which we switch two signals. In fact this braiding is a symmetry, so it does not matter which strand goes over which:
Dually, duplication and deletion make into a cocommutative comonoid. This means that if we reflect the equations obeyed by addition and zero across the horizontal axis and turn dark operations into light ones, we obtain another set of valid equations:
There are also relations between the monoid and comonoid operations. For example, adding two signals and then duplicating the result gives the same output as duplicating each signal and then adding the results:
This diagram is familiar in the theory of Hopf algebras, or more generally bialgebras. Here it is an example of the fact that the monoid operations on are comonoid homomorphisms—or equivalently, the comonoid operations are monoid homomorphisms.
We summarize this situation by saying that is a bimonoid. These are all the bimonoid laws, drawn as diagrams:
The last equation means we can actually make the diagram at left disappear, since it equals the identity morphism on the 0-dimensional vector space, which is drawn as nothing.
So far all our string diagrams denote linear maps. We can treat these as morphisms in the category where objects are finite-dimensional vector spaces over a field and morphisms are linear maps. This category is equivalent to the category where the only objects are vector spaces for and then morphisms can be seen as matrices. The space of signals is a vector space over which may not be finite-dimensional, but this does not cause a problem: an matrix with entries in still defines a linear map from to in a functorial way.
In applications of string diagrams to quantum theory, we make into a symmetric monoidal category using the tensor product of vector spaces. In control theory, we instead make into a symmetric monoidal category using the direct sum of vector spaces. In Lemma 1 of our paper we prove that for any field with direct sum is generated as a symmetric monoidal category by the one object together with these morphisms:
where is arbitrary.
However, these generating morphisms obey some unexpected relations! For example, we have:
Thus, it is important to find a complete set of relations obeyed by these generating morphisms, thus obtaining a presentation of as a symmetric monoidal category. We do this in Theorem 2. In brief, these relations say:
(1) is a bicommutative bimonoid;
(2) the rig operations of can be recovered from the generating morphisms;
(3) all the generating morphisms commute with scalar multiplication.
Here item (2) means that and in the field can be expressed in terms of signal-flow diagrams as follows:
Multiplicative inverses cannot be so expressed, so our signal-flow diagrams so far do not know that is a field. Additive inverses also cannot be expressed in this way. So, we expect that a version of Theorem 2 will hold whenever is a mere rig: that is, a ‘ring without negatives’, like the natural numbers. The one change is that instead of working with vector spaces, we should work with finitely presented free -modules.
Item (3), the fact that all our generating morphisms commute with scalar multiplication, amounts to these diagrammatic equations:
While Theorem 2 is a step towards understanding the category-theoretic underpinnings of control theory, it does not treat signal-flow diagrams that include ‘feedback’. Feedback is one of the most fundamental concepts in control theory because a control system without feedback may be highly sensitive to disturbances or unmodeled behavior. Feedback allows these uncontrolled behaviors to be mollified. As a string diagram, a basic feedback system might look schematically like this:
The user inputs a ‘reference’ signal, which is fed into a controller, whose output is fed into a system, which control theorists call a ‘plant’, which in turn produces its own output. But then the system’s output is duplicated, and one copy is fed into a sensor, whose output is added (or if we prefer, subtracted) from the reference signal.
In string diagrams—unlike in the usual thinking on control theory—it is essential to be able to read any diagram from top to bottom as a composite of tensor products of generating morphisms. Thus, to incorporate the idea of feedback, we need two more generating morphisms. These are the ‘cup':
and ‘cap':
These are not maps: they are relations. The cup imposes the relation that its two inputs be equal, while the cap does the same for its two outputs. This is a way of describing how a signal flows around a bend in a wire.
To make this precise, we use a category called An object of this category is a finite-dimensional vector space over while a morphism from to denoted is a linear relation, meaning a linear subspace
In particular, when a linear relation is just an arbitrary system of constant-coefficient linear ordinary differential equations relating input variables and output variables.
Since the direct sum is also the cartesian product of and a linear relation is indeed a relation in the usual sense, but with the property that if is related to and is related to then is related to whenever
We compose linear relations and as follows:
Any linear map gives a linear relation namely the graph of that map:
Composing linear maps thus becomes a special case of composing linear relations, so becomes a subcategory of Furthermore, we can make into a monoidal category using direct sums, and it becomes symmetric monoidal using the braiding already present in
In these terms, the cup is the linear relation
given by
while the cap is the linear relation
given by
These obey the zigzag relations:
Thus, they make into a compact closed category where and thus every object, is its own dual.
Besides feedback, one of the things that make the cap and cup useful is that they allow any morphism to be ‘plugged in backwards’ and thus ‘turned around’. For instance, turning around integration:
we obtain differentiation. In general, using caps and cups we can turn around any linear relation and obtain a linear relation called the adjoint of which turns out to given by
For example, if is nonzero, the adjoint of scalar multiplication by is multiplication by :
Thus, caps and cups allow us to express multiplicative inverses in terms of signal-flow diagrams! One might think that a problem arises when when but no: the adjoint of scalar multiplication by is
In Lemma 3 we show that is generated, as a symmetric monoidal category, by these morphisms:
where is arbitrary.
In Theorem 4 we find a complete set of relations obeyed by these generating morphisms,thus giving a presentation of as a symmetric monoidal category. To describe these relations, it is useful to work with adjoints of the generating morphisms. We have already seen that the adjoint of scalar multiplication by is scalar multiplication by except when Taking adjoints of the other four generating morphisms of we obtain four important but perhaps unfamiliar linear relations. We draw these as ‘turned around’ versions of the original generating morphisms:
• Coaddition is a linear relation from to that holds when the two outputs sum to the input:
• Cozero is a linear relation from to that holds when the input is zero:
• Coduplication is a linear relation from to that holds when the two inputs both equal the output:
• Codeletion is a linear relation from to that holds always:
Since and automatically obey turned-around versions of the relations obeyed by and we see that acquires a second bicommutative bimonoid structure when considered as an object in
Moreover, the four dark operations make into a Frobenius monoid. This means that is a monoid, is a comonoid, and the Frobenius relation holds:
All three expressions in this equation are linear relations saying that the sum of the two inputs equal the sum of the two outputs.
The operation sending each linear relation to its adjoint extends to a contravariant functor
which obeys a list of properties that are summarized by saying that is a †-compact category. Because two of the operations in the Frobenius monoid are adjoints of the other two, it is a †-Frobenius monoid.
This Frobenius monoid is also special, meaning that
comultiplication (in this case ) followed by multiplication (in this case ) equals the identity:
This Frobenius monoid is also commutative—and cocommutative, but for Frobenius monoids this follows from commutativity.
Starting around 2008, commutative special †-Frobenius monoids have become important in the categorical foundations of quantum theory, where they can be understood as ‘classical structures’ for quantum systems. The category of finite-dimensional Hilbert spaces and linear maps is a †-compact category, where any linear map has an adjoint given by
for all A commutative special †-Frobenius monoid in is then the same as a Hilbert space with a chosen orthonormal basis. The reason is that given an orthonormal basis for a finite-dimensional Hilbert space we can make into a commutative special †-Frobenius monoid with multiplication given by
and unit given by
The comultiplication duplicates basis states:
Conversely, any commutative special †-Frobenius monoid in arises this way.
Considerably earlier, around 1995, commutative Frobenius monoids were recognized as important in topological quantum field theory. The reason, ultimately, is that the free symmetric monoidal category on a commutative Frobenius monoid is the category with 2-dimensional oriented cobordisms as morphisms. But the free symmetric monoidal category on a commutative special Frobenius monoid was worked out even earlier: it is the category with finite sets as objects, where a morphism is an isomorphism class of cospans
This category can be made into a †-compact category in an obvious way, and then the 1-element set becomes a commutative special †-Frobenius monoid.
For all these reasons, it is interesting to find a commutative special †-Frobenius monoid lurking at the heart of control theory! However, the Frobenius monoid here has yet another property, which is more unusual. Namely, the unit followed by the counit is the identity:
We call a special Frobenius monoid that also obeys this extra law extra-special. One can check that the free symmetric monoidal category on a commutative extra-special Frobenius monoid is the category with finite sets as objects, where a morphism is an equivalence relation on the disjoint union and we compose and by letting and generate an equivalence relation on and then restricting this to
As if this were not enough, the light operations share many properties with the dark ones. In particular, these operations make into a commutative extra-special †-Frobenius monoid in a second way. In summary:
• is a bicommutative bimonoid;
• is a bicommutative bimonoid;
• is a commutative extra-special †-Frobenius monoid;
• is a commutative extra-special †-Frobenius monoid.
It should be no surprise that with all these structures built in, signal-flow diagrams are a powerful method of designing processes.
However, it is surprising that most of these structures are present in a seemingly very different context: the so-called ZX calculus, a diagrammatic formalism for working with complementary observables in quantum theory. This arises naturally when one has an -dimensional Hilbert space $H$ with two orthonormal bases that are mutually unbiased, meaning that
for all Each orthonormal basis makes into commutative special †-Frobenius monoid in Moreover, the multiplication and unit of either one of these Frobenius monoids fits together with the comultiplication and counit of the other to form a bicommutative bimonoid. So, we have all the structure present in the list above—except that these Frobenius monoids are only extra-special if is 1-dimensional.
The field is also a 1-dimensional vector space, but this is a red herring: in every finite-dimensional vector space naturally acquires all four structures listed above, since addition, zero, duplication and deletion are well-defined and obey all the relations we have discussed. Jason and I focus on in our paper simply because it generates all the objects via direct sum.
Finally, in the cap and cup are related to the light and dark operations as follows:
Note the curious factor of in the second equation, which breaks some of the symmetry we have seen so far. This equation says that two elements sum to zero if and only if Using the zigzag relations, the two equations above give
We thus see that in both additive and multiplicative inverses can be expressed in terms of the generating morphisms used in signal-flow diagrams.
Theorem 4 of our paper gives a presentation of based on the ideas just discussed. Briefly, it says that is equivalent to the symmetric monoidal category generated by an object and these morphisms:
• addition
• zero
• duplication
• deletion
• scalar multiplication for any
• cup
• cap
obeying these relations:
(1) is a bicommutative bimonoid;
(2) and obey the zigzag equations;
(3) is a commutative extra-special †-Frobenius monoid;
(4) is a commutative extra-special †-Frobenius monoid;
(5) the field operations of can be recovered from the generating morphisms;
(6) the generating morphisms (1)-(4) commute with scalar multiplication.
Note that item (2) makes into a †-compact category, allowing us to mention the adjoints of generating morphisms in the subsequent relations. Item (5) means that and also additive and multiplicative inverses in the field can be expressed in terms of signal-flow diagrams in the manner we have explained.
So, we have a good categorical understanding of the linear algebra used in signal flow diagrams!
Now Jason is moving ahead to apply this to some interesting problems… but that’s another story, for later.