Categories in Control – Erlangen Talk


I’m visiting Erlangen from now until the end of May, since my wife got a grant to do research here. I’m trying to get a lot of papers finished. But today I’m giving a talk in the math department of the university here, which with Germanic brevity is called the Friedrich-Alexander-Universität Erlangen-Nürnberg.

You can see my slides here, or maybe even come to my talk:

Categories in control, Thursday 6 February 2014, 16:15–18:00, Mathematics Department of the FAU, in Übungsraum 1.

The title is a pun. It’s about categories in control theory, the branch of engineering that studies dynamical systems with inputs and outputs, and how to optimize their behavior.

Control theorists often describe these systems using signal-flow graphs. Here is a very rough schematic signal-flow graph, describing the all-important concept of a ‘feedback loop’:

Here is a detailed one, describing a specific device called a servo:

The device is shown on top, and the signal-flow graph describing its behavior is at bottom. For details, click on the picture.

Now, if you have a drop of category-theorist’s blood in your veins, you’ll look at this signal-flow graph and think my god, that’s a string diagram for a morphism in a monoidal category!

And you’d be right. But if you want to learn what that means, and why it matters, read my talk slides!

The slides should make sense if you’re a mathematician, but maybe not otherwise. So, here’s the executive summary. The same sort of super-abstract math that handles things like Feynman diagrams:

also handles signal-flow graphs. The details are different in important and fascinating ways, and this is what I’m mainly concerned with. But we now understand how signal-flow graphs fit into the general theory of networks. This means we can proceed to use modern math to study them—and their relation to other kinds of networks, like electrical circuit diagrams:

More talks

Thanks to the Azimuth Project team, my graduate students and many other folks, the dream of network theory as a step toward ‘green mathematics’ seems to be coming true! There’s a vast amount left to be done, so I’d have trouble convincing a skeptic, but I feel the project has turned a corner. I now feel in my bones that it’s going to work: we’ll eventually develop a language for biology and ecology based in part on category theory.

So, I think it’s a good time to explain all the various aspects of this project that have been cooking away—some quite visibly, but others on secret back burners:

• Jacob Biamonte and I have written a book on Petri nets and chemical reaction networks. You may have seen parts of this on the blog. We started this project at the Centre for Quantum Technologies, but now he’s working at the Institute for Scientific Interchange, in Turin—and collaborating with people there on various aspects of network theory.

• Brendan Fong is working with me on electrical circuits. You may know him for his posts here on chemical reaction networks. He’s now a grad student in computer science at Oxford.

• Jason Erbele, a math grad student at U.C. Riverside, is working with me on control theory. This work is the main topic of my talk—but I also sketch how it ties together with what Brendan is doing. There’s a lot more to say here.

• Tobias Fritz, a postdoc at the Perimeter Institute, is working with me on category-theoretic aspects of information theory. We published a paper on entropy with Tom Leinster, and we’ve got a followup on relative entropy that’s almost done. I should be working on it right this instant! But for now, read the series of posts here on Azimuth: Relative Entropy Part 1, Part 2 and Part 3.

• Brendan Fong has also done some great work on Bayesian networks, using ideas that connect nicely to what Tobias and I are doing.

• Tu Pham and Franciscus Rebro are working on the math that underlies all these projects: bicategories of spans.

The computer science department at Oxford is a great place for category theory and diagrammatic reasoning, thanks to the presence of Samson Abramsky, Bob Coecke and others. I’m going to visit them from February 21 to March 14. It seems like a good time to give a series of talks on this stuff. So, stay tuned! I’ll try to make slides available here.

33 Responses to Categories in Control – Erlangen Talk

  1. John, I live in Erlangen! At which time is your talk, and in which place and department?

  2. Interesting! Is this approach suitable to non linear systems also?

    Best Regards,

    • John Baez says:

      All this stuff can be generalized to nonlinear systems. It will take a significant amount of work to analyze certain classes of nonlinear systems at the level of detail with which Jason has analyzed the linear ones: he’s worked out a complete list of relations that allow us to go between any two linear signal-flow diagrams that do the same job! (That is, they’re specifications of different machines that do the same thing.)

      I really want to work on nonlinear systems. However, for thinking about how category theory is relevant, it seems more efficient to start by focusing on linear ones. It should then be fairly easy to generalize.

      • Do the Lagrangian subspaces generalize to Lagrangian subvarieties? And if so, are they necessarily invariant under scaling V, V^*, or both (each)?

      • John Baez says:

        When we go from linear electrical circuits to nonlinear ones, these should give Lagrangian subvarieties of the symplectic vector space T^* X \oplus T^* Ym where X is the vector space of input voltages, Y is the vector space of output voltages, T^* X is the space of input voltages and currents, and T^*Y is the space of output voltages and currents.

        In short, nonlinear electrical circuits should give Lagrangian correspondences between symplectic vector spaces.

        Alas, I see no reason in general for these subvarieties to be invartant under rescaling of X, Y or anything else. It would take quite a bit of cleverness to design a nonlinear electrical circuit that would work just the same if you doubled all the voltages, or all the currents, or all the currents and voltages. A typical nonlinear element like a diode gives a fairly wacky relation between current and voltage, like this:

        (Here I mean “wacky” from the viewpoint of an algebraic geometer wanting things that transform simply under rescaling, not wacky from the viewpoint of an electrical engineer trying to build a useful device.)

        • Eugene Lerman says:

          I have a vague memory that Lagrangian relations in symplectic manifolds don’t compose too well, and as a result one ends up with a so called “Wehrheim-Woodward 2-category.” Do you see this (2-)category showing up in non-linear electric circuits?

        • John Baez says:

          Hi, Eugene! I’ve repeatedly talked to Alan Weinstein about this. Even for linear Lagrangian relations between symplectic vector spaces, composition is not continuous, though it’s always well-defined. To compose Lagrangian relations between symplectic manifolds a transversality condition must hold, apparently. The “Wehrheim-Woodward 2-category” is a way of dealing with this.

          I haven’t noticed this showing up in nonlinear circuits. I haven’t looked hard. But my hunch is that for some reason a large class of physically realistic nonlinear circuits give Lagrangian relations where the necessary transversality condition does hold. I would like to know if this is true, and if so I’d like to know why.

          Back on 9 August 2012, Alan wrote:

          Right now, I’m at a conference in memory of Paulette Libermann at the IHP, and hearing Sternberg talk about generating functions led me to the following variation on the Wehrheim Woodward 2 category which I wrote about in my recent preprint.

          Given symplectic X and Y, let’s define a 1-morphism X \leftarrow Y to be, not just a lagrangian submanifold of the product X \times \overline{Y}, but a lagrangian submanifold L of a third manifold Q together with a “reduction morphism” to X \times \overline{Y} from Q, i.e. a canonical relation which identifies X \times \overline{Y} with the quotient of a coisotropic submanifold C of Q by its characteristic foliation. The composition of 1-morphisms is given simply by cartesian product, composed with the natural reduction

          X \times \overline{Z} \leftarrow X \times \overline{Y} \times Y \times \overline{Z}

          given by the diagonal of Y.

          I haven’t yet figured out exactly what the 2-morphisms should be (though I have an idea of a “fractions” construction), but if L is transversal to C above, there should be a 2-isomorphism from the morphism above to the one in which Q is just X \times \overline{Y} and the lagrangian submanifold is the projection of L \cap C into X \times \overline{Y}.

          I guess by now he’s worked out the 2-morphisms?

        • HI John,

          Very interesting remarks of Alan W’s – it reminds me somewhat of Pronk’s construction of the bicategory of fractions (Composito Math. 1996), in particular of Gpd(S), the 2-category of groupoids in S. One can rigidify this construction to anafunctors (me, in TAC 2012) if some supplementary data (a pretopology on S) satisfies a condition (is subcanonical). However in general one must play with the extra freedom that something analogous to Weinstein’s Q –> X \times \bar Y. However, this is not a tight enough analogy for me to guess what 2-morphisms might be.

        • Eugene Lerman says:

          The 2-arrows in the Wehrheim-Woodward 2-category are a bit complicated and not easy to guess unless you are fluent in symplectic topology. It is mentioned in the original paper of Wehrheim-Woodward ( The point is that there are morphisms between pairs of Lagrangian manifolds, the idea that goes back to Donaldson and Fukaya. Wehrheim-Woodward use psedoholomorphic quilts (basically moduli of pseudo-holomorphic strips, but I am mangling it).

  3. Jenny M. says:

    Perhaps this is the moment when you add a “green mathematics” tag? I’d find that helpful.

    • John Baez says:

      Hmm, I see what you mean, but it’s a bit tricky. The long network theory and information theory series on this blog are all aimed at developing green mathematics. They just don’t say the phrase “green mathematics” very much: they’re too busy laying down necessary infrastructure to advertise what’s being built. This post is no more “green” than all those: it’s mainly about control theory, which is generally considered a fairly “grey” topic, though some biologists know better. It just happens that while preparing the talk I gave yesterday, I felt an enormous burst of optimism that yes, it’s all going to work.

      It’s possible I need a “pure mathematics” tag to indicate the math posts that are not aimed at developing green mathematics—like the “symmetry and the fourth dimension” series. These posts are just for fun, a kind of sideshow.

      • Jenny M. says:

        I understand the difficulties. Just letting you know there are some folks following “green mathematics” from the sidelines with strong interest and minimal comprehension. Glad you think this is going to work!

      • John Baez says:

        I’m glad you’re interested. As things get closer to working, I’ll try to write things that are easier to follow.

        Right now I mainly need to bring the project to the point where more mathematicians get interested in it. This requires a lot of technical work that only mathematicians will enjoy—and I believe at this stage saying


        would be counterproductive, since they’ll react with suspicion. Mathematicians typically want to see theorems and challenging but very precisely posed questions… except for students, who are still open to the appeal of grand but vague visions.

      • lee bloomquist says:

        Professor Baez, there is also a diagramming tool I once used to model manufacturing processes called Petri nets, which are closely related to those you are studying called stochastic Petri nets.

        However, here is a word to the wise– Don’t propose to develop a commercial tool for modeling the capital budgeting process in customer companies by prototyping and iterating in your own company, no matter how much the tool might ground the sale of certain products to customers based upon objective data. Recall what the economist Erik Brynjolfsson of MIT has described in his papers about “HIPPO” management (management by the “HIghest Paid Person’s in the room Opinion”). This is a management practice he contrasts with management decision-making based upon objective data. Threats to management have in many cases proven to be lethal to career(s)!

        Come to think of it, such a tool might be a good way to regulate companies on their decisions related to climate change. Though with the above lesson in-mind, it might be better to approach an organization like the NSA about this– especially when it starts becoming obvious to everybody that climate change is a threat to national security.


  4. Thanks John, that’s very interesting!

    I have two minor comments/observations. First, i think that on slide 12 the ambiguity goes away once you take initial conditions into account, regardless if you are working in the time-domain or frequency-domain (Laplace). Since you always have to take into account initial conditions, I honestly wouldn’t even talk about ambiguity because it might confuse people. Anyway that’s a minor point.

    Second, it seems to me that when you “join” these generators together to create a diagram (e.g. slide 21) you are implicitly stating that the signal coming out of one end of a generator is equal to the one coming in at the input of the following (connected) generator (something like f=g). So that being said it looks (to me) like you should be able to create a feedback as in slide 29 just using one addition, 3 multiplications, and one duplication.

    In other words i really don’t understand the need for the cup and cap morphisms. Why aren’t they just like any other connection between generators as it’s done for example in slide 21 ? What am i missing ?

    • Plus in slide 30 you say that “To allow feedback loops we need morphisms more general than linear maps”.

      However both the cup and cap morphisms seem very linear, and in fact you even say in slide 32 that they are indeed linear relations, seemingly in contradiction to what stated before.

      I don’t know, perhaps you can shed some light …

      • John Baez says:

        The cup and cap morphisms are not linear maps; they are linear relations.

        A relation R from a set S to a set T is a subset of S \times T; if the pair (s,t) is in this subset we say s is related to t via the relation R.

        A map or function is a relation such that for each s \in S there is exactly one t \in T such that s is related to t via the relation R.

        When the sets S and T are vector spaces a linear relation from S to T is a linear subspace of S \times T. A linear map is a linear relation that’s a map.

      • John Baez says:

        Let me say a bit more.

        From what I said before, a relation can return many, one, or no outputs for each input. A map or function returns exactly one output for each input.

        For example, squaring is a map from \mathbb{R} to \mathbb{R}, while square-rooting is just a relation from \mathbb{R} to \mathbb{R}, since a real number can have two, one, or no real square roots.

        For any vector space V, there is a cup relation

        \cup : V \times V \leadsto \{0\}

        where the squiggly arrow means we’re talking about a relation rather than a map.

        Applied to the ordered pair (v,w) \in V \times V, this has the element 0 as output if v = w, while it has no output if v \ne w. This is not a linear map, but it is a linear relation.

        So, the cup is a way of forcing the two inputs v, w to be equal: it’s undefined if they’re not. You can’t do this with a map. But you can do it with a relation.

        • Giampiero Campa says:

          Ok, thanks for clarifying that.

          So it looks like the cap instead is forcing two outputs to be equal. Is that right ?

          Since in the graph in slide 21 (an others following) you always connect an output of a generator to an input of the next one (assuming a directionality from top to bottom), then you don’t need either the cup or cap.

          But my question at this point is, when i look at the feedback loop in page 29, i don’t see any instances in which you need to force inputs or outputs to be the same. All i see is one addition at the beginning, one duplication at the end, and three linear maps in the middle. Every inputs (except the reference) is connected to an output, and every output (except the system output) is connected to an input (again assuming some directionality in the arrows). So then again, why the cup and cap ??

        • John Baez says:

          Giampiero wrote:

          So it looks like the cap instead is forcing two outputs to be equal. Is that right?

          Right. For any vector space V, the cap

          \cap : \{0\} \leadsto V \times V

          is the relation where 0 is related to all pairs (v,v) in V \times V. So, you feed in the only possible input, 0, and the allowed outputs are all pairs (v,w) where v = w.

          But my question at this point is, when i look at the feedback loop in page 29, i don’t see any instances in which you need to force inputs or outputs to be the same.

          The problem is that this loop is drawn in a way so you can’t see the cup and cup unless you know where they must be! First, it’s drawn so you read it from left to right instead of from top to bottom:

          But second, and more importantly, it’s drawn in a rectilinear style that disguises the cap and cup! I should draw my own picture, to fix that.

          For example, look at how the ‘system output’ gets duplicated at right and one copy bends to the left and heads back to the ‘sensor’.

          Here we should see a wire split into two wires that keep going right, and then see the bottom wire bend back and head left. Duplication is a function

          \Delta : V \to V \times V

          but for the second copy of the system output to ‘loop back’, we need to use the cup!

          Similarly, we need a cap at left.

          When I draw the picture in a style that matches the rest of my slides (or get Jason to draw it ) I’ll include it here. And I’ll include it in my next version of this talk. I see now that the different style of drawing used above can be confusing.

        • Giampiero Campa says:

          Interesting that my previous comment is awaiting moderation, I wonder what triggered that … anyway, great stuff, i’d love to understand it more, thanks for answering my questions.

        • but for the second copy of the system output to ‘loop back’, we need to use the cup!

          OK so that cup on the right has two inputs. The first one is connected to the system output. The second input of that cup is connected to … what ?

          It should be connected to the output of some other map, right ?

          The only available output is the one of the sensor, but you can’t force the output of the sensor to be equal to the system output (i wish we could).

        • John Baez says:

          Giampiero wrote:

          OK so that cup on the right has two inputs. The first one is connected to the system output. The second input of that cup is connected to … what ?

          Ugh, this is very simple but tough to explain without a decent picture and/or lots of fancy-sounding math. It’s connected to the input of the sensor, as you can see here:

          We duplicate the system output, use a cup to send one copy of the system output back to the sensor, and it becomes the input of the sensor.

          It should be connected to the output of some other map, right?

          Okay, that’s another thing that requires explanation. Given any relation R, we can ‘reverse’ it to get a new relation R^* whose output is the input of R and vice versa. For example, the reverse of the relation ‘child of’ is the relation ‘parent of’. In the picture above, we’ve reversed the sensor in this way.

          Mathematically, a relation

          R : S \leadsto T

          is a subset of S \times T, and its reverse

          R^* : T \leadsto S

          is a subset of T \times S, namely:

          R^* = \{ (t,s) : \; (s,t) \in R \}

          For example, if (s,t) \in R when s is the child of t, then (t,s) \in R^* when t is the parent of s.

          The reverse of a function may not be a function, but working with relations gets around this problem: the reverse of a relation is always another relation. This lets us take any picture of a gadget, reflect it, and get something that still makes sense. The official ‘output’ of the reversed gadget is what we were calling the ‘input’ of the original gadget, and vice versa.

        • Ok, so the “reverse sensor” map comes out with an _input_ on the left hand side, which explains why you say you use a cap to force the input of the reverse sensor (that is the output of the original sensor) to be equal to one of the inputs of the addition at the beginning. is that right ?

          It still feels very weird that in this formalism you can’t just directly connect the output to the duplicator to the input of the sensor (the way you connect for example the output of the addition to the controller input that follows it).

          Perhaps you do all this because you want to preserve some kind of directionality from left to right for ALL the gadgets ?

          I am more used to think that the only important directionality is the one of the arrow, because it carries the signal in that direction, and that it does not matter how it is oriented on the page, or if it bends or zig zags around, but perhaps it’s not so in this formalism.

        • John Baez says:

          Giampiero wrote:

          Ok, so the “reverse sensor” map comes out with an _input_ on the left hand side, which explains why you say you use a cap to force the input of the reverse sensor (that is the output of the original sensor) to be equal to one of the inputs of the addition at the beginning. is that right?

          Yeah, sorta. But it sounds very complicated and surreal. I’m really just talking about the usual math of relations, and operations on relations. Since the purely mathematical treatment in my slides didn’t seem to be communicating the ideas clearly, I started talking to you using something that vaguely resembles plain English. Now I’m starting to regret that, because the ‘plain English’ is getting very convoluted.

          ‘The input of the reversed sensor is the output of the sensor’ sounds very scary. All it really means is this: if have a box where you plug in an input wire on the left and the output comes out a wire on the right, and you turn around the box, now you have to plug the input wire on the right and the output comes out the left!

          The turned-around box can be seen as a new device, a ‘reversed sensor’ whose input is the output of the old device, and vice versa. But this makes a simple business seem rather mysterious.

          I am more used to think that the only important directionality is the one of the arrow, because it carries the signal in that direction, and that it does not matter how it is oriented on the page, or if it bends or zig zags around, but perhaps it’s not so in this formalism.

          If you look at my slides you’ll see there are no arrows. So, the picture I borrowed because I was too lazy to draw my own is confusing you in yet another way: it has arrows.

          I think the best thing to do at this point is wait until I write something else about this formalism, and continue the conversation then. It’ll have pictures—and they won’t be confusing pictures drawn by someone else in a different style than the one I’m using—so it will make a lot more sense than the conversation we’re having now. This conversation has been very instructive to me, because it reminds me that there are several different ways to think about these things, which become very confusing when mixed with each other.

        • John Baez says:

          Perhaps this will help. There are various ways one can try to keep track of ‘directionality of signal flow’ for diagrams made of boxes connected by wires, where wires touch the boxes at specific points—let me call those ‘ports’.

          One is to draw arrows on the wires connecting boxes, but not label the ports on boxes. This works if every box allows one input and one output and we impose a rule saying each box must be connected to one wire with an arrow pointing in and one wire with an arrow pointing out, as here:

          However, this approach becomes problematic as soon as we have boxes with more than two ports.

          Another approach is to not draw arrows on wires, but instead have boxes with labelled ports. For example, a box could have two ports labelled ‘input’ and ‘output’. Or, it could have five ports labelled ‘A’, ‘B’, ‘C’, ‘D’ and ‘E’. In this setup, every diagram made of boxes and wires where every port has a wire coming in gives a device that makes sense.

          This latter approach is the one that naturally arises in the mathematics of relations, and it’s the one I want to use.

          However, while having this conversation I noticed that the symbol for scalar multiplication in my talk slides doesn’t obey the rules I just laid down: when you’ve got a circle-shaped box with two wires coming in, you can’t really tell which end of the circle is the input port and which is the output port. I’d been reading the from top to bottom, but we won’t always want this. So, I think we need to fix this.

        • I think that if you work wit maps (which might not be invertible) then you have an implicit natural directionality in the blocks, so you have to label inputs and outputs differently and this leads to the fact that arrows have a direction as well.

          This is the most common convention in control theory and practice (and by the way this is an example).

          But you are working with both relations and maps, and I see how cup and caps might be useful in situations where you want to share variables, for example.

          I just don’t see the feedback loop as an example in which you absolutely have to use them, because to me the feedback loop is not qualitatively different from the graph in slide 21 (where cups and caps are not used).

          But alright, ok, i will wait until you write something else and draw your own pictures, i promise :-)

        • John Baez says:

          Thanks for helping me identify a point where my formalization has something new to offer.

          I claim you really need caps and cups to implement general feedback loops. The feedback loop in slide 21 may not look like it has caps, cups and relations, but it does—or more precisely, we need to think of it this way to get a reasonably efficient and elegant mathematical treatment. We also need relations to handle linear electrical circuits in a reasonably nice way.

          Control theorists and electrical engineers may not believe this, but that just means I have something to teach them (which is good). Please wait for further talks and/or papers, where I’ll try to explain this in a convincing way!

        • guest says:

          Why do we need cap and cup? Can’t traced monoidal categories — where “trace” simply maps morphisms of the form (X (x) U) -> (Y (x) U) to morphisms X -> Y (the U output has been fed back into the input) — give us the appropriate notion? This seems to map more comfortably to the way that system theorists tend to view “feedback” as a sort of primitive operation/notion.

        • John Baez says:

          The monoidal category of finite-dimensional vector spaces and linear relations is compact closed when we use direct sum as our tensor product, so it has a cap and cup whether we want them or not. These allow us to describe, not just feedback, but arbitrary physical systems involving linear signal processing and wires that bend up and down. Compact closed monoidal categories are traced, but it would be a shame to focus solely on the lesser structure when you actually have more.

  5. […] Neat article from John Baez on the use of category theory in control theory. I like how he describes category theory as a way of formally studying anything that can be diagrammatically expressed. […]

You can use Markdown or HTML in your comments. You can also use LaTeX, like this: $latex E = m c^2 $. The word 'latex' comes right after the first dollar sign, with a space after it.

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.