Applied Category Theory

6 April, 2017

The American Mathematical Society is having a meeting here at U. C. Riverside during the weekend of November 4th and 5th, 2017. I’m organizing a session on Applied Category Theory, and I’m looking for people to give talks.

The goal is to start a conversation about applications of category theory, not within pure math or fundamental physics, but to other branches of science and engineering—especially those where the use of category theory is not already well-established! For example, my students and I have been applying category theory to chemistry, electrical engineering, control theory and Markov processes.

Alas, we have no funds for travel and lodging. If you’re interested in giving a talk, please submit an abstract here:

General information about abstracts, American Mathematical Society.

More precisely, please read the information there and then click on the link on that page to submit an abstract. It should then magically fly through cyberspace to me! Abstracts are due September 12th, but the sooner you submit one, the greater the chance that we’ll have space.

For the program of the whole conference, go here:

Fall Western Sectional Meeting, U. C. Riverside, Riverside, California, 4–5 November 2017.

We’ll be having some interesting plenary talks:

• Paul Balmer, UCLA, An invitation to tensor-triangular geometry.

• Pavel Etingof, MIT, Double affine Hecke algebras and their applications.

• Monica Vazirani, U.C. Davis, Combinatorics, categorification, and crystals.

Category Theory Course Notes

11 March, 2016

My grad students and I have been using a lot of category theory in our work on networks in engineering, chemistry and biology. So I decided to teach an introductory course on category theory, and it was surprisingly popular: 25 grad students registered for it! Clearly there’s a lot of interest in the subject, and we don’t regularly teach a course on it at U. C. Riverside.

Here are the notes from my course. In my earlier fall Fall 2015 seminar, I had tried to explain how category theory unifies mathematics, without getting into lots of technical details. This time I tried to give a systematic introduction to the subject, leading up to a little taste of topos theory. But many proofs were offloaded to another more informal seminar, for which notes are not available.

When I started teaching this course, I imagined that the notes might someday grow into a book I’d always dreamt of: an introduction to category theory that includes lots of examples, talks to the reader in a friendly way, and explains what’s ‘really going on’. However, while teaching the course, I noticed that Emily Riehl has written a book like this, probably better than I ever could. Even better, her book is free online and will be published by Dover:

• Emily Riehl, Category Theory in Context, 2014.

So, I don’t feel much urge to write that book anymore. But there might still be room for a more quirky book on category theory that only I could write. It would probably need to include not only this ‘standard’ material but more on applications.

If you discover any errors in the notes please email me, and I’ll add them to the list of errors.

You can get all 10 weeks of notes in a single file here:

Or, you can look at individual weeks:

Week 1 (Jan. 5 and 7) – The definition of a category. Some familiar categories. Various kinds of categories, including monoids, groupoids, groups, preorders, equivalence relations and posets. The definition of a functor. Doing mathematics inside a category: isomorphisms, monomorphisms and epimorphisms.

Week 2 (Jan. 12 and 14) – Doing mathematics inside a category: an isomorphism is a monomorphism and epimorphism, but not necessarily conversely. Products. Any object isomorphic to a product can also be a product. Products are unique up to isomorphism. Coproducts. What products and coproducts are like in various familiar categories. General limits and colimits. Examples: products and coproducts, equalizers and coequalizers, pullbacks and pushouts, terminal and initial objects.

Week 3 (Jan. 19 and 21) – Equalizers and coequalizers, and what they look like in $\textrm{Set}$ and other familiar categories. Pullbacks and pushouts, and what they look like in $\textrm{Set}.$ Composing pullback squares.

Week 4 (Jan. 26 and 28) – Doing mathematics between categories. Faithful, full, and essentially surjective functors. Forgetful functors: what it means for a functor to forget nothing, forget properties, forget structure or forget stuff. Transformations between functors. Natural transformations. Functor categories. Natural isomorphisms. In a category with binary products, the product becomes a functor, and the commutative and associative laws hold up to natural isomorphism. Cartesian categories. In a cartesian category, the left and right unit laws also hold up to natural isomorphism. A $G$-set is a functor from a group $G$ to $\textrm{Set}.$ What is a natural transformation between such functors?

Week 5 (Feb. 2 and 4) – A $G$-set is a functor from a group $G$ to $\textrm{Set},$ and a natural transformation between such functors is a map of $G$-sets. Equivalences of categories. Adjoint functors: the rough idea. The hom-functor. Adjoint functors: the definition. Examples: the left adjoint of the forgetful functor from $\textrm{Grp}$ to $\textrm{Set}.$ The left adjoint of the forgetful functor from $\textrm{Vect}_k$ to $\textrm{Set}.$ The forgetful functor from $\textrm{Top}$ to $\textrm{Set}$ has both a left and right adjoint. If a category $C$ has binary products, the diagonal functor from $C$ to $C \times C$ has a right adjoint. If it has binary coproducts, the diagonal functor has a left adjoint.

Week 6 (Feb. 9 and 11) – Diagrams in a category as functors. Cones as natural transformations. The process of taking limits as a right adjoint. The process of taking colimits as a left adjoint. Left adjoints preserve colimits; right adjoints preserve limits. Examples: the ‘free group’ functor from sets to groups preserve coproducts, while the forgetful functor from groups to sets preserves products. The composite of left adjoints is a left adjoint; the composite of right adjoints is a right adjoint. The unit and counit of a pair of adjoint functors.

Week 7 (Feb. 16 and 18) – Adjunctions. The naturality of the isomorphism $\textrm{hom}(Fc,d) \cong \textrm{hom}(c,Ud)$ in an adjunction. Given an adjunction, we can recover this isomorphism and its inverse from the unit and counit. Toward topos theory: cartesian closed categories and subobject classifiers. The definition of cartesian closed category, or ‘ccc’. Examples of cartesian closed categories. In a cartesian closed category with coproducts, the product distributes over the coproduct, and exponentiation distributes over the product.

Week 8 (Feb. 23) – Internalization. The concept of a group in a cartesian category. Any pair of objects $X, Y$ in a cartesian closed category has an ‘internal’ hom, the object $X^Y,$ as well as the usual ‘external’ hom, the set $\textrm{hom}(X,Y).$ Evaluation and coevaluation. Internal composition. In a category with a terminal object, we can define the set of elements of any object.

Week 8 (Feb. 25) – Guest lecture by Christina Osborne on symmetric monoidal categories.

Week 9 (Mar. 1 and 3) – For any category $C$ with a terminal object, elements define a functor $\textrm{elt} : C \to \textrm{Set}.$ If $C$ is cartesian, this functor preserves finite products. If $C$ is cartesian closed, $\textrm{elt}(Y^X) \cong \hom(X,Y),$ so it converts the internal hom into the external hom. The ‘name’ of a morphism. Subobjects. The subobject classifier in $\textrm{Set}.$ The general definition of subobject classifier in any category with finite limits. The definition of a topos. Examples of topoi, including the topos of graphs.

Week 10 (Mar. 8 and 10) – The subobject classifier in the topos of graphs. Any topos has finite colimits. Any morphism in a topos has an epi-mono factorization, which is unique up to a unique isomorphism. The image of a morphism in topos. The poset $\textrm{Sub}(X),$ whose elements are subobjects of an object $X$ in a topos. The correspondence between set theory and logic: given a set $X,$ subsets of $X$ correspond to predicates defined for elements of $X,$ intersection corresponds to ‘and’, union corresponds to ‘or’, the set $X$ itself corresponds to ‘true’, and the empty set corresponds to ‘false’. The intersection of subsets of is their product in $\textrm{Sub}(X),$ their union is their coproduct in $\textrm{Sub}(X),$ the set $X$ is the terminal object in $\textrm{Sub}(X),$ and the empty set is the initial object. A lattice is a poset with finite limits and finite colimits, and a Heyting algebra is a lattice that is also cartesian closed. For any object $X$ in any topos, $\textrm{Sub}(X)$ is a Heyting algebra. If we think of these elements of $\textrm{Sub}(X)$ as predicates, the exponential is ‘implication’.

Where does topos theory go from here?

All the notes are handwritten, by Christina Osborne and Samuel Britton. I’d consider paying someone to TeX them up! But as you see, there are a lot of diagrams…. so you should only try it if you want to learn category theory while practicing your TikZ. And if you try it, you should let us know – it would be silly to have more than one person doing the same job, while chopping the job into parts might work well.

5 February, 2014

Since one goal of Azimuth is to connect mathematicians to projects that can more immediately help the world, I want to pass this on. It’s a press release put out by Jocelyn Paine, who has blogged about applied category theory on the n-Category Café. I think he’s a serious guy, so I hope we can help him out!

Spreadsheet researcher Jocelyn Ireson-Paine has launched an Indiegogo campaign to fund a project to make spreadsheets safer. It will show how to write spreadsheets that are easier to read and less error-prone than when written in Excel. This is important because spreadsheet errors have cost some companies millions of pounds, even causing resignations and share-price crashes. An error in one spreadsheet, an economic model written in 2010 by Harvard economists Carmen Reinhart and Kenneth Rogoff, has even been blamed for tax rises and public-sector cuts. If he gets funding, Jocelyn will re-engineer this spreadsheet. He hopes that, because of its notoriety, this will catch public attention.

Reinhart and Rogoff’s spreadsheet was part of a paper on the association between debt and economic growth. They concluded that in countries where debt exceeds 90% of gross domestic product, growth is notably lower. But in spring 2013, University of Massachusetts student Thomas Herndon found they had omitted data when calculating an average. Because their paper’s conclusion supported governments’ austerity programmes, much criticism followed. They even received hate email blaming them for tax rises and public-sector cuts.

Jocelyn said, “The error probably didn’t change the results much. But better software would have made the nature of the error clearer, as well as the economics calculations, thus averting ill-informed and hurtful media criticism. Indeed, it might have avoided the error altogether.”

Jocelyn’s project will use two ideas. One is “literate programming”. Normally, a programmer writes a program first, then adds comments explaining how it works. But in literate programming, the programmer becomes an essayist. He or she first writes the explanation, then inserts the calculations as if putting equations into a maths essay. In ordinary spreadsheets, you’re lucky to get any documentation at all; in literate spreadsheets, documentation comes first.

The other idea is “modularity”. This means building spreadsheets from self-contained parts which can be developed, tested, and documented independently of one another. This gives the spreadsheet’s author less to think about, making mistakes less likely. It also makes it easier to replace parts that do have mistakes.

Jocelyn has embodied these ideas in a piece of software named Excelsior. He said, “‘Excelsior’ means ‘higher’ in Latin, and ‘upwards!’ in Longfellow’s poem. I think of it as meaning ‘upwards from Excel’. In fact, though, it’s the name of a wonderful Oxford café where I used to work on my ideas.”

Jocelyn also wants to show how advanced maths benefits computing. Some of his inspiration came from a paper he found on a friend’s desk in the Oxford University Department of Computer Science. Written by professor Joseph Goguen, this used a branch of maths called category theory to elucidate what it means for something to be part of a system, and how the behaviour of a system arises from the behaviours of its parts. Jocelyn said, “The ideas in the paper were extremely general, applying to many different areas. And when you think of modules as parts, they even apply to spreadsheets. This shows the value of abstraction.”

For more

Postal: 23 Stratfield Road, Oxford, OX2 7BG, UK.
Tel: 07768 534 091.

Jocelyn’s bio: http://www.j-paine.org/bio.html

Jocelyn’s personal website, for academic and general stuff: http://www.j-paine.org/

Background information: links to all topics mentioned can be found at the end of Paine’s campaign text at

These include literate programming, modularity, the Reinhart-Rogoff spreadsheet, category theory, and many horror stories about the damage caused by spreadsheet errors.

Category Theory for Scientists

23 May, 2013

At last—a textbook on category theory for scientists! And it’s free!

• David Spivak, Category Theory for Scientists.

It’s based on a course the author taught:

This course is an attempt to extol the virtues of a new branch of mathematics, called category theory, which was invented for powerful communication of ideas between different fields and subfields within mathematics. By powerful communication of ideas I actually mean something precise. Different branches of mathematics can be formalized into categories. These categories can then be connected together by functors. And the sense in which these functors provide powerful communication of ideas is that facts and theorems proven in one category can be transferred through a connecting functor to yield proofs of an analogous theorem in another category. A functor is like a conductor of mathematical truth.

I believe that the language and toolset of category theory can be useful throughout science. We build scientific understanding by developing models, and category theory is the study of basic conceptual building blocks and how they cleanly fit together to make such models. Certain structures and conceptual frameworks show up again and again in our understanding of reality. No one would dispute that vector spaces are ubiquitous. But so are hierarchies, symmetries, actions of agents on objects, data models, global behavior emerging as the aggregate of local behavior, self-similarity, and the effect of methodological context.

Some ideas are so common that our use of them goes virtually undetected, such as set-theoretic intersections. For example, when we speak of a material that is both lightweight and ductile, we are intersecting two sets. But what is the use of even mentioning this set-theoretic fact? The answer is that when we formalize our ideas, our understanding is almost always clarified. Our ability to communicate with others is enhanced, and the possibility for developing new insights expands. And if we are ever to get to the point that we can input our ideas into computers, we will need to be able to formalize these ideas first.

It is my hope that this course will offer scientists a new vocabulary in which to think and communicate, and a new pipeline to the vast array of theorems that exist and are considered immensely powerful within mathematics. These theorems have not made their way out into the world of science, but they are directly applicable there. Hierarchies are partial orders, symmetries are group elements, data models are categories, agent actions are monoid actions, local-to-global principles are sheaves, self-similarity is modeled by operads, context can be modeled by monads.

He asks readers from different subjects for help in finding new ways to apply category theory to those subjects. And that’s the right attitude to take when reading this book. I’ve found categories immensely valuable in my work. But it took effort to learn category theory and see how it can apply to different subjects. People are just starting to figure out these things, so don’t expect instant solutions to the problems in your own favorite field.

But Spivak does the best job I’ve seen so far at explaining category theory as a general-purpose tool for thinking clearly. Since I’m busy using category theory to clarify the relationships between fields like chemistry, population biology, electrical engineering and control theory, this subject is very much on my mind.

Compositionality in Network Theory

29 November, 2016

I gave a talk at the workshop on compositionality at the Simons Institute for the Theory of Computing next week. I spoke about some new work with Blake Pollard. You can see the slides here:

• John Baez, Compositionality in network theory, 6 December 2016.

and a video here:

Abstract. To describe systems composed of interacting parts, scientists and engineers draw diagrams of networks: flow charts, Petri nets, electrical circuit diagrams, signal-flow graphs, chemical reaction networks, Feynman diagrams and the like. In principle all these different diagrams fit into a common framework: the mathematics of symmetric monoidal categories. This has been known for some time. However, the details are more challenging, and ultimately more rewarding, than this basic insight. Two complementary approaches are presentations of symmetric monoidal categories using generators and relations (which are more algebraic in flavor) and decorated cospan categories (which are more geometrical). In this talk we focus on the latter.

This talk assumes considerable familiarity with category theory. For a much gentler talk on the same theme, see:

Corelations in Network Theory

2 February, 2016

Category theory reduces a large chunk of math to the clever manipulation of arrows. One of the fun things about this is that you can often take a familiar mathematical construction, think of it category-theoretically, and just turn around all the arrows to get something new and interesting!

In math we love functions. If we have a function

$f: X \to Y$

we can formally turn around the arrow to think of $f$ as something going back from $Y$ back to $X$. But this something is usually not a function: it’s called a ‘cofunction’. A cofunction from $Y$ to $X$ is simply a function from $X$ to $Y.$

Cofunctions are somewhat interesting, but they’re really just functions viewed through a looking glass, so they don’t give much new—at least, not by themselves.

The game gets more interesting if we think of functions and cofunctions as special sorts of relations. A relation from $X$ to $Y$ is a subset

$R \subseteq X \times Y$

It’s a function when for each $x \in X$ there’s a unique $y \in Y$ with $(x,y) \in R.$ It’s a cofunction when for each $y \in Y$ there’s a unique $x \in x$ with $(x,y) \in R.$

Just as we can compose functions, we can compose relations. Relations have certain advantages over functions: for example, we can ‘turn around’ any relation $R$ from $X$ to $Y$ and get a relation $R^\dagger$ from $Y$ to $X:$

$R^\dagger = \{(y,x) : \; (x,y) \in R \}$

If we turn around a function we get a cofunction, and vice versa. But we can also do other fun things: for example, since both functions and cofunctions are relations, we can compose a function and a cofunction and get a relation.

Of course, relations also have certain disadvantages compared to functions. But it’s utterly clear by now that the category $\mathrm{FinRel},$ where the objects are finite sets and the morphisms are relations, is very important.

So far, so good. But what happens if we take the definition of ‘relation’ and turn all the arrows around?

There are actually several things I could mean by this question, some more interesting than others. But one of them gives a very interesting new concept: the concept of ‘corelation’. And two of my students have just written a very nice paper on corelations:

• Brandon Coya and Brendan Fong, Corelations are the prop for extraspecial commutative Frobenius monoids.

Here’s why this paper is important for network theory: corelations between finite sets are exactly what we need to describe electrical circuits made of ideal conductive wires! A corelation from a finite set $X$ to a finite set $Y$ can be drawn this way:

I have drawn more wires than strictly necessary: I’ve drawn a wire between two points whenever I want current to be able to flow between them. But there’s a reason I did this: a corelation from $X$ to $Y$ simply tells us when current can flow from one point in either of these sets to any other point in these sets.

Of course circuits made solely of conductive wires are not very exciting for electrical engineers. But in an earlier paper, Brendan introduced corelations as an important stepping-stone toward more general circuits:

• John Baez and Brendan Fong, A compositional framework for passive linear circuits. (Blog article here.)

The key point is simply that you use conductive wires to connect resistors, inductors, capacitors, batteries and the like and build interesting circuits—so if you don’t fully understand the math of conductive wires, you’re limited in your ability to understand circuits in general!

In their new paper, Brendan teamed up with Brandon Coya, and they figured out all the rules obeyed by the category $\mathrm{FinCorel},$ where the objects are finite sets and the morphisms are corelations. I’ll explain these rules later.

This sort of analysis had previously been done for $\mathrm{FinRel},$ and it turns out there’s a beautiful analogy between the two cases! Here is a chart displaying the analogy:

 Spans Cospans extra bicommutative bimonoids special commutative Frobenius monoids Relations Corelations extraspecial bicommutative bimonoids extraspecial commutative Frobenius monoids

I’m sure this will be cryptic to the nonmathematicians reading this, and even many mathematicians—but the paper explains what’s going on here.

I’ll actually say what an ‘extraspecial commutative Frobenius monoid’ is later in this post. This is a terse way of listing all the rules obeyed by corelations between finite sets—and thus, all the rules obeyed by conductive wires.

But first, let’s talk about something simpler.

What is a corelation?

Just as we can define functions as relations of a special sort, we can also define relations in terms of functions. A relation from $X$ to $Y$ is a subset

$R \subseteq X \times Y$

but we can think of this as an equivalence class of one-to-one functions

$i: R \to X \times Y$

Why an equivalence class? The image of $i$ is our desired subset of $X \times Y.$ The set $R$ here could be replaced by any isomorphic set; its only role is to provide ‘names’ for the elements of $X \times Y$ that are in the image of $i.$

Now we have a relation described as an arrow, or really an equivalence class of arrows. Next, let’s turn the arrow around!

There are different things I might mean by that, but we want to do it cleverly. When we turn arrows around, the concept of product (for example, cartesian product $X \times Y$ of sets) turns into the concept of sum (for example, disjoint union $X + Y$ of sets). Similarly, the concept of monomorphism (such as a one-to-one function) turns into the concept of epimorphism (such as an onto function). If you don’t believe me, click on the links!

So, we should define a corelation from a set $X$ to a set $Y$ to be an equivalence class of onto functions

$p: X + Y \to C$

Why an equivalence class? The set $C$ here could be replaced by any isomorphic set; its only role is to provide ‘names’ for the sets of elements of $X + Y$ that get mapped to the same thing via $p.$

In simpler terms, a corelation from $X$ to a set $Y$ is just a partition of the disjoint union $X + Y.$ So, it looks like this:

If we like, we can then draw a line connecting any two points that lie in the same part of the partition:

These lines determine the corelation, so we can also draw a corelation this way:

This is why corelations describe circuits made solely of wires!

The rules governing corelations

The main result in Brandon and Brendan’s paper is that $\mathrm{FinCorel}$ is equivalent to the PROP for extraspecial commutative Frobenius monoids. That’s a terse way of the laws governing $\mathrm{FinCorel}.$

Let me just show you the most important laws. In each of these law I’ll draw two circuits made of wires, and write an equals sign asserting that they give the same corelation from a set $X$ to a set $Y.$ The inputs $X$ of each circuit are on top, and the outputs $Y$ are at the bottom. I’ll draw 3-way junctions as little triangles, but don’t worry about that. When we compose two corelations we may get a wire left in mid-air, not connected to the inputs or outputs. We draw the end of the wire as a little circle.

There are some laws called the ‘commutative monoid’ laws:

and an upside-down version called the ‘cocommutative comonoid’ laws:

Then we have ‘Frobenius laws’:

and finally we have the ‘special’ and ‘extra’ laws:

All other laws can be derived from these in some systematic ways.

Commutative Frobenius monoids obey the commutative monoid laws, the cocommutative comonoid laws and the Frobenius laws. They play a fundamental role in 2d topological quantum field theory. Special Frobenius monoids are also well-known. But the ‘extra’ law, which says that a little piece of wire not connected to anything can be thrown away with no effect, is less well studied. Jason Erbele and I gave it this name in our work on control theory:

• John Baez and Jason Erbele, Categories in control. (Blog article here.)

For more

David Ellerman has spent a lot of time studying what would happen to mathematics if we turned around a lot of arrows in a certain systematic way. In particular, just as the concept of relation would be replaced by the concept of corelation, the concept of subset would be replaced by the concept of partition. You can see how it fits together: just as a relation from $X$ to $Y$ is a subset of $X \times Y,$ a corelation from $X$ to $Y$ is a partition of $X + Y.$

There’s a lattice of subsets of a set:

In logic these subsets correspond to propositions, and the lattice operations are the logical operations ‘and’ and ‘or’. But there’s also a lattice of partitions of a set:

In Ellerman’s vision, this lattice of partitions gives a new kind of logic. You can read about it here:

• David Ellerman, Introduction to partition logic, Logic Journal of the Interest Group in Pure and Applied Logic 22 (2014), 94–125.

As mentioned, the main result in Brandon and Brendan’s paper is that $\mathrm{FinCorel}$ is equivalent to the PROP for extraspecial commutative Frobenius monoids. After they proved this, they noticed that the result has also been stated in other language and proved in other ways by two other authors:

• Fabio Zanasi, Interacting Hopf Algebras—the Theory of Linear Systems, PhD thesis, École Normale Supériere de Lyon, 2015.

• K. Dosen and Z. Petrić, Syntax for split preorders, Annals of Pure and Applied Logic 164 (2013), 443–481.

Unsurprisingly, I prefer Brendan and Brandon’s approach to deriving the result. But it’s nice to see different perspectives!

Network Theory (Part 33)

4 November, 2014

Last time I came close to describing the ‘black box functor’, which takes an electrical circuit made of resistors

and sends it to its behavior as viewed from outside. From outside, all you can see is the relation between currents and potentials at the ‘terminals’—the little bits of wire that poke out of the black box:

I came close to defining the black box functor, but I didn’t quite make it! This time let’s finish the job.

The categories in question

The black box functor

$\blacksquare : \mathrm{ResCirc} \to \mathrm{LinRel}$

goes from the category $\mathrm{ResCirc},$ where morphisms are circuits made of resistors, to the category $\mathrm{LinRel},$ where morphisms are linear relations. Let me remind you how these categories work, and introduce a bit of new notation.

Here is the category $\mathrm{ResCirc}:$

• an object is a finite set;

• a morphism from $X$ to $Y$ is an isomorphism class of cospans

in the category of graphs with edges labelled by resistances: numbers in $(0,\infty).$ Here we think of the finite sets $X$ and $Y$ as graphs with no edges. We call $X$ the set of inputs and $Y$ the set of outputs.

• we compose morphisms in $\mathrm{ResCirc}$ by composing isomorphism classes of cospans.

And here is the category $\mathrm{LinRel}:$

• an object is a finite-dimensional real vector space;

• a morphism from $U$ to $V$ is a linear relation $R : U \leadsto V,$ meaning a linear subspace $R \subseteq U \times V;$

• we compose a linear relation $R \subseteq U \times V$ and a linear relation $S \subseteq V \times W$ in the usual way we compose relations, getting:

$SR = \{(u,w) \in U \times W : \; \exists v \in V \; (u,v) \in R \mathrm{\; and \;} (v,w) \in S \}$

In case you’re wondering: I’ve just introduced the wiggly arrow notation

$R : U \leadsto V$

for a linear relation from $U$ to $V,$ because it suggests that a relation is a bit like a function but more general. Indeed, a function is a special case of a relation, and composing functions is a special case of composing relations.

The black box functor

Now, how do we define the black box functor?

Defining it on objects is easy. An object of $\mathrm{ResCirc}$ is a finite set $S,$ and we define

$\blacksquare{S} = \mathbb{R}^S \times \mathbb{R}^S$

The idea is that $S$ could be a set of inputs or outputs, and then

$(\phi, I) \in \mathbb{R}^S \times \mathbb{R}^S$

is a list of numbers: the potentials and currents at those inputs or outputs.

So, the interesting part is defining the black box functor on morphisms!

For this we start with a morphism in $\mathrm{ResCirc}$:

The labelled graph $\Gamma$ consists of:

• a set $N$ of nodes,

• a set $E$ of edges,

• maps $s, t : E \to N$ sending each edge to its source and target,

• a map $r : E \to (0,\infty)$ sending each edge to its resistance.

The cospan gives maps

$i: X \to N, \qquad o: Y \to N$

These say how the inputs and outputs are interpreted as nodes in the circuit. We’ll call the nodes that come from inputs or outputs ‘terminals’. So, mathematically,

$T = \mathrm{im}(i) \cup \mathrm{im}(o) \subseteq N$

is the set of terminals: the union of the images of $i$ and $o.$

In the simplest case, the maps $i$ and $o$ are one-to-one, with disjoint ranges. Then each terminal either comes from a single input, or a single output, but not both! This is a good picture to keep in mind. But some subtleties arise when we leave this simplest case and consider other cases.

Now, the black box functor is supposed to send our circuit to a linear relation. I’ll call the circuit $\Gamma$ for short, though it’s really the whole cospan

So, our black box functor is supposed to send this circuit to a linear relation

$\blacksquare(\Gamma) : \mathbb{R}^X \times \mathbb{R}^X \leadsto \mathbb{R}^Y \times \mathbb{R}^Y$

This is a relation between the potentials and currents at the input terminals and the potentials and currents at the output terminals! How is it defined?

I’ll start by outlining how this works.

First, our circuit picks out a subspace

$dQ \subseteq \mathbb{R}^T \times \mathbb{R}^T$

This is the subspace of allowed potentials and currents on the terminals. I’ll explain this and why it’s called $dQ$ a bit later. Briefly, it comes from the principle of minimum power, described last time.

Then, the map

$i: X \to T$

gives a linear relation

$S(i) : \mathbb{R}^X \times \mathbb{R}^X \leadsto \mathbb{R}^T \times \mathbb{R}^T$

This says how the potentials and currents at the inputs are related to those at the terminals. Similarly, the map

$o: Y \to T$

gives a linear relation

$S(o) : \mathbb{R}^Y \times \mathbb{R}^Y \leadsto \mathbb{R}^T \times \mathbb{R}^T$

This says how the potentials and currents at the outputs are related to those at the terminals.

Next, we can ‘turn around’ any linear relation

$R : \mathbb{R}^Y \times \mathbb{R}^Y \leadsto \mathbb{R}^T \times \mathbb{R}^T$

to get a relation

$R^\dagger : \mathbb{R}^T \times \mathbb{R}^T \leadsto \mathbb{R}^Y \times \mathbb{R}^Y$

defined by

$R^\dagger = \{(\phi',-I',\phi,-I) : (\phi, I, \phi', I') \in R \}$

Here we are just switching the input and output potentials, but when we switch the currents we also throw in a minus sign. The reason is that we care about the current flowing in to an input, but out of an output.

Finally, one more trick: given a linear subspace

$L \subseteq V$

of a vector space $V$ we get a linear relation

$1|_L : V \leadsto V$

called the identity restricted to $L$, defined like this:

$1|_L = \{ (v, v) :\; v \in L \} \subseteq V \times V$

If $L$ is all of $V$ this relation is actually the identity function on $V.$ Otherwise it’s a partially defined function that’s defined only on $L,$ and is the identity there. (A partially defined function is an example of a relation.) My notation $1|_L$ is probably bad, but I don’t know a better one, so bear with me.

Let’s use all these ideas to define

$\blacksquare(\Gamma) : \mathbb{R}^X \times \mathbb{R}^X \leadsto \mathbb{R}^Y \times \mathbb{R}^Y$

To do this, we compose three linear relations:

$S(i) : \mathbb{R}^X \times \mathbb{R}^X \leadsto \mathbb{R}^T \times \mathbb{R}^T$

2) We compose this with

$1|_{dQ} : \mathbb{R}^T \times \mathbb{R}^T \leadsto \mathbb{R}^T \times \mathbb{R}^T$

3) Then we compose this with

$S(o)^\dagger : \mathbb{R}^T \times \mathbb{R}^T \leadsto \mathbb{R}^Y \times \mathbb{R}^Y$

Note that:

1) says how the potentials and currents at the inputs are related to those at the terminals,

2) picks out which potentials and currents at the terminals are actually allowed, and

3) says how the potentials and currents at the terminals are related to those at the outputs.

So, I hope all makes sense, at least in some rough way. In brief, here’s the formula:

$\blacksquare(\Gamma) = S(o)^\dagger \; 1|_{dQ} \; S(i)$

Now I just need to fill in some details. First, how do we define $S(i)$ and $S(o)?$ They work exactly the same way, by ‘copying potentials and adding currents’, so I’ll just talk about one. Second, how do we define the subspace $dQ?$ This uses the principle of minimum power.

Any function between finite sets

$i: X \to T$

gives a linear map

$i^* : \mathbb{R}^T \to \mathbb{R}^X$

Mathematicians call this linear map the pullback along $i,$ and for any $\phi \in \mathbb{R}^T$ it’s defined by

$i^*(\phi)(x) = \phi(i(x))$

In our application, we think of $\phi$ as a list of potentials at terminals. The function $i$ could map a bunch of inputs to the same terminal, and the above formula says the potential at this terminal gives the potential at all those inputs. So, we are copying potentials.

We also get a linear map going the other way:

$i_* : \mathbb{R}^X \to \mathbb{R}^T$

Mathematicians call this the pushforward along $i,$ and for any $I \in \mathbb{R}^X$ it’s defined by

$\displaystyle{ i_*(I)(t) = \sum_{x \; : \; i(x) = t } I(x) }$

In our application, we think of $I$ as a list of currents entering at some inputs. The function $i$ could map a bunch of inputs to the same terminal, and the above formula says the current at this terminal is the sum of the currents at all those inputs. So, we are adding currents.

Putting these together, our map

$i : X \to T$

gives a linear relation

$S(i) : \mathbb{R}^X \times \mathbb{R}^X \leadsto \mathbb{R}^T \times \mathbb{R}^T$

where the pair $(\phi, I) \in \mathbb{R}^X \times \mathbb{R}^X$ is related to the pair $(\phi', I') \in \mathbb{R}^T \times \mathbb{R}^T$ iff

$\phi = i^*(\phi')$

and

$I' = i_*(I)$

So, here’s the rule of thumb when attaching the points of $X$ to the input terminals of our circuit: copy potentials, but add up currents. More formally:

$\begin{array}{ccl} S(i) &=& \{ (\phi, I, \phi', I') : \; \phi = i^*(\phi') , \; I' = i_*(I) \} \\ \\ &\subseteq& \mathbb{R}^X \times \mathbb{R}^X \times \mathbb{R}^T \times \mathbb{R}^T \end{array}$

The principle of minimum power

Finally, how does our circuit define a subspace

$dQ \subseteq \mathbb{R}^T \times \mathbb{R}^T$

of allowed potential-current pairs at the terminals? The trick is to use the ideas we discussed last time. If we know the potential at all nodes of our circuit, say $\phi \in \mathbb{R}^N$, we know the power used by the circuit:

$P(\phi) = \displaystyle{ \sum_{e \in E} \frac{1}{r_e} \big(\phi(s(e)) - \phi(t(e))\big)^2 }$

We saw last time that if we fix the potentials at the terminals, the circuit will choose potentials at the other nodes to minimize this power. We can describe the potential at the terminals by

$\psi \in \mathbb{R}^T$

So, the power for a given potential at the terminals is

$Q(\psi) = \displaystyle{ \frac{1}{2} \min_{\phi \in \mathbb{R}^N \; : \; \phi|_T = \psi} \sum_{e \in E} \frac{1}{r_e} \big(\phi(s(e)) - \phi(t(e))\big)^2 }$

Actually this is half the power: I stuck in a factor of 1/2 for some reason we’ll soon see. This $Q$ is a quadratic function

$Q : \mathbb{R}^T \to \mathbb{R}$

so its derivative is linear. And, our work last time showed something interesting: to compute the current $J_x$ flowing into a terminal $x \in T,$ we just differentiate $Q$ with respect to the potential at that terminal:

$\displaystyle{ J_x = \frac{\partial Q(\psi)}{\partial \psi_x} }$

This is the reason for the 1/2: when we take the derivative of $Q,$ we bring down a 2 from differentiating all those squares, and to make that go away we need a 1/2.

The space of allowed potential-current pairs at the terminals is thus the linear subspace

$dQ = \{ (\psi, J) : \; \displaystyle{ J_x = \frac{\partial Q(\psi)}{\partial \psi_x} \} \subseteq \mathbb{R}^T \times \mathbb{R}^T }$

And this completes our precise description of the black box functor!

The hard part is this:

Theorem. $\blacksquare : \mathrm{ResCirc} \to \mathrm{LinRel}$ is a functor.

In other words, we have to prove that it preserves composition:

$\blacksquare(fg) = \blacksquare(f) \blacksquare(g)$

• John Baez and Brendan Fong, A compositional framework for passive linear networks.

Network Theory (Part 32)

20 October, 2014

Okay, today we will look at the ‘black box functor’ for circuits made of resistors. Very roughly, this takes a circuit made of resistors with some inputs and outputs:

and puts a ‘black box’ around it:

forgetting the internal details of the circuit and remembering only how the it behaves as viewed from outside. As viewed from outside, all the circuit does is define a relation between the potentials and currents at the inputs and outputs. We call this relation the circuit’s behavior. Lots of different choices of the resistances $R_1, \dots, R_6$ would give the same behavior. In fact, we could even replace the whole fancy circuit by a single edge with a single resistor on it, and get a circuit with the same behavior!

The idea is that when we use a circuit to do something, all we care about is its behavior: what it does as viewed from outside, not what it’s made of.

Furthermore, we’d like the behavior of a system made of parts to depend in a simple way on the external behaviors of its parts. We don’t want to have to ‘peek inside’ the parts to figure out what the whole will do! Of course, in some situations we do need to peek inside the parts to see what the whole will do. But in this particular case we don’t—at least in the idealization we are considering. And this fact is described mathematically by saying that black boxing is a functor.

So, how do circuits made of resistors behave? To answer this we first need to remember what they are!

Review

Remember that for us, a circuit made of resistors is a mathematical structure like this:

It’s a cospan where:

$\Gamma$ is a graph labelled by resistances. So, it consists of a finite set $N$ of nodes, a finite set $E$ of edges, two functions

$s, t : E \to N$

sending each edge to its source and target nodes, and a function

$r : E \to (0,\infty)$

that labels each edge with its resistance.

$i: I \to \Gamma$ is a map of graphs labelled by resistances, where $I$ has no edges. A labelled graph with no edges has nothing but nodes! So, the map $i$ is just a trick for specifying a finite set of nodes called inputs and mapping them to $N.$ Thus $i$ picks out some nodes of $\Gamma$ and declares them to be inputs. (However, $i$ may not be one-to-one! We’ll take advantage of that subtlety later.)

$o: O \to \Gamma$ is another map of graphs labelled by resistances, where $O$ again has no edges, and we call its nodes outputs.

The principle of minimum power

So what does a circuit made of resistors do? This is described by the principle of minimum power.

Recall from Part 27 that when we put it to work, our circuit has a current $I_e$ flowing along each edge $e \in E.$ This is described by a function

$I: E \to \mathbb{R}$

It also has a voltage across each edge. The word ‘across’ is standard here, but don’t worry about it too much; what matters is that we have another function

$V: E \to \mathbb{R}$

describing the voltage $V_e$ across each edge $e.$

Resistors heat up when current flows through them, so they eat up electrical power and turn this power into heat. How much? The power is given by

$\displaystyle{ P = \sum_{e \in E} I_e V_e }$

So far, so good. But what does it mean to minimize power?

To understand this, we need to manipulate the formula for power using the laws of electrical circuits described in Part 27. First, Ohm’s law says that for linear resistors, the current is proportional to the voltage. More precisely, for each edge $e \in E,$

$\displaystyle{ I_e = \frac{V_e}{r_e} }$

where $r_e$ is the resistance of that edge. So, the bigger the resistance, the less current flows: that makes sense. Using Ohm’s law we get

$\displaystyle{ P = \sum_{e \in E} \frac{V_e^2}{r_e} }$

Now we see that power is always nonnegative! Now it makes more sense to minimize it. Of course we could minimize it simply by setting all the voltages equal to zero. That would work, but that would be boring: it gives a circuit with no current flowing through it. The fun starts when we minimize power subject to some constraints.

For this we need to remember another law of electrical circuits: a spinoff of Kirchhoff’s voltage law. This says that we can find a function called the potential

$\phi: N \to \mathbb{R}$

such that

$V_e = \phi_{s(e)} - \phi_{t(e)}$

for each $e \in E.$ In other words, the voltage across each edge is the difference of potentials at the two ends of this edge.

Using this, we can rewrite the power as

$\displaystyle{ P = \sum_{e \in E} \frac{1}{r_e} (\phi_{s(e)} - \phi_{t(e)})^2 }$

Now we’re really ready to minimize power! Our circuit made of resistors has certain nodes called terminals:

$T \subseteq N$

These are the nodes that are either inputs or outputs. More precisely, they’re the nodes in the image of

$i: I \to \Gamma$

or

$o: O \to \Gamma$

The principle of minimum power says that:

If we fix the potential $\phi$ on all terminals, the potential at other nodes will minimize the power

$\displaystyle{ P(\phi) = \sum_{e \in E} \frac{1}{r_e} (\phi_{s(e)} - \phi_{t(e)})^2 }$

subject to this constraint.

This should remind you of all the other minimum or maximum principles you know, like the principle of least action, or the way a system in thermodynamic equilibrium maximizes its entropy. All these principles—or at least, most of them—are connected. I could talk about this endlessly. But not now!

Now let’s just use the principle of minimum power. Let’s see what it tells us about the behavior of an electrical circuit.

Let’s imagine changing the potential $\phi$ by adding some multiple of a function

$\psi: N \to \mathbb{R}$

If this other function vanishes at the terminals:

$\forall n \in T \; \; \psi(n) = 0$

then $\phi + x \psi$ doesn’t change at the terminals as we change the number $x.$

Now suppose $\phi$ obeys the principle of minimum power. In other words, supposes it minimizes power subject to the constraint of taking the values it does at the terminals. Then we must have

$\displaystyle{ \frac{d}{d x} P(\phi + x \psi)\Big|_{x = 0} }$

whenever

$\forall n \in T \; \; \psi(n) = 0$

This is just the first derivative test for a minimum. But the converse is true, too! The reason is that our power function is a sum of nonnegative quadratic terms. Its graph will look like a paraboloid. So, the power has no points where its derivative vanishes except minima, even when we constrain $\phi$ by making it lie on a linear subspace.

We can go ahead and start working out the derivative:

$\displaystyle{ \frac{d}{d x} P(\phi + x \psi)! = ! \frac{d}{d x} \sum_{e \in E} \frac{1}{r_e} (\phi_{s(e)} - \phi_{t(e)} + x(\psi_{s(e)} -\psi_{t(e)}))^2 }$

To work out the derivative of these quadratic terms at $x = 0,$ we only need to keep the part that’s proportional to $x.$ The rest gives zero. So:

$\begin{array}{ccl} \displaystyle{ \frac{d}{d t} P(\phi + x \psi)\Big|_{x = 0} } &=& \displaystyle{ \frac{d}{d x} \sum_{e \in E} \frac{x}{r_e} (\phi_{s(e)} - \phi_{t(e)}) (\psi_{s(e)} - \psi_{t(e)}) \Big|_{x = 0} } \\ \\ &=& \displaystyle{ \sum_{e \in E} \frac{1}{r_e} (\phi_{s(e)} - \phi_{t(e)}) (\psi_{s(e)} - \psi_{t(e)}) } \end{array}$

The principle of minimum power says this is zero whenever $\psi : N \to \mathbb{R}$ is a function that vanishes at terminals. By linearity, it’s enough to consider functions $\psi$ that are zero at every node except one node $n$ that is not a terminal. By linearity we can also assume $\psi(n) = 1.$

Given this, the only nonzero terms in the sum

$\displaystyle{ \sum_{e \in E} \frac{1}{r_e} (\phi_{s(e)} - \phi_{t(e)}) (\psi_{s(e)} - \psi_{t(e)}) }$

will be those involving edges whose source or target is $n.$ We get

$\begin{array}{ccc} \displaystyle{ \frac{d}{d x} P(\phi + x \psi)\Big|_{x = 0} } &=& \displaystyle{ \sum_{e: \; s(e) = n} \frac{1}{r_e} (\phi_{s(e)} - \phi_{t(e)})} \\ \\ && -\displaystyle{ \sum_{e: \; t(e) = n} \frac{1}{r_e} (\phi_{s(e)} - \phi_{t(e)}) } \end{array}$

So, the principle of minimum power says precisely

$\displaystyle{ \sum_{e: \; s(e) = n} \frac{1}{r_e} (\phi_{s(e)} - \phi_{t(e)}) = \sum_{e: \; t(e) = n} \frac{1}{r_e} (\phi_{s(e)} - \phi_{t(e)}) }$

for all nodes $n$ that aren’t terminals.

What does this mean? You could just say it’s a set of linear equations that must be obeyed by the potential $\phi.$ So, the principle of minimum power says that fixing the potential at terminals, the potential at other nodes must be chosen in a way that obeys a set of linear equations.

But what do these equations mean? They have a nice meaning. Remember, Kirchhoff’s voltage law says

$V_e = \phi_{s(e)} - \phi_{t(e)}$

and Ohm’s law says

$\displaystyle{ I_e = \frac{V_e}{r_e} }$

Putting these together,

$\displaystyle{ I_e = \frac{1}{r_e} (\phi_{s(e)} - \phi_{t(e)}) }$

so the principle of minimum power merely says that

$\displaystyle{ \sum_{e: \; s(e) = n} I_e = \sum_{e: \; t(e) = n} I_e }$

for any node $n$ that is not a terminal.

This is Kirchhoff’s current law: for any node except a terminal, the total current flowing into that node must equal the total current flowing out! That makes a lot of sense. We allow current to flow in or out of our circuit at terminals, but ‘inside’ the circuit charge is conserved, so if current flows into some other node, an equal amount has to flow out.

In short: the principle of minimum power implies Kirchoff’s current law! Conversely, we can run the whole argument backward and derive the principle of minimum power from Kirchhoff’s current law. (In both the forwards and backwards versions of this argument, we use Kirchhoff’s voltage law and Ohm’s law.)

When the node $n$ is a terminal, the quantity

$\displaystyle{ \sum_{e: \; s(e) = n} I_e \; - \; \sum_{e: \; t(e) = n} I_e }$

need not be zero. But it has an important meaning: it’s the amount of current flowing into that terminal!

We’ll call this $I_n,$ the current at the terminal $n \in T.$ This is something we can measure even when our circuit has a black box around it:

So is the potential $\phi_n$ at the terminal $n.$ It’s these currents and potentials at terminals that matter when we try to describe the behavior of a circuit while ignoring its inner workings.

Black boxing

Now let me quickly sketch how black boxing becomes a functor.

A circuit made of resistors gives a linear relation between the potentials and currents at terminals. A relation is something that can hold or fail to hold. A ‘linear’ relation is one defined using linear equations.

A bit more precisely, suppose we choose potentials and currents at the terminals:

$\psi : T \to \mathbb{R}$

$J : T \to \mathbb{R}$

Then we seek potentials and currents at all the nodes and edges of our circuit:

$\phi: N \to \mathbb{R}$

$I : E \to \mathbb{R}$

that are compatible with our choice of $\psi$ and $J.$ Here compatible means that

$\psi_n = \phi_n$

and

$J_n = \displaystyle{ \sum_{e: \; s(e) = n} I_e \; - \; \sum_{e: \; t(e) = n} I_e }$

whenever $n \in T,$ but also

$\displaystyle{ I_e = \frac{1}{r_e} (\phi_{s(e)} - \phi_{t(e)}) }$

for every $e \in E,$ and

$\displaystyle{ \sum_{e: \; s(e) = n} I_e \; = \; \sum_{e: \; t(e) = n} I_e }$

whenever $n \in N - T.$ (The last two equations combine Kirchoff’s laws and Ohm’s law.)

There either exist $I$ and $\phi$ making all these equations true, in which case we say our potentials and currents at the terminals obey the relation… or they don’t exist, in which case we say the potentials and currents at the terminals don’t obey the relation.

The relation is clearly linear, since it’s defined by a bunch of linear equations. With a little work, we can make it into a linear relation between potentials and currents in

$\mathbb{R}^I \oplus \mathbb{R}^I$

and potentials and currents in

$\mathbb{R}^O \oplus \mathbb{R}^O$

Remember, $I$ is our set of inputs and $O$ is our set of outputs.

In fact, this process of getting a linear relation from a circuit made of resistors defines a functor:

$\blacksquare : \mathrm{ResCirc} \to \mathrm{LinRel}$

Here $\mathrm{ResCirc}$ is the category where morphisms are circuits made of resistors, while $\mathrm{LinRel}$ is the category where morphisms are linear relations.

More precisely, here is the category $\mathrm{ResCirc}:$

• an object of $\mathrm{ResCirc}$ is a finite set;

• a morphism from $I$ to $O$ is an isomorphism class of circuits made of resistors:

having $I$ as its set of inputs and $O$ as its set of outputs;

• we compose morphisms in $\mathrm{ResCirc}$ by composing isomorphism classes of cospans.

(Remember, circuits made of resistors are cospans. This lets us talk about isomorphisms between them. If you forget the how isomorphism between cospans work, you can review it in Part 31.)

And here is the category $\mathrm{LinRel}:$

• an object of $\mathrm{LinRel}$ is a finite-dimensional real vector space;

• a morphism from $U$ to $V$ is a linear relation $R \subseteq U \times V,$ meaning a linear subspace of the vector space $U \times V;$

• we compose a linear relation $R \subseteq U \times V$ and a linear relation $S \subseteq V \times W$ in the usual way we compose relations, getting:

$SR = \{(u,w) \in U \times W : \; \exists v \in V \; (u,v) \in R \mathrm{\; and \;} (v,w) \in S \}$

Next steps

So far I’ve set up most of the necessary background but not precisely defined the black boxing functor

$\blacksquare : \mathrm{ResCirc} \to \mathrm{LinRel}$

There are some nuances I’ve glossed over, like the difference between inputs and outputs as elements of $I$ and $O$ and their images in $N.$ If you want to see the precise definition and the proof that it’s a functor, read our paper:

• John Baez and Brendan Fong, A compositional framework for passive linear networks.

The proof is fairly long: there may be a much quicker one, but at least this one has the virtue of introducing a lot of nice ideas that will be useful elsewhere.

Next time I’ll define the black box functor more carefully.

Network Theory Seminar (Part 2)

16 October, 2014

This time I explain more about how ‘cospans’ represent gadgets with two ends, an input end and an output end:

I describe how to glue such gadgets together by composing cospans. We compose cospans using a category-theoretic construction called a ‘pushout’, so I also explain pushouts. At the end, I explain how this gives us a category where the morphisms are electrical circuits made of resistors, and sketch what we’ll do next: study the behavior of these circuits.

These lecture notes provide extra details:

Network Theory (Part 31)

13 October, 2014

Last time we came up with a category of labelled graphs and described circuits as ‘cospans’ in this category.

Cospans may sound scary, but they’re not. A cospan is just a diagram consisting of an object with two morphisms going into it:

We can talk about cospans in any category. A cospan is an abstract way of thinking about a ‘chunk of stuff’ $\Gamma$ with two ‘ends’ $I$ and $O.$ It could be any sort of stuff: a set, a graph, an electrical circuit, a network of any kind, or even a piece of matter (in some mathematical theory of matter).

We call the object $\Gamma$ the apex of the cospan and call the morphisms $i: I \to \Gamma, o : O \to \Gamma$ the legs of the cospan. We sometimes call the objects $I$ and $O$ the feet of the cospan. We call $I$ the input and $O$ the output. We say the cospan goes from $I$ to $O,$ though the direction is just a convention: we can flip a cospan and get a cospan going the other way!

If you’re wondering about the name ‘cospan’, it’s because a span is a diagram like this:

Since a ‘span’ is another name for a bridge, and this looks like a bridge from $I$ to $O,$ category theorists called it a span! And category theorists use the prefix ‘co-‘ when they turn all the arrows around. Spans came first historically, and we will use those too at times. But now let’s think about how to compose cospans.

Composing cospans is supposed to be like gluing together chunks of stuff by attaching the output of the first to the input of the second. So, we say two cospans are composable if the output of the first equals the input of the second, like this:

We then compose them by forming a new cospan going all the way from $X$ to $Z$:

The new object $\Gamma +_Y \Gamma'$ and the new morphisms $i'', o''$ are built using a process called a ‘pushout’ which I’ll explain in a minute. The result is cospan from $X$ to $Z,$ called the composite of the cospans we started with. Here it is:

So how does a pushout work? It’s a general construction that you can define in any category, though it only exists if the category is somewhat nice. (Ours always will be.) You start with a diagram like this:

and you want to get a commuting diamond like this:

which is in some sense ‘the best’ given the diagram we started with. For example, suppose we’re in the category of sets and $Y$ is a set included in both $\Gamma$ and $\Gamma'.$ Then we’d like $A$ to be the union of $\Gamma$ and $\Gamma.$ There are other choices of $A$ that would give a commuting diamond, but the union is the best. Something similar is happening when we compose circuits, but instead of the category of sets we’re using the category of labelled graphs we discussed last time.

How do we make precise the idea that $A$ is ‘the best’? We consider any other potential solution to this problem, that is, some other commuting diamond:

Then $A$ is ‘the best’ if there exists a unique morphism $q$ from $A$ to the ‘competitor’ $Q$ making the whole combined diagram commute:

This property is called a universal property: instead of saying that $A$ is the ‘best’, grownups say it is universal.

When $A$ has this universal property we call it the pushout of the original diagram, and we may write it as $\Gamma +_Y \Gamma'.$ Actually we should call the whole diagram

the pushout, or a pushout square, because the morphisms $i'', o''$ matter too. The universal property is not really a property just of $A,$ but of the whole pushout square. But often we’ll be sloppy and call just the object $A$ the pushout.

Puzzle 1. Suppose we have a diagram in the category of sets

where $Y = \Gamma \cap \Gamma'$ and the maps $i, o'$ are the inclusions of this intersection in the sets $\Gamma$ and $\Gamma'.$ Prove that $A = \Gamma \cup \Gamma'$ is the pushout, or more precisely the diagram

is a pushout square, where $i'', o''$ are the inclusions of $\Gamma$ and $\Gamma$ in the union $A = \Gamma \cup \Gamma'.$

More generally, a pushout in the category of sets is a way of gluing together sets $\Gamma$ and $\Gamma'$ with some ‘overlap’ given by the maps

And this works for labelled graphs, too!

Puzzle 2. Suppose we have two circuits of resistors that are composable, like this:

and this:

These give cospans in the category $L\mathrm{Graph}$ where

$L = (0,\infty)$

(Remember from last time that $L\mathrm{Graph}$ is the category of graphs with edges labelled by elements of some set $L.$) Show that if we compose these cospans we get a cospan corresponding to this circuit:

If you’re a mathematician you might find it easier to solve this kind of problem in general, which requires pondering how pushouts work in $L\mathrm{Graph}.$ Alternatively, you might find it easier to think about this particular example: then you can just check that the answer we want has the desired property of a pushout!

If this stuff seems complicated, well, just know that category theory is a very general, powerful tool and I’m teaching you just the microscopic fragment of it that we need right now. Category theory ultimately seems very simple: I can’t really think of any math that’s simpler! It only seem complicated when it’s unfamiliar and you have a fragmentary view of it.

So where are we? We know that circuits made of resistors are a special case of cospans. We know how to compose cospans. So, we know how to compose circuits… and in the last puzzle, we saw this does just what we want.

The advantage of this rather highbrow approach is that a huge amount is known about composing cospans! In particular, suppose we have any category $C$ where pushouts exist: that is, where we can always complete any diagram like this:

to a pushout square. Then we can form a category $\mathrm{Cospan}(C)$ where:

• an object is an object of $C$

• a morphism from an object $I \in C$ to an object $O \in C$ is an equivalence classes of cospans from $I$ to $O:$

• we compose cospans in the manner just described.

Why did I say ‘equivalence class’? It’s because the pushout is not usually unique. It’s unique only up to isomorphism. So, composing cospans would be ill-defined unless we work with some kind of equivalence class of cospans.

To be precise, suppose we have two cospans from $I$ to $O$:

Then a map of cospans from one to the other is a commuting diagram like this:

We say that this is an isomorphism of cospans if $f$ is an isomorphism.

This gives our equivalence relation on cospans! It’s an old famous theorem in category theory—so famous that it’s hard to find a reference for the proof—that whenever $C$ is a category with pushouts, there’s a category $\mathrm{Cospan}(C)$ where:

• an object is an object of $C$

• a morphism from an object $I \in C$ to an object $O \in C$ is an isomorphism class of cospans from $I$ to $O.$

• we compose isomorphism classes of cospans by picking representatives, composing them and then taking the isomorphism class.

This takes some work to prove, but it’s true, so this is how we get our category of circuits!

Next time we’ll do something with this category. Namely, we’ll cook up a category of ‘behaviors’. The behavior of a circuit made of resistors just says which currents and potentials its terminals can have. If we put a circuit in a metaphorical ‘black box’ and refuse to peek inside, all we can see is its behavior.

Then we’ll cook up a functor from the category of circuits to the category of behaviors. We’ll call this the ‘black box functor’. Saying that it’s a functor mainly means that

$\blacksquare(f g) = \blacksquare(f) \blacksquare(g)$

Here $f$ and $g$ are circuits that we can compose, and $f g$ is their composite. The black square is the black box functor, so $\blacksquare(fg)$ is the behavior of the circuit $f g.$ There’s a way to compose behaviors, too, and the equation above says that the behavior of the composite circuit is the composite of their behaviors!

This is very important, because it says we can figure out what a big circuit does if we know what its pieces do. And this is one of the grand themes of network theory: understanding big complicated networks by understanding their pieces. We may not always be able to do this, in practice! But it’s something we’re always concerned with.