Complex Adaptive System Design (Part 3)

17 August, 2017

It’s been a long time since I’ve blogged about the Complex Adaptive System Composition and Design Environment or CASCADE project run by John Paschkewitz. For a reminder, read these:

Complex adaptive system design (part 1), Azimuth, 2 October 2016.

Complex adaptive system design (part 2), Azimuth, 18 October 2016.

A lot has happened since then, and I want to explain it.

I’m working with Metron Scientific Solutions to develop new techniques for designing complex networks.

The particular problem we began cutting our teeth on is a search and rescue mission where a bunch of boats, planes and drones have to locate and save people who fall overboard during a boat race in the Caribbean Sea. Subsequently the Metron team expanded the scope to other search and rescue tasks. But the real goal is to develop very generally applicable new ideas on designing and ‘tasking’ networks of mobile agents—that is, designing these networks and telling the agents what to do.

We’re using the mathematics of ‘operads’, in part because Spivak’s work on operads has drawn a lot of attention and raised a lot of hopes:

• David Spivak, The operad of wiring diagrams: formalizing a graphical language for databases, recursion, and plug-and-play circuits.

An operad is a bunch of operations for sticking together smaller things to create bigger ones—I’ll explain this in detail later, but that’s the core idea. Spivak described some specific operads called ‘operads of wiring diagrams’ and illustrated some of their potential applications. But when we got going on our project, we wound up using a different class of operads, which I’ll call ‘network operads’.

Here’s our dream, which we’re still trying to make into a reality:

Network operads should make it easy to build a big network from smaller ones and have every agent know what to do. You should be able to ‘slap together’ a network, throwing in more agents and more links between them, and automatically have it do something reasonable. This should be more flexible than an approach where you need to know ahead of time exactly how many agents you have, and how they’re connected, before you can tell them what to do.

You don’t want a network to malfunction horribly because you forgot to hook it up correctly. You want to focus your attention on optimizing the network, not getting it to work at all. And you want everything to work so smoothly that it’s easy for the network to adapt to changing conditions.

To achieve this we’re using network operads, which are certain special ‘typed operads’. So before getting into the details of our approach, I should say a bit about typed operads. And I think that will be enough for today’s post: I don’t want to overwhelm you with too much information at once.

In general, a ‘typed operad’ describes ways of sticking together things of various types to get new things of various types. An ‘algebra’ of the operad gives a particular specification of these things and the results of sticking them together. For now I’ll skip the full definition of a typed operad and only highlight the most important features. A typed operad O has:

• a set T of types.

• collections of operations O(t_1,...,t_n ; t) where t_i, t \in T. Here t_1, \dots, t_n are the types of the inputs, while t is the type of the output.

• ways to compose operations. Given an operation
f \in O(t_1,\dots,t_n ;t) and n operations

g_1 \in O(t_{11},\dots,t_{1 k_1}; t_1),\dots, g_n \in O(t_{n1},\dots,t_{n k_n};t_n)

we can compose them to get

f \circ (g_1,\dots,g_n) \in O(t_{11}, \dots, t_{nk_n};t)

These must obey some rules.

But if you haven’t seen operads before, you’re probably reeling in horror—so I need to rush in and save you by showing you the all-important pictures that help explain what’s going on!

First of all, you should visualize an operation f \in O(t_1, \dots, t_n; t) as a little gizmo like this:

It has n inputs at top and one output at bottom. Each input, and the output, has a ‘type’ taken from the set T. So, for example, if you operation takes two real numbers, adds them and spits out the closest integer, both input types would be ‘real’, while the output type would be ‘integer’.

The main thing we do with operations is compose them. Given an an operation f \in O(t_1,\dots,t_n ;t), we can compose it with n operations

g_1 \in O(t_{11},\dots,t_{1 k_1}; t_1), \quad \dots, \quad g_n \in O(t_{n1},\dots,t_{n k_n};t_n)

by feeding their outputs into the inputs of f, like this:

The result is an operation we call

f \circ (g_1, \dots, g_n)

Note that the input types of f have to match the output types of the g_i for this to work! This is the whole point of types: they forbid us from composing operations in ways that don’t make sense.

This avoids certain stupid mistakes. For example, you can take the square root of a positive number, but you may not want to take the square root of a negative number, and you definitely don’t want to take the square root of a hamburger. While you can land a plane on an airstrip, you probably don’t want to land a plane on a person.

The operations in an operad are quite abstract: they aren’t really operating on anything. To render them concrete, we need another idea: operads have ‘algebras’.

An algebra A of the operad O specifies a set of things of each type t \in T such that the operations of O act on these sets. A bit more precisely, an algebra consists of:

• for each type t \in T, a set A(t) of things of type t

• an action of O on A, that is, a collection of maps

\alpha : O(t_1,...,t_n ; t) \times A(t_1) \times \cdots \times A(t_n) \to A(t)

obeying some rules.

In other words, an algebra turns each operation f \in O(t_1,...,t_n ; t) into a function that eats things of types t_1, \dots, t_n and spits out a thing of type t.

When we get to designing systems with operads, the fact that the same operad can have many algebras will be useful. Our operad will have operations describing abstractly how to hook up networks to form larger networks. An algebra will give a specific implementation of these operations. We can use one algebra that’s fairly fine-grained and detailed about what the operations actually do, and another that’s less detailed. There will then be a map between from the first algebra to the second, called an ‘algebra homomorphism’, that forgets some fine-grained details.

There’s a lot more to say—all this is just the mathematical equivalent of clearing my throat before a speech—but I’ll stop here for now.

And as I do—since it also takes me time to stop talking—I should make it clear yet again that I haven’t even given the full definition of typed operads and their algebras! Besides the laws I didn’t write down, there’s other stuff I omitted. Most notably, there’s a way to permute the inputs of an operation in an operad, and operads have identity operations, one for each type.

To see the full definition of an ‘untyped’ operad, which is really an operad with just one type, go here:

• Wikipedia, Operad theory.

They just call it an ‘operad’. Note that they first explain ‘non-symmetric operads’, where you can’t permute the inputs of operations, and then explain operads, where you can.

If you’re mathematically sophisticated, you can easily guess the laws obeyed by a typed operad just by looking at this article and inserting the missing types. You can also see the laws written down in Spivak’s paper, but with some different terminology: he calls types ‘objects’, he calls operations ‘morphisms’, and he calls typed operads ‘symmetric colored operads’—or once he gets going, just ‘operads’.

You can also see the definition of a typed operad in Section 2.1 here:

• Donald Yau, Operads of wiring diagrams.

What I would call a typed operad with S as its set of types, he calls an ‘S-colored operad’.

I guess it’s already evident, but I’ll warn you that the terminology in this subject varies quite a lot from author to author: for example, a certain community calls typed operads ‘symmetric multicategories’. This is annoying at first but once you understand the subject it’s as ignorable as the fact that mathematicians have many different accents. The main thing to remember is that operads come in four main flavors, since they can either be typed or untyped, and they can either let you permute inputs or not. I’ll always be working with typed operads where you can permute inputs.

Finally, I’ll say that while the definition of operad looks lengthy and cumbersome at first, it becomes lean and elegant if you use more category theory.

Next time I’ll give you an example of an operad: the simplest ‘network
operad’.


A Compositional Framework for Reaction Networks

30 July, 2017

For a long time Blake Pollard and I have been working on ‘open’ chemical reaction networks: that is, networks of chemical reactions where some chemicals can flow in from an outside source, or flow out. The picture to keep in mind is something like this:



where the yellow circles are different kinds of chemicals and the aqua boxes are different reactions. The purple dots in the sets X and Y are ‘inputs’ and ‘outputs’, where certain kinds of chemicals can flow in or out.

Here’s our paper on this stuff:

• John Baez and Blake Pollard, A compositional framework for reaction networks, Reviews in Mathematical Physics 29, 1750028.

Blake and I gave talks about this stuff in Luxembourg this June, at a nice conference called Dynamics, thermodynamics and information processing in chemical networks. So, if you’re the sort who prefers talk slides to big scary papers, you can look at those:

• John Baez, The mathematics of open reaction networks.

• Blake Pollard, Black-boxing open reaction networks.

But I want to say here what we do in our paper, because it’s pretty cool, and it took a few years to figure it out. To get things to work, we needed my student Brendan Fong to invent the right category-theoretic formalism: ‘decorated cospans’. But we also had to figure out the right way to think about open dynamical systems!

In the end, we figured out how to first ‘gray-box’ an open reaction network, converting it into an open dynamical system, and then ‘black-box’ it, obtaining the relation between input and output flows and concentrations that holds in steady state. The first step extracts the dynamical behavior of an open reaction network; the second extracts its static behavior. And both these steps are functors!

Lawvere had the idea that the process of assigning ‘meaning’ to expressions could be seen as a functor. This idea has caught on in theoretical computer science: it’s called ‘functorial semantics’. So, what we’re doing here is applying functorial semantics to chemistry.

Now Blake has passed his thesis defense based on this work, and he just needs to polish up his thesis a little before submitting it. This summer he’s doing an internship at the Princeton branch of the engineering firm Siemens. He’s working with Arquimedes Canedo on ‘knowledge representation’.

But I’m still eager to dig deeper into open reaction networks. They’re a small but nontrivial step toward my dream of a mathematics of living systems. My working hypothesis is that living systems seem ‘messy’ to physicists because they operate at a higher level of abstraction. That’s what I’m trying to explore.

Here’s the idea of our paper.

The idea

Reaction networks are a very general framework for describing processes where entities interact and transform int other entities. While they first showed up in chemistry, and are often called ‘chemical reaction networks’, they have lots of other applications. For example, a basic model of infectious disease, the ‘SIRS model’, is described by this reaction network:

S + I \stackrel{\iota}{\longrightarrow} 2 I  \qquad  I \stackrel{\rho}{\longrightarrow} R \stackrel{\lambda}{\longrightarrow} S

We see here three types of entity, called species:

S: susceptible,
I: infected,
R: resistant.

We also have three `reactions’:

\iota : S + I \to 2 I: infection, in which a susceptible individual meets an infected one and becomes infected;
\rho : I \to R: recovery, in which an infected individual gains resistance to the disease;
\lambda : R \to S: loss of resistance, in which a resistant individual becomes susceptible.

In general, a reaction network involves a finite set of species, but reactions go between complexes, which are finite linear combinations of these species with natural number coefficients. The reaction network is a directed graph whose vertices are certain complexes and whose edges are called reactions.

If we attach a positive real number called a rate constant to each reaction, a reaction network determines a system of differential equations saying how the concentrations of the species change over time. This system of equations is usually called the rate equation. In the example I just gave, the rate equation is

\begin{array}{ccl} \displaystyle{\frac{d S}{d t}} &=& r_\lambda R - r_\iota S I \\ \\ \displaystyle{\frac{d I}{d t}} &=&  r_\iota S I - r_\rho I \\  \\ \displaystyle{\frac{d R}{d t}} &=& r_\rho I - r_\lambda R \end{array}

Here r_\iota, r_\rho and r_\lambda are the rate constants for the three reactions, and S, I, R now stand for the concentrations of the three species, which are treated in a continuum approximation as smooth functions of time:

S, I, R: \mathbb{R} \to [0,\infty)

The rate equation can be derived from the law of mass action, which says that any reaction occurs at a rate equal to its rate constant times the product of the concentrations of the species entering it as inputs.

But a reaction network is more than just a stepping-stone to its rate equation! Interesting qualitative properties of the rate equation, like the existence and uniqueness of steady state solutions, can often be determined just by looking at the reaction network, regardless of the rate constants. Results in this direction began with Feinberg and Horn’s work in the 1960’s, leading to the Deficiency Zero and Deficiency One Theorems, and more recently to Craciun’s proof of the Global Attractor Conjecture.

In our paper, Blake and I present a ‘compositional framework’ for reaction networks. In other words, we describe rules for building up reaction networks from smaller pieces, in such a way that its rate equation can be figured out knowing those those of the pieces. But this framework requires that we view reaction networks in a somewhat different way, as ‘Petri nets’.

Petri nets were invented by Carl Petri in 1939, when he was just a teenager, for the purposes of chemistry. Much later, they became popular in theoretical computer science, biology and other fields. A Petri net is a bipartite directed graph: vertices of one kind represent species, vertices of the other kind represent reactions. The edges into a reaction specify which species are inputs to that reaction, while the edges out specify its outputs.

You can easily turn a reaction network into a Petri net and vice versa. For example, the reaction network above translates into this Petri net:



Beware: there are a lot of different names for the same thing, since the terminology comes from several communities. In the Petri net literature, species are called places and reactions are called transitions. In fact, Petri nets are sometimes called ‘place-transition nets’ or ‘P/T nets’. On the other hand, chemists call them ‘species-reaction graphs’ or ‘SR-graphs’. And when each reaction of a Petri net has a rate constant attached to it, it is often called a ‘stochastic Petri net’.

While some qualitative properties of a rate equation can be read off from a reaction network, others are more easily read from the corresponding Petri net. For example, properties of a Petri net can be used to determine whether its rate equation can have multiple steady states.

Petri nets are also better suited to a compositional framework. The key new concept is an ‘open’ Petri net. Here’s an example:



The box at left is a set X of ‘inputs’ (which happens to be empty), while the box at right is a set Y of ‘outputs’. Both inputs and outputs are points at which entities of various species can flow in or out of the Petri net. We say the open Petri net goes from X to Y. In our paper, we show how to treat it as a morphism f : X \to Y in a category we call \textrm{RxNet}.

Given an open Petri net with rate constants assigned to each reaction, our paper explains how to get its ‘open rate equation’. It’s just the usual rate equation with extra terms describing inflows and outflows. The above example has this open rate equation:

\begin{array}{ccr} \displaystyle{\frac{d S}{d t}} &=&  - r_\iota S I - o_1 \\ \\ \displaystyle{\frac{d I}{d t}} &=&  r_\iota S I - o_2  \end{array}

Here o_1, o_2 : \mathbb{R} \to \mathbb{R} are arbitrary smooth functions describing outflows as a function of time.

Given another open Petri net g: Y \to Z, for example this:



it will have its own open rate equation, in this case

\begin{array}{ccc} \displaystyle{\frac{d S}{d t}} &=& r_\lambda R + i_2 \\ \\ \displaystyle{\frac{d I}{d t}} &=& - r_\rho I + i_1 \\  \\ \displaystyle{\frac{d R}{d t}} &=& r_\rho I - r_\lambda R  \end{array}

Here i_1, i_2: \mathbb{R} \to \mathbb{R} are arbitrary smooth functions describing inflows as a function of time. Now for a tiny bit of category theory: we can compose f and g by gluing the outputs of f to the inputs of g. This gives a new open Petri net gf: X \to Z, as follows:



But this open Petri net gf has an empty set of inputs, and an empty set of outputs! So it amounts to an ordinary Petri net, and its open rate equation is a rate equation of the usual kind. Indeed, this is the Petri net we have already seen.

As it turns out, there’s a systematic procedure for combining the open rate equations for two open Petri nets to obtain that of their composite. In the example we’re looking at, we just identify the outflows of f with the inflows of g (setting i_1 = o_1 and i_2 = o_2) and then add the right hand sides of their open rate equations.

The first goal of our paper is to precisely describe this procedure, and to prove that it defines a functor

\diamond: \textrm{RxNet} \to \textrm{Dynam}

from \textrm{RxNet} to a category \textrm{Dynam} where the morphisms are ‘open dynamical systems’. By a dynamical system, we essentially mean a vector field on \mathbb{R}^n, which can be used to define a system of first-order ordinary differential equations in n variables. An example is the rate equation of a Petri net. An open dynamical system allows for the possibility of extra terms that are arbitrary functions of time, such as the inflows and outflows in an open rate equation.

In fact, we prove that \textrm{RxNet} and \textrm{Dynam} are symmetric monoidal categories and that d is a symmetric monoidal functor. To do this, we use Brendan Fong’s theory of ‘decorated cospans’.

Decorated cospans are a powerful general tool for describing open systems. A cospan in any category is just a diagram like this:



We are mostly interested in cospans in \mathrm{FinSet}, the category of finite sets and functions between these. The set S, the so-called apex of the cospan, is the set of states of an open system. The sets X and Y are the inputs and outputs of this system. The legs of the cospan, meaning the morphisms i: X \to S and o: Y \to S, describe how these inputs and outputs are included in the system. In our application, S is the set of species of a Petri net.

For example, we may take this reaction network:

A+B \stackrel{\alpha}{\longrightarrow} 2C \quad \quad C \stackrel{\beta}{\longrightarrow} D

treat it as a Petri net with S = \{A,B,C,D\}:



and then turn that into an open Petri net by choosing any finite sets X,Y and maps i: X \to S, o: Y \to S, for example like this:



(Notice that the maps including the inputs and outputs into the states of the system need not be one-to-one. This is technically useful, but it introduces some subtleties that I don’t feel like explaining right now.)

An open Petri net can thus be seen as a cospan of finite sets whose apex S is ‘decorated’ with some extra information, namely a Petri net with S as its set of species. Fong’s theory of decorated cospans lets us define a category with open Petri nets as morphisms, with composition given by gluing the outputs of one open Petri net to the inputs of another.

We call the functor

\diamond: \textrm{RxNet} \to \textrm{Dynam}

gray-boxing because it hides some but not all the internal details of an open Petri net. (In the paper we draw it as a gray box, but that’s too hard here!)

We can go further and black-box an open dynamical system. This amounts to recording only the relation between input and output variables that must hold in steady state. We prove that black-boxing gives a functor

\square: \textrm{Dynam} \to \mathrm{SemiAlgRel}

(yeah, the box here should be black, and in our paper it is). Here \mathrm{SemiAlgRel} is a category where the morphisms are semi-algebraic relations between real vector spaces, meaning relations defined by polynomials and inequalities. This relies on the fact that our dynamical systems involve algebraic vector fields, meaning those whose components are polynomials; more general dynamical systems would give more general relations.

That semi-algebraic relations are closed under composition is a nontrivial fact, a spinoff of the Tarski–Seidenberg theorem. This says that a subset of \mathbb{R}^{n+1} defined by polynomial equations and inequalities can be projected down onto \mathbb{R}^n, and the resulting set is still definable in terms of polynomial identities and inequalities. This wouldn’t be true if we didn’t allow inequalities. It’s neat to see this theorem, important in mathematical logic, showing up in chemistry!

Structure of the paper

Okay, now you’re ready to read our paper! Here’s how it goes:

In Section 2 we review and compare reaction networks and Petri nets. In Section 3 we construct a symmetric monoidal category \textrm{RNet} where an object is a finite set and a morphism is an open reaction network (or more precisely, an isomorphism class of open reaction networks). In Section 4 we enhance this construction to define a symmetric monoidal category \textrm{RxNet} where the transitions of the open reaction networks are equipped with rate constants. In Section 5 we explain the open dynamical system associated to an open reaction network, and in Section 6 we construct a symmetric monoidal category \textrm{Dynam} of open dynamical systems. In Section 7 we construct the gray-boxing functor

\diamond: \textrm{RxNet} \to \textrm{Dynam}

In Section 8 we construct the black-boxing functor

\square: \textrm{Dynam} \to \mathrm{SemiAlgRel}

We show both of these are symmetric monoidal functors.

Finally, in Section 9 we fit our results into a larger ‘network of network theories’. This is where various results in various papers I’ve been writing in the last few years start assembling to form a big picture! But this picture needs to grow….


A Bicategory of Decorated Cospans

8 July, 2017

My students are trying to piece together general theory of networks, inspired by many examples. A good general theory should clarify and unify these examples. What some people call network theory, I’d just call ‘applied graph invariant theory’: they come up with a way to calculate numbers from graphs, they calculate these numbers for graphs that show up in nature, and then they try to draw conclusions about this. That’s fine as far as it goes, but there’s a lot more to network theory!

There are many kinds of networks. You can usually create big networks of a given kind by sticking together smaller networks of this kind. The networks usually do something, and the behavior of the whole is usually determined by the behavior of the parts and how the parts are stuck together.

So, we should think of networks of a given kind as morphisms in a category, or more generally elements of an algebra of some operad, and define a map sending each such network to its behavior. Then we can study this map mathematically!

All these insights (and many more) are made precise in Fong’s theory of ‘decorated cospans’:

• Brendan Fong, The Algebra of Open and Interconnected Systems, Ph.D. thesis, University of Oxford, 2016. (Blog article here.)

Kenny Courser is starting to look at the next thing: how one network can turn into another. For example, a network might change over time, or we might want to simplify a complicated network somehow. If a network is morphism, a process where one network turns into another could be a ‘2-morphism’: that is, a morphism between morphisms. Just as categories have objects and morphisms, bicategories have objects, morphisms and 2-morphisms.

So, Kenny is looking at bicategories. As a first step, Kenny took Brendan’s setup and souped it up to define ‘decorated cospan bicategories’:

• Kenny Courser, Decorated cospan bicategories, Theory and Applications of Categories 32 (2017), 985–1027.

In this paper, he showed that these bicategories are often ‘symmetric monoidal’. This means that you can not only stick networks together end to end, you can also set them side by side or cross one over the other—and similarly for processes that turn one network into another! A symmetric monoidal bicategory is a somewhat fearsome structure, so Kenny used some clever machinery developed by Mike Shulman to get the job done:

• Mike Shulman, Constructing symmetric monoidal bicategories.

I would love to talk about the details, but they’re a bit technical so I think I’d better talk about something more basic. Namely: what’s a decorated cospan category and what’s a decorated cospan bicategory?

First: what’s a decorated cospan? A cospan in some category C is a diagram like this:

where the objects and morphisms are all in C. For example, if C is the category of sets, we’ve got two sets X and Y mapped to a set \Gamma.

In a ‘decorated’ cospan, the object \Gamma is equipped or, as we like to say, ‘decorated’ with extra structure. For example:

Here the set \Gamma consists of 3 points—but it’s decorated with a graph whose edges are labelled by numbers! You could use this to describe an electrical circuit made of resistors. The set X would then be the set of ‘input terminals’, and Y the set of ‘output terminals’.

In this example, and indeed in many others, there’s no serious difference between inputs and outputs. We could reflect the picture, switching the roles of X and Y, and the inputs would become outputs and vice versa. One reason for distinguishing them is that we can then attach the outputs of one circuit to the inputs of another and build a larger circuit. If we think of our circuit as a morphism from the input set X to the output set Y, this process of attaching circuits to form larger ones can be seen as composing morphisms in a category.

In other words, if we get the math set up right, we can compose a decorated cospan from X to Y and a decorated cospan from Y to Z and get a decorated cospan from X to Z. So with luck, we get a category with objects of C as objects, and decorated cospans between these guys as morphisms!

For example, we can compose this:

and this:

to get this:

What did I mean by saying ‘with luck’? Well, there’s not really any luck involved, but we need some assumptions for all this to work. Before we even get to the decorations, we need to be able to compose cospans. We can do this whenever our cospans live in a category with pushouts. In category theory, a pushout is how we glue two things together.

So, suppose our category C has pushouts. IF we then have two cospans in C, one from X to Y and one from Y to Z:

we can take a pushout:

and get a cospan from X to Z:

All this is fine and dandy, but there’s a slight catch: the pushout is only defined up to isomorphism, so we can’t expect this process of composing cospans to be associative: it will only be associative up to isomorphism.

What does that mean? What’s an isomorphism of cospans?

I’m glad you asked. A map of cospans is a diagram like this:

where the two triangles commmute. You can see two cospans in this picture; the morphism f provides the map from one to the other. If f is an isomorphism, then this is an isomorphism of cospans.

To get around this problem, we can work with a category where the morphisms aren’t cospans, but isomorphism classes of cospans. That’s what Brendan did, and it’s fine for many purposes.

But back around 1972, when Bénabou was first inventing bicategories, he noticed that you could also create a bicategory with

• objects of C as objects,
• spans in C as morphisms, and
• maps of spans in C as 2-morphisms.

Bicategories are perfectly happy for composition of 1-morphisms to be associative only up to isomorphism, so this solves the problem in a somewhat nicer way. (Taking equivalence classes of things when you don’t absolutely need to is regarded with some disdain in category theory, because it often means you’re throwing out information—and when you throw out information, you often regret it later.)

So, if you’re interested in decorated cospan categories, and you’re willing to work with bicategories, you should consider thinking about decorated cospan bicategories. And now, thanks to Kenny Courser’s work, you can!

He showed how the decorations work in the bicategorical approach: for example, he proved that whenever C has finite colimits and

F : (C,+) \to (\mathrm{Set}, \times)

is a lax symmetric monoidal functor, you get a symmetric monoidal bicategory where a morphism is a cospan in C:

with the object \Gamma decorated by an element of F(\Gamma).

Proving this took some virtuosic work in category theory. The key turns out to be this glorious diagram:



For the explanation, check out Proposition 4.1 in his paper.

I’ll talk more about applications of cospan bicategories when I blog about some other papers Kenny Courser and Daniel Cicala are writing.


The Theory of Devices

20 June, 2017

I’m visiting the University of Genoa and talking to two category theorists: Marco Grandis and Giuseppe Rosolini. Grandis works on algebraic topology and higher categories, while Rosolini works on the categorical semantics of programming languages.

Yesterday, Marco Grandis showed me a fascinating paper by his thesis advisor:

• Gabriele Darbo, Aspetti algebrico-categoriali della teoria dei dispotivi, Symposia Mathematica IV (1970), Istituto Nazionale di Alta Matematica, 303–336.

It’s closely connected to Brendan Fong’s thesis, but also different—and, of course, much older. According to Grandis, Darbo was the first person to work on category theory in Italy. He’s better known for other things, like ‘Darbo’s fixed point theorem’, but this piece of work is elegant, and, it seems to me, strangely ahead of its time.

The paper’s title translates as ‘Algebraic-categorical aspects of the theory of devices’, and its main concept is that of a ‘universe of devices’: a collection of devices of some kind that can be hooked up using wires to form more devices of this kind. Nowadays we might study this concept using operads—but operads didn’t exist in 1970, and Darbo did quite fine without them.

The key is the category \mathrm{FinCorel}, which has finite sets as objects and ‘corelations’ as morphisms. I explained corelations here:

Corelations in network theory, 2 February 2016.

Briefly, a corelation from a finite set X to a finite set Y is a partition of the disjoint union of X and Y. We can get such a partition from a bunch of wires connecting points of X and Y. The idea is that two points lie in the same part of the partition iff they’re connected, directly or indirectly, by a path of wires. So, if we have some wires like this:

they determine a corelation like this:

There’s an obvious way to compose corelations, giving a category \mathrm{FinCorel}.

Gabriele Darbo doesn’t call them ‘corelations’: he calls them ‘trasduttori’. A literal translation might be ‘transducers’. But he’s definitely talking about corelations, and like Fong he thinks they are basic for studying ways to connect systems.

Darbo wants a ‘universe of devices’ to assign to each finite set X a set D(X) of devices having X as their set of ‘terminals’. Given a device in D(X) and a corelation f \colon X \to Y, thought of as a bunch of wires, he wants to be able to attach these wires to the terminals in X and get a new device with Y as its set of terminals. Thus he wants a map D(f): D(X) \to D(Y). If you draw some pictures, you’ll see this should give a functor

D : \mathrm{FinCorel} \to \mathrm{Set}

Moreover, if we have device with a set X of terminals and a device with a set Y of terminals, we should be able to set them side by side and get a device whose set of terminals form the set X + Y, meaning the disjoint union of X and Y. So, Darbo wants to have maps

\delta_{X,Y} : D(X) \times D(Y) \to D(X + Y)

If you draw some more pictures you can convince yourself that \delta should be a lax symmetric monoidal functor… if you’re one of the lucky few who knows what that means. If you’re not, you can look it up in many places, such as Section 1.2 here:

• Brendan Fong, The Algebra of Open and Interconnected Systems, Ph.D. thesis, University of Oxford, 2016. (Blog article here.)

Darbo does not mention lax symmetric monoidal functors, perhaps because such concepts were first introduced by Mac Lane only in 1968. But as far as I can tell, Darbo’s definition is almost equivalent to this:

Definition. A universe of devices is a lax symmetric monoidal functor D \colon \mathrm{FinCorel} \to \mathrm{Set}.

One difference is that Darbo wants there to be exactly one device with no terminals. Thus, he assumes D(\emptyset) is a one-element set, say 1, while the definition above would only demand the existence of a map \delta \colon 1 \to D(\emptyset) obeying a couple of axioms. That gives a particular ‘favorite’ device with no terminals. I believe we get Darbo’s definition from the above one if we further assume \delta is the identity map. This makes sense if we take the attitude that ‘a device is determined by its observable behavior’, but not otherwise. This attitude is called ‘black-boxing’.

Darbo does various things in his paper, but the most exciting to me is his example connected to linear electrical circuits. He defines, for any pair of objects V and I in an abelian category C, a particular universe of devices. He calls this the universe of linear devices having V as the object of potentials and I as the object of currents.

If you don’t like abelian categories, think of C as the category of finite-dimensional real vector spaces, and let V = I = \mathbb{R}. Electric potential and electric current are described by real numbers so this makes sense.

The basic idea will be familiar to Fong fans. In an electrical circuit made of purely conductive wires, when two wires merge into one we add the currents to get the current on the wire going out. When one wire splits into two we duplicate the potential to get the potentials on the wires going out. Working this out further, any corelation f : X \to Y between finite set determines two linear relations, one

f_* : I^X \rightharpoonup I^Y

relating the currents on the wires coming in to the currents on the wires going out, and one

f^* : V^Y \rightharpoonup V^X

relating the potentials on the wires going out to the potentials on the wires coming in. Here I^X is the direct sum of X copies of I, and so on; the funky arrow indicates that we have a linear relation rather than a linear map. Note that f_* goes forward while f^* goes backward; this is mainly just conventional, since you can turn linear relations around, but we’ll see it’s sort of nice.

If we let \mathrm{Rel}(A,B) be the set of linear relations between two objects A, B \in C, we can use the above technology to get a universe of devices where

D(X) = \mathrm{Rel}(V^X, I^X)

In other words, a device of this kind is simply a linear relation between the potentials and currents at its terminals!

How does D get to be a functor D : \mathrm{FinCorel} \to \mathrm{FinSet}? That’s pretty easy. We’ve defined it on objects (that is, finite sets) by the above formula. So, suppose we have a morphism (that is, a corelation) f \colon X \to Y. How do we define D(f) : D(X) \to D(Y)?

To answer this question, we need a function

D(f) : \mathrm{Rel}(V^X, I^X) \to \mathrm{Rel}(V^Y, I^Y)

Luckily, we’ve got linear relations

f_* : I^X \rightharpoonup I^Y

and

f^* : V^Y \rightharpoonup V^X

So, given any linear relation R \in \mathrm{Rel}(V^X, I^X), we just define

D(f)(R) = f_* \circ R \circ f^*

Voilà!

People who have read Fong’s thesis, or my paper with Blake Pollard on reaction networks:

• John Baez and Blake Pollard, A compositional framework for reaction networks.

will find many of Darbo’s ideas eerily similar. In particular, the formula

D(f)(R) = f_* \circ R \circ f^*

appears in Lemma 16 of my paper with Blake, where we are defining a category of open dynamical systems. We prove that D is a lax symmetric monoidal functor, which is just what Darbo proved—though in a different context, since our R is not linear like his, and for a different purpose, since he’s trying to show D is a ‘universe of devices’, while we’re trying to construct the category of open dynamical systems as a ‘decorated cospan category’.

In short: if this work of Darbo had become more widely known, the development of network theory could have been sped up by three decades! But there was less interest in a general theory of networks at the time, lax monoidal functors were brand new, operads unknown… and, sadly, few mathematicians read Italian.

Darbo has other papers, and so do his students. We should read them and learn from them! Here are a few open-access ones:

• Franco Parodi, Costruzione di un universo di dispositivi non lineari su una coppia di gruppi abeliani, Rendiconti del Seminario Matematico della Università di Padova 58 (1977), 45–54.

• Franco Parodi, Categoria degli universi di dispositivi e categoria delle T-algebre, Rendiconti del Seminario Matematico della Università di Padova 62 (1980), 1–15.

• Stefano Testa, Su un universo di dispositivi monotoni, Rendiconti del Seminario Matematico della Università di Padova 65 (1981), 53–57.

At some point I will scan in G. Darbo’s paper and make it available here.


Information Processing in Chemical Networks (Part 1)

4 January, 2017

There’s a workshop this summer:

Dynamics, Thermodynamics and Information Processing in Chemical Networks, 13-16 June 2017, Complex Systems and Statistical Mechanics Group, University of Luxembourg. Organized by Massimiliano Esposito and Matteo Polettini.

They write, “The idea of the workshop is to bring in contact a small number of high-profile research groups working at the frontier between physics and biochemistry, with particular emphasis on the role of Chemical Networks.”

The speakers may include John Baez, Sophie de Buyl, Massimiliano Esposito, Arren Bar-Even, Christoff Flamm, Ronan Fleming, Christian Gaspard, Daniel Merkle, Philippe Nge, Thomas Ouldridge, Luca Peliti, Matteo Polettini, Hong Qian, Stefan Schuster, Alexander Skupin, Pieter Rein ten Wolde. I believe attendance is by invitation only, so I’ll endeavor to make some of the ideas presented available here at this blog.

Some of the people involved

I’m looking forward to this, in part because there will be a mix of speakers I’ve met, speakers I know but haven’t met, and speakers I don’t know yet. I feel like reminiscing a bit, and I hope you’ll forgive me these reminiscences, since if you try the links you’ll get an introduction to the interface between computation and chemical reaction networks.

In part 25 of the network theory series here, I imagined an arbitrary chemical reaction network and said:

We could try to use these reactions to build a ‘chemical computer’. But how powerful can such a computer be? I don’t know the answer.

Luca Cardelli answered my question in part 26. This was just my first introduction to the wonderful world of chemical computing. Erik Winfree has a DNA and Natural Algorithms Group at Caltech, practically next door to Riverside, and the people there do a lot of great work on this subject. David Soloveichik, now at U. T. Austin, is an alumnus of this group.

In 2014 I met all three of these folks, and many other cool people working on these theme, at a workshop I tried to summarize here:

Programming with chemical reaction networks, Azimuth, 23 March 2014.

The computational power of chemical reaction networks, 10 June 2014.

Chemical reaction network talks, 26 June 2014.

I met Matteo Polettini about a year later, at a really big workshop on chemical reaction networks run by Elisenda Feliu and Carsten Wiuf:

Trends in reaction network theory (part 1), Azimuth, 27 January 2015.

Trends in reaction network theory (part 2), Azimuth, 1 July 2015.

Polettini has his own blog, very much worth visiting. For example, you can see his view of the same workshop here:

• Matteo Polettini, Mathematical trends in reaction network theory: part 1 and part 2, Out of Equilibrium, 1 July 2015.

Finally, I met Massimiliano Esposito and Christoph Flamm recently at the Santa Fe Institute, at a workshop summarized here:

Information processing and biology, Azimuth, 7 November 2016.

So, I’ve gradually become educated in this area, and I hope that by June I’ll be ready to say something interesting about the semantics of chemical reaction networks. Blake Pollard and I are writing a paper about this now.


Globular

14 December, 2016

One of my goals is to turn category theory, and even higher category theory, into a practical tool for science. For this we need good scientific ideas—but we also need good software.

My friend Jamie Vicary has been developing some of this software, together with Aleks Kissinger and Krzysztof Bar and others. Jamie demonstrated it at the Simons Institute workshop on compositionality. You can watch his demonstration here:

But since Globular runs on a web browser, you can also try it out yourself here:

Globular.

You can see his talk slides:

• Jamie Vicary, Data structures for quasistrict higher categories. (Talk slides here.)

Abstract. Higher category theory is one of the most general approaches to compositionality, with broad and striking applications across computer science, mathematics and physics. We present a new, simple way to define higher categories, in which many important compositional properties emerge as theorems, rather than axioms. Our approach is amenable to computer implementation, and we present a new proof assistant we have developed, with a powerful graphical calculus. In particular, we will outline a substantial new proof we have developed in our setting.

And in December 2015, he wrote an article about this software on the n-Category Café. It’s been improved since then, but it can’t hurt to read what he wrote—so I append it here!

Globular: the basic idea

When you’re trying to prove something in a monoidal category, or a higher category, string diagrams are a really useful technique, especially when you’re trying to get an intuition for what you’re doing. But when it comes to writing up your results, the problems start to mount. For a complex proof, it’s hard to be sure your result is correct—a slip of the pen could lead to a false proof, and an error that’s hard to find. And writing up your results can be a huge time-sink, requiring weeks or months using a graphics package, all just for some nice pictures that tell you little about the correctness of the proof, and become useless if you decide to change your approach. Computers should be able help with all these things, in the way that proof assistants like Coq and Agda are allowing us to work with traditional syntactic proofs in a more sophisticated way.

The purpose of this post is to introduce Globular, a new proof assistant for working with higher-categorical proofs using string diagrams. It’s available at http://globular.science, with documentation on the nab. It’s web-based, so everything happens right in your browser: build formal proofs, visualize and step through them; keep your proofs private, share them with collaborators, or make them publicly available.

Before we get into the technical details, here’s a screenshot of Globular in action:

Screenshot of Globular

The main part of the screen shows a diagram, which in this case is 2-dimensional. It represents a composite 2-cell in a finitely-presented 2-category, with the blue and red regions representing objects, the lines representing 1-cells, and the vertices representing 2-cells. In fact, this 2d diagram is just an intermediate state of a 3d proof, through which we’re navigating with the ‘Slice’ controls in the top-right. The proof itself has been built up by composing the generators listed in the signature, down the left-hand side of the screen. (If you want to take a look at this proof yourself, you can go straight there; in the top-right, set ‘Project’ to 0, then increment the second ‘Slice’ counter to scroll through the proof.)

Globular has been developed so far in the Quantum Group in
the Oxford Computer Science department, by Krzysztof
Bar
, Katherine Casey, Aleks Kissinger, Jamie Vicary and Caspar Wylie. We haven’t quite got around to it yet, but Globular will be open-source, and we’re really keen for people to get involved and help build the software—there’s a huge amount to do! If you want to help out, get in touch.

Mathematical foundations

Globular is based on the theory of finitely-presented semistrict n-categories; at the moment, it works up to the level of 3-categories, with an extension to 4-categories actively in development. (You can build cells of any dimension, but from 4-cells and up, some structures are missing.)

Definitions of n-category vary in how strict they are; a definition is semistrict when it’s as strict as possible, while still having the property that every weak n-category satisfies it, up to equivalence. Definitions of semistrict n-category are not unique: in dimension 3, Gray categories put all the weak structure in the interchangers, while Simpson snucategories put it all in the unitors. Globular implements the axioms of a Gray category, because this is the most appropriate for the graphical calculus: the interchangers can be seen graphically, as changes in height of the components of the diagram. By the theory of k-tuply monoidal n-categories, this also lets you build proofs in a monoidal category, or a braided monoidal category, or a monoidal 2-category.

The only things that Globular understands are k-cells, for some value of k. So if you want to build an n-category where an equation f=g holds between n-cells, you have to do it by adding (n+1)-cells a:f \to g and b:g \to f. If you then build some composite C(f) involving f, you can apply the cell a to obtain C(g), and we interpret this as the equation C(f) = C(g). In a slogan, this is equality via rewriting. This is consistent with the basic premise of homotopy type theory: treat your proofs as first-order structures, which can in turn be reasoned about themselves.

Globular can also handle invertibility in a nice way. For a cell F:A \to B to be invertible, indicated by ticking a box in the signature, means that there also exists an invertible cell F^{-1}: B \to A, and invertible cells \text{id}_A \to F . F^{-1} and \text{id}_B \to F^{-1} . F. This is a coinductive definition (see Mike Shulman’s nice post on this topic), since we’re defining the notion of invertibility in terms of itself in a higher dimension. This sort of a definition is great for proof assistants to work with, as it allows a lot of structure to be generated from a single compact definition.

How it works

For a lot more details, take a look at the nLab page. Everything that happens in Globular involves in interaction between the signature on the left-hand side, and the diagram in the main part of the screen. The signature stores the ‘library’ of cells you have available, and the diagram is a particular composite of cells that you have constructed.

To construct a new diagram, clear whatever is currently displayed by clicking the ‘Clear’ button on the right, or pressing ‘c’. Then start by clicking the icon of a n-cell in your signature, which will make a diagram consisting just of that cell. Clicking on the icons of other k-generators for 0 < k \leq n will display a list of ways the cell can be attached, and when you choose one of these ways, the attachment will be performed, growing your n-diagram. (If you’re starting with a blank workspace you will only have a single 0-cell available, so you won’t be able to do this yet!) Clicking an (n+1)-cell G displays a list of ways that your n-diagram D can be rewritten, by identifying the source of G as a subdiagram of D. Selecting one of these ways will implement the rewrite, by ‘cutting out’ the chosen subdiagram of D, and replacing it with the target of G.

Another way to modify the diagram is to click directly on it. Clicking near the edge of the diagram performs an attachment, while clicking in the interior of the diagram performs a rewrite. If more than one attachment or rewrite is consistent with your click, a little menu will pop up for you to choose what you want to do. When you move your mouse pointer over the diagram, a little label pops up to show you what your cursor is hovering over, which is helpful when choosing where to click.

You can also click-and-drag on the diagram. This will attach or rewrite with an interchanger, or naturality for an interchanger, or invertibility for an interchanger, depending on where you have clicked and the direction of the drag. Clicking and dragging is designed to work as if you were really ‘touching’ the strings. So if you want to braid one strand over another, click the strand to ‘grab’ it, and ‘pull’ it to one side. If you want to pull a vertex through a braiding, click the vertex to ‘grab’ it, and ‘pull’ it up or down through its adjacent braiding. Of course, Globular will only carry out the command if the move you are attempting to make is actually valid in that location.

Example theorems

Here are four worked examples of nontrivial proofs in Globular:

Frobenius implies associative: http://globular.science/1512.004. In a monoidal category, if multiplication and comultiplication morphisms are unital, counital and Frobenius, then they are associative and coassociative.

Strengthening an equivalence: http://globular.science/1512.007. In a 2-category, an equivalence gives rise to an adjoint equivalence, satisfying the snake equations.

Swallowtail comes for free: http://globular.science/1512.006. In a monoidal 2-category, a weakly-dual pair of objects gives rise to a strongly-dual pair, satisfying the swallowtail equations.

Pentagon and triangle implies \lambda_I = \rho_I: http://globular.science/1512.002. In a monoidal 2-category, if a pseudomonoid object satisfies pentagon and triangle equations, then it satisfies \lambda_I = \rho_I.

We’ll focus on the second example project “Strengthening an equivalence” listed above, and see how it was constructed. This project investigates the factthat every equivalence in a 2-category gives rise to an adjoint equivalence. To start, we therefore need the basic data that exhibits an equivalence in a 2-category: two 0-cells A and B, and an invertible 1-cell F:A \to B, by the weak definition of ‘invertible’ discussed above. This gives us the following signature:

The 2-cells that witness invertibility of F look like cups and caps in the graphical calculus, but they won’t satisfy the snake equations that define an adjoint equivalence. The idea of this proof is to define a new cap, built out of the invertible structure of F, which does satisfy the snake equations with the existing cup.

By starting with a diagram consisting of F alone, pressing ‘i’ to take the identity diagram, and clicking-and-dragging, we build the following 2-diagram, out of the invertible structure associated to F:

This is our candidate for our redefined cup. Its source is the identity on A, and its target is F composed with F^{-1}. It looks a bit like the curved end of a hockey stick.

To store it for later use, we now click the ‘Theorem’ button. Writing D for the 2-diagram we have constructed, this does two things. First, it creates a 2-cell generator that we call “New Cup”, whose source is s(D), and whose target is t(D). This is the redefined cup that we can use in future expressions. Second, it creates an invertible 3-cell generator that we call “New Cup Definition”, with source given by “New Cup”, and with target given by our hockey-stick diagram D. This says what “New Cup” means in terms of our original structure. This adds the following cells to our signature:

Because “New Cup Definition” is a 3-cell, by default we see two little icons: one for its source, and one for its target. See how its source is a little picture of “New Cup”, and its target is a little picture of the hockey stick, just as we defined it.

We’re now ready to prove one of the snake equations. We start by building the snake composite, using “New Cup” for the cup, and the invertible structure of F for the cap:

Now have to prove that this equals the identity. Since equality is implemented by rewriting, we must construct a 3-diagram whose source is this snake composite, and whose target is the identity on F. To start, we click the ‘Identity’ button to convert our diagram into an identity 3-diagram. The only apparent effect this has is to add a number scroller to the ‘Slice’ area of the controls in the top-right. At the moment we can set this to ‘0’ and ‘1’, representing the source and target of our identity 3-diagram respectively. We set it to ‘1’, because we want to compose things to the target.

We now build up our proof. First, we click on the pink vertex that represents “New Cup”. This will attach our 3-cell “New Cup Definition”, replacing “New Cup” with our hockey-stick picture. By clicking-and-dragging on the diagram, we obtain the following sequence
of pictures:

           

             
 

Pictures 3 to 10 were created by attaching interchangers, and pictures 11 to 15 were created by attaching higher structure generated by the invertibility of F. In all cases, this structure was attached just by clicking-and-dragging on the appropriate vertices of the diagram. We’ve turned the snake into the identity, so we’ve finished our proof, which required 14 3-cells. By using the ‘Slice’ control in the top-right, we can navigate through the 15 slices that make up our proof, and review what we just did. Even better, turning the ‘Project’ control to the value ‘1’ tells Globular to project out the lowest dimension. This means that our entire 3-diagram proof can be viewed as a single 2-dimensional diagram, as follows:

This is just like the Morse singularity graphics used by topologists to study the structure of higher-dimensional manifolds. In this picture, the vertices are 3-cells, the lines are 2-cells, and the regions are 1-cells (in fact, every region is the 1-cell F.) By moving your mouse pointer over the different parts of the diagram, you can see what the different components are. Interchangers are represented in this projection by braidings.

Now we can do something cool: we can modify our proof, by clicking-and-dragging on the Morse projection. For example, just to the right of centre, there is a crossing, out of which emerge two long vertical lines that travel up a long way before annihilating with one another. Our proof would be much simpler if these two lines just annihilated with each other right after the interchanger. So, we click the vertex at the top of the lines, and drag it down repeatedly, until it gets to where we want it:

We’ve simplified our proof. By clicking-and-dragging some more, you can change the proof in lots of different ways, although you probably won’t get it much simpler than this. Putting the ‘Project’ control back to ‘0’, and navigating through the stages of the proof with the ‘Slice’ control as we were doing before, we can see that our proof has indeed been modified.

This project has been in development for about 18 months, so it feels great to finally launch. We hope the whole community will get clicking-and-dragging, and let us know how easy it is to use, and what other features would be useful. There are certain to still be bugs, so let us know about them too, and we’ll get right on them.


Modelling Interconnected Systems with Decorated Corelations

9 December, 2016

Here at the Simons Institute workshop on compositionality, my talk on network theory explained how to use ‘decorated cospans’ as a general model of open systems. These were invented by Brendan Fong, and are nicely explained in his thesis:

• Brendan Fong, The Algebra of Open and Interconnected Systems. (Blog article here.)

But he went further: to understand the externally observable behavior of an open system we often want to simplify a decorated cospan and get another sort of structure, which he calls a ‘decorated corelation’.

In this talk, Brendan explained decorated corelations and what they’re good for:

• Brendan Fong, Modelling interconnected systems with decorated corelations. (Talk slides here.)

Abstract. Hypergraph categories are monoidal categories in which every object is equipped with a special commutative Frobenius monoid. Morphisms in a hypergraph category can hence be represented by string diagrams in which strings can branch and split: diagrams that are reminiscent of electrical circuit diagrams. As such they provide a framework for formalising the syntax and semantics of circuit-type diagrammatic languages. In this talk I will introduce decorated corelations as a tool for building hypergraph categories and hypergraph functors, drawing examples from linear algebra and dynamical systems.