That’s what our team came up with: John Foley of Metron, my graduate students Blake Pollard and Joseph Moeller, and myself. We’re writing a couple of papers on this, and I’ll let you know when they’re ready. These blog articles are kind of sneak preview—and a gentle introduction, where you can ask questions.

For example: I’m talking a lot about networks. But what is a ‘network’, exactly?

There are many kinds. At the crudest level, we can model a network as a simple graph, which is something like this:

There are some restrictions on what counts as a simple graph. If the vertices are agents of some sort and the edges are communication channels, these restrictions imply:

• We allow at most one channel between any pair of agents, since there’s at most one edge between any two vertices of our graph.

• The channels do not have a favored direction, since there are no arrows on the edges of our graph.

• We don’t allow a channel from an agent to itself, since an edge can’t start and end at the same vertex.

For other purposes we may want to drop some or all of these restrictions. There is an appalling diversity of options! We might want to allow multiple channels between a pair of agents. For this we could use multigraphs. We might want to allow directed channels, where the sender and receiver have different capabilities: for example, signals may only be able to flow in one direction. For this we could use directed graphs. And so on.

We will also want to consider graphs with colored vertices, to specify different types of agents—or colored edges, to specify different types of channels. Even more complicated variants are likely to become important as we proceed.

To avoid sinking into a mire of special cases, we need the full power of modern mathematics. Instead of separately studying all these various kinds of networks, we need a unified notion that subsumes all of them.

To do this, the Metron team came up with something called a ‘network model’. There is a network model for simple graphs, a network model for multigraphs, a network model for directed graphs, a network model for directed graphs with 3 colors of vertex and 15 colors of edge, and more.

You should think of a network model as a *kind of network*. Not a specific network, just a *kind* of network.

Our team proved that for each network model there is an operad whose operations describe how to put together networks of that kind. We call such operads ‘network operads’.

I want to make all this precise, but today let me just show you one example. Let’s take to be the network model for simple graphs, and look at the network operad I won’t tell you what kind of thing is yet! But I’ll tell you about the operad .

• **Types.** Remember from last time that an operad has a set of ‘types’. For this is the set of natural numbers, The reason is that a simple graph can have any number of vertices.

• **Operations.** Remember that an operad has sets of ‘operations’. In our case we have a set of operations for each choice of

An operation is a way of taking a simple graph with vertices, a simple graph with vertices,… and so on, and sticking them together, perhaps adding new edges, to get a simple graph with

vertices.

Let me show you an operation

This will be a way of taking three simple graphs—one with 3 vertices, one with 4, and one with 2—and sticking them together, perhaps adding edges, to get one with 9 vertices.

Here’s what looks like:

It’s a simple graph with vertices numbered from 1 to 9, with the vertices in bunches: {1,2,3}, {4,5,6,7} and {8,9}. It could be any such graph. This one happens to have an edge from 3 to 6 and an edge from 1 to 2.

Here’s how we can actually use our operation. Say we have three simple graphs like this:

Then we can use our operation to stick them together and get this:

Notice that we added a new edge from 3 to 6, connecting two of our three simple graphs. We also added an edge from 1 to 2… but this had no effect, since there was already an edge there! The reason is that simple graphs have at most one edge between vertices.

But what if we didn’t already have an edge from 1 to 2? What if we applied our operation to the following simple graphs?

Well, now we’d get this:

This time adding the edge from 1 to 2 had an effect, since there wasn’t already an edge there!

In short, we can use this operad to stick together simple graphs, but also to add new edges *within* the simple graphs we’re sticking together!

When I’m telling you how we ‘actually use’ our operad to stick together graphs, I’m secretly describing an *algebra* of our operad. Remember, an operad describes ways of sticking together things together, but an ‘algebra’ of the operad gives a particular specification of these things and describes how we stick them together.

Our operad has lots of interesting algebras, but I’ve just shown you the simplest one. More precisely:

• **Things.** Remember from last time that for each type, an algebra specifies a set of **things** of that type. In this example our types are natural numbers, and for each natural number I’m letting the set of things consist of all simple graphs with vertices

• **Action.** Remember that our operad should have an **action** on , meaning a bunch of maps

I just described how this works in some examples. Some rules should hold… and they do.

To make sure you understand, try these puzzles:

**Puzzle 1.** In the example I just explained, what is the set if

**Puzzle 2.** In this example, how many elements does have?

**Puzzle 3.** In this example, how many elements does have?

**Puzzle 4.** In this example, how many elements does have?

**Puzzle 5.** In the particular algebra that I explained, how many elements does have?

Next time I’ll describe some more interesting algebras of this operad These let us describe networks of mobile agents with range-limited communication channels!

]]>

• Complex adaptive system design (part 1), *Azimuth*, 2 October 2016.

• Complex adaptive system design (part 2), *Azimuth*, 18 October 2016.

A lot has happened since then, and I want to explain it.

I’m working with Metron Scientific Solutions to develop new techniques for designing complex networks.

The particular problem we began cutting our teeth on is a search and rescue mission where a bunch of boats, planes and drones have to locate and save people who fall overboard during a boat race in the Caribbean Sea. Subsequently the Metron team expanded the scope to other search and rescue tasks. But the real goal is to develop *very generally* applicable new ideas on designing and ‘tasking’ networks of mobile agents—that is, designing these networks and telling the agents what to do.

We’re using the mathematics of ‘operads’, in part because Spivak’s work on operads has drawn a lot of attention and raised a lot of hopes:

• David Spivak, The operad of wiring diagrams: formalizing a graphical language for databases, recursion, and plug-and-play circuits.

An operad is a bunch of operations for sticking together smaller things to create bigger ones—I’ll explain this in detail later, but that’s the core idea. Spivak described some specific operads called ‘operads of wiring diagrams’ and illustrated some of their potential applications. But when we got going on our project, we wound up using a different class of operads, which I’ll call ‘network operads’.

Here’s our dream, which we’re still trying to make into a reality:

Network operads should make it easy to build a big network from smaller ones and have every agent know what to do. You should be able to ‘slap together’ a network, throwing in more agents and more links between them, and automatically have it do something reasonable. This should be more flexible than an approach where you need to know ahead of time exactly how many agents you have, and how they’re connected, before you can tell them what to do.

You don’t want a network to malfunction horribly because you forgot to hook it up correctly. You want to focus your attention on *optimizing* the network, not getting it to work at all. And you want everything to work so smoothly that it’s easy for the network to adapt to changing conditions.

To achieve this we’re using network operads, which are certain special ‘typed operads’. So before getting into the details of our approach, I should say a bit about typed operads. And I think that will be enough for today’s post: I don’t want to overwhelm you with too much information at once.

In general, a ‘typed operad’ describes ways of sticking together things of various types to get new things of various types. An ‘algebra’ of the operad gives a particular specification of these things and the results of sticking them together. For now I’ll skip the full definition of a typed operad and only highlight the most important features. A typed operad has:

• a set of **types**.

• collections of **operations** where . Here are the types of the **inputs**, while is the type of the **output**.

• ways to **compose** operations. Given an operation

and operations

we can compose them to get

These must obey some rules.

But if you haven’t seen operads before, you’re probably reeling in horror—so I need to rush in and save you by showing you the all-important *pictures* that help explain what’s going on!

First of all, you should visualize an operation as a little gizmo like this:

It has inputs at top and one output at bottom. Each input, and the output, has a ‘type’ taken from the set So, for example, if you operation takes two real numbers, adds them and spits out the closest integer, both input types would be ‘real’, while the output type would be ‘integer’.

The main thing we do with operations is compose them. Given an an operation we can compose it with operations

by feeding their outputs into the inputs of like this:

The result is an operation we call

Note that the input types of have to match the output types of the for this to work! This is the whole point of types: *they forbid us from composing operations in ways that don’t make sense*.

This avoids certain stupid mistakes. For example, you can take the square root of a positive number, but you may not want to take the square root of a negative number, and you definitely don’t want to take the square root of a hamburger. While you can land a plane on an airstrip, you probably don’t want to land a plane on a person.

The operations in an operad are quite abstract: they aren’t really operating *on* anything. To render them concrete, we need another idea: operads have ‘algebras’.

An algebra of the operad specifies a set of things of each type such that the operations of act on these sets. A bit more precisely, an algebra consists of:

• for each type a set of **things of type**

• an **action** of on that is, a collection of maps

obeying some rules.

In other words, an algebra turns each operation into a function that eats things of types and spits out a thing of type

When we get to designing systems with operads, the fact that the same operad can have many algebras will be useful. Our operad will have operations describing *abstractly* how to hook up networks to form larger networks. An algebra will give a specific *implementation* of these operations. We can use one algebra that’s fairly fine-grained and detailed about what the operations actually do, and another that’s less detailed. There will then be a map between from the first algebra to the second, called an ‘algebra homomorphism’, that forgets some fine-grained details.

There’s a lot more to say—all this is just the mathematical equivalent of clearing my throat before a speech—but I’ll stop here for now.

And as I do—since it also takes me time to *stop* talking—I should make it clear yet again that I haven’t even given the full definition of typed operads and their algebras! Besides the laws I didn’t write down, there’s other stuff I omitted. Most notably, there’s a way to permute the inputs of an operation in an operad, and operads have identity operations, one for each type.

To see the full definition of an ‘untyped’ operad, which is really an operad with *just one type*, go here:

• Wikipedia, Operad theory.

They just call it an ‘operad’. Note that they first explain ‘non-symmetric operads’, where you can’t permute the inputs of operations, and then explain operads, where you can.

If you’re mathematically sophisticated, you can easily guess the laws obeyed by a typed operad just by looking at this article and inserting the missing types. You can also see the laws written down in Spivak’s paper, but with some different terminology: he calls types ‘objects’, he calls operations ‘morphisms’, and he calls typed operads ‘symmetric colored operads’—or once he gets going, just ‘operads’.

You can also see the definition of a typed operad in Section 2.1 here:

• Donald Yau, Operads of wiring diagrams.

What I would call a typed operad with as its set of types, he calls an ‘-colored operad’.

I guess it’s already evident, but I’ll warn you that the terminology in this subject varies quite a lot from author to author: for example, a certain community calls typed operads ‘symmetric multicategories’. This is annoying at first but once you understand the subject it’s as ignorable as the fact that mathematicians have many different accents. The main thing to remember is that operads come in four main flavors, since they can either be typed or untyped, and they can either let you permute inputs or not. I’ll always be working with typed operads where you can permute inputs.

Finally, I’ll say that while the definition of operad looks lengthy and cumbersome at first, it becomes lean and elegant if you use more category theory.

Next time I’ll give you an example of an operad: the simplest ‘network

operad’.

]]>

• Norbert Blum, A solution of the P versus NP problem.

Most papers that claim to solve hard math problems are wrong: that’s why these problems are considered hard. But these papers can still be fun to look at, at least if they’re not *obviously* wrong. It’s fun to hope that maybe today humanity has found another beautiful grain of truth.

I’m not an expert on the P versus NP problem, so I have no opinion on this paper. So don’t get excited: wait calmly by your radio until you hear from someone who actually works on this stuff.

I found the first paragraph interesting, though. Here it is, together with some highly non-expert commentary. Beware: everything I say could be wrong!

Understanding the power of negations is one of the most challenging problems in complexity theory. With respect to monotone Boolean functions, Razborov [12] was the first who could shown that the gain, if using negations, can be super-polynomial in comparision to monotone Boolean networks. Tardos [16] has improved this to exponential.

I guess a ‘Boolean network’ is like a machine where you feed in a string of bits and it computes new bits using the logical operations ‘and’, ‘or’ and ‘not’. If you leave out ‘not’ the Boolean network is **monotone**, since then making more inputs equal to 1, or ‘true’, is bound to make more of the output bits 1 as well. Blum is saying that including ‘not’ makes some computations vastly more efficient… but that this stuff is hard to understand.

For the characteristic function of an NP-complete problem like the clique function, it is widely believed that negations cannot help enough to improve the Boolean complexity from exponential to polynomial.

A bunch of nodes in a graph are a **clique** if each of these nodes is connected by an edge to every other. Determining whether a graph with vertices has a clique with more than nodes is a famous problem: the **clique decision problem**.

For example, here’s a brute-force search for a clique with at least 4 nodes:

The clique decision problem is **NP-complete**. This means that if you can solve it with a Boolean network whose complexity grows like some polynomial in n, then P = NP. But if you can’t, then P ≠ NP.

(Don’t ask me what the complexity of a Boolean network is; I can guess but I could get it wrong.)

I guess Blum is hinting that the best *monotone* Boolean network for solving the clique decision problem has a complexity that’s exponential in And then he’s saying it’s widely believed that not gates can’t reduce the complexity to a polynomial.

Since the computation of an one-tape Turing machine can be simulated by a non-monotone Boolean network of size at most the square of the number of steps [15, Ch. 3.9], a superpolynomial lower bound for the non-monotone network complexity of such a function would imply P ≠ NP.

Now he’s saying what I said earlier: if you show it’s impossible to solve the clique decision problem with any Boolean network whose complexity grows like some polynomial in n, then you’ve shown P ≠ NP. This is how Blum intends to prove P ≠ NP.

For the monotone complexity of such a function, exponential lower bounds are known [11, 3, 1, 10, 6, 8, 4, 2, 7].

Should you trust someone who claims they’ve proved P ≠ NP, but can’t manage to get their references listed in increasing order?

But until now, no one could prove a non-linear lower bound for the nonmonotone complexity of any Boolean function in NP.

That’s a great example of how helpless we are: we’ve got all these problems whose complexity should grow faster than any polynomial, and we can’t even prove their complexity grows faster than *linear*. Sad!

An obvious attempt to get a super-polynomial lower bound for the non-monotone complexity of the clique function could be the extension of the method which has led to the proof of an exponential lower bound of its monotone complexity. This is the so-called “method of approximation” developed by Razborov [11].

I don’t know about this. All I know is that Razborov and Rudich proved a whole bunch of strategies for proving P ≠ NP can’t possibly work. These strategies are called ‘natural proofs’. Here are some friendly blog articles on their result:

• Timothy Gowers, How not to prove that P is not equal to NP, 3 October 2013.

• Timothy Gowers, Razborov and Rudich’s natural proofs argument, 7 October 2013.

From these I get the impression that what Blum calls ‘Boolean networks’ may be what other people call ‘Boolean circuits’. But I could be wrong!

Continuing:

Razborov [13] has shown that his approximation method cannot be used to prove better than quadratic lower bounds for the non-monotone complexity of a Boolean function.

So, this method is unable to prove some NP problem can’t be solved in polynomial time and thus prove P ≠ NP. Bummer!

But Razborov uses a very strong distance measure in his proof for the inability of the approximation method. As elaborated in [5], one can use the approximation method with a weaker distance measure to prove a super-polynomial lower bound for the non-monotone complexity of a Boolean function.

This reference [5] is to another paper by Blum. And in the end, he claims to use similar methods to prove that the complexity of any Boolean network that solves the clique decision problem must grow faster than a polynomial.

So, if you’re trying to check his proof that P ≠ NP, you should probably start by checking that other paper!

The picture below, by Behnam Esfahbod on Wikicommons, shows the two possible scenarios. The one at left is the one Norbert Blum claims to have shown we’re in.

]]>

• Applied Algebraic Topology 2017, August 8-12, 2017, Hokkaido University, Sapporo, Japan.

Unfortunately these notes will not give you a good summary of the talks—and almost nothing about applications of algebraic topology. Instead, I seem to be jotting down random cool math facts that I’m learning and don’t want to forget.

]]>

People have been using algebraic topology in data analysis these days, so we’re starting to see conferences like this:

• Applied Algebraic Topology 2017, August 8-12, 2017, Hokkaido University, Sapporo, Japan.

I’m giving the first talk at this one. I’ve done a lot of work on applied category theory, but only a bit on on applied algebraic topology. It was tempting to smuggle in some categories, operads and props under the guise of algebraic topology. But decided it would be more useful, as a kind of prelude to the conference, to say a bit about the overall history of algebraic topology, and its inner logic: how it was inevitably driven to categories, and then 2-categories, and then ∞-categories.

This may be the least ‘applied’ of all the talks at this conference, but I’m hoping it will at least trigger some interesting thoughts. We don’t want the ‘applied’ folks to forget the grand view that algebraic topology has to offer!

Here are my talk slides:

• The rise and spread of algebraic topology.

Abstract.As algebraic topology becomes more important in applied mathematics it is worth looking back to see how this subject has changed our outlook on mathematics in general. When Noether moved from working with Betti numbers to homology groups, she forced a new outlook on topological invariants: namely, they are oftenfunctors, with two invariants counting as ‘the same’ if they arenaturally isomorphic. To formalize this it was necessary to invent categories, and to formalize the analogy between natural isomorphisms between functors and homotopies between maps it was necessary to invent 2-categories. These are just the first steps in the ‘homotopification’ of mathematics, a trend in which algebra more and more comes to resemble topology, and ultimately abstract ‘spaces’ (for example, homotopy types) are considered as fundamental as sets. It is natural to wonder whether topological data analysis is a step in the spread of these ideas into applied mathematics, and how the importance of ‘robustness’ in applications will influence algebraic topology.

I thank Mike Shulman with some help on model categories and quasicategories. Any mistakes are, of course, my own fault.

]]>

where the yellow circles are different kinds of chemicals and the aqua boxes are different reactions. The purple dots in the sets *X* and *Y* are ‘inputs’ and ‘outputs’, where certain kinds of chemicals can flow in or out.

Our paper on this stuff just got accepted, and it should appear soon:

• John Baez and Blake Pollard, A compositional framework for reaction networks, to appear in *Reviews in Mathematical Physics*.

But thanks to the arXiv, you don’t have to wait: *beat the rush, click and download now!*

Blake and I gave talks about this stuff in Luxembourg this June, at a nice conference called Dynamics, thermodynamics and information processing in chemical networks. So, if you’re the sort who prefers talk slides to big scary papers, you can look at those:

• John Baez, The mathematics of open reaction networks.

• Blake Pollard, Black-boxing open reaction networks.

But I want to say here what we do in our paper, because it’s pretty cool, and it took a few years to figure it out. To get things to work, we needed my student Brendan Fong to invent the right category-theoretic formalism: ‘decorated cospans’. But we also had to figure out the right way to think about open dynamical systems!

In the end, we figured out how to first ‘gray-box’ an open reaction network, converting it into an open dynamical system, and then ‘black-box’ it, obtaining the relation between input and output flows and concentrations that holds in steady state. The first step extracts the dynamical behavior of an open reaction network; the second extracts its static behavior. And both these steps are functors!

Lawvere had the idea that the process of assigning ‘meaning’ to expressions could be seen as a functor. This idea has caught on in theoretical computer science: it’s called ‘functorial semantics’. So, what we’re doing here is applying functorial semantics to chemistry.

Now Blake has passed his thesis defense based on this work, and he just needs to polish up his thesis a little before submitting it. This summer he’s doing an internship at the Princeton branch of the engineering firm Siemens. He’s working with Arquimedes Canedo on ‘knowledge representation’.

But I’m still eager to dig deeper into open reaction networks. They’re a small but nontrivial step toward my dream of a mathematics of living systems. My working hypothesis is that living systems seem ‘messy’ to physicists because they operate at a higher level of abstraction. That’s what I’m trying to explore.

Here’s the idea of our paper.

Reaction networks are a very general framework for describing processes where entities interact and transform int other entities. While they first showed up in chemistry, and are often called ‘chemical reaction networks’, they have lots of other applications. For example, a basic model of infectious disease, the ‘SIRS model’, is described by this reaction network:

We see here three types of entity, called **species**:

• : **susceptible**,

• : **infected**,

• : **resistant**.

We also have three `reactions’:

• : **infection**, in which a susceptible individual meets an infected one and becomes infected;

• : **recovery**, in which an infected individual gains resistance to the disease;

• : **loss of resistance**, in which a resistant individual becomes susceptible.

In general, a reaction network involves a finite set of species, but reactions go between **complexes**, which are finite linear combinations of these species with natural number coefficients. The reaction network is a directed graph whose vertices are certain complexes and whose edges are called **reactions**.

If we attach a positive real number called a **rate constant** to each reaction, a reaction network determines a system of differential equations saying how the concentrations of the species change over time. This system of equations is usually called the **rate equation**. In the example I just gave, the rate equation is

Here and are the rate constants for the three reactions, and now stand for the concentrations of the three species, which are treated in a continuum approximation as smooth functions of time:

The rate equation can be derived from the **law of mass action**, which says that any reaction occurs at a rate equal to its rate constant times the product of the concentrations of the species entering it as inputs.

But a reaction network is more than just a stepping-stone to its rate equation! Interesting qualitative properties of the rate equation, like the existence and uniqueness of steady state solutions, can often be determined just by looking at the reaction network, regardless of the rate constants. Results in this direction began with Feinberg and Horn’s work in the 1960’s, leading to the Deficiency Zero and Deficiency One Theorems, and more recently to Craciun’s proof of the Global Attractor Conjecture.

In our paper, Blake and I present a ‘compositional framework’ for reaction networks. In other words, we describe rules for building up reaction networks from smaller pieces, in such a way that its rate equation can be figured out knowing those those of the pieces. But this framework requires that we view reaction networks in a somewhat different way, as ‘Petri nets’.

Petri nets were invented by Carl Petri in 1939, when he was just a teenager, for the purposes of chemistry. Much later, they became popular in theoretical computer science, biology and other fields. A Petri net is a bipartite directed graph: vertices of one kind represent species, vertices of the other kind represent reactions. The edges into a reaction specify which species are inputs to that reaction, while the edges out specify its outputs.

You can easily turn a reaction network into a Petri net and vice versa. For example, the reaction network above translates into this Petri net:

Beware: there are a lot of different names for the same thing, since the terminology comes from several communities. In the Petri net literature, species are called **places** and reactions are called **transitions**. In fact, Petri nets are sometimes called ‘place-transition nets’ or ‘P/T nets’. On the other hand, chemists call them ‘species-reaction graphs’ or ‘SR-graphs’. And when each reaction of a Petri net has a rate constant attached to it, it is often called a ‘stochastic Petri net’.

While some qualitative properties of a rate equation can be read off from a reaction network, others are more easily read from the corresponding Petri net. For example, properties of a Petri net can be used to determine whether its rate equation can have multiple steady states.

Petri nets are also better suited to a compositional framework. The key new concept is an ‘open’ Petri net. Here’s an example:

The box at left is a set *X* of ‘inputs’ (which happens to be empty), while the box at right is a set *Y* of ‘outputs’. Both inputs and outputs are points at which entities of various species can flow in or out of the Petri net. We say the open Petri net **goes from X to Y**. In our paper, we show how to treat it as a morphism in a category we call .

Given an open Petri net with rate constants assigned to each reaction, our paper explains how to get its ‘open rate equation’. It’s just the usual rate equation with extra terms describing inflows and outflows. The above example has this open rate equation:

Here are arbitrary smooth functions describing outflows as a function of time.

Given another open Petri net for example this:

it will have its own open rate equation, in this case

Here are arbitrary smooth functions describing inflows as a function of time. Now for a tiny bit of category theory: we can **compose** and by gluing the outputs of to the inputs of This gives a new open Petri net as follows:

But this open Petri net has an *empty* set of inputs, and an empty set of outputs! So it amounts to an ordinary Petri net, and its open rate equation is a rate equation of the usual kind. Indeed, this is the Petri net we have already seen.

As it turns out, there’s a systematic procedure for combining the open rate equations for two open Petri nets to obtain that of their composite. In the example we’re looking at, we just identify the outflows of with the inflows of (setting and ) and then add the right hand sides of their open rate equations.

The first goal of our paper is to precisely describe this procedure, and to prove that it defines a functor

from to a category where the morphisms are ‘open dynamical systems’. By a dynamical system, we essentially mean a vector field on which can be used to define a system of first-order ordinary differential equations in variables. An example is the rate equation of a Petri net. An *open* dynamical system allows for the possibility of extra terms that are arbitrary functions of time, such as the inflows and outflows in an open rate equation.

In fact, we prove that and are symmetric monoidal categories and that is a symmetric monoidal functor. To do this, we use Brendan Fong’s theory of ‘decorated cospans’.

Decorated cospans are a powerful general tool for describing open systems. A **cospan** in any category is just a diagram like this:

We are mostly interested in cospans in the category of finite sets and functions between these. The set , the so-called **apex** of the cospan, is the set of states of an open system. The sets and are the **inputs** and **outputs** of this system. The **legs** of the cospan, meaning the morphisms and describe how these inputs and outputs are included in the system. In our application, is the set of species of a Petri net.

For example, we may take this reaction network:

treat it as a Petri net with :

and then turn that into an open Petri net by choosing any finite sets and maps , , for example like this:

(Notice that the maps including the inputs and outputs into the states of the system need not be one-to-one. This is technically useful, but it introduces some subtleties that I don’t feel like explaining right now.)

An open Petri net can thus be seen as a cospan of finite sets whose apex is ‘decorated’ with some extra information, namely a Petri net with as its set of species. Fong’s theory of decorated cospans lets us define a category with open Petri nets as morphisms, with composition given by gluing the outputs of one open Petri net to the inputs of another.

We call the functor

**gray-boxing** because it hides some but not all the internal details of an open Petri net. (In the paper we draw it as a gray box, but that’s too hard here!)

We can go further and **black-box** an open dynamical system. This amounts to recording only the relation between input and output variables that must hold in steady state. We prove that black-boxing gives a functor

(yeah, the box here should be black, and in our paper it is). Here is a category where the morphisms are **semi-algebraic** relations between real vector spaces, meaning relations defined by polynomials and inequalities. This relies on the fact that our dynamical systems involve **algebraic** vector fields, meaning those whose components are polynomials; more general dynamical systems would give more general relations.

That semi-algebraic relations are closed under composition is a nontrivial fact, a spinoff of the **Tarski–Seidenberg theorem**. This says that a subset of defined by polynomial equations and inequalities can be projected down onto , and the resulting set is still definable in terms of polynomial identities and inequalities. This wouldn’t be true if we didn’t allow inequalities. It’s neat to see this theorem, important in mathematical logic, showing up in chemistry!

Okay, now you’re ready to read our paper! Here’s how it goes:

In Section 2 we review and compare reaction networks and Petri nets. In Section 3 we construct a symmetric monoidal category where an object is a finite set and a morphism is an open reaction network (or more precisely, an isomorphism class of open reaction networks). In Section 4 we enhance this construction to define a symmetric monoidal category where the transitions of the open reaction networks are equipped with rate constants. In Section 5 we explain the open dynamical system associated to an open reaction network, and in Section 6 we construct a symmetric monoidal category of open dynamical systems. In Section 7 we construct the gray-boxing functor

In Section 8 we construct the black-boxing functor

We show both of these are symmetric monoidal functors.

Finally, in Section 9 we fit our results into a larger ‘network of network theories’. This is where various results in various papers I’ve been writing in the last few years start assembling to form a big picture! But this picture needs to grow….

]]>

Who do you trust: the liar or the hypocrite?

Some people like to accuse those are worried about climate change of being “hypocrites”. Why? Because we still fly around in planes, drive cars and so on.

What’s the argument? Could it be this?

*“If even those folks who claim there’s a problem aren’t willing to do anything about it, it must not really be a problem.”*

That argument is invalid. Say we have a married couple who both smoke. The husband says “we should quit smoking.” But he keeps smoking. Does this mean that it’s okay to smoke?

Or suppose he says “*you* should quit smoking”, but keeps on smoking himself. That’s would be infuriating. But it doesn’t make the statement less true.

Indeed, our civilization is addicted to burning carbon. It’s a lot like being addicted to nicotine. Addiction leads people to say one thing and do another. You *know* you should change your behavior—but you don’t have the will power. Or you do for a while… but then you lapse.

I see this in myself. I try to stop taking airplane flights, but like most successful scientists I get lots of invitations to conferences, with free flights to fun places. It’s hard to resist. It’s like offering cigarettes to someone who is trying to quit. I can resist nine times and cave in on the tenth! I can “relapse” for months and then come to my senses.

In fact the accusation of hypocrisy is not about the facts of climate change. It’s about choosing a social group:

*“The people who want you to take climate change seriously are hypocrites. Don’t be a sucker. Don’t let them boss you around. Join us instead.”*

This takes advantage of a psychological fact: *most of us prefer liars to hypocrites*. A lie is forgivable. But hypocrisy—someone publicly saying you should do something when they don’t themselves—is not.

There are studies about this:

• Association for Psychological Science, We dislike hypocrites because they deceive us, 30 January 2017.

The title of this article is wrong. Liars also deceive us. We hate hypocrites for other reasons.

We’re averse to hypocrites because their disavowal of bad behavior sends a false signal, misleading us into thinking they’re virtuous when they’re not, according to new findings in

Psychological Science, a journal of the Association for Psychological Science. The research shows that people dislike hypocrites more than those who openly admit to engaging in a behavior that they disapprove of.“People dislike hypocrites because they unfairly use condemnation to gain reputational benefits and appear virtuous at the expense of those who they are condemning–when these reputational benefits are in fact undeserved,” explains psychological scientist Jillian Jordan of Yale University, first author on the research.

Intuitively, it seems that we might dislike hypocrites because their word is inconsistent with their behavior, because they lack the self-control to behave according to their own morals, or because they deliberately engage in behaviors that they know to be morally wrong. All of these explanations seem plausible, but the new findings suggest that it’s the misrepresentation of their moral character that really raises our ire.

In an online study with 619 participants, Jordan and Yale colleagues Roseanna Sommers, Paul Bloom, and David G. Rand presented each participant with four scenarios about characters engaging in possible moral transgressions: a member of a track team using performance-enhancing drugs, a student cheating on a take-home chemistry exam, an employee failing to meet a deadline on a team project, and a member of a hiking club who engaged in infidelity.

In each scenario, participants read about a conversation involving moral condemnation of a transgression. The researchers varied whether the condemnation came from a “target character” (who subjects would later evaluate) or somebody else, as well as whether the scenario provided direct information about the target character’s own moral behavior. Participants then evaluated how trustworthy and likeable the target character was, as well as the likelihood that the target character would engage in the transgression.

The results showed that participants viewed the target more positively when he or she condemned the bad behavior in the scenario, but only when they had no information about how the character actually behaved. This suggests that we tend to interpret condemnation as a signal of moral behavior in the absence of direct information.

A second online study showed that condemning bad behavior conveyed a greater reputational boost for the character than directly stating that he or she didn’t engage in the behavior.

“Condemnation can act as a stronger signal of one’s own moral goodness than a direct statement of moral behavior,” the researchers write.

And additional data suggest that people dislike hypocrites even more than they dislike liars. In a third online study, participants had a lower opinion of a character who illegally downloaded music when he or she condemned the behavior than when he or she directly denied engaging in it.

I believe the accusation of hypocrisy is trying to set up a binary choice:

*“Whose side are you on? Those hypocrites who say climate change is a problem and try to get you to make sacrifices, while they don’t? Or us liars, who say there’s no problem and your behavior is fine?”*

Of course, being liars, they leave out the word “liars”.

One way out is to realize it’s not a binary choice.

There’s a third position: the **honest hypocrite**.

Perhaps the most critical piece of evidence for the theory of hypocrisy as false signaling is that people disliked hypocrites more than so-called “honest hypocrites.” In a fourth online study, the researchers tested perceptions of “honest hypocrites,” who—like traditional hypocrites—condemn behaviors that they engage in, but who also admit that they sometimes commit those behaviors.

“The extent to which people forgive honest hypocrites was striking to us,” says Jordan. “These honest hypocrites are seen as no worse than people who commit the same transgressions but keep their mouths shut and refrain from judging others for doing the same — suggesting that the entirety of our dislike for hypocrites can be attributed to the fact that they falsely signal their virtue.”

There’s also a fourth position: the non-liar, non-hypocrite. That’s even better. But sometimes, when we need to take collective action, we should listen to the honest hypocrite, who tells us that we should all take action, but admits he’s not doing it yet.

And now, here’s a great example of someone trying take advantage of our hatred of hypocrites. Pay careful attention to how she cleverly tries to manipulate you! By the end you’ll feel different than when you started. If she were trying to get you to smoke, by the end you’d light up a cigarette and feel proud of yourself.

The Hypocrisy of Climate Change Advocates

Julie KellySo according to all the hysterical people, President-elect Donald Trump has appointed the most climate denier cabinet ever. As cabinet confirmation hearings get underway, expect to hear the charge “climate denier!” a lot.

For those of you who don’t know what a climate denier is, it means you either challenge, question or flat-out reject the idea that the planet is warming due to human activity. In the scientific world and in the world of international liberal groupthink (but I repeat myself), this is blasphemy. Should you remotely doubt the dubious models, unrealized dire predictions, changing goal posts or flawed data related to climate science, you are not just stupid according to these folks, but you are on par with those who deny the Holocaust.

Even people who believe in manmade climate change (or AGW, anthropogenic global warming) have been excommunicated from the climate tribe for raising any concern about climate science. Last month, Roger Pielke, Jr. wrote a revealing op-ed in the Wall Street Journal about how he became a target of the climate junta for saying there was no connection between weather disasters and climate change. Although Pielke believes in AGW and even supports a carbon tax to mitigate its impact, his scrutiny made him a target of powerful folks in Congress, the media and even the White House.

The first time I was called a climate denier was a few years ago, after I started writing about agricultural biotechnology or GMOs. The charge was an attempt to undermine my credibility on supporting genetic engineering: the line of attack was, if you don’t believe the science and consensus about man-made global warming, you are a scientific illiterate who has no business speaking in defense of other scientific issues like biotechnology. This was often dished out by climate change pushers who also oppose GMOs because they are anti-capitalist, anti-corporate ideologues (Bernie Sanders could be the poster child for this).

As I did more research on climate change, I learned one important thing: being a climate change believer means never having to say you’re sorry, or at least never making any major sacrifice to your lifestyle that would mitigate the pending doom you are so preoccupied with (but, sea ice!). You can go along with climate change dogma and do virtually nothing about it except recycle your newspapers while self-righteously calling the other side names. From the Pope to the president to the smug suburban mom, climate adherents live in glass houses that function thanks to evil stuff like oil and gas while throwing rocks at us so-called deniers.

So who are the real deniers: those who are reasonably skeptical about climate change or those who give lots of lip service to it while living a lifestyle totally inimical to every tenet of the climate change creed?

To that end, you might be a climate change denier if:

You are the Holy Father of the largest denomination of the Christian faith who calls climate change “one of the principal challenges facing humanity in our day” and that coal, oil and gas must be replaced “without delay” yet lives a palatial lifestyle powered by fossil fuels.

You are the president of the United States who tried to ban fracking on public land because it emits greenhouse gases but then takes credit for cutting “dependence on foreign oil by more than half” thanks to fracking.

You are a presidential candidate whose primary message is blasting big corporations from Exxon to Monsanto for destroying the planet but then demands a private jet to make meaningless campaign appearances on behalf of the woman who beat you so you can keep getting attention for yourself.

You are a movie star who works in one of the most energy-intensive and frivolous industries but now earns fame by leading protests against fracking and demands the country live on 100 percent renewables by 2050 then jets your family off from Manhattan to Australia on a jumbo jet to take pictures of the Great Barrier Reef.

You are Robert Kennedy, Jr.

You drive a Tesla but don’t know the electricity comes from a grid supported by fossil fuels.

You are a legislator who pushes solar panels and wind turbines without having the slightest clue how much energy and materials — like steel, concrete, diesel fuel, fiberglass and plastic — are needed to manufacture them.

You are Leonardo DiCaprio.

You are a suburban mom who looks down at other moms who don’t care/know/believe in climate change but you spend the day driving your privileged kids around in a pricy SUV and have two air-conditioners in your 6,000 square-foot house,

You oppose nuclear energy and/or genetically engineered crops.

You eat meat because meat production allegedly emits about 14.5 percent of greenhouse gases or some made-up number according to the United Nations.

You eat any sort of food because agriculture uses all kinds of climate polluting energy not to mention the big carbon footprint to process, package, ship and deliver that food to your local Whole Foods.

You are John Kerry.

So if you live off the grid, never fly in an airplane and don’t eat, then you can call me a denier. For the rest of you, please zip it. You deny climate change by your actions because you contribute daily to the very greenhouse gases you contend are destroying the planet. I’d rather be a denier than a hypocrite any day.

]]>

• Erica Klarreich, In game theory, no clear path to equilibrium, *Quanta*, 18 July 2017.

Economists like the concept of ‘Nash equilibrium’, but it’s problematic in some ways. This matters for society at large.

In a Nash equilibrium for a multi-player game, no player can improve their payoff by unilaterally changing their strategy. This doesn’t mean everyone is happy: it’s possible to be trapped in a Nash equilibrium where everyone is miserable, because anyone changing their strategy unilaterally would be even *more* miserable. (Think ‘global warming’.)

The great thing about Nash equilibria is that their meaning is easy to fathom, and they exist. John Nash won a Nobel prize for a paper proving that they exist. His paper was less than one page long. But he proved the existence of Nash equilibria for arbitrary multi-player games using a nonconstructive method: a fixed point theorem that doesn’t actually tell you how to *find* the equilibrium!

Given this, it’s not surprising that Nash equilibria can be hard to find. Last September a paper came out making this precise, in a strong way:

• Yakov Babichenko and Aviad Rubinstein, Communication complexity of approximate Nash equilibria.

The authors show there’s no guaranteed method for players to find even an approximate Nash equilibrium unless they tell each other almost everything about their preferences. This makes finding the Nash equilibrium prohibitively difficult to find when there are lots of players… *in general*. There are particular games where it’s not difficult, and that makes these games important: for example, if you’re trying to run a government well. (A laughable notion these days, but still one can hope.)

Klarreich’s article in *Quanta* gives a nice readable account of this work and also a more practical alternative to the concept of Nash equilibrium. It’s called a ‘correlated equilibrium’, and it was invented by the mathematician Robert Aumann in 1974. You can see an attempt to define it here:

• Wikipedia, Correlated equilibrium.

The precise mathematical definition near the start of this article is a pretty good example of how you shouldn’t explain something: it contains a big fat equation containing symbols not mentioned previously, and so on. By thinking about it for a while, I was able to fight my way through it. Someday I should improve it—and someday I should explain the idea here! But for now, I’ll just quote this passage, which roughly explains the idea in words:

The idea is that each player chooses their action according to their observation of the value of the same public signal. A strategy assigns an action to every possible observation a player can make. If no player would want to deviate from the recommended strategy (assuming the others don’t deviate), the distribution is called a

correlated equilibrium.

According to Erica Klarreich it’s a useful notion. She even makes it sound revolutionary:

This might at first sound like an arcane construct, but in fact we use correlated equilibria all the time—whenever, for example, we let a coin toss decide whether we’ll go out for Chinese or Italian, or allow a traffic light to dictate which of us will go through an intersection first.

In [some] examples, each player knows exactly what advice the “mediator” is giving to the other player, and the mediator’s advice essentially helps the players coordinate which Nash equilibrium they will play. But when the players don’t know exactly what advice the others are getting—only how the different kinds of advice are correlated with each other—Aumann showed that the set of correlated equilibria can contain more than just combinations of Nash equilibria: it can include forms of play that aren’t Nash equilibria at all, but that sometimes result in a more positive societal outcome than any of the Nash equilibria. For example, in some games in which cooperating would yield a higher total payoff for the players than acting selfishly, the mediator can sometimes beguile players into cooperating by withholding just what advice she’s giving the other players. This finding, Myerson said, was “a bolt from the blue.”

(Roger Myerson is an economics professor at the University of Chicago who won a Nobel prize for his work on game theory.)

And even though a mediator can give many different kinds of advice, the set of correlated equilibria of a game, which is represented by a collection of linear equations and inequalities, is more mathematically tractable than the set of Nash equilibria. “This other way of thinking about it, the mathematics is so much more beautiful,” Myerson said.

While Myerson has called Nash’s vision of game theory “one of the outstanding intellectual advances of the 20th century,” he sees correlated equilibrium as perhaps an even more natural concept than Nash equilibrium. He has opined on numerous occasions that “if there is intelligent life on other planets, in a majority of them they would have discovered correlated equilibrium before Nash equilibrium.”

When it comes to repeated rounds of play, many of the most natural ways that players could choose to adapt their strategies converge, in a particular sense, to correlated equilibria. Take, for example, “regret minimization” approaches, in which before each round, players increase the probability of using a given strategy if they regret not having played it more in the past. Regret minimization is a method “which does bear some resemblance to real life — paying attention to what’s worked well in the past, combined with occasionally experimenting a bit,” Roughgarden said.

(Tim Roughgarden is a theoretical computer scientist at Stanford University.)

For many regret-minimizing approaches, researchers have shown that play will rapidly converge to a correlated equilibrium in the following surprising sense: after maybe 100 rounds have been played, the game history will look essentially the same as if a mediator had been advising the players all along. It’s as if “the [correlating] device was somehow implicitly found, through the interaction,” said Constantinos Daskalakis, a theoretical computer scientist at the Massachusetts Institute of Technology.

As play continues, the players won’t necessarily stay at the same correlated equilibrium — after 1,000 rounds, for instance, they may have drifted to a new equilibrium, so that now their 1,000-game history looks as if it had been guided by a different mediator than before. The process is reminiscent of what happens in real life, Roughgarden said, as societal norms about which equilibrium should be played gradually evolve.

In the kinds of complex games for which Nash equilibrium is hard to reach, correlated equilibrium is “the natural leading contender” for a replacement solution concept, Nisan said.

As Klarreich hints, you can find correlated equilibria using a technique called linear programming. That was proved here, I think:

• Christos H. Papadimitriou and Tim Roughgarden, Computing correlated equilibria in multi-player games, *J. ACM* **55** (2008), 14:1-14:29.

Do you know something about correlated equilibria that I should know? If so, please tell me!

]]>

There are many kinds of networks. You can usually create big networks of a given kind by sticking together smaller networks of this kind. The networks usually *do* something, and the behavior of the whole is usually determined by the behavior of the parts and how the parts are stuck together.

So, we should think of networks of a given kind as morphisms in a category, or more generally elements of an algebra of some operad, and define a map sending each such network to its behavior. Then we can study this map mathematically!

All these insights (and many more) are made precise in Fong’s theory of ‘decorated cospans’:

• Brendan Fong, *The Algebra of Open and Interconnected Systems*, Ph.D. thesis, University of Oxford, 2016. (Blog article here.)

Kenny Courser is starting to look at the next thing: how one network can turn into another. For example, a network might change over time, or we might want to simplify a complicated network somehow. If a network is morphism, a process where one network turns into another could be a ‘2-morphism’: that is, a morphism between morphisms. Just as categories have objects and morphisms, bicategories have objects, morphisms *and 2-morphisms*.

So, Kenny is looking at bicategories. As a first step, Kenny took Brendan’s setup and souped it up to define ‘decorated cospan bicategories’:

• Kenny Courser, Decorated cospan bicategories, *Theory and Applications of Categories* **32** (2017), 985–1027.

In this paper, he showed that these bicategories are often ‘symmetric monoidal’. This means that you can not only stick networks together end to end, you can also set them side by side or cross one over the other—and similarly for processes that turn one network into another! A symmetric monoidal bicategory is a somewhat fearsome structure, so Kenny used some clever machinery developed by Mike Shulman to get the job done:

• Mike Shulman, Constructing symmetric monoidal bicategories.

I would love to talk about the details, but they’re a bit technical so I think I’d better talk about something more basic. Namely: what’s a decorated cospan category and what’s a decorated cospan bicategory?

First: what’s a decorated cospan? A **cospan** in some category is a diagram like this:

where the objects and morphisms are all in For example, if is the category of sets, we’ve got two sets and mapped to a set

In a ‘decorated’ cospan, the object is equipped or, as we like to say, ‘decorated’ with extra structure. For example:

Here the set consists of 3 points—but it’s decorated with a graph whose edges are labelled by numbers! You could use this to describe an electrical circuit made of resistors. The set would then be the set of ‘input terminals’, and the set of ‘output terminals’.

In this example, and indeed in many others, there’s no serious difference between inputs and outputs. We could reflect the picture, switching the roles of and and the inputs would become outputs and vice versa. One reason for distinguishing them is that we can then attach the outputs of one circuit to the inputs of another and build a larger circuit. If we think of our circuit as a morphism from the input set to the output set this process of attaching circuits to form larger ones can be seen as *composing morphisms in a category*.

In other words, if we get the math set up right, we can compose a decorated cospan from to and a decorated cospan from to and get a decorated cospan from to So with luck, we get a category with objects of as objects, and decorated cospans between these guys as morphisms!

For example, we can compose this:

and this:

to get this:

What did I mean by saying ‘with luck’? Well, there’s not really any *luck* involved, but we need some assumptions for all this to work. Before we even get to the decorations, we need to be able to compose cospans. We can do this whenever our cospans live in a category with pushouts. In category theory, a pushout is how we glue two things together.

So, suppose our category has pushouts. IF we then have two cospans in one from to and one from to

we can take a pushout:

and get a cospan from to

All this is fine and dandy, but there’s a slight catch: the pushout is only defined up to isomorphism, so we can’t expect this process of composing cospans to be associative: it will only be associative *up to isomorphism*.

What does that mean? What’s an isomorphism of cospans?

I’m glad you asked. A **map of cospans** is a diagram like this:

where the two triangles commmute. You can see two cospans in this picture; the morphism provides the map from one to the other. If is an isomorphism, then this is an isomorphism of cospans.

To get around this problem, we can work with a category where the morphisms aren’t cospans, but *isomorphism classes* of cospans. That’s what Brendan did, and it’s fine for many purposes.

But back around 1972, when Bénabou was first inventing bicategories, he noticed that you could also create a bicategory with

• objects of as objects,

• spans in as morphisms, and

• maps of spans in as 2-morphisms.

Bicategories are perfectly happy for composition of 1-morphisms to be associative only up to isomorphism, so this solves the problem in a somewhat nicer way. (Taking equivalence classes of things when you don’t absolutely need to is regarded with some disdain in category theory, because it often means you’re throwing out information—and when you throw out information, you often regret it later.)

So, if you’re interested in decorated cospan categories, and you’re willing to work with bicategories, you should consider thinking about decorated cospan *bicategories*. And now, thanks to Kenny Courser’s work, you can!

He showed how the decorations work in the bicategorical approach: for example, he proved that whenever has finite colimits and

is a lax symmetric monoidal functor, you get a symmetric monoidal bicategory where a morphism is a cospan in

with the object decorated by an element of

Proving this took some virtuosic work in category theory. The key turns out to be this glorious diagram:

For the explanation, check out Proposition 4.1 in his paper.

I’ll talk more about *applications* of cospan bicategories when I blog about some other papers Kenny Courser and Daniel Cicala are writing.

]]>

• Entropy 2018 — From Physics to Information Sciences and Geometry, 14–16 May 2018, Auditorium Enric Casassas, Faculty of Chemistry, University of Barcelona, Barcelona, Spain.

They write:

One of the most frequently used scientific words is the word “entropy”. The reason is that it is related to two main scientific domains: physics and information theory. Its origin goes back to the start of physics (thermodynamics), but since Shannon, it has become related to information theory. This conference is an opportunity to bring researchers of these two communities together and create a synergy. The main topics and sessions of the conference cover:

• Physics: classical and quantum thermodynamics

• Statistical physics and Bayesian computation

• Geometrical science of information, topology and metrics

• Maximum entropy principle and inference

• Kullback and Bayes or information theory and Bayesian inference

• Entropy in action (applications)The inter-disciplinary nature of contributions from both theoretical and applied perspectives are very welcome, including papers addressing conceptual and methodological developments, as well as new applications of entropy and information theory.

All accepted papers will be published in the proceedings of the conference. A selection of invited and contributed talks presented during the conference will be invited to submit an extended version of their paper for a special issue of the open access journal

Entropy.

]]>