## Applied Category Theory 2018

12 September, 2017

There will be a conference on applied category theory!

Applied Category Theory (ACT 2018). School 23–27 April 2018 and workshop 30 April–4 May 2018 at the Lorentz Center in Leiden, the Netherlands. Organized by Bob Coecke (Oxford), Brendan Fong (MIT), Aleks Kissinger (Nijmegen), Martha Lewis (Amsterdam), and Joshua Tan (Oxford).

The plenary speakers will be:

• Samson Abramsky (Oxford)
• John Baez (UC Riverside)
• Kathryn Hess (EPFL)
• David Spivak (MIT)

There will be a lot more to say as this progresses, but for now let me just quote from the conference website:

Applied Category Theory (ACT 2018) is a five-day workshop on applied category theory running from April 30 to May 4 at the Lorentz Center in Leiden, the Netherlands.

Towards an Integrative Science: in this workshop, we want to instigate a multi-disciplinary research program in which concepts, structures, and methods from one scientific discipline can be reused in another. The aim of the workshop is to (1) explore the use of category theory within and across different disciplines, (2) create a more cohesive and collaborative ACT community, especially among early-stage researchers, and (3) accelerate research by outlining common goals and open problems for the field.

While the workshop will host discussions on a wide range of applications of category theory, there will be four special tracks on exciting new developments in the field:

1. Dynamical systems and networks
2. Systems biology
3. Cognition and AI
4. Causality

Accompanying the workshop will be an Adjoint Research School for early-career researchers. This will comprise a 16 week online seminar, followed by a 4 day research meeting at the Lorentz Center in the week prior to ACT 2018. Applications to the school will open prior to October 1, and are due November 1. Admissions will be notified by November 15.

Sincerely,
The organizers

Bob Coecke (Oxford), Brendan Fong (MIT), Aleks Kissinger (Nijmegen), Martha Lewis (Amsterdam), and Joshua Tan (Oxford)

Category theory is a branch of mathematics originally developed to transport ideas from one branch of mathematics to another, e.g. from topology to algebra. Applied category theory refers to efforts to transport the ideas of category theory from mathematics to other disciplines in science, engineering, and industry.

This site originated from discussions at the Computational Category Theory Workshop at NIST on Sept. 28-29, 2015. It serves to collect and disseminate research, resources, and tools for the development of applied category theory, and hosts a blog for those involved in its study.

### The proposal: Towards an Integrative Science

Category theory was developed in the 1940s to translate ideas from one field of mathematics, e.g. topology, to another field of mathematics, e.g. algebra. More recently, category theory has become an unexpectedly useful and economical tool for modeling a range of different disciplines, including programming language theory [10], quantum mechanics [2], systems biology [12], complex networks [5], database theory [7], and dynamical systems [14].

A category consists of a collection of objects together with a collection of maps between those objects, satisfying certain rules. Topologists and geometers use category theory to describe the passage from one mathematical structure to another, while category theorists are also interested in categories for their own sake. In computer science and physics, many types of categories (e.g. topoi or monoidal categories) are used to give a formal semantics of domain-specific phenomena (e.g. automata [3], or regular languages [11], or quantum protocols [2]). In the applied category theory community, a long-articulated vision understands categories as mathematical workspaces for the experimental sciences, similar to how they are used in topology and geometry [13]. This has proved true in certain fields, including computer science and mathematical physics, and we believe that these results can be extended in an exciting direction: we believe that category theory has the potential to bridge specific different fields, and moreover that developments in such fields (e.g. automata) can be transferred successfully into other fields (e.g. systems biology) through category theory. Already, for example, the categorical modeling of quantum processes has helped solve an important open problem in natural language processing [9].

In this workshop, we want to instigate a multi-disciplinary research program in which concepts, structures, and methods from one discipline can be reused in another. Tangibly and in the short-term, we will bring together people from different disciplines in order to write an expository survey paper that grounds the varied research in applied category theory and lays out the parameters of the research program.

In formulating this research program, we are motivated by recent successes where category theory was used to model a wide range of phenomena across many disciplines, e.g. open dynamical systems (including open Markov processes and open chemical reaction networks), entropy and relative entropy [6], and descriptions of computer hardware [8]. Several talks will address some of these new developments. But we are also motivated by an open problem in applied category theory, one which was observed at the most recent workshop in applied category theory (Dagstuhl, Germany, in 2015): “a weakness of semantics/CT is that the definitions play a key role. Having the right definitions makes the theorems trivial, which is the opposite of hard subjects where they have combinatorial proofs of theorems (and simple definitions). […] In general, the audience agrees that people see category theorists only as reconstructing the things they knew already, and that is a disadvantage, because we do not give them a good reason to care enough” [1, pg. 61].

In this workshop, we wish to articulate a natural response to the above: instead of treating the reconstruction as a weakness, we should treat the use of categorical concepts as a natural part of transferring and integrating knowledge across disciplines. The restructuring employed in applied category theory cuts through jargon, helping to elucidate common themes across disciplines. Indeed, the drive for a common language and comparison of similar structures in algebra and topology is what led to the development category theory in the first place, and recent hints show that this approach is not only useful between mathematical disciplines, but between scientific ones as well. For example, the ‘Rosetta Stone’ of Baez and Stay demonstrates how symmetric monoidal closed categories capture the common structure between logic, computation, and physics [4].

[1] Samson Abramsky, John C. Baez, Fabio Gadducci, and Viktor Winschel. Categorical methods at the crossroads. Report from Dagstuhl Perspectives Workshop 14182, 2014.

[2] Samson Abramsky and Bob Coecke. A categorical semantics of quantum protocols. In Handbook of Quantum Logic and Quantum Structures. Elsevier, Amsterdam, 2009.

[3] Michael A. Arbib and Ernest G. Manes. A categorist’s view of automata and systems. In Ernest G. Manes, editor, Category Theory Applied to Computation and Control. Springer, Berlin, 2005.

[4] John C. Baez and Mike STay. Physics, topology, logic and computation: a Rosetta stone. In Bob Coecke, editor, New Structures for Physics. Springer, Berlin, 2011.

[5] John C. Baez and Brendan Fong. A compositional framework for passive linear networks. arXiv e-prints, 2015.

[6] John C. Baez, Tobias Fritz, and Tom Leinster. A characterization of entropy in terms of information loss. Entropy, 13(11):1945–1957, 2011.

[7] Michael Fleming, Ryan Gunther, and Robert Rosebrugh. A database of categories. Journal of Symbolic Computing, 35(2):127–135, 2003.

[8] Dan R. Ghica and Achim Jung. Categorical semantics of digital circuits. In Ruzica Piskac and Muralidhar Talupur, editors, Proceedings of the 16th Conference on Formal Methods in Computer-Aided Design. Springer, Berlin, 2016.

[9] Dimitri Kartsaklis, Mehrnoosh Sadrzadeh, Stephen Pulman, and Bob Coecke. Reasoning about meaning in natural language with compact closed categories and Frobenius algebras. In Logic and Algebraic Structures in Quantum Computing and Information. Cambridge University Press, Cambridge, 2013.

[10] Eugenio Moggi. Notions of computation and monads. Information and Computation, 93(1):55–92, 1991.

[11] Nicholas Pippenger. Regular languages and Stone duality. Theory of Computing Systems 30(2):121–134, 1997.

[12] Robert Rosen. The representation of biological systems from the standpoint of the theory of categories. Bulletin of Mathematical Biophysics, 20(4):317–341, 1958.

[13] David I. Spivak. Category Theory for Scientists. MIT Press, Cambridge MA, 2014.

[14] David I. Spivak, Christina Vasilakopoulou, and Patrick Schultz. Dynamical systems and sheaves. arXiv e-prints, 2016.

## Postdoc in Applied Category Theory

8 September, 2017

guest post by Spencer Breiner

### One Year Postdoc Position at Carnegie Mellon/NIST

We are seeking an early-career researcher with a background in category theory, functional programming and/or electrical engineering for a one-year post-doctoral position supported by an Early-concept Grant (EAGER) from the NSF’s Systems Science program. The position will be managed through Carnegie Mellon University (PI: Eswaran Subrahmanian), but the position itself will be located at the US National Institute for Standards and Technology (NIST), located in Gaithersburg, Maryland outside of Washington, DC.

The project aims to develop a compositional semantics for electrical networks which is suitable for system prediction, analysis and control. This work will extend existing methods for linear circuits (featured on this blog!) to include (i) probabilistic estimates of future consumption and (ii) top-down incentives for load management. We will model a multi-layered system of such “distributed energy resources” including loads and generators (e.g., solar array vs. power plant), different types of resource aggregation (e.g., apartment to apartment building), and across several time scales. We hope to demonstrate that such a system can balance local load and generation in order to minimize expected instability at higher levels of the electrical grid.

This post is available full-time (40 hours/5 days per week) for 12 months, and can begin as early as October 1st.

## Complex Adaptive Systems (Part 5)

4 September, 2017

When we design a complex system, we often start with a rough outline and fill in details later, one piece at a time. And if the system is supposed to be adaptive, these details may need to changed as the system is actually being used!

The use of operads should make this easier. One reason is that an operad typically has more than one algebra.

Remember from Part 3: an operad has operations, which are abstract ways of sticking things together. An algebra makes these operations concrete: it specifies some sets of actual things, and how the operations in the operad get implemented as actual ways to stick these things together.

So, an operad $O$ can have one algebra in which things are described in a bare-bones, simplified way, and another algebra in which things are described in more detail. Indeed it will typically have many algebras, corresponding to many levels of detail, but let’s just think about two for a minute.

When we have a ‘less detailed’ algebra $A$ and a ‘more detailed’ algebra $A',$ they will typically be related by a map

$f : A' \to A$

which ‘forgets the extra details’. This map should be a ‘homomorphism’ of algebras, but I’ll postpone the definition of that concept.

What we often want to do, when designing a system, is not forget extra detail, but rather add extra detail to some rough specification. There is not always a systematic way to do this. If there is, then we may have a homomorphism

$g : A \to A'$

going back the other way. This is wonderful, because it lets us automate the process of filling in the details. But we can’t always count on being able to do this—especially not if we want an optimal or even acceptable result. So, often we may have to start with an element of $A$ and search for elements of $A'$ that are mapped to it by $f : A' \to A.$

Let me give some examples. I’ll take the operad that I described last time, and describe some of its algebras, and homomorphisms between these.

I’ll start with an algebra that has very little detail: its elements will be simple graphs. As the name suggests, these are among the simplest possible ways of thinking about networks. They just look like this:

Then I’ll give an algebra with more detail, where the vertices of our simple graphs are points in the plane. There’s nothing special about the plane: we could replace the plane by any other set, and get another algebra of our operad. For example, we could use the set of points on the surface of the Caribbean Sea, the blue stuff in the rectangle here:

That’s what we might use in a search and rescue operation. The points could represent boats, and the edges could represent communication channels.

Then I’ll give an algebra with even more detail, where two points connected by an edge can’t be too far apart. This would be good for range-limited communication channels.

Then I’ll give an algebra with still more detail, where the locations of the points are functions of time. Now our boats are moving around!

Okay, here we go.

The operad from last time was called $O_G.$ Here $G$ is the network model of simple graphs. The best way to picture an operation of $O_G$ is as a way of sticking together a list of simple graphs to get a new simple graph.

For example, an operation

$f \in O_G(3,4,2;9)$

is a way of sticking together a simple graph with 3 vertices, one with 4 vertices and one with 2 vertices to get one with 9 vertices. Here’s a picture of such an operation:

Note that this operation is itself a simple graph. An operation in $O_G(3,4,2;9)$ is just a simple graph with 9 vertices, where we have labelled the vertices from 1 to 9.

This operad comes with a very obvious algebra $A$ where the operations do just what I suggested. In this algebra, an element of $A(t)$ is a simple graph with $t$ vertices, listed in order. Here $t$ is any natural number, which I’m calling ‘t’ for ‘type’.

We also need to say how the operations in $O_G$ act on these sets $A(t).$ If we take simple graphs in $A(3), A(4),$ and $A(2)$:

we can use our operation $f$ to stick them together and get this:

But we can also make up a more interesting algebra of $O_G.$ Let’s call this algebra $A'.$ We’ll let an element of $A'(t)$ be a simple graph with $t$ vertices, listed in order, which are points in the plane.

My previous pictures can be reused to show how operations in $O_G$ act on this new algebra $A'.$ The only difference is that now we tread the vertices literally as points in the plane! Before you should have been imagining them as abstract points not living anywhere; now they have locations.

Now let’s make up an even more detailed algebra $A''.$

What if our communication channels are ‘range-limited’? For example, what if two boats can’t communicate if they are more than 100 kilometers apart?

Then we can let an element of $A''(t)$ be a simple graph with $t$ vertices in the plane such that no two vertices connected by an edge have distance > 100.

Now the operations of our operad $O_G$ act in a more interesting way. If we have an operation, and we apply it to elements of our algebra, it ‘tries’ to put in new edges as it did before, but it ‘fails’ for any edge that would have length > 100. In other words, we just leave out any edges that would be too long.

It took me a while to figure this out. At first I thought the result of the operation would need to be undefined whenever we tried to create an edge that violated the length constraint. But in fact it acts in a perfectly well-defined way: we just don’t put in edges that would be too long!

This is good. This means that if you tell two boats to set up a communication channel, and they’re too far apart, you don’t get the ‘blue screen of death’: your setup doesn’t crash and burn. Instead, you just get a polite warning—‘communication channel not established’—and you can proceed.

The nontrivial part is to check that if we do this, we really get an algebra of our operad! There are some laws that must hold in any algebra. But since I haven’t yet described those laws, I won’t check them here. You’ll have to wait for our paper to come out.

Let’s do one more algebra today. For lack of creativity I’ll call it $A'''.$ Now an element of $A'''(t)$ is a time-dependent graph in the plane with $t$ vertices, listed in order. Namely, the positions of the vertices depend on time, and the presence or absence of an edge between two vertices can also depend on time. Furthermore, let’s impose the requirement that any two vertices can only connected by an edge at times when their distance is ≤ 100.

When I say ‘functions of time’ here, what ‘time’? We can model time by some interval $[T_1, T_2].$ But if you don’t like that, you can change it.

This algebra $A'''$ works more or less like $A''.$ The operations of $O_G$ try to create edges, but these edges only ‘take’ at times when the vertices they connect have distance ≤ 100.

There’s something here you might not like. Our operations can only try to create edges ‘for all times’… and succeed at times when the vertices are close enough. We can’t try to set up a communication channel for a limited amount of time.

But fear not: this is just a limitation in our chosen network model, ‘simple graphs’. With a fancier network model, we’d get a fancier operad, with fancier operations. Right now I’m trying to keep the operad simple (pun not intended), and show you a variety of different algebras.

And you might expect, we have algebra homomorphisms going from more detailed algebras to less detailed ones:

$f_T : A''' \to A'', \quad h : A' \to A$

The homomorphism $h$ takes a simple graph in the plane and forgets the location of its vertices. The homomorphism $f_T$ depends on a choice of time $T \in [T_1, T_2].$ For any time $T,$ it takes a time-dependent graph in the plane and evaluates it at that time, getting a graph in the plane (which obeys the distance constraints, since the time-dependent graph obeyed those constraints at any time).

We do not have a homomorphism $g: A'' \to A'$ that takes a simple graph in the plane obeying our distance constraints and forgets about those constraints. There’s a map $g$ sending elements of $A''$ to elements of $A'$ in this way. But it’s not an algebra homomorphism! The problem is that first trying to connect two graphs with an edge and then applying $g$ may give a different result than first applying $g$ and then connecting two graphs with an edge.

In short: a single operad has many algebras, which we can use to describe our desired system at different levels of detail. Algebra homomorphisms relate these different levels of detail.

Next time I’ll look at some more interesting algebras of the same operad. For example, there’s one that describes a system of interacting mobile agents, which move around in some specific way, determined by their location and the locations of the agents they’re communicating with.

Even this is just the tip of the iceberg—that is, still a rather low level of detail. We can also introduce stochasticity (that is, randomness). And to go even further, we could switch to a more sophisticated operad, based on a fancier ‘network model’.

But not today.

## Voyager 1

3 September, 2017

Launched 40 years ago, the Voyagers are our longest-lived and most distant spacecraft. Voyager 2 has reached the edge of the heliosphere, the realm where the solar wind and the Sun’s magnetic field live. Voyager 1 has already left the heliosphere and entered interstellar space! A new movie, The Farthest, celebrates the Voyagers’ journey toward the stars:

What has Voyager 1 been doing lately? I’ll skip its amazing exploration of the Solar System….

### Leaving the realm of planets

On February 14, 1990, Voyager 1 took the first ever ‘family portrait’ of the Solar System as seen from outside. This includes the famous image of planet Earth known as the Pale Blue Dot:

Soon afterwards, its cameras were deactivated to conserve power and computer resources. The camera software has been removed from the spacecraft, so it would now be hard to get it working again. And here on Earth, the software for reading these images is no longer available!

On February 17, 1998, Voyager 1 reached a distance of 69 AU from the Sun — 69 times farther from the Sun than we are. At that moment it overtook Pioneer 10 as the most distant spacecraft from Earth! Traveling at about 17 kilometers per second, it was moving away from the Sun faster than any other spacecraft. It still is.

That’s 520 million kilometers per year — hard to comprehend. I find it easier to think about this way: it’s 3.6 AU per year. That’s really fast… but not if you’re trying to reach other stars. It will take 20,000 years just to go one light-year.

### Termination shock

As Voyager 1 headed for interstellar space, its instruments continued to study the Solar System. Scientists at the Johns Hopkins University said that Voyager 1 entered the termination shock in February 2003. This is a bit like a ‘sonic boom’, but in reverse: it’s the place where the solar wind drops to below the speed of sound. Yes, sound can move through the solar wind, but only sound with extremely long wavelengths — nothing you can hear.

Some other scientists expressed doubt about this, and the issue wasn’t resolved until other data became available, since Voyager 1’s solar-wind detector had stopped working in 1990. This failure meant that termination shock detection had to be inferred from the other instruments on board. We now think that Voyager 1 reached the termination shock on December 15, 2004 — at a distance of 94 AU from the Sun.

### Heliosheath

In May 2005, a NASA press release said that Voyager 1 had reached the
heliosheath
. This is a bubble of stagnant solar wind, moving below the speed of sound. It’s outside the termination shock but inside the heliopause, where the interstelllar wind crashes against the solar wind.

On March 31, 2006, amateur radio operators in Germany tracked and received radio waves from Voyager 1 using a 20-meter dish. They
checked their data against data from the Deep Space Network station in Madrid, Spain and yes — it matched. This was the first amateur tracking of Voyager 1!

On December 13, 2010, the the Low Energy Charged Particle device
aboard Voyager 1 showed that it passed the point where the solar wind flows away from the Sun. At this point the solar wind seems to turn sideways, due to the push of the interstellar wind. On this date, the spacecraft was approximately 17.3 billion kilometers from the Sun, or 116 AU.

In March 2011, Voyager 1 was commanded to change its orientation to measure the sideways motion of the solar wind. How? I don’t know. Its solar wind detector was broken.

But anyway, a test roll done in February had confirmed the spacecraft’s ability to maneuver and reorient itself. So, in March it rotated 70 degrees counterclockwise with respect to Earth to detect the solar wind. This was the first time the spacecraft had done any major maneuvering since the family portrait photograph of the planets was taken in 1990.

After the first roll the spacecraft had no problem in reorienting itself with Alpha Centauri, Voyager 1’s guide star, and it resumed sending transmissions back to Earth.

On December 1, 2011, it was announced that Voyager 1 had detected the first Lyman-alpha radiation originating from the Milky Way galaxy. Lyman-alpha radiation had previously been detected from other galaxies, but because of interference from the Sun, the radiation from the Milky Way was not detectable.

Puzzle: What the heck is Lyman-alpha radiation?

On December 5, 2011, Voyager 1 saw that the Solar System’s magnetic field had doubled in strength, basically because it was getting compressed by the pressure of the interstellar wind. Energetic particles originating in the Solar System declined by nearly half, while the detection of high-energy electrons from outside increased 100-fold.

### Heliopause and beyond

In June 2012, NASA announced that the probe was detecting even more charged particles from interstellar space. This meant that it was getting close to the heliopause: the place where the gas of interstellar space crashes into the solar wind.

Voyager 1 actually crossed the heliopause in August 2012, although it took another year to confirm this. It was 121 AU from the Sun.

What’s next?

In about 300 years Voyager 1 will reach the Oort cloud, the region of frozen comets. It will take 30,000 years to pass through the Oort cloud. Though it is not heading towards any particular star, in about 40,000 years it will pass within 1.6 light-years of the star Gliese 445.

NASA says:

The Voyagers are destined — perhaps eternally —
to wander the Milky Way.

That’s an exaggeration. The Milky Way will not last forever. In just 3.85 billion years, before our Sun becomes a red giant, the Andromeda galaxy will collide with the Milky Way. In just 100 trillion years, all the stars in the Milky Way will burn out. And in just 10 quintillion years, the Milky Way will have disintegrated, with all the dead stars either falling into black holes or being flung off into intergalactic space.

But still: the Voyagers’ journeys are just beginning. Let’s wish them a happy 40th birthday!

My story here is adapted from this Wikipedia article:

• Wikipedia, Voyager 1.

• NASA, NASA and iconic museum honor Voyager spacecraft 40th anniversary, August 30, 2017.

## Complex Adaptive System Design (Part 4)

22 August, 2017

Last time I introduced typed operads. A typed operad has a bunch of operations for putting together things of various types and getting new things of various types. This is a very general idea! But in the CASCADE project we’re interested in something more specific: networks. So we want operads whose operations are ways to put together networks and get new networks.

That’s what our team came up with: John Foley of Metron, my graduate students Blake Pollard and Joseph Moeller, and myself. We’re writing a couple of papers on this, and I’ll let you know when they’re ready. These blog articles are kind of sneak preview—and a gentle introduction, where you can ask questions.

For example: I’m talking a lot about networks. But what is a ‘network’, exactly?

There are many kinds. At the crudest level, we can model a network as a simple graph, which is something like this:

There are some restrictions on what counts as a simple graph. If the vertices are agents of some sort and the edges are communication channels, these restrictions imply:

• We allow at most one channel between any pair of agents, since there’s at most one edge between any two vertices of our graph.

• The channels do not have a favored direction, since there are no arrows on the edges of our graph.

• We don’t allow a channel from an agent to itself, since an edge can’t start and end at the same vertex.

For other purposes we may want to drop some or all of these restrictions. There is an appalling diversity of options! We might want to allow multiple channels between a pair of agents. For this we could use multigraphs. We might want to allow directed channels, where the sender and receiver have different capabilities: for example, signals may only be able to flow in one direction. For this we could use directed graphs. And so on.

We will also want to consider graphs with colored vertices, to specify different types of agents—or colored edges, to specify different types of channels. Even more complicated variants are likely to become important as we proceed.

To avoid sinking into a mire of special cases, we need the full power of modern mathematics. Instead of separately studying all these various kinds of networks, we need a unified notion that subsumes all of them.

To do this, the Metron team came up with something called a ‘network model’. There is a network model for simple graphs, a network model for multigraphs, a network model for directed graphs, a network model for directed graphs with 3 colors of vertex and 15 colors of edge, and more.

You should think of a network model as a kind of network. Not a specific network, just a kind of network.

Our team proved that for each network model $G$ there is an operad $O_G$ whose operations describe how to put together networks of that kind. We call such operads ‘network operads’.

I want to make all this precise, but today let me just show you one example. Let’s take $G$ to be the network model for simple graphs, and look at the network operad $O_G.$ I won’t tell you what kind of thing $G$ is yet! But I’ll tell you about the operad $O_G$.

Types. Remember from last time that an operad has a set of ‘types’. For $O_G$ this is the set of natural numbers, $\mathbb{N}.$ The reason is that a simple graph can have any number of vertices.

Operations. Remember that an operad has sets of ‘operations’. In our case we have a set of operations $O_G(t_1,\dots,t_n ; t)$ for each choice of $t_1,\dots,t_n, t \in \mathbb{N}.$

An operation $f \in O_G(t_1,\dots,t_n; t)$ is a way of taking a simple graph with $t_1$ vertices, a simple graph with $t_2$ vertices,… and so on, and sticking them together, perhaps adding new edges, to get a simple graph with

$t = t_1 + \cdots + t_n$

vertices.

Let me show you an operation

$f \in O_G(3,4,2;9)$

This will be a way of taking three simple graphs—one with 3 vertices, one with 4, and one with 2—and sticking them together, perhaps adding edges, to get one with 9 vertices.

Here’s what $f$ looks like:

It’s a simple graph with vertices numbered from 1 to 9, with the vertices in bunches: {1,2,3}, {4,5,6,7} and {8,9}. It could be any such graph. This one happens to have an edge from 3 to 6 and an edge from 1 to 2.

Here’s how we can actually use our operation. Say we have three simple graphs like this:

Then we can use our operation to stick them together and get this:

Notice that we added a new edge from 3 to 6, connecting two of our three simple graphs. We also added an edge from 1 to 2… but this had no effect, since there was already an edge there! The reason is that simple graphs have at most one edge between vertices.

But what if we didn’t already have an edge from 1 to 2? What if we applied our operation $f$ to the following simple graphs?

Well, now we’d get this:

This time adding the edge from 1 to 2 had an effect, since there wasn’t already an edge there!

In short, we can use this operad to stick together simple graphs, but also to add new edges within the simple graphs we’re sticking together!

When I’m telling you how we ‘actually use’ our operad to stick together graphs, I’m secretly describing an algebra of our operad. Remember, an operad describes ways of sticking together things together, but an ‘algebra’ of the operad gives a particular specification of these things and describes how we stick them together.

Our operad $O_G$ has lots of interesting algebras, but I’ve just shown you the simplest one. More precisely:

Things. Remember from last time that for each type, an algebra specifies a set of things of that type. In this example our types are natural numbers, and for each natural number $t \in \mathbb{N}$ I’m letting the set of things $A(t)$ consist of all simple graphs with vertices $\{1, \dots, t\}.$

Action. Remember that our operad $O_G$ should have an action on $A$, meaning a bunch of maps

$\alpha : O_G(t_1,...,t_n ; t) \times A(t_1) \times \cdots \times A(t_n) \to A(t)$

I just described how this works in some examples. Some rules should hold… and they do.

To make sure you understand, try these puzzles:

Puzzle 1. In the example I just explained, what is the set $O_G(t_1,\dots,t_n ; t)$ if $t \ne t_1 + \cdots + t_n?$

Puzzle 2. In this example, how many elements does $O_G(1,1;2)$ have?

Puzzle 3. In this example, how many elements does $O_G(1,2;3)$ have?

Puzzle 4. In this example, how many elements does $O_G(1,1,1;3)$ have?

Puzzle 5. In the particular algebra $A$ that I explained, how many elements does $A(3)$ have?

Next time I’ll describe some more interesting algebras of this operad $O_G.$ These let us describe networks of mobile agents with range-limited communication channels!

## Complex Adaptive System Design (Part 3)

17 August, 2017

It’s been a long time since I’ve blogged about the Complex Adaptive System Composition and Design Environment or CASCADE project run by John Paschkewitz. For a reminder, read these:

Complex adaptive system design (part 1), Azimuth, 2 October 2016.

Complex adaptive system design (part 2), Azimuth, 18 October 2016.

A lot has happened since then, and I want to explain it.

I’m working with Metron Scientific Solutions to develop new techniques for designing complex networks.

The particular problem we began cutting our teeth on is a search and rescue mission where a bunch of boats, planes and drones have to locate and save people who fall overboard during a boat race in the Caribbean Sea. Subsequently the Metron team expanded the scope to other search and rescue tasks. But the real goal is to develop very generally applicable new ideas on designing and ‘tasking’ networks of mobile agents—that is, designing these networks and telling the agents what to do.

We’re using the mathematics of ‘operads’, in part because Spivak’s work on operads has drawn a lot of attention and raised a lot of hopes:

An operad is a bunch of operations for sticking together smaller things to create bigger ones—I’ll explain this in detail later, but that’s the core idea. Spivak described some specific operads called ‘operads of wiring diagrams’ and illustrated some of their potential applications. But when we got going on our project, we wound up using a different class of operads, which I’ll call ‘network operads’.

Here’s our dream, which we’re still trying to make into a reality:

Network operads should make it easy to build a big network from smaller ones and have every agent know what to do. You should be able to ‘slap together’ a network, throwing in more agents and more links between them, and automatically have it do something reasonable. This should be more flexible than an approach where you need to know ahead of time exactly how many agents you have, and how they’re connected, before you can tell them what to do.

You don’t want a network to malfunction horribly because you forgot to hook it up correctly. You want to focus your attention on optimizing the network, not getting it to work at all. And you want everything to work so smoothly that it’s easy for the network to adapt to changing conditions.

To achieve this we’re using network operads, which are certain special ‘typed operads’. So before getting into the details of our approach, I should say a bit about typed operads. And I think that will be enough for today’s post: I don’t want to overwhelm you with too much information at once.

In general, a ‘typed operad’ describes ways of sticking together things of various types to get new things of various types. An ‘algebra’ of the operad gives a particular specification of these things and the results of sticking them together. For now I’ll skip the full definition of a typed operad and only highlight the most important features. A typed operad $O$ has:

• a set $T$ of types.

• collections of operations $O(t_1,...,t_n ; t)$ where $t_i, t \in T$. Here $t_1, \dots, t_n$ are the types of the inputs, while $t$ is the type of the output.

• ways to compose operations. Given an operation
$f \in O(t_1,\dots,t_n ;t)$ and $n$ operations

$g_1 \in O(t_{11},\dots,t_{1 k_1}; t_1),\dots, g_n \in O(t_{n1},\dots,t_{n k_n};t_n)$

we can compose them to get

$f \circ (g_1,\dots,g_n) \in O(t_{11}, \dots, t_{nk_n};t)$

These must obey some rules.

But if you haven’t seen operads before, you’re probably reeling in horror—so I need to rush in and save you by showing you the all-important pictures that help explain what’s going on!

First of all, you should visualize an operation $f \in O(t_1, \dots, t_n; t)$ as a little gizmo like this:

It has $n$ inputs at top and one output at bottom. Each input, and the output, has a ‘type’ taken from the set $T.$ So, for example, if you operation takes two real numbers, adds them and spits out the closest integer, both input types would be ‘real’, while the output type would be ‘integer’.

The main thing we do with operations is compose them. Given an an operation $f \in O(t_1,\dots,t_n ;t),$ we can compose it with $n$ operations

$g_1 \in O(t_{11},\dots,t_{1 k_1}; t_1), \quad \dots, \quad g_n \in O(t_{n1},\dots,t_{n k_n};t_n)$

by feeding their outputs into the inputs of $f,$ like this:

The result is an operation we call

$f \circ (g_1, \dots, g_n)$

Note that the input types of $f$ have to match the output types of the $g_i$ for this to work! This is the whole point of types: they forbid us from composing operations in ways that don’t make sense.

This avoids certain stupid mistakes. For example, you can take the square root of a positive number, but you may not want to take the square root of a negative number, and you definitely don’t want to take the square root of a hamburger. While you can land a plane on an airstrip, you probably don’t want to land a plane on a person.

The operations in an operad are quite abstract: they aren’t really operating on anything. To render them concrete, we need another idea: operads have ‘algebras’.

An algebra $A$ of the operad $O$ specifies a set of things of each type $t \in T$ such that the operations of $O$ act on these sets. A bit more precisely, an algebra consists of:

• for each type $t \in T,$ a set $A(t)$ of things of type $t$

• an action of $O$ on $A,$ that is, a collection of maps

$\alpha : O(t_1,...,t_n ; t) \times A(t_1) \times \cdots \times A(t_n) \to A(t)$

obeying some rules.

In other words, an algebra turns each operation $f \in O(t_1,...,t_n ; t)$ into a function that eats things of types $t_1, \dots, t_n$ and spits out a thing of type $t.$

When we get to designing systems with operads, the fact that the same operad can have many algebras will be useful. Our operad will have operations describing abstractly how to hook up networks to form larger networks. An algebra will give a specific implementation of these operations. We can use one algebra that’s fairly fine-grained and detailed about what the operations actually do, and another that’s less detailed. There will then be a map between from the first algebra to the second, called an ‘algebra homomorphism’, that forgets some fine-grained details.

There’s a lot more to say—all this is just the mathematical equivalent of clearing my throat before a speech—but I’ll stop here for now.

And as I do—since it also takes me time to stop talking—I should make it clear yet again that I haven’t even given the full definition of typed operads and their algebras! Besides the laws I didn’t write down, there’s other stuff I omitted. Most notably, there’s a way to permute the inputs of an operation in an operad, and operads have identity operations, one for each type.

To see the full definition of an ‘untyped’ operad, which is really an operad with just one type, go here:

They just call it an ‘operad’. Note that they first explain ‘non-symmetric operads’, where you can’t permute the inputs of operations, and then explain operads, where you can.

If you’re mathematically sophisticated, you can easily guess the laws obeyed by a typed operad just by looking at this article and inserting the missing types. You can also see the laws written down in Spivak’s paper, but with some different terminology: he calls types ‘objects’, he calls operations ‘morphisms’, and he calls typed operads ‘symmetric colored operads’—or once he gets going, just ‘operads’.

You can also see the definition of a typed operad in Section 2.1 here:

• Donald Yau, Operads of wiring diagrams.

What I would call a typed operad with $S$ as its set of types, he calls an ‘$S$-colored operad’.

I guess it’s already evident, but I’ll warn you that the terminology in this subject varies quite a lot from author to author: for example, a certain community calls typed operads ‘symmetric multicategories’. This is annoying at first but once you understand the subject it’s as ignorable as the fact that mathematicians have many different accents. The main thing to remember is that operads come in four main flavors, since they can either be typed or untyped, and they can either let you permute inputs or not. I’ll always be working with typed operads where you can permute inputs.

Finally, I’ll say that while the definition of operad looks lengthy and cumbersome at first, it becomes lean and elegant if you use more category theory.

Next time I’ll give you an example of an operad: the simplest ‘network

## Norbert Blum on P versus NP

15 August, 2017

There’s a new paper on the arXiv that claims to solve a hard problem:

• Norbert Blum, A solution of the P versus NP problem.

Most papers that claim to solve hard math problems are wrong: that’s why these problems are considered hard. But these papers can still be fun to look at, at least if they’re not obviously wrong. It’s fun to hope that maybe today humanity has found another beautiful grain of truth.

I’m not an expert on the P versus NP problem, so I have no opinion on this paper. So don’t get excited: wait calmly by your radio until you hear from someone who actually works on this stuff.

I found the first paragraph interesting, though. Here it is, together with some highly non-expert commentary. Beware: everything I say could be wrong!

Understanding the power of negations is one of the most challenging problems in complexity theory. With respect to monotone Boolean functions, Razborov [12] was the first who could shown that the gain, if using negations, can be super-polynomial in comparision to monotone Boolean networks. Tardos [16] has improved this to exponential.

I guess a ‘Boolean network’ is like a machine where you feed in a string of bits and it computes new bits using the logical operations ‘and’, ‘or’ and ‘not’. If you leave out ‘not’ the Boolean network is monotone, since then making more inputs equal to 1, or ‘true’, is bound to make more of the output bits 1 as well. Blum is saying that including ‘not’ makes some computations vastly more efficient… but that this stuff is hard to understand.

For the characteristic function of an NP-complete problem like the clique function, it is widely believed that negations cannot help enough to improve the Boolean complexity from exponential to polynomial.

A bunch of nodes in a graph are a clique if each of these nodes is connected by an edge to every other. Determining whether a graph with $n$ vertices has a clique with more than $k$ nodes is a famous problem: the clique decision problem.

For example, here’s a brute-force search for a clique with at least 4 nodes:

The clique decision problem is NP-complete. This means that if you can solve it with a Boolean network whose complexity grows like some polynomial in n, then P = NP. But if you can’t, then P ≠ NP.

(Don’t ask me what the complexity of a Boolean network is; I can guess but I could get it wrong.)

I guess Blum is hinting that the best monotone Boolean network for solving the clique decision problem has a complexity that’s exponential in $n.$ And then he’s saying it’s widely believed that not gates can’t reduce the complexity to a polynomial.

Since the computation of an one-tape Turing machine can be simulated by a non-monotone Boolean network of size at most the square of the number of steps [15, Ch. 3.9], a superpolynomial lower bound for the non-monotone network complexity of such a function would imply P ≠ NP.

Now he’s saying what I said earlier: if you show it’s impossible to solve the clique decision problem with any Boolean network whose complexity grows like some polynomial in n, then you’ve shown P ≠ NP. This is how Blum intends to prove P ≠ NP.

For the monotone complexity of such a function, exponential lower bounds are known [11, 3, 1, 10, 6, 8, 4, 2, 7].

Should you trust someone who claims they’ve proved P ≠ NP, but can’t manage to get their references listed in increasing order?

But until now, no one could prove a non-linear lower bound for the nonmonotone complexity of any Boolean function in NP.

That’s a great example of how helpless we are: we’ve got all these problems whose complexity should grow faster than any polynomial, and we can’t even prove their complexity grows faster than linear. Sad!

An obvious attempt to get a super-polynomial lower bound for the non-monotone complexity of the clique function could be the extension of the method which has led to the proof of an exponential lower bound of its monotone complexity. This is the so-called “method of approximation” developed by Razborov [11].

I don’t know about this. All I know is that Razborov and Rudich proved a whole bunch of strategies for proving P ≠ NP can’t possibly work. These strategies are called ‘natural proofs’. Here are some friendly blog articles on their result:

• Timothy Gowers, How not to prove that P is not equal to NP, 3 October 2013.

• Timothy Gowers, Razborov and Rudich’s natural proofs argument, 7 October 2013.

From these I get the impression that what Blum calls ‘Boolean networks’ may be what other people call ‘Boolean circuits’. But I could be wrong!

Continuing:

Razborov [13] has shown that his approximation method cannot be used to prove better than quadratic lower bounds for the non-monotone complexity of a Boolean function.

So, this method is unable to prove some NP problem can’t be solved in polynomial time and thus prove P ≠ NP. Bummer!

But Razborov uses a very strong distance measure in his proof for the inability of the approximation method. As elaborated in [5], one can use the approximation method with a weaker distance measure to prove a super-polynomial lower bound for the non-monotone complexity of a Boolean function.

This reference [5] is to another paper by Blum. And in the end, he claims to use similar methods to prove that the complexity of any Boolean network that solves the clique decision problem must grow faster than a polynomial.

So, if you’re trying to check his proof that P ≠ NP, you should probably start by checking that other paper!

The picture below, by Behnam Esfahbod on Wikicommons, shows the two possible scenarios. The one at left is the one Norbert Blum claims to have shown we’re in.