Applied Category Theory Course

26 March, 2018

It just became a lot easier to learn about applied category theory, thanks to this free book:

• Brendan Fong and David Spivak, Seven Sketches in Compositionality: An Invitation to Applied Category Theory.

I’ve started an informal online course based on this book on the Azimuth Forum. I’m getting pretty sick of the superficial quality of my interactions on social media. This could be a way to do something more interesting.

The idea is that you can read chapters of this book, discuss them, try the exercises in the book, ask and answer questions, and maybe team up to create software that implements some of the ideas. I’ll try to keep things moving forward. For example, I’ll explain some stuff and try to help answer questions that people are stuck on. I may also give some talks or run discussions on Google Hangouts or similar software—but only when I have time: I’m more of a text-based guy. I may get really busy some times, and leave the rest of you alone for a while. But I like writing about math for at least 15 minutes a day, and more when I have time. Furthermore, I’m obsessed with applied category theory and plan to stay that way for at least a few more years.

If this sounds interesting, let me know here—and please visit the Azimuth Forum and register! Use your full real name as your username, with no spaces. I will add spaces and that will become your username. Use a real working email address. If you don’t, the registration process may not work.

Over 70 people have registered so far, so this process will take a while.

The main advantage of the Forum over this blog is that you can initiate new threads and edit your comments. Like here you can write equations in LaTeX. Like here, that ability is severely limited: for example you can’t define macros, and you can’t use TikZ. (Maybe someone could fix that.) But equations are better typeset over there—and more importantly, the ability to edit comments makes it a lot easier to correct errors in your LaTeX.

Please let me know what you think.

What follows is the preface to Fong and Spivak’s book, just so you can get an idea of what it’s like.

Preface

Category theory is becoming a central hub for all of pure mathematics. It is unmatched in its ability to organize and layer abstractions, to find commonalities between structures of all sorts, and to facilitate communication between different mathematical communities. But it has also been branching out into science, informatics, and industry. We believe that it has the potential to be a major cohesive force in the world, building rigorous bridges between disparate worlds, both theoretical and practical. The motto at MIT is mens et manus, Latin for mind and hand. We believe that category theory—and pure math in general—has stayed in the realm of mind for too long; it is ripe to be brought to hand.

Purpose and audience

The purpose of this book is to offer a self-contained tour of applied category theory. It is an invitation to discover advanced topics in category theory through concrete real-world examples. Rather than try to give a comprehensive treatment of these topics—which include adjoint functors, enriched categories, proarrow equipments, toposes, and much more–we merely provide a taste. We want to give readers some insight into how it feels to work with these structures as well as some ideas about how they might show up in practice.

The audience for this book is quite diverse: anyone who finds the above description intriguing. This could include a motivated high school student who hasn’t seen calculus yet but has loved reading a weird book on mathematical logic they found at the library. Or a machine learning researcher who wants to understand what vector spaces, design theory, and dynamical systems could possibly have in common. Or a pure mathematician who wants to imagine what sorts of applications their work might have. Or a recently-retired programmer who’s always had an eerie feeling that category theory is what they’ve been looking for to tie it all together, but who’s found the usual books on the subject impenetrable.

For example, we find it something of a travesty that in 2018 there seems to be no introductory material available on monoidal categories. Even beautiful modern introductions to category theory, e.g. by Riehl or Leinster, do not include anything on this rather central topic. The basic idea is certainly not too abstract; modern human intuition seems to include a pre-theoretical understanding of monoidal categories that is just waiting to be formalized. Is there anyone who wouldn’t correctly understand the basic idea being communicated in the following diagram?

Many applied category theory topics seem to take monoidal categories as their jumping off point. So one aim of this book is to provide a reference—even if unconventional—for this important topic.

We hope this book inspires both new visions and new questions. We intend it to be self-contained in the sense that it is approachable with minimal prerequisites, but not in the sense that the complete story is told here. On the contrary, we hope that readers use this as an invitation to further reading, to orient themselves in what is becoming a large literature, and to discover new applications for themselves.

This book is, unashamedly, our take on the subject. While the abstract structures we explore are important to any category theorist, the specific topics have simply been chosen to our personal taste. Our examples are ones that we find simple but powerful, concrete but representative, entertaining but in a way that feels important and expansive at the same time. We hope our readers will enjoy themselves and learn a lot in the process.

How to read this book

The basic idea of category theory—which threads through every chapter—is that if one pays careful attention to structures and coherence, the resulting systems will be extremely reliable and interoperable. For example, a category involves several structures: a collection of objects, a collection of morphisms relating objects, and a formula for combining any chain of morphisms into a morphism. But these structures need to cohere or work together in a simple commonsense way: a chain of chains is a chain, so combining a chain of chains should be the same as combining the chain. That’s it!

We will see structures and coherence come up in pretty much every definition we give: “here are some things and here are how they fit together.” We ask the reader to be on the lookout for structures and coherence as they read the book, and to realize that as we layer abstraction on abstraction, it is the coherence that makes everything function like a well-oiled machine.

Each chapter in this book is motivated by a real-world topic, such as electrical circuits, control theory, cascade failures, information integration, and hybrid systems. These motivations lead us into and through various sorts of category-theoretic concepts.

We generally have one motivating idea and one category-theoretic purpose per chapter, and this forms the title of the chapter, e.g. Chapter 4 is “Collaborative design: profunctors, categorification, and monoidal categories.” In many math books, the difficulty is roughly a monotonically-increasing function of the page number. In this book, this occurs in each chapter, but not so much in the book as a whole. The chapters start out fairly easy and progress in difficulty.

The upshot is that if you find the end of a chapter very difficult, hope is certainly not lost: you can start on the next one and make good progress. This format lends itself to giving you a first taste now, but also leaving open the opportunity for you to come back at a later date and get more deeply into it. But by all means, if you have the gumption to work through each chapter to its end, we very much encourage that!

We include many exercises throughout the text. Usually these exercises are fairly straightforward; the only thing they demand is that the reader’s mind changes state from passive to active, rereads the previous paragraphs with intent, and puts the pieces together. A reader becomes a student when they work the exercises; until then they are more of a tourist, riding on a bus and listening off and on to the tour guide. Hey, there’s nothing wrong with that, but we do encourage you to get off the bus and make contact with the natives as often as you can.


Applied Category Theory at NIST (Part 1)

17 February, 2018

I think it’s really cool how applied category theory is catching on. My former student Blake Pollard is working at the National Institute of Standards and Technology on applications of category theory to electrical engineering. He’s working with Spencer Breiner… and now Breiner is helping run a workshop on this stuff:

• Applied Category Theory: Bridging Theory & Practice, March 15–16, 2018, NIST, Gaithersburg, Maryland, USA. Organized by Spencer Breiner and Eswaran Subrahmanian.

It’s by invitation only, but I can’t resist mentioning its existence. Here’s the idea:

What: The Information Technology Laboratory at NIST is pleased to announce a workshop on Applied Category Theory to be held at NIST’s Gaithersburg, Maryland campus on March 15 & 16, 2018. The meeting will focus on practical avenues for introducing methods from category theory into real-world applications, with an emphasis on examples rather than theorems.

Who: The workshop aims to bring together two distinct groups. First, category theorists interested in pursuing applications outside of the usual mathematical fields. Second, domain experts and research managers from industry, government, science and engineering who have in mind potential domain applications for categorical methods.

Intended Outcomes: A proposed landscape of potential CT applications and the infrastructure needed to realize them, together with a 5-10 year roadmap for developing the field of applied category theory. This should include perspectives from industry, academia and government as well as major milestones, potential funding sources, avenues for technology transfer and necessary improvements in tool support and methodology. Exploratory collaborations between category theorists and domain experts. We will ask that each group come prepared to meet the other side. Mathematicians should be prepared with concrete examples that demonstrate practical applications of CT in an intuitive way. Domain experts should bring to the table specific problems to which they can devote time and/or funding as well as some reasons about why they think CT might be relevant to this application.

Invited Speakers:
John Baez (University of California at Riverside) and John Foley (Metron Scientific Solutions).
Bob Coecke (University of Oxford).
Dusko Pavlovic (University of Hawaii).

Some other likely participants include Chris Boner (Metron), Arquimedes Canedo (Siemens at Princeton), Stephane Dugowson (Supméca), William Edmonson (North Carolina A&T), Brendan Fong (MIT), Mark Fuge (University of Maryland), Jack Gray (Penumbra), Helle Hansen (Delft), Jelle Herold (Statebox), Marisa Hughes (Johns Hopkins), Steve Huntsman (BAE Systems), Patrick Johnson (Dassault Systèmes), Al Jones (NIST), Cliff Joslyn (Pacific Northwest National Laboratory), Richard Malek (NSF), Tom Mifflin (Metron), Ira Monarch (Carnegie Mellon), John Paschkewitz (DARPA), Evan Patterson (Stanford), Blake Pollard (NIST), Emilie Purvine (Pacific Northwest National Laboratory), Mark Raugas (Pacific Northwest National Laboratory), Bill Regli (University of Maryland), Michael Robinson (American U.) Alberto Speranzon (Honeywell Aerospace), David Spivak (MIT), Eswaran Subrahmanian (Carnegie Mellon), Jamie Vicary (Birmingham and Oxford), and Ryan Wisnesky (Categorical Informatics).

A bunch of us will stay on into the weekend and talk some more. I hope we make a lot of progress—and I plan to let you know how it goes!

I’ll be giving a joint talk with John Foley about our work using operads to design networks. This work is part of the Complex Adaptive System Composition and Design Environment project being done by Metron Scientific Solutions and managed by John Paschkewitz at DARPA.


Pyrofex

4 February, 2018

Mike Stay is applying category theory to computation at a new startup called Pyrofex. And this startup has now entered a deal with RChain.

But let me explain why I’m interested. I’m interested in applied category theory… but this is special.

Mike Stay came to work with me at U.C. Riverside after getting a master’s in computer science at the University of Auckland, where he worked with Cristian Calude on algorithmic randomness. For example:

• Cristian S. Calude and Michael A. Stay, From Heisenberg to Gödel via Chaitin, International Journal of Theoretical Physics 44 (2008), 1053–1065.

• Cristian S. Calude and Michael A. Stay, Most programs stop quickly or never halt, Advances in Applied Mathematics, 40 (2008), 295–308.

It seems like ages ago, but I dimly remember looking at his application, seeing the title of the first of these two papers, and thinking “he’s either a crackpot, or I’m gonna like him”.

You see, the lure of relating Gödel’s theorem to Heisenberg’s uncertainty principle is fatally attractive to many people who don’t really understand either. I looked at the paper, decided he wasn’t a crackpot… and yes, it turned out I liked him.

(A vaguely similar thing happened with my student Chris Rogers, who’d written some papers applying techniques from general relativity to the study of crystals. As soon as I assured myself this stuff was for real, I knew I’d like him!)

Since Mike knew a lot about computer science, his presence at U. C. Riverside emboldened me to give a seminar on classical versus quantum computation. I used this as an excuse to learn the old stuff about the lambda-calculus and cartesian closed categories. When I first started, I thought the basic story would be obvious: people must be making up categories where the morphisms describe processes of computation.

But I soon learned I was wrong: people were making up categories where objects were data types, but the morphisms were equivalence classes of things going between data types—and this equivalence relation completely washed out the difference, between, say, a program that actually computes 237 × 419 and a program that just prints out 99303, which happens to be the answer to that problem.

In other words, the actual process of computation was not visible in the category-theoretic framework. I decided that to make it visible, what we really need are 2-categories in which 2-morphisms are ‘processes of computation’. Or in the jargon: objects are types, morphisms are terms, and 2-morphisms are rewrites.

It turned out people had already thought about this:

• Barnaby P. Hilken, Towards a proof theory of rewriting: the simply-typed 2λ-calculus, Theor. Comp. Sci. 170 (1996), 407–444.

• R. A. G. Seely, Weak adjointness in proof theory in Proc. Durham Conf. on Applications of Sheaves, Springer Lecture Notes in Mathematics 753, Springer, Berlin, 1979, pp. 697–701.

• R. A. G. Seely, Modeling computations: a 2-categorical framework, in Proc. Symposium on Logic in Computer Science 1987, Computer Society of the IEEE, pp. 65—71.

But I felt this viewpoint wasn’t nearly as popular as it should be. It should be very popular, at least among theoretical computer scientists, because it describes what’s actually going on in the lambda-calculus. If you read a big fat book on the lambda-calculus, like Barendregt’s The Lambda Calculus: Its Syntax and Semantics, you’ll see it spends a lot of time on reduction strategies: that is, ways of organizing the process of computation. All this is buried in the usual 1-categorical treatment. It’s living up at the 2-morphism level!

Mike basically agreed with me. We started by writing introduction to the usual 1-categorical stuff, and how it shows up in many different fields:

• John Baez and Michael Stay, Physics, topology, logic and computation: a Rosetta Stone, in New Structures for Physics, ed. Bob Coecke, Lecture Notes in Physics vol. 813, Springer, Berlin, 2011, pp. 95–172.

For financial reasons he had to leave U. C. Riverside and take a job at Google. But he finished his Ph.D. at the University of Auckland, with Cristian Calude and me as co-advisors. And large chunk of his thesis dealt with cartesian closed 2-categories and their generalizations suitable for quantum computation:

• Michael Stay, Compact closed bicategories, Theory and Applications of Categories 31 (2016), 755–798.

Great stuff! My students these days are building on this math and marching ahead.

I said Mike ‘basically’ agreed with me. He agreed that we need to go beyond the usual 1-categorical treatment to model the process of computation. But when it came to applying this idea to computer science, Mike wasn’t satisfied with thinking about the lambda-calculus. That’s an old model of computation: it’s okay for a single primitive computer, but not good for systems where different parts are sending messages to each other, like the internet, or even multiprocessing in a single computer. In other words, the lambda-calculus doesn’t really handle the pressing issues of concurrency and distributed computation.

So, Mike wanted to think about more modern formalisms for computation, like the pi-calculus, using 2-categories.

He left Google and wrote some papers with Greg Meredith on these ideas, for example:

• Michael Stay and Lucius Gregory Meredith, Higher category models of the pi-calculus.

• Michael Stay and Lucius Gregory Meredith, Representing operational semantics with enriched Lawvere theories.

The second one takes a key step: moving away from 2-categories to graph-enriched categories, which are simpler and perhaps better.

Then, after various twists and turns, he started a company called Pyrofex with Nash Foster. Or perhaps I should say Foster started a company with Mike, since Foster is the real bigshot of the two. Here’s what their webpage says:

Hello—

My name is Nash Foster, and together with my friend and colleague Mike Stay, I recently founded a company called Pyrofex. We founded our company for one simple reason: we love to build large-scale distributed systems that are always reliable and secure and we wanted to help users like yourself do it more easily.

When Mike and I founded the company, we felt strongly that several key advances in programming language theory would ease the development of every day large-scale systems. However, we were not sure exactly how to expose the APIs to users or how to build the tool-sets. Grand visions compete with practical necessity, so we spent months talking to users just like you to discover what kinds of things you were most interested in. We spent many hours at white-boards, checking our work against the best CS theory we know. And, of course, we have enjoyed many long days of hacking in the hopes that we can produce something new and useful, for you.

I think this is the most exciting time in history to be a programmer (and I wrote my first program in the early 1980’s). The technologies available today are varied and compelling and exciting. I hope that you’ll be as excited as I am about some of the ideas we discovered while building our software.

And on January 8th, 2018, their company entered into an arrangement with Greg Meredith’s company Rchain. Below I’ll quote part of an announcement. I don’t know much about this stuff—at least, not yet. But I’m happy to see some good ideas getting applied in the real world, and especially happy to see Mike doing it.

The RChain Cooperative & Pyrofex Corporation announce Strategic Partnership

The RChain Cooperative and Pyrofex Corporation today announced strategically important service contracts and an equity investment intended to deliver several mutually beneficial blockchain solutions. RChain will acquire 1.1 million shares of Pyrofex Common Stock as a strategic investment. The two companies will ink separate service contracts to reinforce their existing relationship and help to align their business interests.

Pyrofex will develop critical tools and platform components necessary for the long-term success of the RChain platform. These tools are designed to leverage RChain’s unique blockchain environment and make blockchain development simpler, faster, and more effective than ever before. Under these agreements, Pyrofex will develop the world’s first decentralized IDE for writing blockchain smart contracts on the RChain blockchain.

Pyrofex also commits to continuing the core development of RChain’s blockchain platform and to organizing RChain’s global developer events and conferences.

Comments on the News

“We’re thrilled to have an opportunity to strengthen our relationship with the RChain Cooperative in 2018. Their commitment to open-source development mirrors our own corporate values. It’s a pleasure to have such a close relationship with a vibrant open-source community. I’ve rarely seen the kind of excitement the Coop’s members share and we look forward to delivering some great new technology this year.” — Nash E. Foster, Cofounder & CEO, Pyrofex Corp.

“Intuitive development tools are important for us and the blockchain ecosystem as a whole; we’re incredibly glad Pyrofex intends to launch their tools on RChain first. But, Ethereum has been a huge supporter of RChain and we’re pleased that Pyrofex intends to support Solidity developers as well. Having tools that will make it possible for developers to migrate smart contracts between blockchains is going to create tremendous possibilities.” — Lucius Greg Meredith, President, RChain Cooperative Background

Pyrofex is a software development company co-founded by Dr. Michael Stay, PhD and Nash Foster in 2016. Dr. Stay and Greg Meredith are long-time colleagues and collaborators whose mutual research efforts form the mathematical foundations of RChain’s technology. One example of the work that Greg and Mike have collaborated on is the work on the LADL (Logic as Distributed Law) algorithm. You can watch Dr. Stay present the latest research from the RChain Developers retreat.

Pyrofex and its development team should be familiar to those who follow the RChain Cooperative. They currently employ 14 full-time and several part-time developers dedicated to RChain platform development. Pyrofex CEO Nash Foster and Lead Project Manager Medha Parlikar have helped grow RChain’s development team to an impressive 20+ Core devs with plans on doubling by mid 2018. The team now includes multiple PhDs, ex-Googlers, and other word class talents.

Every Wednesday, you will find Medha on our debrief updating the community with the latest developments in RChain. Here she is announcing the recent node.hello release along with a demo from core developer Chris Kirkwood-Watts.

The working relationship between the RChain Cooperative and Pyrofex has gone so well that the Board of Directors and the community at large have supported Pyrofex’s proposal to develop Cryptofex, the much needed developer tool kit for the decentralized world.

The RChain Cooperative is ecstatic to further develop its relationship with the Pyrofex team.

“As we fly towards Mercury and beyond, we all could use better tools.”
— The RChain Co-op Team.

Listen to the announcement from Greg Meredith as well as a short Q&A with Pyrofex CEO Nash Foster, from a recent community debrief.

The following is an excerpt from the soon to be released Cryptofex Whitepaper.

The Problem: Writing Software is Hard, Compiling is Harder

In 1983, Bordland Software Corporation acquired a small compiler called Compas Pascal and released it in the United States as Turbo Pascal. It was the first product to integrate a compiler and the editor in which software was written and for nearly a decade Borland’s products defined the market for integrated development environments (IDEs).

The year after Borland released TurboPascal, Ken Thompson observed the distinct and unique dangers associated with compiler technologies. In his famous Turing Award acceptance speech, Thompson described a mechanism by which a virus can be injected into a compiler such that every binary compiled with that compiler will replicate the virus.

“In demonstrating the possibility of this kind of attack, I picked on the C compiler. I could have picked on any program-handling program such as an assembler, a loader, or even hardware microcode. As the level of program gets lower, these bugs will be harder and harder to detect. A well installed microcode bug will be almost impossible to detect.” — Ken Thompson

Unfortunately, many developers today remain stuck in a world constructed in the early 1980’s. IDEs remain essentially the same, able to solve only those problems that neatly fit onto their laptop’s single Intel CPU. But barely a month ago, on 22nd November 2017, the Intel Corporation released a critical firmware update to the Intel Management Engine and in the intervening weeks, the public at large has become aware of the “Meltdown” bug. The IME and other components are exactly the sort of low-level microcode applications that Thompson warned about. Intel has demonstrated perfectly that in the past 33 years, we have learned little and gone nowhere.

Ironically, we have had a partial solution to these problems for nearly a decade. In 2009, David A. Wheeler published his PhD dissertation, in which he proposed a mechanism by which multiple compilers can be used to verify the correctness of a compiler output. Such a mechanism turns out to be tailor-made for the decentralized blockchain environment. Combining Wheeler’s mechanism with a set of economic incentives for compile farms to submit correct outputs gives us a very real shot at correcting a problem that has plagued us for more than 30 years.

The Solution: A Distributed and Decentralized Toolchain

If we crack open the development environments at companies like Google and Amazon, many of us would be surprised to discover that very few programs are compiled on a single machine. Already, the most sophisticated organizations in the world have moved to a distributed development environment. This allows them to leverage the cloud, bringing high-performance distributed computing to bear on software development itself. At Google, many thousands of machines churn away compiling code, checking it for correctness, and storing objects to be re-used later. Through clever use of caching and “hermetic” builds, Google makes its builds faster and more computationally efficient than could possibly be done on individual developer workstations. Unfortunately, most of us cannot afford to dedicate thousands of machines to compilation.

The open-source community might be able to build large scale shared compilation environments on the Internet, but Ken Thompson explained to us why we could not trust a shared environment for these workloads. However, in the age of blockchain, it’s now possible to build development environments that harness the power of large-scale compute to compile and check programs against programmer intent. Secure, cheap, and fast — we can get all three.

CryptoFex is just such a Decentralized Integrated Development Environment (DIDE) allowing software engineers to author, test, compile, and statically check their code to ensure that it is secure, efficient, and scalable.


Complex Adaptive System Design (Part 6)

31 October, 2017

I’ve been slacking off on writing this series of posts… but for a good reason: I’ve been busy writing a paper on the same topic! In the process I caught a couple of mistakes in what I’ve said so far. But more importantly, there’s a version out now, that you can read:

• John Baez, John Foley, Blake Pollard and Joseph Moeller, Network models.

There will be two talks about this at the AMS special session on Applied Category Theory this weekend at U. C. Riverside: one by John Foley of Metron Inc., and one by my grad student Joseph Moeller. I’ll try to get their talk slides someday. But for now, here’s the basic idea.

Our goal is to build operads suited for designing networks. These could be networks where the vertices represent fixed or moving agents and the edges represent communication channels. More generally, they could be networks where the vertices represent entities of various types, and the edges represent relationships between these entities—for example, that one agent is committed to take some action involving the other. This paper arose from an example where the vertices represent planes, boats and drones involved in a search and rescue mission in the Caribbean. However, even for this one example, we wanted a flexible formalism that can handle networks of many kinds, described at a level of detail that the user is free to adjust.

To achieve this flexibility, we introduced a general concept of ‘network model’. Simply put, a network model is a kind of network. Any network model gives an operad whose operations are ways to build larger networks of this kind by gluing smaller ones. This operad has a ‘canonical’ algebra where the operations act to assemble networks of the given kind. But it also has other algebras, where it acts to assemble networks of this kind equipped with extra structure and properties. This flexibility is important in applications.

What exactly is a ‘kind of network’? That’s the question we had to answer. We started with some examples, At the crudest level, we can model networks as simple graphs. If the vertices are agents of some sort and the edges represent communication channels, this means we allow at most one channel between any pair of agents.

However, simple graphs are too restrictive for many applications. If we allow multiple communication channels between a pair of agents, we should replace simple graphs with ‘multigraphs’. Alternatively, we may wish to allow directed channels, where the sender and receiver have different capabilities: for example, signals may only be able to flow in one direction. This requires replacing simple graphs with ‘directed graphs’. To combine these features we could use ‘directed multigraphs’.

But none of these are sufficiently general. It’s also important to consider graphs with colored vertices, to specify different types of agents, and colored edges, to specify different types of channels. This leads us to ‘colored directed multigraphs’.

All these are examples of what we mean by a ‘kind of network’, but none is sufficiently general. More complicated kinds, such as hypergraphs or Petri nets, are likely to become important as we proceed.

Thus, instead of separately studying all these kinds of networks, we introduced a unified notion that subsumes all these variants: a ‘network model’. Namely, given a set C of ‘vertex colors’, a network model is a lax symmetric monoidal functor

F: \mathbf{S}(C) \to \mathbf{Cat}

where \mathbf{S}(C) is the free strict symmetric monoidal category on C and \mathbf{Cat} is the category of small categories.

Unpacking this somewhat terrifying definition takes a little work. It simplifies in the special case where F takes values in \mathbf{Mon}, the category of monoids. It simplifies further when C is a singleton, since then \mathbf{S}(C) is the groupoid \mathbf{S}, where objects are natural numbers and morphisms from m to n are bijections

\sigma: \{1,\dots,m\} \to \{1,\dots,n\}

If we impose both these simplifying assumptions, we have what we call a one-colored network model: a lax symmetric monoidal functor

F : \mathbf{S} \to \mathbf{Mon}

As we shall see, the network model of simple graphs is a one-colored network model, and so are many other motivating examples. If you like André Joyal’s theory of ‘species’, then one-colored network models should be pretty fun, since they’re species with some extra bells and whistles.

But if you don’t, there’s still no reason to panic. In relatively down-to-earth terms, a one-colored network model amounts to roughly this. If we call elements of F(n) ‘networks with n vertices’, then:

• Since F(n) is a monoid, we can overlay two networks with the same number of vertices and get a new one. We call this operation

\cup \colon F(n) \times F(n) \to F(n)

• Since F is a functor, the symmetric group S_n acts on the monoid F(n). Thus, for each \sigma \in S_n, we have a monoid automorphism that we call simply

\sigma \colon F(n) \to F(n)

• Since F is lax monoidal, we also have an operation

\sqcup \colon F(m) \times F(n) \to F(m+n)

We call this operation the disjoint union of networks. In examples like simple graphs, it looks just like what it sounds like.

Unpacking the abstract definition further, we see that these operations obey some equations, which we list in Theorem 11 of our paper. They’re all obvious if you draw pictures of examples… and don’t worry, our paper has a few pictures. (We plan to add more.) For example, the ‘interchange law’

(g \cup g') \sqcup (h \cup h') = (g \sqcup h) \cup (g' \sqcup h')

holds whenever g,g' \in F(m) and h, h' \in F(n). This is a nice relationship between overlaying networks and taking their disjoint union.

In Section 2 of our apper we study one-colored network models, and give lots of examples. In Section 3 we describe a systematic procedure for getting one-colored network models from monoids. In Section 4 we study general network models and give examples of these. In Section 5 we describe a category \mathbf{NetMod} of network models, and show that the procedure for getting network models from monoids is functorial. We also make \mathbf{NetMod} into a symmetric monoidal category, and give examples of how to build new networks models by tensoring old ones.

Our main result is that any network model gives a typed operad, also known as a ‘colored operad’. This operad has operations that describe how to stick networks of the given kind together to form larger networks of this kind. This operad has a ‘canonical algebra’, where it acts on networks of the given kind—but the real point is that it has lots of other algebra, where it acts on networks of the given kind equipped with extra structure and properties.

The technical heart of our paper is Section 6, mainly written by Joseph Moeller. This provides the machinery to construct operads from network models in a functorial way. Category theorists should find this section interesting, because because it describes enhancements of the well-known ‘Grothendieck construction’ of the category of elements \int F of a functor

F: \mathbf{C} \to \mathbf{Cat}

where \mathbf{C} is any small category. For example, if \mathbf{C} is symmetric monoidal and F : \mathbf{C} \to (\mathbf{Cat}, \times) is lax symmetric monoidal, then we show \int F is symmetric monoidal. Moreover, we show that the construction sending the lax symmetric monoidal functor F to the symmetric monoidal category \int F is functorial.

In Section 7 we apply this machinery to build operads from network models. In Section 8 we describe some algebras of these operads, including an algebra whose elements are networks of range-limited communication channels. In future work we plan to give many more detailed examples, and to explain how these algebras, and the homomorphisms between them, can be used to design and optimize networks.

I want to explain all this in more detail—this is a pretty hasty summary, since I’m busy this week. But for now you can read the paper!


Some posts in this series:

Part 1. CASCADE: the Complex Adaptive System Composition and Design Environment.

Part 2. Metron’s software for system design.

Part 3. Operads: the basic idea.

Part 4. Network operads: an easy example.

Part 5. Algebras of network operads: some easy examples.

Part 6. Network models.

Part 7. Step-by-step compositional design and tasking using commitment networks.


Postdoc in Applied Category Theory

8 September, 2017

guest post by Spencer Breiner

One Year Postdoc Position at Carnegie Mellon/NIST

We are seeking an early-career researcher with a background in category theory, functional programming and/or electrical engineering for a one-year post-doctoral position supported by an Early-concept Grant (EAGER) from the NSF’s Systems Science program. The position will be managed through Carnegie Mellon University (PI: Eswaran Subrahmanian), but the position itself will be located at the US National Institute for Standards and Technology (NIST), located in Gaithersburg, Maryland outside of Washington, DC.

The project aims to develop a compositional semantics for electrical networks which is suitable for system prediction, analysis and control. This work will extend existing methods for linear circuits (featured on this blog!) to include (i) probabilistic estimates of future consumption and (ii) top-down incentives for load management. We will model a multi-layered system of such “distributed energy resources” including loads and generators (e.g., solar array vs. power plant), different types of resource aggregation (e.g., apartment to apartment building), and across several time scales. We hope to demonstrate that such a system can balance local load and generation in order to minimize expected instability at higher levels of the electrical grid.

This post is available full-time (40 hours/5 days per week) for 12 months, and can begin as early as October 1st.

For more information on this position, please contact Dr. Eswaran Subrahmanian (sub@cmu.edu) or Dr. Spencer Breiner (spencer.breiner@nist.gov).


Complex Adaptive System Design (Part 5)

4 September, 2017

When we design a complex system, we often start with a rough outline and fill in details later, one piece at a time. And if the system is supposed to be adaptive, these details may need to changed as the system is actually being used!

The use of operads should make this easier. One reason is that an operad typically has more than one algebra.

Remember from Part 3: an operad has operations, which are abstract ways of sticking things together. An algebra makes these operations concrete: it specifies some sets of actual things, and how the operations in the operad get implemented as actual ways to stick these things together.

So, an operad O can have one algebra in which things are described in a bare-bones, simplified way, and another algebra in which things are described in more detail. Indeed it will typically have many algebras, corresponding to many levels of detail, but let’s just think about two for a minute.

When we have a ‘less detailed’ algebra A and a ‘more detailed’ algebra A', they will typically be related by a map

f : A' \to A

which ‘forgets the extra details’. This map should be a ‘homomorphism’ of algebras, but I’ll postpone the definition of that concept.

What we often want to do, when designing a system, is not forget extra detail, but rather add extra detail to some rough specification. There is not always a systematic way to do this. If there is, then we may have a homomorphism

g : A \to A'

going back the other way. This is wonderful, because it lets us automate the process of filling in the details. But we can’t always count on being able to do this—especially not if we want an optimal or even acceptable result. So, often we may have to start with an element of A and search for elements of A' that are mapped to it by f : A' \to A.

Let me give some examples. I’ll take the operad that I described last time, and describe some of its algebras, and homomorphisms between these.

I’ll start with an algebra that has very little detail: its elements will be simple graphs. As the name suggests, these are among the simplest possible ways of thinking about networks. They just look like this:

Then I’ll give an algebra with more detail, where the vertices of our simple graphs are points in the plane. There’s nothing special about the plane: we could replace the plane by any other set, and get another algebra of our operad. For example, we could use the set of points on the surface of the Caribbean Sea, the blue stuff in the rectangle here:

That’s what we might use in a search and rescue operation. The points could represent boats, and the edges could represent communication channels.

Then I’ll give an algebra with even more detail, where two points connected by an edge can’t be too far apart. This would be good for range-limited communication channels.

Then I’ll give an algebra with still more detail, where the locations of the points are functions of time. Now our boats are moving around!

Okay, here we go.

The operad from last time was called O_G. Here G is the network model of simple graphs. The best way to picture an operation of O_G is as a way of sticking together a list of simple graphs to get a new simple graph.

For example, an operation

f \in O_G(3,4,2;9)

is a way of sticking together a simple graph with 3 vertices, one with 4 vertices and one with 2 vertices to get one with 9 vertices. Here’s a picture of such an operation:

Note that this operation is itself a simple graph. An operation in O_G(3,4,2;9) is just a simple graph with 9 vertices, where we have labelled the vertices from 1 to 9.

This operad comes with a very obvious algebra A where the operations do just what I suggested. In this algebra, an element of A(t) is a simple graph with t vertices, listed in order. Here t is any natural number, which I’m calling ‘t’ for ‘type’.

We also need to say how the operations in O_G act on these sets A(t). If we take simple graphs in A(3), A(4), and A(2):

we can use our operation f to stick them together and get this:

But we can also make up a more interesting algebra of O_G. Let’s call this algebra A'. We’ll let an element of A'(t) be a simple graph with t vertices, listed in order, which are points in the plane.

My previous pictures can be reused to show how operations in O_G act on this new algebra A'. The only difference is that now we tread the vertices literally as points in the plane! Before you should have been imagining them as abstract points not living anywhere; now they have locations.

Now let’s make up an even more detailed algebra A''.

What if our communication channels are ‘range-limited’? For example, what if two boats can’t communicate if they are more than 100 kilometers apart?

Then we can let an element of A''(t) be a simple graph with t vertices in the plane such that no two vertices connected by an edge have distance > 100.

Now the operations of our operad O_G act in a more interesting way. If we have an operation, and we apply it to elements of our algebra, it ‘tries’ to put in new edges as it did before, but it ‘fails’ for any edge that would have length > 100. In other words, we just leave out any edges that would be too long.

It took me a while to figure this out. At first I thought the result of the operation would need to be undefined whenever we tried to create an edge that violated the length constraint. But in fact it acts in a perfectly well-defined way: we just don’t put in edges that would be too long!

This is good. This means that if you tell two boats to set up a communication channel, and they’re too far apart, you don’t get the ‘blue screen of death’: your setup doesn’t crash and burn. Instead, you just get a polite warning—‘communication channel not established’—and you can proceed.

The nontrivial part is to check that if we do this, we really get an algebra of our operad! There are some laws that must hold in any algebra. But since I haven’t yet described those laws, I won’t check them here. You’ll have to wait for our paper to come out.

Let’s do one more algebra today. For lack of creativity I’ll call it A'''. Now an element of A'''(t) is a time-dependent graph in the plane with t vertices, listed in order. Namely, the positions of the vertices depend on time, and the presence or absence of an edge between two vertices can also depend on time. Furthermore, let’s impose the requirement that any two vertices can only connected by an edge at times when their distance is ≤ 100.

When I say ‘functions of time’ here, what ‘time’? We can model time by some interval [T_1, T_2]. But if you don’t like that, you can change it.

This algebra A''' works more or less like A''. The operations of O_G try to create edges, but these edges only ‘take’ at times when the vertices they connect have distance ≤ 100.

There’s something here you might not like. Our operations can only try to create edges ‘for all times’… and succeed at times when the vertices are close enough. We can’t try to set up a communication channel for a limited amount of time.

But fear not: this is just a limitation in our chosen network model, ‘simple graphs’. With a fancier network model, we’d get a fancier operad, with fancier operations. Right now I’m trying to keep the operad simple (pun not intended), and show you a variety of different algebras.

And you might expect, we have algebra homomorphisms going from more detailed algebras to less detailed ones:

f_T : A''' \to A'', \quad h : A' \to A

The homomorphism h takes a simple graph in the plane and forgets the location of its vertices. The homomorphism f_T depends on a choice of time T \in [T_1, T_2]. For any time T, it takes a time-dependent graph in the plane and evaluates it at that time, getting a graph in the plane (which obeys the distance constraints, since the time-dependent graph obeyed those constraints at any time).

We do not have a homomorphism g: A'' \to A' that takes a simple graph in the plane obeying our distance constraints and forgets about those constraints. There’s a map g sending elements of A'' to elements of A' in this way. But it’s not an algebra homomorphism! The problem is that first trying to connect two graphs with an edge and then applying g may give a different result than first applying g and then connecting two graphs with an edge.

In short: a single operad has many algebras, which we can use to describe our desired system at different levels of detail. Algebra homomorphisms relate these different levels of detail.

Next time I’ll look at some more interesting algebras of the same operad. For example, there’s one that describes a system of interacting mobile agents, which move around in some specific way, determined by their location and the locations of the agents they’re communicating with.

Even this is just the tip of the iceberg—that is, still a rather low level of detail. We can also introduce stochasticity (that is, randomness). And to go even further, we could switch to a more sophisticated operad, based on a fancier ‘network model’.

But not today.


Some posts in this series:

Part 1. CASCADE: the Complex Adaptive System Composition and Design Environment.

Part 2. Metron’s software for system design.

Part 3. Operads: the basic idea.

Part 4. Network operads: an easy example.

Part 5. Algebras of network operads: some easy examples.

Part 6. Network models.

Part 7. Step-by-step compositional design and tasking using commitment networks.


Norbert Blum on P versus NP

15 August, 2017

There’s a new paper on the arXiv that claims to solve a hard problem:

• Norbert Blum, A solution of the P versus NP problem.

Most papers that claim to solve hard math problems are wrong: that’s why these problems are considered hard. But these papers can still be fun to look at, at least if they’re not obviously wrong. It’s fun to hope that maybe today humanity has found another beautiful grain of truth.

I’m not an expert on the P versus NP problem, so I have no opinion on this paper. So don’t get excited: wait calmly by your radio until you hear from someone who actually works on this stuff.

I found the first paragraph interesting, though. Here it is, together with some highly non-expert commentary. Beware: everything I say could be wrong!

Understanding the power of negations is one of the most challenging problems in complexity theory. With respect to monotone Boolean functions, Razborov [12] was the first who could shown that the gain, if using negations, can be super-polynomial in comparision to monotone Boolean networks. Tardos [16] has improved this to exponential.

I guess a ‘Boolean network’ is like a machine where you feed in a string of bits and it computes new bits using the logical operations ‘and’, ‘or’ and ‘not’. If you leave out ‘not’ the Boolean network is monotone, since then making more inputs equal to 1, or ‘true’, is bound to make more of the output bits 1 as well. Blum is saying that including ‘not’ makes some computations vastly more efficient… but that this stuff is hard to understand.

For the characteristic function of an NP-complete problem like the clique function, it is widely believed that negations cannot help enough to improve the Boolean complexity from exponential to polynomial.

A bunch of nodes in a graph are a clique if each of these nodes is connected by an edge to every other. Determining whether a graph with n vertices has a clique with more than k nodes is a famous problem: the clique decision problem.

For example, here’s a brute-force search for a clique with at least 4 nodes:



The clique decision problem is NP-complete. This means that if you can solve it with a Boolean network whose complexity grows like some polynomial in n, then P = NP. But if you can’t, then P ≠ NP.

(Don’t ask me what the complexity of a Boolean network is; I can guess but I could get it wrong.)

I guess Blum is hinting that the best monotone Boolean network for solving the clique decision problem has a complexity that’s exponential in n. And then he’s saying it’s widely believed that not gates can’t reduce the complexity to a polynomial.

Since the computation of an one-tape Turing machine can be simulated by a non-monotone Boolean network of size at most the square of the number of steps [15, Ch. 3.9], a superpolynomial lower bound for the non-monotone network complexity of such a function would imply P ≠ NP.

Now he’s saying what I said earlier: if you show it’s impossible to solve the clique decision problem with any Boolean network whose complexity grows like some polynomial in n, then you’ve shown P ≠ NP. This is how Blum intends to prove P ≠ NP.

For the monotone complexity of such a function, exponential lower bounds are known [11, 3, 1, 10, 6, 8, 4, 2, 7].

Should you trust someone who claims they’ve proved P ≠ NP, but can’t manage to get their references listed in increasing order?

But until now, no one could prove a non-linear lower bound for the nonmonotone complexity of any Boolean function in NP.

That’s a great example of how helpless we are: we’ve got all these problems whose complexity should grow faster than any polynomial, and we can’t even prove their complexity grows faster than linear. Sad!

An obvious attempt to get a super-polynomial lower bound for the non-monotone complexity of the clique function could be the extension of the method which has led to the proof of an exponential lower bound of its monotone complexity. This is the so-called “method of approximation” developed by Razborov [11].

I don’t know about this. All I know is that Razborov and Rudich proved a whole bunch of strategies for proving P ≠ NP can’t possibly work. These strategies are called ‘natural proofs’. Here are some friendly blog articles on their result:

• Timothy Gowers, How not to prove that P is not equal to NP, 3 October 2013.

• Timothy Gowers, Razborov and Rudich’s natural proofs argument, 7 October 2013.

From these I get the impression that what Blum calls ‘Boolean networks’ may be what other people call ‘Boolean circuits’. But I could be wrong!

Continuing:

Razborov [13] has shown that his approximation method cannot be used to prove better than quadratic lower bounds for the non-monotone complexity of a Boolean function.

So, this method is unable to prove some NP problem can’t be solved in polynomial time and thus prove P ≠ NP. Bummer!

But Razborov uses a very strong distance measure in his proof for the inability of the approximation method. As elaborated in [5], one can use the approximation method with a weaker distance measure to prove a super-polynomial lower bound for the non-monotone complexity of a Boolean function.

This reference [5] is to another paper by Blum. And in the end, he claims to use similar methods to prove that the complexity of any Boolean network that solves the clique decision problem must grow faster than a polynomial.

So, if you’re trying to check his proof that P ≠ NP, you should probably start by checking that other paper!

The picture below, by Behnam Esfahbod on Wikicommons, shows the two possible scenarios. The one at left is the one Norbert Blum claims to have shown we’re in.