Applied Category Theory Course: Databases

6 June, 2018

 

In my online course we’re now into the third chapter of Fong and Spivak’s book Seven Sketches. Now we’re talking about databases!

To some extent this is just an excuse to (finally) introduce categories, functors, natural transformations, adjoint functors and Kan extensions. Great stuff, and databases are a great source of easy examples.

But it’s also true that Spivak helps run a company called Categorical Informatics that actually helps design databases using category theory! And his partner, Ryan Wisnesky, would be happy to talk to people about it. If you’re interested, click the link: he’s attending my course.

To read and join discussions on Chapter 3 go here:

Chapter 3

You can also do exercises and puzzles, and see other people’s answers to these.

Here are the lectures I’ve given so far:

Lecture 34 – Chapter 3: Categories
Lecture 35 – Chapter 3: Categories versus Preorders
Lecture 36 – Chapter 3: Categories from Graphs
Lecture 37 – Chapter 3: Presentations of Categories
Lecture 38 – Chapter 3: Functors
Lecture 39 – Chapter 3: Databases
Lecture 40 – Chapter 3: Relations
Lecture 41 – Chapter 3: Composing Functors
Lecture 42 – Chapter 3: Transforming Databases
Lecture 43 – Chapter 3: Natural Transformations
Lecture 44 – Chapter 3: Categories, Functors and Natural Transformations
Lecture 45 – Chapter 3: Composing Natural Transformations
Lecture 46 – Chapter 3: Isomorphisms
Lecture 47 – Chapter 3: Adjoint Functors
Lecture 48 – Chapter 3: Adjoint Functors
Lecture 49 – Chapter 3: Kan Extensions
Lecture 50 – Chapter 3: Kan Extensions


Workshop on Compositional Approaches

24 May, 2018

This looks great too:

Workshop on Compositional Approaches in Physics, Natural Language Processing, and Social Sciences, 2 September 2018, Nice, France.

Compositional Approaches for Physics, NLP, and Social Sciences (CAPNS 2018) will be colocated with QI 2018. The workshop is a continuation and extension of the Workshop on Semantic Spaces at the Intersection of NLP, Physics and Cognitive Science held in June 2016.

AIMS AND SCOPE
The ability to compose parts to form a more complex whole, and to analyze a whole as a combination of elements, is desirable across disciplines. In this workshop we bring together researchers applying compositional approaches to NLP, Physics, Cognitive Science, and Game Theory. The interplay between these disciplines will foster theoretically motivated approaches to understanding how meanings of words interact in sentences and discourse, how concepts develop, and how complex games can be analyzed. Commonalities between the compositional mechanisms employed may be extracted, and applications and phenomena traditionally thought of as ‘non-compositional’ will be examined.

Topics of interests include (but are not restricted to):
Applications of quantum logic in natural language processing and cognitive science
Compositionality in vector space models of meaning
Compositionality in conceptual spaces
Compositional approaches to game theory
Reasoning in vector spaces and conceptual spaces
Conceptual spaces in linguistics
Game-theoretic models of language and conceptual change
Diagrammatic reasoning for natural language processing, cognitive science, and game theory
Compositional explanations of so-called ‘non-compositional’ phenomena such as metaphor

IMPORTANT DATES:
June 30th: Paper submission
July 15th: Notification to contributors
September 2nd: Workshop date

CONFIRMED SPEAKERS:
Gerhard Jäger, Professor of General Linguistics, University of Tübingen
Paul Smolensky, Principal Researcher, Microsoft Research, and Krieger-Eisenhower Professor of Cognitive Science, Johns Hopkins University

SUBMISSIONS:
We invite:
Original contributions (up to 12 pages) of previously unpublished work. Submission of substantial, albeit partial results of work in progress is welcomed.

Extended abstracts (3 pages) of previously published work that is recent and relevant to the workshop. These should include a link to a separately published paper or preprint.

Contributions should be submitted at:
https://easychair.org/conferences/?conf=capns2018

PROGRAMME COMMITTEE:
Peter Bruza, Queensland University of Technology
Trevor Cohen, University of Texas
Fredrik Nordvall Forsberg, University of Strathclyde
Liane Gabora, University of British Columbia
Peter Gärdenfors, Lund University
Helle Hvid Hansen, TU Delft
Chris Heunen, University of Edinburgh
Peter Hines, University of York
Alexander Kurz, University of Leicester
Antonio Lieto, University of Turin
Glyn Morrill, Universitat Politècnica de Catalunya
Dusko Pavlovic, University of Hawaii
Taher Pilehvar, University of Cambridge
Emmanuel Pothos, City, University of London
Matthew Purver, Queen Mary University of London
Mehrnoosh Sadrzadeh, Queen Mary University of London
Marta Sznajder, Munich Center for Mathematical Philosophy
Pawel Sobocinski, University of Southampton
Dominic Widdows, Grab Technologies
Geraint Wiggins, Vrije Universiteit Brussel
Victor Winschel, OICOS GmbH
Philipp Zahn, University of St. Gallen
Frank Zenker, University of Konstanz

ORGANIZATION:
Bob Coecke, University of Oxford
Jules Hedges, University of Oxford
Dimitri Kartsaklis, University of Cambridge
Martha Lewis, ILLC, University of Amsterdam
Dan Marsden, University of Oxford


RChain

12 May, 2018

guest post by Christian Williams

Mike Stay has been doing some really cool stuff since earning his doctorate. He’s been collaborating with Greg Meredith, who studied the π-calculus with Abramsky, and then conducted impactful research and design in the software industry before some big ideas led him into the new frontier of decentralization. They and a great team are developing RChain, a distributed computing infrastructure based on the reflective higher-order π-calculus, the ρ-calculus.

They’ve made significant progress in the first year, and on April 17-18 they held the RChain Developer Conference in Boulder, Colorado. Just five months ago, the first conference was a handful of people; now this received well over a hundred. Programmers, venture capitalists, blockchain enthusiasts, experts in software, finance, and mathematics: myriad perspectives from around the globe came to join in the dawn of a new internet. Let’s just say, it’s a lot to take in. This project is the real deal – the idea is revolutionary, the language is powerful, the architecture is elegant; the ambition is immense and skilled developers are actually bringing it to reality. There’s no need for hype: you’re gonna be hearing about RChain.

Documentation , GitHub , Architecture

Here’s something from the documentation:

The open-source RChain project is building a decentralized, economic, censorship-resistant, public compute infrastructure and blockchain. It will host and execute programs popularly referred to as “smart contracts”. It will be trustworthy, scalable, concurrent, with proof-of-stake consensus and content delivery.

The decentralization movement is ambitious and will provide awesome opportunities for new social and economic interactions. Decentralization also provides a counterbalance to abuses and corruption that occasionally occur in large organizations where power is concentrated. Decentralization supports self-determination and the rights of individuals to self-organize. Of course, the realities of a more decentralized world will also have its challenges and issues, such as how the needs of international law, public good, and compassion will be honored.

We admire the awesome innovations of Bitcoin, Ethereum, and other platforms that have dramatically advanced the state of decentralized systems and ushered in this new age of cryptocurrency and smart contracts. However, we also see symptoms that those projects did not use the best engineering and formal models for scaling and correctness in order to support mission-critical solutions. The ongoing debates about scaling and reliability are symptomatic of foundational architectural issues. For example, is it a scalable design to insist on an explicit serialized processing order for all of a blockchain’s transactions conducted on planet earth?

To become a blockchain solution with industrial-scale utility, RChain must provide content delivery at the scale of Facebook and support transactions at the speed of Visa. After due diligence on the current state of many blockchain projects, after deep collaboration with other blockchain developers, and after understanding their respective roadmaps, we concluded that the current and near-term blockchain architectures cannot meet these requirements. In mid-2016, we resolved to build a better blockchain architecture.

Together with the blockchain industry, we are still at the dawn of this decentralized movement. Now is the time to lay down a solid architectural foundation. The journey ahead for those who share this ambitious vision is as challenging as it is worthwhile, and this document summarizes that vision and how we seek to accomplish it.

We began by admitting the following minimal requirements:

  • Dynamic, responsive, and provably correct smart contracts.
  • Concurrent execution of independent smart contracts.
  • Data separation to reduce unnecessary data replication of otherwise independent tokens and smart contracts.
  • Dynamic and responsive node-to-node communication.
  • Computationally non-intensive consensus/validation protocol.

Building quality software is challenging. It is easier to build “clever” software; however, the resulting software is often of poor quality, riddled with bugs, difficult to maintain, and difficult to evolve. Inheriting and working on such software can be hellish for development teams, not to mention their customers. When building an open-source system to support a mission-critical economy, we reject a minimal-success mindset in favor of end-to-end correctness.

To accomplish the requirements above, our design approach is committed to:

  • A computational model that assumes fine-grained concurrency and dynamic network topology.
  • A composable and dynamic resource addressing scheme.
  • The functional programming paradigm, as it more naturally accommodates distributed and parallel processing.
  • Formally verified, correct-by-construction protocols which leverage model checking and theorem proving.
  • The principles of intension and compositionality.

RChain is light years ahead of the industry. Why? It is upholding the principle of correct by construction with the depth and rigor of mathematics. For years, Mike and Greg have been developing original ideas for distributed computation: in particular, logic as a distributive law is an “algorithm for deriving a spatial-behavioral type system from a formal presentation of a computational calculus.” This is a powerful way to integrate operational semantics into a language, and prove soundness with a single natural transformation; it also provides an extremely expressive query language, with which you could search the entire world to find “code that does x”. Mike’s strong background in higher category theory has enabled the formalization of Greg’s ideas, which he has developed over decades of thinking deeply and comprehensively about the world of computing. Of all of these, there is one concept which is the heart and pulse of RChain, which unifies the system as a rational whole: the ρ-calculus.

So what’s the big idea? First, some history: back in the late 80s, Greg developed a classification of computational models called “the 4 C’s”:

completeness,
compositionality,
(a good notion of)
complexity, and
concurrency.

He found that there was none which had all four, and predicted the existence of one. Just a few years later, Milner invented the π-calculus, and since then it has reigned as the natural language of network computing. It presents a totally different way of thinking: instead of representing sequential instructions for a single machine, the π-calculus is fundamentally concurrent—processes or agents communicate over names or channels, and computation occurs through the interaction of processes. The language is simple yet remarkably powerful; it is deeply connected with game semantics and linear logic, and has become an essential tool in systems engineering and biocomputing: see mobile process calculi for programming the blockchain.


Here is the basic syntax. The variables x,y are names, and P,Q are processes:

P,Q := 0 | (νx)P | x?(y).P | x!(y).P | P|Q

(do nothing | create new x; run P | receive on x and bind to y; run P | send value y on x; run P | run P and Q in parallel)

The computational engine, the basic reduction analogous to beta-reduction of lambda calculus, is the communication rule:

COMM : x!(y).P|x?(z).Q → P|Q[y/z]

(given parallel output and input processes along the same channel, the value is transferred from the output to the input, and is substituted for all occurrences of the input variable in the continuation process)

The definition of a process calculus must specify structural congruence: these express the equivalences between processes—for example, ({P},|,0) forms a commutative monoid.


The π-calculus reforms computation, on the most basic level, to be a cooperative activity. Why is this important? To have a permanently free internet, we have to be able to support it without reliance on centralized powers. This is one of the simplest points, but there are many deeper reasons which I am not yet knowledgeable enough to express. It’s all about the philosophy of openness which is characteristic of applied category theory: historically, we have developed theories and practices which are isolated from each other and the world, and had to fabricate their interrelation and cooperation ad hoc; this leaves us needlessly struggling with inadequate systems, and limits our thought and action.

Surely there must be a less primitive way of making big changes in the store than by pushing vast numbers of words back and forth through the von Neumann bottleneck. Not only is this tube a literal bottleneck for the data traffic of a problem, but, more importantly, it is an intellectual bottleneck that has kept us tied to word-at-a-time thinking instead of encouraging us to think in terms of the larger conceptual units of the task at hand. Thus programming is basically planning and detailing the enormous traffic of words through the von Neumann bottleneck, and much of that traffic concerns not significant data itself, but where to find it. — John Backus, 1977 ACM Turing Award

There have been various mitigations to these kind of problems, but the cognitive limitation remains, and a total renewal is necessary; the π-calculus completely reimagines the nature of computation in society, and opens vast possibility. We can begin to conceive of the many interwoven strata of computation as a coherent, fluid whole. While I was out of my depth in many talks during the conference, I began to appreciate that this was a truly comprehensive innovation: RChain reforms almost every aspect of computing, from the highest abstraction all the way down to the metal. Coalescing the architecture, as mentioned earlier, is the formal calculus as the core guiding principle. There was some concern that because the ρ-calculus is so different from traditional languages, there may be resistance to adoption; but our era is a paradigm shift, the call is for a new way of thinking, and we must adapt.

So why are we using the reflective higher-order π-calculus, the ρ-calculus? Because there’s just one problem with the conventional π-calculus: it presupposes a countably infinite collection of atomic names. These are not only problematic to generate and manage, but the absence of structure is a massive waste. In this regard, the π-calculus was incomplete, until Greg realized that you can “close the loop” with reflection, a powerful form of self-reference:

Code ←→ Data

The mantra is that names are quoted processes; this idea pervades and guides the design of RChain. There is no need to import infinitely many opaque, meaningless symbols—the code itself is nothing but clear, meaningful syntax. If there is an intrinsic method of reference and dereference, or “quoting and unquoting”, code can be turned into data, sent as a message, and then turned back into code; known as “code mobility”, one can communicate big programs as easily as emails. This allows for metaprogramming: on an industrial level, not only people write programs—programs write programs. This is essential to creating a robust virtual infrastructure.

So, how can the π-calculus be made reflective? By solving for the least fixed point of a recursive equation, which parametrizes processes by names:

P[x] = 0 | x?(x).P[x] | x!(P[x]) | P[x]|P[x] | @P[x] | *x

RP = P[RP]

This is reminiscent of how the Y combinator enables recursion by giving the fixed point of any function, Yf = f(Yf). The last two terms of the syntax are reference and dereference, which turn code into data and data into code. Notice that we did not include a continuation for output: the ρ-calculus is asynchronous, meaning that the sender does not get confirmation that the message has been received; this is important for efficient parallel computation and corresponds to polarised linear logic. We adopt the convention that names are output and processes are input. The last two modifications to complete the official ρ-calculus syntax are multi-input and pattern-matching:


P,Q := 0                                                null process

| for(p1←x1,…,pn←xn).P                input guarded process

| x!(@Q)                                        output a name

| *x                                          dereference, evaluate code

| P|Q                                        parallel composition

x,p := @P                                            name or quoted process

(each ‘pi’ is a “pattern” or formula to collect terms on channel ‘xi’—this is extremely useful and general, and enables powerful functionality throughout the system)


Simple. Of course, this is not really a programming language yet, though it is more usable than the pure λ-calculus. Rholang, the actual language of RChain, adds some essential features:

ρ-calculus + variables + useful ground terms + new name construction + arithmetic + pattern matching = Rholang

Here’s the specification, syntax and semantics, and a visualization; explore code and examples in GitHub and learn the contract design in the documentation—you can even try coding on rchain.cloud! For those who don’t like clicking all these links, let’s see just one concrete example of a contract, the basic program in Rholang: a process with persistent state, associated code, and associated addresses. This is a Cell, which stores a value until it is accessed or updated:


contract Cell( get, set, state ) = {
 select {
   case rtn <- get; v <- state => {
     rtn!( *v ) | state!( *v ) | Cell( get, set, state ) }
   case newValue <- set; v <- state => {
     state!( *newValue ) | Cell( get, set, state ) }

}}


The parameters are the channels on which the contract communicates. Cell selects from two possibilities: either it is being accessed, i.e. there is data (the return channel) to receive on get, then it outputs on rtn and maintains its state and call; or it is being updated, i.e. there is data (the new value) to receive on set, then it updates state and calls itself again. This shows how the ontology of the language enables natural recursion, and thereby persistent storage: state is Cell’s way of “talking to itself”—since the sequential aspect of Rholang is functional, one “cycles” data to persist. The storage layer uses a similar idea; the semantics may be related to traced monoidal categories.

Curiously, the categorical semantics of the ρ-calculus has proven elusive. There is the general ideology that λ:sequential :: π:concurrent, that the latter is truly fundamental, but the Curry-Howard-Lambek isomorphism has not yet been generalized canonically—though there has been partial success, involving functor-category denotational semantics, linear logic, and session types. Despite its great power and universality, the ρ-calculus remains a bit of a mystery in mathematics: this fact should intrigue anyone who cares about logic, types, and categories as the foundations of abstract thought.

Now, the actual system—the architecture consists of five interwoven layers (all better explained in the documentation):


Storage: based on Special K – “a pattern language for the web.” This layer stores both key-value pairs and continuations through an essential duality of database and query—if you don’t find what you’re looking for, leave what you would have done in its place, and when it arrives, the process will continue. Greg characterizes this idea as the computational equivalent of the law of excluded middle.

[channel, pattern, data, continuation]

RChain has a refined, multidimensional view of resources – compute, memory, storage, and network—and accounts for their production and consumption linearly.

Execution: a Rho Virtual Machine instance is a context for ρ-calculus reduction of storage elements. The entire state of the blockchain is one big Rholang term, which is updated by a transaction: a receive invokes a process which changes key values, and the difference must be verified by consensus. Keys permanently maintain the entire history of state transitions. While currently based on the Java VM, it will be natively hosted.

Namespace: a set of channels, i.e. resources, organized by a logic for access and authority. The primary significance is scalability – a user does not need to deal with the whole chain, only pertinent namespaces. ‘A namespace definition may control the interactions that occur in the space, for example, by specifying: accepted addresses, namespaces, or behavioral types; maximum or minimum data size; or input-output structure.’ These handle nondeterminism of the two basic “race conditions”, contention for resources:

x!(@Q1) | for(ptrn <- x){P} | x!(@Q2)

for(ptrn <- x){P1} | x!(@Q) | for(ptrn <- x){P2}

Contrasted with flat public keys of other blockchains, domains work with DNS and extend them by a compositional tree structure. Each node as a named channel is itself a namespace, and hence definitions can be built up inductively, with precise control.

Consensus: verify partial orders of changes to the one-big-Rholang-term state; the block structure should persist as a directed acyclic graph. The algorithm is Proof of Stake – the capacity to validate in a namespace is tied to the “stake” one holds in it. Greg explains via tangles, and how the complex CASPER protocol works naturally with RChain.

Contracts: ‘An RChain contract is a well-specified, well-behaved, and formally verified program that interacts with other such programs.’ (K Framework) ; (DAO attack) ‘A behavioral type is a property of an object that binds it to a discrete range of action patterns. Behavioral types constrain not only the structure of input and output, but the permitted order of inputs and outputs among communicating and (possibly) concurrent processes under varying conditions… The Rholang behavioral type system will iteratively decorate terms with modal logical operators, which are propositions about the behavior of those terms. Ultimately properties [such as] data information flow, resource access, will be concretized in a type system that can be checked at compile-time. The behavioral type systems Rholang will support make it possible to evaluate collections of contracts against how their code is shaped and how it behaves. As such, Rholang contracts elevate semantics to a type-level vantage point, where we are able to scope how entire protocols can safely interface.’ (LADL)

So what can you build on RChain? Anything.

Decentralized applications: identity, reputation, tokens, timestamping, financial services, content delivery, exchanges, social networks, marketplaces, (decentralized autonomous) organizations, games, oracles, (Ethereum dApps), … new forms of code yet to be imagined. It’s much more than a better internet: RChain is a potential abstract foundation for a rational global society. The system is a minimalist framework of universally principled design; it is a canvas with which we can begin to explore how the world should really work. If we are open and thoughtful, if we care enough, we can learn to do things right.

The project is remarkably unknown for its magnitude, and building widespread adoption may be one of RChain’s greatest challenges. Granted, it is new; but education will not be easy. It’s too big a reformation for a public mindset which thinks of (technological) progress as incrementally better specs or added features; this is conceptual progression, an alien notion to many. That’s why the small but growing community is vital. This article is nothing; I’m clearly unqualified—click links, read papers, watch videos.  The scale of ambition, the depth of insight, the lucidity of design, the unity of theory and practice—it’s something to behold. And it’s real. Mercury will be complete in December. It’s happening, right now, and you can be a part of it. Learn, spread the word. Get involved; join the discussion or even the development—the RChain website has all the resources you need.

40% of the world population lives within 100km of the ocean. Greg pointed out that if we can’t even handle today’s refugee crises, what will possibly happen when the waters rise? At the very least, we desperately need better large-scale coordination systems. Will we make it to the next millennium? We can—just a matter of will.

Thank you for reading. You are great.


Applied Category Theory 2018 – Videos

30 April, 2018

Some of the talks at Applied Category Theory 2018 were videotaped by the Statebox team. You can watch them on YouTube:

• David Spivak, A higher-order temporal logic for dynamical systems. Book available here and slides here.

• Fabio Zanasi and Bart Jacobs, Categories in Bayesian networks. Paper available here. (Some sound missing; when you hit silence skip forwards to about 15:00.)

• Bob Coecke and Aleks Kissinger, Causality. Paper available here.

• Samson Abramsky, Games and constraint satisfaction, Part 1 and Part 2. Paper available here.

• Dan Ghica, Diagrammatic semantics for digital circuits. Paper available here.

• Kathryn Hess, Towards a categorical approach to neuroscience.

• Tom Leinster, Biodiversity and the theory of magnitude. Papers available here and here.

• John Baez, Props in network theory. Slides available here, paper here and blog article here.


Applied Category Theory Course

26 March, 2018

It just became a lot easier to learn about applied category theory, thanks to this free book:

• Brendan Fong and David Spivak, Seven Sketches in Compositionality: An Invitation to Applied Category Theory.

I’ve started an informal online course based on this book on the Azimuth Forum. I’m getting pretty sick of the superficial quality of my interactions on social media. This could be a way to do something more interesting.

The idea is that you can read chapters of this book, discuss them, try the exercises in the book, ask and answer questions, and maybe team up to create software that implements some of the ideas. I’ll try to keep things moving forward. For example, I’ll explain some stuff and try to help answer questions that people are stuck on. I may also give some talks or run discussions on Google Hangouts or similar software—but only when I have time: I’m more of a text-based guy. I may get really busy some times, and leave the rest of you alone for a while. But I like writing about math for at least 15 minutes a day, and more when I have time. Furthermore, I’m obsessed with applied category theory and plan to stay that way for at least a few more years.

If this sounds interesting, let me know here—and please visit the Azimuth Forum and register! Use your full real name as your username, with no spaces. I will add spaces and that will become your username. Use a real working email address. If you don’t, the registration process may not work.

Over 70 people have registered so far, so this process will take a while.

The main advantage of the Forum over this blog is that you can initiate new threads and edit your comments. Like here you can write equations in LaTeX. Like here, that ability is severely limited: for example you can’t define macros, and you can’t use TikZ. (Maybe someone could fix that.) But equations are better typeset over there—and more importantly, the ability to edit comments makes it a lot easier to correct errors in your LaTeX.

Please let me know what you think.

What follows is the preface to Fong and Spivak’s book, just so you can get an idea of what it’s like.

Preface

Category theory is becoming a central hub for all of pure mathematics. It is unmatched in its ability to organize and layer abstractions, to find commonalities between structures of all sorts, and to facilitate communication between different mathematical communities. But it has also been branching out into science, informatics, and industry. We believe that it has the potential to be a major cohesive force in the world, building rigorous bridges between disparate worlds, both theoretical and practical. The motto at MIT is mens et manus, Latin for mind and hand. We believe that category theory—and pure math in general—has stayed in the realm of mind for too long; it is ripe to be brought to hand.

Purpose and audience

The purpose of this book is to offer a self-contained tour of applied category theory. It is an invitation to discover advanced topics in category theory through concrete real-world examples. Rather than try to give a comprehensive treatment of these topics—which include adjoint functors, enriched categories, proarrow equipments, toposes, and much more–we merely provide a taste. We want to give readers some insight into how it feels to work with these structures as well as some ideas about how they might show up in practice.

The audience for this book is quite diverse: anyone who finds the above description intriguing. This could include a motivated high school student who hasn’t seen calculus yet but has loved reading a weird book on mathematical logic they found at the library. Or a machine learning researcher who wants to understand what vector spaces, design theory, and dynamical systems could possibly have in common. Or a pure mathematician who wants to imagine what sorts of applications their work might have. Or a recently-retired programmer who’s always had an eerie feeling that category theory is what they’ve been looking for to tie it all together, but who’s found the usual books on the subject impenetrable.

For example, we find it something of a travesty that in 2018 there seems to be no introductory material available on monoidal categories. Even beautiful modern introductions to category theory, e.g. by Riehl or Leinster, do not include anything on this rather central topic. The basic idea is certainly not too abstract; modern human intuition seems to include a pre-theoretical understanding of monoidal categories that is just waiting to be formalized. Is there anyone who wouldn’t correctly understand the basic idea being communicated in the following diagram?

Many applied category theory topics seem to take monoidal categories as their jumping off point. So one aim of this book is to provide a reference—even if unconventional—for this important topic.

We hope this book inspires both new visions and new questions. We intend it to be self-contained in the sense that it is approachable with minimal prerequisites, but not in the sense that the complete story is told here. On the contrary, we hope that readers use this as an invitation to further reading, to orient themselves in what is becoming a large literature, and to discover new applications for themselves.

This book is, unashamedly, our take on the subject. While the abstract structures we explore are important to any category theorist, the specific topics have simply been chosen to our personal taste. Our examples are ones that we find simple but powerful, concrete but representative, entertaining but in a way that feels important and expansive at the same time. We hope our readers will enjoy themselves and learn a lot in the process.

How to read this book

The basic idea of category theory—which threads through every chapter—is that if one pays careful attention to structures and coherence, the resulting systems will be extremely reliable and interoperable. For example, a category involves several structures: a collection of objects, a collection of morphisms relating objects, and a formula for combining any chain of morphisms into a morphism. But these structures need to cohere or work together in a simple commonsense way: a chain of chains is a chain, so combining a chain of chains should be the same as combining the chain. That’s it!

We will see structures and coherence come up in pretty much every definition we give: “here are some things and here are how they fit together.” We ask the reader to be on the lookout for structures and coherence as they read the book, and to realize that as we layer abstraction on abstraction, it is the coherence that makes everything function like a well-oiled machine.

Each chapter in this book is motivated by a real-world topic, such as electrical circuits, control theory, cascade failures, information integration, and hybrid systems. These motivations lead us into and through various sorts of category-theoretic concepts.

We generally have one motivating idea and one category-theoretic purpose per chapter, and this forms the title of the chapter, e.g. Chapter 4 is “Collaborative design: profunctors, categorification, and monoidal categories.” In many math books, the difficulty is roughly a monotonically-increasing function of the page number. In this book, this occurs in each chapter, but not so much in the book as a whole. The chapters start out fairly easy and progress in difficulty.

The upshot is that if you find the end of a chapter very difficult, hope is certainly not lost: you can start on the next one and make good progress. This format lends itself to giving you a first taste now, but also leaving open the opportunity for you to come back at a later date and get more deeply into it. But by all means, if you have the gumption to work through each chapter to its end, we very much encourage that!

We include many exercises throughout the text. Usually these exercises are fairly straightforward; the only thing they demand is that the reader’s mind changes state from passive to active, rereads the previous paragraphs with intent, and puts the pieces together. A reader becomes a student when they work the exercises; until then they are more of a tourist, riding on a bus and listening off and on to the tour guide. Hey, there’s nothing wrong with that, but we do encourage you to get off the bus and make contact with the natives as often as you can.


Applied Category Theory at NIST (Part 1)

17 February, 2018

I think it’s really cool how applied category theory is catching on. My former student Blake Pollard is working at the National Institute of Standards and Technology on applications of category theory to electrical engineering. He’s working with Spencer Breiner… and now Breiner is helping run a workshop on this stuff:

• Applied Category Theory: Bridging Theory & Practice, March 15–16, 2018, NIST, Gaithersburg, Maryland, USA. Organized by Spencer Breiner and Eswaran Subrahmanian.

It’s by invitation only, but I can’t resist mentioning its existence. Here’s the idea:

What: The Information Technology Laboratory at NIST is pleased to announce a workshop on Applied Category Theory to be held at NIST’s Gaithersburg, Maryland campus on March 15 & 16, 2018. The meeting will focus on practical avenues for introducing methods from category theory into real-world applications, with an emphasis on examples rather than theorems.

Who: The workshop aims to bring together two distinct groups. First, category theorists interested in pursuing applications outside of the usual mathematical fields. Second, domain experts and research managers from industry, government, science and engineering who have in mind potential domain applications for categorical methods.

Intended Outcomes: A proposed landscape of potential CT applications and the infrastructure needed to realize them, together with a 5-10 year roadmap for developing the field of applied category theory. This should include perspectives from industry, academia and government as well as major milestones, potential funding sources, avenues for technology transfer and necessary improvements in tool support and methodology. Exploratory collaborations between category theorists and domain experts. We will ask that each group come prepared to meet the other side. Mathematicians should be prepared with concrete examples that demonstrate practical applications of CT in an intuitive way. Domain experts should bring to the table specific problems to which they can devote time and/or funding as well as some reasons about why they think CT might be relevant to this application.

Invited Speakers:
John Baez (University of California at Riverside) and John Foley (Metron Scientific Solutions).
Bob Coecke (University of Oxford).
Dusko Pavlovic (University of Hawaii).

Some other likely participants include Chris Boner (Metron), Arquimedes Canedo (Siemens at Princeton), Stephane Dugowson (Supméca), William Edmonson (North Carolina A&T), Brendan Fong (MIT), Mark Fuge (University of Maryland), Jack Gray (Penumbra), Helle Hansen (Delft), Jelle Herold (Statebox), Marisa Hughes (Johns Hopkins), Steve Huntsman (BAE Systems), Patrick Johnson (Dassault Systèmes), Al Jones (NIST), Cliff Joslyn (Pacific Northwest National Laboratory), Richard Malek (NSF), Tom Mifflin (Metron), Ira Monarch (Carnegie Mellon), John Paschkewitz (DARPA), Evan Patterson (Stanford), Blake Pollard (NIST), Emilie Purvine (Pacific Northwest National Laboratory), Mark Raugas (Pacific Northwest National Laboratory), Bill Regli (University of Maryland), Michael Robinson (American U.) Alberto Speranzon (Honeywell Aerospace), David Spivak (MIT), Eswaran Subrahmanian (Carnegie Mellon), Jamie Vicary (Birmingham and Oxford), and Ryan Wisnesky (Categorical Informatics).

A bunch of us will stay on into the weekend and talk some more. I hope we make a lot of progress—and I plan to let you know how it goes!

I’ll be giving a joint talk with John Foley about our work using operads to design networks. This work is part of the Complex Adaptive System Composition and Design Environment project being done by Metron Scientific Solutions and managed by John Paschkewitz at DARPA.


Pyrofex

4 February, 2018

Mike Stay is applying category theory to computation at a new startup called Pyrofex. And this startup has now entered a deal with RChain.

But let me explain why I’m interested. I’m interested in applied category theory… but this is special.

Mike Stay came to work with me at U.C. Riverside after getting a master’s in computer science at the University of Auckland, where he worked with Cristian Calude on algorithmic randomness. For example:

• Cristian S. Calude and Michael A. Stay, From Heisenberg to Gödel via Chaitin, International Journal of Theoretical Physics 44 (2008), 1053–1065.

• Cristian S. Calude and Michael A. Stay, Most programs stop quickly or never halt, Advances in Applied Mathematics, 40 (2008), 295–308.

It seems like ages ago, but I dimly remember looking at his application, seeing the title of the first of these two papers, and thinking “he’s either a crackpot, or I’m gonna like him”.

You see, the lure of relating Gödel’s theorem to Heisenberg’s uncertainty principle is fatally attractive to many people who don’t really understand either. I looked at the paper, decided he wasn’t a crackpot… and yes, it turned out I liked him.

(A vaguely similar thing happened with my student Chris Rogers, who’d written some papers applying techniques from general relativity to the study of crystals. As soon as I assured myself this stuff was for real, I knew I’d like him!)

Since Mike knew a lot about computer science, his presence at U. C. Riverside emboldened me to give a seminar on classical versus quantum computation. I used this as an excuse to learn the old stuff about the lambda-calculus and cartesian closed categories. When I first started, I thought the basic story would be obvious: people must be making up categories where the morphisms describe processes of computation.

But I soon learned I was wrong: people were making up categories where objects were data types, but the morphisms were equivalence classes of things going between data types—and this equivalence relation completely washed out the difference, between, say, a program that actually computes 237 × 419 and a program that just prints out 99303, which happens to be the answer to that problem.

In other words, the actual process of computation was not visible in the category-theoretic framework. I decided that to make it visible, what we really need are 2-categories in which 2-morphisms are ‘processes of computation’. Or in the jargon: objects are types, morphisms are terms, and 2-morphisms are rewrites.

It turned out people had already thought about this:

• Barnaby P. Hilken, Towards a proof theory of rewriting: the simply-typed 2λ-calculus, Theor. Comp. Sci. 170 (1996), 407–444.

• R. A. G. Seely, Weak adjointness in proof theory in Proc. Durham Conf. on Applications of Sheaves, Springer Lecture Notes in Mathematics 753, Springer, Berlin, 1979, pp. 697–701.

• R. A. G. Seely, Modeling computations: a 2-categorical framework, in Proc. Symposium on Logic in Computer Science 1987, Computer Society of the IEEE, pp. 65—71.

But I felt this viewpoint wasn’t nearly as popular as it should be. It should be very popular, at least among theoretical computer scientists, because it describes what’s actually going on in the lambda-calculus. If you read a big fat book on the lambda-calculus, like Barendregt’s The Lambda Calculus: Its Syntax and Semantics, you’ll see it spends a lot of time on reduction strategies: that is, ways of organizing the process of computation. All this is buried in the usual 1-categorical treatment. It’s living up at the 2-morphism level!

Mike basically agreed with me. We started by writing introduction to the usual 1-categorical stuff, and how it shows up in many different fields:

• John Baez and Michael Stay, Physics, topology, logic and computation: a Rosetta Stone, in New Structures for Physics, ed. Bob Coecke, Lecture Notes in Physics vol. 813, Springer, Berlin, 2011, pp. 95–172.

For financial reasons he had to leave U. C. Riverside and take a job at Google. But he finished his Ph.D. at the University of Auckland, with Cristian Calude and me as co-advisors. And large chunk of his thesis dealt with cartesian closed 2-categories and their generalizations suitable for quantum computation:

• Michael Stay, Compact closed bicategories, Theory and Applications of Categories 31 (2016), 755–798.

Great stuff! My students these days are building on this math and marching ahead.

I said Mike ‘basically’ agreed with me. He agreed that we need to go beyond the usual 1-categorical treatment to model the process of computation. But when it came to applying this idea to computer science, Mike wasn’t satisfied with thinking about the lambda-calculus. That’s an old model of computation: it’s okay for a single primitive computer, but not good for systems where different parts are sending messages to each other, like the internet, or even multiprocessing in a single computer. In other words, the lambda-calculus doesn’t really handle the pressing issues of concurrency and distributed computation.

So, Mike wanted to think about more modern formalisms for computation, like the pi-calculus, using 2-categories.

He left Google and wrote some papers with Greg Meredith on these ideas, for example:

• Michael Stay and Lucius Gregory Meredith, Higher category models of the pi-calculus.

• Michael Stay and Lucius Gregory Meredith, Representing operational semantics with enriched Lawvere theories.

The second one takes a key step: moving away from 2-categories to graph-enriched categories, which are simpler and perhaps better.

Then, after various twists and turns, he started a company called Pyrofex with Nash Foster. Or perhaps I should say Foster started a company with Mike, since Foster is the real bigshot of the two. Here’s what their webpage says:

Hello—

My name is Nash Foster, and together with my friend and colleague Mike Stay, I recently founded a company called Pyrofex. We founded our company for one simple reason: we love to build large-scale distributed systems that are always reliable and secure and we wanted to help users like yourself do it more easily.

When Mike and I founded the company, we felt strongly that several key advances in programming language theory would ease the development of every day large-scale systems. However, we were not sure exactly how to expose the APIs to users or how to build the tool-sets. Grand visions compete with practical necessity, so we spent months talking to users just like you to discover what kinds of things you were most interested in. We spent many hours at white-boards, checking our work against the best CS theory we know. And, of course, we have enjoyed many long days of hacking in the hopes that we can produce something new and useful, for you.

I think this is the most exciting time in history to be a programmer (and I wrote my first program in the early 1980’s). The technologies available today are varied and compelling and exciting. I hope that you’ll be as excited as I am about some of the ideas we discovered while building our software.

And on January 8th, 2018, their company entered into an arrangement with Greg Meredith’s company Rchain. Below I’ll quote part of an announcement. I don’t know much about this stuff—at least, not yet. But I’m happy to see some good ideas getting applied in the real world, and especially happy to see Mike doing it.

The RChain Cooperative & Pyrofex Corporation announce Strategic Partnership

The RChain Cooperative and Pyrofex Corporation today announced strategically important service contracts and an equity investment intended to deliver several mutually beneficial blockchain solutions. RChain will acquire 1.1 million shares of Pyrofex Common Stock as a strategic investment. The two companies will ink separate service contracts to reinforce their existing relationship and help to align their business interests.

Pyrofex will develop critical tools and platform components necessary for the long-term success of the RChain platform. These tools are designed to leverage RChain’s unique blockchain environment and make blockchain development simpler, faster, and more effective than ever before. Under these agreements, Pyrofex will develop the world’s first decentralized IDE for writing blockchain smart contracts on the RChain blockchain.

Pyrofex also commits to continuing the core development of RChain’s blockchain platform and to organizing RChain’s global developer events and conferences.

Comments on the News

“We’re thrilled to have an opportunity to strengthen our relationship with the RChain Cooperative in 2018. Their commitment to open-source development mirrors our own corporate values. It’s a pleasure to have such a close relationship with a vibrant open-source community. I’ve rarely seen the kind of excitement the Coop’s members share and we look forward to delivering some great new technology this year.” — Nash E. Foster, Cofounder & CEO, Pyrofex Corp.

“Intuitive development tools are important for us and the blockchain ecosystem as a whole; we’re incredibly glad Pyrofex intends to launch their tools on RChain first. But, Ethereum has been a huge supporter of RChain and we’re pleased that Pyrofex intends to support Solidity developers as well. Having tools that will make it possible for developers to migrate smart contracts between blockchains is going to create tremendous possibilities.” — Lucius Greg Meredith, President, RChain Cooperative Background

Pyrofex is a software development company co-founded by Dr. Michael Stay, PhD and Nash Foster in 2016. Dr. Stay and Greg Meredith are long-time colleagues and collaborators whose mutual research efforts form the mathematical foundations of RChain’s technology. One example of the work that Greg and Mike have collaborated on is the work on the LADL (Logic as Distributed Law) algorithm. You can watch Dr. Stay present the latest research from the RChain Developers retreat.

Pyrofex and its development team should be familiar to those who follow the RChain Cooperative. They currently employ 14 full-time and several part-time developers dedicated to RChain platform development. Pyrofex CEO Nash Foster and Lead Project Manager Medha Parlikar have helped grow RChain’s development team to an impressive 20+ Core devs with plans on doubling by mid 2018. The team now includes multiple PhDs, ex-Googlers, and other word class talents.

Every Wednesday, you will find Medha on our debrief updating the community with the latest developments in RChain. Here she is announcing the recent node.hello release along with a demo from core developer Chris Kirkwood-Watts.

The working relationship between the RChain Cooperative and Pyrofex has gone so well that the Board of Directors and the community at large have supported Pyrofex’s proposal to develop Cryptofex, the much needed developer tool kit for the decentralized world.

The RChain Cooperative is ecstatic to further develop its relationship with the Pyrofex team.

“As we fly towards Mercury and beyond, we all could use better tools.”
— The RChain Co-op Team.

Listen to the announcement from Greg Meredith as well as a short Q&A with Pyrofex CEO Nash Foster, from a recent community debrief.

The following is an excerpt from the soon to be released Cryptofex Whitepaper.

The Problem: Writing Software is Hard, Compiling is Harder

In 1983, Bordland Software Corporation acquired a small compiler called Compas Pascal and released it in the United States as Turbo Pascal. It was the first product to integrate a compiler and the editor in which software was written and for nearly a decade Borland’s products defined the market for integrated development environments (IDEs).

The year after Borland released TurboPascal, Ken Thompson observed the distinct and unique dangers associated with compiler technologies. In his famous Turing Award acceptance speech, Thompson described a mechanism by which a virus can be injected into a compiler such that every binary compiled with that compiler will replicate the virus.

“In demonstrating the possibility of this kind of attack, I picked on the C compiler. I could have picked on any program-handling program such as an assembler, a loader, or even hardware microcode. As the level of program gets lower, these bugs will be harder and harder to detect. A well installed microcode bug will be almost impossible to detect.” — Ken Thompson

Unfortunately, many developers today remain stuck in a world constructed in the early 1980’s. IDEs remain essentially the same, able to solve only those problems that neatly fit onto their laptop’s single Intel CPU. But barely a month ago, on 22nd November 2017, the Intel Corporation released a critical firmware update to the Intel Management Engine and in the intervening weeks, the public at large has become aware of the “Meltdown” bug. The IME and other components are exactly the sort of low-level microcode applications that Thompson warned about. Intel has demonstrated perfectly that in the past 33 years, we have learned little and gone nowhere.

Ironically, we have had a partial solution to these problems for nearly a decade. In 2009, David A. Wheeler published his PhD dissertation, in which he proposed a mechanism by which multiple compilers can be used to verify the correctness of a compiler output. Such a mechanism turns out to be tailor-made for the decentralized blockchain environment. Combining Wheeler’s mechanism with a set of economic incentives for compile farms to submit correct outputs gives us a very real shot at correcting a problem that has plagued us for more than 30 years.

The Solution: A Distributed and Decentralized Toolchain

If we crack open the development environments at companies like Google and Amazon, many of us would be surprised to discover that very few programs are compiled on a single machine. Already, the most sophisticated organizations in the world have moved to a distributed development environment. This allows them to leverage the cloud, bringing high-performance distributed computing to bear on software development itself. At Google, many thousands of machines churn away compiling code, checking it for correctness, and storing objects to be re-used later. Through clever use of caching and “hermetic” builds, Google makes its builds faster and more computationally efficient than could possibly be done on individual developer workstations. Unfortunately, most of us cannot afford to dedicate thousands of machines to compilation.

The open-source community might be able to build large scale shared compilation environments on the Internet, but Ken Thompson explained to us why we could not trust a shared environment for these workloads. However, in the age of blockchain, it’s now possible to build development environments that harness the power of large-scale compute to compile and check programs against programmer intent. Secure, cheap, and fast — we can get all three.

CryptoFex is just such a Decentralized Integrated Development Environment (DIDE) allowing software engineers to author, test, compile, and statically check their code to ensure that it is secure, efficient, and scalable.