Complex Adaptive System Design (Part 7)

19 February, 2018

In March, I’ll be talking at Spencer Breiner‘s workshop on Applied Category Theory at the National Institute of Standards and Technology. I’ll be giving a joint talk with John Foley about our work using operads to design networks. This work is part of the Complex Adaptive System Composition and Design Environment project being done by Metron Scientific Solutions and managed by John Paschkewitz at DARPA.

I’ve written about this work before:

• Complex adaptive systems design: Part 1, Part 2, Part 3, Part 4, Part 5, Part 6.

But we’ve done a lot more, and my blog articles are having trouble keeping up! So I’d like to sketch out the big picture as it stands today.

If I had to summarize, I’d say we’ve developed a formalism for step-by-step compositional design and tasking, using commitment networks. But this takes a while to explain.

Here’s a very simple example of a commitment network:

It has four nodes, which represent agents: a port, a helicopter, a UAV (an unmanned aerial vehicle, or drone) and a target. The edges between these notes describe relationships between these agents. Some of these relationships are ‘commitments’. For example, the edges labelled ‘SIR’ say that one agent should ‘search, intervene and rescue’ the other.

Our framework for dealing with commitment networks has some special features. It uses operads, but this isn’t really saying much. An ‘operad’ is a bunch of ways of sticking things together. An ‘algebra’ of the operad gives a particular collection of these things, and says what we get when we stick them together. These concepts are extremely general, so there’s a huge diversity of operads, each with a huge diversity of algebras. To say one is using operads to solve a problem is a bit like saying one is using math. What matters more is the specific kind of operad one is using, and how one is using it.

For our work, we needed to develop a new class of operads called network operads, which are described here:

• John Baez, John Foley, Joseph Moeller and Blake Pollard, Network models.

In this paper we mainly discuss communication networks. Subsequently we’ve been working on a special class of network operads that describe how to build commitment networks.

Here are some of key ideas:

• Using network operads we can build bigger networks from smaller ones by overlaying them. David Spivak’s operad of wiring diagrams only let us ‘wire together’ smaller networks to form bigger ones:

Here networks X1, X2 and X3 are being wired together to form Y.

Network operads also let us wire together networks, but in addition they let us take one network:

and overlay another:

to create a larger network:

This is a new methodology for designing systems. We’re all used to building systems by wiring together subsystems: anyone who has a home stereo system has done this. But overlaying systems lets us do more. For example, we can take two plans of action involving the same collection of agents, and overlay them to get a new plan. We’ve all done this, too: you tell a bunch of people to do things… and then tell the same people, or an overlapping set of people, to do some other things. But lots of problems can arise if you aren’t careful. A mathematically principled approach can avoid some of these problems.

• The nodes of our networks represent agents of various types. The edges represent various relationships between agents. For example, they can represent communication channels. But more interestingly, they can represent commitments. For example, we can have an edge from A to B saying that agent A has to go rescue agent B. We call this kind of network a commitment network.

• By overlaying commitment networks, we can not only build systems out of smaller pieces but also build complicated plans by overlaying smaller pieces of plans. Since ‘tasking’ means telling a system what to do, we call this compositional tasking.

• If one isn’t careful, overlaying commitment networks can produce conflicts. Suppose we have a network with an edge saying that agent A has to rescue agent B. On top of this we overlay a network with an edge saying that A has to rescue agent C. If A can’t do both of these tasks at once, what should A do? There are various choices. We need to build a specific choice into the framework, so we can freely overlay commitment networks and get a well-defined result that doesn’t overburden the agents involved. We call this automatic deconflicting.

• Our approach to automatic deconflicting uses an idea developed by the famous category theorist Bill Lawvere: graphic monoids. I’ll explain these later, along with some of their side-benefits.

• Networks operads should let us do step-by-step compositional tasking. In other words, they should let us partially automate the process of tasking networks of agents, both

1) compositionally: tasking smaller networks and then sticking them together, e.g. by overlaying them, to get larger networks,


2) in a step-by-step way, starting at a low level of detail and then increasing the amount of detail.

To do this we need not just operads but their algebras.

• Remember, a network operad is a bunch of ways to stick together networks of some kind, e.g. by overlaying them. An algebra of this operad specifies a particular collection of networks of this kind, and says what we actually get when we stick them together.

So, a network operad can have one algebra in which things are described in a bare-bones, simplified way, and another algebra in which things are described in more detail. Indeed it will typically have many algebras, corresponding to many levels of detail, but for simplicity let’s just think about two.

When we have a ‘less detailed’ algebra A and a ‘more detailed’ algebra A', they will typically be related by a map

f \colon A' \to A

which ‘forgets the extra details’. This sort of map is called a homomorphism of algebras. We give examples in our paper Network models.

But what we usually want to do, when designing a system, is not forget extra detail, but rather add extra detail to a rough specification. There is not always a systematic way to do this. If there is, then we may have a homomorphism

g \colon A \to A'

going back the other way. This lets us automate the process of filling in the details. But we can’t usually count on being able to do this. So, often we may have to start with an element of A and search for an element of A' that is mapped to it by f : A' \to A. And typically we want this element to be optimal, or at least ‘good enough’, according to some chosen criteria. Expressing this idea formally helps us figure out how to automate the search. John Foley, in particular, has been working on this.

That’s an overview of our ideas.

Next, for the mathematically inclined, I want to give a few more details on one of the new elements not mentioned in our Network models paper: ‘graphic monoids’.

Graphic monoids

In our paper Network models we explain how the ‘overlay’ operation makes the collection of networks involving a given set of agents into a monoid. A monoid is a set M with a product that is associative and has an identity element 1:

(xy)z = x(yz)
1 x = x = x 1

In our application, this product is overlaying two networks.

A graphic monoid is one in which the graphic identity

x y x = x y

holds for all x,y.

To understand this identity, let us think of the elements of the monoid as “commitments”. The product x y means “first committing to do x, then committing to do y”. The graphic identity says that if we first commit to do x, then y, and then x again, it’s the same as first committing to do x and then y. Committing to do x again doesn’t change anything!

In particular, in any graphic monoid we have

xx = x 1 x = x 1 = x

so making the same commitment twice is the same as making it once. Mathematically we say every element x of a graphic monoid is idempotent:

x^2 = x

A commutative monoid obeying this law x^2 = x automatically obeys the graphic identity, since then

x y x = x^2 y = x y

But for a noncommutative monoid, the graphic identity is stronger than x^2 = x. It says that after committing to x, no matter what intervening commitments one might have made, committing to x again has no further effect. In other words: the intervening commitments did not undo the original commitment, so making the original commitment a second time has no effect! This captures the idea of how promises should behave.

As I said, for any network model, the set of all networks involving a fixed set of agents is a monoid. In a commitment network model, this monoid is required to be a graphic monoid. Joseph Moeller is writing a paper that shows how to construct a large class of commitment network models. We will follow this with a paper illustrating how to use these in compositional tasking.

For now, let me just mention a side-benefit. In any graphic monoid we can define a relation x \le y by

x \le y  \; \iff \; x a = y $ for some a

This makes the graphic monoid into a partially ordered set, meaning that these properties hold:

reflexivity: x \le x

transitivity: x \le y , y \le z \; \implies \; x \le z

antisymmetry: x \le y, y \le x \; \implies x = y

In the context of commitment networks, x \le y means that starting from x we can reach y by making some further commitment a: that is, x a = y for some a. So, as we ‘task’ a collection of agents by giving them more and more commitments, we move up in this partial order.


4 February, 2018

Mike Stay is applying category theory to computation at a new startup called Pyrofex. And this startup has now entered a deal with RChain.

But let me explain why I’m interested. I’m interested in applied category theory… but this is special.

Mike Stay came to work with me at U.C. Riverside after getting a master’s in computer science at the University of Auckland, where he worked with Cristian Calude on algorithmic randomness. For example:

• Cristian S. Calude and Michael A. Stay, From Heisenberg to Gödel via Chaitin, International Journal of Theoretical Physics 44 (2008), 1053–1065.

• Cristian S. Calude and Michael A. Stay, Most programs stop quickly or never halt, Advances in Applied Mathematics, 40 (2008), 295–308.

It seems like ages ago, but I dimly remember looking at his application, seeing the title of the first of these two papers, and thinking “he’s either a crackpot, or I’m gonna like him”.

You see, the lure of relating Gödel’s theorem to Heisenberg’s uncertainty principle is fatally attractive to many people who don’t really understand either. I looked at the paper, decided he wasn’t a crackpot… and yes, it turned out I liked him.

(A vaguely similar thing happened with my student Chris Rogers, who’d written some papers applying techniques from general relativity to the study of crystals. As soon as I assured myself this stuff was for real, I knew I’d like him!)

Since Mike knew a lot about computer science, his presence at U. C. Riverside emboldened me to give a seminar on classical versus quantum computation. I used this as an excuse to learn the old stuff about the lambda-calculus and cartesian closed categories. When I first started, I thought the basic story would be obvious: people must be making up categories where the morphisms describe processes of computation.

But I soon learned I was wrong: people were making up categories where objects were data types, but the morphisms were equivalence classes of things going between data types—and this equivalence relation completely washed out the difference, between, say, a program that actually computes 237 × 419 and a program that just prints out 99303, which happens to be the answer to that problem.

In other words, the actual process of computation was not visible in the category-theoretic framework. I decided that to make it visible, what we really need are 2-categories in which 2-morphisms are ‘processes of computation’. Or in the jargon: objects are types, morphisms are terms, and 2-morphisms are rewrites.

It turned out people had already thought about this:

• Barnaby P. Hilken, Towards a proof theory of rewriting: the simply-typed 2λ-calculus, Theor. Comp. Sci. 170 (1996), 407–444.

• R. A. G. Seely, Weak adjointness in proof theory in Proc. Durham Conf. on Applications of Sheaves, Springer Lecture Notes in Mathematics 753, Springer, Berlin, 1979, pp. 697–701.

• R. A. G. Seely, Modeling computations: a 2-categorical framework, in Proc. Symposium on Logic in Computer Science 1987, Computer Society of the IEEE, pp. 65—71.

But I felt this viewpoint wasn’t nearly as popular as it should be. It should be very popular, at least among theoretical computer scientists, because it describes what’s actually going on in the lambda-calculus. If you read a big fat book on the lambda-calculus, like Barendregt’s The Lambda Calculus: Its Syntax and Semantics, you’ll see it spends a lot of time on reduction strategies: that is, ways of organizing the process of computation. All this is buried in the usual 1-categorical treatment. It’s living up at the 2-morphism level!

Mike basically agreed with me. We started by writing introduction to the usual 1-categorical stuff, and how it shows up in many different fields:

• John Baez and Michael Stay, Physics, topology, logic and computation: a Rosetta Stone, in New Structures for Physics, ed. Bob Coecke, Lecture Notes in Physics vol. 813, Springer, Berlin, 2011, pp. 95–172.

For financial reasons he had to leave U. C. Riverside and take a job at Google. But he finished his Ph.D. at the University of Auckland, with Cristian Calude and me as co-advisors. And large chunk of his thesis dealt with cartesian closed 2-categories and their generalizations suitable for quantum computation:

• Michael Stay, Compact closed bicategories, Theory and Applications of Categories 31 (2016), 755–798.

Great stuff! My students these days are building on this math and marching ahead.

I said Mike ‘basically’ agreed with me. He agreed that we need to go beyond the usual 1-categorical treatment to model the process of computation. But when it came to applying this idea to computer science, Mike wasn’t satisfied with thinking about the lambda-calculus. That’s an old model of computation: it’s okay for a single primitive computer, but not good for systems where different parts are sending messages to each other, like the internet, or even multiprocessing in a single computer. In other words, the lambda-calculus doesn’t really handle the pressing issues of concurrency and distributed computation.

So, Mike wanted to think about more modern formalisms for computation, like the pi-calculus, using 2-categories.

He left Google and wrote some papers with Greg Meredith on these ideas, for example:

• Michael Stay and Lucius Gregory Meredith, Higher category models of the pi-calculus.

• Michael Stay and Lucius Gregory Meredith, Representing operational semantics with enriched Lawvere theories.

The second one takes a key step: moving away from 2-categories to graph-enriched categories, which are simpler and perhaps better.

Then, after various twists and turns, he started a company called Pyrofex with Nash Foster. Or perhaps I should say Foster started a company with Mike, since Foster is the real bigshot of the two. Here’s what their webpage says:


My name is Nash Foster, and together with my friend and colleague Mike Stay, I recently founded a company called Pyrofex. We founded our company for one simple reason: we love to build large-scale distributed systems that are always reliable and secure and we wanted to help users like yourself do it more easily.

When Mike and I founded the company, we felt strongly that several key advances in programming language theory would ease the development of every day large-scale systems. However, we were not sure exactly how to expose the APIs to users or how to build the tool-sets. Grand visions compete with practical necessity, so we spent months talking to users just like you to discover what kinds of things you were most interested in. We spent many hours at white-boards, checking our work against the best CS theory we know. And, of course, we have enjoyed many long days of hacking in the hopes that we can produce something new and useful, for you.

I think this is the most exciting time in history to be a programmer (and I wrote my first program in the early 1980’s). The technologies available today are varied and compelling and exciting. I hope that you’ll be as excited as I am about some of the ideas we discovered while building our software.

And on January 8th, 2018, their company entered into an arrangement with Greg Meredith’s company Rchain. Below I’ll quote part of an announcement. I don’t know much about this stuff—at least, not yet. But I’m happy to see some good ideas getting applied in the real world, and especially happy to see Mike doing it.

The RChain Cooperative & Pyrofex Corporation announce Strategic Partnership

The RChain Cooperative and Pyrofex Corporation today announced strategically important service contracts and an equity investment intended to deliver several mutually beneficial blockchain solutions. RChain will acquire 1.1 million shares of Pyrofex Common Stock as a strategic investment. The two companies will ink separate service contracts to reinforce their existing relationship and help to align their business interests.

Pyrofex will develop critical tools and platform components necessary for the long-term success of the RChain platform. These tools are designed to leverage RChain’s unique blockchain environment and make blockchain development simpler, faster, and more effective than ever before. Under these agreements, Pyrofex will develop the world’s first decentralized IDE for writing blockchain smart contracts on the RChain blockchain.

Pyrofex also commits to continuing the core development of RChain’s blockchain platform and to organizing RChain’s global developer events and conferences.

Comments on the News

“We’re thrilled to have an opportunity to strengthen our relationship with the RChain Cooperative in 2018. Their commitment to open-source development mirrors our own corporate values. It’s a pleasure to have such a close relationship with a vibrant open-source community. I’ve rarely seen the kind of excitement the Coop’s members share and we look forward to delivering some great new technology this year.” — Nash E. Foster, Cofounder & CEO, Pyrofex Corp.

“Intuitive development tools are important for us and the blockchain ecosystem as a whole; we’re incredibly glad Pyrofex intends to launch their tools on RChain first. But, Ethereum has been a huge supporter of RChain and we’re pleased that Pyrofex intends to support Solidity developers as well. Having tools that will make it possible for developers to migrate smart contracts between blockchains is going to create tremendous possibilities.” — Lucius Greg Meredith, President, RChain Cooperative Background

Pyrofex is a software development company co-founded by Dr. Michael Stay, PhD and Nash Foster in 2016. Dr. Stay and Greg Meredith are long-time colleagues and collaborators whose mutual research efforts form the mathematical foundations of RChain’s technology. One example of the work that Greg and Mike have collaborated on is the work on the LADL (Logic as Distributed Law) algorithm. You can watch Dr. Stay present the latest research from the RChain Developers retreat.

Pyrofex and its development team should be familiar to those who follow the RChain Cooperative. They currently employ 14 full-time and several part-time developers dedicated to RChain platform development. Pyrofex CEO Nash Foster and Lead Project Manager Medha Parlikar have helped grow RChain’s development team to an impressive 20+ Core devs with plans on doubling by mid 2018. The team now includes multiple PhDs, ex-Googlers, and other word class talents.

Every Wednesday, you will find Medha on our debrief updating the community with the latest developments in RChain. Here she is announcing the recent node.hello release along with a demo from core developer Chris Kirkwood-Watts.

The working relationship between the RChain Cooperative and Pyrofex has gone so well that the Board of Directors and the community at large have supported Pyrofex’s proposal to develop Cryptofex, the much needed developer tool kit for the decentralized world.

The RChain Cooperative is ecstatic to further develop its relationship with the Pyrofex team.

“As we fly towards Mercury and beyond, we all could use better tools.”
— The RChain Co-op Team.

Listen to the announcement from Greg Meredith as well as a short Q&A with Pyrofex CEO Nash Foster, from a recent community debrief.

The following is an excerpt from the soon to be released Cryptofex Whitepaper.

The Problem: Writing Software is Hard, Compiling is Harder

In 1983, Bordland Software Corporation acquired a small compiler called Compas Pascal and released it in the United States as Turbo Pascal. It was the first product to integrate a compiler and the editor in which software was written and for nearly a decade Borland’s products defined the market for integrated development environments (IDEs).

The year after Borland released TurboPascal, Ken Thompson observed the distinct and unique dangers associated with compiler technologies. In his famous Turing Award acceptance speech, Thompson described a mechanism by which a virus can be injected into a compiler such that every binary compiled with that compiler will replicate the virus.

“In demonstrating the possibility of this kind of attack, I picked on the C compiler. I could have picked on any program-handling program such as an assembler, a loader, or even hardware microcode. As the level of program gets lower, these bugs will be harder and harder to detect. A well installed microcode bug will be almost impossible to detect.” — Ken Thompson

Unfortunately, many developers today remain stuck in a world constructed in the early 1980’s. IDEs remain essentially the same, able to solve only those problems that neatly fit onto their laptop’s single Intel CPU. But barely a month ago, on 22nd November 2017, the Intel Corporation released a critical firmware update to the Intel Management Engine and in the intervening weeks, the public at large has become aware of the “Meltdown” bug. The IME and other components are exactly the sort of low-level microcode applications that Thompson warned about. Intel has demonstrated perfectly that in the past 33 years, we have learned little and gone nowhere.

Ironically, we have had a partial solution to these problems for nearly a decade. In 2009, David A. Wheeler published his PhD dissertation, in which he proposed a mechanism by which multiple compilers can be used to verify the correctness of a compiler output. Such a mechanism turns out to be tailor-made for the decentralized blockchain environment. Combining Wheeler’s mechanism with a set of economic incentives for compile farms to submit correct outputs gives us a very real shot at correcting a problem that has plagued us for more than 30 years.

The Solution: A Distributed and Decentralized Toolchain

If we crack open the development environments at companies like Google and Amazon, many of us would be surprised to discover that very few programs are compiled on a single machine. Already, the most sophisticated organizations in the world have moved to a distributed development environment. This allows them to leverage the cloud, bringing high-performance distributed computing to bear on software development itself. At Google, many thousands of machines churn away compiling code, checking it for correctness, and storing objects to be re-used later. Through clever use of caching and “hermetic” builds, Google makes its builds faster and more computationally efficient than could possibly be done on individual developer workstations. Unfortunately, most of us cannot afford to dedicate thousands of machines to compilation.

The open-source community might be able to build large scale shared compilation environments on the Internet, but Ken Thompson explained to us why we could not trust a shared environment for these workloads. However, in the age of blockchain, it’s now possible to build development environments that harness the power of large-scale compute to compile and check programs against programmer intent. Secure, cheap, and fast — we can get all three.

CryptoFex is just such a Decentralized Integrated Development Environment (DIDE) allowing software engineers to author, test, compile, and statically check their code to ensure that it is secure, efficient, and scalable.

Statebox: A Universal Language of Distributed Systems

22 January, 2018

guest post by Christian Williams

A short time ago, on the Croatian island of Zlarin, there gathered a band of bold individuals—rebels of academia and industry, whose everyday thoughts and actions challenge the separations of the modern world. They journeyed from all over to learn of the grand endeavor of another open mind, an expert functional programmer and creative hacktivist with significant mathematical knowledge: Jelle |yell-uh| Herold.

The Dutch computer scientist has devoted his life to helping our species and our planet: from consulting in business process optimization to winning a Greenpeace hackathon, from updating Netherlands telecommunications to creating a website to determine ways for individuals to help heal the earth, Jelle has gained a comprehensive perspective of the interconnected era. Through a diverse and innovative career, he has garnered crucial insights into software design and network computation—most profoundly, he has realized that it is imperative that these immense forces of global change develop thoughtful, comprehensive systematization.

Jelle understood that initiating such a grand ambition requires a massive amount of work, and the cooperation of many individuals, fluent in different fields of mathematics and computer science. Enter the Zlarin meeting: after a decade of consideration, Jelle has now brought together proponents of categories, open games, dependent types, Petri nets, string diagrams, and blockchains toward a singular end: a universal language of distributed systems—Statebox.

Statebox is a programming language formed and guided by fundamental concepts and principles of theoretical mathematics and computer science. The aim is to develop the canonical process language for distributed systems, and thereby elucidate the way these should actually be designed. The idea invokes the deep connections of these subjects in a novel and essential way, to make code simple, transparent, and concrete. Category theory is both the heart and pulse of this endeavor; more than a theory, it is a way of thinking universally. We hope the project helps to demonstrate the importance of this perspective, and encourages others to join.

The language is designed to be self-optimizing, open, adaptive, terminating, error-cognizant, composable, and most distinctively—visual. Petri nets are the natural representation of decentralized computation and concurrency. By utilizing them as program models, the entire language is diagrammatic, and this allows one to inspect the flow of the process executed by the program. While most languages only compile into illegible machine code, Statebox compiles directly into diagrams, so that the user immediately sees and understands the concrete realization of the abstract design. We believe that this immanent connection between the “geometric” and “algebraic” aspects of computation is of great importance.

Compositionality is a rightfully popular contemporary term, indicating the preservation of type under composition of systems or processes. This is essential to the universality of the type, and it is intrinsic to categories, which underpin the Petri net. A pertinent example is that composition allows for a form of abstraction in which programs do not require complete specification. This is parametricity: a program becomes executable when the functions are substituted with valid terms. Every term has a type, and one cannot connect pieces of code that have incompatible inputs and outputs—the compiler would simply produce an error. The intent is to preserve a simple mathematical structure that imposes as little as possible, and still ensure rationality of code. We can then more easily and reliably write tools providing automatic proofs of termination and type-correctness. Many more aspects will be explained as we go along, and in more detail in future posts.

Statebox is more than a specific implementation. It is an evolving aspiration, expressing an ideal, a source of inspiration, signifying a movement. We fully recognize that we are at the dawn of a new era, and do not assume that the current presentation is the best way to fulfill this ideal—but it is vital that this kind of endeavor gains the hearts and minds of these communities. By learning to develop and design by pure theory, we make a crucial step toward universal systems and knowledge. Formalisms are biased, fragile, transient—thought is eternal.

Thank you for reading, and thank you to John Baez—|bi-ez|, some there were not aware—for allowing me to write this post. Azimuth and its readers represent what scientific progress can and should be; it is an honor to speak to you. My name is Christian Williams, and I have just begun my doctoral studies with Dr. Baez. He received the invitation from Jelle and could not attend, and was generous enough to let me substitute. Disclaimer: I am just a young student with big dreams, with insufficient knowledge to do justice to this huge topic. If you can forgive some innocent confidence and enthusiasm, I would like to paint a big picture, to explain why this project is important. I hope to delve deeper into the subject in future posts, and in general to celebrate and encourage the cognitive revolution of Applied Category Theory. (Thank you also to Anton and Fabrizio for providing some of this writing when I was not well; I really appreciate it.)

Statebox Summit, Zlarin 2017, was awesome. Wish you could’ve been there. Just a short swim in the Adriatic from the old city of Šibenik |shib-enic|, there lies the small, green island of Zlarin |zlah-rin|, with just a few hundred kind inhabitants. Jelle’s friend, and part of the Statebox team, Anton Livaja and his family graciously allowed us to stay in their houses. Our headquarters was a hotel, one of the few places open in the fall. We set up in the back dining room for talks and work, and for food and sunlight we came to the patio and were brought platters of wonderful, wholesome Croatian dishes. As we burned the midnight oil, we enjoyed local beer, and already made history—the first Bitcoin transaction of the island, with a progressive bartender, Vinko.

Zlarin is a lovely place, but we haven’t gotten to the best part—the people. All who attended are brilliant, creative, and spirited. Everyone’s eyes had a unique spark to light. I don’t think I’ve ever met such a fascinating group in my life. The crew: Jelle, Anton, Emi Gheorghe, Fabrizio Genovese, Daniel van Dijk, Neil Ghani, Viktor Winschel, Philipp Zahn, Pawel Sobocinski, Jules Hedges, Andrew Polonsky, Robin Piedeleu, Alex Norta, Anthony di Franco, Florian Glatz, Fredrik Nordvall Forsberg. These innovators have provocative and complementary ideas in category theory, computer science, open game theory, functional programming, and the blockchain industry; and they came to share an important goal. These are people who work earnestly to better humanity, motivated by progress, not profit. Talking with them gave me hope, that there are enough intelligent, open-minded, and caring people to fix this mess of modern society. In our short time together, we connected—now, almost all continue to contribute and grow the endeavor.

Why is society a mess? The present human condition is absurd. We are in a cognitive renaissance, yet our world is in peril. We need to realize a deeper harmony of theory and practice—we need ideas that dare to dream big, that draw on the vast wealth of contemporary thought to guide and unite subjects in one mission. The way of the world is only a reflection of how we choose to think, and for more than a century we have delved endlessly into thought itself. If we truly learn from our thought, knowledge and application become imminently interrelated, not increasingly separate. It is imperative that we abandon preconception, pretense and prejudice, and ask with naive sincerity: “How should things be, really, and how can we make it happen?”

This pertains more generally to the irresponsibly ad hoc nature of society—we find ourselves entrenched in inadequate systems. Food, energy, medicine, finance, communications, media, governance, technology—our deepening dependence on centralization is our greatest vulnerability. Programming practice is the perfect example of the gradual failure of systems when their design is left to wander in abstraction. As business requirements evolved, technological solutions were created haphazardly, the priority being immediate return over comprehensive methodology, which resulted in ‘duct-taped’ systems, such as the Windows OS. Our entire world now depends on unsystematic software, giving rise to so much costly disorganization, miscommunication, and worse, bureaucracy. Statebox aims to close the gap between the misguided formalisms which came out of this type of degeneration, and design a language which corresponds naturally to essential mathematical concepts—to create systems which are rational, principled, universal. To explain why Statebox represents to us such an important ideal, we must first consider its closest relative, the elephant in the technological room: blockchain.

Often the best ideas are remarkably simple—in 2008, an unknown person under the alias Satoshi Nakamoto published the whitepaper Bitcoin: A Peer-to-Peer Electronic Cash System. In just a few pages, a protocol was proposed which underpins a new kind of computational network, called a blockchain, in which interactions are immediate, transparent, and permanent. This is a personal interpretation—the paper focuses on the application given in its title. In the original financial context, immediacy is one’s ability to directly transact with anyone, without intermediaries, such as banks; transparency is one’s right to complete knowledge of the economy in which one participates, meaning that each node owns a copy of the full history of the network; permanence is the irrevocability of one’s transactions. These core aspects are made possible by an elegant use of cryptography and game theory, which essentially removes the need for trusted third parties in the authorization, verification, and documentation of transactions. Per word, it’s almost peerless in modern influence; the short and sweet read is recommended.

The point of this simplistic explanation is that blockchain is about more than economics. The transaction could be any cooperation, the value could be any social good—when seen as a source of consensus, the blockchain protocol can be expanded to assimilate any data and code. After several years of competing cryptocurrencies, the importance of this deeper idea was gradually realized. There arose specialized tools to serve essential purposes in some broader system, and only recently have people dared to conceive of what this latter could be. In 2014, a wunderkind named Vitalik Buterin created Ethereum, a fully programmable blockchain. Solidity is a Turing-complete language of smart contracts, autonomous programs which enable interactions and enact rules on the network. With this framework, one can not only transact with others, but implement any kind of process; one can build currencies, websites, or organizations—decentralized applications, constructed with smart contracts, could be just about anything.

There is understandably great confidence and excitement for these ventures, and many are receiving massive public investment. Seriously, the numbers are staggering—but most of it is pure hype. There is talk of the first global computer, the internet of value, a single shared source of truth, and other speculative descriptions. But compared to the ambition, the actual theory is woefully underdeveloped. So far, implementations make almost no use of the powerful ideas of mathematics. There are still basic flaws in blockchain itself, the foundation of almost all decentralized technology. For example, the two viable candidates for transaction verification are called Proof of Work and Proof of Stake: the former requires unsustainable consumption of resources, namely hardware and electricity, and the latter is susceptible to centralization. Scalability is a major problem, thus also cost and speed of transactions. A major Ethereum dApp, Decentralized Autonomous Organization, was hacked.

These statements are absolutely not to disregard all of the great work of this community; it is primarily rhetoric to distinguish the high ideals of Statebox, and I lack the eloquence to make the point diplomatically, nor near the knowledge to give a real account of this huge endeavor. We now return to the rhetoric.

What seems to be lost in the commotion is the simple recognition that we do not yet really know what we should make, nor how to do so. The whole idea is simply too big—the space of possibility is almost completely unknown, because this innovation can open every aspect of society to reform. But as usual, people try to ignore their ignorance, imagining it will disappear, and millions clamor about things we do not yet understand. Most involved are seeing decentralization as an exciting business venture, rather than our best hope to change the way of this broken world; they want to cash in on another technological wave. Of the relatively few idealists, most still retain the assumptions and limitations of the blockchain.

For all this talk, there is little discussion of how to even work toward the ideal abstract design. Most mathematics associated to blockchain is statistical analysis of consensus, while we’re sitting on a mountain of powerful categorical knowledge of systems. At the summit, Prof. Neil Ghani said “it’s like we’re on the Moon, talking about going to Mars, while everyone back on Earth still doesn’t even have fire.” We have more than enough conceptual technology to begin developing an ideal and comprehensive system, if the right minds come together. Theory guides practice, practice motivates theory—the potential is immense.

Fortunately, there are those who have this big picture in mind. Long before the blockchain craze, Jelle saw the fundamental importance of both distributed systems and the need for academic-industrial symbiosis. In the mid-2000’s, he used Petri nets to create process tools for businesses. Employees could design and implement any kind of abstract workflow to more effectively communicate and produce. Jelle would provide consultation to optimize these processes, and integrate them into their existing infrastructure—as it executed, it would generate tasks, emails, forms and send them to designated individuals to be completed for the next iteration. Many institutions would have to shell out millions of dollars to IBM or Fujitsu for this kind of software, and his was more flexible and intuitive. This left a strong impression on Jelle, regarding the power of Petri nets and the impact of deliberate design.

Many experiences like this gradually instilled in Jelle a conviction to expand his knowledge and begin planning bold changes to the world of programming. He attended mathematics conferences, and would discuss with theorists from many relevant subjects. On the island, he told me that it was actually one of Baez’s talks about networks which finally inspired him to go for this huge idea. By sincerely and openly reaching out to the whole community, Jelle made many valuable connections. He invited these thinkers to share his vision—theorists from all over Europe, and some from overseas, gathered in Croatia to learn and begin to develop this project—and it was a great success.

By now you may be thinking, alright kid spill the beans already. Here they are, right into your brain—well, most will be in the next post, but we should at least have a quick overview of some of the main ideas not already discussed.

The notion of open system complements compositionality. The great difference between closure and openness, in society as well as theory, was a central theme in many of our conversations during the summit. Although we try to isolate and suspend life and cognition in abstraction, the real, concrete truth is what flows through these ethereal forms. Every system in Statebox is implicitly open, and this impels design to idealize the inner and outer connections of processes. Open systems are central to the Baez Network Theory research team. There are several ways to categorically formalize open systems; the best are still being developed, but the first main example can be found in The Algebra of Open and Interconnected Systems by Brendan Fong, an early member of the team.

Monoidal categories, as this blog knows well, represent systems with both series and parallel processes. One of the great challenge of this new era of interconnection is distributed computation—getting computers to work together as a supercomputer, and monoidal categories are essential to this. Here, objects are data types, and morphisms are computations, while composition is serial and tensor is parallel. As Dr. Baez has demonstrated with years of great original research, monoidal categories are essential to understanding the complexity of the world. If we can connect our knowledge of natural systems to social systems, we can learn to integrate valuable principles—a key example being complete resource cognizance.

Petri nets are presentations of free strict symmetric monoidal categories, and as such they are ideal models of “normal” computation, i.e. associative, unital, and commutative. Open Petri nets are the workhorses of Statebox. They are the morphisms of a category which is itself monoidal—and via openness it is even richer and more versatile. Most importantly it is compact closed, which introduces a simple but crucial duality into computation—input-output interchange—which is impossible in conventional cartesian closed computation, and actually brings the paradigm closer to quantum computation

Petri nets represent processes in an intuitive, consistent, and decentralized way. These will be multi-layered via the notion of operad and a resourceful use of Petri net tokens, representing the interacting levels of a system. Compositionality makes exploring their state space much easier: the state space of a big process can be constructed from those of smaller ones, a technique that more often than not avoids state space explosion, a long-standing problem in Petri net analysis. The correspondence between open Petri nets and a logical calculus, called place/transition calculus, allows the user to perform queries on the Petri net, and a revolutionary technique called information-gain computing greatly reduces response time.

Dependently typed functional programming is the exoskeleton of this beautiful beast; in particular, the underlying language is Idris. Dependent types arose out of both theoretical mathematics and computer science, and they are beginning to be recognized as very general, powerful, and natural in practice. Functional programming is a similarly pure and elegant paradigm for “open” computation. They are fascinating and inherently categorical, and deserve whole blog posts in the future.

Even economics has opened its mind to categories. Statebox is very fortunate to have several of these pioneers—open game theory is a categorical, compositional version of game theory, which allows the user to dynamically analyze and optimize code. Jules’ choice of the term “teleological category” is prescient; it is about more than just efficiency—it introduces the possibility of building principles into systems, by creating game-theoretical incentives which can guide people to cooperate for the greater good, and gradually lessen the influence of irrational, selfish priorities.

Categories are the language by which Petri nets, functional programming, and open games can communicate—and amazingly, all of these theories are unified in an elegant representation called string diagrams. These allow the user to forget the formalism, and reason purely in graphical terms. All the complex mathematics goes under the hood, and the user only needs to work with nodes and strings, which are guaranteed to be formally correct.

Category theory also models the data structures that are used by Statebox: Typedefs is a very lightweight—but also very expressive—data structure, that is at the very core of Statebox. It is based on initial F-algebras, and can be easily interpreted in a plethora of pre-existing solutions, enabling seamless integration with existing systems. One of the core features of Typedefs is that serialization is categorically internalized in the data structure, meaning that every operation involving types can receive a unique hash and be recorded on the blockchain public ledger. This is one of the many components that make Statebox fail-resistant: every process and event is accounted for on the public ledger, and the whole history of a process can be rolled back and analyzed thanks to the blockchain technology.

The Statebox team is currently working on a monograph that will neatly present how all the pertinent categorical theories work together in Statebox. This is a formidable task that will take months to complete, but will also be the cleanest way to understand how Statebox works, and which mathematical questions have still to be answered to obtain a working product. It will be a thorough document that also considers important aspects such as our guiding ethics.

The team members are devoted to creating something positive and different, explicitly and solely to better the world. The business paradigm is based on the principle that innovation should be open and collaborative, rather than competitive and exclusive. We want to share ideas and work with you. There are many blooming endeavors which share the ideals that have been described in this article, and we want them all to learn from each other and build off one another.

For example, Statebox contributor and visionary economist Viktor Winschel has a fantastic project called Oicos. The great proponent of applied category theory, David Spivak, has an exciting and impressive organization called Categorical Informatics. Mike Stay, a past student of Dr. Baez, has started a company called Pyrofex, which is developing categorical distributed computation. There are also somewhat related languages for blockchain, such as Simplicity, and innovative distributed systems such as Iota and RChain. Even Ethereum is beginning to utilize categories, with Casper. And of course there are research groups, such as Network Theory and Mathematically Structured Programming, as well as so many important papers, such as Algebraic Databases. This is just a slice of everything going on; as far as I know there is not yet a comprehensive account of all the great applied category theory and distributed innovations being developed. Inevitably these endeavors will follow the principle they share, and come together in a big way. Statebox is ready, willing, and able to help make this reality.

If you are interested in Statebox, you are welcomed with open arms. You can contact Jelle at, Fabrizio at, Emi at, Anton at; they can provide more information, connect you to the discussion, or anything else. There will be a second summit in 2018 in about six months, details to be determined. We hope to see you there. Future posts will keep you updated, and explain more of the theory and design of Statebox. Thank you very much for reading.

P.S. Found unexpected support in Šibenik! Great bar—once a reservoir.


14 December, 2016

One of my goals is to turn category theory, and even higher category theory, into a practical tool for science. For this we need good scientific ideas—but we also need good software.

My friend Jamie Vicary has been developing some of this software, together with Aleks Kissinger and Krzysztof Bar and others. Jamie demonstrated it at the Simons Institute workshop on compositionality. You can watch his demonstration here:

But since Globular runs on a web browser, you can also try it out yourself here:


You can see his talk slides:

• Jamie Vicary, Data structures for quasistrict higher categories. (Talk slides here.)

Abstract. Higher category theory is one of the most general approaches to compositionality, with broad and striking applications across computer science, mathematics and physics. We present a new, simple way to define higher categories, in which many important compositional properties emerge as theorems, rather than axioms. Our approach is amenable to computer implementation, and we present a new proof assistant we have developed, with a powerful graphical calculus. In particular, we will outline a substantial new proof we have developed in our setting.

And in December 2015, he wrote an article about this software on the n-Category Café. It’s been improved since then, but it can’t hurt to read what he wrote—so I append it here!

Globular: the basic idea

When you’re trying to prove something in a monoidal category, or a higher category, string diagrams are a really useful technique, especially when you’re trying to get an intuition for what you’re doing. But when it comes to writing up your results, the problems start to mount. For a complex proof, it’s hard to be sure your result is correct—a slip of the pen could lead to a false proof, and an error that’s hard to find. And writing up your results can be a huge time-sink, requiring weeks or months using a graphics package, all just for some nice pictures that tell you little about the correctness of the proof, and become useless if you decide to change your approach. Computers should be able help with all these things, in the way that proof assistants like Coq and Agda are allowing us to work with traditional syntactic proofs in a more sophisticated way.

The purpose of this post is to introduce Globular, a new proof assistant for working with higher-categorical proofs using string diagrams. It’s available at, with documentation on the nab. It’s web-based, so everything happens right in your browser: build formal proofs, visualize and step through them; keep your proofs private, share them with collaborators, or make them publicly available.

Before we get into the technical details, here’s a screenshot of Globular in action:

Screenshot of Globular

The main part of the screen shows a diagram, which in this case is 2-dimensional. It represents a composite 2-cell in a finitely-presented 2-category, with the blue and red regions representing objects, the lines representing 1-cells, and the vertices representing 2-cells. In fact, this 2d diagram is just an intermediate state of a 3d proof, through which we’re navigating with the ‘Slice’ controls in the top-right. The proof itself has been built up by composing the generators listed in the signature, down the left-hand side of the screen. (If you want to take a look at this proof yourself, you can go straight there; in the top-right, set ‘Project’ to 0, then increment the second ‘Slice’ counter to scroll through the proof.)

Globular has been developed so far in the Quantum Group in
the Oxford Computer Science department, by Krzysztof
, Katherine Casey, Aleks Kissinger, Jamie Vicary and Caspar Wylie. We haven’t quite got around to it yet, but Globular will be open-source, and we’re really keen for people to get involved and help build the software—there’s a huge amount to do! If you want to help out, get in touch.

Mathematical foundations

Globular is based on the theory of finitely-presented semistrict n-categories; at the moment, it works up to the level of 3-categories, with an extension to 4-categories actively in development. (You can build cells of any dimension, but from 4-cells and up, some structures are missing.)

Definitions of n-category vary in how strict they are; a definition is semistrict when it’s as strict as possible, while still having the property that every weak n-category satisfies it, up to equivalence. Definitions of semistrict n-category are not unique: in dimension 3, Gray categories put all the weak structure in the interchangers, while Simpson snucategories put it all in the unitors. Globular implements the axioms of a Gray category, because this is the most appropriate for the graphical calculus: the interchangers can be seen graphically, as changes in height of the components of the diagram. By the theory of k-tuply monoidal n-categories, this also lets you build proofs in a monoidal category, or a braided monoidal category, or a monoidal 2-category.

The only things that Globular understands are k-cells, for some value of k. So if you want to build an n-category where an equation f=g holds between n-cells, you have to do it by adding (n+1)-cells a:f \to g and b:g \to f. If you then build some composite C(f) involving f, you can apply the cell a to obtain C(g), and we interpret this as the equation C(f) = C(g). In a slogan, this is equality via rewriting. This is consistent with the basic premise of homotopy type theory: treat your proofs as first-order structures, which can in turn be reasoned about themselves.

Globular can also handle invertibility in a nice way. For a cell F:A \to B to be invertible, indicated by ticking a box in the signature, means that there also exists an invertible cell F^{-1}: B \to A, and invertible cells \text{id}_A \to F . F^{-1} and \text{id}_B \to F^{-1} . F. This is a coinductive definition (see Mike Shulman’s nice post on this topic), since we’re defining the notion of invertibility in terms of itself in a higher dimension. This sort of a definition is great for proof assistants to work with, as it allows a lot of structure to be generated from a single compact definition.

How it works

For a lot more details, take a look at the nLab page. Everything that happens in Globular involves in interaction between the signature on the left-hand side, and the diagram in the main part of the screen. The signature stores the ‘library’ of cells you have available, and the diagram is a particular composite of cells that you have constructed.

To construct a new diagram, clear whatever is currently displayed by clicking the ‘Clear’ button on the right, or pressing ‘c’. Then start by clicking the icon of a n-cell in your signature, which will make a diagram consisting just of that cell. Clicking on the icons of other k-generators for 0 < k \leq n will display a list of ways the cell can be attached, and when you choose one of these ways, the attachment will be performed, growing your n-diagram. (If you’re starting with a blank workspace you will only have a single 0-cell available, so you won’t be able to do this yet!) Clicking an (n+1)-cell G displays a list of ways that your n-diagram D can be rewritten, by identifying the source of G as a subdiagram of D. Selecting one of these ways will implement the rewrite, by ‘cutting out’ the chosen subdiagram of D, and replacing it with the target of G.

Another way to modify the diagram is to click directly on it. Clicking near the edge of the diagram performs an attachment, while clicking in the interior of the diagram performs a rewrite. If more than one attachment or rewrite is consistent with your click, a little menu will pop up for you to choose what you want to do. When you move your mouse pointer over the diagram, a little label pops up to show you what your cursor is hovering over, which is helpful when choosing where to click.

You can also click-and-drag on the diagram. This will attach or rewrite with an interchanger, or naturality for an interchanger, or invertibility for an interchanger, depending on where you have clicked and the direction of the drag. Clicking and dragging is designed to work as if you were really ‘touching’ the strings. So if you want to braid one strand over another, click the strand to ‘grab’ it, and ‘pull’ it to one side. If you want to pull a vertex through a braiding, click the vertex to ‘grab’ it, and ‘pull’ it up or down through its adjacent braiding. Of course, Globular will only carry out the command if the move you are attempting to make is actually valid in that location.

Example theorems

Here are four worked examples of nontrivial proofs in Globular:

Frobenius implies associative: In a monoidal category, if multiplication and comultiplication morphisms are unital, counital and Frobenius, then they are associative and coassociative.

Strengthening an equivalence: In a 2-category, an equivalence gives rise to an adjoint equivalence, satisfying the snake equations.

Swallowtail comes for free: In a monoidal 2-category, a weakly-dual pair of objects gives rise to a strongly-dual pair, satisfying the swallowtail equations.

Pentagon and triangle implies \lambda_I = \rho_I: In a monoidal 2-category, if a pseudomonoid object satisfies pentagon and triangle equations, then it satisfies \lambda_I = \rho_I.

We’ll focus on the second example project “Strengthening an equivalence” listed above, and see how it was constructed. This project investigates the factthat every equivalence in a 2-category gives rise to an adjoint equivalence. To start, we therefore need the basic data that exhibits an equivalence in a 2-category: two 0-cells A and B, and an invertible 1-cell F:A \to B, by the weak definition of ‘invertible’ discussed above. This gives us the following signature:

The 2-cells that witness invertibility of F look like cups and caps in the graphical calculus, but they won’t satisfy the snake equations that define an adjoint equivalence. The idea of this proof is to define a new cap, built out of the invertible structure of F, which does satisfy the snake equations with the existing cup.

By starting with a diagram consisting of F alone, pressing ‘i’ to take the identity diagram, and clicking-and-dragging, we build the following 2-diagram, out of the invertible structure associated to F:

This is our candidate for our redefined cup. Its source is the identity on A, and its target is F composed with F^{-1}. It looks a bit like the curved end of a hockey stick.

To store it for later use, we now click the ‘Theorem’ button. Writing D for the 2-diagram we have constructed, this does two things. First, it creates a 2-cell generator that we call “New Cup”, whose source is s(D), and whose target is t(D). This is the redefined cup that we can use in future expressions. Second, it creates an invertible 3-cell generator that we call “New Cup Definition”, with source given by “New Cup”, and with target given by our hockey-stick diagram D. This says what “New Cup” means in terms of our original structure. This adds the following cells to our signature:

Because “New Cup Definition” is a 3-cell, by default we see two little icons: one for its source, and one for its target. See how its source is a little picture of “New Cup”, and its target is a little picture of the hockey stick, just as we defined it.

We’re now ready to prove one of the snake equations. We start by building the snake composite, using “New Cup” for the cup, and the invertible structure of F for the cap:

Now have to prove that this equals the identity. Since equality is implemented by rewriting, we must construct a 3-diagram whose source is this snake composite, and whose target is the identity on F. To start, we click the ‘Identity’ button to convert our diagram into an identity 3-diagram. The only apparent effect this has is to add a number scroller to the ‘Slice’ area of the controls in the top-right. At the moment we can set this to ‘0’ and ‘1’, representing the source and target of our identity 3-diagram respectively. We set it to ‘1’, because we want to compose things to the target.

We now build up our proof. First, we click on the pink vertex that represents “New Cup”. This will attach our 3-cell “New Cup Definition”, replacing “New Cup” with our hockey-stick picture. By clicking-and-dragging on the diagram, we obtain the following sequence
of pictures:



Pictures 3 to 10 were created by attaching interchangers, and pictures 11 to 15 were created by attaching higher structure generated by the invertibility of F. In all cases, this structure was attached just by clicking-and-dragging on the appropriate vertices of the diagram. We’ve turned the snake into the identity, so we’ve finished our proof, which required 14 3-cells. By using the ‘Slice’ control in the top-right, we can navigate through the 15 slices that make up our proof, and review what we just did. Even better, turning the ‘Project’ control to the value ‘1’ tells Globular to project out the lowest dimension. This means that our entire 3-diagram proof can be viewed as a single 2-dimensional diagram, as follows:

This is just like the Morse singularity graphics used by topologists to study the structure of higher-dimensional manifolds. In this picture, the vertices are 3-cells, the lines are 2-cells, and the regions are 1-cells (in fact, every region is the 1-cell F.) By moving your mouse pointer over the different parts of the diagram, you can see what the different components are. Interchangers are represented in this projection by braidings.

Now we can do something cool: we can modify our proof, by clicking-and-dragging on the Morse projection. For example, just to the right of centre, there is a crossing, out of which emerge two long vertical lines that travel up a long way before annihilating with one another. Our proof would be much simpler if these two lines just annihilated with each other right after the interchanger. So, we click the vertex at the top of the lines, and drag it down repeatedly, until it gets to where we want it:

We’ve simplified our proof. By clicking-and-dragging some more, you can change the proof in lots of different ways, although you probably won’t get it much simpler than this. Putting the ‘Project’ control back to ‘0’, and navigating through the stages of the proof with the ‘Slice’ control as we were doing before, we can see that our proof has indeed been modified.

This project has been in development for about 18 months, so it feels great to finally launch. We hope the whole community will get clicking-and-dragging, and let us know how easy it is to use, and what other features would be useful. There are certain to still be bugs, so let us know about them too, and we’ll get right on them.

Programming with Data Flow Graphs

5 June, 2016

Network theory is catching on—in a very practical way!

Google recently started a new open source library called TensorFlow. It’s for software built using data flow graphs. These are graphs where the edges represent tensors—that is, multidimensional arrays of numbers—and the nodes represent operations on tensors. Thus, they are reminiscent of the spin networks used in quantum gravity and gauge theory, or the tensor networks used in renormalization theory. However, I bet the operations involved are nonlinear! If so, they’re more general.

Here’s what Google says:

About TensorFlow

TensorFlow™ is an open source software library for numerical computation using data flow graphs. Nodes in the graph represent mathematical operations, while the graph edges represent the multidimensional data arrays (tensors) communicated between them. The flexible architecture allows you to deploy computation to one or more CPUs or GPUs in a desktop, server, or mobile device with a single API. TensorFlow was originally developed by researchers and engineers working on the Google Brain Team within Google’s Machine Intelligence research organization for the purposes of conducting machine learning and deep neural networks research, but the system is general enough to be applicable in a wide variety of other domains as well.

What is a Data Flow Graph?

Data flow graphs describe mathematical computation with a directed graph of nodes & edges. Nodes typically implement mathematical operations, but can also represent endpoints to feed in data, push out results, or read/write persistent variables. Edges describe the input/output relationships between nodes. These data edges carry dynamically-sized multidimensional data arrays, or tensors. The flow of tensors through the graph is where TensorFlow gets its name. Nodes are assigned to computational devices and execute asynchronously and in parallel once all the tensors on their incoming edges becomes available.

TensorFlow Features

Deep Flexibility. TensorFlow isn’t a rigid neural networks library. If you can express your computation as a data flow graph, you can use TensorFlow. You construct the graph, and you write the inner loop that drives computation. We provide helpful tools to assemble subgraphs common in neural networks, but users can write their own higher-level libraries on top of TensorFlow. Defining handy new compositions of operators is as easy as writing a Python function and costs you nothing in performance. And if you don’t see the low-level data operator you need, write a bit of C++ to add a new one.

True Portability. TensorFlow runs on CPUs or GPUs, and on desktop, server, or mobile computing platforms. Want to play around with a machine learning idea on your laptop without need of any special hardware? TensorFlow has you covered. Ready to scale-up and train that model faster on GPUs with no code changes? TensorFlow has you covered. Want to deploy that trained model on mobile as part of your product? TensorFlow has you covered. Changed your mind and want to run the model as a service in the cloud? Containerize with Docker and TensorFlow just works.

Connect Research and Production. Gone are the days when moving a machine learning idea from research to product require a major rewrite. At Google, research scientists experiment with new algorithms in TensorFlow, and product teams use TensorFlow to train and serve models live to real customers. Using TensorFlow allows industrial researchers to push ideas to products faster, and allows academic researchers to share code more directly and with greater scientific reproducibility.

Auto-Differentiation. Gradient based machine learning algorithms will benefit from TensorFlow’s automatic differentiation capabilities. As a TensorFlow user, you define the computational architecture of your predictive model, combine that with your objective function, and just add data — TensorFlow handles computing the derivatives for you. Computing the derivative of some values w.r.t. other values in the model just extends your graph, so you can always see exactly what’s going on.

Language Options. TensorFlow comes with an easy to use Python interface and a no-nonsense C++ interface to build and execute your computational graphs. Write stand-alone TensorFlow Python or C++ programs, or try things out in an interactive TensorFlow iPython notebook where you can keep notes, code, and visualizations logically grouped. This is just the start though — we’re hoping to entice you to contribute SWIG interfaces to your favorite language — be it Go, Java, Lua, JavaScript, or R.

Maximize Performance. Want to use every ounce of muscle in that workstation with 32 CPU cores and 4 GPU cards? With first-class support for threads, queues, and asynchronous computation, TensorFlow allows you to make the most of your available hardware. Freely assign compute elements of your TensorFlow graph to different devices, and let TensorFlow handle the copies.

Who Can Use TensorFlow?

TensorFlow is for everyone. It’s for students, researchers, hobbyists, hackers, engineers, developers, inventors and innovators and is being open sourced under the Apache 2.0 open source license.

TensorFlow is not complete; it is intended to be built upon and extended. We have made an initial release of the source code, and continue to work actively to make it better. We hope to build an active open source community that drives the future of this library, both by providing feedback and by actively contributing to the source code.

Why Did Google Open Source This?

If TensorFlow is so great, why open source it rather than keep it proprietary? The answer is simpler than you might think: We believe that machine learning is a key ingredient to the innovative products and technologies of the future. Research in this area is global and growing fast, but lacks standard tools. By sharing what we believe to be one of the best machine learning toolboxes in the world, we hope to create an open standard for exchanging research ideas and putting machine learning in products. Google engineers really do use TensorFlow in user-facing products and services, and our research group intends to share TensorFlow implementations along side many of our research publications.

For more details, try this:

TensorFlow tutorials.

Very Long Proofs

28 May, 2016


In the 1980s, the mathematician Ronald Graham asked if it’s possible to color each positive integer either red or blue, so that no triple of integers a,b,c obeying Pythagoras’ famous equation:

a^2 + b^2 = c^2

all have the same color. He offered a prize of $100.

Now it’s been solved! The answer is no. You can do it for numbers up to 7824, and a solution is shown in this picture. But you can’t do it for numbers up to 7825.

To prove this, you could try all the ways of coloring these numbers and show that nothing works. Unfortunately that would require trying

3 628 407 622 680 653 855 043 364 707 128 616 108 257 615 873 380 491 654 672 530 751 098 578 199 115 261 452 571 373 352 277 580 182 512 704 196 704 700 964 418 214 007 274 963 650 268 320 833 348 358 055 727 804 748 748 967 798 143 944 388 089 113 386 055 677 702 185 975 201 206 538 492 976 737 189 116 792 750 750 283 863 541 981 894 609 646 155 018 176 099 812 920 819 928 564 304 241 881 419 294 737 371 051 103 347 331 571 936 595 489 437 811 657 956 513 586 177 418 898 046 973 204 724 260 409 472 142 274 035 658 308 994 441 030 207 341 876 595 402 406 132 471 499 889 421 272 469 466 743 202 089 120 267 254 720 539 682 163 304 267 299 158 378 822 985 523 936 240 090 542 261 895 398 063 218 866 065 556 920 106 107 895 261 677 168 544 299 103 259 221 237 129 781 775 846 127 529 160 382 322 984 799 874 720 389 723 262 131 960 763 480 055 015 082 441 821 085 319 372 482 391 253 730 679 304 024 117 656 777 104 250 811 316 994 036 885 016 048 251 200 639 797 871 184 847 323 365 327 890 924 193 402 500 160 273 667 451 747 479 728 733 677 070 215 164 678 820 411 258 921 014 893 185 210 250 670 250 411 512 184 115 164 962 089 724 089 514 186 480 233 860 912 060 039 568 930 065 326 456 428 286 693 446 250 498 886 166 303 662 106 974 996 363 841 314 102 740 092 468 317 856 149 533 746 611 128 406 657 663 556 901 416 145 644 927 496 655 933 158 468 143 482 484 006 372 447 906 612 292 829 541 260 496 970 290 197 465 492 579 693 769 880 105 128 657 628 937 735 039 288 299 048 235 836 690 797 324 513 502 829 134 531 163 352 342 497 313 541 253 617 660 116 325 236 428 177 219 201 276 485 618 928 152 536 082 354 773 892 775 152 956 930 865 700 141 446 169 861 011 718 781 238 307 958 494 122 828 500 438 409 758 341 331 326 359 243 206 743 136 842 911 727 359 310 997 123 441 791 745 020 539 221 575 643 687 646 417 117 456 946 996 365 628 976 457 655 208 423 130 822 936 961 822 716 117 367 694 165 267 852 307 626 092 080 279 836 122 376 918 659 101 107 919 099 514 855 113 769 846 184 593 342 248 535 927 407 152 514 690 465 246 338 232 121 308 958 440 135 194 441 048 499 639 516 303 692 332 532 864 631 075 547 542 841 539 848 320 583 307 785 982 596 093 517 564 724 398 774 449 380 877 817 714 717 298 596 139 689 573 570 820 356 836 562 548 742 103 826 628 952 649 445 195 215 299 968 571 218 175 989 131 452 226 726 280 771 962 970 811 426 993 797 429 280 745 007 389 078 784 134 703 325 573 686 508 850 839 302 112 856 558 329 106 490 855 990 906 295 808 952 377 118 908 425 653 871 786 066 073 831 252 442 345 238 678 271 662 351 535 236 004 206 289 778 489 301 259 384 752 840 495 042 455 478 916 057 156 112 873 606 371 350 264 102 687 648 074 992 121 706 972 612 854 704 154 657 041 404 145 923 642 777 084 367 960 280 878 796 437 947 008 894 044 010 821 287 362 106 232 574 741 311 032 906 880 293 520 619 953 280 544 651 789 897 413 312 253 724 012 410 831 696 803 510 617 000 147 747 294 278 502 175 823 823 024 255 652 077 422 574 922 776 413 427 073 317 197 412 284 579 070 292 042 084 295 513 948 442 461 828 389 757 279 712 121 164 692 705 105 851 647 684 562 196 098 398 773 163 469 604 125 793 092 370 432

possibilities. But recently, three mathematicians cleverly figured out how to eliminate most of the options. That left fewer than a trillion to check!

So they spent 2 days on a supercomputer, running 800 processors in parallel, and checked all the options. None worked. They verified their solution on another computer.

This is one of the world’s biggest proofs: it’s 200 terabytes long! That’s about equal to all the digitized text held by the US Library of Congress. There’s also a 68-gigabyte digital signature—sort of a proof that a proof exists—if you want to skim it.

It’s interesting that these 200 terabytes were used to solve a yes-or-no question, whose answer takes a single bit to state: no.

I’m not sure breaking the world’s record for the longest proof is something to be proud of. Mathematicians prize short, insightful proofs. I bet a shorter proof of this result will eventually be found.

Still, it’s fun that we can do such things. Here’s a story about the proof:

• Evelyn Lamb, Two-hundred-terabyte maths proof is largest ever, Nature, May 26, 2016.

and here’s the actual paper:

• Marijn J. H. Heule, Oliver Kullmann and Victor W. Marek, Solving and verifying the Boolean Pythagorean triples problem via cube-and-conquer.

The ‘cube-and-conquer’ paradigm is a “hybrid SAT method for hard problems, employing both look-ahead and CDCL solvers”. The actual benefit of this huge proof is developing such methods for solving big problems! In the comments to my G+ post, Roberto Bayardo explained:

CDCL == “conflict-directed clause learning”: when you hit a dead end in backtrack search, this is the process of recording a (hopefully small) clause that prevents you from making the same mistake again…a type of memorization, essentially.

Look-ahead: in backtrack search, you repeat the process of picking an unassigned variable and then picking an assignment for that variable until you reach a “dead end” (upon deriving a contradiction). Look-ahead involves doing some amount of processing on the remaining formula after each assignment in order to simplify it. This includes identifying variables for which one of its assignments can be quickly eliminated. “Unit propagation” is a type of look-ahead, though I suspect in this case they mean something quite a bit more sophisticated.

Arnaud Spiwack has a series of blog posts introducting SAT solvers here:

Part 1: models & conflict clauses.

Part 2: finding good conflict clauses.

Part 3: SAT modulo theory.

Part 4: two-literal watch.

Part 5: Avatar.

A much longer proof

By the way, despite the title of the Nature article, in the comments to my G+ post about this, Michael Nielsen pointed out a much longer proof by Chris Jefferson, who wrote:

Darn, I had no idea one could get into the media with this kind of stuff.

I had a much larger “proof”, where we didn’t bother storing all the details, in which we enumerated 718,981,858,383,872 semigroups, towards counting the semigroups of size 10.

Uncompressed, it would have been about 63,000 terabytes just for the semigroups, and about a thousand times that to store the “proof”, which is just the tree of the search.

Of course, it would have compressed extremely well, but also I’m not sure it would have had any value, you could rebuild the search tree much faster than you could read it from a disc, and if anyone re-verified our calculation I would prefer they did it by a slightly different search, which would give us much better guarantees of correctness.

His team found a total of 12,418,001,077,381,302,684 semigroups of size 10. They only had to find 718,981,858,383,872 by a brute force search, which is 0.006% of the total:

• Andreas Distler, Chris Jefferson, Tom Kelsey, and Lars Kottho, The semigroups of order 10, in Principles and Practice of Constraint Programming, Springer Lecture Notes in Computer Science 7514, Springer, Berlin, pp. 883–899.

Insanely long proofs

All the proofs mentioned so far are downright laconic compared to those discussed here:

• John Baez, Insanely long proofs, 19 October 2012.

For example, if you read this post you’ll learn about a fairly short theorem whose shortest proof using Peano arithmetic contains at least

\displaystyle{ 10^{10^{1000}}  }

symbols. This is so many that if you tried to write down the number of symbols in this proof—not the symbols themselves, but just the number of symbols—in ordinary decimal notation, you couldn’t do it if you wrote one digit on each proton, neutron and electron in the observable Universe!

The Busy Beaver Game

21 May, 2016

This month, a bunch of ‘logic hackers’ have been seeking to determine the precise boundary between the knowable and the unknowable. The challenge has been around for a long time. But only now have people taken it up with the kind of world-wide teamwork that the internet enables.

A Turing machine is a simple model of a computer. Imagine a machine that has some finite number of states, say N states. It’s attached to a tape, an infinitely long tape with lots of squares, with either a 0 or 1 written on each square. At each step the machine reads the number where it is. Then, based on its state and what it reads, it either halts, or it writes a number, changes to a new state, and moves either left or right.

The tape starts out with only 0’s on it. The machine starts in a particular ‘start’ state. It halts if it winds up in a special ‘halt’ state.

The Busy Beaver Game is to find the Turing machine with N states that runs as long as possible and then halts.

The number BB(N) is the number of steps that the winning machine takes before it halts.

In 1961, Tibor Radó introduced the Busy Beaver Game and proved that the sequence BB(N) is uncomputable. It grows faster than any computable function!

A few values of BB(N) can be computed, but there’s no way to figure out all of them.

As we increase N, the number of Turing machines we need to check increases faster than exponentially: it’s


Of course, many could be ruled out as potential winners by simple arguments. But the real problem is this: it becomes ever more complicated to determine which Turing machines with N states never halt, and which merely take a huge time to halt.

Indeed, matter what axiom system you use for math, as long as it has finitely many axioms and is consistent, you can never use it to correctly determine BB(N) for more than some finite number of cases.

So what do people know about BB(N)?

For starters, BB(0) = 0. At this point I should admit that people don’t count the halt state as one of our N states. This is just a convention. So, when we consider BB(0), we’re considering machines that only have a halt state. They instantly halt.

Next, BB(1) = 1.

Next, BB(2) = 6.

Next, BB(3) = 21. This was proved in 1965 by Tibor Radó and Shen Lin.

Next, BB(4) = 107. This was proved in 1983 by Allan Brady.

Next, BB(5). Nobody knows what BB(5) equals!

The current 5-state busy beaver champion was discovered by Heiner Marxen and Jürgen Buntrock in 1989. It takes 47,176,870 steps before it halts. So, we know

BB(5) ≥ 47,176,870.

People have looked at all the other 5-state Turing machines to see if any does better. But there are 43 machines that do very complicated things that nobody understands. It’s believed they never halt, but nobody has been able to prove this yet.

We may have hit the wall of ignorance here… but we don’t know.

That’s the spooky thing: the precise boundary between the knowable and the unknowable is unknown. It may even be unknowable… but I’m not sure we know that.

Next, BB(6). In 1996, Marxen and Buntrock showed it’s at least 8,690,333,381,690,951. In June 2010, Pavel Kropitz proved that

\displaystyle{ \mathrm{BB}(6) \ge 7.412 \cdot 10^{36,534} }

You may wonder how he proved this. Simple! He found a 6-state machine that runs for


steps and then halts!

Of course, I’m just kidding when I say this was simple. The machine is easy enough to describe, but proving it takes exactly this long to run takes real work! You can read about such proofs here:

• Pascal Michel, The Busy Beaver Competition: a historical survey.

I don’t understand them very well. All I can say at this point is that many of the record-holding machines known so far are similar to the famous Collatz conjecture. The idea there is that you can start with any positive integer and keep doing two things:

• if it’s even, divide it by 2;

• if it’s odd, triple it and add 1.

The conjecture is that this process will always eventually reach the number 1. Here’s a graph of how many steps it takes, as a function of the number you start with:

Nice pattern! But this image shows how it works for numbers up to 10 million, and you’ll see it doesn’t usually take very long for them to reach 1. Usually less than 600 steps is enough!

So, to get a Turing machine that takes a long time to halt, you have to take this kind of behavior and make it much more long and drawn-out. Conversely, to analyze one of the potential winners of the Busy Beaver Game, people must take that long and drawn-out behavior and figure out a way to predict much more quickly when it will halt.

Next, BB(7). In 2014, someone who goes by the name Wythagoras showed that

\displaystyle{ \textrm{BB}(7) > 10^{10^{10^{10^{10^7}}}} }

It’s fun to prove lower bounds on BB(N). For example, in 1964 Milton Green constructed a sequence of Turing machines that implies

\textrm{BB}(2N) \ge 3 \uparrow^{N-2} 3

Here I’m using Knuth’s up-arrow notation, which is a recursively defined generalization of exponentiation, so for example

\textrm{BB}(10) \ge 3 \uparrow^{3} 3 = 3 \uparrow^2 3^{3^3} = 3^{3^{3^{3^{\cdot^{\cdot^\cdot}}}}}

where there are 3^{3^3} threes in that tower.

But it’s also fun to seek the smallest N for which we can prove BB(N) is unknowable! And that’s what people are making lots of progress on right now.

Sometime in April 2016, Adam Yedidia and Scott Aaronson showed that BB(7910) cannot be determined using the widely accepted axioms for math called ZFC: that is, Zermelo—Fraenkel set theory together with the axiom of choice. It’s a great story, and you can read it here:

• Scott Aaronson, The 8000th Busy Beaver number eludes ZF set theory: new paper by Adam Yedidia and me, Shtetl-Optimized, 3 May 2016.

• Adam Yedidia and Scott Aaronson, A relatively small Turing machine whose behavior is independent of set theory, 13 May 2016.

Briefly, Yedidia created a new programming language, called Laconic, which lets you write programs that compile down to small Turing machines. They took an arithmetic statement created by Harvey Friedman that’s equivalent to the consistency of the usual axioms of ZFC together with a large cardinal axiom called the ‘stationary Ramsey property’, or SRP. And they created a Turing machine with 7910 states that seeks a proof of this arithmetic statement using the axioms of ZFC.

Since ZFC can’t prove its own consistency, much less its consistency when supplemented with SRP, their machine will only halt if ZFC+SRP is inconsistent.

Since most set theorists believe ZFC+SRP is consistent, this machine probably doesn’t halt. But we can’t prove this using ZFC.

In short: if the usual axioms of set theory are consistent, we can never use them to determine the value of BB(7910).

The basic idea is nothing new: what’s new is the explicit and rather low value of the number 7910. Poetically speaking, we know the unknowable starts here… if not sooner.

However, this discovery set off a wave of improvements! On the Metamath newsgroup, Mario Carneiro and others started ‘logic hacking’, looking for smaller and smaller Turing machines that would only halt if ZF—that is, Zermelo–Fraenkel set theory, without the axiom of choice—is inconsistent.

By just May 15th, Stefan O’Rear seems to have brought the number down to 1919. He found a Turing machine with just 1919 states that searches for an inconsistency in the ZF axioms. Interestingly, this turned out to work better than using Harvey Friedman’s clever trick.

Thus, if O’Rear’s work is correct, we can only determine BB(1919) if we can determine whether ZF set theory is consistent. However, we cannot do this using ZF set theory—unless we find an inconsistency in ZF set theory.

For details, see:

• Stefan O’Rear, A Turing machine Metamath verifier, 15 May 2016.

I haven’t checked his work, but it’s available on GitHub.

What’s the point of all this? At present, it’s mainly just a game. However, it should have some interesting implications. It should, for example, help us better locate the ‘complexity barrier’.

I explained that idea here:

• John Baez, The complexity barrier, Azimuth, 28 October 2011.

Briefly, while there’s no limit on how much information a string of bits—or any finite structure—can have, there’s a limit on how much information we can prove it has!

This amount of information is pretty low, perhaps a few kilobytes. And I believe the new work on logic hacking can be used to estimate it more accurately!