• John Baez and Brendan Fong, A compositional framework for passive linear networks.
While my paper with Jason Erbele, Categories in control, studies signal flow diagrams, this one focuses on circuit diagrams. The two are different, but closely related.
I’ll talk about their relation at the Turin workshop in May. For now, let me just talk about this paper with Brendan. There’s a lot in here, but let me just try to explain the main result. It’s all about ‘black boxing': hiding the details of a circuit and only remembering its behavior as seen from outside.
In late 1940s, just as Feynman was developing his diagrams for processes in particle physics, Eilenberg and Mac Lane initiated their work on category theory. Over the subsequent decades, and especially in the work of Joyal and Street in the 1980s, it became clear that these developments were profoundly linked: monoidal categories have a precise graphical representation in terms of string diagrams, and conversely monoidal categories provide an algebraic foundation for the intuitions behind Feynman diagrams. The key insight is the use of categories where morphisms describe physical processes, rather than structure-preserving maps between mathematical objects.
In work on fundamental physics, the cutting edge has moved from categories to higher categories. But the same techniques have filtered into more immediate applications, particularly in computation and quantum computation. Our paper is part of a new program of applying string diagrams to engineering, with the aim of giving diverse diagram languages a unified foundation based on category theory.
Indeed, even before physicists began using Feynman diagrams, various branches of engineering were using diagrams that in retrospect are closely related. Foremost among these are the ubiquitous electrical circuit diagrams. Although less well-known, similar diagrams are used to describe networks consisting of mechanical, hydraulic, thermodynamic and chemical systems. Further work, pioneered in particular by Forrester and Odum, applies similar diagrammatic methods to biology, ecology, and economics.
As discussed in detail by Olsen, Paynter and others, there are mathematically precise analogies between these different systems. In each case, the system’s state is described by variables that come in pairs, with one variable in each pair playing the role of ‘displacement’ and the other playing the role of ‘momentum’. In engineering, the time derivatives of these variables are sometimes called ‘flow’ and ‘effort’.
displacement: | flow: | momentum: | effort: | |
Mechanics: translation | position | velocity | momentum | force |
Mechanics: rotation | angle | angular velocity | angular momentum | torque |
Electronics | charge | current | flux linkage | voltage |
Hydraulics | volume | flow | pressure momentum | pressure |
Thermal Physics | entropy | entropy flow | temperature momentum | temperature |
Chemistry | moles | molar flow | chemical momentum | chemical potential |
In classical mechanics, this pairing of variables is well understood using symplectic geometry. Thus, any mathematical formulation of the diagrams used to describe networks in engineering needs to take symplectic geometry as well as category theory into account.
While diagrams of networks have been independently introduced in many disciplines, we do not expect formalizing these diagrams to immediately help the practitioners of these disciplines. At first the flow of information will mainly go in the other direction: by translating ideas from these disciplines into the language of modern mathematics, we can provide mathematicians with food for thought and interesting new problems to solve. We hope that in the long run mathematicians can return the favor by bringing new insights to the table.
Although we keep the broad applicability of network diagrams in the back of our minds, our paper talks in terms of electrical circuits, for the sake of familiarity. We also consider a somewhat limited class of circuits. We only study circuits built from ‘passive’ components: that is, those that do not produce energy. Thus, we exclude batteries and current sources. We only consider components that respond linearly to an applied voltage. Thus, we exclude components such as nonlinear resistors or diodes. Finally, we only consider components with one input and one output, so that a circuit can be described as a graph with edges labeled by components. Thus, we also exclude transformers. The most familiar components our framework covers are linear resistors, capacitors and inductors.
While we want to expand our scope in future work, the class of circuits made from these components has appealing mathematical properties, and is worthy of deep study. Indeed, these circuits has been studied intensively for many decades by electrical engineers. Even circuits made exclusively of resistors have inspired work by mathematicians of the caliber of Weyl and Smale!
Our work relies on this research. All we are adding is an emphasis on symplectic geometry and an explicitly ‘compositional’ framework, which clarifies the way a larger circuit can be built from smaller pieces. This is where monoidal categories become important: the main operations for building circuits from pieces are composition and tensoring.
Our strategy is most easily illustrated for circuits made of linear resistors. Such a resistor dissipates power, turning useful energy into heat at a rate determined by the voltage across the resistor. However, a remarkable fact is that a circuit made of these resistors always acts to minimize the power dissipated this way. This ‘principle of minimum power’ can be seen as the reason symplectic geometry becomes important in understanding circuits made of resistors, just as the principle of least action leads to the role of symplectic geometry in classical mechanics.
Here is a circuit made of linear resistors:
The wiggly lines are resistors, and their resistances are written beside them: for example, means 3 ohms, an ‘ohm’ being a unit of resistance. To formalize this, define a circuit of linear resistors to consist of:
• a set of nodes,
• a set of edges,
• maps sending each edge to its source and target node,
• a map specifying the resistance of the resistor
labelling each edge,
• maps specifying the inputs and outputs of the circuit.
When we run electric current through such a circuit, each node gets a potential The voltage across an edge is defined as the change in potential as we move from to the source of to its target, The power dissipated by the resistor on this edge is then
The total power dissipated by the circuit is therefore twice
The factor of is convenient in some later calculations.
Note that is a nonnegative quadratic form on the vector space However, not every nonnegative definite quadratic form on arises in this way from some circuit of linear resistors with as its set of nodes. The quadratic forms that do arise are called Dirichlet forms. They have been extensively investigated, and they play a major role in our work.
We write
for the set of terminals: that is, nodes corresponding to inputs or outputs. The principle of minimum power says that if we fix the potential at the terminals, the circuit will choose the potential at other nodes to minimize the total power dissipated. An element of the vector space assigns a potential to each terminal. Thus, if we fix the total power dissipated will be twice
The function is again a Dirichlet form. We call it the power functional of the circuit.
Now, suppose we are unable to see the internal workings of a circuit, and can only observe its ‘external behavior': that is, the potentials at its terminals and the currents flowing into or out of these terminals. Remarkably, this behavior is completely determined by the power functional The reason is that the current at any terminal can be obtained by differentiating with respect to the potential at this terminal, and relations of this form are all the relations that hold between potentials and currents at the terminals.
The Laplace transform allows us to generalize this immediately to circuits that can also contain linear inductors and capacitors, simply by changing the field we work over, replacing by the field of rational functions of a single real variable, and talking of impedance where we previously talked of resistance. We obtain a category where an object is a finite set, a morphism is a circuit with input set and output set and composition is given by identifying the outputs of one circuit with the inputs of the next, and taking the resulting union of labelled graphs. Each such circuit gives rise to a Dirichlet form, now defined over and this Dirichlet form completely describes the externally observable behavior of the circuit.
We can take equivalence classes of circuits, where two circuits count as the same if they have the same Dirichlet form. We wish for these equivalence classes of circuits to form a category. Although there is a notion of composition for Dirichlet forms, we find that it lacks identity morphisms or, equivalently, it lacks morphisms representing ideal wires of zero impedance. To address this we turn to Lagrangian subspaces of symplectic vector spaces. These generalize quadratic forms via the map
taking a quadratic form on the vector space over the field to the graph of its differential Here we think of the symplectic vector space as the state space of the circuit, and the subspace as the subspace of attainable states, with describing the potentials at the terminals, and the currents.
This construction is well-known in classical mechanics, where the principle of least action plays a role analogous to that of the principle of minimum power here. The set of Lagrangian subspaces is actually an algebraic variety, the Lagrangian Grassmannian, which serves as a compactification of the space of quadratic forms. The Lagrangian Grassmannian has already played a role in Sabot’s work on circuits made of resistors. For us, its importance it that we can find identity morphisms for the composition of Dirichlet forms by taking circuits made of parallel resistors and letting their resistances tend to zero: the limit is not a Dirichlet form, but it exists in the Lagrangian Grassmannian.
Indeed, there exists a category with finite dimensional symplectic vector spaces as objects and Lagrangian relations as morphisms: that is, linear relations from to that are given by Lagrangian subspaces of where is the symplectic vector space conjugate to —that is, with the sign of the symplectic structure switched.
To move from the Lagrangian subspace defined by the graph of the differential of the power functional to a morphism in the category —that is, to a Lagrangian relation— we must treat seriously the input and output functions of the circuit. These express the circuit as built upon a cospan:
Applicable far more broadly than this present formalization of circuits, cospans model systems with two ‘ends’, an input and output end, albeit without any connotation of directionality: we might just as well exchange the role of the inputs and outputs by taking the mirror image of the above diagram. The role of the input and output functions, as we have discussed, is to mark the terminals we may glue onto the terminals of another circuit, and the pushout of cospans gives formal precision to this gluing construction.
One upshot of this cospan framework is that we may consider circuits with elements of that are both inputs and outputs, such as this one:
This corresponds to the identity morphism on the finite set with two elements. Another is that some points may be considered an input or output multiple times, like here:
This lets to connect two distinct outputs to the above double input.
Given a set of inputs or outputs, we understand the electrical behavior on this set by considering the symplectic vector space the direct sum of the space of potentials and the space of currents at these points. A Lagrangian relation specifies which states of the output space are allowed for each state of the input space Turning the Lagrangian subspace of a circuit into this information requires that we understand the ‘symplectification’
and ‘twisted symplectification’
of a function between finite sets. In particular we need to understand how these apply to the input and output functions with codomain restricted to ; abusing notation, we also write these and
The symplectification is itself a Lagrangian relation, and the catch phrase is that it ‘copies voltages’ and ‘splits currents’. More precisely, for any given potential-current pair in its image under consists of all elements of in such that the potential at is equal to the potential at and such that, for each fixed collectively the currents at the sum to the current at We use the symplectification of the output function to relate the state on to that on the outputs
As our current framework is set up to report the current out of each node, to describe input currents we define the twisted symplectification:
almost identically to the above, except that we flip the sign of the currents We use the twisted symplectification of the input function to relate the state on to that on the inputs. The overline here means we’re reversing the sign of the symplectic structure.
The Lagrangian relation corresponding to a circuit then comprises exactly a list of the potential-current pairs that are possible electrical states of the inputs and outputs of the circuit. In doing so, it identifies distinct circuits. A simple example of this is the identification of a single 2-ohm resistor:
with two 1-ohm resistors in series:
Our inability to access the internal workings of a circuit in this representation inspires us to call this process black boxing: you should imagine encasing the circuit in an opaque black box, leaving only the terminals accessible. Fortunately, this information is enough to completely characterize the external behavior of a circuit, including how it interacts when connected with other circuits!
Put more precisely, the black boxing process is functorial: we can compute the black-boxed version of a circuit made of parts by computing the black-boxed versions of the parts and then composing them. In fact we shall prove that and are dagger compact categories, and the black box functor preserves all this extra structure:
Theorem. There exists a symmetric monoidal dagger functor, the black box functor
mapping a finite set to the symplectic vector space it generates, and a circuit to the Lagrangian relation
where is the circuit’s power functional.
The goal of this paper is to prove and explain this result. The proof is more tricky than one might first expect, but our approach involves concepts that should be useful throughout the study of networks, such as ‘decorated cospans’ and ‘corelations’.
Give it a read, and let us know if you have questions or find mistakes!
A while back I decided one way to apply my math skills to help save the planet was to start pushing toward green mathematics: a kind of mathematics that can interact with biology and ecology just as fruitfully as traditional mathematics interacts with physics. As usual with math, the payoffs will come slowly, but they may be large. It’s not a substitute for doing other, more urgent things—but if mathematicians don’t do this, who will?
As a first step in this direction, I decided to study networks.
This May, a small group of mathematicians is meeting in Turin for a workshop on the categorical foundations of network theory, organized by Jacob Biamonte. I’m trying to get us mentally prepared for this. We all have different ideas, yet they should fit together somehow.
Tobias Fritz, Eugene Lerman and David Spivak have all written articles here about their work, though I suspect Eugene will have a lot of completely new things to say, too. Now it’s time for me to say what my students and I have doing.
Despite my ultimate aim of studying biological and ecological networks, I decided to start by clarifying the math of networks that appear in chemistry and engineering, since these are simpler, better understood, useful in their own right, and probably a good warmup for the grander goal. I’ve been working with Brendan Fong on electrical ciruits, and with Jason Erbele on control theory. Let me talk about this paper:
• John Baez and Jason Erbele, Categories in control.
Control theory is the branch of engineering that focuses on manipulating open systems—systems with inputs and outputs—to achieve desired goals. In control theory, signal-flow diagrams are used to describe linear ways of manipulating signals, for example smooth real-valued functions of time. Here’s a real-world example; click the picture for more details:
For a category theorist, at least, it is natural to treat signal-flow diagrams as string diagrams in a symmetric monoidal category. This forces some small changes of perspective, which I’ll explain, but more important is the question: which symmetric monoidal category?
We argue that the answer is: the category of finite-dimensional vector spaces over a certain field but with linear relations rather than linear maps as morphisms, and direct sum rather than tensor product providing the symmetric monoidal structure. We use the field consisting of rational functions in one real variable This variable has the meaning of differentation. A linear relation from to is thus a system of linear constant-coefficient ordinary differential equations relating ‘input’ signals and ‘output’ signals.
Our main goal in this paper is to provide a complete ‘generators and relations’ picture of this symmetric monoidal category, with the generators being familiar components of signal-flow diagrams. It turns out that the answer has an intriguing but mysterious connection to ideas that are familiar in the diagrammatic approach to quantum theory! Quantum theory also involves linear algebra, but it uses linear maps between Hilbert spaces as morphisms, and the tensor product of Hilbert spaces provides the symmetric monoidal structure.
We hope that the category-theoretic viewpoint on signal-flow diagrams will shed new light on control theory. However, in this paper we only lay the groundwork.
There are several basic operations that one wants to perform when manipulating signals. The simplest is multiplying a signal by a scalar. A signal can be amplified by a constant factor:
where We can write this as a string diagram:
Here the labels and on top and bottom are just for explanatory purposes and not really part of the diagram. Control theorists often draw arrows on the wires, but this is unnecessary from the string diagram perspective. Arrows on wires are useful to distinguish objects from their
duals, but ultimately we will obtain a compact closed category where each object is its own dual, so the arrows can be dropped. What we really need is for the box denoting scalar multiplication to have a clearly defined input and output. This is why we draw it as a triangle. Control theorists often use a rectangle or circle, using arrows on wires to indicate which carries the input and which the output
A signal can also be integrated with respect to the time variable:
Mathematicians typically take differentiation as fundamental, but engineers sometimes prefer integration, because it is more robust against small perturbations. In the end it will not matter much here. We can again draw integration as a string diagram:
Since this looks like the diagram for scalar multiplication, it is natural to extend to the field of rational functions of a variable which stands for differentiation. Then differentiation becomes a special case of scalar multiplication, namely multiplication by and integration becomes multiplication by Engineers accomplish the same effect with Laplace transforms, since differentiating a signal $f$ is equivalent to multiplying its Laplace transform
by the variable Another option is to use the Fourier transform: differentiating is equivalent to multiplying its Fourier transform
by Of course, the function needs to be sufficiently well-behaved to justify calculations involving its Laplace or Fourier transform. At a more basic level, it also requires some work to treat integration as the two-sided inverse of differentiation. Engineers do this by considering signals that vanish for and choosing the antiderivative that vanishes under the same condition. Luckily all these issues can be side-stepped in a formal treatment of signal-flow diagrams: we can simply treat signals as living in an unspecified vector space over the field The field would work just as well, and control theory relies heavily on complex analysis. In our paper we work over an arbitrary field
The simplest possible signal processor is a rock, which takes the 'input' given by the force on the rock and produces as 'output' the rock's position Thanks to Newton's second law we can describe this using a signal-flow diagram:
Here composition of morphisms is drawn in the usual way, by attaching the output wire of one morphism to the input wire of the next.
To build more interesting machines we need more building blocks, such as addition:
and duplication:
When these linear maps are written as matrices, their matrices are transposes of each other. This is reflected in the string diagrams for addition and duplication:
The second is essentially an upside-down version of the first. However, we draw addition as a dark triangle and duplication as a light one because we will later want another way to ‘turn addition upside-down’ that does not give duplication. As an added bonus, a light upside-down triangle resembles the Greek letter the usual symbol for duplication.
While they are typically not considered worthy of mention in control theory, for completeness we must include two other building blocks. One is the zero map from the zero-dimensional vector space to our field which we denote as and draw as follows:
The other is the zero map from to sometimes called ‘deletion’, which we denote as and draw thus:
Just as the matrices for addition and duplication are transposes of each other, so are the matrices for zero and deletion, though they are rather degenerate, being and matrices, respectively. Addition and zero make into a commutative monoid, meaning that the following relations hold:
The equation at right is the commutative law, and the crossing of strands is the braiding:
by which we switch two signals. In fact this braiding is a symmetry, so it does not matter which strand goes over which:
Dually, duplication and deletion make into a cocommutative comonoid. This means that if we reflect the equations obeyed by addition and zero across the horizontal axis and turn dark operations into light ones, we obtain another set of valid equations:
There are also relations between the monoid and comonoid operations. For example, adding two signals and then duplicating the result gives the same output as duplicating each signal and then adding the results:
This diagram is familiar in the theory of Hopf algebras, or more generally bialgebras. Here it is an example of the fact that the monoid operations on are comonoid homomorphisms—or equivalently, the comonoid operations are monoid homomorphisms.
We summarize this situation by saying that is a bimonoid. These are all the bimonoid laws, drawn as diagrams:
The last equation means we can actually make the diagram at left disappear, since it equals the identity morphism on the 0-dimensional vector space, which is drawn as nothing.
So far all our string diagrams denote linear maps. We can treat these as morphisms in the category where objects are finite-dimensional vector spaces over a field and morphisms are linear maps. This category is equivalent to the category where the only objects are vector spaces for and then morphisms can be seen as matrices. The space of signals is a vector space over which may not be finite-dimensional, but this does not cause a problem: an matrix with entries in still defines a linear map from to in a functorial way.
In applications of string diagrams to quantum theory, we make into a symmetric monoidal category using the tensor product of vector spaces. In control theory, we instead make into a symmetric monoidal category using the direct sum of vector spaces. In Lemma 1 of our paper we prove that for any field with direct sum is generated as a symmetric monoidal category by the one object together with these morphisms:
where is arbitrary.
However, these generating morphisms obey some unexpected relations! For example, we have:
Thus, it is important to find a complete set of relations obeyed by these generating morphisms, thus obtaining a presentation of as a symmetric monoidal category. We do this in Theorem 2. In brief, these relations say:
(1) is a bicommutative bimonoid;
(2) the rig operations of can be recovered from the generating morphisms;
(3) all the generating morphisms commute with scalar multiplication.
Here item (2) means that and in the field can be expressed in terms of signal-flow diagrams as follows:
Multiplicative inverses cannot be so expressed, so our signal-flow diagrams so far do not know that is a field. Additive inverses also cannot be expressed in this way. So, we expect that a version of Theorem 2 will hold whenever is a mere rig: that is, a ‘ring without negatives’, like the natural numbers. The one change is that instead of working with vector spaces, we should work with finitely presented free -modules.
Item (3), the fact that all our generating morphisms commute with scalar multiplication, amounts to these diagrammatic equations:
While Theorem 2 is a step towards understanding the category-theoretic underpinnings of control theory, it does not treat signal-flow diagrams that include ‘feedback’. Feedback is one of the most fundamental concepts in control theory because a control system without feedback may be highly sensitive to disturbances or unmodeled behavior. Feedback allows these uncontrolled behaviors to be mollified. As a string diagram, a basic feedback system might look schematically like this:
The user inputs a ‘reference’ signal, which is fed into a controller, whose output is fed into a system, which control theorists call a ‘plant’, which in turn produces its own output. But then the system’s output is duplicated, and one copy is fed into a sensor, whose output is added (or if we prefer, subtracted) from the reference signal.
In string diagrams—unlike in the usual thinking on control theory—it is essential to be able to read any diagram from top to bottom as a composite of tensor products of generating morphisms. Thus, to incorporate the idea of feedback, we need two more generating morphisms. These are the ‘cup':
and ‘cap':
These are not maps: they are relations. The cup imposes the relation that its two inputs be equal, while the cap does the same for its two outputs. This is a way of describing how a signal flows around a bend in a wire.
To make this precise, we use a category called An object of this category is a finite-dimensional vector space over while a morphism from to denoted is a linear relation, meaning a linear subspace
In particular, when a linear relation is just an arbitrary system of constant-coefficient linear ordinary differential equations relating input variables and output variables.
Since the direct sum is also the cartesian product of and a linear relation is indeed a relation in the usual sense, but with the property that if is related to and is related to then is related to whenever
We compose linear relations and as follows:
Any linear map gives a linear relation namely the graph of that map:
Composing linear maps thus becomes a special case of composing linear relations, so becomes a subcategory of Furthermore, we can make into a monoidal category using direct sums, and it becomes symmetric monoidal using the braiding already present in
In these terms, the cup is the linear relation
given by
while the cap is the linear relation
given by
These obey the zigzag relations:
Thus, they make into a compact closed category where and thus every object, is its own dual.
Besides feedback, one of the things that make the cap and cup useful is that they allow any morphism to be ‘plugged in backwards’ and thus ‘turned around’. For instance, turning around integration:
we obtain differentiation. In general, using caps and cups we can turn around any linear relation and obtain a linear relation called the adjoint of which turns out to given by
For example, if is nonzero, the adjoint of scalar multiplication by is multiplication by :
Thus, caps and cups allow us to express multiplicative inverses in terms of signal-flow diagrams! One might think that a problem arises when when but no: the adjoint of scalar multiplication by is
In Lemma 3 we show that is generated, as a symmetric monoidal category, by these morphisms:
where is arbitrary.
In Theorem 4 we find a complete set of relations obeyed by these generating morphisms,thus giving a presentation of as a symmetric monoidal category. To describe these relations, it is useful to work with adjoints of the generating morphisms. We have already seen that the adjoint of scalar multiplication by is scalar multiplication by except when Taking adjoints of the other four generating morphisms of we obtain four important but perhaps unfamiliar linear relations. We draw these as ‘turned around’ versions of the original generating morphisms:
• Coaddition is a linear relation from to that holds when the two outputs sum to the input:
• Cozero is a linear relation from to that holds when the input is zero:
• Coduplication is a linear relation from to that holds when the two inputs both equal the output:
• Codeletion is a linear relation from to that holds always:
Since and automatically obey turned-around versions of the relations obeyed by and we see that acquires a second bicommutative bimonoid structure when considered as an object in
Moreover, the four dark operations make into a Frobenius monoid. This means that is a monoid, is a comonoid, and the Frobenius relation holds:
All three expressions in this equation are linear relations saying that the sum of the two inputs equal the sum of the two outputs.
The operation sending each linear relation to its adjoint extends to a contravariant functor
which obeys a list of properties that are summarized by saying that is a †-compact category. Because two of the operations in the Frobenius monoid are adjoints of the other two, it is a †-Frobenius monoid.
This Frobenius monoid is also special, meaning that
comultiplication (in this case ) followed by multiplication (in this case ) equals the identity:
This Frobenius monoid is also commutative—and cocommutative, but for Frobenius monoids this follows from commutativity.
Starting around 2008, commutative special †-Frobenius monoids have become important in the categorical foundations of quantum theory, where they can be understood as ‘classical structures’ for quantum systems. The category of finite-dimensional Hilbert spaces and linear maps is a †-compact category, where any linear map has an adjoint given by
for all A commutative special †-Frobenius monoid in is then the same as a Hilbert space with a chosen orthonormal basis. The reason is that given an orthonormal basis for a finite-dimensional Hilbert space we can make into a commutative special †-Frobenius monoid with multiplication given by
and unit given by
The comultiplication duplicates basis states:
Conversely, any commutative special †-Frobenius monoid in arises this way.
Considerably earlier, around 1995, commutative Frobenius monoids were recognized as important in topological quantum field theory. The reason, ultimately, is that the free symmetric monoidal category on a commutative Frobenius monoid is the category with 2-dimensional oriented cobordisms as morphisms. But the free symmetric monoidal category on a commutative special Frobenius monoid was worked out even earlier: it is the category with finite sets as objects, where a morphism is an isomorphism class of cospans
This category can be made into a †-compact category in an obvious way, and then the 1-element set becomes a commutative special †-Frobenius monoid.
For all these reasons, it is interesting to find a commutative special †-Frobenius monoid lurking at the heart of control theory! However, the Frobenius monoid here has yet another property, which is more unusual. Namely, the unit followed by the counit is the identity:
We call a special Frobenius monoid that also obeys this extra law extra-special. One can check that the free symmetric monoidal category on a commutative extra-special Frobenius monoid is the category with finite sets as objects, where a morphism is an equivalence relation on the disjoint union and we compose and by letting and generate an equivalence relation on and then restricting this to
As if this were not enough, the light operations share many properties with the dark ones. In particular, these operations make into a commutative extra-special †-Frobenius monoid in a second way. In summary:
• is a bicommutative bimonoid;
• is a bicommutative bimonoid;
• is a commutative extra-special †-Frobenius monoid;
• is a commutative extra-special †-Frobenius monoid.
It should be no surprise that with all these structures built in, signal-flow diagrams are a powerful method of designing processes.
However, it is surprising that most of these structures are present in a seemingly very different context: the so-called ZX calculus, a diagrammatic formalism for working with complementary observables in quantum theory. This arises naturally when one has an -dimensional Hilbert space $H$ with two orthonormal bases that are mutually unbiased, meaning that
for all Each orthonormal basis makes into commutative special †-Frobenius monoid in Moreover, the multiplication and unit of either one of these Frobenius monoids fits together with the comultiplication and counit of the other to form a bicommutative bimonoid. So, we have all the structure present in the list above—except that these Frobenius monoids are only extra-special if is 1-dimensional.
The field is also a 1-dimensional vector space, but this is a red herring: in every finite-dimensional vector space naturally acquires all four structures listed above, since addition, zero, duplication and deletion are well-defined and obey all the relations we have discussed. Jason and I focus on in our paper simply because it generates all the objects via direct sum.
Finally, in the cap and cup are related to the light and dark operations as follows:
Note the curious factor of in the second equation, which breaks some of the symmetry we have seen so far. This equation says that two elements sum to zero if and only if Using the zigzag relations, the two equations above give
We thus see that in both additive and multiplicative inverses can be expressed in terms of the generating morphisms used in signal-flow diagrams.
Theorem 4 of our paper gives a presentation of based on the ideas just discussed. Briefly, it says that is equivalent to the symmetric monoidal category generated by an object and these morphisms:
• addition
• zero
• duplication
• deletion
• scalar multiplication for any
• cup
• cap
obeying these relations:
(1) is a bicommutative bimonoid;
(2) and obey the zigzag equations;
(3) is a commutative extra-special †-Frobenius monoid;
(4) is a commutative extra-special †-Frobenius monoid;
(5) the field operations of can be recovered from the generating morphisms;
(6) the generating morphisms (1)-(4) commute with scalar multiplication.
Note that item (2) makes into a †-compact category, allowing us to mention the adjoints of generating morphisms in the subsequent relations. Item (5) means that and also additive and multiplicative inverses in the field can be expressed in terms of signal-flow diagrams in the manner we have explained.
So, we have a good categorical understanding of the linear algebra used in signal flow diagrams!
Now Jason is moving ahead to apply this to some interesting problems… but that’s another story, for later.
• Kinetic networks: from topology to design, Santa Fe Institute, 17–19 September, 2015. Organized by Yoav Kallus, Pablo Damasceno, and Sidney Redner.
Proteins, self-assembled materials, virus capsids, and self-replicating biomolecules go through a variety of states on the way to or in the process of serving their function. The network of possible states and possible transitions between states plays a central role in determining whether they do so reliably. The goal of this workshop is to bring together researchers who study the kinetic networks of a variety of self-assembling, self-replicating, and programmable systems to exchange ideas about, methods for, and insights into the construction of kinetic networks from first principles or simulation data, the analysis of behavior resulting from kinetic network structure, and the algorithmic or heuristic design of kinetic networks with desirable properties.
In Part 1 and Part 2, we learnt about ordered commutative monoids and how they formalize theories of resource convertibility and combinability. In this post, I would like to say a bit about the applications that have been explored so far. First, the study of resource theories has become a popular subject in quantum information theory, and many of the ideas in my paper actually originate there. I’ll list some references at the end. So I hope that the toolbox of ordered commutative monoids will turn out to be useful for this. But here I would like to talk about an example application that is much easier to understand, but no less difficult to analyze: graph theory and the resource theory of zero-error communication.
A graph consists of a bunch of nodes connected by a bunch of edges, for example like this:
This particular graph is the pentagon graph or 5-cycle. To give it some resource-theoretic interpretation, think of it as the distinguishability graph of a communication channel, where the nodes are the symbols that can be sent across the channel, and two symbols share an edge if and only if they can be unambiguously decoded. For example, the pentagon graph roughly corresponds to the distinguishability graph of my handwriting, when restricted to five letters only:
So my ‘w’ is distinguishable from my ‘u’, but it may be confused for my ‘m’. In order to communicate unambiguously, it looks like I should restrict myself to using only two of those letters in writing, since any third of them may be mistaken for one of the other three. But alternatively, I could use a block code to create context around each letter which allows for perfect disambiguation. This is what happens in practice: I write in natural language, where an entire word is usually not ambiguous.
One can now also consider graph homomorphisms, which are maps like this:
The numbers on the nodes indicate where each node on the left gets mapped to. Formally, a graph homomorphism is a function taking nodes to nodes such that adjacent nodes get mapped to adjacent nodes. If a homomorphism exists between graphs and then we also write ; in terms of communication channels, we can interpret this as saying that simulates since the homomorphism provides a map between the symbols which preserves distinguishability. A ‘code’ for a communication channel is then just a homomorphism from the complete graph in which all nodes share an edge to the graph which describes the channel. With this ordering structure, the collection of all finite graphs forms an ordered set. This ordered set has an intricate structure which is intimately related to some big open problems in graph theory.
We can also combine two communication channels to form a compound one. Going back to the handwriting example, we can consider the new channel in which the symbols are pairs of letters. Two such pairs are distinguishable if and only if either the first letters of each pair are distinguishable or the second letters are,
When generalized to arbitrary graphs, this yields the definition of disjunctive product of graphs. It is not hard to show that this equips the ordered set of graphs with a binary operation compatible with the ordering, so that we obtain an ordered commutative monoid denoted Grph. It mathematically formalizes the resource theory of zero-error communication.
Using the toolbox of ordered commutative monoids combined with some concrete computations on graphs, one can show that Grph is not cancellative: if is the complete graph on 11 nodes, then but there exists a graph such that
The graph turns out to have 136 nodes. This result seems to be new. But if you happen to have seen something like this before, please let me know!
Last time, we also talked about rates of conversion. In Grph, it turns out that some of these correspond to famous graph invariants! For example, the rate of conversion from a graph to the single-edge graph is Shannon capacity where is the complement graph. This is of no surprise since was originally defined by Shannon with precisely this rate in mind, although he did not use the language of ordered commutative monoids. In any case, the Shannon capacity is a graph invariant notorious for its complexity: it is not known whether there exists an algorithm to compute it! But an application of the Rate Theorem from Part 2 gives us a formula for the Shannon capacity:
where ranges over all graph invariants which are monotone under graph homomorphisms, multiplicative under disjunctive product, and normalized such that Unfortunately, this formula still not produce an algorithm for computing But it nonconstructively proves the existence of many new graph invariants which approximate the Shannon capacity from above.
Although my story ends here, I also feel that the whole project has barely started. There are lots of directions to explore! For example, it would be great to fit Shannon’s noisy channel coding theorem into this framework, but this has turned out be technically challenging. If you happen to be familiar with rate-distortion theory and you want to help out, please get in touch!
Here is a haphazard selection of references on resource theories in quantum information theory and related fields:
• Igor Devetak, Aram Harrow and Andreas Winter, A resource framework for quantum Shannon theory.
• Gilad Gour, Markus P. Müller, Varun Narasimhachar, Robert W. Spekkens and Nicole Yunger Halpern, The resource theory of informational nonequilibrium in thermodynamics.
• Fernando G.S.L. Brandão, Michał Horodecki, Nelly Huei Ying Ng, Jonathan Oppenheim and Stephanie Wehner, The second laws of quantum thermodynamics.
• Iman Marvian and Robert W. Spekkens, The theory of manipulations of pure state asymmetry: basic tools and equivalence classes of states under symmetric operations.
• Elliott H. Lieb and Jakob Yngvason, The physics and mathematics of the second law of thermodynamics.
In Part 1, I introduced ordered commutative monoids as a mathematical formalization of resources and their convertibility. Today I’m going to say something about what to do with this formalization. Let’s start with a quick recap!
Definition: An ordered commutative monoid is a set equipped with a binary relation a binary operation and a distinguished element such that the following hold:
• and equip with the structure of a commutative monoid;
• equips with the structure of a partially ordered set;
• addition is monotone: if then also
Recall also that we think of the as resource objects such that represents the object consisting of and together, and means that the resource object can be converted into
When confronted with an abstract definition like this, many people ask: so what is it useful for? The answer to this is twofold: first, it provides a language which we can use to guide our thoughts in any application context. Second, the definition itself is just the very start: we can now also prove theorems about ordered commutative monoids, which can be instantiated in any particular application context. So the theory of ordered commutative monoids will provide a useful toolbox for talking about concrete resource theories and studying them. In the remainder of this post, I’d like to say a bit about what this toolbox contains. For more, you’ll have to read the paper!
To start, let’s consider catalysis as one of the resource-theoretic phenomena neatly captured by ordered commutative monoids. Catalysis is the phenomenon that certain conversions become possible only due to the presence of a catalyst, which is an additional resource object which does not get consumed in the process of the conversion. For example, we have
because making a table from timber and nails requires a saw and a hammer as tools. So in this example, ‘saw hammer’ is a catalyst for the conversion of ‘timber nails’ into ‘table’. In mathematical language, catalysis occurs precisely when the ordered commutative monoid is not cancellative, which means that sometimes holds even though does not. So, the notion of catalysis perfectly matches up with a very natural and familiar notion from algebra.
One can continue along these lines and study those ordered commutative monoids which are cancellative. It turns out that every ordered commutative monoid can be made cancellative in a universal way; in the resource-theoretic interpretation, this boils down to replacing the convertibility relation by catalytic convertibility, in which is declared to be convertible into as soon as there exists a catalyst which achieves this conversion. Making an ordered commutative monoid cancellative like this is a kind of ‘regularization': it leads to a mathematically more well-behaved structure. As it turns out, there are several additional steps of regularization that can be performed, and all of these are both mathematically natural and have an appealing resource-theoretic interpretation. These regularizations successively take us from the world of ordered commutative monoids to the realm of linear algebra and functional analysis, where powerful theorems are available. For now, let me not go into the details, but only try to summarize one of the consequences of this development. This requires a bit of preparation.
In many situations, it is not just of interest to convert a single copy of some resource object into a single copy of some instead, one may be interested in converting many copies of into many copies of all together, and thereby maximizing (or minimizing) the ratio of the resulting number of ‘s compared to the number of ‘s that get consumed. This ratio is measured by the maximal rate:
Here, and are natural numbers, and stands for the -fold sum and similarly for So this maximal rate quantifies how many ’ s we can get out of one copy of when working in a ‘mass production’ setting. There is also a notion of regularized rate, which has a slightly more complicated definition that I don’t want to spell out here, but is similar in spirit. The toolbox of ordered commutative monoids now provides the following result:
Rate Theorem: If and in an ordered commutative monoid which satisfies a mild technical assumption, then the maximal regularized rate from to can be computed like this:
where ranges over all functionals on with
Wait a minute, what’s a ‘functional’? It’s defined to be a map which is monotone,
and additive,
In economic terms, we can think of a functional as a consistent assignment of prices to all resource objects. If is at least as useful as then the price of should be at least as high as the price of ; and the price of two objects together should be the sum of their individual prices. So the in the rate formula above ranges over all ‘markets’ on which resource objects can be ‘traded’ at consistent prices. The term ‘functional’ is supposed to hint at a relation to functional analysis. In fact, the proof of the theorem crucially relies on the Hahn–Banach Theorem.
The mild technical mentioned in the Rate Theorem is that the ordered commutative monoid needs to have a generating pair. This turns out to hold in the applications that I have considered so far, and I hope that it will turn out to hold in most others as well. For the full gory details, see the paper.
So this provides some idea of what kinds of gadgets one can find in the toolbox of ordered commutative monoids. Next time, I’ll show some applications to graph theory and zero-error communication and say a bit about where this project might be going next.
Hi! I am Tobias Fritz, a mathematician at the Perimeter Institute for Theoretical Physics in Waterloo, Canada. I like to work on all sorts of mathematical structures which pop up in probability theory, information theory, and other sorts of applied math. Today I would like to tell you about my latest paper:
• The mathematical structure of theories of resource convertibility, I.
It should be of interest to Azimuth readers as it forms part of what John likes to call ‘green mathematics’. So let’s get started!
Resources and their management are an essential part of our everyday life. We deal with the management of time or money pretty much every day. We also consume natural resources in order to afford food and amenities for (some of) the 7 billion people on our planet. Many of the objects that we deal with in science and engineering can be considered as resources. For example, a communication channel is a resource for sending information from one party to another. But for now, let’s stick with a toy example: timber and nails constitute a resource for making a table. In mathematical notation, this looks like so:
We interpret this inequality as saying that “given timber and nails, we can make a table”. I like to write it as an inequality like this, which I think of as stating that having timber and nails is at least as good as having a table, because the timber and nails can always be turned into a table whenever one needs a table.
To be more precise, we should also take into account that making the table requires some tools. These tools do not get consumed in the process, so we also get them back out:
Notice that this kind of equation is analogous to a chemical reaction equation like this:
So given a hydrogen molecules and an oxygen molecule, we can let them react such as to form a molecule of water. In chemistry, this kind of equation would usually be written with an arrow ‘’ instead of an ordering symbol ‘’ , but here we interpret the equation slightly differently. As with the timber and the nails and nails above, the inequality says that if we have two hydrogen atoms and an oxygen atom, then we can let them react to a molecule of water, but we don’t have to. In this sense, having two hydrogen atoms and an oxygen atom is at least as good as having a molecule of water.
So what’s going on here, mathematically? In all of the above equations, we have a bunch of stuff on each side and an inequality ‘’ in between. The stuff on each side consists of a bunch of objects tacked together via ‘’ . With respect to these two pieces of structure, the collection of all our resource objects forms an ordered commutative monoid:
Definition: An ordered commutative monoid is a set equipped with a binary relation a binary operation and a distinguished element such that the following hold:
• and equip with the structure of a commutative monoid;
• equips with the structure of a partially ordered set;
• addition is monotone: if then also
Here, the third axiom is the most important, since it tells us how the additive structure interacts with the ordering structure.
Ordered commutative monoids are the mathematical formalization of resource convertibility and combinability as follows. The elements are the resource objects, corresponding to the ‘collections of stuff’ in our earlier examples, such as or Then the addition operation simply joins up collections of stuff into bigger collections of stuff. The ordering relation is what formalizes resource convertibility, as in the examples above. The third axiom states that if we can convert into then we can also convert together with into together with for any for example by doing nothing to
A mathematically minded reader might object that requiring to form a partially ordered set under is too strong a requirement, since it requires two resource objects to be equal as soon as they are mutually interconvertible: and implies However, I think that this is not an essential restriction, because we can regard this implication as the definition of equality: ‘’ is just a shorthand notation for ‘ and ’ which formalizes the perfect interconvertibility of resource objects.
We could now go back to the original examples and try to model carpentry and chemistry in terms of ordered commutative monoids. But as a mathematician, I needed to start out with something mathematically precise and rigorous as a testing ground for the formalism. This helps ensure that the mathematics is sensible and useful before diving into real-world applications. So, the main example in my paper is the ordered commutative monoid of graphs, which has a resource-theoretic interpretation in terms of zero-error information theory. As graph theory is a difficult and traditional subject, this application constitutes the perfect training camp for the mathematics of ordered commutative monoids. I will get to this in Part 3.
In Part 2, I will say something about what one can do with ordered commutative monoids. In the meantime, I’d be curious to know what you think about what I’ve said so far!
• Resource convertibility: part 2.
To watch the workshop live, go here. Go down to where it says
Investigative Workshop: Information and Entropy in Biological Systems
Then click where it says live link. There’s nothing there now, but I’m hoping there will be when the show starts!
Below you can see the schedule of talks and a list of participants. The hours are in Eastern Daylight Time: add 4 hours to get Greenwich Mean Time. The talks start at 10 am EDT, which is 2 pm GMT.
There will be 1½ hours of talks in the morning and 1½ hours in the afternoon for each of the 3 days, Wednesday April 8th to Friday April 10th. The rest of the time will be for discussions on different topics. We’ll break up into groups, based on what people want to discuss.
Each invited speaker will give a 30-minute talk summarizing the key ideas in some area, not their latest research so much as what everyone should know to start interesting conversations. After that, 15 minutes for questions and/or coffee.
Here’s the schedule. You can already see slides or other material for the talks with links!
• 9:45-10:00 — the usual introductory fussing around.
• 10:00-10:30 — John Baez, Information and entropy in biological systems.
• 10:30-11:00 — questions, coffee.
• 11:00-11:30 — Chris Lee, Empirical information, potential information and disinformation.
• 11:30-11:45 — questions.
• 11:45-1:30 — lunch, conversations.
• 1:30-2:00 — John Harte, Maximum entropy as a foundation for theory building in ecology.
• 2:00-2:15 — questions, coffee.
• 2:15-2:45 — Annette Ostling, The neutral theory of biodiversity and other competitors to the principle of maximum entropy.
• 2:45-3:00 — questions, coffee.
• 3:00-5:30 — break up into groups for discussions.
• 5:30 — reception.
• 10:00-10:30 — David Wolpert, The Landauer limit and thermodynamics of biological organisms.
• 10:30-11:00 — questions, coffee.
• 11:00-11:30 — Susanne Still, Efficient computation and data modeling.
• 11:30-11:45 — questions.
• 11:45-1:30 — group photo, lunch, conversations.
• 1:30-2:00 — Matina Donaldson-Matasci, The fitness value of information in an uncertain environment.
• 2:00-2:15 — questions, coffee.
• 2:15-2:45 — Roderick Dewar, Maximum entropy and maximum entropy production in biological systems: survival of the likeliest?
• 2:45-3:00 — questions, coffee.
• 3:00-6:00 — break up into groups for discussions.
• 10:00-10:30 — Marc Harper, Information transport and evolutionary dynamics.
• 10:30-11:00 — questions, coffee.
• 11:00-11:30 — Tobias Fritz, Characterizations of Shannon and Rényi entropy.
• 11:30-11:45 — questions.
• 11:45-1:30 — lunch, conversations.
• 1:30-2:00 — Christina Cobbold, Biodiversity measures and the role of species similarity.
• 2:00-2:15 — questions, coffee.
• 2:15-2:45 — Tom Leinster, Maximizing biological diversity.
• 2:45-3:00 — questions, coffee.
• 3:00-6:00 — break up into groups for discussions.
Here are the confirmed participants. This list may change a little bit:
• John Baez – mathematical physicist.
• Romain Brasselet – postdoc in cognitive neuroscience knowledgeable about information-theoretic methods and methods of estimating entropy from samples of probability distributions.
• Katharina Brinck – grad student at Centre for Complexity Science at Imperial College; did masters at John Harte’s lab, where she extended his Maximum Entropy Theory of Ecology (METE) to trophic food webs, to study how entropy maximization on the macro scale together with MEP on the scale of individuals drive the structural development of model ecosystems.
• Christina Cobbold – mathematical biologist, has studied the role of species similarity in measuring biodiversity.
• Troy Day – mathematical biologist, works with population dynamics, host-parasite dynamics, etc.; influential and could help move population dynamics to a more information-theoretic foundation.
• Roderick Dewar – physicist who studies the principle of maximal entropy production.
• Barrett Deris – MIT postdoc studying the studying the factors that influence evolvability of drug resistance in bacteria.
• Charlotte de Vries – a biology master’s student who studied particle physics to the master’s level at Oxford and the Perimeter Institute. Interested in information theory.
• Matina Donaldson-Matasci – a biologist who studies information, uncertainty and collective behavior.
• Chris Ellison – a postdoc who worked with James Crutchfield on “information-theoretic measures of structure and memory in stationary, stochastic systems – primarily, finite state hidden Markov models”. He coauthored “Intersection Information based on Common Randomness”, http://arxiv.org/abs/1310.1538. The idea: “The introduction of the partial information decomposition generated a flurry of proposals for defining an intersection information that quantifies how much of “the same information” two or more random variables specify about a target random variable. As of yet, none is wholly satisfactory.” Works on mutual information between organisms and environment (along with David Krakauer and Jessica Flack), and also entropy rates.
• Cameron Freer – MIT postdoc in Brain and Cognitive Sciences working on maximum entropy production principles, algorithmic entropy etc.
• Tobias Fritz – a physicist who has worked on “resource theories” and haracterizations of Shannon and Rényi entropy and on resource theories.
• Dashiell Fryer – works with Marc Harper on information geometry and evolutionary game theory.
• Michael Gilchrist – an evolutionary biologist studying how errors and costs of protein translation affect the codon usage observed within a genome. Works at NIMBioS.
• Manoj Gopalkrishnan – an expert on chemical reaction networks who understands entropy-like Lyapunov functions for these systems.
• Marc Harper – works on evolutionary game theory using ideas from information theory, information geometry, etc.
• John Harte – an ecologist who uses the maximum entropy method to predict the structure of ecosystems.
• Ellen Hines – studies habitat modeling and mapping for marine endangered species and ecosystems, sea level change scenarios, documenting of human use and values. Her lab has used MaxEnt methods.
• Elizabeth Hobson – behavior ecology postdoc developing methods to quantify social complexity in animals. Works at NIMBioS.
• John Jungk – works on graph theory and biology.
• Chris Lee – in bioinformatics and genomics; applies information theory to experiment design and evolutionary biology.
• Maria Leites – works on dynamics, bifurcations and applications of coupled systems of non-linear ordinary differential equations with applications to ecology, epidemiology, and transcriptional regulatory networks. Interested in information theory.
• Tom Leinster – a mathematician who applies category theory to study various concepts of ‘magnitude’, including biodiversity and entropy.
• Timothy Lezon – a systems biologist in the Drug Discovery Institute at Pitt, who has used entropy to characterize phenotypic heterogeneity in populations of cultured cells.
• Maria Ortiz Mancera – statistician working at CONABIO, the National Commission for Knowledge and Use of Biodiversity, in Mexico.
• Yajun Mei – statistician who uses Kullback-Leibler divergence and how to efficiently compute entropy for the two-state hidden Markov models.
• Robert Molzon – mathematical economist who has studied deterministic approximation of stochastic evolutionary dynamics.
• David Murrugarra – works on discrete models in mathematical biology; interested in learning about information theory.
• Annette Ostling – studies community ecology, focusing on the influence of interspecific competition on community structure, and what insights patterns of community structure might provide about the mechanisms by which competing species coexist.
• Connie Phong – grad student at Chicago’s Institute of Genomics and System biology, working on how “certain biochemical network motifs are more attuned than others at maintaining strong input to output relationships under fluctuating conditions.”
• Petr Plechak – works on information-theoretic tools for estimating and minimizing errors in coarse-graining stochastic systems. Wrote “Information-theoretic tools for parametrized coarse-graining of non-equilibrium extended systems”.
• Blake Polllard – physics grad student working with John Baez on various generalizations of Shannon and Renyi entropy, and how these entropies change with time in Markov processes and open Markov processes.
• Timothee Poisot – works on species interaction networks; developed a “new suite of tools for probabilistic interaction networks”.
• Richard Reeve – works on biodiversity studies and the spread of antibiotic resistance. Ran a program on entropy-based biodiversity measures at a mathematics institute in Barcelona.
• Rob Shaw – works on entropy and information in biotic and pre-biotic systems.
• Matteo Smerlak – postdoc working on nonequilibrium thermodynamics and its applications to biology, especially population biology and cell replication.
• Susanne Still – a computer scientist who studies the role of thermodynamics and information theory in prediction.
• Alexander Wissner-Gross – Institute Fellow at the Harvard University Institute for Applied Computational Science and Research Affiliate at the MIT Media Laboratory, interested in lots of things.
• David Wolpert – works at the Santa Fe Institute on i) information theory and game theory, ii) the second law of thermodynamics and dynamics of complexity, iii) multi-information source optimization, iv) the mathematical underpinnings of reality, v) evolution of organizations.
• Matthew Zefferman – works on evolutionary game theory, institutional economics and models of gene-culture co-evolution. No work on information, but a postdoc at NIMBioS.
Jacob Biamonte got a grant from the Foundational Questions Institute to run a small meeting on network theory:
• The categorical foundations of network theory.
It’s being held 25-28 May 2015 in Turin, Italy, at the ISI Foundation. We’ll make slides and/or videos available, but the main goal is to bring a few people together, exchange ideas, and push the subject forward.
Network theory is a diverse subject which developed independently in several disciplines. It uses graphs with additional structure to model everything from complex systems to theories of fundamental physics.
This event aims to further our understanding of the mathematical theory underlying the relations between seemingly different networked systems. It’s part of the Azimuth network theory project.
With the exception of the first day (Monday May 25th) we will kick things off with a morning talk, with plenty of time for questions and interaction. We will then break for lunch at 1:00 p.m. and return for an afternoon work session. People are encouraged to give informal talks and to present their ideas in the afternoon sessions.
• Jacob Biamonte: opening remarks.
For Jacob’s work on quantum networks visit www.thequantumnetwork.org.
• John Baez: network theory.
For my stuff see the Azimuth Project network theory page.
• David Spivak: operadic network design.
Operads are a formalism for sticking small networks together to form bigger ones. David has a 3-part series of articles sketching his ideas on networks.
• Eugene Lerman: continuous time open systems and monoidal double categories.
Eugene is especially interested in classical mechanics and networked dynamical systems, and he wrote an introductory article about them here on the Azimuth blog.
• Tobias Fritz: ordered commutative monoids and theories of resource convertibility.
Tobias has a new paper on this subject, and a 3-part expository series here on the Azimuth blog!
ISI Foundation
Via Alassio 11/c
10126 Torino — Italy
Phone: +39 011 6603090
Email: isi@isi.it
Theory group details: www.TheQuantumNetwork.org
• Part 2: Creating a knowledge network.
Remember where we were. Ologs, linguistically-enhanced sketches, just weren’t doing justice to the idea that each step in a recipe is itself a recipe. But the idea seemed ripe for mathematical formulation.
Thus, I returned to a question I’d wondered about in the very beginning: how is macro-understanding built from micro-understanding? How can multiple individual humans come together, like cells in a multicellular organism, to make a whole that is itself a surviving decision-maker?
There were, and continue to be, a lot of “open-to-Spivak” questions one can ask: How are stories about events built from sub-stories about sub-events? How is macro-economics built from micro-economics? Are large-scale phenomena always based on, and relatable to, interactions between smaller-scale phenomena? For example, I still want to understand, in very basic terms, how classical (large-scale) phenomena are a manifestation of quantum phenomena.
Neuroscience professor Michael Gazzaniga has a similar question: How does cognition arise from the interaction of tiny event-noticers, and how does society emerge and effect individual brains? As put it in the last paragraph of his book Who’s In Charge, we are in need of a language by which to understand the interfaces of “our layered hierarchical existence”, because doing so “holds the answer to our quest for understanding mind/brain relationships.” He goes on:
Understanding how to develop a vocabulary for those layered interactions, for me, constitutes the scientific problem of this century.
I tend to be infatuated with this same kind of idea: cognition emerging from interactions between sub-cognitive pieces. This is what got me interested in what I now call “operadic modularity”. Luckily again, my Office of Naval Research hero (now at the Air Force Office of Scientific Research) granted me a chance to study it.
The idea is this: modularity is about arranging many modules into a single whole, which is another module, usable as part of a larger system. Each system of modules is a module of its own, and we see the nesting phenomenon. Operads can model the language of nestable interface arrangements, and their algebras can model how interfaces are filled in with the required sorts of modules.
Here, by operad, I mean symmetric colored operad, or symmetric multicategory. Operads are like categories—they have objects, morphisms, identities, and a unital and associative composition formula—the difference is that the domain of a morphism is a finite set of objects (in an operad) rather than a single object (as in a category). So morphisms in an operad are written like we call such a morphism n-ary.
An early example, formulated operadically by Peter May (the inventor of operads) is called the little 2-cubes operad, denoted It has only one object, say a square ⬜, and an n-ary morphism
⬜ ,…, ⬜ ⬜
is any arrangement of non-overlapping squares in a larger square. These arrangements clearly display a nesting property.
Another source of examples comes from the fact that every monoidal category has an underlying operad with
(Either was symmetric monoidal to begin with or you can add in symmetries, roughly by multiplying each hom-set by ) The operad underlying the cartesian monoidal category of sets is an example I’ll use later.
If you want to think about operads as modeling modularity—building one thing out of many—the first trick is to imagine the codomain object as the exterior and all the domain objects as sitting inside it, on the interior. May’s little 2-cubes operad gives the idea: squares in a square. From now on, if I speak of many little objects arranged inside one big object, I always mean it this way: the interior objects constitute the domain, the exterior object is the codomain, and the arrangement itself is the morphism. These arrangements can be nested inside one another, corresponding to composition in the operad.
What are other types of nested phenomena, which we might be able to think about operadically? How about circles wired together in a larger circle? An object in this operad is a circle with some number of wires sticking out; let’s call it a ported-circle. A morphism from n-many ported-circles to one ported-circle is any connection pattern involving—i.e., wiring together of—the ports. This description can be interpreted in a few different ways; I usually mean the underlying operad of the monoidal category of “sets and cospans under disjoint union”, but the “spaghetti and meatballs operad” of circular planar arc diagrams is another good interpretation.
Once you have an operad you have a kind of calculus for nestable arrangements. As I’ve been saying, I often think of the morphisms in an operad in terms of pictures, such as wiring diagrams or squares-in-a-square. If you say you want these pictures to “mean something”, you’re probably looking for an algebra This operad functor which acts like a lax functor between monoidal categories, would tell you the set of fillers or fills that can be placed into each object in the picture.
I often think of the operad as a picture language, and the -algebra its intended semantics. Not only does such a set-valued functor on give you a set of fills for each object , it would also give you a formula for producing a large-scale fill (element of ) from any arrangement of small-scale fills (element of ).
For example, given a pointed space , you can ask for the set of based 2-spheres
⬜
in it. Here, is any element of ⬜ Think of a based sphere in as a continuous map from the filled-in square to that sends the boundary of the square to the basepoint of Given n spheres in an arrangement of non-overlapping squares in a square prescribes a new based sphere ⬜ The idea is that you send all the unused space in the big exterior square to the basepoint of , and follow the instructions when you get to the th little square inside. Thus any “2-fold loop space” gives an algebra of May’s little 2-cubes operad.
So recently, I’ve been thinking a lot about operadic modularity, i.e., cases in which a thing can be built out of a bunch of simpler things. Note that not all cases of “nesting” have such a clear picture language. For example, context-free grammars are modular: you build [postal-address] out of [name-part], [street-address] and [zip-part], you build each of these, e.g., [name-part], in any of several ways (there is an optional suffix part and the option to abbreviate your first name using an initial). The point is, you build things out of smaller parts, nested inside still smaller parts. Seeing context-free grammars as free operads is one of the things Hermida, Makkai, and Power explained in their paper on higher dimensional multigraphs.
The operadic notion of modularity can also be applied to building hierarchical protein materials. Like context-free grammars, the operad of such materials doesn’t come with a nice picture language. However, it can be formalized as an operad nonetheless. That is, there is a grammar of actions that one can apply to a bunch of polypeptides, actions such as “attach”, “overlay”, “rigidMotion”, “helix”, “makeArray”. From these you can build proteins that are quite complex from a simple vocabulary of 20 amino acids. I’ve joined forces with Tristan Giesa and Ravi Jagadeesan to make such a program. The software package, called Matriarch, for “Materi-als Arch-itecture”, should be available soon as an open source Python library.
There are lots of operads whose morphisms look like string diagrams of various sorts. These operads, which generalize a set-theoretic version of May’s topological little 2-cubes, have clear picture languages. The algebras on such “visualizable” operads can model things like databases and dynamical systems. Over the past year or so, I’ve been writing a series of “worked example” papers, such as those above, in which I explain various picture languages and semantics for them.
I find that operads provide a nice language in which to discuss string diagrams of various sorts. String-diagrammatic languages exist for many different “doctrines”, such as categories, categories without identities, monoidal categories, cartesian monoidal categories, traced monoidal categories, operads, etc. For example, Dylan Rupel and I realized that traced monoidal categories are (well, if you have enough equipment and an expert like Patrick Schultz around) algebras on the operad of oriented 1-cobordisms. It seems to me that the other doctrines above are similarly associated to operads that are “nearby” Cob, e.g., sub-operads of Cob, operads under Cob, etc. Maps between these various operads should induce known adjunctions between the corresponding doctrines.
That brings us to present day. There will be a workshop in Turin in a couple of months, and I think it’ll be a lot of fun:
• Categorical Foundations of Network Theory, May 25-28, ISI Foundation, Turin, Italy.
I’m looking forward to hearing from John Baez, Jacob Biamonte, Eugene Lerman, Tobias Fritz and others, about what they’ve been thinking about recently. I think we’ll find interesting common ground. If there’s interest, I’d be happy to talk about categorical models for how information is communicated throughout a network, and whether this gives any insight that can lead to better decision-making by the larger whole.
In 2007, I asked myself: as mathematically as possible, what can formally ground meaningful information, including both its successful communication and its role in decision-making? I believed that category theory could be useful in formalizing the type of object that we call information, and the type of relationship that we call communication.
Over the next few years, I worked on this project. I tried to understand what information is, how it is stored, and how it can be transferred between entities that think differently. Since databases store information, I wanted to understand databases category-theoretically. I eventually decided that databases are basically just categories corresponding to a collection of meaningful concepts and connections between them, and that these categories are equipped with functors Such a functor assigns to each meaningful concept a set of examples and connects them as dictated by the morphisms of I later found out that this “databases as categories” idea was not original; it is due to Rosebrugh and others. My view on the subject has matured a bit since then, but I still like this basic conception of databases.
If we model a person’s knowledge as a database (interconnected tables of examples of things and relationships), then the network of knowledgeable humans could be conceptualized as a simplicial complex equipped with a sheaf of databases. Here, a vertex represents an individual, with her database of knowledge. An edge represents a pair of individuals and a common ground database relating their individual databases. For example, you and your brother have a database of concepts and examples from your history. The common-ground database is like the intersection of the two databases, but it could be smaller (if the people don’t yet know they agree on something). In a simplicial complex, there are not only vertices and edges, but also triangles (and so on). These would represent databases held in common between three or more people.
I wanted “regular people” to actually make such a knowledge network, i.e., to share their ideas in the form of categories and link them together with functors. Of course, most people don’t know categories and functors, so I thought I’d make things easier for them by equipping categories with linguistic structures: text boxes for objects, labeled arrows for morphisms. For example, “a person has a mother” would be a morphism from the “person” object, to the “mother” object. I called such a linguistic category an olog, playing on the word blog. The idea (originally inspired during a conversation with my friend Ralph Hutchison) was that I wanted people, especially scientists, to blog their ontologies, i.e., to write “onto-logs” like others make web-logs.
Ologs codify knowledge. They are like concept webs, except with more rules that allow them to simultaneously serve as database schemas. By introducing ologs, I hoped I could get real people to upload their ideas into what is now called the cloud, and make the necessary knowledge network. I tried to write my papers to engage an audience of intelligent lay-people rather than for an audience of mathematicians. It was a risk, but to me it was the only honest approach to the larger endeavor.
(For students who might want to try going out on a limb like this, you should know that I was offered zero jobs after my first postdoc at University of Oregon. The risk was indeed risky, and one has to be ok with that. I personally happened to be the beneficiary of good luck and was offered a grant, out of the clear blue sky, by a former PhD in algebraic geometry, who worked at the Office of Naval Research at the time. That, plus the helping hands of Haynes Miller and many other brilliant and wonderful people, can explain how I lived to tell the tale.)
So here’s how the simplicial complex of ologs would ideally help humanity steer. Suppose we say that in order for one person to learn from another, the two need to find a common language and align some ideas. This kind of (usually tacit) agreement on, or alignment of, an initial common-ground vocabulary and concept-set is important to get their communication onto a proper footing.
For two vertices in such a simplicial network, the richer their common-ground olog (i.e., the database corresponding to the edge between them) is, the more quickly and accurately the vertices can share new ideas. As ideas are shared over a simplex, all participating databases can be updated, hence making the communication between them richer. In around 2010, Mathieu Anel and I worked out a formal way this might occur; however, we have not yet written it up. The basic idea can be found here.
In this setup, the simplicial complex of human knowledge should grow organically. Scientists, business people, and other people might find benefit in ologging their ideas and conceptions, and using them to learn from their peers. I imagined a network organizing itself, where simplices of like-minded people could share information with neighboring groups across common faces.
I later wrote a book called Category Theory for the Sciences, available free online, to help scientists learn how category theory could apply to familiar situations like taxonomies, graphs, and symmetries. Category theory, simply explained, becomes a wonderful key to the whole world of pure mathematics. It’s the closest thing we have to a universal language of thought, and therefore an appropriate language for forming connections.
My working hypothesis for the knowledge network was this. The information held by people whose worldview is more true—more accurate—would have better predictive power, i.e., better results. This is by definition: I define ones knowledge to be accurate as the extent to which, when he uses this knowledge to direct his actions, he has good luck handling his worldly affairs. As Louis Pasteur said, “Luck favors the prepared mind.” It follows that if someone has a track record of success, others will value finding broad connections into his olog. However, to link up with someone you must find a part of your olog that aligns with his—a functorial connection—and you can only receive meaningful information from him to the extent that you’ve found such common ground.
Thus, people who like to live in fiction worlds would find it difficult to connect, except to other like-minded “Obama’s a Christian”-type people. To the extent you are imbedded in a fictional—less accurate, less predictive—part of the network, you will find it difficult to establish functorial connections to regions of more accurate knowledge, and therefore you can’t benefit from the predictive and conceptual value of this knowledge.
In other words, people would be naturally inclined to try to align their understanding with people that are better informed. I felt hope that this kind of idea could lead to a system in which honesty and accuracy were naturally rewarded. At the very least, those who used it could share information much more effectively than they do now. This was my plan; I just had to make it real.
I had a fun idea for publicizing ologs. The year was in 2008, and I remember thinking it would be fantastic if I could olog the political platform and worldview of Barack Obama and of Sarah Palin. I wished I could sit down with them and other politicians and help them write ologs about what they believed and wanted for the country. I imagined that some politicians might have ologs that look like a bunch of disconnected text boxes—like a brain with neurons but no synapses—a collection of talking points but no real substantive ideas.
Anyway, there I was, trying to understand everything this way: all information was categories (or perhaps sketches) and presheaves. I would work with interested people from any academic discipline, such as materials science, to make ologs about whatever information they wanted to record category-theoretically. Ologs weren’t a theory of everything, but instead, as Jack Morava put it, a theory of anything.
One day I was working on a categorical sketch to model processes within processes, but somehow it really wasn’t working properly. The idea was simple: each step in a recipe is a mini-recipe of its own. Like chopping the carrots means getting out a knife and cutting board, putting a carrot on there, and bringing the knife down successively along it. You can keep zooming into any of these and see it as its own process. So there is some kind of nested, fractal-like behavior here. The olog I made could model the idea of steps in a recipe, but I found it difficult to encode the fact that each step was itself a recipe.
This nesting thing seemed like an idea that mathematics should treat beautifully, and ologs weren’t doing it justice. It was then when I finally admitted that there might be other fish in the mathematical sea.
• Part 3: From parts to wholes.