is a good approximation to the number of primes less than or equal to Numerical evidence suggests that is always greater than For example,

and

But in 1914, Littlewood heroically showed that in fact, changes sign infinitely many times!

This raised the question: when does first exceed ? In 1933, Littlewood’s student Skewes showed, assuming the Riemann hypothesis, that it must do so for some less than or equal to

Later, in 1955, Skewes showed *without* the Riemann hypothesis that must exceed for some smaller than

By now this bound has been improved enormously. We now know the two functions cross somewhere near but we don’t know if this is the first crossing!

All this math is quite deep. Here is something less deep, but still fun.

You can show that

and so on.

It’s a nice pattern. But this pattern doesn’t go on forever! It lasts a very, very long time… but not forever.

More precisely, the identity

holds when

but not for all At some point it stops working and never works again. In fact, it definitely fails for all

The integrals here are a variant of the Borwein integrals:

where the pattern continues until

but then fails:

I never understood this until I read Greg Egan’s explanation, based on the work of Hanspeter Schmid. It’s all about convolution, and Fourier transforms:

Suppose we have a rectangular pulse, centred on the origin, with a height of 1/2 and a half-width of 1.

Now, suppose we keep taking moving averages of this function, again and again, with the average computed in a window of half-width 1/3, then 1/5, then 1/7, 1/9, and so on.

There are a couple of features of the original pulse that will persist completely unchanged for the first few stages of this process, but then they will be abruptly lost at some point.

The first feature is that F(0) = 1/2. In the original pulse, the point (0,1/2) lies on a plateau, a perfectly constant segment with a half-width of 1. The process of repeatedly taking the moving average will nibble away at this plateau, shrinking its half-width by the half-width of the averaging window. So, once the sum of the windows’ half-widths exceeds 1, at 1/3+1/5+1/7+…+1/15, F(0) will suddenly fall below 1/2, but up until that step it will remain untouched.

In the animation below, the plateau where F(x)=1/2 is marked in red.

The second feature is that F(–1)=F(1)=1/4. In the original pulse, we have a step at –1 and 1, but if we define F here as the average of the left-hand and right-hand limits we get 1/4, and once we apply the first moving average we simply have 1/4 as the function’s value.

In this case, F(–1)=F(1)=1/4 will continue to hold so long as the points (–1,1/4) and (1,1/4) are surrounded by regions where the function has a suitable symmetry: it is equal to an odd function, offset and translated from the origin to these centres. So long as that’s true for a region wider than the averaging window being applied, the average at the centre will be unchanged.

The initial half-width of each of these symmetrical slopes is 2 (stretching from the opposite end of the plateau and an equal distance away along the x-axis), and as with the plateau, this is nibbled away each time we take another moving average. And in this case, the feature persists until 1/3+1/5+1/7+…+1/113, which is when the sum first exceeds 2.

In the animation, the yellow arrows mark the extent of the symmetrical slopes.

OK, none of this is difficult to understand, but why should we care?

Because this is how Hanspeter Schmid explained the infamous Borwein integrals:

∫sin(t)/t dt = π/2

∫sin(t/3)/(t/3) × sin(t)/t dt = π/2

∫sin(t/5)/(t/5) × sin(t/3)/(t/3) × sin(t)/t dt = π/2…

∫sin(t/13)/(t/13) × … × sin(t/3)/(t/3) × sin(t)/t dt = π/2

But then the pattern is broken:

∫sin(t/15)/(t/15) × … × sin(t/3)/(t/3) × sin(t)/t dt < π/2

Here these integrals are from t=0 to t=∞. And Schmid came up with an even more persistent pattern of his own:

∫2 cos(t) sin(t)/t dt = π/2

∫2 cos(t) sin(t/3)/(t/3) × sin(t)/t dt = π/2

∫2 cos(t) sin(t/5)/(t/5) × sin(t/3)/(t/3) × sin(t)/t dt = π/2

…

∫2 cos(t) sin(t/111)/(t/111) × … × sin(t/3)/(t/3) × sin(t)/t dt = π/2But:

∫2 cos(t) sin(t/113)/(t/113) × … × sin(t/3)/(t/3) × sin(t)/t dt < π/2

The first set of integrals, due to Borwein, correspond to taking the Fourier transforms of our sequence of ever-smoother pulses and then evaluating F(0). The Fourier transform of the sinc function:

sinc(w t) = sin(w t)/(w t)

is proportional to a rectangular pulse of half-width w, and the Fourier transform of a product of sinc functions is the convolution of their transforms, which in the case of a rectangular pulse just amounts to taking a moving average.

Schmid’s integrals come from adding a clever twist: the extra factor of 2 cos(t) shifts the integral from the zero-frequency Fourier component to the sum of its components at angular frequencies –1 and 1, and hence the result depends on F(–1)+F(1)=1/2, which as we have seen persists for much longer than F(0)=1/2.

• Hanspeter Schmid, Two curious integrals and a graphic proof,

Elem. Math.69(2014) 11–17.

I asked Greg if we could generalize these results to give even longer sequences of identities that eventually fail, and he showed me how: you can just take the Borwein integrals and replace the numbers 1, 1/3, 1/5, 1/7, … by some sequence of positive numbers

The integral

will then equal as long as but not when it exceeds 1. You can see a full explanation on Wikipedia:

• Wikipedia, Borwein integral: general formula.

As an example, I chose the integral

which equals if and only if

Thus, the identity holds if

but

so the identity holds if

or

or

On the other hand, the identity fails if

so it fails if

but

so the identity fails if

or

or

With a little work one could sharpen these estimates considerably, though it would take more work to find the *exact* value of at which

first fails.

]]>• Tai-Danae Bradley, *What is Applied Category Theory?*

Abstract.This is a collection of introductory, expository notes on applied category theory, inspired by the 2018 Applied Category Theory Workshop, and in these notes we take a leisurely stroll through two themes (functorial semantics and compositionality), two constructions (monoidal categories and decorated cospans) and two examples (chemical reaction networks and natural language processing) within the field.

Check it out!

]]>It’s called the **5/8 theorem**. Randomly choose two elements of a finite group. What’s the probability that they commute? If it exceeds 62.5%, the group must be abelian!

This was probably known for a long time, but the first known proof appears in a paper by Erdös and Turan.

It’s fun to lead up to this proof by looking for groups that are “as commutative as possible without being abelian”. This phrase could mean different things. *One* interpretation is that we’re trying to maximize the probability that two randomly chosen elements commute. But there are two simpler interpretations, which will actually help us prove the 5/8 theorem.

How big can the center of a finite group be, compared to the whole group? If a group is abelian, its center, say is all of But let’s assume is not abelian. How big can be?

Since the center is a subgroup of we know by Lagrange’s theorem that is an integer. To make big we need this integer to be small. How small can it be?

It can’t be 1, since then and would be abelian. Can it be 2?

No! This would force to be abelian, leading to a contradiction! The reason is that the center is always a normal subgroup of , so is a group of size . If this is 2 then has to be But this is generated by one element, so must be generated by its center together with one element. This one element commutes with everything in the center, obviously… but that means is abelian: a contradiction!

For the same reason, can’t be 3. The only group with 3 elements is which is generated by one element. So the same argument leads to a contradiction: is generated by its center and one element, which commutes with everything in the center, so is abelian.

So let’s try There are two groups with 4 elements: and The second, called the Klein four-group, is not generated by one element. It’s generated by two elements! So it offers some hope.

If you haven’t studied much group theory, you could be pessimistic. After all, is still abelian! So you might think this: “If the group is generated by its center and two elements which commute with each other, so it’s abelian.”

But that’s false: even if two elements of commute with each other, this does not imply that the elements of mapping to these elements commute.

This is a fun subject to study, but best way for us to see this right now is to actually find a nonabelian group with . The smallest possible example would have and indeed this works!

Namely, we’ll take to be the 8-element quaternion group

where

and multiplication by works just as you’d expect, e.g.

You can think of these 8 guys as the unit quaternions lying on the 4 coordinate axes. They’re the vertices of a 4-dimensional analogue of the octahedron. Here’s a picture by David A. Richter, where the 8 vertices are projected down from 4 dimensions to the vertices of a cube:

The center of is and the quotient is the Klein four-group, since if we mod out by we get the group

with

So, we’ve found a nonabelian finite group with 1/4 of its elements lying in the center, and this is the maximum possible fraction!

Here’s another way to ask how commutative a finite group can be, without being abelian. Any element has a centralizer consisting of all elements that commute with

How big can be? If is in the center of then is all of So let’s assume is not in the center, and ask how big the fraction can be.

In other words: how large can the fraction of elements of that commute with be, without it being *everything*?

It’s easy to check that the centralizer is a subgroup of So, again using Lagrange’s theorem, we know is an integer. To make the fraction big, we want this integer to be small. If it’s 1, *everything* commutes with So the first real option is 2.

Can we find an element of a finite group that commutes with exactly 1/2 the elements of that group?

Yes! One example is our friend the quaternion group Each non-identity element commutes with exactly half the elements. For example, commutes only with its own powers:

So we’ve found a finite group with a non-central element that commutes with 1/2 the elements in the group, and this is maximum possible fraction!

Now let’s tackle the original question. Suppose is a nonabelian group. How can we maximize the probability for two randomly chosen elements of to commute?

Say we randomly pick two elements Then there are two cases. If is in the center of it commutes with with probability 1. But if is not in the center, we’ve just seen it commutes with with probability at most 1/2.

So, to get an upper bound on the probability that our pair of elements commutes, we should make the center as large as possible. We’ve seen that is at most 1/4. So let’s use that.

Then with probability 1/4, commutes with all the elements of while with probability 3/4 it commutes with 1/2 the elements of

So, the probability that commutes with is

Even better, all these bounds are attained by the quaternion group 1/4 of its elements are in the center, while every element not in the center commutes with 1/2 of the elements! So, the probability that two elements in this group commute is 5/8.

So we’ve proved the 5/8 theorem and shown we can’t improve this constant.

I find it very pleasant that the quaternion group is “as commutative as possible without being abelian” in three different ways. But I shouldn’t overstate its importance!

I don’t know the proof, but the website groupprops says the following are equivalent for a finite group :

• The probability that two elements commute is 5/8.

• The inner automorphism group of has 4 elements.

• The inner automorphism group of is

Examining the argument I gave, it seems the probability 5/8 can only be attained if

•

• for every

So apparently any finite group with inner automorphism group must have these other two properties as well!

There are lots of groups with inner automorphism group Besides the quaternion group, there’s one other 8-element group with this property: the group of rotations and reflections of the square, also known as the dihedral group of order 8. And there are six 16-element groups with this property: they’re called the groups of Hall–Senior class two. And I expect that as we go to higher powers of two, there will be vast numbers of groups with this property.

You see, the number of nonisomorphic groups of order grows alarmingly fast. There’s 1 group of order 2, 2 of order 4, 5 of order 8, 14 of order 16, 51 of order 32, 267 of order 64… but 49,487,365,422 of order 1024. Indeed, it seems ‘almost all’ finite groups have order a power of two, in a certain asymptotic sense. For example, 99% of the roughly 50 billion groups of order ≤ 2000 have order 1024.

Thus, if people trying to classify groups are like taxonomists, groups of order a power of 2 are like insects.

In 1964, the amusingly named pair of authors Marshall Hall Jr. and James K. Senior classified all groups of order for They developed some powerful general ideas in the process, like isoclinism. I don’t want to explain it here, but which involves the quotient that I’ve been talking about. So, though I don’t understand much about this, I’m not completely surprised to read that any group of order has commuting probability 5/8 iff it has ‘Hall–Senior class two’.

There’s much more to say. For example, we can define the probability that two elements commute not just for finite groups but also compact topological groups, since these come with a god-given probability measure, called Haar measure. And here again, if the group is nonabelian, the maximum possible probability for two elements to commute is 5/8!

There are also many other generalizations. For example Guralnick and Wilson proved:

• If the probability that two randomly chosen elements of generate a solvable group is greater than 11/30 then itself is solvable.

• If the probability that two randomly chosen elements of generate a nilpotent group is greater than 1/2 then is nilpotent.

• If the probability that two randomly chosen elements of generate a group of odd order is greater than 11/30 then itself has odd order.

The constants are optimal in each case.

I’ll just finish with two questions I don’t know the answer to:

• For exactly what set of numbers can we find a finite group where the probability that two randomly chosen elements commute is If we call this set we’ve seen

But does contain *every* rational number in the interval (0,5/8], or just some? Just some, in fact—but which ones? It should be possible to make some progress on this by examining my proof of the 5/8 theorem, but I haven’t tried at all. I leave it to you!

• For what properties P of a finite group is there a theorem of this form: “if the probability of two randomly chosen elements generating a subgroup of with property P exceeds some value then must itself have property P”? Is there some logical form a property can have, that will guarantee the existence of a result like this?

Here is a nice discussion, where I learned some of the facts I mentioned, including the proof I gave:

• MathOverflow, 5/8 bound in group theory.

Here is an elementary reference, free online if you jump through some hoops, which includes the proof for compact topological groups, and other bits of wisdom:

• W. H. Gustafson, What is the probability that two group elements commute?, *American Mathematical Monthly* **80** (1973), 1031–1034.

For example, if is finite simple and nonabelian, the probability that two elements commute is at most 1/12, a bound attained by

Here’s another elementary article:

• Desmond MacHale, How commutative can a non-commutative group be?, *The Mathematical Gazette* **58** (1974), 199–202.

If you get completely stuck on Puzzle 1, you can look here for some hints on what values the probability of two elements to commute can take… but not a complete solution!

The 5/8 theorem seems to have first appeared here:

• P. Erdös and P. Turán, On some problems of a statistical group-theory, IV, *Acta Math. Acad. Sci. Hung.* **19** (1968) 413–435.

I’ve been spending the last month at the Centre of Quantum Technologies, getting lots of work done. This Friday I’m giving a talk, and you can see the slides now:

• John Baez, Getting to the bottom of Noether’s theorem.

Abstract.In her paper of 1918, Noether’s theorem relating symmetries and conserved quantities was formulated in term of Lagrangian mechanics. But if we want to make the essence of this relation seem as self-evident as possible, we can turn to a formulation in term of Poisson brackets, which generalizes easily to quantum mechanics using commutators. This approach also gives a version of Noether’s theorem for Markov processes. The key question then becomes: when, and why, do observables generate one-parameter groups of transformations? This question sheds light on why complex numbers show up in quantum mechanics.

At 5:30 on Saturday October 6th I’ll talk about this stuff at this workshop in London:

• The Philosophy and Physics of Noether’s Theorems, 5-6 October 2018, Fischer Hall, 1-4 Suffolk Street, London, UK. Organized by Bryan W. Roberts (LSE) and Nicholas Teh (Notre Dame).

This workshop celebrates the 100th anniversary of Noether’s famous paper connecting symmetries to conserved quantities. Her paper actually contains *two* big theorems. My talk is only about the more famous one, Noether’s first theorem, and I’ll change my talk title to make that clear when I go to London, to avoid getting flak from experts. Her second theorem explains why it’s hard to define energy in general relativity! This is one reason Einstein admired Noether so much.

I’ll also give this talk at DAMTP—the Department of Applied Mathematics and Theoretical Physics, in Cambridge—on Thursday October 4th at 1 pm.

The organizers of London workshop on the philosophy and physics of Noether’s theorems have asked me to write a paper, so my talk can be seen as the first step toward that. My talk doesn’t contain any hard theorems, but the main point—that the complex numbers arise naturally from wanting a correspondence between observables and symmetry generators—can be expressed in some theorems, which I hope to explain in my paper.

]]>

It’s an open-access journal for research using compositional ideas, most notably of a category-theoretic origin, in any discipline. Topics may concern foundational structures, an organizing principle, or a powerful tool. Example areas include but are not limited to: computation, logic, physics, chemistry, engineering, linguistics, and cognition.

*Compositionality* is free of cost for both readers and authors.

We invite you to submit a manuscript for publication in the first issue of Compositionality (ISSN: 2631-4444), a new open-access journal for research using compositional ideas, most notably of a category-theoretic origin, in any discipline.

To submit a manuscript, please visit http://www.compositionality-journal.org/for-authors/.

Compositionality refers to complex things that can be built by sticking together simpler parts. We welcome papers using compositional ideas, most notably of a category-theoretic origin, in any discipline. This may concern foundational structures, an organising principle, a powerful tool, or an important application. Example areas include but are not limited to: computation, logic, physics, chemistry, engineering, linguistics, and cognition.

Related conferences and workshops that fall within the scope of Compositionality include the Symposium on Compositional Structures (SYCO), Categories, Logic and Physics (CLP), String Diagrams in Computation, Logic and Physics (STRING), Applied Category Theory (ACT), Algebra and Coalgebra in Computer Science (CALCO), and the Simons Workshop on Compositionality.

Submissions should be original contributions of previously unpublished work, and may be of any length. Work previously published in conferences and workshops must be significantly expanded or contain significant new results to be accepted. There is no deadline for submission. There is no processing charge for accepted publications; Compositionality is free to read and free to publish in. More details can be found in our editorial policies at http://www.compositionality-journal.org/editorial-policies/.

John Baez, University of California, Riverside, USA

Bob Coecke, University of Oxford, UK

Kathryn Hess, EPFL, Switzerland

Steve Lack, Macquarie University, Australia

Valeria de Paiva, Nuance Communications, USA

Corina Cirstea, University of Southampton, UK

Ross Duncan, University of Strathclyde, UK

Andree Ehresmann, University of Picardie Jules Verne, France

Tobias Fritz, Max Planck Institute, Germany

Neil Ghani, University of Strathclyde, UK

Dan Ghica, University of Birmingham, UK

Jeremy Gibbons, University of Oxford, UK

Nick Gurski, Case Western Reserve University, USA

Helle Hvid Hansen, Delft University of Technology, Netherlands

Chris Heunen, University of Edinburgh, UK

Aleks Kissinger, Radboud University, Netherlands

Joachim Kock, Universitat Autonoma de Barcelona, Spain

Martha Lewis, University of Amsterdam, Netherlands

Samuel Mimram, Ecole Polytechnique, France

Simona Paoli, University of Leicester, UK

Dusko Pavlovic, University of Hawaii, USA

Christian Retore, Universite de Montpellier, France

Mehrnoosh Sadrzadeh, Queen Mary University, UK

Peter Selinger, Dalhousie University, Canada

Pawel Sobocinski, University of Southampton, UK

David Spivak, MIT, USA

Jamie Vicary, University of Birmingham and University of Oxford, UK

Simon Willerton, University of Sheffield, UK

Sincerely,

The Editorial Board of Compositionality

]]>‘Compositional tasking’ means assigning tasks to networks agents in such a way that you can connect or even overlay such tasked networks and get larger ones. This lets you build up complex plans from smaller pieces.

In my last post in this series, I sketched an approach using ‘commitment networks’. A commitment network is a graph where nodes represent agents and edges represent commitments, like “A should move toward B either for 3 hours or until they meet, whichever comes first”. By overlaying such graphs we can build up commitment networks that describe complex plans of action. The rules for overlaying incorporate ‘automatic deconflicting’. In other words: don’t need to worry about agents being given conflicting duties as you stack up plans… because you’ve decided ahead of time what they should do in these situations.

I still like that approach, but we’ve been asked to develop some ideas more closely connected to traditional methods of tasking, like PERT charts, so now we’ve done that.

‘PERT’ stands for ‘program evaluation and review technique’. PERT charts were developed by the US Navy in 1957, but now they’re used all over industry to help plan and schedule large projects.

Here’s simple example:

The nodes in this graph are different **states**, like “you have built the car but not yet put on the tires”. The edges are different **tasks**, like “put the tires on the car”. Each state is labelled with an arbitrary name: 10, 20, 30, 40 and 50. The tasks also have names: A, B, C, D, E, and F. More importantly, each task is labelled by the amount of time that task requires!

Your goal is to start at state 10 and move all the way to state 50. Since you’re bossing lots of people around, you can make them do tasks simultaneously. However, you can only reach a state after you have done *all* the tasks leading up to that state. For example, you can’t reach state 50 unless you have already done *all* of tasks C, E, and F. Some typical questions are:

• What’s the minimum amount of time it takes to get from state 10 to state 50?

• Which tasks could take longer, without changing the answer to the previous question? How much longer could each task take, without changing the answer? This amount of time is called the **slack** for that task.

There are known algorithms for solving such problems. These help big organizations plan complex projects. So, connecting compositional tasking to PERT charts seems like a good idea.

At first this seemed confusing because in our previous work the nodes represented *agents*, while in PERT charts the nodes represent *states*. Of course graphs can be used for many things, even in the same setup. But the trick was getting everything to fit together nicely.

Now I think we’re close.

John Foley has been working out some nice example problems where a collection of agents need to move along the edges of a graph from specified start locations to specified end locations, taking routes that minimize their total fuel usage. However, there are some constraints. Some edges can only be traversed by specified *teams* of agents: they can’t go alone. Also, no one agent is allowed to run out of fuel.

This is a nice problem because while it’s pretty simple and specific, it’s representative of a large class of problems where a collection of agents are trying to carry out tasks together. ‘Moving along the edge of a graph’ can stand for a task of any sort. The constraint that some edges can only be traversed by specified teams is then a way of saying that certain tasks can only be accomplished by teams.

Furthermore, there are nice software packages for optimization subject to constraints. For example, John likes one called Choco. So, we plan to use one of these as part of the project.

What makes this all *compositional* is that John has expressed this problem using our ‘network model’ formalism, which I began sketching in Part 6. This allows us to assemble tasks for larger collections of agents from tasks for smaller collections.

Here, however, an idea due to my student Joe Moeller turned out to be crucial.

In our first examples of network models, explained earlier in this series, we allowed a *monoid* of networks for any set of agents of different kinds. A monoid has a binary operation called ‘multiplication’, and the idea here was this could describe the operation of ‘overlaying’ networks: for example, laying one set of communication channels, or committments, on top of another.

However, Joe knew full well that a monoid is a category with one object, so he pushed for a generalization that allowed not just a monoid but a *category* of networks for any set of agents of different kinds. I didn’t know what this was good for, but I figured: what the heck, let’s do it. It was a mathematically natural move, and it didn’t make anything harder—in fact it clarified some of our constructions, which is why Joe wanted to do it.

Now that generalization is proving to be crucial! We can take our category of networks to have *states* as objects and *tasks* (ways of moving between states) as morphisms! So, instead of ‘overlaying networks’, the basic operation is now *composing tasks*.

So, we now have a framework where if you specify a collection of agents of different kinds, we can give you the category whose morphisms are tasks those agents can engage in.

An example is John’s setup where the agents are moving around on a graph.

But this framework also handles PERT charts! While the folks who invented PERT charts didn’t think of them this way, one can think of them as describing categories of a certain specific sort, with states as objects and tasks as morphisms.

So, we now have a compositional framework for PERT charts.

I would like to dive deeper into the details, but this is probably enough for one post. I will say, though, that we use some math I’ve just developed with my grad student Jade Master, explained here:

• Open Petri nets (part 3), *Azimuth*, 19 August 2018.

The key is the relation between Petri nets and PERT charts. I’ll have more to say about that soon, I hope!

Some posts in this series:

• Part 1. CASCADE: the Complex Adaptive System Composition and Design Environment.

• Part 2. Metron’s software for system design.

• Part 3. Operads: the basic idea.

• Part 4. Network operads: an easy example.

• Part 5. Algebras of network operads: some easy examples.

• Part 6. Network models.

• Part 7. Step-by-step compositional design and tasking using commitment networks.

• Part 8. Compositional tasking using category-valued network models.

]]>• John Baez and Jade Master, Open Petri nets.

In Part 1 we saw the double category of open Petri nets; in Part 2 we saw the reachability semantics for open Petri nets as a double functor. Now I’d like to wrap up by showing you the engine beneath the hood of our results.

I fell in love with Petri nets when I realized that they were really just presentations of free symmetric monoidal categories. If you like category theory, this turns Petri nets from something mysterious into something attractive.

In any category you can compose morphisms and and get a morphism In a monoidal category you can also tensor morphisms and and get a morphism This of course relies on your ability to tensor objects. In a symmetric monoidal category you also have And of course, there is more to it than this. But this is enough to get started.

A Petri net has ‘places’ and also ‘transitions’ going between multisets of places:

From this data we can try to generate a symmetric monoidal category whose objects are built from the places and whose morphisms are built from the transitions. So, for example, the above Petri net would give a symmetric monoidal category with an object

2 **susceptible** + **infected**

and a morphism from this to the object

**susceptible** + 2 **infected**

(built using the transition **infection**), and a morphism

from this to the object

**susceptible** + **infected** + **resistant**

(built using the transition **recovery**) and so on. Here we are using to denote the tensor product in our symmetric monoidal category, as usual in chemistry.

When we do this sort of construction, the resulting symmetric monoidal category is ‘free’. That is, we are not imposing any really interesting equations: the objects are freely generated by the places in our Petri net by tensoring, and the morphisms are freely generated by the transitions by tensoring and composition.

That’s the basic idea. The problem is making this idea precise!

Many people have tried in many different ways. I like this approach the best:

• José Meseguer and Ugo Montanari, Petri nets are monoids, *Information and Computation* **88** (1990), 105–155.

but I think it can be simplified a bit, so let me describe what Jade and I did in our paper.

The problem is that there are different notions of symmetric monoidal category, and also different notions of morphism between Petri nets. We take the maximally strict approach, and work with ‘commutative’ monoidal categories. These are just commutative monoid objects in so their associator:

their left and right unitor:

and even—disturbingly—their braiding:

are all identity morphisms.

The last would ordinarily be seen as ‘going too far’, since while every symmetric monoidal category is equivalent to one with trivial associator and unitors, this ceases to be true if we also require the braiding to be trivial. However, it seems that Petri nets most naturally serve to present symmetric monoidal categories of this very strict sort. There just isn’t enough information in a Petri net to make it worthwhile giving them a nontrivial braiding

It took me a while to accept this, but now it seem obvious. If you want a nontrivial braiding, you should be using something a bit fancier than a Petri net.

Thus, we construct adjoint functors between a category of Petri nets, which we call and a category of ‘commutative monoidal categories’, which we call

An object of is a **Petri net**: that is, a set of **places**, a set of **transitions**, and **source** and **target** functions

where is the underlying set of the free commutative monoid on

More concretely, is the set of formal finite linear combinations of elements of with natural number coefficients. The set naturally includes in , and for any function

there is a unique monoid homomorphism

extending

A **Petri net morphism** from a Petri net

to a Petri net

is a pair of functions

making the two obvious diagrams commute:

There is a category with Petri nets as objects and Petri net morphisms as morphisms.

On the other hand, a **commutative monoidal category** is a commutative monoid object in Explicitly, it’s a strict monoidal category such that for all objects and we have

and for all morphisms and

Note that a commutative monoidal category is the same as a strict symmetric monoidal category where the symmetry isomorphisms

are all identity morphisms. Every strict monoidal functor between commutative monoidal categories is automatically a strict symmetric monoidal functor. So, we let be the category whose objects are commutative monoidal categories and whose morphisms are strict monoidal functors.

There’s a functor

sending any commutative monoidal category to its underlying Petri net. This Petri net has the set of objects as its set of places and the set of morphisms as its set of transitions, and

as its source and target maps.

**Proposition.** The functor has a left adjoint

This is Proposition 10 in our paper, and we give an explicit construction of this left adjoint.

So that’s our conception of the free commutative monoidal category on a Petri net. It’s pretty simple. How could anyone have done anything else?

Montanari and Meseguer do *almost* the same thing, but our category of Petri nets is a subcategory of theirs: our morphisms of Petri nets send places to places, while they allow more general maps that send a place to a *formal linear combination* of places. On the other hand, they consider a full subcategory of our containing only commutative monoidal categories whose objects form a *free* commutative monoid.

Other papers do a variety of more complicated things. I don’t have the energy to explain them all, but you can see some here:

• Pierpaolo Degano, José Meseguer and Ugo Montanari, Axiomatizing net computations and processes, in *Logic in Computer Science 1989*, IEEE, New Jersey, 1989, pp. 175–185.

• Vladimiro Sassone, Strong concatenable processes: an approach to the category of Petri net computations, *BRICS Report Series*, Dept. of Computer Science, U. Aarhus, 1994.

• Vladimiro Sassone, On the category of Petri net computations, in *Colloquium on Trees in Algebra and Programming*, Springer, Berlin, 1995.

• Vladimiro Sassone, An axiomatization of the algebra of Petri net concatenable processes, in *Theoretical Computer Science* **170** (1996), 277–296.

• Vladimiro Sassone and Pavel Sobociński, A congruence for Petri nets, *Electronic Notes in Theoretical Computer Science* **127** (2005), 107–120.

Getting the free commutative monoidal category on a Petri net right is key to developing the reachability semantics for open Petri nets in a nice way. But to see that, you’ll have to read our paper!

• Part 1: the double category of open Petri nets.

• Part 2: the reachability semantics for open Petri nets.

• Part 3: the free symmetric monoidal category on a Petri net.

]]>• John Baez and Jade Master, Open Petri nets.

Last time I explained, in a sketchy way, the double category of open Petri nets. This time I’d like to describe a ‘semantics’ for open Petri nets.

In his famous thesis *Functorial Semantics of Algebraic Theories*, Lawvere introduced the idea that semantics, as a map from expressions to their meanings, should be a functor between categories. This has been generalized in many directions, and the same idea works for double categories. So, we describe our semantics for open Petri nets as a map

from our double category of open Petri nets to a double category of relations. This map sends any open Petri net to its ‘reachability relation’.

In Petri net theory, a **marking** of a set is a finite multisubset of We can think of this as a way of placing finitely ‘tokens’—little black dots—on the elements of A Petri net lets us start with some marking of its places and then repeatedly change the marking by moving tokens around, using the transitions. This is how Petri nets describe processes!

For example, here’s a Petri net from chemistry:

Here’s a marking of its places:

But using the transitions, we can repeatedly change the marking. We started with one atom of carbon, one molecule of oxygen, one molecule of sodium hydroxide and one molecule of hydrochloric acid. But they can turn into one molecule of carbon dioxide, one molecule of sodium hydroxide and one molecule of hydrochloric acid:

These can then turn into one molecule of sodium bicarbonate and one molecule of hydrochloric acid:

Then these can turn into one molecule of carbon dioxide, one molecule of water and one molecule of sodium chloride:

People say one marking is **reachable** from another if you can get it using a finite sequence of transitions in this manner. (Our paper explains this well-known notion more formally.) In this example every marking has 0 or 1 tokens in each place. But that’s not typical: in general we could have any natural number of tokens in each place, so long as the total number of tokens is finite.

Our paper adapts the concept of reachability to *open* Petri nets. Let denote the set of markings of the set Given an open Petri net there is a **reachability relation**

This relation holds when we can take a given marking of feed those tokens into the Petri net move them around using transitions in and have them pop out and give a certain marking of leaving no tokens behind.

For example, consider this open Petri net

Here is a marking of

We can feed these tokens into and move them around using transitions in

They can then pop out into leaving none behind:

This gives a marking of that is ‘reachable’ from the original marking of

The main result of our paper is that the map sending an open Petri net to its reachability relation extends to a ‘lax double functor’

where is a double category having open Petri nets as horizontal 1-cells and is a double category having relations as horizontal 1-cells.

I can give you a bit more detail on those double categories, and also give you a clue about what ‘lax’ means, without it becoming too stressful.

Last time I said the double category has:

• sets as objects,

• functions as vertical 1-morphisms,

• open Petri nets as horizontal 1-cells—they look like this:

• morphisms between open Petri nets as 2-morphisms—an example would be the visually obvious map from this open Petri net:

to this one:

What about This double category has

• sets as objects,

• functions as vertical 1-morphisms,

• relations as horizontal 1-cells from to and

• maps between relations as 2-morphisms. Here a **map between relations** is a square

that obeys

So, the idea of the reachability semantics is that it maps:

• any set to the set consisting of all markings of that set.

• any function to the obvious function

(Yes, is a really a functor.)

• any open Petri net to its reachability relation

• any morphism between Petri nets to the obvious map between their reachability relations.

Especially if you draw some examples, all this seems quite reasonable and nice. But it’s important to note that is a *lax* double functor. This means that it does *not* send a composite open Petri net to the composite of the reachability relations for and So, we do *not* have

Instead, we just have

It’s easy to see why. Take to be this open Petri net:

and take to be this one:

Then their composite is this:

It’s easy to see that is a proper subset of In a token can move all the way from point 1 to point 5. But it does not do so by first moving through and then moving through It has to take a more complicated zig-zag path where it first leaves and enters then comes back into and then goes to

In our paper, Jade and I conjecture that we get

if we restrict the reachability semantics to a certain specific sub-double category of consisting of ‘one-way’ open Petri nets.

Finally, besides showing that

is a lax double functor, we also show that it’s symmetric monoidal. This means that the reachability semantics works as you’d expect when you run two open Petri nets ‘in parallel’.

In a way, the most important thing about our paper is that it illustrates some methods to study semantics for symmetric monoidal double categories. Kenny Courser and I will describe these methods more generally in our paper “Structured cospans.” They can be applied to timed Petri nets, colored Petri nets, and various other kinds of Petri nets. One can also develop a reachability semantics for open Petri nets that are glued together along transitions as well as places.

I hear that the company Statebox wants these and other generalizations. We aim to please—so we’d like to give it a try.

Next time I’ll wrap up this little series of posts by explaining how Petri nets give symmetric monoidal categories.

• Part 1: the double category of open Petri nets.

• Part 2: the reachability semantics for open Petri nets.

• Part 3: the free symmetric monoidal category on a Petri net.

]]>• John Baez and Jade Master, Open Petri nets.

Abstract.The reachability semantics for Petri nets can be studied using open Petri nets. For us an ‘open’ Petri net is one with certain places designated as inputs and outputs via a cospan of sets. We can compose open Petri nets by gluing the outputs of one to the inputs of another. Open Petri nets can be treated as morphisms of a category, which becomes symmetric monoidal under disjoint union. However, since the composite of open Petri nets is defined only up to isomorphism, it is better to treat them as morphisms of a symmetric monoidal double category Various choices of semantics for open Petri nets can be described using symmetric monoidal double functors out of Here we describe the reachability semantics, which assigns to each open Petri net the relation saying which markings of the outputs can be obtained from a given marking of the inputs via a sequence of transitions. We show this semantics gives a symmetric monoidal lax double functor from to the double category of relations. A key step in the proof is to treat Petri nets as presentations of symmetric monoidal categories; for this we use the work of Meseguer, Montanari, Sassone and others.

I’m excited about this, especially because our friends at Statebox are planning to use open Petri nets in their software. They’ve recently come out with a paper too:

• Fabrizio Romano Genovese and Jelle Herold, Executions in (semi-)integer Petri nets are compact closed categories.

Petri nets are widely used to model open systems in subjects ranging from computer science to chemistry. There are various kinds of Petri net, and various ways to make them ‘open’, and my paper with Jade only handles the simplest. But our techniques are flexible, so they can be generalized.

What’s an open Petri net? For us, it’s a thing like this:

The yellow circles are called ‘places’ (or in chemistry, ‘species’). The aqua rectangles are called ‘transitions’ (or in chemistry, ‘reactions’). There can in general be lots of places and lots of transitions. The bold arrows from places to transitions and from transitions to places complete the structure of a Petri net. There are also arbitrary functions from sets and into the set of places. This makes our Petri net into an ‘open’ Petri net.

We can think of open Petri nets as morphisms between finite sets. There’s a way to compose them! Suppose we have an open Petri net from to where now I’ve given names to the points in these sets:

We write this as for short, where the funky arrow reminds us this isn’t a function between sets. Given another open Petri net for example this:

the first step in composing and is to put the pictures together:

At this point, if we ignore the sets we have a new Petri net whose set of places is the disjoint union of those for and

The second step is to identify a place of with a place of whenever both are images of the same point in . We can then stop drawing everything involving and get an open Petri net which looks like this:

Formalizing this simple construction leads us into a bit of higher category theory. The process of taking the disjoint union of two sets of places and then quotienting by an equivalence relation is a pushout. Pushouts are defined only up to canonical isomorphism: for example, the place labeled in the last diagram above could equally well have been labeled or This is why to get a category, with composition strictly associative, we need to use *isomorphism classes* of open Petri nets as morphisms. But there are advantages to avoiding this and working with open Petri nets themselves. Basically, it’s better to work with things than mere isomorphism classes of things! If we do this, we obtain not a category but a bicategory with open Petri nets as morphisms.

However, this bicategory is equipped with more structure. Besides composing open Petri nets, we can also ‘tensor’ them via disjoint union: this describes Petri nets being run in parallel rather than in series. The result is a symmetric monoidal bicategory. Unfortunately, the axioms for a symmetric monoidal bicategory are cumbersome to check directly. Double categories turn out to be more convenient.

Double categories were introduced in the 1960s by Charles Ehresmann. More recently they have found their way into applied mathematics. They been used to study various things, including open dynamical systems:

• Eugene Lerman and David Spivak, An algebra of open continuous time dynamical systems and networks.

open electrical circuits and chemical reaction networks:

• Kenny Courser, A bicategory of decorated cospans, *Theory and Applications of Categories* **32** (2017), 995–1027.

open discrete-time Markov chains:

• Florence Clerc, Harrison Humphrey and P. Panangaden, Bicategories of Markov processes, in *Models, Algorithms, Logics and Tools*, Lecture Notes in Computer Science **10460**, Springer, Berlin, 2017, pp. 112–124.

and coarse-graining for open continuous-time Markov chains:

• John Baez and Kenny Courser, Coarse-graining open Markov processes. (Blog article here.)

As noted by Shulman, the easiest way to get a symmetric monoidal bicategory is often to first construct a symmetric monoidal double category:

• Mike Shulman, Constructing symmetric monoidal bicategories.

The theory of ‘structured cospans’ gives a systematic way to build symmetric monoidal double categories—Kenny Courser and I are writing a paper on this—and Jade and I use this to construct the symmetric monoidal double category of open Petri nets.

A 2-morphism in a double category can be drawn as a square like this:

We call and ‘objects’, and ‘vertical 1-morphisms’, and ‘horizontal 1-cells’, and a ‘2-morphism’. We can compose vertical 1-morphisms to get new vertical 1-morphisms and compose horizontal 1-cells to get new horizontal 1-cells. We can compose the 2-morphisms in two ways: horizontally and vertically. (This is just a quick sketch of the ideas, not the full definition.)

In our paper, Jade and I start by constructing a symmetric monoidal double category with:

• sets as objects,

• functions as vertical 1-morphisms,

• open Petri nets as horizontal 1-cells,

• morphisms between open Petri nets as 2-morphisms.

(Since composition of horizontal 1-cells is associative only up to an invertible 2-morphism, this is technically a pseudo double category.)

What are the morphisms between open Petri nets like? A simple example may be help give a feel for this. There is a morphism from this open Petri net:

to this one:

mapping both primed and unprimed symbols to unprimed ones. This describes a process of ‘simplifying’ an open Petri net. There are also morphisms that include simple open Petri nets in more complicated ones, etc.

This is just the start. Our real goal is to study the *semantics* of open Petri nets: that is, how they actually describe processes! And for that, we need to think about the free symmetric monoidal category on a Petri net. You can read more about those things in Part 2 and Part 3 of this series.

• Part 1: the double category of open Petri nets.

• Part 2: the reachability semantics for open Petri nets.

• Part 3: the free symmetric monoidal category on a Petri net.

]]>I’ll be speaking at a conference celebrating the centenary of Emmy Noether’s work connecting symmetries and conservation laws:

• The Philosophy and Physics of Noether’s Theorems, 5-6 October 2018, Fischer Hall, 1-4 Suffolk Street, London, UK. Organized by Bryan W. Roberts (LSE) and Nicholas Teh (Notre Dame).

They write:

2018 brings with it the centenary of a major milestone in mathematical physics: the publication of Amalie (“Emmy”) Noether’s theorems relating symmetry and physical quantities, which continue to be a font of inspiration for “symmetry arguments” in physics, and for the interpretation of symmetry within philosophy.

In order to celebrate Noether’s legacy, the University of Notre Dame and the LSE Centre for Philosophy of Natural and Social Sciences are co-organizing a conference that will bring together leading mathematicians, physicists, and philosophers of physics in order to discuss the enduring impact of Noether’s work.

There’s a registration fee, which you can see on the conference website, along with a map showing the conference location, a schedule of the talks, and other useful stuff.

Here are the speakers:

• John Baez (UC Riverside)

• Jeremy Butterfield (Cambridge)

• Anne-Christine Davis (Cambridge)

• Sebastian De Haro (Amsterdam and Cambridge)

• Ruth Gregory (Durham)

• Yvette Kosmann-Schwarzbach (Paris)

• Peter Olver (UMN)

• Sabrina Pasterski (Harvard)

• Oliver Pooley (Oxford)

• Tudor Ratiu (Shanghai Jiao Tong and Geneva)

• Kasia Rejzner (York)

• Robert Spekkens (Perimeter)

I’m looking forward to analyzing the basic assumptions behind various generalizations of Noether’s first theorem, the one that shows symmetries of a Lagrangian give conserved quantities. Having generalized it to Markov processes, I know there’s a lot more to what’s going on here than just the wonders of Lagrangian mechanics:

• John Baez and Brendan Fong, A Noether theorem for Markov processes, *J. Math. Phys.* **54** (2013), 013301. (Blog article here.)

I’ve been trying to get to the bottom of it ever since.

]]>