Nonstandard Integers as Complex Numbers

3 March, 2018

 

I just read something cool:

• Joel David Hamkins, Nonstandard models of arithmetic arise in the complex numbers, 3 March 2018.

Let me try to explain it in a simplified way. I think all cool math should be known more widely than it is. Getting this to happen requires a lot of explanations at different levels.

Here goes:

The Peano axioms are a nice set of axioms describing the natural numbers. But thanks to Gödel’s incompleteness theorem, these axioms can’t completely nail down the structure of the natural numbers. So, there are lots of different ‘models’ of Peano arithmetic.

These are often called ‘nonstandard’ models. If you take a model of Peano arithmetic—say, your favorite ‘standard’ model —you can get other models by throwing in extra natural numbers, larger than all the standard ones. These nonstandard models can be countable or uncountable. For more, try this:

Nonstandard models of arithmetic, Wikipedia.

Starting with any of these models you can define integers in the usual way (as differences of natural numbers), and then rational numbers (as ratios of integers). So, there are lots of nonstandard versions of the rational numbers. Any one of these will be a field: you can add, subtract, multiply and divide your nonstandard rationals, in ways that obey all the usual basic rules.

Now for the cool part: if your nonstandard model of the natural numbers is small enough, your field of nonstandard rational numbers can be found somewhere in the standard field of complex numbers!

In other words, your nonstandard rationals are a subfield of the usual complex numbers: a subset that’s closed under addition, subtraction, multiplication and division by things that aren’t zero.

This is counterintuitive at first, because we tend to think of nonstandard models of Peano arithmetic as spooky and elusive things, while we tend to think of the complex numbers as well-understood.

However, the field of complex numbers is actually very large, and it has room for many spooky and elusive things inside it. This is well-known to experts, and we’re just seeing more evidence of that.

I said that all this works if your nonstandard model of the natural numbers is small enough. But what is “small enough”? Just the obvious thing: your nonstandard model needs to have a cardinality smaller than that of the complex numbers. So if it’s countable, that’s definitely small enough.

This fact was recently noticed by Alfred Dolich at a pub after a logic seminar at the City University of New York. The proof is very easy if you know this result: any field of characteristic zero whose cardinality is less than or equal to that of the continuum is isomorphic to some subfield of the complex numbers. So, unsurprisingly, it turned out to have been repeatedly discovered before.

And the result I just mentioned follows from this: any two algebraically closed fields of characteristic zero that have the same uncountable cardinality must be isomorphic. So, say someone hands you a field F of characteristic zero whose cardinality is smaller than that of the continuum. You can take its algebraic closure by throwing in roots to all polynomials, and its cardinality won’t get bigger. Then you can throw in even more elements, if necessary, to get a field whose cardinality is that of the continuum. The resulting field must be isomorphic to the complex numbers. So, F is isomorphic to a subfield of the complex numbers.

To round this off, I should say a bit about why nonstandard models of Peano arithmetic are considered spooky and elusive. Tennenbaum’s theorem says that for any countable non-standard model of Peano arithmetic there is no way to code the elements of the model as standard natural numbers such that either the addition or multiplication operation of the model is a computable function on the codes.

We can, however, say some things about what these countable nonstandard models are like as ordered sets. They can be linearly ordered in a way compatible with addition and multiplication. And then they consist of one copy of the standard natural numbers, followed by a lot of copies of the standard integers, which are packed together in a dense way: that is, for any two distinct copies, there’s another distinct copy between them. Furthermore, for any of these copies, there’s another copy before it, and another after it.

I should also say what’s good about algebraically closed fields of characteristic zero: they are uncountably categorical. In other words, any two models of the axioms for an algebraically closed field with the same cardinality must be isomorphic. (This is not true for the countable models: it’s easy to find different countable algebraically closed fields of characteristic zero. They are not spooky and elusive.)

So, any algebraically closed field whose cardinality is that of the continuum is isomorphic to the complex numbers!

For more on the logic of complex numbers, written at about the same level as this, try this post of mine:

The logic of real and complex numbers, Azimuth 8 September 2014.


Cartesian Bicategories

1 March, 2018

Two students in the Applied Category Theory 2018 school have blogged about a classic paper in category theory:

• Daniel Cicala and Jules Hedges, Cartesian bicategories, The n-Category Café, 19 February 2018.

Jules Hedges is a postdoc in the computer science department at Oxford who is applying category theory to game theory and economics. Daniel Cicala is a grad student working with me on a compositional approach to graph rewriting, which is about stuff like this:

This picture shows four ‘open graphs’: graphs with inputs and outputs. The vertices are labelled with operations. The top of the picture shows a ‘rewrite rule’ where one open graph is turned into another: the operation of multiplying by 2 is replaced by the operation of adding something to itself. The bottom of the picture shows one way we can ‘apply’ this rule: this takes us from open graph at bottom left to the open graph at bottom right.

So, we can use graph rewriting to think about ways to transform a computer program into another, perhaps simpler, computer program that does the same thing.

How do we formalize this?

A computer program wants to be a morphism, since it’s a process that turns some input into some output. Rewriting wants to be a 2-morphism, since it’s a ‘meta-process’ that turns some program into some other program. So, there should be some bicategory with computer programs (or labelled open graphs!) as morphisms and rewrites as 2-morphisms. In fact there should be a bunch of such bicategories, since there are a lot of details that one can tweak.

Together with my student Kenny Courser, Daniel has been investigating these bicategories:

• Daniel Cicala, Spans of cospans, Theory and Applications of Categories 33 (2018), 131–147.

Abstract. We discuss the notion of a span of cospans and define, for them, horizonal and vertical composition. These compositions satisfy the interchange law if working in a topos C and if the span legs are monic. A bicategory is then constructed from C-objects, C-cospans, and doubly monic spans of C-cospans. The primary motivation for this construction is an application to graph rewriting.

• Daniel Cicala, Spans of cospans in a topos, Theory and Applications of Categories 33 (2018), 1–22.

Abstract. For a topos T, there is a bicategory MonicSp(Csp(T)) whose objects are those of T, morphisms are cospans in T, and 2-morphisms are isomorphism classes of monic spans of cospans in T. Using a result of Shulman, we prove that MonicSp(Csp(T)) is symmetric monoidal, and moreover, that it is compact closed in the sense of Stay. We provide an application which illustrates how to encode double pushout rewrite rules as 2-morphisms inside a compact closed sub-bicategory of MonicSp(Csp(Graph)).

This stuff sounds abstract and esoteric when they talk about it, but it’s really all about things like the picture above—and it’s an important part of network theory!

Recently Daniel Cicala has noticed that some of the bicategories he’s getting are ‘cartesian bicategories’ in the sense of this paper:

• Aurelio Carboni and Robert F. C. Walters, Cartesian bicategories I, Journal of Pure and Applied Algebra 49 (1987), 11–32.

And that’s the paper he’s blogging about now with Jules Hedges!


Insect Population Crash

25 February, 2018

Scary news from Australia:

• Marc Rigby, Insect population decline leaves Australian scientists scratching for solutions, ABC Far North, 23 February 2018.

I’ll quote the start:

A global crash in insect populations has found its way to Australia, with entomologists across the country reporting lower than average numbers of wild insects.

University of Sydney entomologist Dr. Cameron Webb said researchers around the world widely acknowledge that insect populations are in decline, but are at a loss to determine the cause.

“On one hand it might be the widespread use of insecticides, on the other hand it might be urbanisation and the fact that we’re eliminating some of the plants where it’s really critical that these insects complete their development,” Dr Webb said.

“Add in to the mix climate change and sea level rise and it’s incredibly difficult to predict exactly what it is. It’s left me dumbfounded.”

Entomologist and owner of the Australian Insect Farm, near Innisfail in far north Queensland, Jack Hasenpusch is usually able to collect swarms of wild insects at this time of year.

“I’ve been wondering for the last few years why some of the insects have been dropping off and put it down to lack of rainfall,” Mr. Hasenpusch said.

“This year has really taken the cake with the lack of insects, it’s left me dumbfounded, I can’t figure out what’s going on.”

Mr Hasenpusch said entomologists he had spoken to from Sydney, Brisbane, Perth and even as far away as New Caledonia and Italy all had similar stories.

The Australian Butterfly Sanctuary in Kuranda, west of Cairns, has had difficulty breeding the far north’s iconic Ulysses butterfly for more than two years.

“We’ve had [the problem] checked by scientists, the University of Queensland was involved, Biosecurity Queensland was involved but so far we haven’t found anything unusual in the bodies [of caterpillars] that didn’t survive,” said breeding laboratory supervisor Tina Kupke.

“We’ve had some short successes but always failed in the second generation.”

Ms. Lupke said the problem was not confined to far north Queensland, or even Australia. “Some of our pupae go overseas from some of our breeders here and they’ve all had the same problem,” she said. “And the Melbourne Zoo has been trying for quite a while with the same problems.”

Limited lifecycle prefaces population plummet

Dr. Webb, who primarily researches mosquitoes, said numbers were also in decline across New South Wales this year, which was indicative of the situation in other insect populations.

“We’ve had a really strange summer; it’s been very dry, sometimes it’s been brutally hot but sometimes it’s been cooler than average,” he said.

“Mosquito populations, much like a lot of other insects, rely on the combination of water, humidity and temperature to complete their lifecycle. When you mix around any one of those three components you can really change the local population dynamics.”

All this reminds me of a much more detailed study showing a dramatic insect population decline in Germany over a much longer time period:

• Gretchen Vogel, Where have all the insects gone?, Science, 10 May 2017.

I’ll just quote a bit of this article:

Now, a new set of long-term data is coming to light, this time from a dedicated group of mostly amateur entomologists who have tracked insect abundance at more than 100 nature reserves in western Europe since the 1980s.

Over that time the group, the Krefeld Entomological Society, has seen the yearly insect catches fluctuate, as expected. But in 2013 they spotted something alarming. When they returned to one of their earliest trapping sites from 1989, the total mass of their catch had fallen by nearly 80%. Perhaps it was a particularly bad year, they thought, so they set up the traps again in 2014. The numbers were just as low. Through more direct comparisons, the group—which had preserved thousands of samples over 3 decades—found dramatic declines across more than a dozen other sites.

It also mentions a similar phenomenon in Scotland:

Since 1968, scientists at Rothamsted Research, an agricultural research center in Harpenden, U.K., have operated a system of suction traps—12-meter-long suction tubes pointing skyward. Set up in fields to monitor agricultural pests, the traps capture all manner of insects that happen to fly over them; they are “effectively upside-down Hoovers running 24/7, continually sampling the air for migrating insects,” says James Bell, who heads the Rothamsted Insect Survey.

Between 1970 and 2002, the biomass caught in the traps in southern England did not decline significantly. Catches in southern Scotland, however, declined by more than two-thirds during the same period. Bell notes that overall numbers in Scotland were much higher at the start of the study. “It might be that much of the [insect] abundance in southern England had already been lost” by 1970, he says, after the dramatic postwar changes in agriculture and land use.

Here’s the actual research paper:

• Caspar A. Hallmann, Martin Sorg, Eelke Jongejans, Henk Siepel, Nick Hofland, Heinz Schwan, Werner Stenmans, Andreas Müller, Hubert Sumser, Thomas Hörren, Dave Goulson and Hans de Kroon, More than 75 percent decline over 27 years in total flying insect biomass in protected areas, PLOS One, 18 October 2017.

Abstract. Global declines in insects have sparked wide interest among scientists, politicians, and the general public. Loss of insect diversity and abundance is expected to provoke cascading effects on food webs and to jeopardize ecosystem services. Our understanding of the extent and underlying causes of this decline is based on the abundance of single species or taxonomic groups only, rather than changes in insect biomass which is more relevant for ecological functioning. Here, we used a standardized protocol to measure total insect biomass using Malaise traps, deployed over 27 years in 63 nature protection areas in Germany (96 unique location-year combinations) to infer on the status and trend of local entomofauna. Our analysis estimates a seasonal decline of 76%, and mid-summer decline of 82% in flying insect biomass over the 27 years of study. We show that this decline is apparent regardless of habitat type, while changes in weather, land use, and habitat characteristics cannot explain this overall decline. This yet unrecognized loss of insect biomass must be taken into account in evaluating declines in abundance of species depending on insects as a food source, and ecosystem functioning in the European landscape.

It seems we are heading into strange times.


A Double Conference

23 February, 2018

Here’s a cool way to cut carbon emissions: a double conference. The idea is to have a conference in two faraway locations connected by live video stream, to reduce the amount of long-distance travel!

Even better, it’s about a great subject:

• Higher algebra and mathematical physics, August 13–17, 2018, Perimeter Institute, Waterloo, Canada, and Max Planck Institute for Mathematics, Bonn, Germany.

Here’s the idea:

“Higher algebra” has become important throughout mathematics, physics, and mathematical physics, and this conference will bring together leading experts in higher algebra and its mathematical physics applications. In physics, the term “algebra” is used quite broadly: any time you can take two operators or fields, multiply them, and write the answer in some standard form, a physicist will be happy to call this an “algebra”. “Higher algebra” is characterized by the appearance of a hierarchy of multilinear operations (e.g. A-infinity and L-infinity algebras). These structures can be higher categorical in nature (e.g. derived categories, cohomology theories), and can involve mixtures of operations and co-operations (Hopf algebras, Frobenius algebras, etc.). Some of these notions are purely algebraic (e.g. algebra objects in a category), while others are quite geometric (e.g. shifted symplectic structures).

An early manifestation of higher algebra in high-energy physics was supersymmetry. Supersymmetry makes quantum field theory richer and thus more complicated, but at the same time many aspects become more tractable and many problems become exactly solvable. Since then, higher algebra has made numerous appearances in mathematical physics, both high- and low-energy._

Participation is limited. Some financial support is available for early-career mathematicians. For more information and to apply, please visit the conference website of the institute closer to you:

North America: http://www.perimeterinstitute.ca/HAMP
Europe: http://www.mpim-bonn.mpg.de/HAMP

If you have any questions, please write to double.conference.2018@gmail.com.

One of the organizers, Aaron Mazel-Gee, told me:

We are also interested in spreading the idea of double conferences more generally: we’re hoping that our own event’s success inspires other academic communities to organize their own double conferences. We’re hoping to eventually compile a sort of handbook to streamline the process for others, so that they can learn from our own experiences regarding the various unique challenges that organizing such an event poses. Anyways, all of this is just to say that I would be happy for you to publicize this event anywhere that it might reach these broader audiences.

So, if you’re interested in having a double conference, please contact the organizers of this one for tips on how to do it! I’m sure they’ll have better advice after they’ve actually done it. I’ve found that the technical details really matter for these things: it can be very frustrating when they don’t work correctly. Avoiding such problems requires testing everything ahead of time—under conditions that exactly match what you’re planning to do!


Complex Adaptive System Design (Part 7)

19 February, 2018

In March, I’ll be talking at Spencer Breiner‘s workshop on Applied Category Theory at the National Institute of Standards and Technology. I’ll be giving a joint talk with John Foley about our work using operads to design networks. This work is part of the Complex Adaptive System Composition and Design Environment project being done by Metron Scientific Solutions and managed by John Paschkewitz at DARPA.

I’ve written about this work before:

• Complex adaptive systems design: Part 1, Part 2, Part 3, Part 4, Part 5, Part 6.

But we’ve done a lot more, and my blog articles are having trouble keeping up! So I’d like to sketch out the big picture as it stands today.

If I had to summarize, I’d say we’ve developed a formalism for step-by-step compositional design and tasking, using commitment networks. But this takes a while to explain.

Here’s a very simple example of a commitment network:

It has four nodes, which represent agents: a port, a helicopter, a UAV (an unmanned aerial vehicle, or drone) and a target. The edges between these notes describe relationships between these agents. Some of these relationships are ‘commitments’. For example, the edges labelled ‘SIR’ say that one agent should ‘search, intervene and rescue’ the other.

Our framework for dealing with commitment networks has some special features. It uses operads, but this isn’t really saying much. An ‘operad’ is a bunch of ways of sticking things together. An ‘algebra’ of the operad gives a particular collection of these things, and says what we get when we stick them together. These concepts are extremely general, so there’s a huge diversity of operads, each with a huge diversity of algebras. To say one is using operads to solve a problem is a bit like saying one is using math. What matters more is the specific kind of operad one is using, and how one is using it.

For our work, we needed to develop a new class of operads called network operads, which are described here:

• John Baez, John Foley, Joseph Moeller and Blake Pollard, Network models.

In this paper we mainly discuss communication networks. Subsequently we’ve been working on a special class of network operads that describe how to build commitment networks.

Here are some of key ideas:

• Using network operads we can build bigger networks from smaller ones by overlaying them. David Spivak’s operad of wiring diagrams only let us ‘wire together’ smaller networks to form bigger ones:

Here networks X1, X2 and X3 are being wired together to form Y.

Network operads also let us wire together networks, but in addition they let us take one network:

and overlay another:

to create a larger network:

This is a new methodology for designing systems. We’re all used to building systems by wiring together subsystems: anyone who has a home stereo system has done this. But overlaying systems lets us do more. For example, we can take two plans of action involving the same collection of agents, and overlay them to get a new plan. We’ve all done this, too: you tell a bunch of people to do things… and then tell the same people, or an overlapping set of people, to do some other things. But lots of problems can arise if you aren’t careful. A mathematically principled approach can avoid some of these problems.

• The nodes of our networks represent agents of various types. The edges represent various relationships between agents. For example, they can represent communication channels. But more interestingly, they can represent commitments. For example, we can have an edge from A to B saying that agent A has to go rescue agent B. We call this kind of network a commitment network.

• By overlaying commitment networks, we can not only build systems out of smaller pieces but also build complicated plans by overlaying smaller pieces of plans. Since ‘tasking’ means telling a system what to do, we call this compositional tasking.

• If one isn’t careful, overlaying commitment networks can produce conflicts. Suppose we have a network with an edge saying that agent A has to rescue agent B. On top of this we overlay a network with an edge saying that A has to rescue agent C. If A can’t do both of these tasks at once, what should A do? There are various choices. We need to build a specific choice into the framework, so we can freely overlay commitment networks and get a well-defined result that doesn’t overburden the agents involved. We call this automatic deconflicting.

• Our approach to automatic deconflicting uses an idea developed by the famous category theorist Bill Lawvere: graphic monoids. I’ll explain these later, along with some of their side-benefits.

• Networks operads should let us do step-by-step compositional tasking. In other words, they should let us partially automate the process of tasking networks of agents, both

1) compositionally: tasking smaller networks and then sticking them together, e.g. by overlaying them, to get larger networks,

and

2) in a step-by-step way, starting at a low level of detail and then increasing the amount of detail.

To do this we need not just operads but their algebras.

• Remember, a network operad is a bunch of ways to stick together networks of some kind, e.g. by overlaying them. An algebra of this operad specifies a particular collection of networks of this kind, and says what we actually get when we stick them together.

So, a network operad can have one algebra in which things are described in a bare-bones, simplified way, and another algebra in which things are described in more detail. Indeed it will typically have many algebras, corresponding to many levels of detail, but for simplicity let’s just think about two.

When we have a ‘less detailed’ algebra A and a ‘more detailed’ algebra A', they will typically be related by a map

f \colon A' \to A

which ‘forgets the extra details’. This sort of map is called a homomorphism of algebras. We give examples in our paper Network models.

But what we usually want to do, when designing a system, is not forget extra detail, but rather add extra detail to a rough specification. There is not always a systematic way to do this. If there is, then we may have a homomorphism

g \colon A \to A'

going back the other way. This lets us automate the process of filling in the details. But we can’t usually count on being able to do this. So, often we may have to start with an element of A and search for an element of A' that is mapped to it by f : A' \to A. And typically we want this element to be optimal, or at least ‘good enough’, according to some chosen criteria. Expressing this idea formally helps us figure out how to automate the search. John Foley, in particular, has been working on this.

That’s an overview of our ideas.

Next, for the mathematically inclined, I want to give a few more details on one of the new elements not mentioned in our Network models paper: ‘graphic monoids’.

Graphic monoids

In our paper Network models we explain how the ‘overlay’ operation makes the collection of networks involving a given set of agents into a monoid. A monoid is a set M with a product that is associative and has an identity element 1:

(xy)z = x(yz)
1 x = x = x 1

In our application, this product is overlaying two networks.

A graphic monoid is one in which the graphic identity

x y x = x y

holds for all x,y.

To understand this identity, let us think of the elements of the monoid as “commitments”. The product x y means “first committing to do x, then committing to do y”. The graphic identity says that if we first commit to do x, then y, and then x again, it’s the same as first committing to do x and then y. Committing to do x again doesn’t change anything!

In particular, in any graphic monoid we have

xx = x 1 x = x 1 = x

so making the same commitment twice is the same as making it once. Mathematically we say every element x of a graphic monoid is idempotent:

x^2 = x

A commutative monoid obeying this law x^2 = x automatically obeys the graphic identity, since then

x y x = x^2 y = x y

But for a noncommutative monoid, the graphic identity is stronger than x^2 = x. It says that after committing to x, no matter what intervening commitments one might have made, committing to x again has no further effect. In other words: the intervening commitments did not undo the original commitment, so making the original commitment a second time has no effect! This captures the idea of how promises should behave.

As I said, for any network model, the set of all networks involving a fixed set of agents is a monoid. In a commitment network model, this monoid is required to be a graphic monoid. Joseph Moeller is writing a paper that shows how to construct a large class of commitment network models. We will follow this with a paper illustrating how to use these in compositional tasking.

For now, let me just mention a side-benefit. In any graphic monoid we can define a relation x \le y by

x \le y  \; \iff \; x a = y $ for some a

This makes the graphic monoid into a partially ordered set, meaning that these properties hold:

reflexivity: x \le x

transitivity: x \le y , y \le z \; \implies \; x \le z

antisymmetry: x \le y, y \le x \; \implies x = y

In the context of commitment networks, x \le y means that starting from x we can reach y by making some further commitment a: that is, x a = y for some a. So, as we ‘task’ a collection of agents by giving them more and more commitments, we move up in this partial order.


Applied Category Theory at NIST (Part 1)

17 February, 2018

I think it’s really cool how applied category theory is catching on. My former student Blake Pollard is working at the National Institute of Standards and Technology on applications of category theory to electrical engineering. He’s working with Spencer Breiner… and now Breiner is helping run a workshop on this stuff:

• Applied Category Theory: Bridging Theory & Practice, March 15–16, 2018, NIST, Gaithersburg, Maryland, USA. Organized by Spencer Breiner and Eswaran Subrahmanian.

It’s by invitation only, but I can’t resist mentioning its existence. Here’s the idea:

What: The Information Technology Laboratory at NIST is pleased to announce a workshop on Applied Category Theory to be held at NIST’s Gaithersburg, Maryland campus on March 15 & 16, 2018. The meeting will focus on practical avenues for introducing methods from category theory into real-world applications, with an emphasis on examples rather than theorems.

Who: The workshop aims to bring together two distinct groups. First, category theorists interested in pursuing applications outside of the usual mathematical fields. Second, domain experts and research managers from industry, government, science and engineering who have in mind potential domain applications for categorical methods.

Intended Outcomes: A proposed landscape of potential CT applications and the infrastructure needed to realize them, together with a 5-10 year roadmap for developing the field of applied category theory. This should include perspectives from industry, academia and government as well as major milestones, potential funding sources, avenues for technology transfer and necessary improvements in tool support and methodology. Exploratory collaborations between category theorists and domain experts. We will ask that each group come prepared to meet the other side. Mathematicians should be prepared with concrete examples that demonstrate practical applications of CT in an intuitive way. Domain experts should bring to the table specific problems to which they can devote time and/or funding as well as some reasons about why they think CT might be relevant to this application.

Invited Speakers:
John Baez (University of California at Riverside) and John Foley (Metron Scientific Solutions).
Bob Coecke (University of Oxford).
Dusko Pavlovic (University of Hawaii).

Some other likely participants include Chris Boner (Metron), Arquimedes Canedo (Siemens at Princeton), Stephane Dugowson (Supméca), William Edmonson (North Carolina A&T), Brendan Fong (MIT), Mark Fuge (University of Maryland), Jack Gray (Penumbra), Helle Hansen (Delft), Jelle Herold (Statebox), Marisa Hughes (Johns Hopkins), Steve Huntsman (BAE Systems), Patrick Johnson (Dassault Systèmes), Al Jones (NIST), Cliff Joslyn (Pacific Northwest National Laboratory), Richard Malek (NSF), Tom Mifflin (Metron), Ira Monarch (Carnegie Mellon), John Paschkewitz (DARPA), Evan Patterson (Stanford), Blake Pollard (NIST), Emilie Purvine (Pacific Northwest National Laboratory), Mark Raugas (Pacific Northwest National Laboratory), Bill Regli (University of Maryland), Michael Robinson (American U.) Alberto Speranzon (Honeywell Aerospace), David Spivak (MIT), Eswaran Subrahmanian (Carnegie Mellon), Jamie Vicary (Birmingham and Oxford), and Ryan Wisnesky (Categorical Informatics).

A bunch of us will stay on into the weekend and talk some more. I hope we make a lot of progress—and I plan to let you know how it goes!

I’ll be giving a joint talk with John Foley about our work using operads to design networks. This work is part of the Complex Adaptive System Composition and Design Environment project being done by Metron Scientific Solutions and managed by John Paschkewitz at DARPA.


Linguistics Using Category Theory

11 February, 2018

 

Now students in the Applied Category Theory 2018 school are reading about categories applied to linguistics. Read the blog article here for more:

• Jade Master and Cory Griffith, Linguistics using category theory, The n-Category Café, 6 February 2018.

This was written by my grad student Jade Master along with Cory Griffith, an undergrad at Stanford. Jade is currently working with me on the semantics of open Petri nets.

What’s the basic idea of this linguistics and category theory stuff? I don’t know much about this, but I can say a bit.

Since category theory is great for understanding the semantics of programming languages, it makes sense to try it for human languages, even though they’re much harder. The first serious attempt I know was by Jim Lambek, who introduced pregroup grammars in 1958:

• Joachim Lambek, The mathematics of sentence structure, Amer. Math. Monthly 65 (1958), 154–170.

In this article he hid the connection to category theory. But when you start diagramming sentences or phrases using his grammar, as below, you get planar string diagrams as shown above. So it’s not surprising—if you’re in the know—that he’s secretly using monoidal categories where every object has a right dual and, separately, a left dual.

This fact is just barely mentioned in the Wikipedia article:

Pregroup grammar.

but it’s explained in more detail here:

• A. Preller and J. Lambek, Free compact 2-categories, Mathematical Structures in Computer Science 17 (2005), 309-340.

This stuff is hugely fun, so I’m wondering why I never looked into it before! When I talked to Lambek, who is sadly no longer with us, it was mainly about his theories relating particle physics to quaternions.

Recently Mehrnoosh Sadrzadeh and Bob Coecke have taken up Lambek’s ideas, relating them to the category of finite-dimensional vector spaces. Choosing a monoidal functor from a pregroup grammar to this category allows one to study linguistics using linear algebra! This simplifies things, perhaps a bit too much—but it makes it easy to do massive computations, which is very popular in this age of “big data” and machine learning.

It also sets up a weird analogy between linguistics and quantum mechanics, which I’m a bit suspicious of. While the category of finite-dimensional vector spaces with its usual tensor product is monoidal, and has duals, it’s symmetric, so the difference between writing a word to the left of another and writing it to the right of another gets washed out! I think instead of using vector spaces one should use modules of some noncommutative Hopf algebra, or something like that. Hmm… I should talk to those folks.

To discuss this, please visit The n-Category Café, since there’s a nice conversation going on there and I don’t want to split it. There has also been a conversation on Google+, and I’ll quote some of it here, so you don’t have to run all over the internet.

Noam Zeilberger wrote:

You might have been simplifying things for the post, but a small comment anyways: what Lambek introduced in his original paper are these days usually called “Lambek grammars”, and not exactly the same thing as what Lambek later introduced as “pregroup grammars”. Lambek grammars actually correspond to monoidal biclosed categories in disguise (i.e., based on left/right division rather than left/right duals), and may also be considered without a unit (as in his original paper). (I only have a passing familiarity with this stuff, though, and am not very clear on the difference in linguistic expressivity between grammars based on division vs grammars based on duals.)

Noam Zeilberger wrote:

If you haven’t seen it before, you might also like Lambek’s followup paper “On the calculus of syntactic types”, which generalized his original calculus by dropping associativity (so that sentences are viewed as trees rather than strings). Here are the first few paragraphs from the introduction:

…and here is a bit near the end of the 1961 paper, where he made explicit how derivations in the (original) associative calculus can be interpreted as morphisms of a monoidal biclosed category:

John Baez wrote:

Noam Zeilberger wrote: “what Lambek introduced in his original paper are these days usually called “Lambek grammars”, and not exactly the same thing as what Lambek later introduced as “pregroup grammars”.”

Can you say what the difference is? I wasn’t simplifying things on purpose; I just don’t know this stuff. I think monoidal biclosed categories are great, and if someone wants to demand that the left or right duals be inverses, or that the category be a poset, I can live with that too…. though if I ever learned more linguistics, I might ask why those additional assumptions are reasonable. (Right now I have no idea how reasonable the whole approach is to begin with!)

Thanks for the links! I will read them in my enormous amounts of spare time. :-)

Noam Zeilberger wrote:

As I said it’s not clear to me what the linguistic motivations are, but the way I understand the difference between the original “Lambek” grammars and (later introduced by Lambek) pregroup grammars is that it is precisely analogous to the difference between a monoidal category with left/right residuals and a monoidal category with left/right duals. Lambek’s 1958 paper was building off the idea of “categorial grammar” introduced earlier by Ajdukiewicz and Bar-Hillel, where the basic way of combining types was left division A\B and right division B/A (with no product).

Noam Zeilberger wrote:

At least one seeming advantage of the original approach (without duals) is that it permits interpretations of the “semantics” of sentences/derivations in cartesian closed categories. So it’s in harmony with the approach of “Montague semantics” (mentioned by Richard Williamson over at the n-Cafe) where the meanings of natural language expressions are interpreted using lambda calculus. What I understand is that this is one of the reasons Lambek grammar started to become more popular in the 80s, following a paper by Van Benthem where he observed that such such lambda terms denoting the meanings of expressions could be computed via “homomorphism” from syntactic derivations in Lambek grammar.

Jason Nichols wrote:

John Baez, as someone with a minimal understanding of set theory, lambda calculus, and information theory, what would you recommend as background reading to try to understand this stuff?

It’s really interesting, and looks relevant to work I do with NLP and even abstract syntax trees, but I reading the papers and wiki pages, I feel like there’s a pretty big gap to cross between where I am, and where I’d need to be to begin to understand this stuff.

John Baez wrote:

Jason Nichols: I suggest trying to read some of Lambek’s early papers, like this one:

• Joachim Lambek, The mathematics of sentence structure, Amer. Math. Monthly 65 (1958), 154–170.

(If you have access to the version at the American Mathematical Monthly, it’s better typeset than this free version.) I don’t think you need to understand category theory to follow them, at least not this first one. At least for starters, knowing category theory mainly makes it clear that the structures he’s trying to use are not arbitrary, but “mathematically natural”. I guess that as the subject develops further, people take more advantage of the category theory and it becomes more important to know it. But anyway, I recommend Lambek’s papers!

Borislav Iordanov wrote:

Lambek was an amazing teacher, I was lucky to have him in my ungrad. There is a small and very approachable book on his pregroups treatment that he wrote shortly before he passed away: “From Word to Sentence: a computational algebraic approach to grammar”. It’s plain algebra and very fun. Sadly looks like out of print on Amazon, but if you can find it, well worth it.

Andreas Geisler wrote:

One immediate concern for me here is that this seems (don’t have the expertise to be sure) to repeat a very old mistake of linguistics, long abandoned :

Words do not have atomic meanings. They are not a part of some 1:1 lookup table.

The most likely scenario right now is that our brains store meaning as a continuously accumulating set of connections that ultimately are impacted by every instance of a form we’ve ever heard/seen.

So, you shall know a word by all the company you’ve ever seen it in.

Andreas Geisler wrote:

John Baez I am a linguist by training, you’re welcome to borrow my brain if you want. You just have to figure out the words to use to get my brain to index what you need, as I don’t know the category theory stuff at all.

It’s a question of interpretation. I am also a translator, so i might be of some small assistance there as well, but it’s not going to be easy either way I am afraid.

John Baez wrote:

Andreas Geisler wrote: “I might be of some small assistance there as well, but it’s not going to be easy either way I am afraid.”

No, it wouldn’t. Alas, I don’t really have time to tackle linguistics myself. Mehrnoosh Sadrzadeh is seriously working on category theory and linguistics. She’s one of the people leading a team of students at this Applied Category Theory 2018 school. She’s the one who assigned this paper by Lambek, which 2 students blogged about. So she would be the one to talk to.

So, you shall know a word by all the company you’ve ever seen it in.

Yes, that quote appears in the blog article by the students, which my post here was merely an advertisement for.