## Compositional Robotics (Part 2)

27 May, 2021

Very soon we’re having a workshop on applications of category theory to robotics:

2021 Workshop on Compositional Robotics: Mathematics and Tools, online, Monday 31 May 2021.

You’re invited! As of today it’s not too late to register and watch the talks online, and registration is free. Go here to register:

Here’s the schedule. All times are in UTC, so the show starts at 9:15 am Pacific Time:

Time (UTC) Speaker

Title

16:15-16:30   Intro and plan of the workshop

16:30-17:10

Jonathan Lorand

Category Theory Basics

17:20-18:00

John Baez Category Theory and Systems

Breakout rooms

18:30-19:10

Andrea Censi
& Gioele Zardini

Categories for Co-Design

19:20-20:00

David Spivak

Dynamic Interaction Patterns

Breakout rooms

20:30-21:15

Aaron Ames

A Categorical Perspective on Robotics

21:30-22:15 Daniel Koditschek Toward a Grounded Type Theory for Robot Task Composition
22:30-00:30 Selected speakers Talks from open submissions

For more information go to the workshop website or my previous blog post on this workshop:

## Category Theory and Systems

27 May, 2021

I’m giving a talk on Monday the 31st of May, 2021 at 17:20 UTC, which happens to be 10:20 am Pacific Time for me. You can see my slides here:

I’ll talk about how to describe open systems as morphisms in symmetric monoidal categories, and how to use ‘functorial semantics’ to describe the behavior of open systems.

It’s part of the 2021 Workshop on Compositional Robotics: Mathematics and Tools, and if you click the link you can see how to attend!  If you stick around for the rest of the workshop you’ll hear more concrete talks from people who really work on robotics.

## Compositional Robotics (Part 1)

20 April, 2021

A bunch of us are organizing a workshop on applications of category theory to robotics, as part of the IEEE International Conference on Robotics and Automation:

2021 Workshop on Compositional Robotics: Mathematics and Tools, online, 31 May 2021. Organized by Andrea Censi, Gioele Zardini, Jonathan Lorand, David Spivak, Brendan Fong, Nina Otter, Paolo Perrone, John Baez, Dylan Shell, Jason Kane, Alexandra Nilles, Andew Spielberg, and Emilio Frazzoli.

Submit your papers here by 21 May 2021!

Here’s the idea of the workshop:

In the last decade research on embodied intelligence has seen important developments. While the complexity of robotic systems has dramatically increased, both for single robots and interacting multi-robot systems (e.g., autonomous vehicles and mobility systems), the design methods have not kept up.

The standard answer to dealing with complexity is exploiting compositionality, but there are no well-established mathematical modeling and design tools that have the reach for compositional analysis and design at the level of a complex robotic system.

The goal of this workshop is to integrate mathematical principles and practical tools for compositional robotics, with a focus on applied category theory as a meta-language to talk about compositionality.

The workshop will happen on May 31st virtually. Details will follow.

#### Session I: Mathematics and Tools for Compositionality

In the morning, some of the world’s leading experts in Applied Category Theory (ACT) will provide tutorials to present an invitation to various aspects of compositionality, both at the theoretical and the practical level. In particular, Dr. Jonathan Lorand will teach Category Theory basics, Dr. David Spivak and Dr. Brendan Fong will introduce the audience to the concept of compositionality, Prof. John Baez will explain how the previously defined concepts can be used when modeling various types of systems, and Dr. Andrea Censi will present the theory of co-design, tailored to robotic applications.

#### Session II: Keynote Talks and Open Contributions

The afternoon session features two keynotes on the application of compositionality in robotics:

• Prof. Aaron Ames, Bren Professor of Mechanical and Civil Engineering and Control and Dynamical Systems, California Institute of Technology.

• Prof. Daniel Koditschek, Alfred Fitler Moore Professor of Electrical & Systems Engineering, School of Engineering & Applied Science, University of Pennsylvania. Prof. Koditschek will be assisted by Dr. Paul Gustafson (Wright State University) and Dr. Matthew Kvalheim (University of Pennsylvania).

Both speakers are leading experts in their fields and have succesfully applied category theory and compositionality to real challenges in robotics. Finally, we plan for eight talk-slots for open submissions. Submissions should focus on mathematical perspectives (not limited to ACT) and applications of compositionality.

## Applied Category Theory 2021 — Call for Papers

16 April, 2021

The deadline for submitting papers is coming up soon: May 12th.

Fourth Annual International Conference on Applied Category Theory (ACT 2021), July 12–16, 2021, online and at the Computer Laboratory of the University of Cambridge.

Plans to run ACT 2021 as one of the first physical conferences post-lockdown are progressing well. Consider going to Cambridge! Financial support is available for students and junior researchers.

Applied category theory is a topic of interest for a growing community of researchers, interested in studying many different kinds of systems using category-theoretic tools. These systems are found across computer science, mathematics, and physics, as well as in social science, linguistics, cognition, and neuroscience. The background and experience of our members is as varied as the systems being studied. The goal of the Applied Category Theory conference series is to bring researchers together, disseminate the latest results, and facilitate further development of the field.

We accept submissions of both original research papers, and work accepted/submitted/ published elsewhere. Accepted original research papers will be invited for publication in a proceedings volume. The keynote addresses will be drawn from the best accepted papers. The conference will include an industry showcase event.

We hope to run the conference as a hybrid event, with physical attendees present in Cambridge, and other participants taking part online. However, due to the state of the pandemic, the possibility of in-person attendance is not yet confirmed. Please do not book your travel or hotel accommodation yet.

### Financial support

We are able to offer financial support to PhD students and junior researchers. Full guidance is on the webpage.

### Important dates (all in 2021)

• Submission Deadline: Wednesday 12 May
• Author Notification: Monday 7 June
• Financial Support Application Deadline: Monday 7 June
• Financial Support Notification: Tuesday 8 June
• Priority Physical Registration Opens: Wednesday 9 June
• Ordinary Physical Registration Opens: Monday 13 June
• Reserved Accommodation Booking Deadline: Monday 13 June
• Adjoint School: Monday 5 to Friday 9 July
• Main Conference: Monday 12 to Friday 16 July

### Submissions

The following two types of submissions are accepted:

Proceedings Track. Original contributions of high-quality work consisting of an extended abstract, up to 12 pages, that provides evidence of results of genuine interest, and with enough detail to allow the program committee to assess the merits of the work. Submission of work-in-progress is encouraged, but it must be more substantial than a research proposal.

Non-Proceedings Track. Descriptions of high-quality work submitted or published elsewhere will also be considered, provided the work is recent and relevant to the conference. The work may be of any length, but the program committee members may only look at the first 3 pages of the submission, so you should ensure that these pages contain sufficient evidence of the quality and rigour of your work.

Papers in the two tracks will be reviewed against the same standards of quality. Since ACT is an interdisciplinary conference, we use two tracks to accommodate the publishing conventions of different disciplines. For example, those from a Computer Science background may prefer the Proceedings Track, while those from a Mathematics, Physics or other background may prefer the Non-Proceedings Track. However, authors from any background are free to choose the track that they prefer, and submissions may be moved from the Proceedings Track to the Non-Proceedings Track at any time at the request of the authors.

Contributions must be submitted in PDF format. Submissions to the Proceedings Track must be prepared with LaTeX, using the EPTCS style files available at http://style.eptcs.org.

The submission link will soon be available on the ACT2021 web page: https://www.cl.cam.ac.uk/events/act2021

### Program Committee

Chair:

• Kohei Kishida, University of Illinois, Urbana-Champaign

Members:

• Richard Blute, University of Ottawa
• Spencer Breiner, NIST
• Daniel Cicala, University of New Haven
• Robin Cockett, University of Calgary
• Bob Coecke, Cambridge Quantum Computing
• Geoffrey Cruttwell, Mount Allison University
• Valeria de Paiva, Samsung Research America and University of Birmingham
• Brendan Fong, Massachusetts Institute of Technology
• Jonas Frey, Carnegie Mellon University
• Tobias Fritz, Perimeter Institute for Theoretical Physics
• Fabrizio Romano Genovese, Statebox
• Helle Hvid Hansen, University of Groningen
• Jules Hedges, University of Strathclyde
• Chris Heunen, University of Edinburgh
• Alex Hoffnung, Bridgewater
• Martti Karvonen, University of Ottawa
• Kohei Kishida, University of Illinois, Urbana -Champaign (chair)
• Martha Lewis, University of Bristol
• Bert Lindenhovius, Johannes Kepler University Linz
• Ben MacAdam, University of Calgary
• Dan Marsden, University of Oxford
• Jade Master, University of California, Riverside
• Joe Moeller, NIST
• Koko Muroya, Kyoto University
• Simona Paoli, University of Leicester
• Daniela Petrisan, Université de Paris, IRIF
• Peter Selinger, Dalhousie University
• Michael Shulman, University of San Diego
• David Spivak, MIT and Topos Institute
• Joshua Tan, University of Oxford
• Dmitry Vagner
• Jamie Vicary, University of Cambridge
• John van de Wetering, Radboud University Nijmegen
• Vladimir Zamdzhiev, Inria, LORIA, Université de Lorraine
• Maaike Zwart

## Language Complexity (Part 7)

23 March, 2021

David A. Tanzer

Higher complexity classes

In Part 6, we saw a program with quadratic complexity. The collection of all languages that can be decided in $O(n^k)$ time for some $k$ is called P, for polynomial time complexity.

Now let’s consider languages that appear to require time that is exponential in the size of the input, and hence lie outside of P.

Here is a decision problem that is believed to be of this sort. Say you are given a description of a boolean circuit, involving AND, OR and NOT gates, which has N inputs and one output. Is there a combination of input values that causes the output to be True?

It appears that any general decision procedure for this problem must resort to some form of searching all the possible input combinations. For N inputs, that’s on the order of $2^N$ combinations to be tried. So the computation takes time that is exponential in the number of inputs.

There is a related decision problem for languages. Consider the language of Boolean formulas like (not(X7) and (X1 or not(X2 and X1)). The query is to find assignments of True/False to each of the variables which satisfy the formula, i.e., which make it evaluate to True.

Note that each Boolean formula is equivalent to a Boolean circuit, and a satisfying assignment to the variables is tantamount to an input combination which causes the circuit to output True.

Let SAT be the language consisting of Boolean formulas which are satisfiable, i.e., for which a satisfying assignment exists. For example, the formula (X1 and X2) belongs to SAT, because it is satisfied by the assignment X1=True, X2=True. On the other hand, (X1 and not(X1)) has no satisfying assignment, and so it does not belong to SAT.

Apparently, a decider for SAT must end up resorting to trying an exponential number of combinations. Now the number of variables in a formula is $O(n)$. A brute force search through input combinations means exponential time complexity.

Could some clever person figure out a better method, which runs in polynomial time? Not if the widely believed conjecture that P != NP holds true.

## Language Complexity (Part 6)

18 March, 2021

David A. Tanzer

In Part 5 we introduced big O notation for describing linear complexity. Now let’s look at a function with greater than linear complexity:

def square_length(text):
# compute the square of the length of text
# FIXME: not the most elegant or efficient approach
n = length(text)
counter = 0
for i = 1 to n:
for j = 1 to n:
counter = counter + 1
return counter

Here, due to the suboptimal implementation, the number of steps is proportional to the square of the size of the input.

Let $f(n) = MaxSteps(\mathit{square\_length}, n)$.

Then $f(n) = O(n^2)$.

This says that f becomes eventually bounded by some quadratic function. On the other hand, it is not the case that $f(n) = O(n)$, as $f(n)$ will eventually exceed any linear function.

Here is the general definition of big O notation:

Definition.   $f(n) = O(g(n))$ means that for some $r > 0$ and $n_1$, we have that $n > n_1 \implies |f(n)| < r g(n)$.

Any function which is eventually bounded by a linear function must also be eventually bounded by a quadratic function, i.e., linear functions are “smaller than” quadratic functions. So $f(n) = O(n)$ makes a stronger statement than $f(n) = O(n^2)$. Generally, we try to make the strongest statement possible about the complexity of a function.

## Language Complexity (Part 5)

10 March, 2021

David A. Tanzer

Big O analysis

Recall our function f(n) from Part 4, which gives the values 2, 13, 14, 25, 26, 37, …

Using ‘big O’ notation, we write f(n) = O(n) to say that f is linearly bounded.

This means that f(n) will eventually become less than some linear function.

As we said, f(n) has a “rough slope” of 6. So f could never be bounded by e.g. the linear function 2n. On the other hand it looks like f should be bounded by any linear function with slope greater than 6, e.g., g(n) = 10n. However, g is not a perfect bound on f, as f(0)=2 > g(0)=0, and f(1)=13 > g(1)=10.

But once n reaches 2 we have that f(n) < g(n). So we say that f is eventually bounded by g.

Now let’s recap.

Definition.   $f(n) = O(n)$ means that for some $r > 0$ and $n_1$, we have that $n > n_1 \implies |f(n)| < r n$.

Now let’s apply big O to the analysis of this function from the previous post:

def approximately_linear(text):
counter = 0
for i = 1 to length(text):
if i is odd:
for j = 1 to 10:
counter = counter + 1
return counter


The function approximately_linear has definite linear time complexity because:

$f(n) \equiv \text{MaxSteps}(\mathit{approximately\_linear}, n) = O(n)$

## Language Complexity (Part 4)

2 March, 2021

David A. Tanzer

### Summarizing computational complexity

In Part 3 we defined, for each program P, a detailed function P'(n) that gives the worst case number of steps that P must perform when given some input of size n. Now we want to summarize P into general classes, such as linear, quadratic, etc.

What’s in a step?

But we should clarify something before proceeding: what is meant by a ‘step’ of a program? Do we count it in units of machine language, or in terms of higher level statements? Before, we said that each iteration of the loop for the ALL_A decision procedure counted as a step. But in a more detailed view, each iteration of the loop includes multiple steps: comparing the input character to ‘A’, incrementing a counter, performing a test.

All these interpretations are plausible. Fortunately, provided that the definition of a program step is ‘reasonable’, all of them will lead to the same general classification of the program’s time complexity. Here, by reasonable I mean that the definition of a step should be such that, on a given computer, there is an absolute bound on the amount of clock time needed for the processor to complete one step.

Approximately linear functions

The classification of programs into complexity classes such as linear, quadratic, etc., is a generalization which doesn’t require that the time complexity be exactly linear, quadratic, etc. For an example, consider the following code:

def approximately_linear(text):
# perform a silly computation
counter = 0
for i = 1 to length(text):
if i is odd:
for j = 1 to 10:
counter = counter + 1
return counter


Here are the number of steps it performs, as a function of the input length:

f(0) = 2
f(1) = 13
f(2) = 14
f(3) = 25
f(4) = 26
f(5) = 37
...


The value increases alternately by 1 and then by 11. As it increases by 12 for every two steps, we could say that is “approximately linear,” with slope “roughly equal” to 6. But in fine detail, the graph looks like a sawtooth.

Soon, we will explain how this function gets definitively classified as having linear complexity.

Appendix: Python machines versus Turing machines

Here we are programming and measuring complexity on a Python-like machine, rather than a pure Turning machine. This is surfaced, for example, in the fact that without further ado we called a function length(text) to count the number of characters, and will regard this as a single step of the computation. On a true Turing machine, however, counting the length of the string takes N steps, as this operation requires that the tape be advanced one character at a time until the end of the string is detected.

This is a point which turns out not to substantially affect the complexity classification of a language. Assuming that steps are counted reasonably, any optimal decision procedure for a language of strings, whether written in Turing machine language, Python, C# or what have you, will end up with the same complexity classification.

The length function in Python really does take a bounded amount of time, so it is fair to count it as a single step. The crux of this matter is that, in a higher level language, a string is more than a sequence of characters, as it is a data structure containing length information as well. So there is order N work that is implied just by the existence of a string. But this can be folded into the up-front cost of merely reading the input, which is a general precondition for a language decider.

But, you may ask, what about languages which can be decided without even reading all of the input? For example, the language of strings that begin with the prefix “abc”. Ok, so you got me.

Still, as a practical matter, anything with linear or sub-linear complexity can be considered excellent and simple. The real challenges have do with complexity which is greater than linear, and which represents a real practical issue: software performance. So, for intents and purposes, we may treat any implied order N costs as being essentially zeros – as long as they can be on a one-time, up-front basis, e.g., the order N work involved in constructing a string object.

## Native Type Theory

24 February, 2021

guest post by Christian Williams

Native Type Theory is a new paper by myself and Mike Stay. We propose a unifying method of reasoning for programming languages: model a language as a theory, form the category of presheaves, and use the internal language of the topos.

$\mathtt{language} \xrightarrow{\;\Lambda\;} \mathtt{category} \xrightarrow{\;\mathscr{P}\;} \mathtt{topos} \xrightarrow{\;\Phi\;} \mathtt{type\; system}$

Though these steps are known, the original aspect is simply the composite and its application to software. If implemented properly, we believe that native types can be very useful to the virtual world. Here, I want to share some of what we’ve learned so far.

## The Free Lunch

The biggest lesson of this paper was to have faith in what John Baez calls the dao of mathematics. For two years, Mike and I and Greg Meredith looked for ways to generate logics for programming languages; we tried many approaches, but ultimately the solution was the simplest.

Two facts of category theory provide an abundance of structure, for free:

1. Every category embeds into its presheaf topos.
2. Every topos has a rich internal language.

The embedding preserves limits and exponents, hence we can apply this to “higher-order theories”.

For now we explore the language of a topos. There are multiple views, and often its full power is not used. Toposes support geometric logic, predicate logic, and moreover dependent type theory. We emphasize the latter: dependent types are expressive and pervasive, yet underutilized in mathematics.

## The Language of a Topos

My thinking has been shaped by the idea that even foundations are categorical. Virtually any language can be modelled as a structured category: the most comprehensive reference I’ve found is Categorical Logic and Type Theory by Bart Jacobs.

Probably the most studied categorical structure of logic is the topos. Toposes of sheaves, functors which coherently assign data to a space, were first applied to algebraic geometry. A continuous map $f\colon X\to Y$ induces an inverse image $f\colon Sh(Y)\to Sh(X)$ which is a left adjoint that preserves finite limits. This gives geometric morphisms of toposes, and geometric logic ($\wedge$ and $\exists$) as the language of classifying toposes.

Though geometric logic is an important level of generality, the language of toposes is more powerful. In La Jolla, California 1965, the budding category theory community recognized Grothendieck’s categories of sheaves as being fundamentally logical structures, which generalize set theory. An elementary topos is a cartesian closed category with finite limits and a subobject classifier, an object which represents the correspondence of predicates and subobjects.

The language of an elementary topos T is encapsulated in a pair of structures.

• The predicate fibration $\Omega\text{T}\to \text{T}$ reasons about predicates, like subsets.

• The codomain fibration $\Delta\text{T}\to \text{T}$ reasons about dependent types, like indexed sets.

### Predicate Logic

Throughout mathematics, we use the internal predicate logic of Set. It is the canonical example of a topos: a predicate such as $\varphi(x)= (x+3\geq 5)$ is a function $\varphi:X\to 2=\{t,f\}$, which corresponds to its comprehension, the subobject of true terms $\{x\in X \;|\; \varphi(x)=t\}\subseteq X$.

Predicates on any set $X$ form a Boolean algebra $P(X)=[X,2]$, ordered by implication. Every function $f:X\to Y$ gives an inverse image $P(f):P(Y)\to P(X)$. This defines a functor $P:\text{Set}^{\text{op}} \to \text{Bool}$ which is a first-order hyperdoctrine: each $P(f)$ has adjoints $\exists_f\dashv P(f)\dashv \forall_f$ representing quantification, which satisfy the Beck–Chevalley condition.

Altogether, this structure formalizes classical higher-order predicate logic. Most formulae, such as

$\forall x,y:\mathbb{Q}.\; \exists z:\mathbb{Q}.\; x< z \wedge z< y$

$\forall f:Y^X.\; \forall y:Y.\; \exists x:X.\; f(x)=y \Rightarrow$
$\exists g:X^Y.\; \forall y:Y.\; f(g(y))=y$

can be modelled in this logical structure of $\text{Set}.$

This idea is fairly well-known; people often talk about the “Mitchell-Benabou language” of a topos. However, this language is predicate logic over a simple type theory, meaning that the only type formation rules are products and functions. Many constructions in mathematics do not fit into this language, because types often depend on terms:

$\text{Nat}(n) := \{m:\mathbb{N} \;|\; m\leq n\}$

$\mathbb{Z}_p := \mathbb{Z}/\langle x\sim y \equiv \exists z:\mathbb{Z}.\; (x-y)=z\cdot p\rangle$

This is provided by extending predicate logic with dependent types, described in the next section.

So, we have briefly discussed how the structure of Set allows for the the explicit construction of predicates used in everyday mathematics. The significance is that these can be constructed in any topos: we thereby generalize the historical conception of logic.

For example, in a presheaf topos $[C^{\text{op}},\text{Set}]$, the law of excluded middle does not hold, and for good reason. Negation of a sieve is necessarily more subtle than negation of subsets, because we must exclude the possibility that a morphism is not in but “precomposes in” to a sieve. This will be explored more in the applications post.

### Dependent Type Theory

Dependency is pervasive throughout mathematics. What is a monoid? It’s a set $M$, and then operations $m,e$ on $M$, and then conditions on $m,e$. Most objects are constructed in layers, each of which depends on the ones before. Type theory is often regarded as “fancy” and only necessary in complex situations, similar to misperceptions of category theory; yet dependent types are everywhere.

The basic idea is to represent dependency using preimage. A type which depends on terms of another type, $x:A\vdash B(x):\text{Type}$, can be understood as an indexed set $\{B(x)\}_{x:A}$, and this in turn is represented as a function $\coprod x:A.\; B(x)\to A$. Hence the “world of types which depend on $A$” is the slice category $\text{Set}/A$.

The poset of “$A$-predicates” sits inside the category of “$A$-types”: a comprehension is an injection $\{x\in A \;|\; \varphi(x)\}\to A$. This is a degenerate kind of dependent type, where preimages are truth values rather than sets. So we are expanding into a larger environment, which shares all of the same structure. The slice category $\text{Set}/A$ is a categorification of $P(A)$: its morphisms are commuting triangles, understood as $A$-indexed functions.

Every function $f\colon A\to B$ gives a functor $f^\ast: \text{Set}/B\to \text{Set}/A$ by pullback; this generalizes preimage, and can be expressed as substitution: given $p:S\to B$, we can form the $A$-type

$x:A\vdash f^\ast(S)(x) = S(f(x)):\text{Type}.$

This functor has adjoints $\Sigma_f\dashv f^\ast\dashv \Pi_f$, called dependent sum and dependent product: these are the powerful tools for constructing dependent types. They generalize not only quantification, but also product and hom: the triple adjunction induces an adjoint co/monad on $\text{Set}/B$

$\Sigma_f\circ f^\ast = -\times f \dashv [f,-] = \Pi_f\circ f^\ast.$

These dependent generalizations of product and function types are extremely useful.

Indexed sum generalizes product type by allowing the type of the second coordinate to depend on the term in the first coordinate. For example: $\Sigma n:\mathbb{N}.\text{List}(n)$ consists of dependent pairs $\langle n, l:\text{List}(n)\rangle$.

Indexed product generalizes function type by allowing the type of the codomain to depend on the term in the domain. For example: $\Pi S:\text{Set}.\text{List}(S)$ consists of dependent functions $\lambda S:\text{Set}.\varphi(S):\text{List}(S)$.

See how natural they are? We use them all the time, often without realizing. Simply by using preimage for indexing, we generalize product and function types to “branching” versions, allowing us to build up complex objects such as

$\text{Monoid}:= \Sigma M:\text{Set}.\Sigma m:M^2\to M.\Sigma e:1\to M...$
$...\Pi (a,b,c):M^3. m(m(a,b),c)=m(a,m(b,c)).$
$\Pi a:M. m(e,a)=a=m(a,e).$

This rich language is available in any topos. I think we’ve hardly begun to see its utility in mathematics, computation, and everyday life.

### All Together

A topos has two systems, predicate logic and dependent type theory. Each is modelled by a fibration, a functor into the topos for which the preimage of $A$ are the $A$-predicates / $A$-types. A morphism in the domain is a judgement of the form “in the context of variables of these (dependent) types, this term is of this (dependent) type”.

$a:A,b:B(a),\dots, z:Z(y)\vdash t:T$

The two fibrations are connected by comprehension which converts a predicate to a type, and image factorization which converts a type to a predicate. These give that the predicate fibration is a reflective subcategory of the type fibration.

Altogether, this forms the full higher-order dependent type theory of a topos (Jacobs Ch.11). As far as I know, this is what deserves to be called “the” language of a topos, in its full glory. This type theory is used in proof assistants such as Coq and Agda; eventually, this expressive logic + dependent types will be utilized in many programming languages.

## Conclusion

Native Type Theory gives provides a shared framework of reasoning for a broad class of languages. We believe that this could have a significant impact on software and system design, if integrated into existing systems.

In the next post, we’ll explore why this language is so useful for the topos of presheaves on theories. Please let me know any thoughts or questions about this post and especially the paper. Thank you.

## Language Complexity (Part 3)

23 February, 2021

David A. Tanzer

### A detailed measure of computational complexity

In Part 2, we identified language complexity by how “difficult” it is for a program to decide if some given text is a sentence of the language — in other words, by the complexity of the decision problem for the language.

Note there may be many decision procedures — which are implementations of functions — that solve the decision problem for a language. For our purposes, we will focus our attention just on the ones which give the best performance. As motivation for this focus, consider that our simple language ALL_A = {“A”, “AA”, “AAA”, …} could be decided by an ill-conceived program which does all sorts of unnecessary work — but that would say nothing about the complexity of ALL_A itself. On the other hand, the procedure which simply scans the characters and verifies that all are ‘A’ is optimal, and simple — which shows that ALL_A is itself a simple language.

Our goal now is to quantify the notion of language complexity. That boils down to the matter of how to quantify the amount of work a program — in this case, the language decider — needs to do.

Computational complexity

Say we have a program P that inputs a string, goes through some steps, and outputs some results. For each input string, we can count how many computation steps it leads to.   Let StepCount(P,x) be the number of steps needed to compute the result from x.

Now let’s consider the effect of the size of the input on the number of steps.

In general, we expect larger inputs to trigger longer sequences of computation steps. In the first place, it takes more steps to scan more text. And moreover, as the analysis gets larger, more steps will be involved.

Checking whether all characters are ‘A’ takes order N work, where N is the size of the input string.

For something that requires more than order N work, consider a program which takes as input the text representation of a number, parses it, and computes the number of factors in the number. The analysis of factors calls for a significant amount of computation beyond the N steps to scan the input. Moreover, this work will become larger and more complex as the input values become larger.

These facts are indicators of what is known as the time complexity of the computation.

Sidenote: the space complexity of a computation pertains to the amount of memory it uses, as a function of input size. In our context, we are considering just the time complexity, i.e., the running time as a function of input size.

Worst-case complexity

It is fair to ask what is meant by the number of computation steps required to compute the output, for input of size N. After all, there are multiple input strings of size N, each of which may trigger a different number of computation steps. For instance, take our loop-based decision procedure for ALL_A. Input “AAAAA” causes 5 computation steps (as the loop runs through to completion). But “AABAA”, also of length 5, causes only 3 computation steps (as the loop terminates once it sees the ‘B’).

What we care about here is just the maximum number of steps that will be needed for a given input size. In other words, we focus our analysis on the worst-case behavior for each input size N. The motivation here is that any bound on running time that we put on the worst-case inputs of size N gives us a bound for all inputs of size N.

Let MaxSteps(P,N) be the maximum number of steps that program P may perform for an input of size N.

Let P'(N) = MaxSteps(P,N).

So P’ is the function that gives us the maximum number of steps that P will perform for any input size N.

P’ is our first step towards quantifying the time complexity of the program P.

But for typical purposes, P’ presents an overly detailed description of complexity.

For instance, take a decision procedure P for ALL_A. We would like to be able to summarize the time complexity of P by saying that it is linear in the size of the input string. That information is contained in the structure of the function P'(n). But P’ offers a much more granular description, with one value for every natural number n.

In the next posts we will look into the classification of P'(n) into general complexity classes such as linear, quadratic and exponential.