Topos Theory (Part 1)

I’m teaching an introduction to topos theory this quarter, loosely based on Mac Lane and Moerdijk’s Sheaves in Geometry and Logic.

I’m teaching one and a half hours each week for 10 weeks, so we probably won’t make it far very through this 629-page book. I may continue for the next quarter, but still, to make good progress I’ll have to do various things.

First, I’ll assume basic knowledge of category theory, a lot of which is explained in the Categorical Preliminaries and Chapter 1 of this book. I’ll start in with Chapter 2. Feel free to ask questions!

Second, I’ll skip a lot of proofs and focus on stating definitions and theorems, and explaining what they mean and why they’re interesting.

These notes to myself will be compressed versions of what I will later write on the whiteboard.

Sheaves

Topos theory emerged from Grothendieck’s work on algebraic geometry; he developed it as part of his plan to prove the Weil Conjectures. It was really just one of many linked innovations in algebraic geometry that emerged from the French school, and it makes the most sense if you examine the whole package. Unfortunately algebraic geometry takes a long time to explain! But later Lawvere and Tierney realized that topos theory could serve as a grand generalization of logic and set theory. This logical approach is more self-contained, and easier to explain, but also a bit more dry—at least to me. I will try to steer a middle course, and the title Sheaves in Geometry and Logic shows that Mac Lane and Moerdijk were trying to do this too.

The basic idea of algebraic geometry is to associate to a space the commutative ring of functions on that space, and study the geometry and topology of this space using that ring. For example, if X is a compact Hausdorff space there’s a ring C(X) consisting of all continuous real-valued functions on X, and you can recover X from this ring. But algebraic geometers often deal with situations where there aren’t enough everywhere-defined functions (of the sort they want to consider) on a space. For example, the only analytic functions on the Riemann sphere are constant functions. That’s not good enough! Most analytic functions on the Riemann sphere have poles, and are only defined away from these poles. (I’m giving an example from complex analysis, in hopes that more people will get what I’m talking about, but there are plenty of purely algebraic examples.)

This forced algebraic geometers to invent ‘sheaves’, around 1945 or so. The idea of a sheaf is that instead of only considering functions defined everywhere, we look at functions defined on open sets.

So, let X be a topological space and let \mathcal{O}(X) be the collection of open subsets of X. This is a poset with inclusion as the partial ordering, and thus it is a category. A presheaf is a functor

F \colon \mathcal{O}(X)^{\mathrm{op}} \to \mathsf{Set}

So, a sheaf assigns to each open set U a set F U. It allows us to restrict an element of F U to any smaller open set U' \subseteq U, and a couple of axioms hold, which are encoded in the word ‘functor’. Note the ‘op’: that’s what lets us restrict elements of F U to smaller open sets.

The example to keep in mind is where F U consists of functions on U (that is, functions of the sort we want to consider, such as continuous or smooth or analytic functions). However, other examples are important too.

In many of these examples something nice happens. First, suppose we have s \in F U and an open cover of U by open sets U_i. Then we can restrict s to U_i getting something we can call s|_{U_i}. We can then further restrict this to U_i \cap U_j. And by the definition of presheaf, we have

(s|_{U_i})|_{U_i \cap U_j} = (s|_{U_j})|_{U_i \cap U_j}

In other words, if we take a guy in F U and restrict it to a bunch of open sets covering U, the resulting guys agree on the overlaps U_i \cap U_j. Check that this follows from the definition of functor and some other facts!

This is true for any presheaf. A presheaf is a sheaf if we can start the other way around, with a bunch of guys s_i \in F U_i that agree on overlaps:

s_i|_{U_i \cap U_j} = s_j|_{U_i \cap U_j}

and get a unique s \in F U that restricts to all these guys:

s|_{U_i} = s_i

Note this definition secretly has two clauses: I’m saying that in this situation s exists and is unique. If we have uniqueness but not necessarily existence, we say our presheaf is a separated presheaf.

The point of a sheaf is that you can tell if something is in F U by examining it locally. These examples explain what I mean:

Puzzle. Let X = \mathbb{R} and for each open set U \subseteq \mathbb{R} take F U to be the set of continuous real-valued functions on U. Show that with the usual concept of restriction of functions, F is a presheaf and in fact a sheaf.

Puzzle. Let X = \mathbb{R} and for each open set U \subseteq \mathbb{R} take F U to be the set of bounded continuous real-valued functions on U. Show that with the usual concept of restriction of functions, F is a separated presheaf but not a sheaf.

The problem is that a function can be bounded on each open set in an open cover of U yet not bounded on U. You can tell if a function is continuous by examining it locally, but you can’t tell if its bounded!

So, in a sense that should gradually become clear, sheaves are about ‘local truth’.

The category of sheaves on a space

There’s a category of presheaves on any topological space X. Since a presheaf on X is a functor

F \colon \mathcal{O}(X)^{\mathrm{op}} \to \mathsf{Set},

a morphism between presheaves is a natural transformation between such functors.

Remember, if \mathsf{C} and \mathsf{D} are categories, we use \mathsf{C}^{\mathsf{D}} to stand for the category where the objects are functors from \mathsf{D} to \mathsf{C}, and the morphisms are natural transformations. This is called a functor category.

So, a category of presheaves is just an example of a functor category, and the category of presheaves on X is called

\mathsf{Set}^{\mathcal{O}(X)^{\mathrm{op}}}

But this name is rather ungainly, so we make an abbreviation

\widehat{\mathsf{C}} = \mathsf{Set}^{\mathsf{C}^{\mathrm{op}}}

Then the category of presheaves on X is called

\widehat{\mathcal{O}(X)}

Sheaves are subtler, but we define morphisms of sheaves the exact same way. Every sheaf has an underlying presheaf, so we define a morphism between sheaves to be a morphism between their underlying presheaves. This gives the category of sheaves on X, which we call \mathsf{Sh}(X).

By how we’ve set things up, \mathsf{Sh}(X) is a full subcategory of
\widehat{\mathcal{O}(X)}.

Now, what Grothendieck realized is that \mathsf{Sh}(X) acts a whole lot like the category of sets. For example, in the category of sets we can define ‘commutative rings’, but we can copy the definition in \mathsf{Sh}(X) and get ‘sheaves of commutative rings’, and so on. The point is that we’re copying ordinary math, but doing it locally, in a topological space.

Elementary topoi

Lawvere and Tierney clarified what was going on here by inventing the concept of ‘elementary topos’. I’ll throw the definition at you now and explain all the pieces in future classes:

Definition. An elementary topos, or topos for short, is a category with finite limits and colimits, exponentials and a subobject classifier.

I hope you know limits and colimits, since that’s the kind of basic category theory definition I’m assuming. Given two objects x and y in a category, their exponential is an object x^y that acts like the thing of all maps from y to x. I’ll give the actual definition later. A subobject classifier is, roughly, an object \Omega that generalizes the usual set of truth values

2 = \{0,1\}

Namely, subobjects of any object x are in one-to-one correspondence with morphisms from x to \Omega, which serve as ‘characteristic functions’. Again, this is just a sketch: I’ll give the actual definition later, or you can click on the link and read it now.

The point is that an elementary topos has enough bells and whistles that we can ‘do mathematics inside it’. It’s like an alternative universe, a variant of our usual category of sets and functions, where mathematicians can live. But beware: in general, the kind of mathematics we do in an elementary topos is finitistic mathematics using intuitionistic logic.

You see, the category of finite sets is an elementary topos, so you can’t expect to have ‘infinite objects’ like the set of natural numbers in an elementary topos—unless you decree that you want them (which people often do).

Also, we will see that while 2 = \{0,1\} is a Boolean algebra, the subobject classifier of an elementary topos need only be a ‘Heyting algebra’: a generalization of a Boolean algebra in which the law of excluded middle fails. This is actually not weird: it’s connected to the fact that a category of sheaves lets us reason ‘locally’. For example, we don’t just care if two functions are equal or not, we care if they’re equal or not in each open set. So we need a subtler form of logic than classical Boolean logic.

There’s a lot more to say, and I’m just sketching out the territory now, but one of the first big theorems we’re aiming for is this:

Theorem. For any topological space X, \mathsf{Sh}(X) is an elementary topos.

The topos of sheaves \mathsf{Sh}(X) remembers a lot about the topological space X that it came from… so a topos can also be seen as a way of talking about a space! This is even true for elementary topoi that aren’t topoi of sheaves on an actual space. So, topos theory is more than a generalization of set theory. It’s also, in a different way, a generalization of topology.

Grothendieck topoi

You’ll notice that sheaves on X were defined starting with the poset \mathcal{O}(X) of open sets of X. In fact, to define them we never used anything about X except this poset! This suggests that we could define sheaves more generally starting from any poset.

And that’s true—but Grothendieck went further: he defined sheaves starting from any category, as long as that category was equipped with some extra structure saying when a bunch of morphisms f_i \colon x_i \to x serve to ‘cover’ the object x. This extra data is called a ‘coverage’ or more often (rather confusingly) a ‘Grothendieck topology’. A category equipped with a Grothendieck topology is called a ‘site’.

So, Grothendieck figured out how to talk about the category of sheaves \mathsf{Sh}(\mathsf{C}) on any site \mathsf{C}. He did this before Lawvere and Tierney came along, and this was his definition of a topos. So, nowadays we say a category of sheaves on a site is a Grothendieck topos. However:

Theorem. Any Grothendieck topos is an elementary topos.

So, Lawvere and Tierney’s approach subsumes Grothendieck’s, in a sense. Not every elementary topos is a Grothendieck topos, though! For example, the category of finite sets is an elementary topos but not a Grothendieck topos. (It’s not big enough: any Grothendieck topos has, not just finite limits and colimits, but all small limits and colimits.) So both concepts of topos are important and still used. But when I say just ‘topos’, I’ll mean ‘elementary topos’.

Why did Grothendieck bother to generalize the concept of sheaves from sheaves on a topological space to sheaves on a site? He wasn’t just doing it for fun: it was a crucial step in his attempt to prove the Weil Conjectures!

Basically, when you’re dealing with spaces that algebraic geometers like—say, algebraic varieties—there aren’t enough open sets to do everything we want, so we need to use covering spaces as a generalization of open covers. So, instead of defining sheaves using the poset of open subsets of our space X, Grothendieck needed to use the category of covering spaces of X.

That’s the rough idea, anyway.

Geometric morphisms

As you probably know if you’re reading this, category theory is all about the morphisms. This is true not just within a category, but between them. The point of topos theory is not just to study one topos, but many. We don’t want merely to do mathematics in alternative universes: we want to be able to translate mathematics from one alternative universe to another!

So, what are the morphisms between topoi?

First, if you have a continuous map f \colon X \to Y between topological spaces, you can take the ‘direct image’ of a presheaf on X to get a presheaf on Y. Here’s how this works.

The inverse image of any open set is open, so we get an inverse image map

f^{-1} \colon \mathcal{O}(Y) \to \mathcal{O}(X)

sending each open set V \subseteq Y to the open set

f^{-1} V = \{x \in X :\; f(x) \in V \} \subseteq X

Given a presheaf F on X, we define its direct image to be the presheaf on Y given by

(f_\ast F)(V) = F(f^{-1} V)

Note the double reversal here: f maps points in X to points in Y, but open sets in Y give open sets in X, and then presheaves on X give presheaves on Y.

Of course we need to check that it works:

Puzzle. Show that f_\ast F is a presheaf. That is, explain how we can restrict an element of (f_\ast F)(V) to any open set contained in V, and check that we get a presheaf this way.

In fact it works very nicely:

Puzzle. Show that taking direct images gives a functor from the category of presheaves on X to the category of presheaves on Y.

Puzzle. Show that if F is a sheaf on X, its direct image f_\ast F is a sheaf on Y.

The upshot of all this is that a continuous map between topological spaces

f \colon X \to Y

gives a functor between sheaf categories

f_\ast \colon \mathsf{Sh}(X) \to \mathsf{Sh}(Y)

And this functor turns out to be very nice! This is another big theorem we aim to prove later:

Theorem. If f \colon X \to Y is a continuous map between topological spaces, the functor

f_\ast \colon \mathsf{Sh}(X) \to \mathsf{Sh}(Y)

has a left adjoint

f^\ast \colon \mathsf{Sh}(Y) \to \mathsf{Sh}(X)

that preserves finite limits.

This left adjoint is called the inverse image map. Note that because f_\ast has a left adjoint, it is a right adjoint, so it preserves limits. Because f^\ast is a left adjoint, it preserves colimits. The fact that f^\ast preserves finite limits is extra gravy on top of an already nice situation!

We bundle all this niceness into a definition:

Definition. A functor f_\ast \colon \mathsf{T} \to \mathsf{T'} between topoi is a geometric morphism if it has a left adjoint that preserves finite limits.

And this is the most important kind of morphism between topoi. It’s not a very obvious definition, but it’s extracted straight from what happens in examples.

To wrap up, I should add that people usually call the pair consisting of f_\ast \colon \mathsf{T} \to \mathsf{T'} and its left adjoint f^\ast \colon \mathsf{T'} \to \mathsf{T} a geometric morphism. A functor has at most one adjoint, up to natural isomorphism, so my definition is at least tolerable. But I’ll probably switch to the standard one when we get serious about geometric morphisms.

And we will eventually see that geometric morphisms let us translate mathematics from one alternative universe to another!

Conclusion

If this seemed like too much too soon, fear not, I’ll go over it again and actually define a lot of the concepts I merely sketched, like ‘exponentials’, ‘subobject classifier’, ‘Heyting algebra’, ‘Grothendieck topology’, and ‘Grothendieck topos’. I just wanted to get a lot of the main concepts on the table quickly. You should do the puzzles to see if you understand what I wanted you to understand. Unless I made a mistake, all of these are straightforward definition-pushing if you’re comfortable with some basic category theory.

For more background on topos theory I highly recommend this:

• Colin McLarty, The uses and abuses of the history of topos theory.

Abstract. The view that toposes originated as generalized set theory is a figment of set theoretically educated common sense. This false history obstructs understanding of category theory and especially of categorical foundations for mathematics. Problems in geometry, topology, and related algebra led to categories and toposes. Elementary toposes arose when Lawvere’s interest in the foundations of physics and Tierney’s in the foundations of topology led both to study Grothendieck’s foundations for algebraic geometry. I end with remarks on a categorical view of the history of set theory, including a false history plausible from that point of view that would make it helpful to introduce toposes as a generalization from set theory.

There’s also a lot of background material in the book for this course:

22 Responses to Topos Theory (Part 1)

  1. Blake Stacey says:

    Is there a way to relax the distributiveness of Heyting algebras to something like modularity or orthomodularity?

    • Todd Trimble says:

      So a Heyting algebra is a lattice that as a poset (hence as a category) is cartesian closed. Cartesian closure is the real hallmark of being a Heyting algebra; if you drop that condition, then generally speaking you are talking about a very different species of lattice. And cartesian closure implies that products (which in a poset are meets) distribute over coproducts (which in a poset are joins).

      If a finitely complete category has a subobject classifier \Omega, then \Omega carries an internal structure of cartesian closed meet semi-lattice. (This is true even if we drop the topos condition that the ambient category is cartesian closed!) Externally, this means that all subobject posets \mathrm{Sub}(X) are cartesian closed meet-semilattices.

      Off-hand I know of no interesting way to relax the cartesian closure condition on a Heyting algebra that would allow modularity but not distributivity. As you know, modular lattices typically crop up as lattices of congruences for specific types of categories of algebraic structures, like groups and rings; more generally, they arise this way if the algebraic theory in question is a so-called Mal’cev theory. But these types of categories are quite a far cry from elementary toposes. The “closest” they can get, as far as I know, is via Freyd’s notion of AT category, which axiomatizes the exactness properties that are common to abelian categories and elementary toposes.

    • John Baez says:

      Distributivity is the hallmark of ‘classical’ logic -where ‘classical’ means non-quantum, as opposed to non-intuitionistic. Topos theory is thoroughly classical in this sense. There have been attempts to develop ‘quantum topos theory’, but the first two I looked at were amateurish and unconvincing. If quantum topos theory deserves to exist at all, it may be connected to noncommutative algebraic geometry – another risky subject.

    • Blake Stacey says:

      That is all very interesting and helpful; thank you!

      (For reasons that are probably personal and idiosyncratic, I’m not all that enthusiastic about most of what goes under the name “quantum logic” as a route to understanding the truly deep fundamentals of the physics; it kind of seems like the subject which really deserves the name has yet to be invented. The “a ‘logic’ is a lattice of closed subspaces of a Hilbert space” way of thinking did suggest one conceptual connection about a straight-up math problem, but I still haven’t been able to carry it very far!)

    • John Baez says:

      Many share your belief that traditional lattice-based quantum logic is not getting to the bottom of things, and there’s a lot of modern work on other approaches. A good introduction is here:

      • Chris Heunen and Jamie Vicary, Categories for Quantum Theory, Oxford U. Press, Oxford, 2020.

      I wrote a much shorter introduction to these ideas for philosophers:

      • John Baez, Quantum quandaries: a category-theoretic approach, in Structural Foundations of Quantum Gravity, eds. Steven French, Dean Rickles and Juha Saatsi, Oxford U. Press, 2006, pp. 240–265.

      The basic idea here is that classical logic is about cartesian closed categories while quantum logic is about compact dagger-categories.

      I should also mention generalized probabilistic theories using convex sets, and effect theories. There’s a lot of work going on in all these approaches. It needs to be synthesized and beautified.

      I wish people would ask questions about topos theory, but maybe everything in this lecture was too easy for anyone in the world to have any questions!

    • Todd Trimble says:

      Regarding categories and quantum mechanics: there’s also something called the Bohr topos (of a C^\ast-algebra), which seems to have some attractive features if you happen to like toposes and internal logic. It’s pretty straightforward to define (it’s a presheaf topos, over the category dual to the poset of commutative \ast-subalgebras, equipped with an internal “tautological” commutative ring). I’m guessing you know something about this, John, and might have something to say about it, but the nLab has a pretty extensive sales pitch going for it.

    • Blake Stacey says:

      The “Bohr topos” material has the appeal that the developers appear to have thought more carefully than is typical about what Bohr himself wrote. He could be quite a foggy writer, a problem compounded by later generations simply not reading the pieces where he was the clearest and by the fact that very often he wasn’t thinking about the same questions that those later generations had in mind.

      I’m not yet seeing what I might crassly call the “cash value” of Bohr toposes — yes, we can define all these things, but what do they make more clear than before? Maybe this series will shake something loose for me. My own work has fallen closer to the “generalized probabilistic theories” side, so I’d like to see the pieces fit together.

      • John Baez says:

        I don’t there’s any “cash value” for Bohr toposes yet. I wrote about some similar stuff long ago in This Week’s Finds: it’s nice how there’s a poset of commutative subalgebras of a general C*-algebra, which give a preheaf of different “classical views” of a quantum system… but if you’re mainly interested in physics rather than topos theory, it’s safe to wait until a bunch of people shout “Eureka!”

        Topos theory is interesting for other reasons.

    • Todd Trimble says:

      Yeah, it may not be completely clear, even for Urs Schreiber who is the principal author of the relevant nLab articles, what that cash value is. Far be it from me to say much about it, except that there is this potential of bringing a huge wealth of experience with toposes to bear (googling “Bohr topos” will reveal some of this).

      Urs makes a concerted effort to explain the relevant issues where the Bohr topos provides a convenient conceptual framework, here. He emphasizes the close connection between the poset of commutative \ast-subalgebras and the underlying Jordan algebra, and that this together with the actual commutative algebra structures are what is key to the physics. But I’d better quit here.

  2. […] Last time I defined sheaves on a topological space. This time I’ll say how to get sheaves from ‘bundles’. You may or may not have heard of ‘bundles’ of various kinds, like vector bundles or fiber bundles. If you have, be glad: the bundles I’m talking about now include these as special cases. If not, don’t worry: the bundles I’m talking about now are much simpler! […]

  3. Thanks very much for all your great work at both publicizing and discussing key topics in math, in particular, SGL.
    There are some key results for general cat theory in it,
    e.g., how to cut an adjunctiont down to equivalences.

    Any thoughts on how it compares to Colin McLarty’s text?

    Since you’re currently into SGL, this might be a good time to make such a comparison.

    I agree with your remarks that there are more important things to worry about than diacritical marks in French.
    And thanks for your remarks on terminology.

    BTW, clicking on my name will yield (links to) several web pages showing some aspects of subobject classification, etc. in Set.

    • Todd Trimble says:

      At first I was going, “what is this key topic called SGL that I’ve never heard of?” Until it dawned on me that it’s not a topic but a famous book about a topic, Sheaves in Geometry and Logic, the one John says he’s using for the course. Usually I refer to it as “Mac Lane and Moerdijk”.

      (Good luck with the course, John!)

      I don’t know McLarty’s book intimately, but recently had to take a look at it. It’s nice I think. For example, he outlines the hands-on construction of finite colimits in a topos, whereas Mac Lane and Moerdijk take the more abstract approach given via Paré’s theorem. He also goes into some loving detail about topics not well-covered by Mac Lane and Moerdijk, such as toposes for SDG (Synthetic Differential Geometry) and the effective topos. So in some sense the books are nicely complementary.

      • John Baez says:

        I have McLarty’s book and I’d say it focuses more than MacLane and Moerdijk do on applications of elementary topoi to logic, much less on sheaves, Grothendieck topoi and the roots of topos theory in algebraic geometry. It’s also a shorter book.

  4. Blake Stacey says:

    Is a category equipped with a Grothendieck topology known as a site just because things are put on it?

    When an ordinary word becomes a math word, I always end up at least a little confused. Sometimes I wonder if there’s an important insight to how the concept is used in practice that I’m not getting because most books just take the word as given. Sometimes I get the feeling that there was a strange history of translation (like with ray class field and perhaps magma).

    • John Baez says:

      In this lecture I defined sheaves on a topological space. Grothendieck realized that we could define sheaves on a category as long as we say, for every object x, when bunch of morphisms f_i \colon x_i \to x ‘cover’ x. With this extra information the category becomes a kind of generalization of a topological space, with its objects serving as generalized open sets—so Grothendieck, a master of choosing the most evocative term for any concept he defined, called it a ‘site’. Intuitively, a site is a place on which sheaves can be built, and thus a place in which we can do mathematics. (To really get this, you have to get in the mood of French algebraic geometry, where you need sheaves to do anything interesting.)

      site: an area of ground on which a town, building, or monument is constructed.

      There are even things called ‘skyscaper sheaves’—though only on spaces, as far as I know: a skyscraper sheaf is a sheaf that’s supported on a single point, thus very ‘tall and thin’.

      • Todd Trimble says:

        Of course there’s a general notion of a point of a (Grothendieck) topos E, namely a so-called geometric morphism \text{Set} \to E. In the special case where E is sheaves on a topological space, this is exactly the skyscaper sheaf construction which associates such a geometric morphism to a point of the space (it’s no coincidence that the topos points in this case are in natural bijection with spatial points, at least if the space is a sober space). In other words, topos points are the same as generalized skyscapers. In the special case of sheaves over a space, the left exact left adjoint E \to \text{Set} of the skyscaper morphism is “take the stalk at that point”.

  5. David Corfield says:

    You’re really happier linking to Wikipedia than to the nLab? https://en.wikipedia.org/wiki/Subobject_classifier is a better page for your students and readers than https://ncatlab.org/nlab/show/subobject+classifier ?

    • John Baez says:

      I always link to whatever I think offers the best explanation for the audience I have in mind. For people who have never heard of a subobject classifier, I feel the hand-holding “introductory explanation” of subobject classifiers in Wikipedia does what I want. Of course everyone serious about category theory should consult the nLab too.

  6. “Topos theory emerged from Grothendieck’s work”

    An interesting guy, who in later life, though he didn’t abandon mathematics, lived as a recluse in a small village in the foothills of the Pyrenees, for a time trying to live exclusively on dandelion soup.

  7. I explained the sheaf condition in Part 1, but here’s a slicker way to say it. Suppose is an open set covered by a collection of open sets […]

  8. In Part 1, I said how to push sheaves forward along a continuous map. Now let’s see how to pull them back! This will set up a pair of adjoint functors with nice properties, called a ‘geometric morphism’ […]

Leave a Reply to Todd Trimble Cancel reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.