Compositional Thermostatics (Part 1)

At the Topos Institute this summer, a group of folks started talking about thermodynamics and category theory. It probably started because Spencer Breiner and my former student Joe Moeller, both working at NIST, were talking about thermodynamics with some people there. But I’ve been interested in thermodynamics for quite a while now –and Owen Lynch, a grad student visiting from the University of Utrecht, wanted to do his master’s thesis on the subject. He’s now working with me. Sophie Libkind, David Spivak and David Jaz Myers also joined in: they’re especially interested in open systems and how they interact.

Prompted by these conversations, a subset of us eventually wrote a paper on the foundations of equilibrium thermodynamics:

• John Baez, Owen Lynch and Joe Moeller, Compositional thermostatics.

The idea here is to describe classical thermodynamics, classical statistical mechanics and quantum statistical mechanics in a unified framework based on entropy maximization. This framework can also handle ‘generalized probabilistic theories’ of the sort studied in quantum foundations—that is, theories like quantum mechanics, but more general.

To unify all these theories, we define a ‘thermostatic system’ to be any convex space X of ‘states’ together with a concave function

S \colon X \to [-\infty, \infty]

assigning to each state an ‘entropy’.

Whenever several such systems are combined and allowed to come to equilibrium, the new equilibrium state maximizes the total entropy subject to constraints. We explain how to express this idea using an operad. Intuitively speaking, the operad we construct has as operations all possible ways of combining thermostatic systems. For example, there is an operation that combines two gases in such a way that they can exchange energy and volume, but not particles—and another operation that lets them exchange only particles, and so on.

It is crucial to use a sufficiently general concept of ‘convex space’, which need not be a convex subset of a vector space. Luckily there has been a lot of work on this, so we can just grab a good definition off the shelf:

Definition. A convex space is a set X with an operation c_\lambda \colon X \times X \to X for each \lambda \in [0, 1] such that the following identities hold:

1) c_1(x, y) = x

2) c_\lambda(x, x) = x

3) c_\lambda(x, y) = c_{1-\lambda}(y, x)

4) c_\lambda(c_\mu(x, y) , z) = c_{\lambda'}(x, c_{\mu'}(y, z)) for all 0 \le \lambda, \mu, \lambda', \mu' \le 1 satisfying \lambda\mu = \lambda' and 1-\lambda = (1-\lambda')(1-\mu').

To understand these axioms, especially the last, you need to check that any vector space is a convex space with

c_\lambda(x, y) = \lambda x + (1-\lambda)y

So, these operations c_\lambda describe ‘convex linear combinations’.

Indeed, any subset of a vector space closed under convex linear combinations is a convex space! But there are other examples too.

In 1949, the famous mathematician Marshall Stone invented ‘barycentric algebras’. These are convex spaces satisfying one extra axiom: the cancellation axiom, which says that whenever \lambda \ne 0,

c_\lambda(x,y) = c_\lambda(x',y) \implies x = x'

He proved that any barycentric algebra is isomorphic to a convex subset of a vector space. Later Walter Neumann noted that a convex space, defined as above, is isomorphic to a convex subset of a vector space if and only if the cancellation axiom holds.

Dropping the cancellation axiom has convenient formal consequences, since the resulting more general convex spaces can then be defined as algebras of a finitary commutative monad, giving the category of convex spaces very good properties.

But dropping this axiom is no mere formal nicety. In our definition of ‘thermostatic system’, we need the set of possible values of entropy to be a convex space. One obvious candidate is the set [0,\infty). However, for a well-behaved formalism based on entropy maximization, we want the supremum of any set of entropies to be well-defined. This forces us to consider the larger set [0,\infty], which does not obey the cancellation axiom.

But even that is not good enough! In thermodynamics you often read about the ‘heat bath‘, an idealized system that can absorb or emit an arbitrarily large amount of energy while keeping a fixed temperature. We want to treat the ‘heat bath’ as a thermostatic system on an equal footing with any other. To do this, we need to allow consider negative entropies—not because the heat bath can have negative entropy, but because it acts as an infinite reservoir of entropy, and the change in entropy from its default state can be positive or negative.

This suggests letting entropies take values in the convex space \mathbb{R}. But then the requirement that any set of entropies have a supremum (including empty and unbounded sets) forces us to use the larger convex space [-\infty,\infty].

This does not obey the cancellation axiom, so there is no way to think of it as a convex subset of a vector space. In fact, it’s not even immediately obvious how to make it into a convex space at all! After all, what do you get when you take a nontrivial convex linear combination of \infty and -\infty? You’ll have to read our paper for the answer, and the justification.

We then define a thermostatic system to be a convex set X together with a concave function

S \colon X \to [-\infty, \infty]

where concave means that

S(c_\lambda(x,y)) \ge c_\lambda(S(x), S(y))

We give lots of examples from classical thermodynamics, classical and quantum statistical mechanics, and beyond—including our friend the ‘heat bath’.

For example, suppose X is the set of probability distributions on an n-element set, and suppose S \colon X \to [-\infty, \infty] is the Shannon entropy

\displaystyle{ S(p) = - \sum_{i = 1}^n p_i \log p_i }

Then given two probability distributions p and q, we have

S(\lambda p + (1-\lambda q)) \ge \lambda S(p) + (1-\lambda) S(q)

for all \lambda \in [0,1]. So this entropy function is convex, and S \colon X \to [-\infty, \infty] defines a thermostatic system. But in this example the entropy only takes nonnegative values, and is never infinite, so you need to look at other examples to see why we want to let entropy take values in [-\infty,\infty].

After looking at examples of thermostatic systems, we define an operad whose operations are convex-linear relations from a product of convex spaces to a single convex space. And then we prove that thermostatic systems give an algebra for this operad: that is, we can really stick together thermostatic systems in all these ways. The trick is computing the entropy function of the new composed system from the entropy functions of its parts: this is where entropy maximization comes in.

For a nice introduction to these ideas, check out Owen’s blog article:

• Owen Lynch, Compositional thermostatics, Topos Institute Blog, 9 September 2021.

And then comes the really interesting part: checking that this adequately captures many of the examples physicists have thought about!

The picture at the top of this post shows one that we discuss: two cylinders of ideal gas with a movable divider between them that’s permeable to heat. Yes, this is an operation in an operad—and if you tell us the entropy function of each cylinder of gas, our formalism will automatically compute the entropy function of the resulting combination of these two cylinders.

There are many other examples. Did you ever hear of the ‘canonical ensemble’, the ‘microcanonical ensemble’, or the ‘grand canonical ensemble’? Those are famous things in statistical mechanics. We show how our formalism recovers these.

I’m sure there’s much more to be done. But I feel happy to see modern math being put to good use: making the foundations of thermodynamics more precise. Once Vladimir Arnol’d wrote:

Every mathematician knows that it is impossible to understand any elementary course in thermodynamics.

I’m not sure our work will help with that—and indeed, it’s possible that once the mathematicians finally understand thermodynamics, physicists won’t understand what the mathematicians are talking about! But at least we’re clearly seeing some more of the mathematical structures that are hinted at, but not fully spelled out, in such an elementary course.

I expect that our work will interact nicely with Simon Willerton’s work on the Legendre transform. The Legendre transform of a concave (or convex) function is widely used in thermostatics, and Simon describes this for functions valued in [-\infty,\infty] using enriched profunctors:

• Simon Willerton, Enrichment and the Legendre–Fenchel transform I, The n-Category Café, April 16, 2014.

• Simon Willerton, Enrichment and the Legendre–Fenchel transform II, The n-Category Café, May 22, 2014.

He also has a paper on this, and you can see him talk about it on YouTube.


See all four parts of this series:

Part 1: thermostatic systems and convex sets.

Part 2: composing thermostatic systems.

Part 3: operads and their algebras.

Part 4: the operad for composing thermostatic systems.

12 Responses to Compositional Thermostatics (Part 1)

  1. markovian says:

    Your paper makes a nice read up to section 4. Thereafter, I am lost!

  2. Steve Huntsman says:

    A really slick thing would be to do compositional statistical physics vs thermodynamics per se. Of course the former differs principally in its focus on the partition function, and to a lesser extent in the particulars of a system. Regarding systems, an example par excellence is a collection of Ising-Glauber spins. If y’all can figure out how to make a spin operad (or what IMO is likelier, a sort of asymptotic or approximate operad) from these gizmos that plays nicely with real-space renormalization a la Kadanoff then you will likely have a tool that allows one to talk about “renormalizing” all sorts of things that have heretofore not even been recognized as admitting a renormalization group analysis.

    • John Baez says:

      Our formalism covers statistical physics, both classical and quantum, and we give lots of examples of how. We treat the entropy rather than the partition function as the key ingredient. Spin systems, and gluing together spin systems, should work fine.

      In classical thermodynamics, by taking Legendre transform one can recover the logarithm of the partition function from the entropy. I believe Hong Qian has studied something similar for statistical mechanics. But we don’t get deep into Legendre transforms in this paper. That should come next!

      • Steve Huntsman says:

        Is there a visible route to doing real space/block spin renormalization for a 2D or even 1D Ising model through your setup? That would be wonderful!

  3. Toby Bartels says:

    Recording my guess here before checking (and to receive notifications of additional comments): a nontrivial convex combination of –∞ and +∞ will be –∞. (Reason: suprema are important, whereas you never mentioned infima, so as much as possible you want a convex combination of two suprema to equal the supremum of the convex combinations.)

    • John Baez says:

      You guessed right; now I’ll have to think about whether your explanation of why it’s right is secretly the same as the explanation in our paper (it’s after Definition 11).

  4. Nicholas Sullivan says:

    This makes me wonder if this would be a good use for surreal numbers. If the entropy were allowed to take values in the surreals, then an ideal heat bath could have an energy of E = \mathcal{E}\omega and entropy of S = \beta\mathcal{E}\omega. Then, any finite energy change in the heat bath could be accounted for by adding a real quantity to this, without having to worry about infinities. You would also be able to use the axiom of cancellation again.

  5. Will Perkins says:

    What’s the relationship between these concepts and Jaynes’ principle of maximum entropy (which he used to motivate and re-derive things like the canonical and grand canonical ensemble)? Eg. this classic paper: https://journals.aps.org/pr/abstract/10.1103/PhysRev.106.620

    • John Baez says:

      Jaynes realized, among other things, that entropy maximization in thermostatics is not just a law of physics but a general principle of reasoning in situations of incomplete information. This is widely understood by now (though some still argue about it), so we didn’t even think of mentioning it in our paper. But we probably should, since we’re explaining how entropy maximization works when you combine several systems.

      You might like this post of mine:

      Classical mechanics versus thermodynamics (part 1).

      Here’s the most relevant part:

      The big picture

      Now let’s step back and think about what’s going on.

      Lately I’ve been trying to unify a bunch of ‘extremal principles’, including:

      1) the principle of least action
      2) the principle of least energy
      3) the principle of maximum entropy
      4) the principle of maximum simplicity, or Occam’s razor

      In my post on quantropy I explained how the first three principles fit into a single framework if we treat Planck’s constant as an imaginary temperature. The guiding principle of this framework is

      maximize entropy
      subject to the constraints imposed by what you believe

      And that’s nice, because E. T. Jaynes has made a powerful case for this principle.

      However, when the temperature is imaginary, entropy is so different that it may deserves a new name: say, ‘quantropy’. In particular, it’s complex-valued, so instead of maximizing it we have to look for stationary points: places where its first derivative is zero. But this isn’t so bad. Indeed, a lot of minimum and maximum principles are really ‘stationary principles’ if you examine them carefully.

      What about the fourth principle: Occam’s razor? We can formalize this using algorithmic probability theory. Occam’s razor then becomes yet another special case of

      maximize entropy
      subject the constraints imposed by what you believe

      once we realize that algorithmic entropy is a special case of ordinary entropy.

      All of this deserves plenty of further thought and discussion—but not today!

  6. Will Perkins says:

    thank you! That’s a very nice post.

You can use Markdown or HTML in your comments. You can also use LaTeX, like this: $latex E = m c^2 $. The word 'latex' comes right after the first dollar sign, with a space after it.

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.