Information Geometry (Part 17)

I’m getting back into information geometry, which is the geometry of the space of probability distributions, studied using tools from information theory. I’ve written a bunch about it already, which you can see here:

Information geometry.

Now I’m fascinated by something new: how symplectic geometry and contact geometry show up in information geometry. But before I say anything about this, let me say a bit about how they show up in thermodynamics. This is more widely discussed, and it’s a good starting point.

Symplectic geometry was born as the geometry of phase space in classical mechanics: that is, the space of possible positions and momenta of a classical system. The simplest example of a symplectic manifold is the vector space \mathbb{R}^{2n}, with n position coordinates q_i and n momentum coordinates p_i.

It turns out that symplectic manifolds are always even-dimensional, because we can always cover them with coordinate charts that look like \mathbb{R}^{2n}. When we change coordinates, it turns out that the splitting of coordinates into positions and momenta is somewhat arbitrary. For example, the position of a rock on a spring now may determine its momentum a while later, and vice versa. What’s not arbitrary? It’s the so-called ‘symplectic structure’:

\omega = dp_1 \wedge dq_1 + \cdots + dp_n \wedge dq_n

While far from obvious at first, we know by now that the symplectic structure is exactly what needs to be preserved under valid changes of coordinates in classical mechanics! In fact, we can develop the whole formalism of classical mechanics starting from a manifold with a symplectic structure.

Symplectic geometry also shows up in thermodynamics. In thermodynamics we can start with a system in equilibrium whose state is described by some variables q_1, \dots, q_n. Its entropy will be a function of these variables, say

S = f(q_1, \dots, q_n)

We can then take the partial derivatives of entropy and call them something:

\displaystyle{ p_i = \frac{\partial f}{\partial q_i} }

These new variables p_i are said to be ‘conjugate’ to the q_i, and they turn out to be very interesting. For example, if q_i is energy then p_i is ‘coolness’: the reciprocal of temperature. The coolness of a system is its change in entropy per change in energy.

Often the variables q_i are ‘extensive’: that is, you can measure them only by looking at your whole system and totaling up some quantity. Examples are energy and volume. Then the new variables p_i are ‘intensive’: that is, you can measure them at any one location in your system. Examples are coolness and pressure.

Now for a twist: sometimes we do not know the function f ahead of time. Then we cannot define the p_i as above. We’re forced into a different approach where we treat them as independent quantities, at least until someone tells us what f is.

In this approach, we start with a space \mathbb{R}^{2n} having n coordinates called q_i and n coordinates called p_i. This is a symplectic manifold, with the symplectic struture \omega described earlier!

But what about the entropy? We don’t yet know what it is as a function of the q_i, but we may still want to talk about it. So, we build a space \mathbb{R}^{2n+1} having one extra coordinate S in addition to the q_i and p_i. This new coordinate stands for entropy. And this new space has an important 1-form on it:

\alpha = -dS + p_1 dq_i + \cdots + p_n dq_n

This is called the ‘contact 1-form’.

This makes \mathbb{R}^{2n+1} into an example of a ‘contact manifold’. Contact geometry is the odd-dimensional partner of symplectic geometry. Just as symplectic manifolds are always even-dimensional, contact manifolds are always odd-dimensional.

What is the point of the contact 1-form? Well, suppose someone tells us the function f relating entropy to the coordinates q_i. Now we know that we want

S = f

and also

\displaystyle{ p_i = \frac{\partial f}{\partial q_i} }

So, we can impose these equations, which pick out a subset of \mathbb{R}^{2n+1}. You can check that this subset, say \Sigma, is an n-dimensional submanifold. But even better, the contact 1-form vanishes when restricted to this submanifold:

\left.\alpha\right|_\Sigma = 0

Let’s see why! Suppose x \in \Sigma and suppose v \in T_x \Sigma is a vector tangent to \Sigma at this point x. It suffices to show

\alpha(v) = 0

Using the definition of \alpha this equation says

\displaystyle{ -dS(v) + \sum_i p_i dq_i(v) = 0 }

But on the surface \Sigma we have

S = f, \qquad  \displaystyle{ p_i = \frac{\partial f}{\partial q_i} }

So, the equation we’re trying to show can be written as

\displaystyle{ -df(v) + \sum_i \frac{\partial f}{\partial q_i} dq_i(v) = 0 }

But this follows from

\displaystyle{ df = \sum_i \frac{\partial f}{\partial q_i} dq_i }

which holds because f is a function only of the coordinates q_i.

So, any formula for entropy S = f(q_1, \dots, q_n) picks out a so-called ‘Legendrian submanifold’ of \mathbb{R}^{2n+1}: that is, an n-dimensional submanifold such that the contact 1-form vanishes when restricted to this submanifold. And the idea is that this submanifold tells you everything you need to know about a thermodynamic system.

Indeed, V. I. Arnol’d says this was implicitly known to the great founder of statistical mechanics, Josiah Willard Gibbs. Arnol’d calls \mathbb{R}^5 with coordinates energy, entropy, temperature, pressure and volume the ‘Gibbs manifold’, and he proclaims:

Gibbs’ thesis: substances are Legendrian submanifolds of the Gibbs manifold.

This is from here:

• V. I. Arnol’d, Contact geometry: the geometrical method of Gibbs’ thermodynamics, Proceedings of the Gibbs Symposium (New Haven, CT, 1989), AMS, Providence, Rhode Island, 1990.

A bit more detail

Now I want to say everything again, with a bit of extra detail, assuming more familiarity with manifolds. Above I was using \mathbb{R}^n with coordinates q_1, \dots, q_n to describe the ‘extensive’ variables of a thermodynamic system. But let’s be a bit more general and use any smooth n-dimensional manifold Q. Even if Q is a vector space, this viewpoint is nice because it’s manifestly coordinate-independent!

So: starting from Q we build the cotangent bundle T^\ast Q. A point in cotangent describes both extensive variables, namely q \in Q, and ‘intensive’ variables, namely a cotangent vector p \in T^\ast_q Q.

The manifold T^\ast Q has a 1-form \theta on it called the tautological 1-form. We can describe it as follows. Given a tangent vector v \in T_{(q,p)} T^\ast Q we have to say what \theta(v) is. Using the projection

\pi \colon T^\ast Q \to Q

we can project v down to a tangent vector d\pi(v) at the point q. But the 1-form p eats tangent vectors at q and spits out numbers! So, we set

\theta(v) = p(d\pi(v))

This is sort of mind-boggling at first, but it’s worth pondering until it makes sense. It helps to work out what \theta looks like in local coordinates. Starting with any local coordinates q_i on an open set of Q, we get local coordinates q_i, p_i on the cotangent bundle of this open set in the usual way. On this open set you then get

\theta = p_1 dq_1 + \cdots + p_n dq_n

This is a standard calculation, which is really worth doing!

It follows that we can define a symplectic structure \omega by

\omega = d \theta

and get this formula in local coordinates:

\omega = dp_1 \wedge dq_1 + \cdots + dp_n \wedge dq_n

Now, suppose we choose a smooth function

f \colon Q \to \mathbb{R}

which describes the entropy. We get a 1-form df, which we can think of as a map

df \colon Q \to T^\ast Q

assigning to each choice q of extensive variables the pair (q,p) of extensive and intensive variables where

p = df_q

The image of the map df is a ‘Lagrangian submanifold‘ of T^\ast Q: that is, an n-dimensional submanifold \Lambda such that

\left.\omega\right|_{\Lambda} = 0

Lagrangian submanifolds are to symplectic geometry as Legendrian submanifolds are to contact geometry! What we’re seeing here is that if Gibbs had preferred symplectic geometry, he could have described substances as Lagrangian submanifolds rather than Legendrian submanifolds. But this approach would only keep track of the derivatives of entropy, df, not the actual value of the entropy function f.

If we prefer to keep track of the actual value of f using contact geometry, we can do that. For this we add an extra dimension to T^\ast Q and form the manifold T^\ast Q \times \mathbb{R}. The extra dimension represents entropy, so we’ll use S as our name for the coordinate on \mathbb{R}.

We can make T^\ast Q \times \mathbb{R} into a contact manifold with contact 1-form

\alpha = -d S + \theta

In local coordinates we get

\alpha = -dS + p_1 dq_i + \cdots + p_n dq_n

just as we had earlier. And just as before, if we choose a smooth function f \colon Q \to \mathbb{R} describing entropy, the subset

\Sigma = \{(q,p,S) \in T^\ast Q \times \mathbb{R} : \; S = f(q), \; p = df_q \}

is a Legendrian submanifold of T^\ast Q \times \mathbb{R}.

Okay, this concludes my lightning review of symplectic and contact geometry in thermodynamics! Next time I’ll talk about something a bit less well understood: how they show up in statistical mechanics.

10 Responses to Information Geometry (Part 17)

  1. Allen Knutson says:

    Vector fields on X have two multiplications: if we think of them as first-order differential operators, those operators multiply noncommutatively to give higher-order differential operators \mathcal D_X. Or we can think of them as fiberwise-linear functions on T^* X, which multiply commutatively to give all (algebraic) functions on T^* X, the second multiplication being an associated-graded degeneration of the first.

    Do you know a corresponding noncommutative picture of the contact manifold T^*Q \times \mathbb R?

    BTW one of your LaTeXs (after “We get a 1-form”) is tardy.

    • Toby Bartels says:

      Tardy … I see what you did there.

      So the first multiplication is one in which (∂/∂𝑥)² means the second partial derivative (so (∂/∂𝑥)²𝑓 = ∂²𝑓/∂𝑥²), while the second is one in which (∂/∂𝑥)² means the square of the first derivative (so (∂/∂𝑥)²𝑓 = (∂𝑓/∂𝑥)², which we can also think of as acting on d𝑓 rather than on 𝑓 itself). And so (𝑥 ∂/∂𝑥)² expands to 𝑥 ∂/∂𝑥 + 𝑥² (∂/∂𝑥)² in the first case but just to 𝑥² (∂/∂𝑥)² in the second case. We get the algebraic structure of the second multiplication by purging all terms with the wrong grade from the structure of the first multiplication. (And we get the Lie bracket, another important way to multiply vector fields, by antisymmetrizing the first multiplication; this is the only operation that keeps the result within vector fields.)

      Anyway, the contact manifold has, in addition to coordinates like 𝑥 and ∂/∂𝑥, the coordinate 𝑆. Although we don't have any particular equation in mind yet, we anticipate expressing 𝑆 as a function of the coordinates like 𝑥. So if we want to multiply ∂/∂𝑥 and 𝑆 in a way analogous to the first kind of multiplication, then I think that the answer should involve a symbol ∂𝑆/∂𝑥, specifically ∂/∂𝑥 𝑆 = ∂𝑆/∂𝑥 + 𝑆 ∂/∂𝑥. More explicitly, if 𝑓 is a function on 𝑄 × ℝ, then 𝑆 𝑓, ∂/∂𝑥 𝑓, and 𝑆 ∂/∂𝑥 𝑓 are also such functions, while ∂/∂𝑥 𝑆 𝑓 = ∂𝑆/∂𝑥 𝑓 + 𝑆 ∂/∂𝑥 𝑓 cannot be interpreted this way. But it lies in wait; as soon as you specify a Legendre submanifold, then all of these can be interpreted as functions on 𝑄. In this way, we know how the multiplication operation works. And it's still graded, and it's still true that dropping all terms of the wrong grade recovers the algebra of functions on the contract manifold.

    • John Baez says:

      Allen’s question could be related to my question about ‘quantizing’ thermodynamics, in the sense of replacing Poisson brackets on the symplectic manifold of extensive and intensive variables by commutators. I wrote about this here:

      Classical mechanics versus thermodynamics (part 2), 23 January 2012.

      I didn’t make much progress because I couldn’t figure out the physical significance of this sort of ‘quantization’. Now that I’m spending more time on thermodynamics, I’m inclined to plunge ahead and see what this quantization gives without worrying too much about what it means. That might be the way to figure out what it means.

      All of that was phrased in terms of symplectic manifolds, but one should also be able to quantize contact manifolds, and I think the math Allen points out should help. I bet people have already quantized contact manifolds—I should check.

  2. Toby Bartels says:

    Typo: you have a df appearing as $late df$.

  3. Miguel Ángel García Ariza says:

    This approach to equilibrium thermodynamics has been my interest for ten years. I have not found a good use to it, from the physical point of view. Can you think of any actual computational or conceptual advantage of using this elegant approach to equilibrium thermodynamics? Can equilibrium thermodynamics really be further advanced, or has everything been said?

    A second, more technical question: the construction of the “phase space” (the cotangent bundle of the space of equilibrium states) is rather natural. However, the construction of the analogous contact manifold does not seem that natural to me: we have to add “by hand” the extra dimension \mathbb{R}. Could the first jet bundle of the space of equilibrium states be the “natural” 2n+1-dimensional space?

    • John Baez says:

      One reason it’s hard to use contact geometry to do new things in equilibrium thermodynamics is that Gibbs already knew contact geometry, implicitly, when formulating his approach to thermodynamics. This at least is V. I. Arnol’d’s claim.

      However, there’s been a lot of mathematical work on contact geometry since Gibbs, so there could be ways to use theorems about contact geometry to do something new. Unfortunately a lot of the deeper theorems are about contact manifolds not of the form \mathbb{R}^{2n+1}. But maybe even these more complicated manifolds have some relevance to thermodynamics, just as more complicated symplectic manifolds (not even cotangent bundles) turn out to be relevant to classical mechanics.

      I have other ideas for applying contact and symplectic geometry to thermodynamics in new ways, but I’ll talk about those in future blog posts! I’m reviewing this old stuff to set the stage.

      • Miguel Ángel García Ariza says:

        Thank you for your reply! I am looking forward to reading about these new ideas.

    • John Baez says:

      Miguel wrote:

      A second, more technical question: the construction of the “phase space” (the cotangent bundle of the space of equilibrium states) is rather natural. However, the construction of the analogous contact manifold does not seem that natural to me: we have to add “by hand” the extra dimension \mathbb{R}. Could the first jet bundle of the space of equilibrium states be the “natural” 2n+1-dimensional space?

      Thanks—yes, that’s a somewhat nicer way to think about what’s going on. As you can see from my article, the extra dimension \mathbb{R} keeps track of the value of entropy, while a point in T^\ast Q keeps track of a point in q and the differential of the entropy at the point. Putting these all together we get a 1-jet.

      I wouldn’t say the 1-jet approach is “more natural”, not in a technical sense anyway, because the first jet bundle of a manifold Q is naturally isomorphic to the direct sum of its cotangent bundle and a trivial line bundle. And here by “trivial” I don’t mean just “trivializable”: I mean trivialized, equipped with a natural trivialization.

      This is because the first jet of a function f : Q \to \mathbb{R} at some point p \in Q splits naturally into the differential d f \in T_p^\ast Q and the value f(p) \in \mathbb{R}. So, we get a natural isomorphism between the first jet bundle J^1(Q) and T^\ast Q \times \mathbb{R} as bundles over Q.

      Jet bundles become a lot more exciting with the second jet bundle: this splits as a direct sum of 3 vector bundles, but not naturally.

      But still, the jet bundle viewpoint is nice!

  4. […] Last time I sketched how two related forms of geometry, symplectic and contact geometry, show up in thermodynamics. Today I want to explain how they show up in probability theory. […]

You can use Markdown or HTML in your comments. You can also use LaTeX, like this: $latex E = m c^2 $. The word 'latex' comes right after the first dollar sign, with a space after it.

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.