I showed you last time that in many branches of physics—including classical mechanics and thermodynamics—we can see our task as minimizing or maximizing some function. Today I want to show how we get from that task to symplectic geometry.
So, suppose we have a smooth function

where
is some manifold. A minimum or maximum of
can only occur at a point where

Here the differential
which is a 1-form on
If we pick local coordinates
in some open set of
then we have

and these derivatives
are very interesting. Let’s see why:
Example 1. In classical mechanics, consider a particle on a manifold
Suppose the particle starts at some fixed position at some fixed time. Suppose that it ends up at the position
at time
Then the particle will seek to follow a path that minimizes the action given these conditions. Assume this path exists and is unique. The action of this path is then called Hamilton’s principal function,
Let

and assume Hamilton’s principal function is a smooth function

We then have

where
are local coordinates on 

is called the momentum in the ith direction, and

is called the energy. The minus signs here are basically just a mild nuisance. Time is different from space, and in special relativity the difference comes from a minus sign, but I don’t think that’s the explanation here. We could get rid of the minus signs by working with negative energy, but it’s not such a big deal.
Example 2. In thermodynamics, consider a system with the internal energy
and volume
Then the system will choose a state that maximizes the entropy given these constraints. Assume this state exists and is unique. Call the entropy of this state
Let

and assume the entropy is a smooth function

We then have

where
is the temperature of the system, and
is the pressure. The slight awkwardness of this formula makes people favor other setups.
Example 3. In thermodynamics there are many setups for studying the same system using different minimum or maximum principles. One of the most popular is called the energy scheme. If internal energy increases with increasing entropy, as usually the case, this scheme is equivalent to the one we just saw.
In the energy scheme we fix the entropy
and volume
Then the system will choose a state that minimizes the internal energy given these constraints. Assume this state exists and is unique. Call the internal energy of this state
Let

and assume the entropy is a smooth function

We then have

where

is the temperature, and

is the pressure. You’ll note the formulas here closely resemble those in Example 1!
Example 4. Here are the four most popular schemes for thermodynamics:
• If we fix the entropy
and volume
the system will choose a state that minimizes the internal energy 
• If we fix the entropy
and pressure
the system will choose a state that minimizes the enthalpy 
• If we fix the temperature
and volume
the system will choose a state that minimizes the Helmholtz free energy 
• If we fix the temperature
and pressure
the system will choose a state that minimizes the Gibbs free energy 
These quantities are related by a pack of similar-looking formulas, from which we may derive a mind-numbing little labyrinth of Maxwell relations. But for now, all we need to know is that all these approaches to thermodynamics are equivalent given some reasonable assumptions, and all the formulas and relations can be derived using the Legendre transformation trick I explained last time. So, I won’t repeat what we did in Example 3 for all these other cases!
Example 5. In classical statics, consider a particle on a manifold
This particle will seek to minimize its potential energy
which we’ll assume is some smooth function of its position
We then have

where
are local coordinates on
and

is called the force in the ith direction.
Conjugate variables
So, the partial derivatives of the quantity we’re trying to minimize or maximize are very important! As a result, we often want to give them more of an equal status as independent quantities in their own right. Then we call them ‘conjugate variables’.
To make this precise, consider the cotangent bundle
which has local coordinates
(coming from the coordinates on
) and
(the corresponding coordinates on each cotangent space). We then call
the conjugate variable of the coordinate 
Given a smooth function

the 1-form
can be seen as a section of the cotangent bundle. The graph of this section is defined by the equation

and this equation ties together two intuitions about ‘conjugate variables’: as coordinates on the cotangent bundle, and as partial derivatives of the quantity we’re trying to minimize or maximize.
The tautological 1-form
There is a lot to say here, especially about Legendre transformations, but I want to hasten on to a bit of symplectic geometry. And for this we need the ‘tautological 1-form’ on 
We can think of
as a map

sending each point
to the point
where
is defined by the equation we just saw:

Using this map, we can pull back any 1-form on
to get a 1-form on 
What 1-form on
might we like to get? Why,
of course!
Amazingly, there’s a 1-form
on
such that when we pull it back using the map
we get the 1-form
—no matter what smooth function
we started with!
Thanks to this wonderfully tautological property,
is called the tautological 1-form on
You should check that it’s given by the formula

If you get stuck, try this.
So, if we want to see how much
changes as we move along a path in
we can do this in three equivalent ways:
• Evaluate
at the endpoint of the path and subtract off
at the starting-point.
• Integrate the 1-form
along the path.
• Use
to map the path over to
and then integrate
over this path in 
The last method is equivalent thanks to the ‘tautological’ property of
It may seem overly convoluted, but it shows that if we work in
where the conjugate variables are accorded equal status, everything we want to know about the change in
is contained in the 1-form
no matter which function
we decide to use!
So, in this sense,
knows everything there is to know about the change in Hamilton’s principal function in classical mechanics, or the change in entropy in thermodynamics… and so on!
But this means it must know about things like Hamilton’s equations, and the Maxwell relations.
The symplectic structure
We saw last time that the fundamental equations of classical mechanics and thermodynamics—Hamilton’s equations and the Maxwell relations—are mathematically just the same. They both say simply that partial derivatives commute:

where
is the function we’re trying to minimize or maximize.
I also mentioned that this fact—the commuting of partial derivatives—can be stated in an elegant coordinate-free way:

Perhaps I should remind you of the proof:

but

changes sign when we switch
and
while

does not, so
It’s just a wee bit more work to show that conversely, starting from
it follows that the mixed partials must commute.
How can we state this fact using the tautological 1-form
? I said that using the map

we can pull back
to
and get
But pulling back commutes with the
operator! So, if we pull back
we get
But
So,
has the magical property that when we pull it back to
we always get zero, no matter what
we choose!
This magical property captures Hamilton’s equations, the Maxwell relations and so on—for all choices of
at once. So it shouldn’t be surprising that the 2-form

is colossally important: it’s the famous symplectic structure on the so-called phase space 
Well, actually, most people prefer to work with

It seems this whole subject is a monument of austere beauty… covered with minus signs, like bird droppings.
Example 6. In classical mechanics, let

as in Example 1. If
has local coordinates
then
has these along with the conjugate variables as coordinates. As we explained, it causes little trouble to call these conjugate variables by the same names we used for the partial derivatives of
namely,
and
So, we have

and thus

Example 7. In thermodynamics, let

as in Example 3. If
has coordinates
then the conjugate variables deserve to be called
So, we have

and

You’ll see that in these formulas for
variables get paired with their conjugate variables. That’s nice.
But let me expand on what we just saw, since it’s important. And let me talk about
without tossing in that extra sign.
What we saw is that the 2-form
is a ‘measure of noncommutativity’. When we pull
back to
we get zero. This says that partial derivatives commute—and this gives Hamilton’s equations, the Maxwell relations, and all that. But up in
is not zero. And this suggests that there’s some built-in noncommutativity hiding in phase space!
Indeed, we can make this very precise. Consider a little parallelogram up in
:
Suppose we integrate the 1-form
up the left edge and across the top. Do we get the same answer if integrate it across the bottom edge and then up the right?
No, not necessarily! The difference is the same as the integral of
all the way around the parallelogram. By Stokes’ theorem, this is the same as integrating
over the parallelogram. And there’s no reason that should give zero.
However, suppose we got our parallelogram in
by taking a parallelogram in
and applying the map

Then the integral of
around our parallelogram would be zero, since it would equal the integral of
around a parallelogram in
… and that’s the change in
as we go around a loop from some point to… itself!
And indeed, the fact that a function
doesn’t change when we go around a parallelogram is precisely what makes

So the story all fits together quite nicely.
The big picture
I’ve tried to show you that the symplectic structure on the phase spaces of classical mechanics, and the lesser-known but utterly analogous one on the phase spaces of thermodynamics, is a natural outgrowth of utterly trivial reflections on the process of minimizing or maximizing a function
on a manifold 
The first derivative test tells us to look for points with

while the commutativity of partial derivatives says that

everywhere—and this gives Hamilton’s equations and the Maxwell relations. The 1-form
is the pullback of the tautologous 1-form
on
and similarly
is the pullback of the symplectic structure
The fact that

says that
holds noncommutative delights, almost like a world where partial derivatives no longer commute! But of course we still have

everywhere, and this becomes part of the official definition of a symplectic structure.
All very simple. I hope, however, the experts note that to see this unified picture, we had to avoid the most common approaches to classical mechanics, which start with either a ‘Hamiltonian’

or a ‘Lagrangian’

Instead, we started with Hamilton’s principal function

where
is not the usual configuration space describing possible positions for a particle, but the ‘extended’ configuration space, which also includes time. Only this way do Hamilton’s equations, like the Maxwell relations, become a trivial consequence of the fact that partial derivatives commute.
But what about those ‘noncommutative delights’? First, there’s a noncommutative Poisson bracket operation on functions on
This makes the functions into a so-called Poisson algebra. In classical mechanics of a point particle on the line, for example, it’s well-known that we have

In thermodynamics, the analogous relations

seem sadly little-known. But you can see them here, for example:
• M. J. Peterson, Analogy between thermodynamics and mechanics, American Journal of Physics 47 (1979), 488–490.
at least up to one of those pesky minus signs! We can use these Poisson brackets to study how one thermodynamic variable changes as we slowly change another, staying close to equilibrium all along.
Second, we can go further and ‘quantize’ the functions on
This means coming up with an associative but noncommutative product of these function that mimics the Poisson bracket to some extent. In the case of a particle on a line, we’d get commutation relations like

where
is Planck’s constant. Now we can represent these quantities as operators on a Hilbert space, the uncertainty principle kicks in, and life gets really interesting.
In thermodynamics, the analogous relations would be

The math works just the same, but what does it mean physically? Are we now thinking of temperature, entropy and the like as ‘quantum observables’—for example, operators on a Hilbert space? Are we just quantizing thermodynamics?
That’s one possible interpretation, but I’ve never heard anyone discuss it. Here’s one good reason: as Blake Stacey pointed out below, these equations don’t pass the test of dimensional analysis! The quantities at left have units of energy, while Plank’s constant has units of action. So maybe we need to introduce a quantity with units of time at right, or maybe there’s some other interpretation, where we don’t interpret the parameter
as the good old-fashioned Planck’s constant, but something else instead.
And if you’ve really been paying attention, you may wonder how quantropy fits into this game! I showed that at least in a toy model, the path integral formulation of quantum mechanics arises, not exactly from maximizing or minimizing something, but from finding its critical points: that is, points where its first derivative vanishes. This something is a complex-valued quantity analogous to entropy, which I called ‘quantropy’.
Now, while I keep throwing around words like ‘minimize’ and ‘maximize’, most everything I’m doing works just fine for critical points. So, it seems that the apparatus of symplectic geometry may apply to the path-integral formulation of quantum mechanics.
But that would be weirdly interesting! In particular, what would happen when we go ahead and quantize the path-integral formulation of quantum mechanics?
If you’re a physicist, there’s a guess that will come tripping off your tongue at this point, without you even needing to think. Me too. But I don’t know if that guess is right.
Less mind-blowingly, there is also the question of how symplectic geometry enters into classical statics via the idea of Example 4.
But there’s a lot of fun to be had in this game already with thermodynamics.
Appendix
I should admit, just so you don’t think I failed to notice, that only rather esoteric physicists study the approach to quantum mechanics where time is an operator that doesn’t commute with the Hamiltonian
In this approach
commutes with the momentum and position operators. I didn’t write down those commutation equations, for fear you’d think I was a crackpot and stop reading! It is however a perfectly respectable approach, which can be reconciled with the usual one. And this issue is not only quantum-mechanical: it’s also important in classical mechanics.
Namely, there’s a way to start with the so-called extended phase space for a point particle on a manifold
:

with coordinates
and
and get back to the usual phase space:

with just
and
as coordinates. The idea is to impose a constraint of the form

to knock off one degree of freedom, and use a standard trick called ‘symplectic reduction’ to knock off another.
Similarly, in quantum mechanics we can start with a big Hilbert space

on which
and
are all operators, then impose a constraint expressing
in terms of
and
and then use that constraint to pick out states lying in a smaller Hilbert space. This smaller Hilbert space is naturally identified with the usual Hilbert space for a point particle:

Here
is called the configuration space for our particle; its cotangent bundle is the usual phase space. We call
the extended configuration space for a particle on the line; its cotangent bundle is the extended phase space.
I’m having some trouble remembering where I first learned about these ideas, but here are some good places to start:
• Toby Bartels, Abstract Hamiltonian mechanics.
• Nikola Buric and Slobodan Prvanovic, Space of events and the time observable.
• Piret Kuusk and Madis Koiv, Measurement of time in nonrelativistic quantum and classical mechanics, Proceedings of the Estonian Academy of Sciences, Physics and Mathematics 50 (2001), 195–213.