It came as a bit of a shock last week when I realized that some of the equations I’d learned in thermodynamics were just the same as equations I’d learned in classical mechanics—with only the names of the variables changed, to protect the innocent.
Why didn’t anyone tell me?
For example: everybody loves Hamilton’s equations: there are just two, and they summarize the entire essence of classical mechanics. Most people hate the Maxwell relations in thermodynamics: there are lots, and they’re hard to remember.
But what I’d like to show you now is that Hamilton’s equations are Maxwell relations! They’re a special case, and you can derive them the same way. I hope this will make you like the Maxwell relations more, instead of liking Hamilton’s equations less.
First, let’s see what these equations look like. Then let’s see why Hamilton’s equations are a special case of the Maxwell relations. And then let’s talk about how this might help us unify different aspects of physics.
Hamilton’s equations
Suppose you have a particle on the line whose position and momentum
are functions of time,
If the energy
is a function of position and momentum, Hamilton’s equations say:
The Maxwell relations
There are lots of Maxwell relations, and that’s one reason people hate them. But let’s just talk about two; most of the others work the same way.
Suppose you have a physical system like a box of gas that has some volume pressure
temperature
and entropy
Then the first and second Maxwell relations say:
Comparison
Clearly Hamilton’s equations resemble the Maxwell relations. Please check for yourself that the patterns of variables are exactly the same: only the names have been changed! So, apart from a key subtlety, Hamilton’s equations become the first and second Maxwell relations if we make these replacements:
What’s the key subtlety? One reason people hate the Maxwell’s relations is they have lots of little symbols like saying what to hold constant when we take our partial derivatives. Hamilton’s equations don’t have those.
So, you probably won’t like this, but let’s see what we get if we write Hamilton’s equations so they exactly match the pattern of the Maxwell relations:
This looks a bit weird, and it set me back a day. What does it mean to take the partial derivative of in the
direction while holding
constant, for example?
I still think it’s weird. But I think it’s correct. To see this, let’s derive the Maxwell relations, and then derive Hamilton’s equations using the exact same reasoning, with only the names of variables changed.
Deriving the Maxwell relations
The Maxwell relations are extremely general, so let’s derive them in a way that makes that painfully clear. Suppose we have any smooth function on the plane. Just for laughs, let’s call the coordinates of this plane
and
. Then we have
for some functions and
This equation is just a concise way of saying that
and
The minus sign here is unimportant: you can think of it as a whimsical joke. All the math would work just as well if we left it out.
(In reality, physicists call as the internal energy of a system, regarded as a function of its entropy
and volume
They then call
the temperature and
the pressure. It just so happens that for lots of systems, their internal energy goes down as you increase their volume, so
works out to be positive if we stick in this minus sign, so that’s what people did. But you don’t need to know any of this physics to follow the derivation of the Maxwell relations!)
Now, mixed partial derivatives commute, so we have:
Plugging in our definitions of and
this says
And that’s the first Maxwell relation! So, there’s nothing to it: it’s just a sneaky way of saying that the mixed partial derivatives of the function commute.
The second Maxwell relation works the same way. But seeing this takes a bit of thought, since we need to cook up a suitable function whose mixed partial derivatives are the two sides of this equation:
There are different ways to do this, but for now let me use the time-honored method of ‘pulling the rabbit from the hat’.

Here’s the function we want:
(In thermodynamics this function is called the Helmholtz free energy. It’s sometimes denoted but the International Union of Pure and Applied Chemistry recommends calling it
which stands for the German word ‘Arbeit’, meaning ‘work’.)
Let’s check that this function does the trick:
If we restrict ourselves to any subset of the plane where and
serve as coordinates, the above equation is just a concise way of saying
and
Then since mixed partial derivatives commute, we get:
or in other words:
which is the second Maxwell relation.
We can keep playing this game using various pairs of the four functions as coordinates, and get more Maxwell relations: enough to give ourselves a headache! But we have more better things to do today.
Hamilton’s equations as Maxwell relations
For example: let’s see how Hamilton’s equations fit into this game. Suppose we have a particle on the line. Consider smooth paths where it starts at some fixed position at some fixed time and ends at the point at the time
Nature will choose a path with least action—or at least one that’s a stationary point of the action. Let’s assume there’s a unique such path, and that it depends smoothly on
and
. For this to be true, we may need to restrict
and
to a subset of the plane, but that’s okay: go ahead and pick such a subset.
Given and
in this set, nature will pick the path that’s a stationary point of action; the action of this path is called Hamilton’s principal function and denoted
(Beware: this
is not the same as entropy!)
Let’s assume is smooth. Then we can copy our derivation of the Maxwell equations line for line and get Hamilton’s equations! Let’s do it, skipping some steps but writing down the key results.
For starters we have
for some functions and
called the momentum and energy, which obey
and
As far as I can tell it’s just a cute coincidence that we see a minus sign in the same place as before! Anyway, the fact that mixed partials commute gives us
which is the first of Hamilton’s equations. And now we see that all the funny and
things are actually correct!
Next, we pull a rabbit out of our hat. We define this function:
and check that
This function probably has a standard name, but I don’t know it. Do you?
Then, considering any subset of the plane where and
serve as coordinates, we see that because mixed partials commute:
we get
So, we’re done!
But you might be wondering how we pulled this rabbit out of the hat. More precisely, why did we suspect it was there in the first place? There’s a nice answer if you’re comfortable with differential forms. We start with what we know:
Next, we use this fundamental equation:
to note that:
See? We’ve managed to switch the roles of and
at the cost of an extra minus sign!
Then, if we restrict attention to any contractible open subset of the plane, the Poincaré Lemma says
Since
it follows that there’s a function with
This is our rabbit. And if you ponder the difference between and
, you’ll see it’s
So, it’s no surprise that
The big picture
Now let’s step back and think about what’s going on.
Lately I’ve been trying to unify a bunch of ‘extremal principles’, including:
1) the principle of least action
2) the principle of least energy
3) the principle of maximum entropy
4) the principle of maximum simplicity, or Occam’s razor
In my post on quantropy I explained how the first three principles fit into a single framework if we treat Planck’s constant as an imaginary temperature. The guiding principle of this framework is
subject to the constraints imposed by what you believe
And that’s nice, because E. T. Jaynes has made a powerful case for this principle.
However, when the temperature is imaginary, entropy is so different that it may deserves a new name: say, ‘quantropy’. In particular, it’s complex-valued, so instead of maximizing it we have to look for stationary points: places where its first derivative is zero. But this isn’t so bad. Indeed, a lot of minimum and maximum principles are really ‘stationary principles’ if you examine them carefully.
What about the fourth principle: Occam’s razor? We can formalize this using algorithmic probability theory. Occam’s razor then becomes yet another special case of
subject the constraints imposed by what you believe
once we realize that algorithmic entropy is a special case of ordinary entropy.
All of this deserves plenty of further thought and discussion—but not today!
Today I just want to point out that once we’ve formally unified classical mechanics and thermal statics (often misleadingly called ‘thermodynamics’), as sketched in the article on quantropy, we should be able to take any idea from one subject and transpose it to the other. And it’s true. I just showed you an example, but there are lots of others!
I guessed this should be possible after pondering three famous facts:
• In classical mechanics, if we fix the initial position of a particle, we can pick any position and time
at which the particle’s path ends, and nature will seek the path to this endpoint that minimizes the action. This minimal action is Hamilton’s principal function
which obeys
In thermodynamics, if we fix the entropy and volume
of a box of gas, nature will seek the probability distribution of microstates the minimizes the energy. This minimal energy is the internal energy
, which obeys
• In classical mechanics we have canonically conjugate quantities, while in statistical mechanics we have conjugate variables. In classical mechanics the canonical conjugate of the position is the momentum
, while the canonical conjugate of time
is energy
In thermodynamics, the conjugate of entropy
is temperature
while the conjugate of volume
is pressure
All this is fits in perfectly with the analogy we’ve been using today:
• Something called the Legendre transformation plays a big role both in classical mechanics and thermodynamics. This transformation takes a function of some variable and turns it into a function of the conjugate variable. In our proof of the Maxwell relations, we secretly used a Legendre transformation to pass from the internal energy to the Helmholtz free energy
where we must solve for the entropy in terms of
and
to think of
as a function of these two variables.
Similarly, in our proof of Hamilton’s equations, we passed from Hamilton’s principal function to the function
where we must solve for the position in terms of
and
to think of
as a function of these two variables.
I hope you see that all this stuff fits together in a nice picture, and I hope to say a bit more about it soon. The most exciting thing for me will be to see how symplectic geometry, so important in classical mechanics, can be carried over to thermodynamics. Why? Because I’ve never seen anyone use symplectic geometry in thermodynamics. But maybe I just haven’t looked hard enough!
Indeed, it’s perfectly possible that some people already know what I’ve been saying today. Have you seen someone point out that Hamilton’s equations are a special case of the Maxwell relations? This would seem to be the first step towards importing all of symplectic geometry to thermodynamics.
Posted by John Baez 






