Classical Mechanics versus Thermodynamics (Part 1)

19 January, 2012

It came as a bit of a shock last week when I realized that some of the equations I’d learned in thermodynamics were just the same as equations I’d learned in classical mechanics—with only the names of the variables changed, to protect the innocent.

Why didn’t anyone tell me?

For example: everybody loves Hamilton’s equations: there are just two, and they summarize the entire essence of classical mechanics. Most people hate the Maxwell relations in thermodynamics: there are lots, and they’re hard to remember.

But what I’d like to show you now is that Hamilton’s equations are Maxwell relations! They’re a special case, and you can derive them the same way. I hope this will make you like the Maxwell relations more, instead of liking Hamilton’s equations less.

First, let’s see what these equations look like. Then let’s see why Hamilton’s equations are a special case of the Maxwell relations. And then let’s talk about how this might help us unify different aspects of physics.

Hamilton’s equations

Suppose you have a particle on the line whose position q and momentum p are functions of time, t. If the energy H is a function of position and momentum, Hamilton’s equations say:

\begin{array}{ccr}  \displaystyle{  \frac{d p}{d t} }  &=&  \displaystyle{- \frac{\partial H}{\partial q} } \\  \\ \displaystyle{  \frac{d q}{d t} } &=&  \displaystyle{ \frac{\partial H}{\partial p} }  \end{array}

The Maxwell relations

There are lots of Maxwell relations, and that’s one reason people hate them. But let’s just talk about two; most of the others work the same way.

Suppose you have a physical system like a box of gas that has some volume V, pressure P, temperature T and entropy S. Then the first and second Maxwell relations say:

\begin{array}{ccr}  \displaystyle{ \left. \frac{\partial T}{\partial V}\right|_S } &=&  \displaystyle{ - \left. \frac{\partial P}{\partial S}\right|_V } \\   \\   \displaystyle{ \left. \frac{\partial S}{\partial  V}\right|_T  }  &=&  \displaystyle{ \left. \frac{\partial P}{\partial T} \right|_V }   \end{array}

Comparison

Clearly Hamilton’s equations resemble the Maxwell relations. Please check for yourself that the patterns of variables are exactly the same: only the names have been changed! So, apart from a key subtlety, Hamilton’s equations become the first and second Maxwell relations if we make these replacements:

\begin{array} {ccccccc}  q &\to& S & &  p &\to & T \\ t & \to & V & & H &\to & P \end{array}

What’s the key subtlety? One reason people hate the Maxwell’s relations is they have lots of little symbols like \left. \right|_V saying what to hold constant when we take our partial derivatives. Hamilton’s equations don’t have those.

So, you probably won’t like this, but let’s see what we get if we write Hamilton’s equations so they exactly match the pattern of the Maxwell relations:

\begin{array}{ccr}     \displaystyle{ \left. \frac{\partial p}{\partial t} \right|_q }  &=&  \displaystyle{- \left. \frac{\partial H}{\partial q} \right|_t } \\  \\\displaystyle{  \left.\frac{\partial q}{\partial t} \right|_p } &=&  \displaystyle{ \left. \frac{\partial H}{\partial p} \right|_t }    \end{array}

This looks a bit weird, and it set me back a day. What does it mean to take the partial derivative of q in the t direction while holding p constant, for example?

I still think it’s weird. But I think it’s correct. To see this, let’s derive the Maxwell relations, and then derive Hamilton’s equations using the exact same reasoning, with only the names of variables changed.

Deriving the Maxwell relations

The Maxwell relations are extremely general, so let’s derive them in a way that makes that painfully clear. Suppose we have any smooth function U on the plane. Just for laughs, let’s call the coordinates of this plane S and V. Then we have

d U = T d S - P d V

for some functions T and P. This equation is just a concise way of saying that

\displaystyle{ T = \left.\frac{\partial U}{\partial S}\right|_V }

and

\displaystyle{ P = - \left.\frac{\partial U}{\partial V}\right|_S }

The minus sign here is unimportant: you can think of it as a whimsical joke. All the math would work just as well if we left it out.

(In reality, physicists call U as the internal energy of a system, regarded as a function of its entropy S and volume V. They then call T the temperature and P the pressure. It just so happens that for lots of systems, their internal energy goes down as you increase their volume, so P works out to be positive if we stick in this minus sign, so that’s what people did. But you don’t need to know any of this physics to follow the derivation of the Maxwell relations!)

Now, mixed partial derivatives commute, so we have:

\displaystyle{ \frac{\partial^2 U}{\partial V \partial S} =  \frac{\partial^2 U}{\partial S \partial V}}

Plugging in our definitions of T and V, this says

\displaystyle{ \left. \frac{\partial T}{\partial V}\right|_S = - \left. \frac{\partial P}{\partial S}\right|_V }

And that’s the first Maxwell relation! So, there’s nothing to it: it’s just a sneaky way of saying that the mixed partial derivatives of the function U commute.

The second Maxwell relation works the same way. But seeing this takes a bit of thought, since we need to cook up a suitable function whose mixed partial derivatives are the two sides of this equation:

\displaystyle{ \left. \frac{\partial S}{\partial  V}\right|_T  = \left. \frac{\partial P}{\partial T} \right|_V }

There are different ways to do this, but for now let me use the time-honored method of ‘pulling the rabbit from the hat’.

Here’s the function we want:

A = U - T S

(In thermodynamics this function is called the Helmholtz free energy. It’s sometimes denoted F, but the International Union of Pure and Applied Chemistry recommends calling it A, which stands for the German word ‘Arbeit’, meaning ‘work’.)

Let’s check that this function does the trick:

\begin{array}{ccl} d A &=& d U - d(T S) \\  &=& (T d S - P d V) - (S dT + T d S) \\  &=& -S d T - P dV \end{array}

If we restrict ourselves to any subset of the plane where T and V serve as coordinates, the above equation is just a concise way of saying

\displaystyle{ S = - \left.\frac{\partial A}{\partial T}\right|_V }

and

\displaystyle{ P = - \left.\frac{\partial A}{\partial V}\right|_T }

Then since mixed partial derivatives commute, we get:

\displaystyle{ \frac{\partial^2 A}{\partial V \partial T} =  \frac{\partial^2 A}{\partial T \partial V}}

or in other words:

\displaystyle{ \left. \frac{\partial S}{\partial  V}\right|_T  = \left. \frac{\partial P}{\partial T} \right|_V }

which is the second Maxwell relation.

We can keep playing this game using various pairs of the four functions S, T, P, V as coordinates, and get more Maxwell relations: enough to give ourselves a headache! But we have more better things to do today.

Hamilton’s equations as Maxwell relations

For example: let’s see how Hamilton’s equations fit into this game. Suppose we have a particle on the line. Consider smooth paths where it starts at some fixed position at some fixed time and ends at the point q at the time t. Nature will choose a path with least action—or at least one that’s a stationary point of the action. Let’s assume there’s a unique such path, and that it depends smoothly on q and t. For this to be true, we may need to restrict q and t to a subset of the plane, but that’s okay: go ahead and pick such a subset.

Given q and t in this set, nature will pick the path that’s a stationary point of action; the action of this path is called Hamilton’s principal function and denoted S(q,t). (Beware: this S is not the same as entropy!)

Let’s assume S is smooth. Then we can copy our derivation of the Maxwell equations line for line and get Hamilton’s equations! Let’s do it, skipping some steps but writing down the key results.

For starters we have

d S = p d q - H d t

for some functions p and H called the momentum and energy, which obey

\displaystyle{ p = \left.\frac{\partial S}{\partial q}\right|_t }

and

\displaystyle{ H = - \left.\frac{\partial S}{\partial t}\right|_q }

As far as I can tell it’s just a cute coincidence that we see a minus sign in the same place as before! Anyway, the fact that mixed partials commute gives us

\displaystyle{ \left. \frac{\partial p}{\partial t} \right|_q = - \left. \frac{\partial H}{\partial q} \right|_t }

which is the first of Hamilton’s equations. And now we see that all the funny \left. \right|_q and \left. \right|_t things are actually correct!

Next, we pull a rabbit out of our hat. We define this function:

X = S - p q

and check that

d X = - q dp - H d t

This function X probably has a standard name, but I don’t know it. Do you?

Then, considering any subset of the plane where p and t serve as coordinates, we see that because mixed partials commute:

\displaystyle{ \frac{\partial^2 X}{\partial t \partial p} =  \frac{\partial^2 A}{\partial p \partial t}}

we get

\displaystyle{ \left. \frac{\partial q}{\partial t} \right|_p = \left. \frac{\partial H}{\partial p} \right|_t }

So, we’re done!

But you might be wondering how we pulled this rabbit out of the hat. More precisely, why did we suspect it was there in the first place? There’s a nice answer if you’re comfortable with differential forms. We start with what we know:

d S = p d q - H d t

Next, we use this fundamental equation:

d^2 = 0

to note that:

\begin{array}{ccl}  0 &=& d^2 S \\ &=& d(p d q- H d t) \\ &=& d p \wedge d q - d H \wedge d t \\ &=& - dq \wedge d p - d H \wedge d t \\ &=& d(-q d p - H d t) \end{array}

See? We’ve managed to switch the roles of p and q, at the cost of an extra minus sign!

Then, if we restrict attention to any contractible open subset of the plane, the Poincaré Lemma says

d \omega = 0 \implies \omega = d \mu \; \textrm{for some} \; \mu

Since

d(- q d p - H d t) = 0

it follows that there’s a function X with

d X = - q d p - H d t

This is our rabbit. And if you ponder the difference between -q d p and p d q, you’ll see it’s -d( p q). So, it’s no surprise that

X = S - p q

The big picture

Now let’s step back and think about what’s going on.

Lately I’ve been trying to unify a bunch of ‘extremal principles’, including:

1) the principle of least action
2) the principle of least energy
3) the principle of maximum entropy
4) the principle of maximum simplicity, or Occam’s razor

In my post on quantropy I explained how the first three principles fit into a single framework if we treat Planck’s constant as an imaginary temperature. The guiding principle of this framework is

maximize entropy
subject to the constraints imposed by what you believe

And that’s nice, because E. T. Jaynes has made a powerful case for this principle.

However, when the temperature is imaginary, entropy is so different that it may deserves a new name: say, ‘quantropy’. In particular, it’s complex-valued, so instead of maximizing it we have to look for stationary points: places where its first derivative is zero. But this isn’t so bad. Indeed, a lot of minimum and maximum principles are really ‘stationary principles’ if you examine them carefully.

What about the fourth principle: Occam’s razor? We can formalize this using algorithmic probability theory. Occam’s razor then becomes yet another special case of

maximize entropy
subject the constraints imposed by what you believe

once we realize that algorithmic entropy is a special case of ordinary entropy.

All of this deserves plenty of further thought and discussion—but not today!

Today I just want to point out that once we’ve formally unified classical mechanics and thermal statics (often misleadingly called ‘thermodynamics’), as sketched in the article on quantropy, we should be able to take any idea from one subject and transpose it to the other. And it’s true. I just showed you an example, but there are lots of others!

I guessed this should be possible after pondering three famous facts:

• In classical mechanics, if we fix the initial position of a particle, we can pick any position q and time t at which the particle’s path ends, and nature will seek the path to this endpoint that minimizes the action. This minimal action is Hamilton’s principal function S(q,t), which obeys

d S = p d q - H d t

In thermodynamics, if we fix the entropy S and volume V of a box of gas, nature will seek the probability distribution of microstates the minimizes the energy. This minimal energy is the internal energy U(S,V), which obeys

d U = T d S - P d V

• In classical mechanics we have canonically conjugate quantities, while in statistical mechanics we have conjugate variables. In classical mechanics the canonical conjugate of the position q is the momentum p, while the canonical conjugate of time t is energy H. In thermodynamics, the conjugate of entropy S is temperature T, while the conjugate of volume V is pressure P. All this is fits in perfectly with the analogy we’ve been using today:

\begin{array} {ccccccc}  q &\to& S & &  p &\to & T \\ t & \to & V & & H &\to & P \end{array}

• Something called the Legendre transformation plays a big role both in classical mechanics and thermodynamics. This transformation takes a function of some variable and turns it into a function of the conjugate variable. In our proof of the Maxwell relations, we secretly used a Legendre transformation to pass from the internal energy U(S,V) to the Helmholtz free energy A(T,V):

A = U - T S

where we must solve for the entropy S in terms of T and V to think of A as a function of these two variables.

Similarly, in our proof of Hamilton’s equations, we passed from Hamilton’s principal function S(q,t) to the function X(p,t):

X = S - p q

where we must solve for the position q in terms of p and t to think of X as a function of these two variables.

I hope you see that all this stuff fits together in a nice picture, and I hope to say a bit more about it soon. The most exciting thing for me will be to see how symplectic geometry, so important in classical mechanics, can be carried over to thermodynamics. Why? Because I’ve never seen anyone use symplectic geometry in thermodynamics. But maybe I just haven’t looked hard enough!

Indeed, it’s perfectly possible that some people already know what I’ve been saying today. Have you seen someone point out that Hamilton’s equations are a special case of the Maxwell relations? This would seem to be the first step towards importing all of symplectic geometry to thermodynamics.


Going on Strike

17 January, 2012

 

Along with Wikipedia and other sites, this blog will go on strike on the 18th of January, 2012. We will be closed starting 13:00 UTC (also known as 1 pm Greenwich Mean Time – that’s 8 am Eastern Standard Time for you Americans). We should be back 12 hours later.

Congress has decided to shelve the Stop Online Piracy Act (SOPA) until a more compliant president is elected. But we need to let them know now that this bill sucks, along with its evil partner, the Protect IP Act or PIPA. That’s what the internet strike is about.

My homepage will be on strike too—in fact, it started today! Yours can easily do the same: just copy my homepage onto yours and adjust it to taste.

(By the way, the official version of the “strike” webpage is flawed because it uses relative links that don’t work when you copy it to your own site. I fixed those in my version.)


Extremal Principles in Classical, Statistical and Quantum Mechanics

13 January, 2012

guest post by Mike Stay

The table in John’s post on quantropy shows that energy and action are analogous:

Statics Dynamics
statistical mechanics quantum mechanics
probabilities amplitudes
Boltzmann distribution Feynman sum over histories
energy action
temperature Planck’s constant times i
entropy quantropy
free energy free action

However, this seems to be part of a bigger picture that includes at least entropy as analogous to both of those, too. I think that just about any quantity defined by an integral over a path would behave similarly.

I see four broad areas to consider, based on a temperature parameter:

  1. T = 0: statics, or “least quantity”
  2. Real T > 0: statistical mechanics
  3. Imaginary T: a thermal ensemble gets replaced by a quantum superposition
  4. Complex T: ensembles of quantum systems, as in nuclear magnetic resonance

I’m not going to get into the last of these in what follows.

1. “Least quantity”

Lagrangian of a classical particle

K is kinetic energy, i.e. the “action density” due to motion.

V is potential energy, i.e. minus the “action density” due to position.

The action is then:

\displaystyle \begin{array}{rcl}   A &=& \int (K-V) \,  d t \\ & = & \int \left[m\left(\frac{d q(t)}{d t}^2 - V(q(t)\right)\right] d t  \end{array}

where m is the particle’s mass. We get the principle of least action by setting \delta A = 0.

“Static” systems related by a Wick rotation

  1. Substitute q(s = iz) for q(t) to get a “springy” static system.

    In John’s homework problem A Spring in Imaginary Time, he guided students through a Wick-rotation-like process that transforms the Lagrangian above into the Hamiltonian of a springy system. (I say “springy” because it’s not exactly the Hamiltonian for a hanging spring: here each infinitesimal piece of the spring is at a fixed horizontal position and is free to move only vertically.)

    \kappa is the potential energy density due to stretching.

    \upsilon is the potential energy density due to position.

    We then have

    \displaystyle  \begin{array}{rcl}\int(\kappa-\upsilon) dz & = &  \int\left[k\left(\frac{dq(iz)}{dz}\right)^2 - \upsilon(q(iz))\right]  dz\\ & = & -i\int\left[-k\left(\frac{dq(iz)}{diz}\right)^2 -  \upsilon(q(iz))\right] diz\\ & = & i  \int\left[k\left(\frac{dq(iz)}{diz}\right)^2 + \upsilon(q(iz))\right]  diz \end{array}

    or letting s = iz,

    \displaystyle  \begin{array}{rcl}   & = &  i\int\left[k\left(\frac{dq(s)}{ds}\right)^2 + \upsilon(q(s))\right]  ds\\ & = & iE \end{array}

    where E is the potential energy of the spring. We get the principle of least energy by setting \delta E = 0.

  2. Substitute q(β = iz) for q(t) to get a thermometer
    system.

    We can repeat the process above, but use inverse temperature, or “coolness”, instead of time. Note that this is still a statics problem at heart! We’ll introduce another temperature below when we allow for multiple possible q‘s.

    K is the potential energy due to rate of change of q with respect to \beta. (This has to do with the thermal expansion coefficient: if we fix length of the thermometer and then cool it, we get “stretching” potential energy.)

    V is any extra potential energy due to q.

    \displaystyle \begin{array}{rcl}\int(K-V) dz  & = & \int\left[k\left(\frac{dq(iz)}{dz}\right)^2 -  V(q(iz))\right] dz\\ & = &  -i\int\left[-k\left(\frac{dq(iz)}{diz}\right)^2 - V(q(iz))\right]  diz\\ & = & i \int\left[k\left(\frac{dq(iz)}{diz}\right)^2 +  V(q(iz))\right] diz \end{array}

    or letting \beta = iz,

    \displaystyle \begin{array}{rcl}   & = &  i\int\left[k\left(\frac{dq(\beta)}{d\beta}\right)^2 +  V(q(\beta))\right] d\beta\\ & = & iS_1\end{array}

    where S_1 is the entropy lost as the thermometer is cooled. We get the principle of “least entropy lost” by setting \delta S_1 = 0.

  3. Substitute q(T₁ = iz) for q(t).

    We can repeat the process above, but use temperature instead of time. We get a system whose heat capacity is governed by a function q(T) and its derivative. We’re trying to find the best function q, the most efficient way to raise the temperature of the system.

    C is the heat capacity (= entropy) proportional to (dq/dT_1)^2.

    V is the heat capacity due to q.

    \displaystyle \begin{array}{rcl}\int(C-V) dz  & = & \int\left[k\left(\frac{dq(iz)}{dz}\right)^2 -  V(q(iz))\right] dz\\ & = &  -i\int\left[-k\left(\frac{dq(iz)}{diz}\right)^2 - V(q(iz))\right]  diz\\ & = & i \int\left[k\left(\frac{dq(iz)}{diz}\right)^2 +  V(q(iz))\right] diz  \end{array}

    or letting T_1 = iz,

    \displaystyle \begin{array}{rcl} & = &  i\int\left[k\left(\frac{dq(T_1)}{dT_1}\right)^2 + V(q(T_1))\right]  dT_1\\ & = & iE \end{array}

    where E is the energy required to raise the
    temperature. We again get the principle of least energy by setting \delta E = 0.

2. Statistical mechanics

Here we allow lots of possible q‘s, then maximize entropy subject to constraints using the Lagrange multiplier trick.

Statistical mechanics of a particle

For the statistical mechanics of a particle, we choose a real measure a_x on the set of paths. For simplicity, we assume the set is finite.

Normalize so \sum a_x = 1.

Define entropy to be S = - \sum a_x \ln a_x.

Our problem is to choose a_x to minimize the “free action” F = A - \lambda S, or, what’s equivalent, to maximize S subject to a constraint on A.

To make units match, λ must have units of action, so it’s some multiple of . Replace λ by ℏλ so the free action is

F = A - \hbar\lambda\, S.

The distribution that minimizes the free action is the Gibbs distribution a_x = \exp(-A/\hbar\lambda) / Z, where Z is the usual partition function.

However, there are other observables of a path, like the position q_{1/2} at the halfway point; given another constraint on the average value of q_{1/2} over all paths, we get a distribution like

\displaystyle a_x = \exp(-\left[A +  pq_{1/2}\right]/\hbar\lambda) / Z.

The conjugate variable to that position is a momentum: in order to get from the starting point to the given point in the allotted time, the particle has to have the corresponding momentum.

dA = \hbar\lambda\, dS - p\, dq.

Other examples from Wick rotation

  1. Introduce a temperature T [Kelvins] that perturbs the spring.

    We minimize the free energy F = E - kT\, S, i.e. maximize the entropy S subject to a constraint on the expected energy

    \langle E\rangle = \sum a_x E_x.

    We get the measure a_x = \exp(-E_x/kT) / Z.

    Other observables about the spring’s path give conjugate variables whose product is energy. Given constraint on the average position of the spring at the halfway point, we get a conjugate force: pulling the spring out of equilibrium requires a force.

    dE = kT\, dS - F\, dq.

  2. Statistical ensemble of thermometers with ensemble temperature T₂ [unitless].

    We minimize the “free entropy” F = S_1 - T_2S_2, i.e. we maximize the entropy S_2 subject to a constraint on the expected entropy lost

    \langle S_1\rangle = \sum a_x S_{1,x}.

    We get the measure a_x = \exp(-S_{1,x}/T_2) / Z.

    Given a constraint on the average position at the halfway point, we get a conjugate inverse length r that tells how much entropy is lost when the thermometer shrinks by dq.

    dS_1 = T_2\, dS_2 - r\, dq.

  3. Statistical ensemble of functions q with ensemble temperature T₂ [Kelvins].

    We minimize the free energy F = E - kT_2\, S, i.e. we maximize the entropy S subject to a constraint on the expected energy

    \displaystyle \langle E\rangle = \sum a_x E_x.

    We get the measure a_x = \exp(-E_x/kT_2) / Z.

    Again, a constraint on the position would give a conjugate force. It’s a little harder to see how here, but given a non-optimal function q(T), we have an extra energy cost due to inefficiency that’s analogous to the stretching potential energy when pulling a spring out of equilibrium.

3. Thermo to quantum via Wick rotation of Lagrange multiplier

We allow a complex-valued measure a as John did in the article on quantropy. We pick a logarithm for each a_x and assume they don’t go through zero as we vary them. We also choose an imaginary Lagrange multiplier.

Normalize so \sum a_x = 1.

Define quantropy Q = - \sum a_x \ln a_x.

Find a stationary point of the free action F = A - \hbar\lambda\, Q.

We get a_x = \exp(-A_x/\hbar\lambda). If \lambda = -i, we get Feynman’s sum over histories. Surely something like the two-slit experiment considers histories with a constraint on position at a particular time, and we get a conjugate momentum?

A Quantum Version of Entropy

Again allow complex-valued a_x. However, this time normalize these by setting \sum |a_x|^2 = 1.

Define a quantum version of entropy S = - \sum |a_x|^2  \ln |a_x|^2.

  1. Allow quantum superposition of perturbed springs.

    \langle E\rangle = \sum |a_x|^2 E_x. Get a_x =  \exp(-E_x/kT) / Z. If T = -i\hbar/tk, we get the evolution of the quantum state |q\rangle under the given Hamiltonian for a time t.

  2. Allow quantum superpositions of thermometers.

    \langle S_1\rangle = \sum |a_x|^2 S_{1,x}. Get a_x =  \exp(-S_{1,x}/T_2) / Z. If T_2 = -i, we get something like a sum over histories, but with a different normalization condition that converges because our set of paths is finite.

  3. Allow quantum superposition of systems.

    \langle E \rangle = \sum |a_x|^2 E_x. Get a_x =\exp(-E_x/kT_2) / Z. If T_2 = -i\hbar/tk, we get the result of “Measure E, then heat the superposition T₁ degrees in a time much less than t seconds, then wait t seconds.” Different functions q in the superposition change the heat capacity differently and thus the systems end up at different energies.

So to sum up, there’s at least a three-way analogy between action, energy, and entropy depending on what you’re integrating over. You get a kind of “statics” if you extremize the integral by varying the path; by allowing multiple paths and constraints on observables, you get conjugate variables and “free” quantities that you want to minimize; and by taking the temperature to be imaginary, you get quantum systems.


The Beauty of Roots (Part 2)

7 January, 2012

Here’s a bit more on the beauty of roots—some things that may have escaped those of you who weren’t following this blog carefully!

Greg Egan has a great new applet for exploring the roots of Littlewood polynomials of a given degree—meaning polynomials whose coefficients are all ±1:

• Greg Egan, Littlewood applet.

Move the mouse around to create a little rectangle, and the applet will zoom in to show the roots in that region. For example, the above region is close to the number -0.0572 + 0.72229i.

Then, by holding the shift key and clicking the mouse, compare the corresponding ‘dragon’. We get the dragon for some complex number by evaluating all power series whose coefficients are all ±1 at this number.

You’ll see that often the dragon for some number resembles the set of roots of Littlewood polynomials near that number! To get a sense of why, read Greg’s explanation. However, he uses a different, though equivalent, definition of the dragon (which he calls the ‘Julia set’).

He also made a great video showing how the dragons change shape as you move around the complex plane:

The dragon is well-defined for any number inside the unit circle, since all power series with coefficients ±1 converge inside this circle. If you watch the video carefully—it helps to make it big—you’ll see a little white cross moving around inside the unit circle, indicating which dragon you’re seeing.

I’m writing a paper about this stuff with Dan Christensen and Sam Derbyshire… that’s why I’m not giving a very careful explanation now. We invited Greg Egan to join us, but he’s too busy writing the third volume of his trilogy Orthogonal.


The Best Climate Scientists

4 January, 2012

A physicist friend asks if there is someone in climate science who has made progress significant enough to deserve a Nobel Prize. It’s an interesting question. Any such prize would be amazingly controversial, but let’s shelve that and ask: who are the best climate scientists, the ones who have made truly dramatic progress?

Arrhenius is no longer with us, so he’s out.


Azimuth on Google Plus (Part 5)

1 January, 2012

Happy New Year! I’m back from Laos. Here are seven items, mostly from the Azimuth Circle on Google Plus:

1) Phil Libin is the boss of a Silicon Valley startup. When he’s off travelling, he uses a telepresence robot to keep an eye on things. It looks like a stick figure on wheels. Its bulbous head has two eyes, which are actually a camera and a laser. On its forehead is a screen, where you can see Libin’s face. It’s made by a company called Anybots, and it costs just $15,000.


I predict that within my life we’ll be using things like this to radically cut travel costs and carbon emissions for business and for conferences. It seems weird now, but so did telephones. Future models will be better to look at. But let’s try it soon!

• Laura Sydell No excuses: robots put you in two places at once, Weekend Edition Saturday, 31 December 2011.

Bruce Bartlett and I are already planning for me to use telepresence to give a lecture on mathematics and the environment at Stellenbosch University in South Africa. But we’d been planning to use old-fashioned videoconferencing technology.

Anybots is located in Mountain View, California. That’s near Google’s main campus. Can anyone help me set up a talk on energy and the environment at Google, where I use an Anybot?

(Or, for that matter, anywhere else around there?)

2) A study claims to have found a correlation between weather and the day of the week! The claim is that there are more tornados and hailstorms in the eastern USA during weekdays. One possible mechanism could be that aerosols from car exhaust help seed clouds.


I make no claims that this study is correct. But at the very least, it would be interesting to examine their use of statistics and see if it’s convincing or flawed:

• Thomas Bell and Daniel Rosenfeld, Why do tornados and hailstorms rest on weekends?, Journal of Geophysical Research 116 (2011), D20211.

Abstract. This study shows for the first time statistical evidence that when anthropogenic aerosols over the eastern United States during summertime are at their weekly mid-week peak, tornado and hailstorm activity there is also near its weekly maximum. The weekly cycle in summertime storm activity for 1995–2009 was found to be statistically significant and unlikely to be due to natural variability. It correlates well with previously observed weekly cycles of other measures of storm activity. The pattern of variability supports the hypothesis that air pollution aerosols invigorate deep convective clouds in a moist, unstable atmosphere, to the extent of inducing production of large hailstones and tornados. This is caused by the effect of aerosols on cloud drop nucleation, making cloud drops smaller and hydrometeors larger. According to simulations, the larger ice hydrometeors contribute to more hail. The reduced evaporation from the larger hydrometeors produces weaker cold pools. Simulations have shown that too cold and fast-expanding pools inhibit the formation of tornados. The statistical observations suggest that this might be the mechanism by which the weekly modulation in pollution aerosols is causing the weekly cycle in severe convective storms during summer over the eastern United States. Although we focus here on the role of aerosols, they are not a primary atmospheric driver of tornados and hailstorms but rather modulate them in certain conditions.

Here’s a discussion of it:

• Bob Yirka, New research may explain why serious thunderstorms and tornados are less prevalent on the weekends, PhysOrg, 22 December 2011.

3) And if you like to check how people use statistics, here’s a paper that would be incredibly important if its findings were correct:

• Joseph J. Mangano and Janette D. Sherman, An unexpected mortality increase in the United States follows arrival of the radioactive plume from Fukushima: is there a correlation?, International Journal of Health Services 42 (2012), 47–64.

The title has a question mark in it, but it’s been cited in very dramatic terms in many places, for example this video entitled “Peer reviewed study shows 14,000 U.S. deaths from Fukushima”:

Starting at 1:31 you’ll see an interview with one of the paper’s authors, Janette Sherman.

14,000 deaths in the US due to Fukushima? Wow! How did they get that figure? This quote from the paper explains how:

During weeks 12 to 25 [after the Fukushima disaster began], total deaths in 119 U.S. cities increased from 148,395 (2010) to 155,015 (2011), or 4.46 percent. This was nearly double the 2.34 percent rise in total deaths (142,006 to 145,324) in 104 cities for the prior 14 weeks, significant at p < 0.000001 (Table 2). This difference between actual and expected changes of +2.12 percentage points (+4.46% – 2.34%) translates to 3,286 “excess” deaths (155,015 × 0.0212) nationwide. Assuming a total of 2,450,000 U.S. deaths will occur in 2011 (47,115 per week), then 23.5 percent of deaths are reported (155,015/14 = 11,073, or 23.5% of 47,115). Dividing 3,286 by 23.5 percent yields a projected 13,983 excess U.S. deaths in weeks 12 to 25 of 2011.

Hmm. Can you think of some potential problems with this analysis?

In the interview, Janette Sherman also mentions increased death rates of children in British Columbia. Here’s the evidence the paper presents for that:

Shortly after the report [another paper by the authors] was issued, officials from British Columbia, Canada, proximate to the northwestern United States, announced that 21 residents had died of sudden infant death syndrome (SIDS) in the first half of 2011, compared with 16 SIDS deaths in all of the prior year. Moreover, the number of deaths from SIDS rose from 1 to 10 in the months of March, April, May, and June 2011, after Fukushima fallout arrived, compared with the same period in 2010. While officials could not offer any explanation for the abrupt increase, it coincides with our findings in the Pacific Northwest.

4) For the first time in 87 years, a wild gray wolf was spotted in California:

• Stephen Messenger, First gray wolf in 80 years enters California, Treehugger, 29 December 2011.

Researchers have been tracking this juvenile male using a GPS-enabled collar since it departed northern Oregon. In just a few weeks, it walked some 730 miles to California. It was last seen surfing off Malibu. Here is a photograph:

5) George Musser left the Centre for Quantum Technologies and returned to New Jersey, but not before writing a nice blog article explaining how the GRACE satellite uses the Earth’s gravitational field to measure the melting of glaciers:

• George Musser, Melting glaciers muck up Earth’s gravitational field, Scientific American, 22 December 2011.

6) The American Physical Society has started a new group: a Topical Group on the Physics of Climate! If you’re a member of the APS, and care about climate issues, you should join this.

7) Finally, here’s a cool picture taken in the Gulf of Alaska by Kent Smith:

He believes this was caused by fresher water meeting more salty water, but it doesn’t sounds like he’s sure. Can anyone figure out what’s going on? The foam where the waters meet is especially intriguing.


Quantropy (Part 1)

22 December, 2011

I wish you all happy holidays! My wife Lisa and I are going to Bangkok on Christmas Eve, and thence to Luang Prabang, a town in Laos where the Nam Khan river joins the Mekong. We’ll return to Singapore on the 30th. See you then! And in the meantime, here’s a little present—something to mull over.

Statistical mechanics versus quantum mechanics

There’s a famous analogy between statistical mechanics and quantum mechanics. In statistical mechanics, a system can be in any state, but its probability of being in a state with energy E is proportional to

\exp(-E/T)

where T is the temperature in units where Boltzmann’s constant is 1. In quantum mechanics, a system can move along any path, but its amplitude for moving along a path with action S is proportional to

\exp(i S/\hbar)

where \hbar is Planck’s constant. So, we have an analogy where Planck’s constant is like an imaginary temperature:

Statistical Mechanics Quantum Mechanics
probabilities amplitudes
energy action
temperature Planck’s constant times i

In other words, making the replacements

E \mapsto S

T \mapsto i \hbar

formally turns the probabilities for states in statistical mechanics into the amplitudes for paths, or ‘histories’, in quantum mechanics.

But the probabilities \exp(-E/T) arise naturally from maximizing entropy subject to a constraint on the expected energy. So what about the amplitudes \exp(i S/\hbar)?

Following the analogy without thinking too hard, we’d guess it arises from minimizing something subject to a constraint on the expected action.

But now we’re dealing with complex numbers, so ‘minimizing’ doesn’t sound right. It’s better talk about finding a ‘stationary point’: a place where the derivative of something is zero.

More importantly, what is this something? We’ll have to see—indeed, we’ll have to see if this whole idea makes sense! But for now, let’s just call it ‘quantropy’. This is a goofy word whose only virtue is that it quickly gets the idea across: just as the main ideas in statistical mechanics follow from the idea of maximizing entropy, we’d like the main ideas in quantum mechanics to follow from maximizing… err, well, finding a stationary point… of ‘quantropy’.

I don’t know how well this idea works, but there’s no way to know except by trying, so I’ll try it here. I got this idea thanks to a nudge from Uwe Stroinski and WebHubTel, who started talking about the principle of least action and the principle of maximum entropy at a moment when I was thinking hard about probabilities versus amplitudes.

Of course, if this idea makes sense, someone probably had it already. If you know where, please tell me.

Here’s the story…

Statics

Static systems at temperature zero obey the principle of minimum energy. Energy is typically the sum of kinetic and potential energy:

E = K + V

where the potential energy V depends only on the system’s position, while the kinetic energy K also depends on its velocity. The kinetic energy is often (but not always) a quadratic function of velocity with a minimum at velocity zero. In classical physics this lets our system minimize energy in a two-step way. First it will minimize kinetic energy, K, by staying still. Then it will go on to minimize potential energy, V, by choosing the right place to stay still.

This is actually somewhat surprising: usually minimizing the sum of two things involves an interesting tradeoff. But sometimes it doesn’t!

In quantum physics, a tradeoff is required, thanks to the uncertainty principle. We can’t know the position and velocity of a particle simultaneously, so we can’t simultaneously minimize potential and kinetic energy. This makes minimizing their sum much more interesting, as you’ll know if you’ve ever worked out the lowest-energy state of a harmonic oscillator or hydrogen atom.

But in classical physics, minimizing energy often forces us into ‘statics’: the boring part of physics, the part that studies things that don’t move. And people usually say statics at temperature zero is governed by the principle of minimum potential energy.

Next let’s turn up the heat. What about static systems at nonzero temperature? This is what people study in the subject called ‘thermostatics’, or more often, ‘equilibrium thermodynamics’.

In classical or quantum thermostatics at any fixed temperature, a closed system will obey the principle of minimum free energy. Now it will minimize

F = E - T S

where T is the temperature and S is the entropy. Note that this principle reduces to the principle of minimum energy when T = 0. But as T gets bigger, the second term in the above formula becomes more important, so the system gets more interested in having lots of entropy. That’s why water forms orderly ice crystals at low temperatures (more or less minimizing energy despite low entropy) and a wild random gas at high temperatures (more or less maximizing entropy despite high energy).

But where does the principle of minimum free energy come from?

One nice way to understand it uses probability theory. Suppose for simplicity that our system has a finite set of states, say X, and the energy of the state x \in X is E_x. Instead of our system occupying a single definite state, let’s suppose it can be in any state, with a probability p_x of being in the state x. Then its entropy is, by definition:

\displaystyle{ S = - \sum_x p_x \ln(p_x) }

The expected value of the energy is

\displaystyle{ E = \sum_x p_x E_x }

Now suppose our system maximizes entropy subject to a constraint on the expected value of energy. Thanks to the Lagrange multiplier trick, this is the same as maximizing

S - \beta E

where \beta is a Lagrange multiplier. When we go ahead and maximize this, we see the system chooses a Boltzmann distribution:

\displaystyle{ p_x = \frac{\exp(-\beta E_x)}{\sum_x \exp(-\beta E_x)}}

This is just a calculation; you must do it for yourself someday, and I will not rob you of that joy.

But what does this mean? We could call \beta the coolness, since its inverse is the temperature, T, at least in units where Boltzmann’s constant is set to 1. So, when the temperature is positive, maximizing S - \beta E is the same as minimizing the free energy:

F = E - T S

(For negative temperatures, maximizing S - \beta E would amount to maximizing free energy.)

So, every minimum or maximum principle described so far can be seen as a special case or limiting case of the principle of maximum entropy, as long as we admit that sometimes we need to maximize entropy subject to constraints.

Why ‘limiting case’? Because the principle of least energy only shows up as the low-temperature limit, or \beta \to \infty limit, of the idea of maximizing entropy subject to a constraint on expected energy. But that’s good enough for me.

Dynamics

Now suppose things are changing as time passes, so we’re doing ‘dynamics’ instead of mere ‘statics’. In classical mechanics we can imagine a system tracing out a path \gamma(t) as time passes from one time to another, for example from t = t_0 to t = t_1. The action of this path is typically the integral of the kinetic minus potential energy:

A(\gamma) = \displaystyle{ \int_{t_0}^{t_1}  (K(t) - V(t)) \, dt }

where K(t) and V(t) depend on the path \gamma. Note that now I’m calling action A instead of the more usual S, since we’re already using S for entropy and I don’t want things to get any more confusing than necessary.

The principle of least action says that if we fix the endpoints of this path, that is the points \gamma(t_0) and \gamma(t_1), the system will follow the path that minimizes the action subject to these constraints.

Why is there a minus sign in the definition of action? How did people come up with principle of least action? How is it related to the principle of least energy in statics? These are all fascinating questions. But I have a half-written book that tackles these questions, so I won’t delve into them here:

• John Baez and Derek Wise, Lectures on Classical Mechanics.

Instead, let’s go straight to dynamics in quantum mechanics. Here Feynman proposed that instead of our following a single definite path, it can follow any path, with an amplitude a(\gamma) of following the path \gamma. And he proposed this prescription for the amplitude:

\displaystyle{ a(\gamma) = \frac{\exp(i A(\gamma)/\hbar)}{\int  \exp(i A(\gamma)/\hbar) \, d \gamma}}

where \hbar is Planck’s constant. He also gave a heuristic argument showing that as \hbar \to 0, this prescription reduces to the principle of least action!

Unfortunately the integral over all paths—called a ‘path integral’—is hard to make rigorous except in certain special cases. And it’s a bit of a distraction for what I’m talking about now. So let’s talk more abstractly about ‘histories’ instead of paths with fixed endpoints, and consider a system whose possible ‘histories’ form a finite set, say X. Systems of this sort frequently show up as discrete approximations to continuous ones, but they also show up in other contexts, like quantum cellular automata and topological quantum field theories. Don’t worry if you don’t know what those things are. I’d just prefer to write sums instead of integrals now, to make everything easier.

Suppose the action of the history x \in X is A_x. Then Feynman’s sum over histories formulation of quantum mechanics says the amplitude of the history x is:

\displaystyle{ a_x = \frac{\exp(i A_x /\hbar)}{\sum_x  \exp(i A_x /\hbar) }}

This looks very much like the Boltzmann distribution:

\displaystyle{ p_x = \frac{\exp(-E_x/T)}{\sum_x \exp(- E_x/T)}}

Indeed, the only serious difference is that we’re taking the exponential of an imaginary quantity instead of a real one.

So far everything has been a review of very standard stuff. Now comes something weird and new—at least, new to me.

Quantropy

I’ve described statics and dynamics, and a famous analogy between them, but there are some missing items in the analogy, which would be good to fill in:

Statics Dynamics
statistical mechanics quantum mechanics
probabilities amplitudes
Boltzmann distribution Feynman sum over histories
energy action
temperature Planck’s constant times i
entropy ???
free energy ???

Since the Boltzmann distribution

\displaystyle{ p_x = \frac{\exp(-E_x/T)}{\sum_x \exp(- E_x/T)}}

comes from the principle of maximum entropy, you might hope Feynman’s sum over histories formulation of quantum mechanics:

\displaystyle{ a_x = \frac{\exp(i A_x /\hbar)}{\sum_x  \exp(i A_x /\hbar) }}

comes from a maximum principle too!

Unfortunately Feynman’s sum over histories involves complex numbers, and it doesn’t make sense to maximize a complex function. However, when we say nature likes to minimize or maximize something, it often behaves like a bad freshman who applies the first derivative test and quits there: it just finds a stationary point, where the first derivative is zero. For example, in statics we have ‘stable’ equilibria, which are local minima of the energy, but also ‘unstable’ equilibria, which are still stationary points of the energy, but not local minima. This is good for us, because stationary points still make sense for complex functions.

So let’s try to derive Feynman’s prescription from some sort of ‘principle of stationary quantropy’.

Suppose we have a finite set of histories, X, and each history x \in X has a complex amplitude a_x  \in \mathbb{C}. We’ll assume these amplitudes are normalized so that

\sum_x a_x = 1

since that’s what Feynman’s normalization actually achieves. We can try to define the quantropy of a by:

\displaystyle{ Q = - \sum_x a_x \ln(a_x) }

You might fear this is ill-defined when a_x = 0, but that’s not the worst problem; in the study of entropy we typically set

0 \ln 0 = 0

and everything works fine. The worst problem is that the logarithm has different branches: we can add any multiple of 2 \pi i to our logarithm and get another equally good logarithm. For now suppose we’ve chosen a specific logarithm for each number a_x, and suppose that when we vary them they don’t go through zero, so we can smoothly change the logarithm as we move them. This should let us march ahead for now, but clearly it’s a disturbing issue which we should revisit someday.

Next, suppose each history x has an action A_x \in \mathbb{R}. Let’s seek amplitudes a_x that give a stationary point of the quantropy Q subject to a constraint on the expected action:

\displaystyle{ A = \sum_x a_x A_x }

The term ‘expected action’ is a bit odd, since the numbers a_x are amplitudes rather than probabilities. While I could try to justify it from how expected values are computed in Feynman’s formalism, I’m mainly using this term because A is analogous to the expected value of the energy, which we saw earlier. We can worry later what all this stuff really means; right now I’m just trying to push forwards with an analogy and do a calculation.

So, let’s look for a stationary point of Q subject to a constraint on A. To do this, I’d be inclined to use Lagrange multipliers and look for a stationary point of

Q - \lambda A

But there’s another constraint, too, namely

\sum_x a_x = 1

So let’s write

B = \sum_x a_x

and look for stationary points of Q subject to the constraints

A = \alpha , \qquad B = 1

To do this, the Lagrange multiplier recipe says we should find stationary points of

Q - \lambda A - \mu B

where \lambda and \mu are Lagrange multipliers. The Lagrange multiplier \lambda is really interesting. It’s analogous to ‘coolness’, \beta = 1/T, so our analogy chart suggests that

\lambda = 1/i\hbar

This says that when \lambda gets big our system becomes close to classical. So, we could call \lambda the classicality of our system. The Lagrange multiplier \mu is less interesting—or at least I haven’t thought about it much.

So, we’ll follow the usual Lagrange multiplier recipe and look for amplitudes for which

0 = \displaystyle{ \frac{\partial}{\partial a_x} \left(Q - \lambda A - \mu B \right) }

holds, along with the constraint equations. We begin by computing the derivatives we need:

\begin{array}{cclcl} \displaystyle{ \frac{\partial}{\partial a_x} Q  }  &=& - \displaystyle{ \frac{\partial}{\partial a_x} \; a_x \ln(a_x)}   &=& - \ln(a_x) - 1 \\    \\    \displaystyle{ \frac{\partial}{\partial a_x}\; A  }  &=& \displaystyle{ \frac{\partial}{\partial a_x} a_x A_x}  &=& A_x \\    \\   \displaystyle{ \frac{\partial}{\partial a_x} B  }  &=& \displaystyle{ \frac{\partial}{\partial a_x}\; a_x }  &=& 1 \end{array}

Thus, we need

0 = \displaystyle{ \frac{\partial}{\partial a_x} \left(Q - \lambda A - \mu B \right) = -\ln(a_x) - 1- \lambda A_x - \mu }

or

\displaystyle{ a_x = \frac{\exp(-\lambda A_x)}{\exp(\mu + 1)} }

The constraint

\sum_x a_x = 1

then forces us to choose:

\displaystyle{ \exp(\mu + 1) = \sum_x \exp(-\lambda A_x) }

so we have

\displaystyle{ a_x = \frac{\exp(-\lambda A_x)}{\sum_x \exp(-\lambda A_x)} }

Hurrah! This is precisely Feynman’s sum over histories formulation of quantum mechanics if

\lambda = 1/i\hbar

We could go further with the calculation, but this is the punchline, so I’ll stop here. I’ll just note that the final answer:

\displaystyle{ a_x = \frac{\exp(iA_x/\hbar)}{\sum_x \exp(iA_x/\hbar)} }

does two equivalent things in one blow:

• It gives a stationary point of quantropy subject to the constraints that the amplitudes sum to 1 and the expected action takes some fixed value.

• It gives a stationary point of the free action:

A - i \hbar Q

subject to the constraint that the amplitudes sum to 1.

In case the second point is puzzling, note that the ‘free action’ is the quantum analogue of ‘free energy’, E - T S. It’s also just Q - \lambda A times -i \hbar, and we already saw that finding stationary points of Q - \lambda A is another way of finding stationary points of quantropy with a constraint on the expected action.

Note also that when \hbar \to 0, free action reduces to action, so we recover the principle of least action—or at least stationary action—in classical mechanics.

Summary. We recover Feynman’s sum over histories formulation of quantum mechanics from assuming that all histories have complex amplitudes, that these amplitudes sum to one, and that the amplitudes give a stationary point of quantropy subject to a constraint on the expected action. Alternatively, we can assume the amplitudes sum to one and that they give a stationary point of free action.

That’s sort of nice! So, here’s our analogy chart, all filled in:

Statics Dynamics
statistical mechanics quantum mechanics
probabilities amplitudes
Boltzmann distribution Feynman sum over histories
energy action
temperature Planck’s constant times i
entropy quantropy
free energy free action

Follow

Get every new post delivered to your Inbox.

Join 3,095 other followers