Classical Mechanics versus Thermodynamics (Part 1)

19 January, 2012

It came as a bit of a shock last week when I realized that some of the equations I’d learned in thermodynamics were just the same as equations I’d learned in classical mechanics—with only the names of the variables changed, to protect the innocent.

Why didn’t anyone tell me?

For example: everybody loves Hamilton’s equations: there are just two, and they summarize the entire essence of classical mechanics. Most people hate the Maxwell relations in thermodynamics: there are lots, and they’re hard to remember.

But what I’d like to show you now is that Hamilton’s equations are Maxwell relations! They’re a special case, and you can derive them the same way. I hope this will make you like the Maxwell relations more, instead of liking Hamilton’s equations less.

First, let’s see what these equations look like. Then let’s see why Hamilton’s equations are a special case of the Maxwell relations. And then let’s talk about how this might help us unify different aspects of physics.

Hamilton’s equations

Suppose you have a particle on the line whose position q and momentum p are functions of time, t. If the energy H is a function of position and momentum, Hamilton’s equations say:

\begin{array}{ccr}  \displaystyle{  \frac{d p}{d t} }  &=&  \displaystyle{- \frac{\partial H}{\partial q} } \\  \\ \displaystyle{  \frac{d q}{d t} } &=&  \displaystyle{ \frac{\partial H}{\partial p} }  \end{array}

The Maxwell relations

There are lots of Maxwell relations, and that’s one reason people hate them. But let’s just talk about two; most of the others work the same way.

Suppose you have a physical system like a box of gas that has some volume V, pressure P, temperature T and entropy S. Then the first and second Maxwell relations say:

\begin{array}{ccr}  \displaystyle{ \left. \frac{\partial T}{\partial V}\right|_S } &=&  \displaystyle{ - \left. \frac{\partial P}{\partial S}\right|_V } \\   \\   \displaystyle{ \left. \frac{\partial S}{\partial  V}\right|_T  }  &=&  \displaystyle{ \left. \frac{\partial P}{\partial T} \right|_V }   \end{array}

Comparison

Clearly Hamilton’s equations resemble the Maxwell relations. Please check for yourself that the patterns of variables are exactly the same: only the names have been changed! So, apart from a key subtlety, Hamilton’s equations become the first and second Maxwell relations if we make these replacements:

\begin{array} {ccccccc}  q &\to& S & &  p &\to & T \\ t & \to & V & & H &\to & P \end{array}

What’s the key subtlety? One reason people hate the Maxwell’s relations is they have lots of little symbols like \left. \right|_V saying what to hold constant when we take our partial derivatives. Hamilton’s equations don’t have those.

So, you probably won’t like this, but let’s see what we get if we write Hamilton’s equations so they exactly match the pattern of the Maxwell relations:

\begin{array}{ccr}     \displaystyle{ \left. \frac{\partial p}{\partial t} \right|_q }  &=&  \displaystyle{- \left. \frac{\partial H}{\partial q} \right|_t } \\  \\\displaystyle{  \left.\frac{\partial q}{\partial t} \right|_p } &=&  \displaystyle{ \left. \frac{\partial H}{\partial p} \right|_t }    \end{array}

This looks a bit weird, and it set me back a day. What does it mean to take the partial derivative of q in the t direction while holding p constant, for example?

I still think it’s weird. But I think it’s correct. To see this, let’s derive the Maxwell relations, and then derive Hamilton’s equations using the exact same reasoning, with only the names of variables changed.

Deriving the Maxwell relations

The Maxwell relations are extremely general, so let’s derive them in a way that makes that painfully clear. Suppose we have any smooth function U on the plane. Just for laughs, let’s call the coordinates of this plane S and V. Then we have

d U = T d S - P d V

for some functions T and P. This equation is just a concise way of saying that

\displaystyle{ T = \left.\frac{\partial U}{\partial S}\right|_V }

and

\displaystyle{ P = - \left.\frac{\partial U}{\partial V}\right|_S }

The minus sign here is unimportant: you can think of it as a whimsical joke. All the math would work just as well if we left it out.

(In reality, physicists call U as the internal energy of a system, regarded as a function of its entropy S and volume V. They then call T the temperature and P the pressure. It just so happens that for lots of systems, their internal energy goes down as you increase their volume, so P works out to be positive if we stick in this minus sign, so that’s what people did. But you don’t need to know any of this physics to follow the derivation of the Maxwell relations!)

Now, mixed partial derivatives commute, so we have:

\displaystyle{ \frac{\partial^2 U}{\partial V \partial S} =  \frac{\partial^2 U}{\partial S \partial V}}

Plugging in our definitions of T and V, this says

\displaystyle{ \left. \frac{\partial T}{\partial V}\right|_S = - \left. \frac{\partial P}{\partial S}\right|_V }

And that’s the first Maxwell relation! So, there’s nothing to it: it’s just a sneaky way of saying that the mixed partial derivatives of the function U commute.

The second Maxwell relation works the same way. But seeing this takes a bit of thought, since we need to cook up a suitable function whose mixed partial derivatives are the two sides of this equation:

\displaystyle{ \left. \frac{\partial S}{\partial  V}\right|_T  = \left. \frac{\partial P}{\partial T} \right|_V }

There are different ways to do this, but for now let me use the time-honored method of ‘pulling the rabbit from the hat’.

Here’s the function we want:

A = U - T S

(In thermodynamics this function is called the Helmholtz free energy. It’s sometimes denoted F, but the International Union of Pure and Applied Chemistry recommends calling it A, which stands for the German word ‘Arbeit’, meaning ‘work’.)

Let’s check that this function does the trick:

\begin{array}{ccl} d A &=& d U - d(T S) \\  &=& (T d S - P d V) - (S dT + T d S) \\  &=& -S d T - P dV \end{array}

If we restrict ourselves to any subset of the plane where T and V serve as coordinates, the above equation is just a concise way of saying

\displaystyle{ S = - \left.\frac{\partial A}{\partial T}\right|_V }

and

\displaystyle{ P = - \left.\frac{\partial A}{\partial V}\right|_T }

Then since mixed partial derivatives commute, we get:

\displaystyle{ \frac{\partial^2 A}{\partial V \partial T} =  \frac{\partial^2 A}{\partial T \partial V}}

or in other words:

\displaystyle{ \left. \frac{\partial S}{\partial  V}\right|_T  = \left. \frac{\partial P}{\partial T} \right|_V }

which is the second Maxwell relation.

We can keep playing this game using various pairs of the four functions S, T, P, V as coordinates, and get more Maxwell relations: enough to give ourselves a headache! But we have more better things to do today.

Hamilton’s equations as Maxwell relations

For example: let’s see how Hamilton’s equations fit into this game. Suppose we have a particle on the line. Consider smooth paths where it starts at some fixed position at some fixed time and ends at the point q at the time t. Nature will choose a path with least action—or at least one that’s a stationary point of the action. Let’s assume there’s a unique such path, and that it depends smoothly on q and t. For this to be true, we may need to restrict q and t to a subset of the plane, but that’s okay: go ahead and pick such a subset.

Given q and t in this set, nature will pick the path that’s a stationary point of action; the action of this path is called Hamilton’s principal function and denoted S(q,t). (Beware: this S is not the same as entropy!)

Let’s assume S is smooth. Then we can copy our derivation of the Maxwell equations line for line and get Hamilton’s equations! Let’s do it, skipping some steps but writing down the key results.

For starters we have

d S = p d q - H d t

for some functions p and H called the momentum and energy, which obey

\displaystyle{ p = \left.\frac{\partial S}{\partial q}\right|_t }

and

\displaystyle{ H = - \left.\frac{\partial S}{\partial t}\right|_q }

As far as I can tell it’s just a cute coincidence that we see a minus sign in the same place as before! Anyway, the fact that mixed partials commute gives us

\displaystyle{ \left. \frac{\partial p}{\partial t} \right|_q = - \left. \frac{\partial H}{\partial q} \right|_t }

which is the first of Hamilton’s equations. And now we see that all the funny \left. \right|_q and \left. \right|_t things are actually correct!

Next, we pull a rabbit out of our hat. We define this function:

X = S - p q

and check that

d X = - q dp - H d t

This function X probably has a standard name, but I don’t know it. Do you?

Then, considering any subset of the plane where p and t serve as coordinates, we see that because mixed partials commute:

\displaystyle{ \frac{\partial^2 X}{\partial t \partial p} =  \frac{\partial^2 A}{\partial p \partial t}}

we get

\displaystyle{ \left. \frac{\partial q}{\partial t} \right|_p = \left. \frac{\partial H}{\partial p} \right|_t }

So, we’re done!

But you might be wondering how we pulled this rabbit out of the hat. More precisely, why did we suspect it was there in the first place? There’s a nice answer if you’re comfortable with differential forms. We start with what we know:

d S = p d q - H d t

Next, we use this fundamental equation:

d^2 = 0

to note that:

\begin{array}{ccl}  0 &=& d^2 S \\ &=& d(p d q- H d t) \\ &=& d p \wedge d q - d H \wedge d t \\ &=& - dq \wedge d p - d H \wedge d t \\ &=& d(-q d p - H d t) \end{array}

See? We’ve managed to switch the roles of p and q, at the cost of an extra minus sign!

Then, if we restrict attention to any contractible open subset of the plane, the Poincaré Lemma says

d \omega = 0 \implies \omega = d \mu \; \textrm{for some} \; \mu

Since

d(- q d p - H d t) = 0

it follows that there’s a function X with

d X = - q d p - H d t

This is our rabbit. And if you ponder the difference between -q d p and p d q, you’ll see it’s -d( p q). So, it’s no surprise that

X = S - p q

The big picture

Now let’s step back and think about what’s going on.

Lately I’ve been trying to unify a bunch of ‘extremal principles’, including:

1) the principle of least action
2) the principle of least energy
3) the principle of maximum entropy
4) the principle of maximum simplicity, or Occam’s razor

In my post on quantropy I explained how the first three principles fit into a single framework if we treat Planck’s constant as an imaginary temperature. The guiding principle of this framework is

maximize entropy
subject to the constraints imposed by what you believe

And that’s nice, because E. T. Jaynes has made a powerful case for this principle.

However, when the temperature is imaginary, entropy is so different that it may deserves a new name: say, ‘quantropy’. In particular, it’s complex-valued, so instead of maximizing it we have to look for stationary points: places where its first derivative is zero. But this isn’t so bad. Indeed, a lot of minimum and maximum principles are really ‘stationary principles’ if you examine them carefully.

What about the fourth principle: Occam’s razor? We can formalize this using algorithmic probability theory. Occam’s razor then becomes yet another special case of

maximize entropy
subject the constraints imposed by what you believe

once we realize that algorithmic entropy is a special case of ordinary entropy.

All of this deserves plenty of further thought and discussion—but not today!

Today I just want to point out that once we’ve formally unified classical mechanics and thermal statics (often misleadingly called ‘thermodynamics’), as sketched in the article on quantropy, we should be able to take any idea from one subject and transpose it to the other. And it’s true. I just showed you an example, but there are lots of others!

I guessed this should be possible after pondering three famous facts:

• In classical mechanics, if we fix the initial position of a particle, we can pick any position q and time t at which the particle’s path ends, and nature will seek the path to this endpoint that minimizes the action. This minimal action is Hamilton’s principal function S(q,t), which obeys

d S = p d q - H d t

In thermodynamics, if we fix the entropy S and volume V of a box of gas, nature will seek the probability distribution of microstates the minimizes the energy. This minimal energy is the internal energy U(S,V), which obeys

d U = T d S - P d V

• In classical mechanics we have canonically conjugate quantities, while in statistical mechanics we have conjugate variables. In classical mechanics the canonical conjugate of the position q is the momentum p, while the canonical conjugate of time t is energy H. In thermodynamics, the conjugate of entropy S is temperature T, while the conjugate of volume V is pressure P. All this is fits in perfectly with the analogy we’ve been using today:

\begin{array} {ccccccc}  q &\to& S & &  p &\to & T \\ t & \to & V & & H &\to & P \end{array}

• Something called the Legendre transformation plays a big role both in classical mechanics and thermodynamics. This transformation takes a function of some variable and turns it into a function of the conjugate variable. In our proof of the Maxwell relations, we secretly used a Legendre transformation to pass from the internal energy U(S,V) to the Helmholtz free energy A(T,V):

A = U - T S

where we must solve for the entropy S in terms of T and V to think of A as a function of these two variables.

Similarly, in our proof of Hamilton’s equations, we passed from Hamilton’s principal function S(q,t) to the function X(p,t):

X = S - p q

where we must solve for the position q in terms of p and t to think of X as a function of these two variables.

I hope you see that all this stuff fits together in a nice picture, and I hope to say a bit more about it soon. The most exciting thing for me will be to see how symplectic geometry, so important in classical mechanics, can be carried over to thermodynamics. Why? Because I’ve never seen anyone use symplectic geometry in thermodynamics. But maybe I just haven’t looked hard enough!

Indeed, it’s perfectly possible that some people already know what I’ve been saying today. Have you seen someone point out that Hamilton’s equations are a special case of the Maxwell relations? This would seem to be the first step towards importing all of symplectic geometry to thermodynamics.


Extremal Principles in Classical, Statistical and Quantum Mechanics

13 January, 2012

guest post by Mike Stay

The table in John’s post on quantropy shows that energy and action are analogous:

Statics Dynamics
statistical mechanics quantum mechanics
probabilities amplitudes
Boltzmann distribution Feynman sum over histories
energy action
temperature Planck’s constant times i
entropy quantropy
free energy free action

However, this seems to be part of a bigger picture that includes at least entropy as analogous to both of those, too. I think that just about any quantity defined by an integral over a path would behave similarly.

I see four broad areas to consider, based on a temperature parameter:

  1. T = 0: statics, or “least quantity”
  2. Real T > 0: statistical mechanics
  3. Imaginary T: a thermal ensemble gets replaced by a quantum superposition
  4. Complex T: ensembles of quantum systems, as in nuclear magnetic resonance

I’m not going to get into the last of these in what follows.

1. “Least quantity”

Lagrangian of a classical particle

K is kinetic energy, i.e. the “action density” due to motion.

V is potential energy, i.e. minus the “action density” due to position.

The action is then:

\displaystyle \begin{array}{rcl}   A &=& \int (K-V) \,  d t \\ & = & \int \left[m\left(\frac{d q(t)}{d t}^2 - V(q(t)\right)\right] d t  \end{array}

where m is the particle’s mass. We get the principle of least action by setting \delta A = 0.

“Static” systems related by a Wick rotation

  1. Substitute q(s = iz) for q(t) to get a “springy” static system.

    In John’s homework problem A Spring in Imaginary Time, he guided students through a Wick-rotation-like process that transforms the Lagrangian above into the Hamiltonian of a springy system. (I say “springy” because it’s not exactly the Hamiltonian for a hanging spring: here each infinitesimal piece of the spring is at a fixed horizontal position and is free to move only vertically.)

    \kappa is the potential energy density due to stretching.

    \upsilon is the potential energy density due to position.

    We then have

    \displaystyle  \begin{array}{rcl}\int(\kappa-\upsilon) dz & = &  \int\left[k\left(\frac{dq(iz)}{dz}\right)^2 - \upsilon(q(iz))\right]  dz\\ & = & -i\int\left[-k\left(\frac{dq(iz)}{diz}\right)^2 -  \upsilon(q(iz))\right] diz\\ & = & i  \int\left[k\left(\frac{dq(iz)}{diz}\right)^2 + \upsilon(q(iz))\right]  diz \end{array}

    or letting s = iz,

    \displaystyle  \begin{array}{rcl}   & = &  i\int\left[k\left(\frac{dq(s)}{ds}\right)^2 + \upsilon(q(s))\right]  ds\\ & = & iE \end{array}

    where E is the potential energy of the spring. We get the principle of least energy by setting \delta E = 0.

  2. Substitute q(β = iz) for q(t) to get a thermometer
    system.

    We can repeat the process above, but use inverse temperature, or “coolness”, instead of time. Note that this is still a statics problem at heart! We’ll introduce another temperature below when we allow for multiple possible q‘s.

    K is the potential energy due to rate of change of q with respect to \beta. (This has to do with the thermal expansion coefficient: if we fix length of the thermometer and then cool it, we get “stretching” potential energy.)

    V is any extra potential energy due to q.

    \displaystyle \begin{array}{rcl}\int(K-V) dz  & = & \int\left[k\left(\frac{dq(iz)}{dz}\right)^2 -  V(q(iz))\right] dz\\ & = &  -i\int\left[-k\left(\frac{dq(iz)}{diz}\right)^2 - V(q(iz))\right]  diz\\ & = & i \int\left[k\left(\frac{dq(iz)}{diz}\right)^2 +  V(q(iz))\right] diz \end{array}

    or letting \beta = iz,

    \displaystyle \begin{array}{rcl}   & = &  i\int\left[k\left(\frac{dq(\beta)}{d\beta}\right)^2 +  V(q(\beta))\right] d\beta\\ & = & iS_1\end{array}

    where S_1 is the entropy lost as the thermometer is cooled. We get the principle of “least entropy lost” by setting \delta S_1 = 0.

  3. Substitute q(T₁ = iz) for q(t).

    We can repeat the process above, but use temperature instead of time. We get a system whose heat capacity is governed by a function q(T) and its derivative. We’re trying to find the best function q, the most efficient way to raise the temperature of the system.

    C is the heat capacity (= entropy) proportional to (dq/dT_1)^2.

    V is the heat capacity due to q.

    \displaystyle \begin{array}{rcl}\int(C-V) dz  & = & \int\left[k\left(\frac{dq(iz)}{dz}\right)^2 -  V(q(iz))\right] dz\\ & = &  -i\int\left[-k\left(\frac{dq(iz)}{diz}\right)^2 - V(q(iz))\right]  diz\\ & = & i \int\left[k\left(\frac{dq(iz)}{diz}\right)^2 +  V(q(iz))\right] diz  \end{array}

    or letting T_1 = iz,

    \displaystyle \begin{array}{rcl} & = &  i\int\left[k\left(\frac{dq(T_1)}{dT_1}\right)^2 + V(q(T_1))\right]  dT_1\\ & = & iE \end{array}

    where E is the energy required to raise the
    temperature. We again get the principle of least energy by setting \delta E = 0.

2. Statistical mechanics

Here we allow lots of possible q‘s, then maximize entropy subject to constraints using the Lagrange multiplier trick.

Statistical mechanics of a particle

For the statistical mechanics of a particle, we choose a real measure a_x on the set of paths. For simplicity, we assume the set is finite.

Normalize so \sum a_x = 1.

Define entropy to be S = - \sum a_x \ln a_x.

Our problem is to choose a_x to minimize the “free action” F = A - \lambda S, or, what’s equivalent, to maximize S subject to a constraint on A.

To make units match, λ must have units of action, so it’s some multiple of . Replace λ by ℏλ so the free action is

F = A - \hbar\lambda\, S.

The distribution that minimizes the free action is the Gibbs distribution a_x = \exp(-A/\hbar\lambda) / Z, where Z is the usual partition function.

However, there are other observables of a path, like the position q_{1/2} at the halfway point; given another constraint on the average value of q_{1/2} over all paths, we get a distribution like

\displaystyle a_x = \exp(-\left[A +  pq_{1/2}\right]/\hbar\lambda) / Z.

The conjugate variable to that position is a momentum: in order to get from the starting point to the given point in the allotted time, the particle has to have the corresponding momentum.

dA = \hbar\lambda\, dS - p\, dq.

Other examples from Wick rotation

  1. Introduce a temperature T [Kelvins] that perturbs the spring.

    We minimize the free energy F = E - kT\, S, i.e. maximize the entropy S subject to a constraint on the expected energy

    \langle E\rangle = \sum a_x E_x.

    We get the measure a_x = \exp(-E_x/kT) / Z.

    Other observables about the spring’s path give conjugate variables whose product is energy. Given constraint on the average position of the spring at the halfway point, we get a conjugate force: pulling the spring out of equilibrium requires a force.

    dE = kT\, dS - F\, dq.

  2. Statistical ensemble of thermometers with ensemble temperature T₂ [unitless].

    We minimize the “free entropy” F = S_1 - T_2S_2, i.e. we maximize the entropy S_2 subject to a constraint on the expected entropy lost

    \langle S_1\rangle = \sum a_x S_{1,x}.

    We get the measure a_x = \exp(-S_{1,x}/T_2) / Z.

    Given a constraint on the average position at the halfway point, we get a conjugate inverse length r that tells how much entropy is lost when the thermometer shrinks by dq.

    dS_1 = T_2\, dS_2 - r\, dq.

  3. Statistical ensemble of functions q with ensemble temperature T₂ [Kelvins].

    We minimize the free energy F = E - kT_2\, S, i.e. we maximize the entropy S subject to a constraint on the expected energy

    \displaystyle \langle E\rangle = \sum a_x E_x.

    We get the measure a_x = \exp(-E_x/kT_2) / Z.

    Again, a constraint on the position would give a conjugate force. It’s a little harder to see how here, but given a non-optimal function q(T), we have an extra energy cost due to inefficiency that’s analogous to the stretching potential energy when pulling a spring out of equilibrium.

3. Thermo to quantum via Wick rotation of Lagrange multiplier

We allow a complex-valued measure a as John did in the article on quantropy. We pick a logarithm for each a_x and assume they don’t go through zero as we vary them. We also choose an imaginary Lagrange multiplier.

Normalize so \sum a_x = 1.

Define quantropy Q = - \sum a_x \ln a_x.

Find a stationary point of the free action F = A - \hbar\lambda\, Q.

We get a_x = \exp(-A_x/\hbar\lambda). If \lambda = -i, we get Feynman’s sum over histories. Surely something like the two-slit experiment considers histories with a constraint on position at a particular time, and we get a conjugate momentum?

A Quantum Version of Entropy

Again allow complex-valued a_x. However, this time normalize these by setting \sum |a_x|^2 = 1.

Define a quantum version of entropy S = - \sum |a_x|^2  \ln |a_x|^2.

  1. Allow quantum superposition of perturbed springs.

    \langle E\rangle = \sum |a_x|^2 E_x. Get a_x =  \exp(-E_x/kT) / Z. If T = -i\hbar/tk, we get the evolution of the quantum state |q\rangle under the given Hamiltonian for a time t.

  2. Allow quantum superpositions of thermometers.

    \langle S_1\rangle = \sum |a_x|^2 S_{1,x}. Get a_x =  \exp(-S_{1,x}/T_2) / Z. If T_2 = -i, we get something like a sum over histories, but with a different normalization condition that converges because our set of paths is finite.

  3. Allow quantum superposition of systems.

    \langle E \rangle = \sum |a_x|^2 E_x. Get a_x =\exp(-E_x/kT_2) / Z. If T_2 = -i\hbar/tk, we get the result of “Measure E, then heat the superposition T₁ degrees in a time much less than t seconds, then wait t seconds.” Different functions q in the superposition change the heat capacity differently and thus the systems end up at different energies.

So to sum up, there’s at least a three-way analogy between action, energy, and entropy depending on what you’re integrating over. You get a kind of “statics” if you extremize the integral by varying the path; by allowing multiple paths and constraints on observables, you get conjugate variables and “free” quantities that you want to minimize; and by taking the temperature to be imaginary, you get quantum systems.


Quantropy (Part 1)

22 December, 2011

I wish you all happy holidays! My wife Lisa and I are going to Bangkok on Christmas Eve, and thence to Luang Prabang, a town in Laos where the Nam Khan river joins the Mekong. We’ll return to Singapore on the 30th. See you then! And in the meantime, here’s a little present—something to mull over.

Statistical mechanics versus quantum mechanics

There’s a famous analogy between statistical mechanics and quantum mechanics. In statistical mechanics, a system can be in any state, but its probability of being in a state with energy E is proportional to

\exp(-E/T)

where T is the temperature in units where Boltzmann’s constant is 1. In quantum mechanics, a system can move along any path, but its amplitude for moving along a path with action S is proportional to

\exp(i S/\hbar)

where \hbar is Planck’s constant. So, we have an analogy where Planck’s constant is like an imaginary temperature:

Statistical Mechanics Quantum Mechanics
probabilities amplitudes
energy action
temperature Planck’s constant times i

In other words, making the replacements

E \mapsto S

T \mapsto i \hbar

formally turns the probabilities for states in statistical mechanics into the amplitudes for paths, or ‘histories’, in quantum mechanics.

But the probabilities \exp(-E/T) arise naturally from maximizing entropy subject to a constraint on the expected energy. So what about the amplitudes \exp(i S/\hbar)?

Following the analogy without thinking too hard, we’d guess it arises from minimizing something subject to a constraint on the expected action.

But now we’re dealing with complex numbers, so ‘minimizing’ doesn’t sound right. It’s better talk about finding a ‘stationary point’: a place where the derivative of something is zero.

More importantly, what is this something? We’ll have to see—indeed, we’ll have to see if this whole idea makes sense! But for now, let’s just call it ‘quantropy’. This is a goofy word whose only virtue is that it quickly gets the idea across: just as the main ideas in statistical mechanics follow from the idea of maximizing entropy, we’d like the main ideas in quantum mechanics to follow from maximizing… err, well, finding a stationary point… of ‘quantropy’.

I don’t know how well this idea works, but there’s no way to know except by trying, so I’ll try it here. I got this idea thanks to a nudge from Uwe Stroinski and WebHubTel, who started talking about the principle of least action and the principle of maximum entropy at a moment when I was thinking hard about probabilities versus amplitudes.

Of course, if this idea makes sense, someone probably had it already. If you know where, please tell me.

Here’s the story…

Statics

Static systems at temperature zero obey the principle of minimum energy. Energy is typically the sum of kinetic and potential energy:

E = K + V

where the potential energy V depends only on the system’s position, while the kinetic energy K also depends on its velocity. The kinetic energy is often (but not always) a quadratic function of velocity with a minimum at velocity zero. In classical physics this lets our system minimize energy in a two-step way. First it will minimize kinetic energy, K, by staying still. Then it will go on to minimize potential energy, V, by choosing the right place to stay still.

This is actually somewhat surprising: usually minimizing the sum of two things involves an interesting tradeoff. But sometimes it doesn’t!

In quantum physics, a tradeoff is required, thanks to the uncertainty principle. We can’t know the position and velocity of a particle simultaneously, so we can’t simultaneously minimize potential and kinetic energy. This makes minimizing their sum much more interesting, as you’ll know if you’ve ever worked out the lowest-energy state of a harmonic oscillator or hydrogen atom.

But in classical physics, minimizing energy often forces us into ‘statics’: the boring part of physics, the part that studies things that don’t move. And people usually say statics at temperature zero is governed by the principle of minimum potential energy.

Next let’s turn up the heat. What about static systems at nonzero temperature? This is what people study in the subject called ‘thermostatics’, or more often, ‘equilibrium thermodynamics’.

In classical or quantum thermostatics at any fixed temperature, a closed system will obey the principle of minimum free energy. Now it will minimize

F = E - T S

where T is the temperature and S is the entropy. Note that this principle reduces to the principle of minimum energy when T = 0. But as T gets bigger, the second term in the above formula becomes more important, so the system gets more interested in having lots of entropy. That’s why water forms orderly ice crystals at low temperatures (more or less minimizing energy despite low entropy) and a wild random gas at high temperatures (more or less maximizing entropy despite high energy).

But where does the principle of minimum free energy come from?

One nice way to understand it uses probability theory. Suppose for simplicity that our system has a finite set of states, say X, and the energy of the state x \in X is E_x. Instead of our system occupying a single definite state, let’s suppose it can be in any state, with a probability p_x of being in the state x. Then its entropy is, by definition:

\displaystyle{ S = - \sum_x p_x \ln(p_x) }

The expected value of the energy is

\displaystyle{ E = \sum_x p_x E_x }

Now suppose our system maximizes entropy subject to a constraint on the expected value of energy. Thanks to the Lagrange multiplier trick, this is the same as maximizing

S - \beta E

where \beta is a Lagrange multiplier. When we go ahead and maximize this, we see the system chooses a Boltzmann distribution:

\displaystyle{ p_x = \frac{\exp(-\beta E_x)}{\sum_x \exp(-\beta E_x)}}

This is just a calculation; you must do it for yourself someday, and I will not rob you of that joy.

But what does this mean? We could call \beta the coolness, since its inverse is the temperature, T, at least in units where Boltzmann’s constant is set to 1. So, when the temperature is positive, maximizing S - \beta E is the same as minimizing the free energy:

F = E - T S

(For negative temperatures, maximizing S - \beta E would amount to maximizing free energy.)

So, every minimum or maximum principle described so far can be seen as a special case or limiting case of the principle of maximum entropy, as long as we admit that sometimes we need to maximize entropy subject to constraints.

Why ‘limiting case’? Because the principle of least energy only shows up as the low-temperature limit, or \beta \to \infty limit, of the idea of maximizing entropy subject to a constraint on expected energy. But that’s good enough for me.

Dynamics

Now suppose things are changing as time passes, so we’re doing ‘dynamics’ instead of mere ‘statics’. In classical mechanics we can imagine a system tracing out a path \gamma(t) as time passes from one time to another, for example from t = t_0 to t = t_1. The action of this path is typically the integral of the kinetic minus potential energy:

A(\gamma) = \displaystyle{ \int_{t_0}^{t_1}  (K(t) - V(t)) \, dt }

where K(t) and V(t) depend on the path \gamma. Note that now I’m calling action A instead of the more usual S, since we’re already using S for entropy and I don’t want things to get any more confusing than necessary.

The principle of least action says that if we fix the endpoints of this path, that is the points \gamma(t_0) and \gamma(t_1), the system will follow the path that minimizes the action subject to these constraints.

Why is there a minus sign in the definition of action? How did people come up with principle of least action? How is it related to the principle of least energy in statics? These are all fascinating questions. But I have a half-written book that tackles these questions, so I won’t delve into them here:

• John Baez and Derek Wise, Lectures on Classical Mechanics.

Instead, let’s go straight to dynamics in quantum mechanics. Here Feynman proposed that instead of our following a single definite path, it can follow any path, with an amplitude a(\gamma) of following the path \gamma. And he proposed this prescription for the amplitude:

\displaystyle{ a(\gamma) = \frac{\exp(i A(\gamma)/\hbar)}{\int  \exp(i A(\gamma)/\hbar) \, d \gamma}}

where \hbar is Planck’s constant. He also gave a heuristic argument showing that as \hbar \to 0, this prescription reduces to the principle of least action!

Unfortunately the integral over all paths—called a ‘path integral’—is hard to make rigorous except in certain special cases. And it’s a bit of a distraction for what I’m talking about now. So let’s talk more abstractly about ‘histories’ instead of paths with fixed endpoints, and consider a system whose possible ‘histories’ form a finite set, say X. Systems of this sort frequently show up as discrete approximations to continuous ones, but they also show up in other contexts, like quantum cellular automata and topological quantum field theories. Don’t worry if you don’t know what those things are. I’d just prefer to write sums instead of integrals now, to make everything easier.

Suppose the action of the history x \in X is A_x. Then Feynman’s sum over histories formulation of quantum mechanics says the amplitude of the history x is:

\displaystyle{ a_x = \frac{\exp(i A_x /\hbar)}{\sum_x  \exp(i A_x /\hbar) }}

This looks very much like the Boltzmann distribution:

\displaystyle{ p_x = \frac{\exp(-E_x/T)}{\sum_x \exp(- E_x/T)}}

Indeed, the only serious difference is that we’re taking the exponential of an imaginary quantity instead of a real one.

So far everything has been a review of very standard stuff. Now comes something weird and new—at least, new to me.

Quantropy

I’ve described statics and dynamics, and a famous analogy between them, but there are some missing items in the analogy, which would be good to fill in:

Statics Dynamics
statistical mechanics quantum mechanics
probabilities amplitudes
Boltzmann distribution Feynman sum over histories
energy action
temperature Planck’s constant times i
entropy ???
free energy ???

Since the Boltzmann distribution

\displaystyle{ p_x = \frac{\exp(-E_x/T)}{\sum_x \exp(- E_x/T)}}

comes from the principle of maximum entropy, you might hope Feynman’s sum over histories formulation of quantum mechanics:

\displaystyle{ a_x = \frac{\exp(i A_x /\hbar)}{\sum_x  \exp(i A_x /\hbar) }}

comes from a maximum principle too!

Unfortunately Feynman’s sum over histories involves complex numbers, and it doesn’t make sense to maximize a complex function. However, when we say nature likes to minimize or maximize something, it often behaves like a bad freshman who applies the first derivative test and quits there: it just finds a stationary point, where the first derivative is zero. For example, in statics we have ‘stable’ equilibria, which are local minima of the energy, but also ‘unstable’ equilibria, which are still stationary points of the energy, but not local minima. This is good for us, because stationary points still make sense for complex functions.

So let’s try to derive Feynman’s prescription from some sort of ‘principle of stationary quantropy’.

Suppose we have a finite set of histories, X, and each history x \in X has a complex amplitude a_x  \in \mathbb{C}. We’ll assume these amplitudes are normalized so that

\sum_x a_x = 1

since that’s what Feynman’s normalization actually achieves. We can try to define the quantropy of a by:

\displaystyle{ Q = - \sum_x a_x \ln(a_x) }

You might fear this is ill-defined when a_x = 0, but that’s not the worst problem; in the study of entropy we typically set

0 \ln 0 = 0

and everything works fine. The worst problem is that the logarithm has different branches: we can add any multiple of 2 \pi i to our logarithm and get another equally good logarithm. For now suppose we’ve chosen a specific logarithm for each number a_x, and suppose that when we vary them they don’t go through zero, so we can smoothly change the logarithm as we move them. This should let us march ahead for now, but clearly it’s a disturbing issue which we should revisit someday.

Next, suppose each history x has an action A_x \in \mathbb{R}. Let’s seek amplitudes a_x that give a stationary point of the quantropy Q subject to a constraint on the expected action:

\displaystyle{ A = \sum_x a_x A_x }

The term ‘expected action’ is a bit odd, since the numbers a_x are amplitudes rather than probabilities. While I could try to justify it from how expected values are computed in Feynman’s formalism, I’m mainly using this term because A is analogous to the expected value of the energy, which we saw earlier. We can worry later what all this stuff really means; right now I’m just trying to push forwards with an analogy and do a calculation.

So, let’s look for a stationary point of Q subject to a constraint on A. To do this, I’d be inclined to use Lagrange multipliers and look for a stationary point of

Q - \lambda A

But there’s another constraint, too, namely

\sum_x a_x = 1

So let’s write

B = \sum_x a_x

and look for stationary points of Q subject to the constraints

A = \alpha , \qquad B = 1

To do this, the Lagrange multiplier recipe says we should find stationary points of

Q - \lambda A - \mu B

where \lambda and \mu are Lagrange multipliers. The Lagrange multiplier \lambda is really interesting. It’s analogous to ‘coolness’, \beta = 1/T, so our analogy chart suggests that

\lambda = 1/i\hbar

This says that when \lambda gets big our system becomes close to classical. So, we could call \lambda the classicality of our system. The Lagrange multiplier \mu is less interesting—or at least I haven’t thought about it much.

So, we’ll follow the usual Lagrange multiplier recipe and look for amplitudes for which

0 = \displaystyle{ \frac{\partial}{\partial a_x} \left(Q - \lambda A - \mu B \right) }

holds, along with the constraint equations. We begin by computing the derivatives we need:

\begin{array}{cclcl} \displaystyle{ \frac{\partial}{\partial a_x} Q  }  &=& - \displaystyle{ \frac{\partial}{\partial a_x} \; a_x \ln(a_x)}   &=& - \ln(a_x) - 1 \\    \\    \displaystyle{ \frac{\partial}{\partial a_x}\; A  }  &=& \displaystyle{ \frac{\partial}{\partial a_x} a_x A_x}  &=& A_x \\    \\   \displaystyle{ \frac{\partial}{\partial a_x} B  }  &=& \displaystyle{ \frac{\partial}{\partial a_x}\; a_x }  &=& 1 \end{array}

Thus, we need

0 = \displaystyle{ \frac{\partial}{\partial a_x} \left(Q - \lambda A - \mu B \right) = -\ln(a_x) - 1- \lambda A_x - \mu }

or

\displaystyle{ a_x = \frac{\exp(-\lambda A_x)}{\exp(\mu + 1)} }

The constraint

\sum_x a_x = 1

then forces us to choose:

\displaystyle{ \exp(\mu + 1) = \sum_x \exp(-\lambda A_x) }

so we have

\displaystyle{ a_x = \frac{\exp(-\lambda A_x)}{\sum_x \exp(-\lambda A_x)} }

Hurrah! This is precisely Feynman’s sum over histories formulation of quantum mechanics if

\lambda = 1/i\hbar

We could go further with the calculation, but this is the punchline, so I’ll stop here. I’ll just note that the final answer:

\displaystyle{ a_x = \frac{\exp(iA_x/\hbar)}{\sum_x \exp(iA_x/\hbar)} }

does two equivalent things in one blow:

• It gives a stationary point of quantropy subject to the constraints that the amplitudes sum to 1 and the expected action takes some fixed value.

• It gives a stationary point of the free action:

A - i \hbar Q

subject to the constraint that the amplitudes sum to 1.

In case the second point is puzzling, note that the ‘free action’ is the quantum analogue of ‘free energy’, E - T S. It’s also just Q - \lambda A times -i \hbar, and we already saw that finding stationary points of Q - \lambda A is another way of finding stationary points of quantropy with a constraint on the expected action.

Note also that when \hbar \to 0, free action reduces to action, so we recover the principle of least action—or at least stationary action—in classical mechanics.

Summary. We recover Feynman’s sum over histories formulation of quantum mechanics from assuming that all histories have complex amplitudes, that these amplitudes sum to one, and that the amplitudes give a stationary point of quantropy subject to a constraint on the expected action. Alternatively, we can assume the amplitudes sum to one and that they give a stationary point of free action.

That’s sort of nice! So, here’s our analogy chart, all filled in:

Statics Dynamics
statistical mechanics quantum mechanics
probabilities amplitudes
Boltzmann distribution Feynman sum over histories
energy action
temperature Planck’s constant times i
entropy quantropy
free energy free action

Probabilities Versus Amplitudes

5 December, 2011

Here are the slides of the talk I’m giving at the CQT Annual Symposium on Wednesday afternoon, which is Tuesday morning for a lot of you. If you catch mistakes, I’d love to hear about them before then!

Probabilities versus amplitudes.

Abstract: Some ideas from quantum theory are just beginning to percolate back to classical probability theory. For example, there is a widely used and successful theory of “chemical reaction networks”, which describes the interactions of molecules in a stochastic rather than quantum way. If we look at it from the perspective of quantum theory, this turns out to involve creation and annihilation operators, coherent states and other well-known ideas—but with a few big differences. The stochastic analogue of quantum field theory is also used in population biology, and here the connection is well-known. But what does it mean to treat wolves as fermions or bosons?


A Bet Concerning Neutrinos (Part 4)

4 December, 2011

Time for another bet!

As you may know, things are getting interesting. At first, a team of physicists claimed to see neutrinos travel from Switzerland to Italy 60 nanoseconds faster than light—but the machine made pulses of neutrinos lasting longer than that, so the whole experiment was dangerously tricky.

Now, it’s making pulses as short as 3 nanoseconds, and those physicists still see them arriving 60 nanoseconds early!

• Tomasso Dorigo, OPERA confirms: neutrinos travel faster than light!!, 17 November 2011.

This seems to have emboldened Curtis Faith, who wrote:

Care to take on another bet John?

I’d be willing to bet you:

• Two days of my time doing anything you choose within my areas of expertise.

against:

• Two days of your time doing anything I choose within your areas of expertise.

For simplicity, we could assume the same criteria for determining the winner as your existing bet.

I replied:

Hi, Curtis! In principle I’m willing to take that bet… but I’m just curious, what sort of things might you want me to do? What sort of things might I want you to do?

I don’t want you to say I need to dig a ditch in your backyard. I don’t mind digging ditches too much, but I don’t really want to fly over to your place to do it, especially since if I lose my bet to Frederik I may be allowed just one round-trip flight for a year.

Seriously, I’m actually curious about what I can do, that you would value, that I’m not already doing.

Curtis replied:

John,

It would be really mean of me to take your only flight. Besides, it seems to me that your expertise is probably not ditch digging.

I was thinking that I’d probably come over to Singapore (or Riverside) with my wife and spend the two days learning from you how to take some ideas that I’ve been working on forward. Getting advice for what sort of math I’d need to develop the ideas, where to learn more, prior work that might be relevant that I don’t know about that you might, etc. I’m guessing it would be the sort of discussions you have with graduate students but with someone who doesn’t have the same level of math skills (yet).

Since this is a very mild penalty, I said okay. We just need to write up an official contract here.

Okay: it’s your move, Curtis.


Quantum Theory Talks in Asia and Australia

30 November, 2011

Next Wednesday the Centre for Quantum Technologies is putting on a show:

The Famous, the Bit and the Quantum, CQT Annual Symposium 2011, Centre for Quantum Technologies, Singapore, 7 December 2011.

There will be three talks. Since I’m not famous, I must either the ‘bit’ or the ‘quantum’. (Seriously, I have no idea what the title of this workshop means.)

• 3 pm. Immanuel Bloch (Max-Planck-Institut für Quantenoptik): Controlling and exploring quantum gases at the single atom level.

Abstract: Over the past years, ultracold quantum gases in optical lattices have offered remarkable opportunities to investigate static and dynamic properties of strongly correlated bosonic or fermionic quantum many-body systems. In this talk I will show how it has recently not only become possible to image such quantum gases with single atom sensitivity and single site resolution, but also how it is now possible to coherently control single atoms on individual lattice sites, how one can measure hidden order parameters and how one can follow the propagation of entangled quasiparticles in a many-body setting. In addition I will present recent results on the generation of strong effective magnetic fields for ultracold atoms in optical lattices, which has opened a new avenue for realizing fractional quantum Hall like states with atomic gases.

• 4.30 pm. Harry Buhrman (Centrum Wiskunde & Informatica & University of Amsterdam): Position-based cryptography.

Abstract: Position-based cryptography uses the geographic position of a party as its sole credential. Normally digital keys or biometric features are used. A central building block in position-based cryptography is that of position-verification. The goal is to prove to a set of verifier that one is at a certain geographical location. Protocols typically assume that messages can not travel faster than the speed of light. By responding to a verier in a timely manner one can guarantee that one is within a certain distance of that verifier. Quite recently it was shown that position-verification protocols only based on this relativistic principle can be broken by two attackers who simulate being at a the claimed position while physically residing elsewhere in space. Because of the no-cloning property of quantum information (qubits) it was believed that with the use of quantum messages one could devise protocols that were resistant to such collaborative attacks. Several schemes were proposed that later turned out to be insecure. Finally it was shown that also in the quantum case no unconditionally secure scheme is possible. We will review the field of position-based quantum cryptography and highlight some of the research currently going on in order to develop, using reasonable assumptions on the capabilities of the attackers, protocols that are secure in practice.

• 6 pm. John Baez (U.C. Riverside & CQT): Probabilities versus amplitudes.

Abstract: Some ideas from quantum theory are just beginning to percolate back to classical probability theory. For example, there is a widely used and successful theory of “chemical reaction networks”, which describes the interactions of molecules in a stochastic rather than quantum way. If we look at it from the perspective of quantum theory, this turns out to involve creation and annihilation operators, coherent states and other well-known ideas•but with a few big differences. The stochastic analogue of quantum field theory is also used in population biology, and here the connection is well-known. But what does it mean to treat wolves as fermions or bosons?

People who have been following my network theory course will know this stuff already. I’ll give a more detailed mini-course on network theory here:

Expository Quantum Lecture Series 5 (EQuaLS5), Institute for Mathematical Research (INSPEM), Universiti Putra Malaysia, Malaysia, 9-13 January 2012.

This looks like fun because a number of people will be giving such mini-courses:

• Do Ngoc Diep (Inst of Math, Hanoi): A procedure for quantization of fields.

• Maurice de Gosson (Univ. of Vienna): The symplectic camel and quantum mechanics.

• Fredrik Stroemberg (Technical Univ. of Darmstadt): Arithmetic quantum chaos.

• S. Twareque Ali (Concordia University, Montreal): Coherent states: theory and applications.

The ‘symplectic camel’, in case you’re wondering, is an allusion to Mikhail Gromov’s result on classical mechanics limiting our ability to squeeze a region of phase space into a long and skinny shape. It’s like trying to squeeze a camel through the eye of a needle!

Later, I’ll give a version of my talk ‘Probabilities versus amplitudes’ at this workshop:

Coogee ’12: Sydney Quantum Information Theory Workshop, Coogee Bay Hotel, Australia, 30 January – 2 February 2012.

The workshop will focus on quantum computation with spin lattices, quantum memory, and quantum error correction. The speakers include:

• Sean Barrett (Imperial College London, UK)
• Hector Bombin (Perimeter Institute, Canada)
• Andrew Doherty (Sydney, Australia)
• Guillaume Duclos-Cianci (Sherbrooke, Canada)
• Steve Flammia (Caltech/Washington, USA)
• Jeongwan Haah (Caltech, USA)
• Robert Koenig (IBM, USA)
• Tobias Osborne (Leibniz Universitat Hannover, Germany)
• David Poulin (Sherbrooke, Canada)
• Jiannis Pachos (Leeds, UK)
• Norbert Schuch (Caltech, USA)
• Frank Verstraete (Vienna, Austria)
• Guifre Vidal (Perimeter Institute, Canada)

I’d better stop blogging and get my talk ready!


Liquid Light

28 November, 2011

Elisabeth Giacobino works at the Ecole Normale Supérieure in Paris. Last week she gave a talk at the Centre for Quantum Technologies. It was about ‘polariton condensates’. You can see a video of her talk here.

What’s a polariton? It’s a strange particle: a blend of matter and light. Polaritons are mostly made of light… with just enough matter mixed in so they can form a liquid! This liquid can form eddies just like water. Giacobino and her team of scientists have actually gotten pictures:

Physicists call this liquid a ‘polariton condensate’, but normal people may better appreciate how wonderful it is if we call it liquid light. That’s not 100% accurate, but it’s close—you’ll see what I mean in a minute.

Here’s a picture of Elisabeth Giacobino (at right) and her coworkers in 2010—not exactly the same team who is working on liquid light, but the best I can find:

How to make liquid light

How do you make liquid light?

First, take a thin film of some semiconductor like gallium arsenide. It’s full of electrons roaming around, so imagine a sea of electrons, like water. If you knock out an electron with enough energy, you’ll get a ‘hole’ which can move around like a particle of its own. Yes, the absence of a thing can act like a thing. Imagine an air bubble in the sea.

All this so far is standard stuff. But now for something more tricky: if you knock an electron just a little, it won’t go far from the hole it left behind. They’ll be attracted to each other, so they’ll orbit each other!

What you’ve got now is like a hydrogen atom—but instead of an electron and a proton, it’s made from an electron and a hole! It’s called an exciton. In Giacobino’s experiments, the excitons are 200 times as big as hydrogen atoms.

Excitons are exciting, but not exciting enough for us. So next, put a mirror on each side of your thin film. Now light can bounce back and forth. The light will interact with the excitons. If you do it right, this lets a particle of light—called a photon—blend with an exciton and form a new particle called polariton.

How does a photon ‘blend’ with an exciton? Umm, err… this involves quantum mechanics. In quantum mechanics you can take two possible situations and add them and get a new one, a kind of ‘blend’ called a ‘superposition’. ‘Schrödinger’s cat’ is what you get when you blend a live cat and a dead cat. People like to argue about why we don’t see half-live, half-dead cats. But never mind: we can see a blend of a photon and an exciton! Giacobino and her coworkers have done just that.

The polaritons they create are mostly light, with just a teeny bit of exciton blended in. Photons have no mass at all. So, perhaps it’s not surprising that their polaritons have a very small mass: about 10-5 times as heavy as an electron!

They don’t last very long: just about 4-10 picoseconds. A picosecond is a trillionth of a second, or 10-12 seconds. After that they fall apart. However, this is long enough for polaritons to do lots of interesting things.

For starters, polaritons interact with each other enough to form a liquid. But it’s not just any ordinary liquid: it’s often a superfluid, like very cold liquid helium. This means among other things, that it has almost no viscosity.

So: it’s even better than liquid light: it’s superfluid light!

The flow of liquid light

What can you do with liquid light?

For starters, you can watch it flow around obstacles. Semiconductors have ‘defects’—little flaws in the crystal structure. These act as obstacles to the flow of polaritons. And Giacobimo and her team have seen the flow of polaritons around defects in the semiconductor:

The two pictures at left are two views of the polariton condensate flowing smoothly around a defect. In these pictures the condensate is a superfluid.

The two pictures in the middle show a different situation. Here the polariton condensate is viscous enough so that it forms a trail of eddies as it flows past the defect. Yes, eddies of light!

And the two pictures at right show yet another situation. In every fluid, we can have waves of pressure. This is called… ‘sound’. Yes, this is how ordinary sound works in air, or under water. But we can also have sound in a polariton condensate!

That’s pretty cool: sound in liquid light! But wait. We haven’t gotten to the really cool part yet. Whenever you have a fluid moving past an obstacle faster than the speed of sound, you get a ‘shock wave’: the obstacle leaves an expanding trail of sound in its wake, behind it, because the sound can’t catch up. That’s why jets flying faster than sound leave a sonic boom behind them.

And that’s what you’re seeing in the pictures at right. The polariton condensate is flowing past the defect faster than the speed of sound, which happens to be around 850,000 meters per second in this experiment. We’re seeing the shock wave it makes. So, we’re seeing a sonic boom in liquid light!

It’s possible we’ll be able to use polariton condensates for interesting new technologies. Giacobimo and her team are also considering using them to study Hawking radiation: the feeble glow that black holes emit according to Hawking’s predictions. There aren’t black holes in polariton condensates, but it may be possible to create a similar kind of radiation. That would be really cool!

But to me, just being able to make a liquid consisting mostly of light, and study its properties, is already a triumph: just for the beauty of it.

Scary technical details

All the pictures of polariton condensates flowing around a defect came from here:

• A. Amo, S. Pigeon, D. Sanvitto, V. G. Sala, R. Hivet, I. Carusotto, F. Pisanello, G. Lemenager, R. Houdre, E. Giacobino, C. Ciuti, and A. Bramati, Hydrodynamic solitons in polariton superfluids.

and this is the paper to read for more details.

I tried to be comprehensible to ordinary folks, but there are a few more things I can’t resist saying.

First, there are actually many different kinds of polaritons. In general, polaritons are quasiparticles formed by the interaction of photons and matter. For example, in some crystals sound acts like it’s made of particles, and these quasiparticles are called ‘phonons’. But sometimes phonons can interact with light to form quasiparticles—and these are called ‘phonon-polaritons’. I’ve only been talking about ‘exciton-polaritons’.

If you know a bit about superfluids, you may be interested to hear that the wavy patterns show the phase of the order parameter ψ in the Landau-Ginzburg theory of superfluids:

If you know about quantum field theory, you may be interested to know that the Hamiltonian describing photon-exciton interactions involves terms roughly like

\alpha a^\dagger a + \beta b^\dagger b + \gamma (a^\dagger b + b^\dagger a)

where a is the annihilation operator for photons, b is the annihilation operator for excitons, the Greek letters are various constants, and the third term describes the interaction of photons and excitons. We can simplify this Hamiltonian by defining new particles that are linear combinations of photons and excitons. It’s just like diagonalizing a matrix; we get something like

\delta c^\dagger c + \epsilon d^\dagger d

where c and d are certain linear combinations of a and b. These act as annihilation operators for our new particles… and one of these new particles is the very light ‘polariton’ I’ve been talking about!


Follow

Get every new post delivered to your Inbox.

Join 3,095 other followers