How to Cut Carbon Emissions and Save Money

27 January, 2012

McKinsey & Company is a management consulting firm. In 2010 they released this ‘carbon abatement cost curve’ for the whole world:

Click it to see a nice big version. So, they’re claiming:

By 2030 we can cut CO2 emissions about 15 gigatonnes per year while saving lots of money.

By 2030 can cut CO2 emissions by up to 37 gigatonnes per year before the total cost—that is, cost minus savings—becomes positive.

The graph is cute. The vertical axis of the graph says how many euros per tonne it would cost to cut CO2 emissions by 2030 using various measures. The horizontal axis says how many gigatonnes per year we could reduce CO2 emissions using these measures.

So, we get lots of blue rectangles. If a rectangle is below the horizontal axis, its area says how many euros per year we’d save by implementing that measure. If it’s above the axis, its area says how much that measure would cost.

I believe the total blue area below the axis equals the total blue area above the axis. So if we do all these things, the total cost is zero.

37 gigatonnes of CO2 is roughly 10 gigatonnes of carbon: remember, there’s a crucial factor of 3\frac{2}{3} here. In 2004, Pacala and Socolow argued that the world needs to find ways to cut carbon emissions by about 7 gigatonnes/year by 2054 to keep emissions flat until this time. By now we’d need 9 gigatonnes/year.

If so, it seems the measures shown here could keep carbon emissions flat worldwide at no net cost!

But as usual, there are at least a few problems.

Problem 1

Is McKinsey’s analysis correct? I don’t know. Here’s their report, along with some others:

• McKinsey & Company, Impact of the financial crisis on carbon economics: Version 2.1 of the global greenhouse gas abatement cost curve, 2010.

For more details it’s good to read version 2.0:

• McKinsey & Company, Pathways to a low carbon economy: Version 2 of the global greenhouse gas abatement cost curve, 2009.

They’re free if you fill out some forms. But it’s not easy to check these things. Does anyone know papers that try to check McKinsey’s work? I find it’s more fun to study a problem like this after you see two sides of the same story.

Problem 2

I said ‘no net cost’. But if you need to spend a lot of money, the fact that I’m saving a lot doesn’t compensate you. So there’s the nontrivial problem of taking money that’s saved on some measures and making sure it gets spent on others. Here’s where ‘big government’ might be required—which makes some people decide global warming is just a political conspiracy, nyeh-heh-heh.

Is there another way to make the money transfer happen, without top-down authority?

We could still get the job about half-done at a huge savings, of course. McKinsey says we could cut CO2 emissions by 15 gigatonnes per year doing things that only save money. That’s about 4 gigatonnes of carbon per year! We could at least do that.

Problem 3

Keeping carbon emissions flat is not enough. Carbon dioxide, once put in the atmosphere, stays there a long time—though individual molecules come and go. As the saying goes, carbon is forever. (Click that link for more precise information.)

So, even Pacala and Socolow say keeping carbon emissions flat is a mere stopgap before we actually reduce carbon emissions, starting in 2054. But some more recent papers seem to suggest Pacala and Socolow were being overly optimistic.

Of course it depends on how much global warming you’re willing to tolerate! It also depends on lots of other things.

Anyway, this paper claims that if we cut global greenhouse gas emissions in half by 2050 (as compared to what they were in 1990), there’s a 12–45% probability that the world will get at least 2 °C warmer than its temperature before the industrial revolution:

• Malte Meinshausen et al, Greenhouse-gas emission targets for limiting global warming to 2 °C, Nature 458 (2009), 1158–1163.

Abstract: More than 100 countries have adopted a global warming limit of 2 °C or below (relative to pre-industrial levels) as a guiding principle for mitigation efforts to reduce climate change risks, impacts and damages. However, the greenhouse gas (GHG) emissions corresponding to a specified maximum warming are poorly known owing to uncertainties in the carbon cycle and the climate response. Here we provide a comprehensive probabilistic analysis aimed at quantifying GHG emission budgets for the 2000–50 period that would limit warming throughout the twenty-first century to below 2 °C, based on a combination of published distributions of climate system properties and observational constraints. We show that, for the chosen class of emission scenarios, both cumulative emissions up to 2050 and emission levels in 2050 are robust indicators of the probability that twenty-first century warming will not exceed 2 °C relative to pre-industrial temperatures.

Limiting cumulative CO2 emissions over 2000–50 to 1,000 Gt CO2 yields a 25% probability of warming exceeding 2 °C—and a limit of 1,440 Gt CO2 yields a 50% probability—given a representative estimate of the distribution of climate system properties. As known 2000–06 CO2 emissions were 234 Gt CO2, less than half the proven economically recoverable oil, gas and coal reserves can still be emitted up to 2050 to achieve such a goal. Recent G8 Communiques envisage halved global GHG emissions by 2050, for which we estimate a 12–45% probability of exceeding 2 °C—assuming 1990 as emission base year and a range of published climate sensitivity distributions. Emissions levels in 2020 are a less robust indicator, but for the scenarios considered, the probability of exceeding 2 °C rises to 53–87% if global GHG emissions are still more than 25% above 2000 levels in 2020.

This paper says we’re basically doomed to suffer unless we revamp society:

• Ted Trainer, Can renewables etc. solve the greenhouse problem? The negative case, Energy Policy 38 (2010), 4107–4114.

Abstract: Virtually all current discussion of climate change and energy problems proceeds on the assumption that technical solutions are possible within basically affluent-consumer societies. There is however a substantial case that this assumption is mistaken. This case derives from a consideration of the scale of the tasks and of the limits of non-carbon energy sources, focusing especially on the need for redundant capacity in winter. The first line of argument is to do with the extremely high capital cost of the supply system that would be required, and the second is to do with the problems set by the intermittency of renewable sources. It is concluded that the general climate change and energy problem cannot be solved without large scale reductions in rates of economic production and consumption, and therefore without transition to fundamentally different social structures and systems.

It’s worth reading because it uses actual numbers, not just hand-waving. But it seeks much more than keeping carbon emissions flat until 2050; that’s one reason for the dire conclusions.

It’s worth noting this rebuttal, which says that everything about Trainer’s paper is fine except a premature dismissal of nuclear power:

• Barry Brook, Could nuclear fission energy, etc., solve the greenhouse problem? The affirmative case, Energy Policy, available online 16 December 2011.

To get your hands on Brook’s paper you either need a subscription or you need to email him. You can do that starting from his blog article about the paper… which is definitely worth reading:

• Barry Brook, Could nuclear fission energy, etc., solve the greenhouse problem? The affirmative case, BraveNewClimate, 14 January 2012.

According to Brook, we can keep global warming from getting too bad if we get really serious about nuclear power.

Of course, these three papers are just a few of many. I’m still trying to sift through the information and figure out what’s really going on. It’s hard. It may be impossible. But McKinsey’s list of ways to cut carbon emissions and save money points to some things we start doing right now.


Ban Elsevier

26 January, 2012

Please take the pledge not to do business with Elsevier. 404 scientists have done it so far:

The cost of knowledge.

You can separately say you

1) won’t publish with them,
2) won’t referee for them, and/or
3) won’t do editorial work for them.

At least do number 2): how often can you do something good by doing less work? When a huge corporation relies so heavily on nasty monopolistic practices and unpaid volunteer labor, they leave themselves open to this.

This pledge website is the brainchild of Tim Gowers, a Fields medalist and prominent math blogger:

• Tim Gowers, Elsevier: my part in its downfall and http://thecostofknowledge.com.

In case you’re not familiar with the Elsevier problem, here’s something excerpted from my website. This does not yet mention Elsevier’s recent support of the Research Works Act, which would try to roll back the US government’s requirement that taxpayer-funded medical research be made freely available online. Nor does it mention the fake medical journals created by Elsevier, where what looked like peer-reviewed papers were secretly advertisements paid for by drug companies! Nor does it mention the Chaos, Solitons and Fractals fiasco. Indeed, it’s hard keeping up with Elsevier’s dirty deeds!

The problem and the solutions

The problem of highly priced science journals is well-known. A wave of mergers in the publishing business has created giant firms with the power to extract ever higher journal prices from university libraries. As a result, libraries are continually being forced to cough up more money or cut their journal subscriptions. It’s really become a crisis.

Luckily, there are also two counter-trends at work. In mathematics and physics, more and more papers are available from a free electronic database called the arXiv, and journals are beginning to let papers stay on this database even after they are published. In the life sciences, PubMed Central plays a similar role.

There are also a growing number of free journals. Many of these are peer-reviewed, and most are run by academics instead of large corporations.

The situation is worst in biology and medicine: the extremely profitable spinoffs of research in these subjects has made it easy for journals to charge outrageous prices and limit the free nature of discourse. A non-profit organization called the Public Library of Science was formed to fight this, and circulated an open letter calling on publishers to adopt reasonable policies. 30,000 scientists signed this and pledged to:

publish in, edit or review for, and personally subscribe to only those scholarly and scientific journals that have agreed to grant unrestricted free distribution rights to any and all original research reports that they have published, through PubMed Central and similar online public resources, within 6 months of their initial publication date.

Unsurprisingly, the response from publishers was chilly. As a result, the Public Library of Science started its own free journals in biology and medicine, with the help of a 9 million dollar grant from the Gordon and Betty Moore Foundation.

A number of other organizations are also pushing for free access to scholarly journals, such as Create Change, the Scholarly Publishing and Academic Resources Coalition, and the Budapest Open Access Initiative, funded by George Soros.

Editorial boards are beginning to wise up, too. On August 10, 2006, all the editors of the math journal Topology resigned to protest the outrageous prices of the publisher, Reed Elsevier. In August of this year, the editorial board of the Springer journal K-Theory followed suit. The Ecole Normale Superieure has also stopped having Elsevier publish the journal Annales Scientifiques de l’École Normale Supérieure.

So, we may just win this war! But only if we all do our part.

What we can do

What can we do to keep academic discourse freely available to all? Here are some things:

1. Don’t publish in overpriced journals.

2. Don’t do free work for overpriced journals (like refereeing and editing).

3. Put your articles on the arXiv or a similar site before publishing them.

4. Only publish in journals that let you keep your articles on the arXiv or a similar site.

5. Support free journals by publishing in them, refereeing for them, editing them… even starting your own!

6. Help make sure free journals and the arXiv stay free.

7. Help start a system of independent ‘referee boards‘ for arXiv papers. These can referee papers and help hiring, tenure and promotion committees to assess the worth of papers, eliminating the last remaining reasons for the existence of traditional for-profit journals.

The nice thing is that most of these are easy to do! Only items 5 through 7 require serious work. As for item 4, a lot of math and physics journals not only let you keep your article on the arXiv, but let you submit it by telling them its arXiv number! In math it’s easy to find these journals, because there’s a public list of them.

Of course, you should read the copyright agreement that you’ll be forced to sign before submitting to a journal or publishing a book. Check to see if you can keep your work on the arXiv, on your own website, etcetera. You can pretty much assume that any rights you don’t explicitly keep, your publisher will get. Eric Weisstein didn’t do this, and look what happened to him: he got sued and spent over a year in legal hell!

Luckily it’s not hard to read these copyright agreements: you can get them off the web. An extensive list is available from Sherpa, an organization devoted to free electronic archives.

If you think maybe you want to start your own journal, or move an existing journal to a cheaper publisher, read Joan Birman’s article about this. Go to the Create Change website and learn what other people are doing. Also check out SPARC—the Scholarly Publishing and Academic Resources Coalition. They can help. And try the Budapest Open Access Initiative—they give out grants.

You can also support the Public Library of Science or join the Open Archives Initiative.

Also: if you like mathematics, tell your librarian about Mathematical Sciences Publishers, a nonprofit organization run by mathematicians for the purpose of publishing low-cost, high-quality math journals.

Which journals are overpriced?

In 1997 Robion Kirby urged mathematicians not to submit papers to, nor edit for, nor referee for overpriced journals. I think this suggestion is great, and it applies not just to mathematics but all disciplines. There is really no good reason for us to donate our work to profit-making corporations who sell it back to us at exorbitant prices! Indeed in climate science this has a terrible effect: crackpot bloggers distribute their misinformation free of charge, while lots of important serious climate science papers are hidden, available only to people who work at institutions with expensive subscriptions.

But how can you tell if a journal is overpriced? In mathematics, Up-to-date information on the rise of journal prices is available from the American Mathematical Society. They even include an Excel spreadsheet that lets you do your own calculations with this data! Some of this information is nicely summarized on a webpage by Ulf Rehmann. Using these tools you can make up your own mind which journals are too expensive to be worth supporting with your free volunteer labor.

What about other subjects? I don’t know. Maybe you do?

When I first learned how bad the situation was, I started by boycotting all journals published by Reed Elsevier. This juggernaut was formed by merger of Reed Publishing and Elsevier Scientific Press in 1993. In August 2001 it bought Harcourt Press—which in turn owned Academic Press, which ran a journal I helped edit, Advances in Mathematics. I don’t work for that journal anymore! The reason is that Reed Elsevier is a particularly bad culprit when it comes to charging high prices. You can see this from the above lists of journal prices, and you can also see it in the business news. In 2002, Forbes magazine wrote:

If you are not a scientist or a lawyer, you might never guess which company is one of the world’s biggest in online revenue. Ebay will haul in only $1 billion this year. Amazon has $3.5 billion in revenue but is still, famously, losing money. Outperforming them both is Reed Elsevier, the London-based publishing company. Of its $8 billion in likely sales this year, $1.5 billion will come from online delivery of data, and its operating margin on the internet is a fabulous 22%.

Credit this accomplishment to two things. One is that Reed primarily sells not advertising or entertainment but the dry data used by lawyers, doctors, nurses, scientists and teachers. The other is its newfound marketing hustle: Its CEO since 1999 has been Crispin Davis, formerly a soap salesman.

But Davis will have to keep hustling to stay out of trouble. Reed Elsevier has fat margins and high prices in a business based on information—a commodity, and one that is cheaper than ever in the internet era. New technologies and increasingly universal access to free information make it vulnerable to attack from below. Today pirated music downloaded from the web ravages corporate profits in the music industry. Tomorrow could be the publishing industry’s turn.

Some customers accuse Reed Elsevier of price gouging. Daniel DeVito, a patent lawyer with Skadden, Arps, Slate, Meagher & Flom, is a fan of Reed’s legal-search service, but he himself does free science searches on the Google site before paying for something like Reed’s ScienceDirect—and often finds what he’s looking for at no cost. Reed can ill afford to rest.

Why should we slave away unpaid to keep Crispin Davis and his ilk rolling in dough? There’s really no good reason.

Sneaky tricks

To fight against the free journals and the arXiv, publishing companies are playing sneaky tricks like these:

Proprietary Preprint Archives. Examples included ChemWeb and something they called "The Mathematics Preprint Server". The latter was especially devious, because mathematicians used to call the arXiv "the mathematics preprint server".

However, the Mathematics Preprint Server didn’t fool many smart people, so lots of the papers they got were crap, like a supposed proof of Goldbach’s conjecture, and a claim that the rotation of a galactic supercluster is due to a "topological defect" in spacetime. Eventually Elsevier gave up and stopped accepting new papers on their preprint server. Now it’s a laughable shadow of its former self. Similarly, ChemWeb was sold off.

Web Spamming. More recently, publishers have tried a new trick: “web spamming”, also known as “search engine spamming” or “cloaking”. The company gives search engine crawlers access to full-text articles — but when you try to read these articles, you get a "doorway page" demanding a subscription or payment. Sometimes you’ll even be taken to a page that has nothing to do with the paper you thought you were about to see!

Culprits include Springer, Reed Elsevier, and the Institute of Electrical and Electronic Engineers. The last one seems to have quit — but check out their powerpoint presentation on this subject, courtesy of Carl Willis.

If you see pages like this, report them to Google or your favorite search engine.

Journal Bundling. Worse still is the strategy of "bundling" subscriptions into huge all-or-nothing packages, so libraries can’t save money by ceasing to subscribe to a single journal. It’s a clever trap, especially because these bundled subscriptions look like a good deal at first. The cost becomes apparent only later. Now universities libraries are being bankrupted as the prices of these bundles keep soaring. The library of my own university, U.C. Riverside, barely has money for any books anymore!

Luckily, people are catching on. In 2003, Cornell University bravely dropped their subscription to 930 Elsevier journals. Four North Carolina universities have joined the revolt, and the University of California has also been battling Elsevier. For other actions universities have taken, read Peter Suber’s list.

Legal bullying. Large corporations like to scare people by means of threats of legal action backed up by deep pockets. A classic example is the lawsuit launched by Gordon and Breach against the American Physical Society for publishing lists of journal prices. Luckily they lost this suit.

Hiring a Dr. Evil lookalike as their PR consultant.

Click either of the pictures for an explanation.


I, Robot

24 January, 2012

On 13 February 2012, I will give a talk at Google in the form of a robot. I will look like this:


My talk will be about “Energy, the Environment and What We Can Do.” Since I think we should cut unnecessary travel, I decided to stay here in Singapore and use a telepresence robot instead of flying to California.

I thank Mike Stay for arranging this at Google, and I especially thank Trevor Blackwell and everyone else at Anybots for letting me use one of their robots!

I believe Google will film this event and make a video available. But I hope reporters attend, because it should be fun, and I plan to describe some ways we can slash carbon emissions.

More detail: I will give this talk at 4 pm Monday, February 13, 2012 in the Paramaribo Room on the Google campus (Building 42, Floor 2). Visitors and reporters are invited, but they need to check in at the main visitor’s lounge in Building 43, and they’ll need to be escorted to and from the talk, so someone will pick them up 10 or 15 minutes before the talk starts.

Energy, the Environment and What We Can Do

Abstract: Our heavy reliance on fossil fuels is causing two serious problems: global warming, and the decline of cheaply available oil reserves. Unfortunately the second problem will not cancel out the first. Each one individually seems extremely hard to solve, and taken
together they demand a major worldwide effort starting now. After an overview of these problems, we turn to the question: what can we do about them?

I also need help from all of you reading this! I want to talk about solutions, not just problems—and given my audience, and the political deadlock in the US, I especially want to talk about innovative solutions that come from individuals and companies, not governments.

Can changing whole systems produce massive cuts in carbon emissions, in a way that spreads virally rather than being imposed through top-down directives? It’s possible. Curtis Faith has some inspiring thoughts on this:

I’ve been looking on various transportation and energy and environment issues for more than 5 years, and almost no one gets the idea that we can radically reduce consumption if we look at the complete systems. In economic terms, we currently have a suboptimal Nash Equilibrium with a diminishing pie when an optimal expanding pie equilibrium is possible. Just tossing around ideas a a very high level with back of the envelope estimates we can get orders of magnitude improvements with systemic changes that will make people’s lives better if we can loosen up the grip of the big corporations and government.

To borrow a physics analogy, the Nash Equilibrium is a bit like a multi-dimensional metastable state where the system is locked into a high energy configuration and any local attempts to make the change revert to the higher energy configuration locally, so it would require sufficient energy or energy in exactly the right form to move all the different metastable states off their equilibrium either simultaneously or in a cascade.

Ideally, we find the right set of systemic economic changes that can have a cascade effect, so that they are locally systemically optimal and can compete more effectively within the larger system where the Nash Equilibrium dominates. I hope I haven’t mixed up too many terms from too many fields and confused things. These terms all have overlapping and sometimes very different meaning in the different contexts as I’m sure is true even within math and science.

One great example is transportation. We assume we need electric cars or biofuel or some such thing. But the very assumption that a car is necessary is flawed. Why do people want cars? Give them a better alternative and they’ll stop wanting cars. Now, what that might be? Public transportation? No. All the money spent building a 2,000 kg vehicle to accelerate and decelerate a few hundred kg and then to replace that vehicle on a regular basis can be saved if we eliminate the need for cars.

The best alternative to cars is walking, or walking on inclined pathways up and down so we get exercise. Why don’t people walk? Not because they don’t want to but because our cities and towns have optimized for cars. Create walkable neighborhoods and give people jobs near their home and you eliminate the need for cars. I live in Savannah, GA in a very tiny place. I never use the car. Perhaps 5 miles a week. And even that wouldn’t be necessary with the right supplemental business structures to provide services more efficiently.

Or electricity for A/C. Everyone lives isolated in structures that are very inefficient to heat. Large community structures could be air conditioned naturally using various techniques and that could cut electricity demand by 50% for neighborhoods. Shade trees are better than insulation.

Or how about moving virtually entire cities to cooler climates during the hot months? That is what people used to do. Take a train North for the summer. If the destinations are low-resource destinations, this can be a huge reduction for the city. Again, getting to this state is hard without changing a lot of parts together.

These problems are not technical, or political, they are economic. We need the economic systems that support these alternatives. People want them. We’ll all be happier and use far less resources (and money). The economic system needs to be changed, and that isn’t going to happen with politics, it will happen with economic innovation. We tend to think of our current models as the way things are, but they aren’t. Most of the status quo is comprised of human inventions, money, fractional reserve banking, corporations, etc. They all brought specific improvements that made them more effective at the time they were introduce because of the conditions during those times. Our times too are different. Some new models will work much better for solving our current problems.

Your idea really starts to address the reason why people fly unnecessarily. This change in perspective is important. What if we went back to sailing ships? And instead of flying we took long leisurely educational seminar cruises on modern versions of sail yachts? What if we improved our trains? But we need to start from scratch and design new systems so they work together effectively. Why are we stuck with models of cities based on the 19th-century norms?

We aren’t, but too many people think we are because the scope of their job or academic career is just the piece of a system, not the system itself.

System level design thinking is the key to making the difference we need. Changes to the complete systems can have order of magnitude improvements. Changes to the parts will have us fighting for tens of percentages.

Do you know good references on ideas like this—preferably with actual numbers? I’ve done some research, but I feel I must be missing a lot of things.

This book, for example, is interesting:

• Michael Peters, Shane Fudge and Tim Jackson, editors, Low Carbon Communities: Imaginative Approaches to Combating Climate Change Locally, Edward Elgar Publishing Group, Cheltenham, UK, 2010.

but I wish it had more numbers on how much carbon emissions were cut by some of the projects they describe: Energy Conscious Households in Action, the HadLOW CARBON Community, the Transition Network, and so on.


Classical Mechanics versus Thermodynamics (Part 2)

23 January, 2012

I showed you last time that in many branches of physics—including classical mechanics and thermodynamics—we can see our task as minimizing or maximizing some function. Today I want to show how we get from that task to symplectic geometry.

So, suppose we have a smooth function

S: Q \to \mathbb{R}

where Q is some manifold. A minimum or maximum of S can only occur at a point where

d S = 0

Here the differential d S which is a 1-form on Q. If we pick local coordinates q^i in some open set of Q, then we have

\displaystyle {d S = \frac{\partial S}{\partial q^i} dq^i }

and these derivatives \displaystyle{ \frac{\partial S}{\partial q^i} } are very interesting. Let’s see why:

Example 1. In classical mechanics, consider a particle on a manifold X. Suppose the particle starts at some fixed position at some fixed time. Suppose that it ends up at the position x at time t. Then the particle will seek to follow a path that minimizes the action given these conditions. Assume this path exists and is unique. The action of this path is then called Hamilton’s principal function, S(x,t). Let

Q = X \times \mathbb{R}

and assume Hamilton’s principal function is a smooth function

S : Q \to \mathbb{R}

We then have

d S = p_i dq^i - H d t

where q^i are local coordinates on X,

\displaystyle{ p_i = \frac{\partial S}{\partial q^i} }

is called the momentum in the ith direction, and

\displaystyle{ H = - \frac{\partial S}{\partial t} }

is called the energy. The minus signs here are basically just a mild nuisance. Time is different from space, and in special relativity the difference comes from a minus sign, but I don’t think that’s the explanation here. We could get rid of the minus signs by working with negative energy, but it’s not such a big deal.

Example 2. In thermodynamics, consider a system with the internal energy U and volume V. Then the system will choose a state that maximizes the entropy given these constraints. Assume this state exists and is unique. Call the entropy of this state S(U,V). Let

Q = \mathbb{R}^2

and assume the entropy is a smooth function

S : Q \to \mathbb{R}

We then have

d S = \displaystyle{\frac{1}{T} d U - \frac{P}{T} d V }

where T is the temperature of the system, and P is the pressure. The slight awkwardness of this formula makes people favor other setups.

Example 3. In thermodynamics there are many setups for studying the same system using different minimum or maximum principles. One of the most popular is called the energy scheme. If internal energy increases with increasing entropy, as usually the case, this scheme is equivalent to the one we just saw.

In the energy scheme we fix the entropy S and volume V. Then the system will choose a state that minimizes the internal energy given these constraints. Assume this state exists and is unique. Call the internal energy of this state U(S,V). Let

Q = \mathbb{R}^2

and assume the entropy is a smooth function

S : Q \to \mathbb{R}

We then have

d U = T d S - P d V

where

\displaystyle{ T = \frac{\partial U}{\partial S} }

is the temperature, and

\displaystyle{ P = - \frac{\partial U}{\partial V} }

is the pressure. You’ll note the formulas here closely resemble those in Example 1!

Example 4. Here are the four most popular schemes for thermodynamics:

• If we fix the entropy S and volume V, the system will choose a state that minimizes the internal energy U(S,V).

• If we fix the entropy S and pressure P, the system will choose a state that minimizes the enthalpy H(S,P).

• If we fix the temperature T and volume V, the system will choose a state that minimizes the Helmholtz free energy A(T,V).

• If we fix the temperature T and pressure P, the system will choose a state that minimizes the Gibbs free energy G(T,P).

These quantities are related by a pack of similar-looking formulas, from which we may derive a mind-numbing little labyrinth of Maxwell relations. But for now, all we need to know is that all these approaches to thermodynamics are equivalent given some reasonable assumptions, and all the formulas and relations can be derived using the Legendre transformation trick I explained last time. So, I won’t repeat what we did in Example 3 for all these other cases!

Example 5. In classical statics, consider a particle on a manifold Q. This particle will seek to minimize its potential energy V(q), which we’ll assume is some smooth function of its position q \in Q. We then have

d V = -F_i dq^i

where q^i are local coordinates on Q and

\displaystyle{ F_i = -\frac{\partial F}{\partial q^i} }

is called the force in the ith direction.

Conjugate variables

So, the partial derivatives of the quantity we’re trying
to minimize or maximize are very important! As a result, we often want to give them more of an equal status as independent quantities in their own right. Then we call them ‘conjugate variables’.

To make this precise, consider the cotangent bundle T^* Q, which has local coordinates q^i (coming from the coordinates on Q) and p_i (the corresponding coordinates on each cotangent space). We then call p_i the conjugate variable of the coordinate q^i.

Given a smooth function

S : Q \to \mathbb{R}

the 1-form d S can be seen as a section of the cotangent bundle. The graph of this section is defined by the equation

\displaystyle{ p_i = \frac{\partial S}{\partial q^i} }

and this equation ties together two intuitions about ‘conjugate variables’: as coordinates on the cotangent bundle, and as partial derivatives of the quantity we’re trying to minimize or maximize.

The tautological 1-form

There is a lot to say here, especially about Legendre transformations, but I want to hasten on to a bit of symplectic geometry. And for this we need the ‘tautological 1-form’ on T^* Q.

We can think of d S as a map

d S : Q \to T^* Q

sending each point q \in Q to the point (q,p) \in T^* Q where p is defined by the equation we just saw:

\displaystyle{ p_i = \frac{\partial S}{\partial q^i} }

Using this map, we can pull back any 1-form on T^* Q to get a 1-form on Q.

What 1-form on Q might we like to get? Why, d S of course!

Amazingly, there’s a 1-form \alpha on T^* Q such that when we pull it back using the map d S, we get the 1-form d S—no matter what smooth function d S we started with!

Thanks to this wonderfully tautological property, \alpha is called the tautological 1-form on T^* Q. You should check that it’s given by the formula

\alpha = p_i dq^i

If you get stuck, try this.

So, if we want to see how much S changes as we move along a path in Q, we can do this in three equivalent ways:

• Evaluate S at the endpoint of the path and subtract off S at the starting-point.

• Integrate the 1-form d S along the path.

• Use d S : Q \to T^* Q to map the path over to T^* Q, and then integrate \alpha over this path in T^* Q.

The last method is equivalent thanks to the ‘tautological’ property of \alpha. It may seem overly convoluted, but it shows that if we work in T^* Q, where the conjugate variables are accorded equal status, everything we want to know about the change in S is contained in the 1-form \alpha, no matter which function S we decide to use!

So, in this sense, \alpha knows everything there is to know about the change in Hamilton’s principal function in classical mechanics, or the change in entropy in thermodynamics… and so on!

But this means it must know about things like Hamilton’s equations, and the Maxwell relations.

The symplectic structure

We saw last time that the fundamental equations of classical mechanics and thermodynamics—Hamilton’s equations and the Maxwell relations—are mathematically just the same. They both say simply that partial derivatives commute:

\displaystyle { \frac{\partial^2 S}{\partial q^i \partial q^j} = \frac{\partial^2 S}{\partial q^j \partial q^i} }

where S: Q \to \mathbb{R} is the function we’re trying to minimize or maximize.

I also mentioned that this fact—the commuting of partial derivatives—can be stated in an elegant coordinate-free way:

d^2 S = 0

Perhaps I should remind you of the proof:

d^2 S =   d \left( \displaystyle{ \frac{\partial S}{\partial q^i} dq^i } \right) = \displaystyle{ \frac{\partial^2 S}{\partial q^j \partial q^i} dq^j \wedge dq^i }

but

dq^j \wedge dq^i

changes sign when we switch i and j, while

\displaystyle{ \frac{\partial^2 S}{\partial q^j \partial q^i}}

does not, so d^2 S = 0. It’s just a wee bit more work to show that conversely, starting from d^2 S = 0, it follows that the mixed partials must commute.

How can we state this fact using the tautological 1-form \alpha? I said that using the map

d S : Q \to T^* Q

we can pull back \alpha to Q and get d S. But pulling back commutes with the d operator! So, if we pull back d \alpha, we get d^2 S. But d^2 S = 0. So, d \alpha has the magical property that when we pull it back to Q, we always get zero, no matter what S we choose!

This magical property captures Hamilton’s equations, the Maxwell relations and so on—for all choices of S at once. So it shouldn’t be surprising that the 2-form

\theta = d \alpha

is colossally important: it’s the famous symplectic structure on the so-called phase space T^* Q.

Well, actually, most people prefer to work with

\omega = - d \alpha

It seems this whole subject is a monument of austere beauty… covered with minus signs, like bird droppings.

Example 6. In classical mechanics, let

Q = X \times \mathbb{R}

as in Example 1. If Q has local coordinates q^i, t, then T^* Q has these along with the conjugate variables as coordinates. As we explained, it causes little trouble to call these conjugate variables by the same names we used for the partial derivatives of S: namely, p_i and -H. So, we have

\alpha = p_i dq^i - H d t

and thus

\omega = dq^i \wedge dp_i - dt \wedge dH

Example 7. In thermodynamics, let

Q = \mathbb{R}^2

as in Example 3. If Q has coordinates S, V then the conjugate variables deserve to be called T, -P. So, we have

\alpha = T d S - P d V

and

\omega = d S \wedge d T - d V \wedge d P

You’ll see that in these formulas for \omega, variables get paired with their conjugate variables. That’s nice.

But let me expand on what we just saw, since it’s important. And let me talk about \theta =  d\alpha, without tossing in that extra sign.

What we saw is that the 2-form \theta is a ‘measure of noncommutativity’. When we pull \theta back to Q we get zero. This says that partial derivatives commute—and this gives Hamilton’s equations, the Maxwell relations, and all that. But up in T^* Q, \theta is not zero. And this suggests that there’s some built-in noncommutativity hiding in phase space!

Indeed, we can make this very precise. Consider a little parallelogram up in T^* Q:

Suppose we integrate the 1-form \alpha up the left edge and across the top. Do we get the same answer if integrate it across the bottom edge and then up the right?

No, not necessarily! The difference is the same as the integral of \alpha all the way around the parallelogram. By Stokes’ theorem, this is the same as integrating \theta over the parallelogram. And there’s no reason that should give zero.

However, suppose we got our parallelogram in T^* Q by taking a parallelogram in Q and applying the map

d S : Q \to T^* Q

Then the integral of \alpha around our parallelogram would be zero, since it would equal the integral of d S around a parallelogram in Q… and that’s the change in S as we go around a loop from some point to… itself!

And indeed, the fact that a function S doesn’t change when we go around a parallelogram is precisely what makes

\displaystyle { \frac{\partial^2 S}{\partial q^i \partial q^j} = \frac{\partial^2 S}{\partial q^j \partial q^i} }

So the story all fits together quite nicely.

The big picture

I’ve tried to show you that the symplectic structure on the phase spaces of classical mechanics, and the lesser-known but utterly analogous one on the phase spaces of thermodynamics, is a natural outgrowth of utterly trivial reflections on the process of minimizing or maximizing a function S on a manifold Q.

The first derivative test tells us to look for points with

d S = 0

while the commutativity of partial derivatives says that

d^2 S = 0

everywhere—and this gives Hamilton’s equations and the Maxwell relations. The 1-form d S is the pullback of the tautologous 1-form \alpha on T^* Q, and similarly d^2 S is the pullback of the symplectic structure d\alpha. The fact that

d \alpha \ne 0

says that T^* Q holds noncommutative delights, almost like a world where partial derivatives no longer commute! But of course we still have

d^2 \alpha = 0

everywhere, and this becomes part of the official definition of a symplectic structure.

All very simple. I hope, however, the experts note that to see this unified picture, we had to avoid the most common approaches to classical mechanics, which start with either a ‘Hamiltonian’

H : T^* Q \to \mathbb{R}

or a ‘Lagrangian’

L : T Q \to \mathbb{R}

Instead, we started with Hamilton’s principal function

S : Q \to \mathbb{R}

where Q is not the usual configuration space describing possible positions for a particle, but the ‘extended’ configuration space, which also includes time. Only this way do Hamilton’s equations, like the Maxwell relations, become a trivial consequence of the fact that partial derivatives commute.

But what about those ‘noncommutative delights’? First, there’s a noncommutative Poisson bracket operation on functions on T^* Q. This makes the functions into a so-called Poisson algebra. In classical mechanics of a point particle on the line, for example, it’s well-known that we have

\begin{array}{ccr}  \{ p, q \} &=& 1 \\  \{ H, t \} &=& -1 \end{array}

In thermodynamics, the analogous relations

\begin{array}{ccr}  \{ T, S \} &=& 1 \\  \{ P, V \} &=& -1 \end{array}

seem sadly little-known. But you can see them here, for example:

• M. J. Peterson, Analogy between thermodynamics and mechanics, American Journal of Physics 47 (1979), 488–490.

at least up to one of those pesky minus signs! We can use these Poisson brackets to study how one thermodynamic variable changes as we slowly change another, staying close to equilibrium all along.

Second, we can go further and ‘quantize’ the functions on T^* Q. This means coming up with an associative but noncommutative product of these function that mimics the Poisson bracket to some extent. In the case of a particle on a line, we’d get commutation relations like

\begin{array}{lcr}  p q - q p &=& - i \hbar \\  H t - t H &=& i \hbar \end{array}

where \hbar is Planck’s constant. Now we can represent these quantities as operators on a Hilbert space, the uncertainty principle kicks in, and life gets really interesting.

In thermodynamics, the analogous relations would be

\begin{array}{ccr}  T S - S T &=& - i \hbar \\  P V - V P &=& i \hbar \end{array}

The math works just the same, but what does it mean physically? Are we now thinking of temperature, entropy and the like as ‘quantum observables’—for example, operators on a Hilbert space? Are we just quantizing thermodynamics?

That’s one possible interpretation, but I’ve never heard anyone discuss it. Here’s one good reason: as Blake Stacey pointed out below, these equations don’t pass the test of dimensional analysis! The quantities at left have units of energy, while Plank’s constant has units of action. So maybe we need to introduce a quantity with units of time at right, or maybe there’s some other interpretation, where we don’t interpret the parameter \hbar as the good old-fashioned Planck’s constant, but something else instead.

And if you’ve really been paying attention, you may wonder how quantropy fits into this game! I showed that at least in a toy model, the path integral formulation of quantum mechanics arises, not exactly from maximizing or minimizing something, but from finding its critical points: that is, points where its first derivative vanishes. This something is a complex-valued quantity analogous to entropy, which I called ‘quantropy’.

Now, while I keep throwing around words like ‘minimize’ and ‘maximize’, most everything I’m doing works just fine for critical points. So, it seems that the apparatus of symplectic geometry may apply to the path-integral formulation of quantum mechanics.

But that would be weirdly interesting! In particular, what would happen when we go ahead and quantize the path-integral formulation of quantum mechanics?

If you’re a physicist, there’s a guess that will come tripping off your tongue at this point, without you even needing to think. Me too. But I don’t know if that guess is right.

Less mind-blowingly, there is also the question of how symplectic geometry enters into classical statics via the idea of Example 4.

But there’s a lot of fun to be had in this game already with thermodynamics.

Appendix

I should admit, just so you don’t think I failed to notice, that only rather esoteric physicists study the approach to quantum mechanics where time is an operator that doesn’t commute with the Hamiltonian H. In this approach H commutes with the momentum and position operators. I didn’t write down those commutation equations, for fear you’d think I was a crackpot and stop reading! It is however a perfectly respectable approach, which can be reconciled with the usual one. And this issue is not only quantum-mechanical: it’s also important in classical mechanics.

Namely, there’s a way to start with the so-called extended phase space for a point particle on a manifold X:

T^* (X \times \mathbb{R})

with coordinates q^i, t, p_i and H, and get back to the usual phase space:

T^* X

with just q^i and p_i as coordinates. The idea is to impose a constraint of the form

H = f(q,p)

to knock off one degree of freedom, and use a standard trick called ‘symplectic reduction’ to knock off another.

Similarly, in quantum mechanics we can start with a big Hilbert space

L^2(X \times \mathbb{R})

on which q^i, t, p_i, and H are all operators, then impose a constraint expressing H in terms of p and q, and then use that constraint to pick out states lying in a smaller Hilbert space. This smaller Hilbert space is naturally identified with the usual Hilbert space for a point particle:

L^2(X)

Here X is called the configuration space for our particle; its cotangent bundle is the usual phase space. We call X \times \mathbb{R} the extended configuration space for a particle on the line; its cotangent bundle is the extended phase space.

I’m having some trouble remembering where I first learned about these ideas, but here are some good places to start:

• Toby Bartels, Abstract Hamiltonian mechanics.

• Nikola Buric and Slobodan Prvanovic, Space of events and the time observable.

• Piret Kuusk and Madis Koiv, Measurement of time in nonrelativistic quantum and classical mechanics, Proceedings of the Estonian Academy of Sciences, Physics and Mathematics 50 (2001), 195–213.


Classical Mechanics versus Thermodynamics (Part 1)

19 January, 2012

It came as a bit of a shock last week when I realized that some of the equations I’d learned in thermodynamics were just the same as equations I’d learned in classical mechanics—with only the names of the variables changed, to protect the innocent.

Why didn’t anyone tell me?

For example: everybody loves Hamilton’s equations: there are just two, and they summarize the entire essence of classical mechanics. Most people hate the Maxwell relations in thermodynamics: there are lots, and they’re hard to remember.

But what I’d like to show you now is that Hamilton’s equations are Maxwell relations! They’re a special case, and you can derive them the same way. I hope this will make you like the Maxwell relations more, instead of liking Hamilton’s equations less.

First, let’s see what these equations look like. Then let’s see why Hamilton’s equations are a special case of the Maxwell relations. And then let’s talk about how this might help us unify different aspects of physics.

Hamilton’s equations

Suppose you have a particle on the line whose position q and momentum p are functions of time, t. If the energy H is a function of position and momentum, Hamilton’s equations say:

\begin{array}{ccr}  \displaystyle{  \frac{d p}{d t} }  &=&  \displaystyle{- \frac{\partial H}{\partial q} } \\  \\ \displaystyle{  \frac{d q}{d t} } &=&  \displaystyle{ \frac{\partial H}{\partial p} }  \end{array}

The Maxwell relations

There are lots of Maxwell relations, and that’s one reason people hate them. But let’s just talk about two; most of the others work the same way.

Suppose you have a physical system like a box of gas that has some volume V, pressure P, temperature T and entropy S. Then the first and second Maxwell relations say:

\begin{array}{ccr}  \displaystyle{ \left. \frac{\partial T}{\partial V}\right|_S } &=&  \displaystyle{ - \left. \frac{\partial P}{\partial S}\right|_V } \\   \\   \displaystyle{ \left. \frac{\partial S}{\partial  V}\right|_T  }  &=&  \displaystyle{ \left. \frac{\partial P}{\partial T} \right|_V }   \end{array}

Comparison

Clearly Hamilton’s equations resemble the Maxwell relations. Please check for yourself that the patterns of variables are exactly the same: only the names have been changed! So, apart from a key subtlety, Hamilton’s equations become the first and second Maxwell relations if we make these replacements:

\begin{array} {ccccccc}  q &\to& S & &  p &\to & T \\ t & \to & V & & H &\to & P \end{array}

What’s the key subtlety? One reason people hate the Maxwell’s relations is they have lots of little symbols like \left. \right|_V saying what to hold constant when we take our partial derivatives. Hamilton’s equations don’t have those.

So, you probably won’t like this, but let’s see what we get if we write Hamilton’s equations so they exactly match the pattern of the Maxwell relations:

\begin{array}{ccr}     \displaystyle{ \left. \frac{\partial p}{\partial t} \right|_q }  &=&  \displaystyle{- \left. \frac{\partial H}{\partial q} \right|_t } \\  \\\displaystyle{  \left.\frac{\partial q}{\partial t} \right|_p } &=&  \displaystyle{ \left. \frac{\partial H}{\partial p} \right|_t }    \end{array}

This looks a bit weird, and it set me back a day. What does it mean to take the partial derivative of q in the t direction while holding p constant, for example?

I still think it’s weird. But I think it’s correct. To see this, let’s derive the Maxwell relations, and then derive Hamilton’s equations using the exact same reasoning, with only the names of variables changed.

Deriving the Maxwell relations

The Maxwell relations are extremely general, so let’s derive them in a way that makes that painfully clear. Suppose we have any smooth function U on the plane. Just for laughs, let’s call the coordinates of this plane S and V. Then we have

d U = T d S - P d V

for some functions T and P. This equation is just a concise way of saying that

\displaystyle{ T = \left.\frac{\partial U}{\partial S}\right|_V }

and

\displaystyle{ P = - \left.\frac{\partial U}{\partial V}\right|_S }

The minus sign here is unimportant: you can think of it as a whimsical joke. All the math would work just as well if we left it out.

(In reality, physicists call U as the internal energy of a system, regarded as a function of its entropy S and volume V. They then call T the temperature and P the pressure. It just so happens that for lots of systems, their internal energy goes down as you increase their volume, so P works out to be positive if we stick in this minus sign, so that’s what people did. But you don’t need to know any of this physics to follow the derivation of the Maxwell relations!)

Now, mixed partial derivatives commute, so we have:

\displaystyle{ \frac{\partial^2 U}{\partial V \partial S} =  \frac{\partial^2 U}{\partial S \partial V}}

Plugging in our definitions of T and V, this says

\displaystyle{ \left. \frac{\partial T}{\partial V}\right|_S = - \left. \frac{\partial P}{\partial S}\right|_V }

And that’s the first Maxwell relation! So, there’s nothing to it: it’s just a sneaky way of saying that the mixed partial derivatives of the function U commute.

The second Maxwell relation works the same way. But seeing this takes a bit of thought, since we need to cook up a suitable function whose mixed partial derivatives are the two sides of this equation:

\displaystyle{ \left. \frac{\partial S}{\partial  V}\right|_T  = \left. \frac{\partial P}{\partial T} \right|_V }

There are different ways to do this, but for now let me use the time-honored method of ‘pulling the rabbit from the hat’.

Here’s the function we want:

A = U - T S

(In thermodynamics this function is called the Helmholtz free energy. It’s sometimes denoted F, but the International Union of Pure and Applied Chemistry recommends calling it A, which stands for the German word ‘Arbeit’, meaning ‘work’.)

Let’s check that this function does the trick:

\begin{array}{ccl} d A &=& d U - d(T S) \\  &=& (T d S - P d V) - (S dT + T d S) \\  &=& -S d T - P dV \end{array}

If we restrict ourselves to any subset of the plane where T and V serve as coordinates, the above equation is just a concise way of saying

\displaystyle{ S = - \left.\frac{\partial A}{\partial T}\right|_V }

and

\displaystyle{ P = - \left.\frac{\partial A}{\partial V}\right|_T }

Then since mixed partial derivatives commute, we get:

\displaystyle{ \frac{\partial^2 A}{\partial V \partial T} =  \frac{\partial^2 A}{\partial T \partial V}}

or in other words:

\displaystyle{ \left. \frac{\partial S}{\partial  V}\right|_T  = \left. \frac{\partial P}{\partial T} \right|_V }

which is the second Maxwell relation.

We can keep playing this game using various pairs of the four functions S, T, P, V as coordinates, and get more Maxwell relations: enough to give ourselves a headache! But we have more better things to do today.

Hamilton’s equations as Maxwell relations

For example: let’s see how Hamilton’s equations fit into this game. Suppose we have a particle on the line. Consider smooth paths where it starts at some fixed position at some fixed time and ends at the point q at the time t. Nature will choose a path with least action—or at least one that’s a stationary point of the action. Let’s assume there’s a unique such path, and that it depends smoothly on q and t. For this to be true, we may need to restrict q and t to a subset of the plane, but that’s okay: go ahead and pick such a subset.

Given q and t in this set, nature will pick the path that’s a stationary point of action; the action of this path is called Hamilton’s principal function and denoted S(q,t). (Beware: this S is not the same as entropy!)

Let’s assume S is smooth. Then we can copy our derivation of the Maxwell equations line for line and get Hamilton’s equations! Let’s do it, skipping some steps but writing down the key results.

For starters we have

d S = p d q - H d t

for some functions p and H called the momentum and energy, which obey

\displaystyle{ p = \left.\frac{\partial S}{\partial q}\right|_t }

and

\displaystyle{ H = - \left.\frac{\partial S}{\partial t}\right|_q }

As far as I can tell it’s just a cute coincidence that we see a minus sign in the same place as before! Anyway, the fact that mixed partials commute gives us

\displaystyle{ \left. \frac{\partial p}{\partial t} \right|_q = - \left. \frac{\partial H}{\partial q} \right|_t }

which is the first of Hamilton’s equations. And now we see that all the funny \left. \right|_q and \left. \right|_t things are actually correct!

Next, we pull a rabbit out of our hat. We define this function:

X = S - p q

and check that

d X = - q dp - H d t

This function X probably has a standard name, but I don’t know it. Do you?

Then, considering any subset of the plane where p and t serve as coordinates, we see that because mixed partials commute:

\displaystyle{ \frac{\partial^2 X}{\partial t \partial p} =  \frac{\partial^2 A}{\partial p \partial t}}

we get

\displaystyle{ \left. \frac{\partial q}{\partial t} \right|_p = \left. \frac{\partial H}{\partial p} \right|_t }

So, we’re done!

But you might be wondering how we pulled this rabbit out of the hat. More precisely, why did we suspect it was there in the first place? There’s a nice answer if you’re comfortable with differential forms. We start with what we know:

d S = p d q - H d t

Next, we use this fundamental equation:

d^2 = 0

to note that:

\begin{array}{ccl}  0 &=& d^2 S \\ &=& d(p d q- H d t) \\ &=& d p \wedge d q - d H \wedge d t \\ &=& - dq \wedge d p - d H \wedge d t \\ &=& d(-q d p - H d t) \end{array}

See? We’ve managed to switch the roles of p and q, at the cost of an extra minus sign!

Then, if we restrict attention to any contractible open subset of the plane, the Poincaré Lemma says

d \omega = 0 \implies \omega = d \mu \; \textrm{for some} \; \mu

Since

d(- q d p - H d t) = 0

it follows that there’s a function X with

d X = - q d p - H d t

This is our rabbit. And if you ponder the difference between -q d p and p d q, you’ll see it’s -d( p q). So, it’s no surprise that

X = S - p q

The big picture

Now let’s step back and think about what’s going on.

Lately I’ve been trying to unify a bunch of ‘extremal principles’, including:

1) the principle of least action
2) the principle of least energy
3) the principle of maximum entropy
4) the principle of maximum simplicity, or Occam’s razor

In my post on quantropy I explained how the first three principles fit into a single framework if we treat Planck’s constant as an imaginary temperature. The guiding principle of this framework is

maximize entropy
subject to the constraints imposed by what you believe

And that’s nice, because E. T. Jaynes has made a powerful case for this principle.

However, when the temperature is imaginary, entropy is so different that it may deserves a new name: say, ‘quantropy’. In particular, it’s complex-valued, so instead of maximizing it we have to look for stationary points: places where its first derivative is zero. But this isn’t so bad. Indeed, a lot of minimum and maximum principles are really ‘stationary principles’ if you examine them carefully.

What about the fourth principle: Occam’s razor? We can formalize this using algorithmic probability theory. Occam’s razor then becomes yet another special case of

maximize entropy
subject the constraints imposed by what you believe

once we realize that algorithmic entropy is a special case of ordinary entropy.

All of this deserves plenty of further thought and discussion—but not today!

Today I just want to point out that once we’ve formally unified classical mechanics and thermal statics (often misleadingly called ‘thermodynamics’), as sketched in the article on quantropy, we should be able to take any idea from one subject and transpose it to the other. And it’s true. I just showed you an example, but there are lots of others!

I guessed this should be possible after pondering three famous facts:

• In classical mechanics, if we fix the initial position of a particle, we can pick any position q and time t at which the particle’s path ends, and nature will seek the path to this endpoint that minimizes the action. This minimal action is Hamilton’s principal function S(q,t), which obeys

d S = p d q - H d t

In thermodynamics, if we fix the entropy S and volume V of a box of gas, nature will seek the probability distribution of microstates the minimizes the energy. This minimal energy is the internal energy
U(S,V), which obeys

d U = T d S - P d V

• In classical mechanics we have canonically conjugate quantities, while in statistical mechanics we have conjugate variables. In classical mechanics the canonical conjugate of the position q is the momentum p, while the canonical conjugate of time t is energy H. In thermodynamics, the conjugate of entropy S is temperature T, while the conjugate of volume V is pressure P. All this is fits in perfectly with the analogy we’ve been using today:

\begin{array} {ccccccc}  q &\to& S & &  p &\to & T \\ t & \to & V & & H &\to & P \end{array}

• Something called the Legendre transformation plays a big role both in classical mechanics and thermodynamics. This transformation takes a function of some variable and turns it into a function of the conjugate variable. In our proof of the Maxwell relations, we secretly used a Legendre transformation to pass from the internal energy U(S,V) to the Helmholtz free energy A(T,V):

A = U - T S

where we must solve for the entropy S in terms of T and V to think of A as a function of these two variables.

Similarly, in our proof of Hamilton’s equations, we passed from Hamilton’s principal function S(q,t) to the function X(p,t):

X = S - p q

where we must solve for the position q in terms of p and t to think of X as a function of these two variables.

I hope you see that all this stuff fits together in a nice picture, and I hope to say a bit more about it soon. The most exciting thing for me will be to see how symplectic geometry, so important in classical mechanics, can be carried over to thermodynamics. Why? Because I’ve never seen anyone use symplectic geometry in thermodynamics. But maybe I just haven’t looked hard enough!

Indeed, it’s perfectly possible that some people already know what I’ve been saying today. Have you seen someone point out that Hamilton’s equations are a special case of the Maxwell relations? This would seem to be the first step towards importing all of symplectic geometry to thermodynamics.


Going on Strike

17 January, 2012

 

Along with Wikipedia and other sites, this blog will go on strike on the 18th of January, 2012. We will be closed starting 13:00 UTC (also known as 1 pm Greenwich Mean Time – that’s 8 am Eastern Standard Time for you Americans). We should be back 12 hours later.

Congress has decided to shelve the Stop Online Piracy Act (SOPA) until a more compliant president is elected. But we need to let them know now that this bill sucks, along with its evil partner, the Protect IP Act or PIPA. That’s what the internet strike is about.

My homepage will be on strike too—in fact, it started today! Yours can easily do the same: just copy my homepage onto yours and adjust it to taste.

(By the way, the official version of the “strike” webpage is flawed because it uses relative links that don’t work when you copy it to your own site. I fixed those in my version.)


Extremal Principles in Classical, Statistical and Quantum Mechanics

13 January, 2012

guest post by Mike Stay

The table in John’s post on quantropy shows that energy and action are analogous:

Statics Dynamics
statistical mechanics quantum mechanics
probabilities amplitudes
Boltzmann distribution Feynman sum over histories
energy action
temperature Planck’s constant times i
entropy quantropy
free energy free action

However, this seems to be part of a bigger picture that includes at least entropy as analogous to both of those, too. I think that just about any quantity defined by an integral over a path would behave similarly.

I see four broad areas to consider, based on a temperature parameter:

  1. T = 0: statics, or “least quantity”
  2. Real T > 0: statistical mechanics
  3. Imaginary T: a thermal ensemble gets replaced by a quantum superposition
  4. Complex T: ensembles of quantum systems, as in nuclear magnetic resonance

I’m not going to get into the last of these in what follows.

1. “Least quantity”

Lagrangian of a classical particle

K is kinetic energy, i.e. the “action density” due to motion.

V is potential energy, i.e. minus the “action density” due to position.

The action is then:

\displaystyle \begin{array}{rcl}   A &=& \int (K-V) \,  d t \\ & = & \int \left[m\left(\frac{d q(t)}{d t}^2 - V(q(t)\right)\right] d t  \end{array}

where m is the particle’s mass. We get the principle of least action by setting \delta A = 0.

“Static” systems related by a Wick rotation

  1. Substitute q(s = iz) for q(t) to get a “springy” static system.

    In John’s homework problem A Spring in Imaginary Time, he guided students through a Wick-rotation-like process that transforms the Lagrangian above into the Hamiltonian of a springy system. (I say “springy” because it’s not exactly the Hamiltonian for a hanging spring: here each infinitesimal piece of the spring is at a fixed horizontal position and is free to move only vertically.)

    \kappa is the potential energy density due to stretching.

    \upsilon is the potential energy density due to position.

    We then have

    \displaystyle  \begin{array}{rcl}\int(\kappa-\upsilon) dz & = &  \int\left[k\left(\frac{dq(iz)}{dz}\right)^2 - \upsilon(q(iz))\right]  dz\\ & = & -i\int\left[-k\left(\frac{dq(iz)}{diz}\right)^2 -  \upsilon(q(iz))\right] diz\\ & = & i  \int\left[k\left(\frac{dq(iz)}{diz}\right)^2 + \upsilon(q(iz))\right]  diz \end{array}

    or letting s = iz,

    \displaystyle  \begin{array}{rcl}   & = &  i\int\left[k\left(\frac{dq(s)}{ds}\right)^2 + \upsilon(q(s))\right]  ds\\ & = & iE \end{array}

    where E is the potential energy of the spring. We get the principle of least energy by setting \delta E = 0.

  2. Substitute q(β = iz) for q(t) to get a thermometer
    system.

    We can repeat the process above, but use inverse temperature, or “coolness”, instead of time. Note that this is still a statics problem at heart! We’ll introduce another temperature below when we allow for multiple possible q‘s.

    K is the potential energy due to rate of change of q with respect to \beta. (This has to do with the thermal expansion coefficient: if we fix length of the thermometer and then cool it, we get “stretching” potential energy.)

    V is any extra potential energy due to q.

    \displaystyle \begin{array}{rcl}\int(K-V) dz  & = & \int\left[k\left(\frac{dq(iz)}{dz}\right)^2 -  V(q(iz))\right] dz\\ & = &  -i\int\left[-k\left(\frac{dq(iz)}{diz}\right)^2 - V(q(iz))\right]  diz\\ & = & i \int\left[k\left(\frac{dq(iz)}{diz}\right)^2 +  V(q(iz))\right] diz \end{array}

    or letting \beta = iz,

    \displaystyle \begin{array}{rcl}   & = &  i\int\left[k\left(\frac{dq(\beta)}{d\beta}\right)^2 +  V(q(\beta))\right] d\beta\\ & = & iS_1\end{array}

    where S_1 is the entropy lost as the thermometer is cooled. We get the principle of “least entropy lost” by setting \delta S_1 = 0.

  3. Substitute q(T₁ = iz) for q(t).

    We can repeat the process above, but use temperature instead of time. We get a system whose heat capacity is governed by a function q(T) and its derivative. We’re trying to find the best function q, the most efficient way to raise the temperature of the system.

    C is the heat capacity (= entropy) proportional to (dq/dT_1)^2.

    V is the heat capacity due to q.

    \displaystyle \begin{array}{rcl}\int(C-V) dz  & = & \int\left[k\left(\frac{dq(iz)}{dz}\right)^2 -  V(q(iz))\right] dz\\ & = &  -i\int\left[-k\left(\frac{dq(iz)}{diz}\right)^2 - V(q(iz))\right]  diz\\ & = & i \int\left[k\left(\frac{dq(iz)}{diz}\right)^2 +  V(q(iz))\right] diz  \end{array}

    or letting T_1 = iz,

    \displaystyle \begin{array}{rcl} & = &  i\int\left[k\left(\frac{dq(T_1)}{dT_1}\right)^2 + V(q(T_1))\right]  dT_1\\ & = & iE \end{array}

    where E is the energy required to raise the
    temperature. We again get the principle of least energy by setting \delta E = 0.

2. Statistical mechanics

Here we allow lots of possible q‘s, then maximize entropy subject to constraints using the Lagrange multiplier trick.

Statistical mechanics of a particle

For the statistical mechanics of a particle, we choose a real measure a_x on the set of paths. For simplicity, we assume the set is finite.

Normalize so \sum a_x = 1.

Define entropy to be S = - \sum a_x \ln a_x.

Our problem is to choose a_x to minimize the “free action” F = A - \lambda S, or, what’s equivalent, to maximize S subject to a constraint on A.

To make units match, λ must have units of action, so it’s some multiple of . Replace λ by ℏλ so the free action is

F = A - \hbar\lambda\, S.

The distribution that minimizes the free action is the Gibbs distribution a_x = \exp(-A/\hbar\lambda) / Z, where Z is the usual partition function.

However, there are other observables of a path, like the position q_{1/2} at the halfway point; given another constraint on the average value of q_{1/2} over all paths, we get a distribution like

\displaystyle a_x = \exp(-\left[A +  pq_{1/2}\right]/\hbar\lambda) / Z.

The conjugate variable to that position is a momentum: in order to get from the starting point to the given point in the allotted time, the particle has to have the corresponding momentum.

dA = \hbar\lambda\, dS - p\, dq.

Other examples from Wick rotation

  1. Introduce a temperature T [Kelvins] that perturbs the spring.

    We minimize the free energy F = E - kT\, S, i.e. maximize the entropy S subject to a constraint on the expected energy

    \langle E\rangle = \sum a_x E_x.

    We get the measure a_x = \exp(-E_x/kT) / Z.

    Other observables about the spring’s path give conjugate variables whose product is energy. Given constraint on the average position of the spring at the halfway point, we get a conjugate force: pulling the spring out of equilibrium requires a force.

    dE = kT\, dS - F\, dq.

  2. Statistical ensemble of thermometers with ensemble temperature T₂ [unitless].

    We minimize the “free entropy” F = S_1 - T_2S_2, i.e. we maximize the entropy S_2 subject to a constraint on the expected entropy lost

    \langle S_1\rangle = \sum a_x S_{1,x}.

    We get the measure a_x = \exp(-S_{1,x}/T_2) / Z.

    Given a constraint on the average position at the halfway point, we get a conjugate inverse length r that tells how much entropy is lost when the thermometer shrinks by dq.

    dS_1 = T_2\, dS_2 - r\, dq.

  3. Statistical ensemble of functions q with ensemble temperature T₂ [Kelvins].

    We minimize the free energy F = E - kT_2\, S, i.e. we maximize the entropy S subject to a constraint on the expected energy

    \displaystyle \langle E\rangle = \sum a_x E_x.

    We get the measure a_x = \exp(-E_x/kT_2) / Z.

    Again, a constraint on the position would give a conjugate force. It’s a little harder to see how here, but given a non-optimal function q(T), we have an extra energy cost due to inefficiency that’s analogous to the stretching potential energy when pulling a spring out of equilibrium.

3. Thermo to quantum via Wick rotation of Lagrange multiplier

We allow a complex-valued measure a as John did in the article on quantropy. We pick a logarithm for each a_x and assume they don’t go through zero as we vary them. We also choose an imaginary Lagrange multiplier.

Normalize so \sum a_x = 1.

Define quantropy Q = - \sum a_x \ln a_x.

Find a stationary point of the free action F = A - \hbar\lambda\, Q.

We get a_x = \exp(-A_x/\hbar\lambda). If \lambda = -i, we get Feynman’s sum over histories. Surely something like the two-slit experiment considers histories with a constraint on position at a particular time, and we get a conjugate momentum?

A Quantum Version of Entropy

Again allow complex-valued a_x. However, this time normalize these by setting \sum |a_x|^2 = 1.

Define a quantum version of entropy S = - \sum |a_x|^2  \ln |a_x|^2.

  1. Allow quantum superposition of perturbed springs.

    \langle E\rangle = \sum |a_x|^2 E_x. Get a_x =  \exp(-E_x/kT) / Z. If T = -i\hbar/tk, we get the evolution of the quantum state |q\rangle under the given Hamiltonian for a time t.

  2. Allow quantum superpositions of thermometers.

    \langle S_1\rangle = \sum |a_x|^2 S_{1,x}. Get a_x =  \exp(-S_{1,x}/T_2) / Z. If T_2 = -i, we get something like a sum over histories, but with a different normalization condition that converges because our set of paths is finite.

  3. Allow quantum superposition of systems.

    \langle E \rangle = \sum |a_x|^2 E_x. Get a_x =\exp(-E_x/kT_2) / Z. If T_2 = -i\hbar/tk, we get the result of “Measure E, then heat the superposition T₁ degrees in a time much less than t seconds, then wait t seconds.” Different functions q in the superposition change the heat capacity differently and thus the systems end up at different energies.

So to sum up, there’s at least a three-way analogy between action, energy, and entropy depending on what you’re integrating over. You get a kind of “statics” if you extremize the integral by varying the path; by allowing multiple paths and constraints on observables, you get conjugate variables and “free” quantities that you want to minimize; and by taking the temperature to be imaginary, you get quantum systems.


The Beauty of Roots (Part 2)

7 January, 2012

Here’s a bit more on the beauty of roots—some things that may have escaped those of you who weren’t following this blog carefully!

Greg Egan has a great new applet for exploring the roots of Littlewood polynomials of a given degree—meaning polynomials whose coefficients are all ±1:

• Greg Egan, Littlewood applet.

Move the mouse around to create a little rectangle, and the applet will zoom in to show the roots in that region. For example, the above region is close to the number -0.0572 + 0.72229i.

Then, by holding the shift key and clicking the mouse, compare the corresponding ‘dragon’. We get the dragon for some complex number by evaluating all power series whose coefficients are all ±1 at this number.

You’ll see that often the dragon for some number resembles the set of roots of Littlewood polynomials near that number! To get a sense of why, read Greg’s explanation. However, he uses a different, though equivalent, definition of the dragon (which he calls the ‘Julia set’).

He also made a great video showing how the dragons change shape as you move around the complex plane:

The dragon is well-defined for any number inside the unit circle, since all power series with coefficients ±1 converge inside this circle. If you watch the video carefully—it helps to make it big—you’ll see a little white cross moving around inside the unit circle, indicating which dragon you’re seeing.

I’m writing a paper about this stuff with Dan Christensen and Sam Derbyshire… that’s why I’m not giving a very careful explanation now. We invited Greg Egan to join us, but he’s too busy writing the third volume of his trilogy Orthogonal.


The Best Climate Scientists

4 January, 2012

A physicist friend asks if there is someone in climate science who has made progress significant enough to deserve a Nobel Prize. It’s an interesting question. Any such prize would be amazingly controversial, but let’s shelve that and ask: who are the best climate scientists, the ones who have made truly dramatic progress?

Arrhenius is no longer with us, so he’s out.


Azimuth on Google Plus (Part 5)

1 January, 2012

Happy New Year! I’m back from Laos. Here are seven items, mostly from the Azimuth Circle on Google Plus:

1) Phil Libin is the boss of a Silicon Valley startup. When he’s off travelling, he uses a telepresence robot to keep an eye on things. It looks like a stick figure on wheels. Its bulbous head has two eyes, which are actually a camera and a laser. On its forehead is a screen, where you can see Libin’s face. It’s made by a company called Anybots, and it costs just $15,000.


I predict that within my life we’ll be using things like this to radically cut travel costs and carbon emissions for business and for conferences. It seems weird now, but so did telephones. Future models will be better to look at. But let’s try it soon!

• Laura Sydell No excuses: robots put you in two places at once, Weekend Edition Saturday, 31 December 2011.

Bruce Bartlett and I are already planning for me to use telepresence to give a lecture on mathematics and the environment at Stellenbosch University in South Africa. But we’d been planning to use old-fashioned videoconferencing technology.

Anybots is located in Mountain View, California. That’s near Google’s main campus. Can anyone help me set up a talk on energy and the environment at Google, where I use an Anybot?

(Or, for that matter, anywhere else around there?)

2) A study claims to have found a correlation between weather and the day of the week! The claim is that there are more tornados and hailstorms in the eastern USA during weekdays. One possible mechanism could be that aerosols from car exhaust help seed clouds.


I make no claims that this study is correct. But at the very least, it would be interesting to examine their use of statistics and see if it’s convincing or flawed:

• Thomas Bell and Daniel Rosenfeld, Why do tornados and hailstorms rest on weekends?, Journal of Geophysical Research 116 (2011), D20211.

Abstract. This study shows for the first time statistical evidence that when anthropogenic aerosols over the eastern United States during summertime are at their weekly mid-week peak, tornado and hailstorm activity there is also near its weekly maximum. The weekly cycle in summertime storm activity for 1995–2009 was found to be statistically significant and unlikely to be due to natural variability. It correlates well with previously observed weekly cycles of other measures of storm activity. The pattern of variability supports the hypothesis that air pollution aerosols invigorate deep convective clouds in a moist, unstable atmosphere, to the extent of inducing production of large hailstones and tornados. This is caused by the effect of aerosols on cloud drop nucleation, making cloud drops smaller and hydrometeors larger. According to simulations, the larger ice hydrometeors contribute to more hail. The reduced evaporation from the larger hydrometeors produces weaker cold pools. Simulations have shown that too cold and fast-expanding pools inhibit the formation of tornados. The statistical observations suggest that this might be the mechanism by which the weekly modulation in pollution aerosols is causing the weekly cycle in severe convective storms during summer over the eastern United States. Although we focus here on the role of aerosols, they are not a primary atmospheric driver of tornados and hailstorms but rather modulate them in certain conditions.

Here’s a discussion of it:

• Bob Yirka, New research may explain why serious thunderstorms and tornados are less prevalent on the weekends, PhysOrg, 22 December 2011.

3) And if you like to check how people use statistics, here’s a paper that would be incredibly important if its findings were correct:

• Joseph J. Mangano and Janette D. Sherman, An unexpected mortality increase in the United States follows arrival of the radioactive plume from Fukushima: is there a correlation?, International Journal of Health Services 42 (2012), 47–64.

The title has a question mark in it, but it’s been cited in very dramatic terms in many places, for example this video entitled “Peer reviewed study shows 14,000 U.S. deaths from Fukushima”:

Starting at 1:31 you’ll see an interview with one of the paper’s authors, Janette Sherman.

14,000 deaths in the US due to Fukushima? Wow! How did they get that figure? This quote from the paper explains how:

During weeks 12 to 25 [after the Fukushima disaster began], total deaths in 119 U.S. cities increased from 148,395 (2010) to 155,015 (2011), or 4.46 percent. This was nearly double the 2.34 percent rise in total deaths (142,006 to 145,324) in 104 cities for the prior 14 weeks, significant at p < 0.000001 (Table 2). This difference between actual and expected changes of +2.12 percentage points (+4.46% – 2.34%) translates to 3,286 “excess” deaths (155,015 × 0.0212) nationwide. Assuming a total of 2,450,000 U.S. deaths will occur in 2011 (47,115 per week), then 23.5 percent of deaths are reported (155,015/14 = 11,073, or 23.5% of 47,115). Dividing 3,286 by 23.5 percent yields a projected 13,983 excess U.S. deaths in weeks 12 to 25 of 2011.

Hmm. Can you think of some potential problems with this analysis?

In the interview, Janette Sherman also mentions increased death rates of children in British Columbia. Here’s the evidence the paper presents for that:

Shortly after the report [another paper by the authors] was issued, officials from British Columbia, Canada, proximate to the northwestern United States, announced that 21 residents had died of sudden infant death syndrome (SIDS) in the first half of 2011, compared with 16 SIDS deaths in all of the prior year. Moreover, the number of deaths from SIDS rose from 1 to 10 in the months of March, April, May, and June 2011, after Fukushima fallout arrived, compared with the same period in 2010. While officials could not offer any explanation for the abrupt increase, it coincides with our findings in the Pacific Northwest.

4) For the first time in 87 years, a wild gray wolf was spotted in California:

• Stephen Messenger, First gray wolf in 80 years enters California, Treehugger, 29 December 2011.

Researchers have been tracking this juvenile male using a GPS-enabled collar since it departed northern Oregon. In just a few weeks, it walked some 730 miles to California. It was last seen surfing off Malibu. Here is a photograph:

5) George Musser left the Centre for Quantum Technologies and returned to New Jersey, but not before writing a nice blog article explaining how the GRACE satellite uses the Earth’s gravitational field to measure the melting of glaciers:

• George Musser, Melting glaciers muck up Earth’s gravitational field, Scientific American, 22 December 2011.

6) The American Physical Society has started a new group: a Topical Group on the Physics of Climate! If you’re a member of the APS, and care about climate issues, you should join this.

7) Finally, here’s a cool picture taken in the Gulf of Alaska by Kent Smith:

He believes this was caused by fresher water meeting more salty water, but it doesn’t sounds like he’s sure. Can anyone figure out what’s going on? The foam where the waters meet is especially intriguing.


Follow

Get every new post delivered to your Inbox.

Join 305 other followers