Mathematics of the Environment (Part 4)

22 October, 2012

We’ve been looking at some very simple models of the Earth’s climate. Pretty soon I want to show you one that illustrates the ice albedo effect. This effect says that when it’s colder, there’s more ice and snow, so the Earth gets lighter in color, so it reflects more sunlight and tends to get even colder. In other words, it’s a positive feedback mechanism: a reaction that strengthens the process that caused the reaction.

Since a higher temperature leads to a higher radiation and therefore to cooling, and a lower temperature leads to a lower radiation, according to the Planck distribution, there is always also a negative feedback present in the climate system of the earth. This is dubbed the Planck feedback, and this is what ultimately protects the Earth against getting arbitrarily hot or cold.

However, the ice albedo effect may be important for the ‘ice ages’ or more properly ‘glacial cycles’ that we’ve been having for the last few ten million years… and also for much earlier, much colder Snowball Earth events. In reverse, melting ice now tends to make the Earth darker and even warmer. So, this is an interesting topic for many reasons… including the math, which we’ll get to later.

Now, obviously the dinosaurs did not keep records of the temperature, so how we estimate temperatures on the ancient Earth is an important question, which deserves a long discussion—but not today! Today I’ll be fairly sketchy about that. I just want you to get a feel for the overall story, and some open questions.

The Earth’s temperature since the last glacial period

First, here’s a graph of Greenland temperatures over the last 18,000 years:

(As usual, click to enlarge and/or get more information.) This chart is based on ice cores, taken from:

• Richard B. Alley, The Two-Mile Time Machine: Ice Cores, Abrupt Climate Change, and our Future, Princeton U. Press, Princeton, 2002.

This is a good book for learning how people reconstruct the
history of temperatures in Greenland from looking at a two-mile-long ice core drilled out of the glaciers there.

As you can see, first Greenland was very cold, and then it warmed up at the end of the last ‘ice age’, or glacial period. But there are lot of other things to see in this graph. For example, there was a severe cold spell between 12.9 and 11.5 thousand years ago: the Younger Dryas event.

I love that name! It comes from the tough little Arctic flower
Dryas octopetala, whose plentiful pollen in certain ice samples gave evidence that this time period was chilly. Was there an Older Dryas? Yes: before the Younger Dryas there was a warm spell called the Allerød, and before that a cold period called the Older Dryas.

The Younger Dryas lasted about 1400 years. Temperatures dropped dramatically in Europe: about 7 °C in only 20 years! In Greenland, it was 15 °C colder during the Younger Dryas than today. In England, the average annual temperature was -5 °C, so glaciers started forming. We can see evidence of this event from oxygen isotope records and many other things.

Why the sudden chill? One popular theory is that the melting of the ice sheet on North America lowered the salinity of North Atlantic waters. This in turn blocked a current called the
Atlantic meridional overturning circulation, or AMOC for short, which normally brings warm water up the coast of Europe. Proponents of this theory argue that this current is what makes London much warmer than, say, Winnipeg in Canada or Irkutsk in Russia. Turn it off and—wham!—you’ll get glaciers forming in England.

Anyway, whatever caused it, the Younger Dryas ended as suddenly at it began, with temperatures jumping 7 °C. Since then, the Earth continued warming up until about 6 thousand years ago—the mid-Holocene thermal maximum. The earth was about 1° or 2° Celsius warmer than today. Since then, it’s basically been cooling off—not counting various smaller variations, like the global warming we’re experiencing in this century.

However, these smaller variations are very interesting! From 6000 to 2500 years ago things cooled down, with the coolest
stretch occurring between 4000 and 2500 years ago: the Iron Age Cold Epoch.

Then things warmed up for a while, and then they cooled down
from 500 to 1000 AD. Yes, the so-called "Dark Ages" were also chilly!

After this came the Medieval Warm Period, a period from about 1000 to 1300 AD:

From 1450 AD to 1890 there was a period of cooling, often called the Little Ice Age. This killed off the Icelandic colonies in Greenland, as described in this gripping book:

• Jane Smiley, The Greenlanders, Ballantine Books, New York, 1996.

However, the term "Little Ice Age" exaggerates the importance of a small blip in the grand scheme of things. It was nowhere near as big as the Younger Dryas: temperatures may have dropped a measly 0.2° Celsius from the Medieval optimum, and it may have happened only in Europe—though this was a subject of debate when I last checked.

Since then, things have been warming up:

The subject has big political implications, and is thus subject to enormous controversy. But, I think it’s quite safe to say that we’ve been seeing a rapid temperature rise since 1900, with the Northern Hemisphere average temperature rising roughly 1 °C since then. Each of the last 11 years, from 2001 to 2011, was one of the 12 warmest years since 1901. (The other one was 1998.)

All these recent variations in the Earth’s climate are very much worth trying to understand. but now let’s back off to longer time periods! We don’t have many Earth-like planets whose climate we can study in detail—at least not yet, since they’re too far away. But we do have one planet, the Earth, that’s gone through many changes. The climate since the end of the last ice age is just a tiny sliver of a long and exciting story!

The Earth’s long-term climate history

Here’s a nice old chart showing estimates of the Earth’s average temperature in the last 150 years, the last 16,000 years, the last 150,000 years and the last million years:


Here “ka” or “kilo-annum” means a thousand years. These temperatures are estimated by various methods; I got this chart from:

• Barry Saltzman, Dynamical Paleoclimatology: Generalized Theory of Global Climate Change, Academic Press, New York, 2002, fig. 3-4.

As we keep zooming in towards the present we keep seeing more detail:

• Over the last million years there have been about ten glacial periods—though trying to count them is a bit like trying to count ‘very deep valleys’ in a hilly landscape!

• From 150 to 120 thousand years ago it warmed up rather rapidly. From 120 thousand years ago to 16 thousand years ago it cooled down—that was the last glacial period. Then it warmed up rather rapidly again.

• Over the last 10 thousand years temperatures have been unusually constant.

• Over the last 150 years it’s been warming up slightly.

If we go back further, say to 5 million years, we see that temperatures have been colder but also more erratic during this period:

This figure is based on this paper:

• L. E. Lisiecki and M. E. Raymo, A Pliocene-Pleistocene stack of 57 globally distributed benthic δ18O records, Paleoceanography 20 (2005), PA1003.

Lisieki and Raymo combined measurements of oxygen isotopes in the shells of tiny sea creatures called foraminifera from 57 globally distributed deep sea sediment cores. But beware: they constructed this record by first applying a computer aided process to align the data in each sediment core. Then the resulting stacked record was tuned to make the positions of peaks and valleys match the known Milankovitch cycles in the Earth’s orbit. The temperature scale was chosen to match Vostok ice core data. So, there are a lot of theoretical assumptions built into this graph.

Going back 65 million years, we see how unusual the current glacial cycles are:


Click to make this graph bigger; it’s from:

• Robert Rohde, 65 million years of climate change, at Global Warming Art.

This graph shows the Earth’s temperature since the extinction of the dinosaurs about 65 million years ago—the end of the Mesozoic and beginning of the Cenozoic. At first the Earth warmed up, reaching its warmest 50 million years ago: the "Eocene Optimum". The spike before that labelled "PETM" is a fascinating event called the Paleocene-Eocene Thermal Maximum. At the end of the Eocene the Earth cooled rapidly and the Antarctic acquired year-round ice. After a warming spell near the end of the Oligocene, further cooling and an increasingly jittery climate led ultimately to the current age of rapid glacial cycles.

Why is the Earth’s climate so jittery nowadays? That’s a fascinating puzzle, which I’d like to discuss in the weeks to come.

Why did the Earth suddenly cool at the end of the Eocene 34 million years ago? One theory relies on the fact that this is when Antarctica first became separated from Australia and South America. After the Tasmanian Gateway between Australia and Antarctica opened, the only thing that kept water from swirling endlessly around Antarctica, getting colder and colder, was the connection between this continent and South America. South America seems to have separated from Antarctica around the end of the Eocene.

In the early Eocene, Antarctica was fringed with a warm temperate to sub-tropical rainforest. But as the Eocene progressed it became colder, and by the start of the Oligocene it had deciduous forests and vast stretches of tundra. Eventually it became almost completely covered with ice.

Thanks to the ice albedo effect, an icy Antarctic tends to keep the Earth cooler. But is that the only or even the main explanation of the overall cooling trend over the last 30 million years? Scientists argue about this.

Going back further:

Here "Ma" or "mega-annum" means "million years". This chart was drawn from many sources; I got it from:

• Barry Saltzman, Dynamical Paleoclimatology: Generalized Theory of Global Climate Change, Academic Press, New York, 2002, fig. 1-3.

Among other things on this chart, you can sort of see hints of the Snowball Earth events that may have happened early in the Earth’s history. These are thought to have occurred during the Cryogenian period 850 to 635 million years ago, and also during the Huronian glaciation 2400 to 2100 million years ago. In both these events a large portion of the Earth was frozen—much more, it seems, than in the recent glacial periods! Ice albedo feedback plays a big role in theories of these events… though also, of course, there must be some explanation of why they ended.

As you can see, there’s a lot of things a really universal climate model might seek to explain. We don’t necessarily need to understand the whole Earth’s history to model it well now, but thinking about other eras is a good way to check our understanding of the present-day Earth.


Mathematics for Sustainability (Part 1)

21 October, 2012

guest post by John Roe

This year, I want to develop a new math course. Nothing surprising in that—it is what math professors do all the time! But usually, when we dream of new courses, we are thinking of small classes of eager graduate students to whom we can explain the latest research ideas. Here, I’m after something a bit different.

The goal will be through a General Education Mathematics course, to enable students to develop the quantitative and qualitative skills needed to reason effectively about environmental and economic sustainability. That’s a lot of long words! Let me unpack a bit:

General Education Mathematics At most universities (including Penn State University, where I teach), every student, whatever their major, has to take one or two “quantitative” courses – this is called the “general education” requirement. I want to reach out to students who are not planning to be mathematicians or scientists, students for whom this may be the last math course they ever take.

quantitative and qualitative skills I want students to be able to work with numbers (“quantitative”)—to be able to get a feeling for scale and size, whether we’re talking about gigatonnes of carbon dioxide, kilowatts of domestic power, or picograms of radioisotopes. But I also want them to get an intuition for the behavior of systems (qualitative), so that the ideas of growth, feedback, oscillation, overshoot and so on become part of their conceptual vocabulary.

to reason effectively A transition to a more sustainable society won’t come about without robust public debate—I want to help students engage effectively in this debate. Shamelessly stealing ideas from Andrew Read’s Science in Our World course, I hope to do this by using an online platform for student presentations. Engaging with this process (which includes commenting on other people’s presentations as well as devising your own) will count seriously in the grading scheme.

environmental and economic sustainability I’d like students to get the idea that there are lots of scales on which one can ask the sustainability question – both time scales (how many years is “sustainable”) and spatial scales. We’ll think about global-scale questions (carbon dioxide emissions being an obvious example) but we’ll try to look at as many examples as possible on a local scale (a single building, the Penn State campus, local agriculture) so that we can engage more directly.

I have been thinking about this plan for a year or more but now it’s time to put it into action. I’ve been in touch with my department head and got a green light to offer this for the first time in Spring 2014. In future posts I will share some more about the structure of the course as it develops. Meanwhile, if anyone has some good suggestions, let me know!


Insanely Long Proofs

19 October, 2012

 

There are theorems whose shortest proof is insanely long. In 1936 Kurt Gödel published an abstract called “On the length of proofs”, which makes essentially this claim.

But what does ‘insanely long’ mean?

To get warmed up, let’s talk about some long proofs.

Long proofs

You’ve surely heard of the quadratic formula, which lets you solve

a x^2 + b x + c  = 0

with the help of a square root:

\displaystyle{ x = \frac{-b \pm \sqrt{b^2 - 4 a c}}{2 a} }

There’s a similar but more complicated ‘cubic formula’ that lets you solve cubic equations, like this:

a x^3 + b x^2 + c x + d = 0

The cubic formula involves both square roots and cube roots. There’s also a ‘quartic formula’ for equations of degree 4, like this:

a x^4 + b x^3 + c x^2 + d x + e = 0

The quartic formula is so long that I can only show it to you if I shrink it by an absurd amount:

(Click repeatedly to enlarge.) But again, it only involves additions, subtraction, multiplication, division and taking roots.

In 1799, Paolo Ruffini proved that there was no general solution using radicals for polynomial equations of degree 5 or more. But his proof was 500 pages long! As a result, it was “mostly ignored”. I’m not sure what that means, exactly. Did most people ignore it completely? Or did everyone ignore most of it? Anyway, his argument seems to have a gap… and later Niels Abel gave a proof that was just 6 pages long, so most people give the lion’s share of credit to Abel.

Jumping ahead quite a lot, the longest proof written up in journals is the classification of finite simple groups. This was done by lots of people in lots of papers, and the total length is somewhere between 10,000 and 20,000 pages… nobody is exactly sure! People are trying to simplify it and rewrite it. The new proof will be a mere 5,000 pages long. So far six books have been written as part of this project.

Even when it’s all in one book, how can we be sure such a long proof is right? Some people want to use computers to make the argument completely rigorous, filling in all the details with the help of a program called a ‘proof assistant’.

The French are so sexy that even their proof assistant sounds dirty: it’s called Coq. Recently mathematicians led by George Gonthier have used Coq to give a completely detailed proof of a small piece of the classification of finite simple groups: the Feit–Thompson Theorem. Feit and Thompson’s original proof of this result, skipping lots of steps that are obvious to experts, took 255 pages!

What does the Feit–Thompson theorem say? Every finite group with an odd number of elements is solvable! Explaining that statement might take quite a while, depending on what you know about math. So let me just say this:

Galois invented group theory and used it to go further than Abel and Ruffini had. He showed a bunch of specific polynomial equations couldn’t be solved just using addition, subtraction, multiplication, division and taking roots. For example, those operations aren’t powerful enough to solve this equation:

x^5 - x + 1 = 0

while they can solve this one:

x^5 - x = 0

Galois showed that every polynomial equation has a group of symmetries, and you can solve the equation using addition, subtraction, multiplication, division and taking roots if its group has a certain special property. So, this property of a group got the name ‘solvability’.

Every finite group with an odd number of elements is solvable. It’s quick to say, but hard to show—so hard that making the proof fully rigorous on a computer seemed out of reach at first:

When Gonthier first suggested a formal Feit-Thompson Theorem proof, his fellow members of the Mathematical Components team at Inria could hardly believe their ears.

“The reaction of the team the first time we had a meeting and I exposed a grand plan,” he recalls ruefully, “was that I had delusions of grandeur. But the real reason of having this project was to understand how to build all these theories, how to make them fit together, and to validate all of this by carrying out a proof that was clearly deemed to be out of reach at the time we started the project.”

It took them 6 years! The completed computer proof has 170,000 lines. It involves 15,000 definitions and 4,300 lemmas. Maybe now Gonthier is dreaming of computerizing the whole classification of finite simple groups.

But there are even longer proofs of important results in math. These longer proofs all involve computer calculations. For example, Thomas Hales seems to have proved that the densest packing of spheres in 3d space is the obvious one, like this:

(There are infinitely many other equally dense packings, as I explained earlier, but none denser.)

Hales’ proof is a hundred pages of writing together with about 3 gigabytes of computer calculations. If we wrote out those calculations in a text file, they’d fill about 2 million pages!

The method is called ‘proof by exhaustion’, because it involves reducing the problem to 10,000 special cases and then settling each one with detailed calculations… thus exhausting anybody who tries to check the proof by hand. Now Hales is trying to create a fully formal proof that can be verified by automated proof checking software such as HOL. He expects that doing this will take 20 years.

These proofs are long, but they’re not what I’d call insanely long.

Insanely long proofs

What do I mean by ‘insanely’ long? Well, for example, I know a theorem that you can prove using the usual axioms of arithmetic—Peano arithmetic—but whose shortest proof using those axioms contains

10^{1000}

symbols. This is so many symbols that you couldn’t write them all down if you wrote one symbol on each proton, neutron and electron in the observable Universe!

I also know a theorem whose shortest proof in Peano arithmetic contains

10^{10^{1000}}

symbols. This is so many that if you tried to write down the number of symbols—not the symbols themselves—in ordinary decimal notation, you couldn’t do it if you wrote one digit on each proton, neutron and electron in the observable Universe!

I also know a theorem whose shortest proof in Peano arithmetic contains…

… well, you get the idea.

By the way, if you don’t know what Peano arithmetic is, don’t worry. It’s just a list of obvious axioms about arithmetic, together with some obvious rules for reasoning, which turn out to be good enough to prove most everyday theorems about arithmetic. The main reason I mention it is that we need to pick some particular setup before we can talk about the ‘shortest proof’ of a theorem and have it be well-defined.

Also by the way, I’m assuming Peano arithmetic is consistent. If it were inconsistent, you could prove 0=1, and then everything would follow quite quickly from that. But most people feel sure it’s consistent.

If it is, then the shortest proof using Peano arithmetic of the following theorem contains at least 10^{1000} symbols:

This statement has no proof in Peano arithmetic that contains fewer than 10^{1000} symbols.

Huh? This doesn’t look a statement about arithmetic! But Gödel showed how you could encode the concept of ‘statement’ and ‘proof’ into arithmetic, and use this to create statements that refer to themselves. He’s most famous for this one:

This statement has no proof in Peano arithmetic.

It turns out that this statement is true… so it’s an example of a true statement about arithmetic that can’t be proved using Peano arithmetic! This is Gödel’s first incompleteness theorem.

This variation is similar:

This statement has no proof in Peano arithmetic that contains fewer than 10^{1000} symbols.

This statement is also true. It’s provable in Peano arithmetic, but its proof contains at least 10^{1000} symbols.

Even longer proofs

But this is just the beginning. If Peano arithmetic is consistent, there are infinitely many theorems whose shortest proof is longer than 10 to the 10 to the 10 to the 10 to the… where the stack of 10’s is as long as the statement of the theorem.

Indeed, take any computable function, say F—growing as fast as you like. Then if Peano arithmetic is consistent, there are infinitely many theorems whose shortest proof is longer than this function of the length of the theorem!

But how do we know this? Here’s how. Using Gödel’s clever ideas, we can create a statement in arithmetic that says:

This statement has no proof in Peano arithmetic that is shorter than F of the length of this statement.

Let’s call this statement P.

We’ll show that if Peano arithmetic is consistent, then P is provable in Peano arithmetic. So, according to what P itself says, P is an example of a statement whose shortest proof is insanely long!

Now, let me sketch why if Peano arithmetic is consistent then P is provable in this setup. To save time, I’ll use N to stand for ‘F of the length of P’. So, P says:

P has no proof in Peano arithmetic whose length is less than N.

Assume there were a proof of P in Peano arithmetic whose length is less than N. Then, thanks to what P actually says, P would be false.

Moreover, we could carry out the argument I just gave within Peano arithmetic. So, within Peano arithmetic, we could disprove P.

But wait a minute—this would mean that within Peano arithmetic we could both prove and disprove P! We’re assuming Peano arithmetic is consistent, so this is impossible.

So our assumption must have been wrong. P must have no proof in Peano arithmetic whose length is less than N.

But this is what P says! So P is true!

Even better, P is provable in Peano arithmetic. Why? Because we can just go through all proofs shorter than N, and check that they’re not proofs of P… we know they won’t be… and this will constitute a proof that:

P has no proof in Peano arithmetic whose length is less than N.

But this is just what P says! So this is a way to prove P. Moreover we can carry out this long-winded check within Peano arithmetic, and get a proof of P in Peano arithmetic!

Of course, this proof has length more than N.

By the way, here I’m using the fact that there are only finitely many proofs with length less than N, so we can go through them all. We have to define ‘length’ in a way that makes this true, for my argument to work. For example, we can take the length to be the number of symbols.

Also, it’s important that our function F be computable. We need to compute N to know how many proofs we need to go through.

The upshot: if Peano arithmetic is consistent, there’s a provable statement whose shortest proof in Peano arithmetic is insanely long, by any computable standard of what counts as ‘insanely long’.

Now, so far we’ve just gotten one theorem with an insanely long proof. But we can get infinitely many, one for each natural number n, as follows. Let P(n) be a statement in arithmetic that says:

This statement has no proof in Peano arithmetic whose length is less than n plus F of the length of this statement.

The same argument I just sketched shows that if Peano arithmetic is consistent then P(n) is provable, but has no proof shorter than n plus F of the length of P(n). The statements P(n) are different for different values of n. So, we get infinitely many different statements with insanely long proofs… at least if Peano arithmetic is consistent, which most people believe.

Proof speedup

But wait a minute! If we’ve proved all these statements P(n) have proofs, then we’ve basically proved them—no? And my argument, though sketchy, was quite short. So, even a completely detailed version of my argument should not be ‘insanely long’. Doesn’t that contradict my claim that the shortest proofs of these statements are insanely long?

Well, no. I only showed that the shortest proof of P(n) using Peano arithmetic is insanely long. I did provide a short proof of P(n). But I did this assuming Peano arithmetic is consistent!

So I didn’t give a short proof of P(n) using Peano arithmetic. I gave a short proof using Peano arithmetic plus the assumption that Peano arithmetic is consistent!

So, if we add to Peano arithmetic an extra axiom saying ‘Peano arithmetic is consistent’, infinitely many theorems get vastly shorter proofs!

This is often called Gödel’s speedup theorem. For more, see:

Gödel’s speedup theorem, Wikipedia.

Connections

It’s pretty cool how just knowing Peano arithmetic was consistent would let us vastly shorten infinitely many proofs.

As an instant consequence, we get Gödel’s second incompleteness theorem: it’s impossible to use Peano arithmetic to prove its own consistency. For if we could, adding that consistency as an extra axiom couldn’t shorten proofs by an arbitrarily large amount. It could only shorten proofs by a fixed finite amount: roughly, the number of symbols in the proof that Peano arithmetic is consistent.

And while we’re at it, let’s note how the existence of insanely long proofs is related to another famous result: the Church–Turing theorem. This says it’s impossible to write a computer program that can decide in a finite time, yes or no, whether any given statement is provable in Peano arithmetic.

Suppose that in Peano arithmetic there were a computable upper bound on the length of a proof of any statement, as a function of the length of that statement. Then we could write a program that would go through all potential proofs of any statement and tell us, in a finite amount of time, whether it had a proof. So, Peano arithmetic would be decidable!

But since it’s not decidable, no such computable upper bound can exist. This is another way of seeing that there must be theorems with insanely long proofs.

So, the existence of insanely long proofs is tightly connected to the inability of Peano arithmetic to prove itself consistent, and also the undecidability of Peano arithmetic. And these are features not just of Peano arithmetic, but of any system of arithmetic that’s sufficiently powerful, yet simple enough that we can write a program to check to see if a purported proof really is a proof.

Buss’ speedup theorem

In fact Gödel stated his result in a more sophisticated way than I did. He never published a proof… but not because the proof is insanely long, probably just because he was a busy man with many ideas. Samuel Buss gave a proof in 1994:

On Gödel’s theorems on lengths of proofs I: Number of lines and speedups for arithmetic, Journal of Symbolic Logic 39 (1994), 737–756.

In first-order arithmetic you can use variables like x,y,z to stand for natural numbers like 0,1,2,3,… This is the kind of arithmetic I’ve been talking about so far: Peano arithmetic is an example. In second-order arithmetic you can also use variables of a different kind to stand for sets of natural numbers. Third-order arithmetic goes further and lets you use variables for sets of sets of natural numbers, and so on.

Gödel claimed, and Buss showed, that each time you go up an order, some statements that were already provable get insanely shorter proofs. So, turning this fact around, there are theorems that have insanely long proofs in first-order arithmetic!

(And similarly for second-order arithmetic, and so on…)

For more details, explained in a fairly friendly way, try:

Speed-up theorems, Stanford Encyclopedia of Philosophy.

By the way, this post is a kind of followup to my post on enormous integers. Insanely long proofs are related to enormous integers: if you know a quick way to describe an enormous integer, you can use the trick I described to cook up a theorem with an enormous proof.

Last time we saw the logician Harvey Friedman was a master of enormous integers. So it’s not surprising to me that Wikipedia says:

Harvey Friedman found some explicit natural examples of this phenomenon, giving some explicit statements in Peano arithmetic and other formal systems whose shortest proofs are ridiculously long.

However, I don’t know these statements. I assume they’re more natural than the weird self-referential examples I’ve described. What are they?


Mathematics of the Environment (Part 3)

13 October, 2012

This week I’ll release these notes before my seminar, so my students (and all of you, too) can read them ahead of time. The reason is that I’m pushing into topics I don’t understand as well as I’d like. So, my notes wrestle with some ideas in too much detail to cover in class—and I’m hoping some students will look at these notes ahead of time to prepare. Also, I’d appreciate your comments!

This week I’ll borrow material shamelessly from here:

• Seymour L. Hess, Introduction to Theoretical Meteorology, Henry Holt and Company, New York, 1959.

It’s an old book: for example, it talks about working out the area under a curve using a gadget called a planimeter, which is what people did before computers.

It also talks about how people measured the solar constant (roughly, the brightness of the Sun) before we could easily put satellites up above the Earth’s atmosphere! And it doesn’t mention global warming.

But despite or perhaps even because of these quaint features, it’s simple and clear. In case it’s not obvious yet, I’m teaching this quarter’s seminar in order to learn stuff. So, I’ll sometimes talk about old work… but if you catch me saying things that are seriously wrong (as opposed to merely primitive), please let me know.

The plan

Last time we considered a simple model Earth, a blackbody at uniform constant temperature absorbing sunlight and re-emitting the same amount of power in the form of blackbody radiation. We worked out that its temperature would be 6 °C, which is not bad. But then we took into account the fact that the Earth is not black. We got a temperature of -18 °C, which is too cold. The reason is that we haven’t yet equipped our model Earth with an atmosphere! So today let’s try that.

At this point things get a lot more complicated, even if we try a 1-dimensional model where the temperature, pressure and other features of the atmosphere only depend on altitude. So, I’ll only do what I can easily do. I’ll explain some basic laws governing radiation, and then sketch how people applied them to the Earth.

It’ll be good to start with a comment about what we did last time.

Kirchoff’s law of radiation

When we admitted the Earth wasn’t black, we said that it absorbed only about 70% of the radiation hitting it… but we still modeled it as emitting radiation just like a blackbody! Isn’t there something fishy about this?

Well, no. The Earth is mainly absorbing sunlight at visible frequencies, and at these frequencies it only absorbs about 70% of the radiation that hits it. But it mainly emits infrared light, and at these frequencies it acts like it’s almost black. These frequencies are almost completely different from those where absorption occurs.

But still, this issue is worth thinking about.

After all, emission and absorption are flip sides of the same coin. There’s a deep principle in physics, called reciprocity, which says that how X affects Y is not a separate question from how Y affects X. In fact, if you know the answer to one of these questions, you can figure out the answer to the other!

The first place most people see this principle is Newton’s third law of classical mechanics, saying that if X exerts a force on Y, Y exerts an equal and opposite force on X.

For example: if I punched your nose, your nose punched my fist just as hard, so you have no right to complain.

This law is still often stated in its old-fashioned form:

For every action there is an equal and opposite reaction.

I found this confusing as a student, because ‘force’ was part of the formal terminology of classical mechanics, but not ‘action’—at least not as used in this sentence!—and certainly not ‘reaction’. But as a statement of the basic intuition behind reciprocity, it’s got a certain charm.

In engineering, the principle of reciprocity is sometimes stated like this:

Reciprocity in linear systems is the principle that the response R_{ab} measured at a location a when the system is excited at a location b is exactly equal to R_{ba}, which is the response at location b when that same excitation is applied at a. This applies for all frequencies of the excitation.

Again this is a bit confusing, at least if you’re a mathematician who would like to know exactly how a ‘response’ or an ‘excitation’ is defined. It’s also disappointing to see the principle stated in a way that limits it to linear systems. Nonetheless it’s tremendously inspiring. What’s really going on here?

I don’t claim to have gotten to the bottom of it. My hunch is that to a large extent it will come down to the fact that mixed partial derivatives commute. If we’ve got a smooth function f of a bunch of variables x_1, \dots, x_n, and we set

\displaystyle{ R_{ab} = \frac{\partial^2 f}{\partial x_a \partial x_b} }

then

R_{ab} = R_{ba}

However, I haven’t gotten around to showing that reciprocity boils down to this in all the examples yet. Yet another unification project to add to my list!

Anyway: reciprocity has lots of interesting applications to electromagnetism. And that’s what we’re really talking about now. After all, light is electromagnetic radiation!

The simplest application is one we learn as children:

If I can see you, then you can see me.

or at least:

If light can go from X to Y in a static environment, it can also go from Y to X.

But we want something that sounds a bit different. Namely:

The tendency of a substance to absorb light at some frequency equals its tendency to emit light at that frequency.

This is too vague. We should make it precise, and in a minute I’ll try, but first let me motivate this idea with a thought experiment. Suppose we have a black rock and a white rock in a sealed mirrored container. Suppose they’re in thermal equilibrium at a very high temperature, so they’re glowing red-hot. So, there’s red light bouncing around the container. The black rock will absorb more of this light. But since they’re in thermal equilibrium, the black rock must also be emitting more of this light, or it would gain energy and get hotter than the white one. That would violate the zeroth law of thermodynamics, which implies that in thermal equilibrium, all the parts of a system must be at the same temperature.

More precisely, we have:

Kirchoff’s Law of Thermal Radiation. For any body in thermal equilibrium, its emissivity equals its absorptivity.

Let me explain. Suppose we have a surface made of some homogeneous isotropic material in thermal equilibrium at temperature T. If it’s perfectly black, we saw last time that it emits light with a monochromatic energy flux given by the Planck distribution:

\displaystyle{ f_{\lambda}(T) = \frac{2 hc^2}{\lambda^5} \frac{1}{ e^{\frac{hc}{\lambda k T}} - 1 } }

Here \lambda is the wavelength of light and the ‘monochromatic energy flux’ has units of power per area per wavelength.

But if our surface is not perfectly black, we have to multiply this by a fudge factor between 0 and 1 to get the right answer. This factor is called the emissivity of the substance. It can depend on the wavelength of the light quite a lot, and also on the surface’s temperature (since for example ice melts at high temperatures and gets darker). So, let’s call it e_\lambda(T).

We can also talk about the absorptivity of our surface, which is the fraction of light it absorbs. Again this depends on the wavelength of the light and the temperature of our surface. So, let’s call it a_\lambda(T).

Then Kirchoff’s law of thermal radiation says

e_\lambda(T) = a_\lambda(T)

So, for each frequency the emissivity must equal the absorptivity… but it’s still possible for the Earth to have an average emissivity near 1 at the wavelengths of infrared light and near 0.7 at the wavelengths of visible light. So there’s no paradox.

Puzzle 1. Is this law named after the same guy who discovered Kirchhoff’s laws governing electrical circuits?

Schwarzschild’s equation

Now let’s talk about light shining through the Earth’s atmosphere. Or more generally, light shining through a medium. What happens? It can get absorbed. It can get scattered, bouncing off in different directions. Light can also get emitted, especially if the medium is hot. The air in our atmosphere isn’t hot enough to emit a lot of visible light, but it definitely emits infrared light and microwaves.

It sounds complicated, and it is, but there are things we can say about it. Let me tell you about Schwarzschild’s equation.

Light comes in different wavelengths. So, can ask how much power per square meter this light carries per wavelength. We call this the monochromatic energy flux I_{\lambda}, since it depends on the wavelength \lambda. As mentioned last time, this has units W/m2μm, where μm stands for micrometers, a unit of wavelength.

However, because light gets absorbed, scattered and emitted the monochromatic energy flux is really a function I_{\lambda}(s), where s is the distance through the medium. Here I’m imagining an essentially one-dimensional situation, like a beam of sunlight coming down through the air when the Sun is directly overhead. We can generalize this later.

Let’s figure out the basic equation describing how I_{\lambda}(s) changes as a function of s. This is called the equation of radiative transfer, or Schwarzschild’s equation. It won’t tell us how different gases absorb different amounts of light of different frequencies—for that we need to do hard calculations, or experiments. But we can feed the results of these calculations into Schwarzschild’s equation.

For starters, let’s assume that light only gets absorbed but not emitted or scattered. Later we’ll include emission, which is very important for what we’re doing: the Earth’s atmosphere is warm enough to emit significant amounts of infrared light (though not hot enough to emit much visible light). Scattering is also very important, but it can be treated as a combination of absorption and emission.

For absorption only, we have the Beer–Lambert law:

\displaystyle{  \frac{d I_\lambda(s)}{d s} = - a_\lambda(s) I_\lambda(s)  }

In other words, the amount of radiation that gets absorbed per distance is proportional to the amount of radiation. However, the constant of proportionality a_\lambda (s) can depend on the frequency and the details of our medium at the position s. I don’t know the standard name for this constant a_\lambda (s), so let’s call it the absorption rate.

Puzzle 2. Assuming the Beer–Lambert law, show that the intensity of light at two positions s_1 and s_2 is related by

\displaystyle{ I_\lambda(s_2) = e^{-\tau} \; I_\lambda(s_1) }

where the optical depth \tau of the intervening medium is defined by

\displaystyle{ \tau = \int_{s_1}^{s_2} a_\lambda(s) \, ds }

So, a layer of stuff has optical depth equal to 1 if light shining through it has its intensity reduced by a factor of 1/e.

We can go a bit further if our medium is a rather thin gas like the air in our atmosphere. Then the absorption rate is given by

a_\lambda(s) = k_\lambda(s) \, \rho(s)

where \rho(s) is the density of the air at the position s and k_\lambda(s) is its absorption coefficient.

In other words, air absorbs light at a rate proportional to its density, but also depending on what it’s made of, which may vary with position. For example, both the density and the humidity of the atmosphere can depend on its altitude.

What about emission? Air doesn’t just absorb infrared light, it also emits significant amounts of it! As mentioned, a blackbody at temperature T emits light with a monochromatic energy flux given by the Planck distribution:

\displaystyle{ f_{\lambda}(T) = \frac{2 hc^2}{\lambda^5} \frac{1}{ e^{\frac{hc}{\lambda k T}} - 1 } }

But a gas like air is far from a blackbody, so we have to multiply this by a fudge factor. Luckily, thanks to Kirchoff’s law of radiation, this factor isn’t so fudgy: it’s just the absorption rate a_\lambda(s).

Here are we generalizing Kirchoff’s law from a surface to a column of air, but that’s okay because we can treat a column as a stack of surfaces; letting these become very thin we arrive at a differential formulation of the law that applies to absorption and emission rates instead of absorptivity and emissivity. (If you’re very sharp, you’ll remember that Kirchoff’s law applies to thermal equilibrium, and wonder about that. Air in the atmosphere isn’t in perfect thermal equilibrium, but it’s close enough for what we’re doing here.)

So, when we take absorption and also emission into account, Beer’s law gets another term:

\displaystyle{  \frac{d I_\lambda(s)}{d s} = - a_\lambda(s) I_\lambda(s) + a_\lambda(s) f_\lambda(T(s)) }

where T is the temperature of our gas at the position s. In other words:

\displaystyle{  \frac{d I_\lambda(s)}{d s} =  a_\lambda(s) ( f_\lambda(T) - I_\lambda(s))}

This is Schwarzschild’s equation.

Puzzle 3. Is this equation named after the same guy who discovered the Schwarzschild metric in general relativity, describing a spherically symmetric black hole?

Application to the atmosphere

In principle, we can use Schwarzschild’s equation to help work out how much sunlight of any frequency actually makes it through the atmosphere down to the Earth, and also how much infrared radiation makes it through the atmosphere out to space. But this is not a calculation I can do here today, because it’s very complicated.

If we actually measure what fraction of radiation of different frequencies makes it through the atmosphere, you’ll see why:


Everything here is a function of the wavelength, measured in micrometers. The smooth red curve is the Planck distribution for light coming from the Sun at a temperature of 5325 K. Most of it is visible light, with a wavelength between 0.4 and 0.7 micrometers. The jagged red region shows how much of this gets through—on a clear day, I assume—and you can see that most of it gets through. The smooth bluish curves are the Planck distributions for light coming from the Earth at various temperatures between 210 K and 310 K. Most of it is infrared light, and not much of it gets through.

This, in a nutshell, is what keeps the Earth warmer than the chilly -18 °C we got last time for an Earth with no atmosphere!

This is the greenhouse effect. As you can see, the absorption of infrared light is mainly due to water vapor, and then carbon dioxide, and then other lesser greenhouse gases, mainly methane, nitrous oxide. Oxygen and ozone also play a minor role, but ozone is more important in blocking ultraviolet light. Rayleigh scattering—the scattering of light by small particles, including molecules and atoms—is also important at short wavelengths, because its strength is proportional to 1/\lambda^4. This is why the sky is blue!

Here the wavelengths are measured in nanometers; there are 1000 nanometers in a micrometer. Rayleigh scattering continues to become more important in the ultraviolet.

But right now I want to talk about the infrared. As you can see, the all-important absorption of infrared radiation by water vapor and carbon dioxide is quite complicated. You need quantum mechanics to predict how this works from first principles. Tim van Beek gave a gentle introduction to some of the key ideas here:

• Tim van Beek, A quantum of warmth, Azimuth, 2 July 2011.

Someday it would be fun to get into the details. Not today, though!

You can see what’s going on a bit more clearly here:


The key fact is that infrared is almost completely absorbed for wavelengths between 5.5 and 7 micrometers, or over 14 micrometers. (A ‘micron’ is just an old name for a micrometer.)

The work of Simpson

The first person to give a reasonably successful explanation of how the power of radiation emitted by the Earth balances the power of the sunlight it absorbs was George Simpson. He did it in 1928:

• George C. Simpson, Further studies in terrestrial radiation, Mem. Roy. Meteor. Soc. 3 (1928), 1–26.

One year earlier, he had tried and failed to understand this problem using a ‘gray atmosphere’ model where the fraction of light that gets through was independent of its wavelength. If you’ve been paying attention, I think you can see why that didn’t work.

In 1928, since he didn’t have a computer, he made a simple model that treated emission of infrared radiation as follows.

He treated the atmosphere as made of layers of varying thickness, each layer containing 0.03 grams per centimeter2 of water vapor. The Earth’s surface radiates infrared almost as a black body. Part of the power is absorbed by the first layer above the surface, while some makes it through. The first layer then re-radiates at the same wavelengths at a rate determined by its temperature. Half this goes downward, while half goes up. Of the part going upward, some is absorbed by the next layer… and so on, up to the top layer. He took this top layer to end at the stratosphere, since the atmosphere is much drier in the stratosphere.

He did this all in a way that depends on the wavelength, but using a simplified model of how each of these layers absorbs infrared light. He assumed it was:

• completely opaque from 5.5 to 7 micrometers (due to water vapor),

• partly transparent from 7 to 8.5 micrometers (interpolating between opaque and transparent),

• completely transparent from 8.5 to 11 micrometers,

• partly transparent from 11 to 14 micrometers (interpolating between transparent and opaque),

• completely opaque above 14 micrometers (due to carbon dioxide and water vapor).

He got this result, at the latitude 50° on a clear day:


The upper smooth curve is the Planck distribution for a temperature of 280 K, corresponding to the ground. The lower smooth curve is the Planck distribution at 218 K, corresponding to the stratosphere. The shaded region is his calculation of the monochromatic flux emitted into space by the Earth. As you can see, it matches the Planck distribution for the stratosphere where the lower atmosphere is completely opaque in his model—between 5.5 and 7 micrometers, and over 14 micrometers. It matches the Planck distribution for the stratosphere where the lower atmosphere is completely transparent. Elsewhere, it interpolates between the two.

The area of this shaded region—calculated with a planimeter, perhaps?—is the total flux emitted into space.

This is just part of the story: he also took clouds into account, and he did different calculations at different latitudes. He got a reasonably good balance between the incoming and outgoing power. In short, he showed that an Earth with its observed temperatures is roughly compatible with his model of how the Earth absorbs and emits radiation. Note that this is just another way to tackle the problem of predicting the temperature given a model.

Also note that Simpson didn’t quite use the Schwarzschild equation. But I guess that in some sense he discretized it—right?

And so on

This was just the beginning of a series of more and more sophisticated models. I’m too tired to go on right now.

You’ll note one big thing we’ve omitted: any sort of calculation of how the pressure, temperature and humidity of the air varies with altitude! To the extent we talked about those at all, we treated them as inputs. But for a full-fledged one-dimensional model of the Earth’s atmosphere, we’d want to derive them from some principles. There are, after all, some important puzzles:

Puzzle 4. If hot air rises, why does the atmosphere generally get colder as you go upward, at least until you reach the stratosphere?

Puzzle 5. Why is there a tropopause? In other words, why is there a fairly sudden transition 10 kilometers up between the troposphere, where the air is moist, cooler the higher you go, and turbulent, and the stratosphere, where the air is drier, warmer the higher you go, and not turbulent?

There’s a limit to how much we can understand these puzzles using a 1-dimensional model, but we should at least try to make a model of a thin column of air with pressure, temperature and humidity varying as a function of altitude, with sunlight streaming downward and infrared radiation generally going up. If we can’t do that, we’ll never understand more complicated things, like the actual atmosphere.


Mathematics of the Environment (Part 2)

11 October, 2012

Here are some notes for the second session of my seminar. They are shamelessly borrowed from these sources:

• Tim van Beek, Putting the Earth In a Box, Azimuth, 19 June 2011.

Climate model, Azimuth Library.

Climate models

Though it’s not my central concern in this class, we should talk a little about climate models.

There are many levels of sophistication when it comes to climate models. It is wise to start with simple, not very realistic models before ascending to complicated, supposedly more realistic ones. This is true in every branch of math or physics: working with simple models gives you insights that are crucial for correctly handling more complicated models. You shouldn’t fly a fighter jet if you haven’t tried something simpler yet, like a bicycle: you’ll probably crash and burn.

As I mentioned last time, models in biology, ecology and climate science pose new challenges compared to models of the simpler systems that physicists like best. As Chris Lee emphasizes, biology inherently deals with ‘high data’ systems where the relevant information can rarely be captured in a few variables, or even a few field equations.

(Field theories involve infinitely many variables, but somehow the ones physicists like best allow us to make a small finite number of measurements and extract a prediction from them! It would be nice to understand this more formally. In quantum field theory, the ‘nice’ field theories are called ‘renormalizable’, but a similar issue shows up classically, as we’ll see in a second.)

The climate system is in part a system that feels like ‘physics’: the flow of air in the atmosphere and water in the ocean. But some of the equations here, for example the Navier–Stokes equations, are already ‘nasty’ by the standards of mathematical physics, since the existence of solutions over long periods of time has not been proved. This is related to ‘turbulence’, a process where information at one length scale can significantly affect information at another dramatically different length scale, making precise predictions difficult.

Climate prediction is, we hope and believe, somewhat insulated from the challenges of weather prediction: we can hope to know the average temperature of the Earth within a degree or two in 5 years even though we don’t know whether it will rain in Manhattan on October 8, 2017. But this hope is something that needs to be studied, not something we can take for granted.

On top of this, the climate is, quite crucially, a biological system. Plant and animal life really affects the climate, as well as being affected by it. So, for example, a really detailed climate model may have a portion specially devoted to the behavior of plankton in the Mediterranean. This means that climate models will never be as ‘neat and clean’ as physicists and mathematicians tend to want—at least, not if these models are trying to be truly realistic. And as I suggested last time, this general type of challenge—the challenge posed by biosystems too complex to precisely model—may ultimately push mathematics in very new directions.

I call this green mathematics, without claiming I know what it will be like. The term is mainly an incitement to think big. I wrote a little about it here.

However, being a bit of an old-fashioned mathematician myself, I’ll start by talking about some very simple climate models, gradually leading up to some interesting puzzles about the ‘ice ages’ or, more properly, ‘glacial cycles’ that have been pestering the Earth for the last 20 million years or so. First, though, let’s take a quick look at the hierarchy of different climate models.

Different kinds of climate models

Zero-dimensional models are like theories of classical mechanics instead of classical field theory. In other words, they only consider with globally averaged quantities, like the average temperature of the Earth, or perhaps regionally averaged quantities, like the average temperature of each ocean and each continent. This sounds silly, but it’s a great place to start. It amounts to dealing with finitely many variables depending on time:

(x_1(t), \dots x_n(t))

We might assume these obey a differential equation, which we can always make first-order by introducing extra variables:

\displaystyle{ \frac{d x_i}{d t} = f_i(t, x_1(t), \dots, x_n(t))  }

This kind of model is studied quite generally in the subject of dynamical systems theory.

In particular, energy balance models try to predict the average surface temperature of the Earth depending on the energy flow. Energy comes in from the Sun and is radiated to outer space by the Earth. What happens in between is modeled by averaged feedback equations.

The Earth has various approximately conserved quantities like the total amount of carbon, or oxygen, or nitrogen—radioactive decay creates and destroys these elements, but it’s pretty negligible in climate physics. So, these things move around from one form to another. We can imagine a model where some of our variables x_i(t) are the amounts of carbon in the air, or in the soil, or in the ocean—different ‘boxes’, abstractly speaking. It will flow from one box to another in a way that depends on various other variables in our model. This idea gives class of models called box models.

Here’s one described by Nathan Urban in “week304” of This Week’s Finds:

I’m interested in box models because they’re a simple example of ‘networked systems’: we’ve got boxes hooked up by wires, or pipes, and we can imagine a big complicated model formed by gluing together smaller models, attaching the wires from one to the wires of another. We can use category theory to formalize this. In category theory we’d call these smaller models ‘morphisms’, and the process of gluing them together is called ‘composing’ them. I’ll talk about this a lot more someday.

One-dimensional models treat temperature and perhaps other quantities as a function of one spatial coordinate (in addition to time): for example, the altitude. This lets us include one dimensional processes of heat transport in the model, like radiation and (a very simplified model of) convection.

Two-dimensional models treat temperature and other quantities as a function of two spatial coordinates (and time): for example, altitude and latitude. Alternatively, we could treat the atmosphere as a thin layer and think of temperature at some fixed altitude as a function of latitude and longitude!

Three-dimensional models treat temperature and other quantities as a function of all three spatial coordinates. At this point we can, if we like, use the full-fledged Navier–Stokes equations to describe the motion of air in the atmosphere and water in the ocean. Needless to say, these models can become very complex and computation-intensive, depending on how many effects we want to take into account and at what resolution we wish to model the atmosphere and ocean.

General circulation models or GCMs try to model the circulation of the atmosphere and/or ocean.

Atmospheric GCMs or AGCMs model the atmosphere and typically contain a land-surface model, while imposing some boundary conditions describing sea surface temperatures. Oceanic GCMs or OGCMs model the ocean (with fluxes from the atmosphere imposed) and may or may not contain a sea ice model. Coupled atmosphere–ocean GCMs or AOGCMs do both atmosphere and ocean. These the basis for detailed predictions of future climate, such as are discussed by the Intergovernmental Panel on Climate Change, or IPCC.

• Backing down a bit, we can consider Earth models of intermediate complexity or EMICs. These might have a 3-dimensional atmosphere and a 2-dimensional ‘slab ocean’, or a 3d ocean and an energy-moisture balance atmosphere.

• Alternatively, we can consider regional circulation models or RCMs. These are limited-area models that can be run at higher resolution than the GCMs and are thus able to better represent fine-grained phenomena, including processes resulting from finer-scale topographic and land-surface features. Typically the regional atmospheric model is run while receiving lateral boundary condition inputs from a relatively-coarse resolution atmospheric analysis model or from the output of a GCM. As Michael Knap pointed out in class, there’s again something from network theory going on here: we are ‘gluing in’ the RCM into a ‘hole’ cut out of a GCM.

Modern GCMs as used in the 2007 IPCC report tended to run around 100-kilometer resolution. Individual clouds can only start to be resolved at about 10 kilometers or below. One way to deal with this is to take the output of higher resolution regional climate models and use it to adjust parameters, etcetera, in GCMs.

The hierarchy of climate models

The climate scientist Isaac Held has a great article about the hierarchy of climate models:

• Isaac Held, The gap between simulation and understanding in climate modeling, Bulletin of the American Meteorological Society (November 2005), 1609–1614.

In it, he writes:

The importance of such a hierarchy for climate modeling and studies of atmospheric and oceanic dynamics has often been emphasized. See, for example, Schneider and Dickinson (1974), and, especially, Hoskins (1983). But, despite notable exceptions in a few subfields, climate theory has not, in my opinion, been very successful at hierarchy construction. I do not mean to imply that important work has not been performed, of course, but only that the gap between comprehensive climate models and more idealized models has not been successfully closed.

Consider, by analogy, another field that must deal with exceedingly complex systems—molecular biology. How is it that biologists have made such dramatic and steady progress in sorting out the human genome and the interactions of the thousands of proteins of which we are constructed? Without doubt, one key has been that nature has provided us with a hierarchy of biological systems of increasing complexity that are amenable to experimental manipulation, ranging from bacteria to fruit fly to mouse to man. Furthermore, the nature of evolution assures us that much of what we learn from simpler organisms is directly relevant to deciphering the workings of their more complex relatives. What good fortune for biologists to be presented with precisely the kind of hierarchy needed to understand a complex system! Imagine how much progress would have been made if they were limited to studying man alone.

Unfortunately, Nature has not provided us with simpler climate systems that form such a beautiful hierarchy. Planetary atmospheres provide insights into the range of behaviors that are possible, but the known planetary atmospheres are few, and each has its own idiosyncrasies. Their study has connected to terrestrial climate theory on occasion, but the influence has not been systematic. Laboratory simulations of rotating and/or convecting fluids remain valuable and underutilized, but they cannot address our most complex problems. We are left with the necessity of constructing our own hierarchies of climate models.

Because nature has provided the biological hierarchy, it is much easier to focus the attention of biologists on a few representatives of the key evolutionary steps toward greater complexity. And, such a focus is central to success. If every molecular biologist had simply studied his or her own favorite bacterium or insect, rather than focusing so intensively on E. coli or Drosophila melanogaster, it is safe to assume that progress would have been far less rapid.

It is emblematic of our problem that studying the biological hierarchy is experimental science, while constructing and studying climate hierarchies is theoretical science. A biologist need not convince her colleagues that the model organism she is advocating for intensive study is well designed or well posed, but only that it fills an important niche in the hierarchy of complexity and that it is convenient for study. Climate theorists are faced with the difficult task of both constructing a hierarchy of models and somehow focusing the attention of the community on a few of these models so that our efforts accumulate efficiently. Even if one believes that one has defined the E. coli of climate models, it is difficult to energize (and fund) a significant number of researchers to take this model seriously and devote years to its study.

And yet, despite the extra burden of trying to create a consensus as to what the appropriate climate model hierarchies are, the construction of such hierarchies must, I believe, be a central goal of climate theory in the twenty-first century. There are no alternatives if we want to understand the climate system and our
comprehensive climate models. Our understanding will be embedded within these hierarchies.

It is possible that mathematicians, with a lot of training from climate scientists, have the sort of patience and delight in ‘study for study’s sake’ to study this hierarchy of models. Here’s one that Held calls ‘the fruit fly of climate models’:

For more, see:

• Isaac Held, The fruit fly of climate models.

The very simplest model

The very simplest model is a zero-dimensional energy balance model. In this model we treat the Earth as having just one degree of freedom—its temperature—and we treat it as a blackbody in equilibrium with the radiation coming from the Sun.

A black body is an object that perfectly absorbs and therefore also perfectly emits all electromagnetic radiation at all frequencies. Real bodies don’t have this property; instead, they absorb radiation at certain frequencies better than others, and some not at all. But there are materials that do come rather close to a black body. Usually one adds another assumption to the characterization of an ideal black body: namely, that the radiation is independent of the direction.

When the black body has a certain temperature T, it will emit electromagnetic radiation, so it will send out a certain amount of energy per second for every square meter of surface area. We will call this the energy flux and denote this as f. The SI unit for f is W/m2: that is, watts per square meter. Here the watt is a unit of energy per time.

Electromagnetic radiation comes in different wavelengths. So, can ask how much energy flux our black body emits per change in wavelength. This depends on the wavelength. We will call this the monochromatic energy flux f_{\lambda}. The SI unit for f_{\lambda} is W/m2μm, where μm stands for micrometer: a millionth of a meter, which is a unit of wavelength. We call f_\lambda the ‘monochromatic’ energy flux because it gives a number for any fixed wavelength \lambda. When we integrate the monochromatic energy flux over all wavelengths, we get the energy flux f.

Max Planck was able to calculate f_{\lambda} for a blackbody at temperature T, but only by inventing a bit of quantum mechanics. His result is called the Planck distribution: if

\displaystyle{ f_{\lambda}(T) = \frac{2 hc^2}{\lambda^5} \frac{1}{ e^{\frac{hc}{\lambda k T}} - 1 } }

where h is Planck’s constant, c is the speed of light, and k is Boltzmann’s constant. Deriving this would be tons of fun, but also a huge digression from the point of this class.

You can integrate f_\lambda over all wavelengths \lambda to get the total energy flux—that is, the total power per square meter emitted by a blackbody. The answer is surprisingly simple: if the total energy flux is defined by

\displaystyle{f = \int_0^\infty f_{\lambda}(T) \, d \lambda }

then in fact we can do the integral and get

f = \sigma \; T^4

for some constant \sigma. This fact is called the Stefan–Boltzmann law, and \sigma is called the Stefan-Boltzmann constant:

\displaystyle{ \sigma=\frac{2\pi^5 k^4}{15c^2h^3} \approx 5.67 \times 10^{-8}\, \frac{\mathrm{W}}{\mathrm{m}^2 \mathrm{K}^4} }

Using this formula, we can assign to every energy flux f a black body temperature T, which is the temperature that an ideal black body would need to have to emit f.

Let’s use this to calculate the temperature of the Earth in this simple model! A planet like Earth gets energy from the Sun and loses energy by radiating to space. Since the Earth sits in empty space, these two processes are the only relevant ones that describe the energy flow.

The sunshine near Earth carries an energy flux of about 1370 watts per square meter. If the temperature of the Earth is constant, as much energy is coming in as going out. So, we might try to balance the incoming energy flux with the outgoing flux of a blackbody at temperature T:

\displaystyle{ 1370 \, \textrm{W}/\textrm{m}^2 = \sigma T^4 }

and then solve for T:

\displaystyle{ T = \left(\frac{1370 \textrm{W}/\textrm{m}^2}{\sigma}\right)^{1/4} }

We’re making a big mistake here. Do you see what it is? But let’s go ahead and see what we get. As mentioned, the Stefan–Boltzmann constant has a value of

\displaystyle{ \sigma \approx 5.67 \times 10^{-8} \, \frac{\mathrm{W}}{\mathrm{m}^2 \mathrm{K}^4}  }

so we get

\displaystyle{ T = \left(\frac{1370}{5.67 \times 10^{-8}} \right)^{1/4} \mathrm{K} }  \approx (2.4 \cdot 10^9)^{1/4} \mathrm{K} \approx 394 \mathrm{K}

This is much too hot! Remember, this temperature is in kelvin, so we need to subtract 273 to get Celsius. Doing so, we get a temperature of 121 °C. This is above the boiling point of water!

Do you see what we did wrong? We neglected a phenomenon known as night. The Earth emits infrared radiation in all directions, but it only absorbs sunlight on the daytime side. Our calculation would be correct if the Earth were a flat disk of perfectly black stuff facing the Sun and perfectly insulated on the back so that it could only emit infrared radiation over the same area that absorbs sunlight! But in fact emission takes place over a larger area than absorption. This makes the Earth cooler.

To get the right answer, we need to take into account the fact that the Earth is round. But just for fun, let’s see how well a flat Earth theory does. A few climate skeptics may even believe this theory. Suppose the Earth were a flat disk of radius r, made of black stuff facing the Sun but not insulated on back. Then it would absorb power equal to

1370 \cdot \pi r^2

since the area of the disk is \pi r^2, but it would emit power equal to

\sigma T^4 \cdot 2 \pi r^2

since it emits from both the front and back. Setting these equal, we now get

\displaystyle{ \frac{1370}{2} \textrm{W}/\textrm{m}^2 = \sigma T^4 }

or

\displaystyle{ T = \left(\frac{1370 \textrm{W}/\textrm{m}^2}{2 \sigma}\right)^{1/4} }

This reduces the temperature by a factor of 2^{-1/4} \approx 0.84 from our previous estimate. So now the temperature works out to be less:

0.84 \cdot 394 \mathrm{K} \approx 331 \mathrm{K}

But this is still too hot! It’s 58 °C, or 136 °F for you Americans out there who don’t have a good intuition for Celsius.

So, a flat black Earth facing the Sun would be a very hot Earth.

But now let’s stop goofing around and do the calculation with a round Earth. Now it absorbs a beam of sunlight with area equal to its cross-section, a circle of area \pi r^2. But it emits infrared over its whole area of 4 \pi r^2: four times as much. So now we get

\displaystyle{ T = \left(\frac{1370 \textrm{W}/\textrm{m}^2}{4 \sigma}\right)^{1/4} }

so the temperature is reduced by a further factor of 2^{-1/4}. We get

0.84 \cdot 331 \mathrm{K} \approx 279 \mathrm{K}

That’s 6 °C. Not bad for a crude approximation! Amusingly, it’s crucial that the area of a sphere is 4 times the area of a circle of the same radius. The question if there is some deeper reason for this simple relation was posed as a geometry puzzle here on Azimuth.

I hope my clowning around hasn’t distracted you from the main point. On average our simplified blackbody Earth absorbs 1370/4 = 342.5 watts of solar power per square meter. So, that’s how much infrared radiation it has to emit. If you can imagine how much heat a 60-watt bulb puts out when it’s surrounded by black paper, we’re saying our simplified Earth emits about 6 times that heat per square meter.

The second simplest climate model

The next step is to take into account the ‘albedo’ of the Earth. The albedo is the fraction of radiation that is instantly reflected without being absorbed. The albedo of a surface does depend on the material of the surface, and in particular on the wavelength of the radiation, of course. But in a first approximation for the average albedo of earth we can take:

\mathrm{albedo}_{\mathrm{Earth}} = 0.3

This means that 30% of the radiation is instantly reflected and only 70% contributes to heating earth. So, instead of getting heated by an average of 342.5 watts per square meter of sunlight, let’s assume it’s heated by

0.7 \times 342.5 \approx 240

watts per square meter. Now we get a temperature of

\displaystyle{ T = \left(\frac{240}{5.67 \times 10^{-8}} \right)^{1/4} K }  \approx (4.2 \cdot 10^9)^{1/4} K \approx 255 K

This is -18 °C. The average temperature of earth is actually estimated to be considerably warmer: about +15 °C. This should not be a surprise: after all, 70% of the planet is covered by liquid water! This is an indication that the average temperature is most probably not below the freezing point of water.

So, our new ‘improved’ calculation gives a worse agreement with reality. The actual Earth is roughly 33 kelvin warmer than our model Earth! What’s wrong?

The main explanation for the discrepancy seems to be: our model Earth doesn’t have an atmosphere yet! Thanks in part to greenhouse gases like water vapor and carbon dioxide, sunlight at visible frequencies can get into the atmosphere more easily than infrared radiation can get out. This warms the Earth. This, in a nutshell, is why dumping a lot of extra carbon dioxide into the air can change our climate. But of course we’ll need to turn to more detailed models, or experimental data, to see how strong this effect is.

Besides the greenhouse effect, there are many other things our ultra-simplified model leaves out: everything associated to the atmosphere and oceans, such as weather, clouds, the altitude-dependence of the temperature of the atmosphere… and also the way the albedo of the Earth depends on location and even on temperature and other factors. There is much much more to say about all this… but not today!


The Mathematical Origin of Irreversibility

8 October, 2012

guest post by Matteo Smerlak

Introduction

Thermodynamical dissipation and adaptive evolution are two faces of the same Markovian coin!

Consider this. The Second Law of Thermodynamics states that the entropy of an isolated thermodynamic system can never decrease; Landauer’s principle maintains that the erasure of information inevitably causes dissipation; Fisher’s fundamental theorem of natural selection asserts that any fitness difference within a population leads to adaptation in an evolution process governed by natural selection. Diverse as they are, these statements have two common characteristics:

1. they express the irreversibility of certain natural phenomena, and

2. the dynamical processes underlying these phenomena involve an element of randomness.

Doesn’t this suggest to you the following question: Could it be that thermal phenomena, forgetful information processing and adaptive evolution are governed by the same stochastic mechanism?

The answer is—yes! The key to this rather profound connection resides in a universal property of Markov processes discovered recently in the context of non-equilibrium statistical mechanics, and known as the ‘fluctuation theorem’. Typically stated in terms of ‘dissipated work’ or ‘entropy production’, this result can be seen as an extension of the Second Law of Thermodynamics to small systems, where thermal fluctuations cannot be neglected. But it is actually much more than this: it is the mathematical underpinning of irreversibility itself, be it thermodynamical, evolutionary, or else. To make this point clear, let me start by giving a general formulation of the fluctuation theorem that makes no reference to physics concepts such as ‘heat’ or ‘work’.

The mathematical fact

Consider a system randomly jumping between states a, b,\dots with (possibly time-dependent) transition rates \gamma_{a b}(t) where a is the state prior to the jump, while b is the state after the jump. I’ll assume that this dynamics defines a (continuous-time) Markov process, namely that the numbers \gamma_{a b} are the matrix entries of an infinitesimal stochastic matrix, which means that its off-diagonal entries are non-negative and that its columns sum up to zero.

Now, each possible history \omega=(\omega_t)_{0\leq t\leq T} of this process can be characterized by the sequence of occupied states a_{j} and by the times \tau_{j} at which the transitions a_{j-1}\longrightarrow a_{j} occur (0\leq j\leq N):

\omega=(\omega_{0}=a_{0}\overset{\tau_{0}}{\longrightarrow} a_{1} \overset{\tau_{1}}{\longrightarrow}\cdots \overset{\tau_{N}}{\longrightarrow} a_{N}=\omega_{T}).

Define the skewness \sigma_{j}(\tau_{j}) of each of these transitions to be the logarithmic ratio of transition rates:

\displaystyle{\sigma_{j}(\tau_{j}):=\ln\frac{\gamma_{a_{j}a_{j-1}}(\tau_{j})}{\gamma_{a_{j-1}a_{j}}(\tau_{j})}}

Also define the self-information of the system in state a at time t by:

i_a(t):= -\ln\pi_{a}(t)

where \pi_{a}(t) is the probability that the system is in state a at time t, given some prescribed initial distribution \pi_{a}(0). This quantity is also sometimes called the surprisal, as it measures the ‘surprise’ of finding out that the system is in state a at time t.

Then the following identity—the detailed fluctuation theorem—holds:

\mathrm{Prob}[\Delta i-\Sigma=-A] = e^{-A}\;\mathrm{Prob}[\Delta i-\Sigma=A] \;

where

\displaystyle{\Sigma:=\sum_{j}\sigma_{j}(\tau_{j})}

is the cumulative skewness along a trajectory of the system, and

\Delta i= i_{a_N}(T)-i_{a_0}(0)

is the variation of self-information between the end points of this trajectory.

This identity has an immediate consequence: if \langle\,\cdot\,\rangle denotes the average over all realizations of the process, then we have the integral fluctuation theorem:

\langle e^{-\Delta i+\Sigma}\rangle=1,

which, by the convexity of the exponential and Jensen’s inequality, implies:

\langle \Delta i\rangle=\Delta S\geq\langle\Sigma\rangle.

In short: the mean variation of self-information, aka the variation of Shannon entropy

\displaystyle{ S(t):= \sum_{a}\pi_{a}(t)i_a(t) }

is bounded from below by the mean cumulative skewness of the underlying stochastic trajectory.

This is the fundamental mathematical fact underlying irreversibility. To unravel its physical and biological consequences, it suffices to consider the origin and interpretation of the ‘skewness’ term in different contexts. (By the way, people usually call \Sigma the ‘entropy production’ or ‘dissipation function’—but how tautological is that?)

The physical and biological consequences

Consider first the standard stochastic-thermodynamic scenario where a physical system is kept in contact with a thermal reservoir at inverse temperature \beta and undergoes thermally induced transitions between states a, b,\dots. By virtue of the detailed balance condition:

\displaystyle{ e^{-\beta E_{a}(t)}\gamma_{a b}(t)=e^{-\beta E_{b}(t)}\gamma_{b a}(t),}

the skewness \sigma_{j}(\tau_{j}) of each such transition is \beta times the energy difference between the states a_{j} and a_{j-1}, namely the heat received from the reservoir during the transition. Hence, the mean cumulative skewness \langle \Sigma\rangle is nothing but \beta\langle Q\rangle, with Q the total heat received by the system along the process. It follows from the detailed fluctuation theorem that

\langle e^{-\Delta i+\beta Q}\rangle=1

and therefore

\Delta S\geq\beta\langle Q\rangle

which is of course Clausius’ inequality. In a computational context where the control parameter is the entropy variation itself (such as in a bit-erasure protocol, where \Delta S=-\ln 2), this inequality in turn expresses Landauer’s principle: it impossible to decrease the self-information of the system’s state without dissipating a minimal amount of heat into the environment (in this case -Q \geq k T\ln2, the ‘Landauer bound’). More general situations (several types of reservoirs, Maxwell-demon-like feedback controls) can be treated along the same lines, and the various forms of the Second Law derived from the detailed fluctuation theorem.

Now, many would agree that evolutionary dynamics is a wholly different business from thermodynamics; in particular, notions such as ‘heat’ or ‘temperature’ are clearly irrelevant to Darwinian evolution. However, the stochastic framework of Markov processes is relevant to describe the genetic evolution of a population, and this fact alone has important consequences. As a simple example, consider the time evolution of mutant fixations x_{a} in a population, with a ranging over the possible genotypes. In a ‘symmetric mutation scheme’, which I understand is biological parlance for ‘reversible Markov process’, meaning one that obeys detailed balance, the ratio between the a\mapsto b and b\mapsto a transition rates is completely determined by the fitnesses f_{a} and f_b of a and b, according to

\displaystyle{\frac{\gamma_{a b}}{\gamma_{b a}} =\left(\frac{f_{b}}{f_{a}}\right)^{\nu} }

where \nu is a model-dependent function of the effective population size [Sella2005]. Along a given history of mutant fixations, the cumulated skewness \Sigma is therefore given by minus the fitness flux:

\displaystyle{\Phi=\nu\sum_{j}(\ln f_{a_j}-\ln f_{a_{j-1}}).}

The integral fluctuation theorem then becomes the fitness flux theorem:

\displaystyle{ \langle e^{-\Delta i -\Phi}\rangle=1}

discussed recently by Mustonen and Lässig [Mustonen2010] and implying Fisher’s fundamental theorem of natural selection as a special case. (Incidentally, the ‘fitness flux theorem’ derived in this reference is more general than this; for instance, it does not rely on the ‘symmetric mutation scheme’ assumption above.) The ensuing inequality

\langle \Phi\rangle\geq-\Delta S

shows that a positive fitness flux is “an almost universal evolutionary principle of biological systems” [Mustonen2010], with negative contributions limited to time intervals with a systematic loss of adaptation (\Delta S > 0). This statement may well be the closest thing to a version of the Second Law of Thermodynamics applying to evolutionary dynamics.

It is really quite remarkable that thermodynamical dissipation and Darwinian evolution can be reduced to the same stochastic mechanism, and that notions such as ‘fitness flux’ and ‘heat’ can arise as two faces of the same mathematical coin, namely the ‘skewness’ of Markovian transitions. After all, the phenomenon of life is in itself a direct challenge to thermodynamics, isn’t it? When thermal phenomena tend to increase the world’s disorder, life strives to bring about and maintain exquisitely fine spatial and chemical structures—which is why Schrödinger famously proposed to define life as negative entropy. Could there be a more striking confirmation of his intuition—and a reconciliation of evolution and thermodynamics in the same go—than the fundamental inequality of adaptive evolution \langle\Phi\rangle\geq-\Delta S?

Surely the detailed fluctuation theorem for Markov processes has other applications, pertaining neither to thermodynamics nor adaptive evolution. Can you think of any?

Proof of the fluctuation theorem

I am a physicist, but knowing that many readers of John’s blog are mathematicians, I’ll do my best to frame—and prove—the FT as an actual theorem.

Let (\Omega,\mathcal{T},p) be a probability space and (\,\cdot\,)^{\dagger}=\Omega\to \Omega a measurable involution of \Omega. Denote p^{\dagger} the pushforward probability measure through this involution, and

\displaystyle{ R=\ln \frac{d p}{d p^\dagger} }

the logarithm of the corresponding Radon-Nikodym derivative (we assume p^\dagger and p are mutually absolutely continuous). Then the following lemmas are true, with (1)\Rightarrow(2)\Rightarrow(3):

Lemma 1. The detailed fluctuation relation:

\forall A\in\mathbb{R} \quad  p\big(R^{-1}(-A) \big)=e^{-A}p \big(R^{-1}(A) \big)

Lemma 2. The integral fluctuation relation:

\displaystyle{\int_{\Omega} d p(\omega)\,e^{-R(\omega)}=1 }

Lemma 3. The positivity of the Kullback-Leibler divergence:

D(p\,\Vert\, p^{\dagger}):=\int_{\Omega} d p(\omega)\,R(\omega)\geq 0.

These are basic facts which anyone can show: (2)\Rightarrow(3) by Jensen’s inequality, (1)\Rightarrow(2) trivially, and (1) follows from R(\omega^{\dagger})=-R(\omega) and the change of variables theorem, as follows,

\begin{array}{ccl} \displaystyle{ \int_{R^{-1}(-A)} d p(\omega)} &=& \displaystyle{ \int_{R^{-1}(A)}d p^{\dagger}(\omega) } \\ \\ &=& \displaystyle{ \int_{R^{-1}(A)} d p(\omega)\, e^{-R(\omega)} } \\ \\ &=& \displaystyle{ e^{-A} \int_{R^{-1}(A)} d p(\omega)} .\end{array}

But here is the beauty: if

(\Omega,\mathcal{T},p) is actually a Markov process defined over some time interval [0,T] and valued in some (say discrete) state space \Sigma, with the instantaneous probability \pi_{a}(t)=p\big(\{\omega_{t}=a\} \big) of each state a\in\Sigma satisfying the master equation (aka Kolmogorov equation)

\displaystyle{ \frac{d\pi_{a}(t)}{dt}=\sum_{b\neq a}\Big(\gamma_{b a}(t)\pi_{a}(t)-\gamma_{a b}(t)\pi_{b}(t)\Big),}

and

• the dagger involution is time-reversal, that is \omega^{\dagger}_{t}:=\omega_{T-t},

then for a given path

\displaystyle{\omega=(\omega_{0}=a_{0}\overset{\tau_{0}}{\longrightarrow} a_{1} \overset{\tau_{1}}{\longrightarrow}\cdots \overset{\tau_{N}}{\longrightarrow} a_{N}=\omega_{T})\in\Omega}

the logarithmic ratio R(\omega) decomposes into ‘variation of self-information’ and ‘cumulative skewness’ along \omega:

\displaystyle{ R(\omega)=\underbrace{\Big(\ln\pi_{a_0}(0)-\ln\pi_{a_N}(T) \Big)}_{\Delta i(\omega)}-\underbrace{\sum_{j=1}^{N}\ln\frac{\gamma_{a_{j}a_{j-1}}(\tau_{j})}{\gamma_{a_{j-1}a_{j}}(\tau_{j})}}_{\Sigma(\omega)}.}

This is easy to see if one writes the probability of a path explicitly as

\displaystyle{p(\omega)=\pi_{a_{0}}(0)\left[\prod_{j=1}^{N}\phi_{a_{j-1}}(\tau_{j-1},\tau_{j})\gamma_{a_{j-1}a_{j}}(\tau_{j})\right]\phi_{a_{N}}(\tau_{N},T)}

where

\displaystyle{ \phi_{a}(\tau,\tau')=\phi_{a}(\tau',\tau)=\exp\Big(-\sum_{b\neq a}\int_{\tau}^{\tau'}dt\, \gamma_{a b}(t)\Big)}

is the probability that the process remains in the state a between the times \tau and \tau'. It follows from the above lemma that

Theorem. Let (\Omega,\mathcal{T},p) be a Markov process and let i,\Sigma:\Omega\rightarrow \mathbb{R} be defined as above. Then we have

1. The detailed fluctuation theorem:

\forall A\in\mathbb{R}, p\big((\Delta i-\Sigma)^{-1}(-A) \big)=e^{-A}p \big((\Delta i-\Sigma)^{-1}(A) \big)

2. The integral fluctuation theorem:

\int_{\Omega} d p(\omega)\,e^{-\Delta i(\omega)+\Sigma(\omega)}=1

3. The ‘Second Law’ inequality:

\displaystyle{ \Delta S:=\int_{\Omega} d p(\omega)\,\Delta i(\omega)\geq \int_{\Omega} d p(\omega)\,\Sigma(\omega)}

The same theorem can be formulated for other kinds of Markov processes as well, including diffusion processes (in which case it follows from the Girsanov theorem).

References

Landauer’s principle was introduced here:

• [Landauer1961] R. Landauer, Irreversibility and heat generation in the computing process}, IBM Journal of Research and Development 5, (1961) 183–191.

and is now being verified experimentally by various groups worldwide.

The ‘fundamental theorem of natural selection’ was derived by Fisher in his book:

• [Fisher1930] R. Fisher, The Genetical Theory of Natural Selection, Clarendon Press, Oxford, 1930.

His derivation has long been considered obscure, even perhaps wrong, but apparently the theorem is now well accepted. I believe the first Markovian models of genetic evolution appeared here:

• [Fisher1922] R. A. Fisher, On the dominance ratio, Proc. Roy. Soc. Edinb. 42 (1922), 321–341.

• [Wright1931] S. Wright, Evolution in Mendelian populations, Genetics 16 (1931), 97–159.

Fluctuation theorems are reviewed here:

• [Sevick2008] E. Sevick, R. Prabhakar, S. R. Williams, and D. J. Searles, Fluctuation theorems, Ann. Rev. Phys. Chem. 59 (2008), 603–633.

Two of the key ideas for the ‘detailed fluctuation theorem’ discussed here are due to Crooks:

• [Crooks1999] Gavin Crooks, The entropy production fluctuation theorem and the nonequilibrium work relation for free energy differences, Phys. Rev. E 60 (1999), 2721–2726.

who identified (E_{a}(\tau_{j})-E_{a}(\tau_{j-1})) as heat, and Seifert:

• [Seifert2005] Udo Seifert, Entropy production along a stochastic trajectory and an integral fluctuation theorem, Phys. Rev. Lett. 95 (2005), 4.

who understood the relevance of the self-information in this context.

The connection between statistical physics and evolutionary biology is discussed here:

• [Sella2005] G. Sella and A.E. Hirsh, The application of statistical physics to evolutionary biology, Proc. Nat. Acad. Sci. USA 102 (2005), 9541–9546.

and the ‘fitness flux theorem’ is derived in

• [Mustonen2010] V. Mustonen and M. Lässig, Fitness flux and ubiquity of adaptive evolution, Proc. Nat. Acad. Sci. USA 107 (2010), 4248–4253.

Schrödinger’s famous discussion of the physical nature of life was published here:

• [Schrödinger1944] E. Schrödinger, What is Life?, Cambridge University Press, Cambridge, 1944.


Mathematics of the Environment (Part 1)

4 October, 2012

 

I’m running a graduate math seminar called here at U. C. Riverside, and here are the slides for the first class:

Mathematics of the Environment, 2 October 2012.

I said a lot of things that aren’t on the slides, so they might be a tad cryptic. I began by showing some graphs everyone should know by heart:

• human population and the history of civilization,

• the history of carbon emissions,

• atmospheric CO2 concentration for the last century or so,

• global average temperatures for the last century or so,

• the melting of the Arctic ice, and

• the longer historical perspective of CO2 concentrations.

You can click on these graphs for more details—there are lots of links in the slides.

Then I posed the question of what mathematicians can do about this. I suggested looking at the birth of written mathematics during the agricultural revolution as a good comparison, since we’re at the start of an equally big revolution now. Have you thought about how Babylonian mathematics was intertwined with the agricultural revolution?

Then, I raised the idea of ‘ecotechnology’ as a goal to strive for, assuming our current civilization doesn’t collapse to the point where it becomes pointless to even try. As an example, I describe the perfect machine for reversing global warming—and show a nice picture of it.

Finally, I began sketching how ecotechnology is related to the mathematics of networks, though this will be a much longer story for later on.

Part of the idea here is that mathematics takes time to have an effect, so mathematicians might as well look ahead a little bit, while politicians, economists, business people and engineers should be doing things that have a big effect soon.


Follow

Get every new post delivered to your Inbox.

Join 3,095 other followers