## El Niño Project (Part 3)

1 July, 2014

In February, this paper claimed there’s a 75% chance the next El Niño will arrive by the end of 2014:

• Josef Ludescher, Avi Gozolchiani, Mikhail I. Bogachev, Armin Bunde, Shlomo Havlin, and Hans Joachim Schellnhuber, Very early warning of next El Niño, Proceedings of the National Academy of Sciences, February 2014. (Click title for free version, journal name for official version.)

Since it was published in a reputable journal, it created a big stir! Being able to predict an El Niño more than 6 months in advance would be a big deal. El Niños can cause billions of dollars of damage.

But that’s not the only reason we at the Azimuth Project want to analyze, criticize and improve this paper. Another reason is that it uses a climate network—and we like network theory.

Very roughly, the idea is this. Draw a big network of dots representing different places in the Pacific Ocean. For each pair of dots, compute a number saying how strongly correlated the temperatures are at those two places. The paper claims that when a El Niño is getting ready to happen, the average of these numbers is big. In other words, temperatures in the Pacific tend to go up and down in synch!

Whether this idea is right or wrong, it’s interesting—and it’s not very hard for programmers to dive in and study it.

Two Azimuth members have done just that: David Tanzer, a software developer who works for financial firms in New York, and Graham Jones, a self-employed programmer who also works on genomics and Bayesian statistics. These guys have really brought new life to the Azimuth Code Project in the last few weeks, and it’s exciting! It’s even gotten me to do some programming myself.

Soon I’ll start talking about the programs they’ve written, and how you can help. But today I’ll summarize the paper by Ludescher et al. Their methodology is also explained here:

• Josef Ludescher, Avi Gozolchiani, Mikhail I. Bogachev, Armin Bunde, Shlomo Havlin, and Hans Joachim Schellnhuber, Improved El Niño forecasting by cooperativity detection, Proceedings of the National Academy of Sciences, 30 May 2013.

### The basic idea

The basic idea is to use a climate network. There are lots of variants on this idea, but here’s a simple one. Start with a bunch of dots representing different places on the Earth. For any pair of dots $i$ and $j,$ compute the cross-correlation of temperature histories at those two places. Call some function of this the ‘link strength’ for that pair of dots. Compute the average link strength… and get excited when this gets bigger than a certain value.

The papers by Ludescher et al use this strategy to predict El Niños. They build their climate network using correlations between daily temperature data for 14 grid points in the El Niño basin and 193 grid points outside this region, as shown here:

The red dots are the points in the El Niño basin.

Starting from this temperature data, they compute an ‘average link strength’ in a way I’ll describe later. When this number is bigger than a certain fixed value, they claim an El Niño is coming.

How do they decide if they’re right? How do we tell when an El Niño actually arrives? One way is to use the ‘Niño 3.4 index’. This the area-averaged sea surface temperature anomaly in the yellow region here:

Anomaly means the temperature minus its average over time: how much hotter than usual it is. When the Niño 3.4 index is over 0.5°C for at least 5 months, Ludescher et al say there’s an El Niño. (By the way, this is not the standard definition. But we will discuss that some other day.)

Here is what they get:

The blue peaks are El Niños: episodes where the Niño 3.4 index is over 0.5°C for at least 5 months.

The red line is their ‘average link strength’. Whenever this exceeds a certain threshold $\Theta = 2.82,$ and the Niño 3.4 index is not already over 0.5°C, they predict an El Niño will start in the following calendar year.

The green arrows show their successful predictions. The dashed arrows show their false alarms. A little letter n appears next to each El Niño that they failed to predict.

You’re probably wondering where the number $2.82$ came from. They get it from a learning algorithm that finds this threshold by optimizing the predictive power of their model. Chart A here shows the ‘learning phase’ of their calculation. In this phase, they adjusted the threshold $\Theta$ so their procedure would do a good job. Chart B shows the ‘testing phase’. Here they used the value of $\Theta$ chosen in the learning phase, and checked to see how good a job it did. I’ll let you read their paper for more details on how they chose $\Theta.$

But what about their prediction now? That’s the green arrow at far right here:

On 17 September 2013, the red line went above the threshold! So, their scheme predicts an El Niño sometime in 2014. The chart at right is a zoomed-in version that shows the red line in August, September, October and November of 2013.

### The details

Now I mainly need to explain how they compute their ‘average link strength’.

Let $i$ stand for any point in this 9 × 23 grid:

For each day $t$ between June 1948 and November 2013, let $\tilde{T}_i(t)$ be the average surface air temperature at the point $i$ on day $t.$

Let $T_i(t)$ be $\tilde{T}_i(t)$ minus its climatological average. For example, if $t$ is June 1st 1970, we average the temperature at location $i$ over all June 1sts from 1948 to 2013, and subtract that from $\tilde{T}_i(t)$ to get $T_i(t).$

They call $T_i(t)$ the temperature anomaly.

(A subtlety here: when we are doing prediction we can’t know the future temperatures, so the climatological average is only the average over past days meeting the above criteria.)

For any function of time, denote its moving average over the last 365 days by:

$\displaystyle{ \langle f(t) \rangle = \frac{1}{365} \sum_{d = 0}^{364} f(t - d) }$

Let $i$ be a point in the El Niño basin, and $j$ be a point outside it. For any time lag $\tau$ between 0 and 200 days, define the time-delayed cross-covariance by:

$\langle T_i(t) T_j(t - \tau) \rangle - \langle T_i(t) \rangle \langle T_j(t - \tau) \rangle$

Note that this is a way of studying the linear correlation between the temperature anomaly at node $i$ and the temperature anomaly a time $\tau$ earlier at some node $j.$ So, it’s about how temperature anomalies inside the El Niño basin are correlated to temperature anomalies outside this basin at earlier times.

Ludescher et al then normalize this, defining the time-delayed cross-correlation $C_{i,j}^{t}(-\tau)$ to be the time-delayed cross-covariance divided by

$\sqrt{\Big{\langle} (T_i(t) - \langle T_i(t)\rangle)^2 \Big{\rangle}} \; \sqrt{\Big{\langle} (T_j(t-\tau) - \langle T_j(t-\tau)\rangle)^2 \Big{\rangle}}$

This is something like the standard deviation of $T_i(t)$ times the standard deviation of $T_j(t - \tau).$ Dividing by standard deviations is what people usually do to turn covariances into correlations. But there are some potential problems here, which I’ll discuss later.

They define $C_{i,j}^{t}(\tau)$ in a similar way, by taking

$\langle T_i(t - \tau) T_j(t) \rangle - \langle T_i(t - \tau) \rangle \langle T_j(t) \rangle$

and normalizing it. So, this is about how temperature anomalies outside the El Niño basin are correlated to temperature anomalies inside this basin at earlier times.

Next, for nodes $i$ and $j,$ and for each time point $t,$ they determine the maximum, the mean and the standard deviation of $|C_{i,j}^t(\tau)|,$ as $\tau$ ranges from -200 to 200 days.

They define the link strength $S_{i j}(t)$ as the difference between the maximum and the mean value, divided by the standard deviation.

Finally, they let $S(t)$ be the average link strength, calculated by averaging $S_{i j}(t)$ over all pairs $(i,j)$ where $i$ is a node in the El Niño basin and $j$ is a node outside.

They compute $S(t)$ for every 10th day between January 1950 and November 2013. When $S(t)$ goes over 2.82, and the Niño 3.4 index is not already over 0.5°C, they predict an El Niño in the next calendar year.

There’s more to say about their methods. We’d like you to help us check their work and improve it. Soon I want to show you Graham Jones’ software for replicating their calculations! But right now I just want to conclude by:

• mentioning a potential problem in the math, and

• telling you where to get the data used by Ludescher et al.

### Mathematical nuances

Ludescher et al normalize the time-delayed cross-covariance in a somewhat odd way. They claim to divide it by

$\sqrt{\Big{\langle} (T_i(t) - \langle T_i(t)\rangle)^2 \Big{\rangle}} \; \sqrt{\Big{\langle} (T_j(t-\tau) - \langle T_j(t-\tau)\rangle)^2 \Big{\rangle}}$

This is a strange thing, since it has nested angle brackets. The angle brackets are defined as a running average over the 365 days, so this quantity involves data going back twice as long: 730 days. Furthermore, the ‘link strength’ involves the above expression where $\tau$ goes up to 200 days.

So, taking their definitions at face value, Ludescher et al could not actually compute their ‘link strength’ until 930 days after the surface temperature data first starts at the beginning of 1948. That would be late 1950. But their graph of the link strength starts at the beginning of 1950!

Perhaps they actually normalized the time-delayed cross-covariance by dividing it by this:

$\sqrt{\big{\langle} T_i(t)^2 \big{\rangle} - \big{\langle} T_i(t)\big{\rangle}^2} \; \sqrt{\big{\langle} T_j(t-\tau)^2 \big{\rangle} - \big{\langle} T_j(t-\tau)\big{\rangle}^2}$

This simpler expression avoids nested angle brackets, and it makes more sense conceptually. It is the standard deviation of $T_i(t)$ over the last 365 days, times of the standard deviation of $T_i(t-\tau)$ over the last 365 days.

As Nadja Kutz noted, the expression written by Ludescher et al does not equal this simpler expression, since:

$\Big{\langle} T_i(t) \; \langle T_i(t) \rangle \Big{\rangle} \neq \big{\langle} T_i(t) \big{\rangle} \; \big{\langle} T_i(t) \big{\rangle}$

The reason is that

$\begin{array}{ccl} \Big{\langle} T_i(t) \; \langle T_i(t) \rangle \Big{\rangle} &=& \displaystyle{ \frac{1}{365} \sum_{d = 0}^{364} T_i(t-d) \langle T_i(t-d) \rangle} \\ \\ &=& \displaystyle{ \frac{1}{365} \sum_{d = 0}^{364} \frac{1}{365} \sum_{D = 0}^{364} T_i(t-d) T_i(t-d-D)} \end{array}$

which is generically different from

$\Big{\langle} \langle T_i(t) \rangle \;\langle T_i(t) \rangle \Big{\rangle} =$

$\displaystyle{ \frac{1}{365} \sum_{D = 0}^{364} (\frac{1}{365} \sum_{d = 0}^{364} T_i(t-d-D))(\frac{1}{365} \sum_{d = 0}^{364} T_i(t-d-D) ) }$

since the terms in the latter expression contain products $T_i(t-364-364)T_i(t-364-364)$ that can’t appear in the former.

Moreover:

$\begin{array}{ccl} \Big{\langle} (T_i(t) - \langle T_i(t) \rangle)^2 \Big{\rangle} &=& \Big{\langle} T_i(t)^2 - 2 T_i(t) \langle T_i(t) \rangle + \langle T_i(t) \rangle^2 \Big{\rangle} \\ \\ &=& \langle T_i(t)^2 \rangle - 2 \big{\langle} T_i(t) \langle T_i(t) \rangle \big{\rangle} + \big{\langle} \langle T_i(t) \rangle^2 \big{\rangle} \end{array}$

But since $\big{\langle} T_i(t) \langle T_i(t) \rangle \big{\rangle} \neq \big{\langle} \langle T_i(t) \rangle \; \langle T_i(t) \rangle \big{\rangle},$ as was just shown, those terms do not cancel out in the above expression. In particular, this means that

$-2 \big{\langle} T_i(t) \langle T_i(t) \rangle \big{\rangle} + \big{\langle} \langle T_i(t) \rangle \langle T_i(t) \rangle \big{\rangle}$

contains terms $T_i(t-364-364)$ which do not appear in $\langle T_i(t)\rangle^2,$ hence

$\Big{\langle} (T_i(t) - \langle T_i(t) \rangle)^2 \Big{\rangle} \neq \langle T_i(t)^2\rangle - \langle T_i(t)\rangle^2$

So at least for the case of the standard deviation it is clear that those two definitions are not the same for a running mean. For the covariances this would still need to be shown.

### Surface air temperatures

Remember that $\tilde{T}_i(t)$ is the average surface air temperature at the grid point $i$ on day $t.$ You can get these temperatures from here:

• Earth System Research Laboratory, NCEP Reanalysis Daily Averages Surface Level, or ftp site.

These sites will give you worldwide daily average temperatures on a 2.5° latitude × 2.5° longitude grid (144 × 73 grid points), from 1948 to now. Ihe website will help you get data from within a chosen rectangle in a grid, for a chosen time interval. Alternatively, you can use the ftp site to download temperatures worldwide one year at a time. Either way, you’ll get ‘NetCDF files’—a format we will discuss later, when we get into more details about programming!

### Niño 3.4

Niño 3.4 is the area-averaged sea surface temperature anomaly in the region 5°S-5°N and 170°-120°W. You can get Niño 3.4 data here:

You can get Niño 3.4 data here:

Niño 3.4 data since 1870 calculated from the HadISST1, NOAA. Discussed in N. A. Rayner et al, Global analyses of sea surface temperature, sea ice, and night marine air temperature since the late nineteenth century, J. Geophys. Res. 108 (2003), 4407.

You can also get Niño 3.4 data here:

Monthly Niño 3.4 index, Climate Prediction Center, National Weather Service.

The actual temperatures in Celsius are close to those at the other website, but the anomalies are rather different, because they’re computed in a way that takes global warming into account. See the website for details.

Niño 3.4 is just one of several official regions in the Pacific:

• Niño 1: 80°W-90°W and 5°S-10°S.

• Niño 2: 80°W-90°W and 0°S-5°S

• Niño 3: 90°W-150°W and 5°S-5°N.

• Niño 3.4: 120°W-170°W and 5°S-5°N.

• Niño 4: 160°E-150°W and 5°S-5°N.

• Kevin E. Trenberth, The definition of El Niño, Bulletin of the American Meteorological Society 78 (1997), 2771–2777.

## Chemical Reaction Network Talks

26 June, 2014

A while ago I blogged about David Soloveichik’s talk at this workshop:

Programming with Chemical Reaction Networks: Mathematical Foundations, Banff International Research Station, 8-13 June 2014.

Now the slides for his talk are available:

• David Soloveichik, U.C. San Francisco, The computational power of chemical reaction networks.

And now I’d like to tell you about three more talks!

The first two are about ways one chemical reaction network can simulate another. This is important for a couple of reasons. First, in biology, a bunch of different chemical reactions can ‘accomplish the same task’—and we’d like to make this idea precise. That’s what Luca Cardelli spoke about. Second, people trying to do computation with chemistry are starting to simulate quite general reactions using DNA! That’s what Sheung Woo Shin spoke about.

### Luca Cardelli

Luca Cardelli was at Oxford when I was visiting this spring, but unfortunately I didn’t meet him! He works for Microsoft on the interface of biology and computation. At Banff, he talked about ways one chemical reaction network can simulate another. His slides are here:

• Luca Cardelli, Morphisms of reaction networks.

He has a paper that gives a more detailed explanation of these ideas:

• Luca Cardelli, Morphisms of reaction networks that couple structure to function.

Here is my own disorganized explanation… with lots of informative but confusing digressions. A population protocol is a chemical reaction with only 2-in, 2-out reactions. For example, this paper presents a population protocol that does ‘approximate majority detection':

• Dana Angluin, James Aspnes, and David Eisenstat, A simple population protocol for fast robust approximate majority, Distributed Computing 21 (2008), 87–102.

What’s the idea? We start with two kinds of molecules, say $x$’s and $y$’s, and we want to see which one is in the majority, so we run these chemical reactions:

$x + y \to x + b$

$x + y \to y + b$

$x + b \to 2x$

$y + b \to 2y$

See? All the reactions have 2 molecules going in and 2 going out. The $b$ molecules act as ‘undecided voters’ who become either an $x$ or a $y,$ depending on who they meet first.

If we start with about $n$ molecules, in $O(n \log n)$ time these reactions are very likely to convert all $x$’s and $y$’s to whatever kind of molecule was in the majority initially… at least if the gap in the number of $x$’s and $y$’s is big enough.

Here’s another population protocol that also does the job:

$x + y \to 2b$

$x + b \to 2x$

$y + b \to 2y$

And here’s a proof that one of these algorithms actually works—most of the time, when the initial difference in populations is big enough:

• Etienne Perron, Dinkar Vasudevan, and Milan Vojonvic, Using three states for binary consensus on complete graphs, Technical Report, MSR-TR-2008-114, Microsoft, September 2008.

If we use a discrete-time formalism to describe the dynamics, the proof seems to get harder. See the paper by Angluin, Aspnes, and Eisenstat for the only known proof!

Anyway, Luca Cardelli is interested in chemical reaction networks actually found in biology. This approximate majority algorithm is seen quite clearly in a certain biological system: a certain ‘epigenetic switch’. However, it is usually ‘obfuscated’ or ‘encrypted': hidden in a bigger, more complicated chemical reaction network. For example, see:

• Luca Cardelli and Attila Csikász-Nagy, The cell cycle switch is approximate majority obfuscated, Scientific Reports 2 (2012).

This got him interested in developing a theory of morphisms between reaction networks, which could answer questions like: When can one CRN emulate another? But these questions turn out to be much easier if we use the rate equation than with the master equation. So, he asks: when can one CRN give the same rate equation as another?

He found a sufficient condition that’s ‘purely syntactic': you can tell if it holds by looking at the reaction networks, regardless of the rate constants.

Here’s the idea. We say one network emulates another if for any rate constants of the second, we can find rate constants for the first that makes its rate equation have solutions exactly mimicking that of the second, but where several species in the first correspond to one in the second.

For this to make sense, we assume there is a map sending:

• species to species
• reactions to reactions

In a chemical reaction network homomorphism, the map on reactions is determined by the map on species in the obvious way. For example, if species $A$ is sent to $f(A)$ and species $B$ is sent to $f(B)$ then the reaction

$2A + B \to 3B$

is sent to the reaction

$2f(A) + f(B) \to 3 f(B)$

In this situation, to make the first network emulate the second, we need to set equal the initial concentrations of all species in the inverse image of a given species.

A reactant homomorphism from one chemical reaction network to another is more general: it sends species to species, and for any reaction in the first chemical reaction network with input

$A + B + C \cdots$

there’s a reaction in the second with input

$f(A) + f(B) + f(C) + \cdots$

(Reactant is another name for input.)

A stoichiomorphism is a kind of morphism that takes rate constants into account. See Cardelli’s paper for the definition.

The main theorem: given a stoichiomorphism from one chemical reaction network to another that’s also a reactant homomorphism, then the first emulates the second.

For a better explanation, read his paper! Here’s a cool picture from his paper showing a bunch of chemical reaction networks including the approximate majority network (labelled AM), many of which show up in biology, and morphisms between them:

Click to enlarge! These chemical reaction networks are drawn in a special style: as influence networks, consisting of ‘gates’ where process activates or deactivates another. Each gate is a chemical reaction network of a certain form, schematically like this:

$\mathrm{off} \leftrightarrow \mathrm{intermediate} \leftrightarrow \mathrm{on}$

where the forward reactions are catalyzed by one chemical and the reverse reactions are catalyzed by another. A gate is like a switch that can be turned on or off.

While listening to this talk, I thought the way in which one CRN emulates another in Cardelli’s formalism looks awfully similar to the way one dynamical system emulates another in Eugene Lerman’s formalism:

• Eugene Lerman, Networks of dynamical systems, Azimuth, 18 March 2014.

The following picture from Cardelli’s paper shows that one of his morphisms of reaction networks is like ‘covering map’. This reminds me a lot of what’s happening in Lerman’s work.

Again, click to enlarge!

### Seung Woo Shin

Seung Woo Shin was actually Brendan Fong’s roommate at the National University of Singapore while Brendan was working with me on chemical reaction networks. Apparently they never talked about their work!

Shin spoke about some other concepts of ‘morphism’ between chemical reaction networks. These other concepts do not involve reaction rates, just which chemicals can turn into which. You can see his slides here:

• Seung Woo Shin, Verifying CRN implementations.

and read his thesis for more details:

• Seung Woo Shin, Compiling and verifying DNA-based chemical reaction network implementations, Masters thesis, Caltech, 2012.

Abstract: One goal of molecular programming and synthetic biology is to build chemical circuits that can control chemical processes at the molecular level. Remarkably, it has been shown that synthesized DNA molecules can be used to construct complex chemical circuits that operate without any enzyme or cellular component. However, designing DNA molecules at the individual nucleotide base level is often difficult and laborious, and thus chemical reaction networks (CRNs) have been proposed as a higher-level programming language. So far, several general-purpose schemes have been described for designing synthetic DNA molecules that simulate the behavior of arbitrary CRNs, and many more are being actively investigated.

Here, we solve two problems related to this topic. First, we present a general-purpose CRN-to-DNA compiler that can apply user-defined compilation schemes for translating formal CRNs to domain-level specifications for DNA molecules. In doing so, we develop a language in which such schemes can be concisely and precisely described. This compiler can greatly reduce the amount of tedious manual labor faced by researchers working in the field. Second, we present a general method for the formal verification of the correctness of such compilation. We first show that this problem reduces to testing a notion of behavioral equivalence between two CRNs, and then we construct a mathematical formalism in which that notion can be precisely defined. Finally, we provide algorithms for testing that notion. This verification process can be thought of as an equivalent of model checking in molecular computation, and we hope that the generality of our verification techniques will eventually allow us to apply them not only to DNA-based CRN implementations but to a wider class of molecular programs.

His thesis built on this earlier paper:

• David Soloveichik, Georg Seelig and Erik Winfree, DNA as a universal substrate for chemical kinetics, Proceedings of the National Academy of Sciences (2010).

I think this work is fascinating and deeply related to category theory, so I talked to Shin and Winfree about it, and this is what we came up with:

CRN equivalences: progress report.

This is one of several reports on progress people at the workshop made on various open problems.

### David Anderson

Brendan Fong and I wrote about David Anderson’s work in Part 9 of the network theory series. It’s so impressive that I expected him to be older… older than me, I guess. He’s not!

In his tutorial, he gave an overview of chemical reaction networks with an emphasis on the deficiency zero theorem. Since many people were puzzled by the ‘deficiency’ concept, they asked lots of questions. But I’ve already explained that idea in Part 21. So, I’ll just mention a couple of cool theorems he told us about!

Theorem (Horn and Jackson). If a reaction network has a complex balanced equilibrium, then:

1. It has no equilibria that are not complex balanced.

2. The reaction network must be weakly reversible.

3. Every stochiometric compatibility class contains precisely one complex balanced equilibrium.

I should have known this, since this work is classic. But I don’t think I knew that the existence of one complex balanced equilibrium implied all this stuff!

He also mentioned this paper:

• Guy Shinar and Martin Feinberg, Structural sources of robustness in biochemical reaction networks, Science (2010).

which contains this amazing theorem:

Theorem (Shinar and Feinberg). Suppose there is a chemical reaction network such that:

1. its deficiency equals one;

2. it has a positive steady state;

3. it has two “non-terminal complexes” that differ only in one species $S.$ (“Non-terminal” is a concept that’s easier to explain with a picture of a reaction network).

Then the species $S$ is absolutely robust: with any initial conditions, the rate equation will approach an equilibrium where the concentration of $S$ approaches a specific fixed value $c,$ independent of the initial conditions!

However, things work very differently if we treat the system stochastically, using the master equation:

• David F. Anderson, German A. Enciso and Matthew D. Johnston, Stochastic analysis of biochemical reaction networks with absolute concentration robustness.

### More

A lot more happened at this workshop! There was a huge amount of discussion of the leader election problem, which is about how to cook up chemical reactions that create a ‘leader': a single molecule of some sort.

Leader election: the problem, and references.

As I explained before, David Soloveichik talked about various forms of digital computation with chemical reaction networks. David Doty talked about the very important flip side of the coin: analog computation.

• David Doty, Rate-independent computation by real-valued chemistry.

There were also great talks by Lulu Qian and Erik Winfree, which I won’t try to summarize. Qian does a lot of work in the lab making things actually happen, so if you’re a practical sort this is the talk to look at:

All in all, a very stimulating workshop. The diversity of things one can ask about chemical reaction networks is quite exciting!

## El Niño Project (Part 2)

24 June, 2014

Before we dive into the exciting world of El Niño prediction, and ways that you can help, let’s have a very very basic crash course on the physics of El Niño.

El Niños are still rather mysterious. But that doesn’t mean we should ignore what the experts know, or suspect.

### The basics

Winds called trade winds blow west across the tropical Pacific, from the Americas to Asia. During La Niña years, water at the ocean’s surface moves along with the wind, warming up in the sunlight as it travels. So, warm water collects at the ocean’s surface off the coast of Asia. This creates more clouds and rainstorms there.

Meanwhile, since surface water is being dragged west by the wind, cold water from below gets pulled up to take its place in the eastern Pacific, off the coast of South America.

So, the temperature at the ocean’s surface looks like this:

This situation is actually reinforced by a feedback loop. Since the ocean’s surface is warmer near Asia, it heats the air and makes it rise. This helps the trade winds blow toward Asia: they go there to fill the ‘gap’ left by rising air.

Of course, you should be wondering: why do the trade winds blow west in the first place?

Without an answer to this, the story so far would work just as well if we switched the words ‘west’ and ‘east’. That wouldn’t mean the story is wrong. It might just mean that there were two stable states of the Earth’s climate: a La Niña state where the trade winds blow west, and another state—say, the El Niño—where they blow east. One could imagine a world permanently stuck in one of these phases. Or perhaps it could flip between these two phases for some reason.

Something roughly like the last choice is actually true. But it’s not so simple: there’s not a complete symmetry between west and east!

Why not? Mainly because the Earth is turning to the east. Air near the equator warms up and rises, so new air from more northern or southern regions moves in to take its place. But because the Earth is fatter at the equator, the equator is moving faster to the east. So, this new air from other places is moving less quickly by comparison… so as seen by someone standing on the equator, it blows west. This is called the Coriolis effect, and it produces winds like this:

Beware: a wind that blows to the west is called an easterly. So the westward-blowing trade winds I’m talking about are called "northeasterly trades" and "southeasterly trades" on this picture.

It’s also good to remember that the west Pacific touches the part of Asia also called the ‘Far East’, while the east Pacific touches the part of America also called the ‘West Coast’. So, it’s easy to get confused! If you find yourself getting confused, just repeat this sentence:

The easterlies blow west from West Coast to Far East.

Everything will instantly become much clearer.

Terminology aside, the story so far should be clear. The trade winds have a good intrinsic reason to blow west, but in the La Niña phase they’re also part of a feedback loop where they make the western Pacific warmer… which in turn helps the trade winds blow west.

But now comes an El Niño! Now for some reason the westward winds weaken. This lets the built-up warm water in the western Pacific slosh back east. And with weaker westward winds, less cold water is pulled up to the surface in the eastern Pacific. So, the eastern Pacific warms up. This makes for more clouds and rain in the eastern Pacific—that’s when we get floods in Southern California. And with the ocean warmer in the eastern Pacific, hot air rises there, which tends to counteract the westward winds even more.

In other words, all the feedbacks reverse themselves! Here’s how it looked in the big El Niño of 1997:

But note: the trade winds never mainly blow east. Even during an El Niño they still blow west, just a bit less. So, the climate is not flip-flopping between two symmetrical alternatives. It’s flip-flopping between two asymmetrical alternatives.

Here’s how it goes! The vertical height of the ocean is exaggerated here to show how water piles up:

Here we see the change in trade winds and ocean currents:

By the way, you can click on any of the pictures to get more information.

### But why?

One huge remaining question is: why do the trade winds weaken? We could also ask the same question about the start of the La Niña phase: why do the trade winds get stronger then?

The short answer is: nobody knows! At least there’s no one story that everyone agrees on. There are actually several stories… and perhaps more than one of them is true. So, at this point it is worthwhile revisiting some actual data:

The top graph shows variations in the water temperature of the tropical Eastern Pacific ocean. When it’s hot we have El Niños: those are the red hills in the top graph. The blue valleys are La Niñas. Note that it’s possible to have two El Niños in a row without an intervening La Niña, or vice versa!

The bottom graph shows the Southern Oscillation Index or SOI. This is basically the air pressure in Tahiti minus the air pressure in Darwin, Australia, divided by its standard deviation.

So, when the SOI is high, the air pressure is higher in the east Pacific than in the west Pacific. This is what we expect in an La Niña: that’s why the westward trade winds are strong then! Conversely, the SOI is low in the El Niño phase. This variation in the SOI is called the Southern Oscillation.

If you look at the graphs above, you’ll see how one looks almost like an upside-down version of the other. So, El Niño/La Niña cycle is tightly linked to the Southern Oscillation.

Another thing you’ll see from is that the ENSO is far from perfectly periodic! Here’s a graph of the Southern Oscillation Index going back a lot further:

So, there’s something inherently irregular about this oscillation. It could be chaotic—meaning that tiny changes amplify as time goes by, making long-term prediction impossible. It could be noisy—meaning that the randomness is mainly due to outside influences. It could be somewhere in between! But nobody is sure.

The graph above was made by William Kessler, an expert on El Nño. His FAQs are worth a look:

• William Kessler, El Niño: How it works, how we observe it.

He describes some theories about why an El Niño starts, and why it ends. These theories involve three additional concepts:

• The thermocline is the border between the warmer surface water in the ocean and the cold deep water, 100 to 200 meters below the surface. During the La Niña phase, warm water is blown to the western Pacific, and cold water is pulled up to the surface of the eastern Pacific. So, the thermocline becomes deeper in the west than the east:

When an El Niño occurs, the thermocline flattens out:

Oceanic Rossby waves are very low-frequency waves in the ocean’s surface and thermocline. At the ocean’s surface they are only 5 centimeters high, but hundreds of kilometers across. The surface waves are mirrored by waves in the thermocline, which are much taller, 10-50 meters in height. When the surface goes up, the thermocline goes down!

• The Madden-Julian oscillation or MJO is the largest form of variability in the tropical atmosphere on time scales of 30-90 days. It’s a pulse that moves east across the Indian Ocean and Pacific ocean at 4-8 meters/second. It manifests itself as patches of anomalously high rainfall and also anomalously low rainfall. Strong Madden-Julian Oscillations are often seen 6-12 months before an El Niño starts!

With this bit of background, I hope you’re ready for what Kessler wrote in his El Niño FAQ:

There are two main theories at present. The first is that the event is initiated by the reflection from the western boundary of the Pacific of an oceanic Rossby wave (type of low-frequency planetary wave that moves only west). The reflected wave is supposed to lower the thermocline in the west-central Pacific and thereby warm the sea surface temperature by reducing the efficiency of upwelling to cool the surface. Then that makes winds blow towards the (slightly) warmer water and really start the event. The nice part about this theory is that the oceanic Rossby waves can be observed for months before the reflection, which implies that El Niño is predictable.

The other idea is that the trigger is essentially random. The tropical convection (organized large-scale thunderstorm activity) in the rising air tends to occur in bursts that last for about a month, and these bursts propagate out of the Indian Ocean (known as the Madden-Julian Oscillation). Since the storms are geostrophic (rotating according to the turning of the earth, which means they rotate clockwise in the southern hemisphere and counter-clockwise in the north), storm winds on the equator always blow towards the east. If the storms are strong enough, or last long enough, then those eastward winds may be enough to start the sloshing. But specific Madden-Julian Oscillation events are not predictable much in advance (just as specific weather events are not predictable in advance), and so to the extent that this is the main element, then El Niño will not be predictable.

In my opinion both these two processes can be important in different El Niños. Some models that did not have the MJO storms were successful in predicting the events of 1986-87 and 1991-92. That suggests that the Rossby wave part was a main influence at that time. But those same models have failed to predict the events since then, and the westerlies have appeared to come from nowhere. It is also quite possible that these two general sets of ideas are incomplete, and that there are other causes entirely. The fact that we have very intermittent skill at predicting the major turns of the ENSO cycle (as opposed to the very good forecasts that can be made once an event has begun) suggests that there remain important elements that are await explanation.

So it’s complicated!

Next time I’ll talk about a new paper that tries to cut through these complications and predict El Niños more than 6 months in advance, using a simple idea. It’s a great opportunity for programmers to dive in and try to do better. But I think we need to keep the subtleties in mind… at least somewhere in the back of our mind.

## El Niño Project (Part 1)

20 June, 2014

A bunch of Azimuth Project members like to program, so they started the Azimuth Code Project… but now it’s getting more lively! We’re trying to understand and predict the climate phenomenon known as El Niño.

Why? Several reasons:

• It’s the biggest source of variability in the Earth’s climate on times scales between a year and a decade. It causes weather disturbances in many regions, especially near the Pacific Ocean. The last really big one happened in 1997-1998, and we’re waiting for the next.

• It’s hard to predict for more than 6 months in advance. It’s not periodic: it’s a quasi-periodic phenomenon that occurs across the tropical Pacific Ocean every 3 to 7 years.

• It matters for global warming. A lot of heat gets stored in the ocean, and a lot comes back into the atmosphere during an El Niño. So, the average surface air temperature of the Earth may reach a new high when the next El Niño comes.

• In February 2014, a paper in Proceedings of the National Academy of Sciences caused a stir by claiming to be able to predict the next El Niño more than 6 months in advance using ideas from network theory. Moreover, it claimed an El Niño would start in late 2014 with a 75% probability.

• The math involved in this paper is interesting, not too complicated, and maybe we can improve on it. At the very least, it raises a lot of questions worth studying. And it’s connected to network theory, one of the Azimuth Project’s specialties!

We are already hard at work on this project. We could use help from computer programmers, mathematicians, and physicists: there is lots to do! But it makes sense to start by explaining the issues and what we’ve done so far. We’ll do that in a series of posts here.

This first post will not get into many details. Instead, I just want to set the stage with some basic information about El Niño.

### El Niño and La Niña

This animation produced by the Australian Bureau of Meteorology shows how the cycle works:

During La Niña years, trade winds blow across the Pacific Ocean from the Americas to Asia in a strong way. So, warm surface water gets pushed toward Asia. Warmer oceans there create more clouds and rain there. The other side of the Pacific gets cooler, so there is less rain in many parts of the Americas.

During El Niño years, trade winds in the tropical Pacific weaken, and blobs of warm surface water move back toward the Americas. So, the eastern part of the Pacific warms up. We generally get more rain in the Americas… but less in Asia.

### ENSO

The cycle of El Niños and La Niñas is often called the El Niño/Southern Oscillation or ENSO. Why? Because this cycle is linked to the Southern Oscillation: an oscillation in the difference in air pressure between the eastern and western Pacific:

The top graph shows variations in the water temperature of the tropical eastern Pacific ocean: when it’s hot we have an El Niño. The bottom graph shows the air pressure in Tahiti minus the air pressure in Darwin, Australia — up to a normalization constant, this called the Southern Oscillation Index, or SOI. If you stare at the graphs a while, you’ll see they’re quite strongly correlated—or more precisely, anticorrelated, since one tends to go up when the other goes down. So, remember:

A big negative SOI goes along with an El Niño!

There are other ways besides the SOI to tell if an El Niño is happening. We’ll talk later about these quantities, how they’re defined, how you can get the data online, what we’ve done with this data, and what we want to do.

### Is a big El Niño coming?

To conclude, I just want you to watch this short movie. NASA’s Jason-2 satellite has detected blobs of hot water moving east toward America! This has made some scientists—not just those using network theory—suspect a big El Niño is on its way, perhaps a repeat of the one that started in 1997.

On the other hand, on June 17th the National Oceanic and Atmospheric Administation (NOAA) said that trends are now running “counter to typical El Niño development”. So we’ll have to wait and see… and meanwhile, try to predict!

### References

If you can’t wait to dive in, start here:

Experiments in El Niño detection and prediction, Azimuth Forum.

To join this conversation, join the forum by following these instructions:

This is the paper that got us excited:

• Josef Ludescher, Avi Gozolchiani, Mikhail I. Bogachev, Armin Bunde, Shlomo Havlin, and Hans Joachim Schellnhuber, Very early warning of next El Niño, Proceedings of the National Academy of Sciences, February 2014.

A lot of the methodology is explained here:

• Josef Ludescher, Avi Gozolchiani, Mikhail I. Bogachev, Armin Bunde, Shlomo Havlin, and Hans Joachim Schellnhuber, Improved El Niño forecasting by cooperativity detection, Proceedings of the National Academy of Sciences, 30 May 2013. (For more discussion, go to the Azimuth Forum.)

## Wind Power and the Smart Grid

18 June, 2014

Electric power companies complain about wind power because it’s intermittent: if suddenly the wind stops, they have to bring in other sources of power.

This is no big deal if we only use a little wind. Across the US, wind now supplies 4% of electric power; even in Germany it’s just 8%. The problem starts if we use a lot of wind. If we’re not careful, we’ll need big fossil-fuel-powered electric plants when the wind stops. And these need to be turned on, ready to pick up the slack at a moment’s notice!

So, a few years ago Xcel Energy, which supplies much of Colorado’s power, ran ads opposing a proposal that it use renewable sources for 10% of its power.

But now things have changed. Now Xcel gets about 15% of their power from wind, on average. And sometimes this spikes to much more!

Every few seconds, hundreds of turbines measure the wind speed. Every 5 minutes, they send this data to high-performance computers 100 miles away at the National Center for Atmospheric Research in Boulder. NCAR crunches these numbers along with data from weather satellites, weather stations, and other wind farms – and creates highly accurate wind power forecasts.

With better prediction, Xcel can do a better job of shutting down idling backup plants on days when they’re not needed. Last year was a breakthrough year – better forecasts saved Xcel nearly as much money as they had in the three previous years combined.

It’s all part of the emerging smart grid—an intelligent network that someday will include appliances and electric cars. With a good smart grid, we could set our washing machine to run when power is cheap. Maybe electric cars could store solar power in the day, use it to power neighborhoods when electricity demand peaks in the evening – then recharge their batteries using wind power in the early morning hours. And so on.

### References

I would love if it the Network Theory project could ever grow to the point of helping design the smart grid. So far we are doing much more ‘foundational’ work on control theory, along with a more applied project on predicting El Niños. I’ll talk about both of these soon! But I have big hopes and dreams, so I want to keep learning more about power grids and the like.

Here are two nice references:

• Kevin Bullis, Smart wind and solar power, from 10 breakthrough technologies, Technology Review, 23 April 2014.

• Keith Parks, Yih-Huei Wan, Gerry Wiener and Yubao Liu, Wind energy forecasting: a collaboration of the National Center for Atmospheric Research (NCAR) and Xcel Energy.

The first is fun and easy to read. The second has more technical details. It describes the software used (the picture on top of this article shows a bit of this), and also some of the underlying math and physics. Let me quote a bit:

#### High-resolution Mesoscale Ensemble Prediction Model (EPM)

It is known that atmospheric processes are chaotic in nature. This implies that even small errors in the model initial conditions combined with the imperfections inherent in the NWP model formulations, such as truncation errors and approximations in model dynamics and physics, can lead to a wind forecast with large errors for certain weather regimes. Thus, probabilistic wind prediction approaches are necessary for guiding wind power applications. Ensemble prediction is at present a practical approach for producing such probabilistic predictions. An innovative mesoscale Ensemble Real-Time Four Dimensional Data Assimilation (E-RTFDDA) and forecasting system that was developed at NCAR was used as the basis for incorporating this ensemble prediction capability into the Xcel forecasting system.

Ensemble prediction means that instead of a single weather forecast, we generate a probability distribution on the set of weather forecasts. The paper has references explaining this in more detail.

We had a nice discussion of wind power and the smart grid over on G+. Among other things, John Despujols mentioned the role of ‘smart inverters’ in enhancing grid stability:

Smart solar inverters smooth out voltage fluctuations for grid stability, DigiKey article library.

A solar inverter converts the variable direct current output of a photovoltaic solar panel into alternating current usable by the electric grid. There’s a lot of math involved here—click the link for a Wikipedia summary. But solar inverters are getting smarter.

#### Wild fluctuations

While the solar inverter has long been the essential link between the photovoltaic panel and the electricity distribution network and converting DC to AC, its role is expanding due to the massive growth in solar energy generation. Utility companies and grid operators have become increasingly concerned about managing what can potentially be wildly fluctuating levels of energy produced by the huge (and still growing) number of grid-connected solar systems, whether they are rooftop systems or utility-scale solar farms. Intermittent production due to cloud cover or temporary faults has the potential to destabilize the grid. In addition, grid operators are struggling to plan ahead due to lack of accurate data on production from these systems as well as on true energy consumption.

In large-scale facilities, virtually all output is fed to the national grid or micro-grid, and is typically well monitored. At the rooftop level, although individually small, collectively the amount of energy produced has a significant potential. California estimated it has more than 150,000 residential rooftop grid-connected solar systems with a potential to generate 2.7 MW.

However, while in some systems all the solar energy generated is fed to the grid and not accessible to the producer, others allow energy generated to be used immediately by the producer, with only the excess fed to the grid. In the latter case, smart meters may only measure the net output for billing purposes. In many cases, information on production and consumption, supplied by smart meters to utility companies, may not be available to the grid operators.

#### Getting smarter

The solution according to industry experts is the smart inverter. Every inverter, whether at panel level or megawatt-scale, has a role to play in grid stability. Traditional inverters have, for safety reasons, become controllable, so that they can be disconnected from the grid at any sign of grid instability. It has been reported that sudden, widespread disconnects can exacerbate grid instability rather than help settle it.

Smart inverters, however, provide a greater degree of control and have been designed to help maintain grid stability. One trend in this area is to use synchrophasor measurements to detect and identify a grid instability event, rather than conventional ‘perturb-and-observe’ methods. The aim is to distinguish between a true island condition and a voltage or frequency disturbance which may benefit from additional power generation by the inverter rather than a disconnect.

Smart inverters can change the power factor. They can input or receive reactive power to manage voltage and power fluctuations, driving voltage up or down depending on immediate requirements. Adaptive volts-amps reactive (VAR) compensation techniques could enable ‘self-healing’ on the grid.

Two-way communications between smart inverter and smart grid not only allow fundamental data on production to be transmitted to the grid operator on a timely basis, but upstream data on voltage and current can help the smart inverter adjust its operation to improve power quality, regulate voltage, and improve grid stability without compromising safety. There are considerable challenges still to overcome in terms of agreeing and evolving national and international technical standards, but this topic is not covered here.

The benefits of the smart inverter over traditional devices have been recognized in Germany, Europe’s largest solar energy producer, where an initiative is underway to convert all solar energy producers’ inverters to smart inverters. Although the cost of smart inverters is slightly higher than traditional systems, the advantages gained in grid balancing and accurate data for planning purposes are considered worthwhile. Key features of smart inverters required by German national standards include power ramping and volt/VAR control, which directly influence improved grid stability.

## The Computational Power of Chemical Reaction Networks

10 June, 2014

I’m at this workshop:

Programming with Chemical Reaction Networks: Mathematical Foundations, Banff International Research Station, 8-13 June 2014.

Luca Cardelli wrote about computation with chemical reactions in Part 26 of the network theory series here on this blog. So, it’s nice to meet him and many other researchers, learn more, and try to solve some problems together!

• David Soloveichik, U.C. San Francisco, The computational power of chemical reaction networks.

David works at the Center for Systems and Synthetic Biology, and their website says:

David did his graduate work with Erik Winfree at Caltech, focusing on algorithmic self-assembly and on synthetic networks of nucleic-acid interactions based on strand displacement cascades. He is interested in “molecular programming”: the systematic design of complex molecular systems based on the principles of computer science and distributed computing. More generally, he is trying to create a theoretical foundation of chemical computation applicable to both synthetic and natural systems.

According to his webpage, Soloveichik’s research interests are:

Wet-lab: the rational design of molecular interactions for synthetic biology, nanotechnology, and bioengineering. The goal is to engineer autonomous molecular systems that can sense, compute, and perform various actions. Using nucleic-acid “strand displacement cascades” as the molecular primitive, we are able to attain freedom of design that hasn’t been previously possible.

Theory: The theoretical foundation of chemical computation. Once we have a way to program molecular interactions, what programming language shall we use? How molecules can process information and carry out computation is still not well-understood; however, a formal connection to models of concurrent computation may allow systematic and scalable design, rigorous analysis and verification. Further, computational principles may elucidate the design of biological regulatory networks.

Here are my notes on his tutorial.

### Motivation

We’ve got people here from different backgrounds:

• computational complexity theory
• wetlab / experimental science
• pure and applied mathematics
• software verification

CRNs (chemical reaction networks) show up in:

• chemistry
• population biology
• sensor networks
• math:
○ Petri nets
○ commutative semigroups
○ bounded context-free languages
○ uniform recurrence equations

Why use them for computation? People want to go beyond the von Neumann architecture for computation. People also want to understand how cells process information. However, with a few exceptions, the computational perspective in this talk has not yet proved relevant in biology. So, there is a lot left to learn.

### The model

The model of computation here will be the master equation for a chemical reaction network… since this has been explained starting Part 4 of the network theory series, I won’t review it!

Can all chemical reaction networks, even those without any conservation laws, be realized by actual chemical systems?

Though this is a subtle question, one answer is “yes, using strand displacement cascades”. This is a trick for getting DNA to simulate other chemical reactions. It’s been carried out in the lab! See this paper and many subsequent ones:

• Soloveichik, Seelig and Winfree, DNA as a universal substrate for chemical kinetics.

Abstract: Molecular programming aims to systematically engineer molecular and chemical systems of autonomous function and ever-increasing complexity. A key goal is to develop embedded control circuitry within a chemical system to direct molecular events. Here we show that systems of DNA molecules can be constructed that closely approximate the dynamic behavior of arbitrary systems of coupled chemical reactions. By using strand displacement reactions as a primitive, we construct reaction cascades with effectively unimolecular and bimolecular kinetics. Our construction allows individual reactions to be coupled in arbitrary ways such that reactants can participate in multiple reactions simultaneously, reproducing the desired dynamical properties. Thus arbitrary systems of chemical equations can be compiled into real chemical systems. We illustrate our method on the Lotka–Volterra oscillator, a limit-cycle oscillator, a chaotic system, and systems implementing feedback digital logic and algorithmic behavior.

However, even working with the master equation for a CRN, there are various things we might mean by having it compute something:

• uniform vs non-uniform: is a single CRN supposed to handle all inputs, or do we allow adding extra reactions for larger inputs? It’s a bit like Turing machines vs Boolean circuits.

• deterministic vs probabilistic: is the correct output guaranteed or merely likely?

• halting vs stabilizing: does the CRN ‘know’ when it has finished, or not? In the ‘halting’ case the CRN irreversibly produces some molecules that signal that the computation is done. In the ‘stabilizing’ case, it eventually stabilizes to the right answer, but we may not know how long to wait.

These distinctions dramatically affect the computational power. In the case of uniform computation:

• deterministic and halting: this has finite computational power.

• deterministic and stabilizing: this can decide semilinear predicates.

• probabilistic and halting: this is Turing-universal.

• probabilistic and stabilizing: this can decide $\Delta_2^0$ predicates, which are more general than computable ones. (Indeed, if we use Turing machines but don’t require them to signal when they’ve halted, the resulting infinitely long computations can ‘compute’ stuff that’s not computable in the usual sense.)

### Deterministic stabilizing computations

Let’s look at the deterministic stabilizing computations in a bit more detail. We’ll look at decision problems. We have a subset $S \subseteq \mathbb{N}^d,$ and we want to answer this question: is the vector $X \in \mathbb{N}^d$ in the set $S?$

To do this, we represent the vector as a bunch of molecules: $X_1$ of the first kind, $X_2$ of the second kind, and so on. We call this an input. We may also include a fixed collection of additional molecules in our input, to help the reactions run.

Then we choose a chemical reaction network, and we let it run on our input. The answer to our question will be encoded in some molecules called Y and N. If $X$ is in $S,$ we want our chemical reaction to produce Y molecules. If it’s not, we want our reaction to produce N’s.

To make this more precise, we need to define what counts as an output. If we’ve got a bunch of molecules that

• contains Y but not N: then the output is YES.

• contains N but not Y: then the output is NO.

Otherwise the output is undefined.

Output-stable states are states with YES or NO output such that all states reachable from them via our chemical reactions give the same output. We say an output-stable-state is correct if this output is the correct answer to the question: is $X$ in $S$.

Our chemical reaction network gives a deterministic stabilizing computation if for any input, and choosing any state reachable from that input, we can do further chemical reactions to reach a correct output-stable state.

In other words: starting from our input, and letting the chemical reactions <run any way they want, we will eventually stabilize at an output that gives the right answer to the question “is $X$ in $S$?”

### Examples

This sounds a bit complicated, but it’s really not. Let’s look at some examples!

Example 1. Suppose you want to check two numbers and see if one is greater than or equal to another. Here

$S = \{(X_1,X_2) : X_2 \ge X_1 \}$

How can you decide if a pair of numbers $(X_1,X_2)$ is in this set?

You start with $X_1$ molecules of type $A,$ $X_2$ molecules of type $B,$ and one molecule of type $Y$. Then you use a chemical reaction network with these reactions:

$A + N \to Y$
$B + Y \to N$

If you let these reactions run, the $Y$ switches to a $N$ each time the reactions destroy an $A$. But the $N$ switches back to a $Y$ each time the reactions destroy a $B.$

When no more reactions are possible, we are left with either one $Y$ or one $N$, which is the correct answer to your question!

Example 2. Suppose you want to check two numbers and see if one is equal to another. Here

$S = \{(X_1,X_2) : X_2 = X_1 \}$

How can you decide if a pair of numbers $(X_1,X_2)$ is in here?

This is a bit harder! As before, you start with $X_1$ molecules of type $A,$ $X_2$ molecules of type $B,$ and one molecule of type $Y.$ Then you use a chemical reaction network with these reactions:

$A + B \to Y$
$Y + N \to Y$
$A + Y \to A + N$
$B + Y \to B + N$

The first reaction lets an $A$ and a $B$ cancel out, producing a $Y$. If you only run this reaction, you’ll eventually be left with either a bunch of $A\mathrm{s}$ or a bunch of $B\mathrm{s}$ or nothing but $Y\mathrm{s}$.

If you have $Y\mathrm{s},$ your numbers were equal. The other reactions deal with the cases where you have $A\mathrm{s}$ or $B\mathrm{s}$ left over. But the key thing to check is that no matter what order we run the reactions, we’ll eventually get the right answer! In the end, you’ll have either $Y\mathrm{s}$ or $N\mathrm{s},$ not both, and this will provide the yes-or-no answer to the question of whether $X_1 = X_2.$

### What deterministic stabilizing computations can do

We’ve looked at some examples of deterministic stabilizing computations. The big question is: what kind of questions can they answer?

More precisely, for what subsets $A \subseteq \mathbb{N}^d$ can we build a deterministic stabilizing computation that ends with output YES if the input $X$ lies in $A$ and with output NO otherwise?

The answer is: the ‘semilinear’ subsets!

• Dana Angluin, James Aspnes and David Eistenstat, Stably computable predicates are semilinear.

A set $S \subseteq \mathbb{N}^d$ is linear if it’s of the form

$\{u_0 + n_1 u_1 + \cdots + n_p u_p : n_i \in \mathbb{N} \}$

for some fixed vectors of natural numbers $u_i \in \mathbb{N}^d.$

A set $S \subseteq \mathbb{N}^d$ semilinear if it’s a finite union of linear sets.

How did Angluin, Aspnes and Eisenstat prove their theorem? Apparently the easy part is showing that membership in any semilinear set can be decided by a chemical reaction network. David sketched the proof of the converse. I won’t go into it, but it used a very nice fact:

Dickson’s Lemma. Any subset of $\mathbb{N}^d$ has a finite set of minimal elements, where we define $x \le y$ if $x_i \le y_i$ for all $i$.

For example, the region above and to the right of the hyperbola here has five minimal elements:

If you know some algebra, Dickson’s lemma should remind you of the Hilbert basis theorem, saying (for example) that every ideal in a ring of multivariable polynomials over a field is finitely generated. And in fact, Paul Gordan used Dickson’s Lemma in 1899 to help give a proof of Hilbert’s basis theorem.

It’s very neat to see how this lemma applies to chemical reaction networks! You can see how it works in Angluin, Aspnes and Eistenstat’s paper. But they call it “Higman’s lemma” for some reason.

### References

Here are some of David Soloveichik’s recent talks:

• An introduction to strand displacement cascades for the Foresight Institute Conference (Palo Alto, Jan 2013): An artificial “biochemistry” with DNA.

• Paper presented at DNA Computing and Molecular Programming 18 (Aarhus, Denmark, Aug 2012): Deterministic function computation with chemical reaction networks.

• Tutorial talk for DNA Computing and Molecular Programming 17 (Pasadena, Aug 2011): The programming language of chemical kinetics, and how to discipline your DNA molecules using strand displacement cascades.

• High-level introduction to algorithmic self-assembly and stochastic chemical reaction networks as computer-theoretic models: Computer-theoretic abstractions for molecular programming.

• On algorithmic behavior in chemical reaction networks and implementing arbitrary chemical reaction networks with DNA: programming well-mixed chemical kinetics.

## Warming Slowdown? (Part 2)

5 June, 2014

guest post by Jan Galkowski

### 5. Trends Are Tricky

Trends as a concept are easy. But trends as objective measures are slippery. Consider the Keeling Curve, the record of atmospheric carbon dioxide concentration first begun by Charles Keeling in the 1950s and continued in the face of great obstacles. This curve is reproduced in Figure 8, and there presented in its original, and then decomposed into three parts, an annual sinusoidal variation, a linear trend, and a stochastic remainder.

Figure 8. Keeling CO2 concentration curve at Mauna Loa, Hawaii, showing original data and its decomposition into three parts, a sinusoidal annual variation, a linear trend, and a stochastic residual.

The question is, which component represents the true trend, long term or otherwise? Are linear trends superior to all others? The importance of a trend is tied up with to what use it will be put. A pair of trends, like the sinusoidal and the random residual of the Keeling, might be more important for predicting its short term movements. On the other hand, explicating the long term behavior of the system being measured might feature the large scale linear trend, with the seasonal trend and random variations being but distractions.

Consider the global surface temperature anomalies of Figure 5 again. What are some ways of determining trends? First, note that by “trends” what’s really meant are slopes. In the case where there are many places to estimate slopes, there are many slopes. When, for example, a slope is estimated by fitting a line to all the points, there’s just a single slope such as in Figure 9. Local linear trends can be estimated from pairs of points in differing sizes of neighborhoods, as depicted in Figures 10 and 11. These can be averaged, if you like, to obtain an overall trend.

Figure 9. Global surface temperature anomalies relative to a 1950-1980 baseline, with long term linear trend atop.

Figure 10. Global surface temperature anomalies relative to a 1950-1980 baseline, with randomly placed trends from local linear having 5 year support atop.

Figure 11. Global surface temperature anomalies relative to a 1950-1980 baseline, with randomly placed trends from local linear having 10 year support atop.

Lest the reader think constructing lots of linear trends on varying neighborhoods is somehow crude, note it has a noble history, being used by Boscovich to estimate Earth’s ellipticity about 1750, as reported by Koenker.

There is, in addition, a question of what to do if local intervals for fitting the little lines overlap, since these are then (on the face of it) not independent of one another. There are a number of statistical devices for making them independent. One way is to do clever kinds of random sampling from a population of linear trends. Another way is to shrink the intervals until they are infinitesimally small, and, so, necessarily independent. That definition is just the point slope of a curve going through the data, or its first derivative. Numerical methods for estimating these exist—and to the degree they succeed, they obtain estimates of the derivative, even if in doing do they might use finite intervals.

One good way of estimating derivatives involves using a smoothing spline, as sketched in Figure 6, and estimating the derivative(s) of that. Such an estimate of the derivative is shown in Figure 12 where the instantaneous slope is plotted in orange atop the data of Figure 6. The value of the derivative should be read using the scale to the right of the graph. The value to the left shows, as before, temperature anomaly in degrees. The cubic spline itself is plotted in green in that figure. Here it’s smoothing parameter is determined by generalized cross-validation, a principled means of taking the subjectivity out of the choice of smoothing parameter. That is explained a bit more in the caption for Figure 12. (See also Cr1979.)

Figure 12. Global surface temperature anomalies relative to a 1950-1980 baseline, with instaneous numerical estimates of derivatives in orange atop, with scale for the derivative to the right of the chart. Note how the value of the first derivative never drops below zero although its magnitude decreases as time approaches 2012. Support for the smoothing spline used to calculate the derivatives is obtained using generalized cross validation. Such cross validation is used to help reduce the possibility that a smoothing parameter is chosen to overfit a particular data set, so the analyst could expect that the spline would apply to as yet uncollected data more than otherwise. Generalized cross validation is a particular clever way of doing that, although it is abstract.

What else might we do?

We could go after a really good approximation to the data of Figure 5. One possibility is to use the Bayesian Rauch-Tung-Striebel (“RTS”) smoother to get a good approximation for the underlying curve and estimate the derivatives of that. This is a modification of the famous Kalman filter, the workhorse of much controls engineering and signals work. What that means and how these work is described in an accompanying inset box.

Using the RTS smoother demands variances of the signal be estimated as priors. The larger the ratio of the estimate of the observations variance to the estimate of the process variance is, the smoother the RTS solution. And, yes, as the reader may have guessed, that makes the result dependent upon initial conditions, although hopefully educated initial conditions.

Figure 13. Global surface temperature anomalies relative to a 1950-1980 baseline, with fits using the Rauch-Tung-Striebel smoother placed atop, in green and dark green. The former uses a prior variance of 3 times that of the Figure 5 data corrected for serial correlation. The latter uses a prior variance of 15 times that of the Figure 5 data corrected for serial correlation. The instantaneous numerical estimates of the first derivative derived from the two solutions are shown in orange and brown, respectively, with their scale of values on the right hand side of the chart. Note the two solutions are essentially identical. If compared to the smoothing spline estimate of Figure 12, the derivative has roughly the same shape, but is shifted lower in overall slope, and the drift up and below a mean value is less.

The RTS smoother result for two process variance values of 0.118 ± 002 and high 0.59 ± 0.02 is shown in Figure 13. These are 3 and 15 times the decorrelated variance value for the series of 0.039 ± 0.001, estimated using the long term variance for this series and others like it, corrected for serial correlation. One reason for using two estimates of the process variance is to see how much difference that makes. As can be seen from Figure 13, it does not make much.

Combining all six methods of estimating trends results in Figure 14, which shows the overprinted densities of slopes.

Figure 14. In a stochastic signal, slopes are random variables. They may be correlated. Fitting of smooth models can be thought of as a way of sampling these random variable. Here, empirical probability density functions for slopes of temperatures versus years are displayed, using each of the 6 methods of estimating slopes. Empirical probability densities are obtained using kernel density estimation. These are preferred to histograms by statisticians because the latter can distort the density due to bin size and boundary effects. The lines here correspond to: local linear fits with 5 years separation (dark green trace), the local linear fits with 10 years separation (green trace), the smoothing spline (blue trace), the RTS smoother with variance 3 times the corrected estimate for the data as the prior variance (orange trace, mostly hidden by brown trace), and the RTS smoother with 15 times the corrected estimate for the data (brown trace). The blue trace can barely be seen because the RTS smoother with the 3 times variance lies nearly atop of it. The slope value for a linear fit to all the points is also shown (the vertical black line).

Note the spread of possibilities given by the 5 year local linear fits. The 10 year local linear fits, the spline, and the RTS smoother fits have their mode in the vicinity of the overall slope. The 10 year local linear fits slope has broader support, meaning it admits more negative slopes in the range of temperature anomalies observed. The RTS smoother results have peaks slightly below those for the spline, the 10 year local linear fits, and the overall slope. The kernel density estimator allows the possibility of probability mass below zero, even though the spline, and two RTS smoother fits never exhibit slopes below zero. This is a Bayesian-like estimator, since the prior is the real line.

Local linear fits to HadCRUT4 time series were used by Fyfe, Gillet, and Zwiers in their 2013 paper and supplement. We do not know the computational details of those trends, since they were not published, possibly due to Nature Climate Change page count restrictions. Those details matter. From these calculations, which, admittedly, are not as comprehensive as those by Fyfe, Gillet, and Zwiers, we see that robust estimators of trends in temperature during the observational record show these are always positive, even if the magnitudes vary. The RTS smoother solutions suggest slopes in recent years are near zero, providing a basis for questioning whether or not there is a warming “hiatus”.

 The Rauch-Tung-Striebel smoother is an enhancement of the Kalman filter. Let $y_{\kappa}$ denote a set of univariate observations at equally space and successive time steps $\kappa$. Describe these as follows: $y_{\kappa} = \mathbf{G} \mathbf{x}_{\kappa} + \varepsilon_{\kappa}$ $\mathbf{x}_{\kappa + 1} = \mathbf{H} \mathbf{x}_{\kappa} + \boldsymbol\gimel_{\kappa}$ $\varepsilon_{\kappa} \sim \mathcal{N}(0, \sigma^{2}_{\varepsilon})$ $\boldsymbol\gimel_{\kappa} \sim \mathcal{N}(0, \boldsymbol\Sigma^{2}_{\eta})$ The multivariate $\mathbf{x}_{\kappa}$ is called a state vector for index $\kappa$. $\mathbf{G}$ and $\mathbf{H}$ are given, constant matrices. Equations (5.3) and (5.4) say that the noise component of observations and states are distributed as zero mean Gaussian random variables with variance $\sigma^{2}_{\varepsilon}$ and covariance $\boldsymbol\Sigma^{2}_{\eta}$, respectively. This simple formulation in practice has great descriptive power, and is widely used in engineering and data analysis. For instance, it is possible to cast autoregressive moving average models (“ARMA”) in this form. (See Kitigawa, Chapter 10.) The key idea is that equation (5.1) describes at observation at time $\kappa$ as the result of a linear regression on coefficients $\mathbf{x}_{\kappa}$, where $\mathbf{G}$ is the corresponding design matrix. Then, the coefficients themselves change with time, using a Markov-like development, a linear regression of the upcoming set of coefficients, $\mathbf{x}_{\kappa+1}$, in terms of the current coefficients, $\mathbf{x}_{\kappa}$, where $\mathbf{H}$ is the design matrix. For the purposes here, a simple version of this is used, something called a local level model (Chapter 2) and occasionally a Gaussian random walk with noise model (Section 12.3.1). In that instance, $\mathbf{G}$ and $\mathbf{H}$ are not only scalars, they are unity, resulting in the simpler $y_{\kappa} = x_{\kappa} + \varepsilon_{\kappa}$ $x_{\kappa + 1} = x_{\kappa} + \eta_{\kappa}$ $\varepsilon_{\kappa} \sim \mathcal{N}(0, \sigma^{2}_{\varepsilon})$ $\eta_{\kappa} \sim \mathcal{N}(0, \sigma^{2}_{\eta})$ with scalar variances $\sigma^{2}_{\varepsilon}$ and $\sigma^{2}_{\eta}$. In either case, the Kalman filter is a way of calculating $\mathbf{x}_{\kappa}$, given $y_{1}, y_{2}, \dots, y_{n}$, values for $\mathbf{G}$ and $\mathbf{H}$, and estimates for $\sigma^{2}_{\varepsilon}$ and $\sigma^{2}_{\eta}$. Choices for $\mathbf{G}$ and $\mathbf{H}$ are considered a model for the data. Choices for $\sigma^{2}_{\varepsilon}$ and $\sigma^{2}_{\eta}$ are based upon experience with $Y_{\kappa}$ and the model. In practice, and within limits, the bigger the ratio the smoother the solution for $\mathbf{x}_{\kappa}$ over successive $\kappa$. Now, the Rauch-Tung-Striebel extension of the Kalman filter amounts to (a) interpreting it in a Bayesian context, and (b) using that interpretation and Bayes Rule to retrospectively update $\mathbf{x}_{\kappa-1}, \mathbf{x}_{\kappa-2}, \dots, \mathbf{x}_{1}$ with the benefit of information through $y_{\kappa}$ and the current state $\mathbf{x}_{\kappa}$. Details won’t be provided here, but are described in depth in many texts, such as Cowpertwait and Metcalfe, Durbin and Koopman, and Särkkä. Finally, commenting on the observation regarding subjectivity of choice in the ratio of variances, mentioned in Section 5 at the discussion of their choice “smoother” here has a specific meaning. If this ratio is smaller, the RTS solution tracks the signal more closely, meaning its short term variability is higher. A small ratio has implications for forecasting, increasing the prediction variance.

The recent IPCC AR5 WG1 Report sets out the context in its Box TS.3:

Hiatus periods of 10 to 15 years can arise as a manifestation of internal decadal climate variability, which sometimes enhances and sometimes counteracts the long-term externally forced trend. Internal variability thus diminishes the relevance of trends over periods as short as 10 to 15 years for long-term climate change (Box 2.2, Section 2.4.3). Furthermore, the timing of internal decadal climate variability is not expected to be matched by the CMIP5 historical simulations, owing to the predictability horizon of at most 10 to 20 years (Section 11.2.2; CMIP5 historical simulations are typically started around nominally 1850 from a control run). However, climate models exhibit individual decades of GMST trend hiatus even during a prolonged phase of energy uptake of the climate system (e.g., Figure 9.8; Easterling and Wehner, 2009; Knight et al., 2009), in which case the energy budget would be balanced by increasing subsurface-ocean heat uptake (Meehl et al., 2011, 2013a; Guemas et al., 2013).

Owing to sampling limitations, it is uncertain whether an increase in the rate of subsurface-ocean heat uptake occurred during the past 15 years (Section 3.2.4). However, it is very likely that the climate system, including the ocean below 700 m depth, has continued to accumulate energy over the period 1998-2010 (Section 3.2.4, Box 3.1). Consistent with this energy accumulation, global mean sea level has continued to rise during 1998-2012, at a rate only slightly and insignificantly lower than during 1993-2012 (Section 3.7). The consistency between observed heat-content and sea level changes yields high confidence in the assessment of continued ocean energy accumulation, which is in turn consistent with the positive radiative imbalance of the climate system (Section 8.5.1; Section 13.3, Box 13.1). By contrast, there is limited evidence that the hiatus in GMST trend has been accompanied by a slower rate of increase in ocean heat content over the depth range 0 to 700 m, when comparing the period 2003-2010 against 1971-2010. There is low agreement on this slowdown, since three of five analyses show a slowdown in the rate of increase while the other two show the increase continuing unabated (Section 3.2.3, Figure 3.2). [Emphasis added by author.]

During the 15-year period beginning in 1998, the ensemble of HadCRUT4 GMST trends lies below almost all model-simulated trends (Box 9.2 Figure 1a), whereas during the 15-year period ending in 1998, it lies above 93 out of 114 modelled trends (Box 9.2 Figure 1b; HadCRUT4 ensemble-mean trend $0.26\,^{\circ}\mathrm{C}$ per decade, CMIP5 ensemble-mean trend $0.16\,^{\circ}\mathrm{C}$ per decade). Over the 62-year period 1951-2012, observed and CMIP5 ensemble-mean trends agree to within $0.02\,^{\circ}\mathrm{C}$ per decade (Box 9.2 Figure 1c; CMIP5 ensemble-mean trend $0.13\,^{\circ}\mathrm{C}$ per decade). There is hence very high confidence that the CMIP5 models show long-term GMST trends consistent with observations, despite the disagreement over the most recent 15-year period. Due to internal climate variability, in any given 15-year period the observed GMST trend sometimes lies near one end of a model ensemble (Box 9.2, Figure 1a, b; Easterling and Wehner, 2009), an effect that is pronounced in Box 9.2, Figure 1a, because GMST was influenced by a very strong El Niño event in 1998. [Emphasis added by author.]

The contributions of Fyfe, Gillet, and Zwiers (“FGZ”) are to (a) pin down this behavior for a 20 year period using the HadCRUT4 data, and, to my mind, more importantly, (b) to develop techniques for evaluating runs of ensembles of climate models like the CMIP5 suite without commissioning specific runs for the purpose. This, if it were to prove out, would be an important experimental advance, since climate models demand expensive and extensive hardware, and the number of people who know how to program and run them is very limited, possibly a more limiting practical constraint than the hardware.

This is the beginning of a great story, I think, one which both advances an understanding of how our experience of climate is playing out, and how climate science is advancing. FGZ took a perfectly reasonable approach and followed it to its logical conclusion, deriving an inconsistency. There’s insight to be won resolving it.

FGZ try to explicitly model trends due to internal variability. They begin with two equations:

1. $M_{ij}(t) = u^{m}(t) + \text{Eint}_{ij}(t) + \text{Emod}_{i}(t),$
$i = 1, \dots, N^{m}, j= 1, \dots, N_{i}$
2. $O_{k}(t) = u^{o}(t) + \text{Eint}^{o}(t) + \text{Esamp}_{k}(t),$
$k = 1, \dots, N^{o}$

$i$ is the model membership index. $j$ is the index of the $i^{\text{th}}$ model’s $j^{\text{th}}$ ensemble. $k$ runs over bootstrap samples taken from HadCRUT4 observations. Here, $M_{ij}(t)$ and $O_{k}(t)$ are trends calculated using models or observations, respectively. $u^{m}(t)$ and $u^{o}(t)$ denote the “true, unknown, deterministic trends due to external forcing” common to models and observations, respectively. $\text{Eint}_{ij}(t)$ and $\text{Eint}^{o}(t)$ are the perturbations to trends due to internal variability of models and observations. $\text{Emod}_{i}(t)$ denotes error in climate model trends for model $i$. $\text{Esamp}_{k}(t)$ denotes the sampling error in the $k^{\text{th}}$ sample. FGZ assume $\text{Emod}_{i}(t)$ are exchangeable with each other as well, at least for the same time $t$. (See [Di1977, Di1988, Ro2013c, Co2005] for more on exchangeability.) Note that while the internal variability of climate models $\text{Eint}_{ij}(t)$ varies from model to model, run to run, and time to time, the ‘internal variability of observations’, namely $\text{Eint}^{o}(t)$, is assumed to only vary with time.

The technical innovation FGZ use is to employ bootstrap resampling on the observations ensemble of HadCRUT4 and an ensemble of runs of 38 CMIP5 climate models to perform a two-sample comparison [Ch2008, Da2009, ]. In doing so, they explicitly assume, in the framework above, exchangeability of models. (Later, in the same work, they also make the same calculation assuming exchangeability of models and observations, an innovation too detailed for this present exposition.)

So, what is a bootstrap? In its simplest form, a bootstrap is a nonparametric, often robust, frequentist technique for sampling the distribution of a function of a set of population parameters, generally irrespective of the nature or complexity of that function, or the number of parameters. Since estimates of the variance of that function are themselves functions of population parameters, assuming the variance exists, the bootstrap can also be used to estimate the precision of the first set of samples, where “precision” is the reciprocal of variance. For more about the bootstrap, see the inset below..

In the case in question here, with FGZ, the bootstrap is being used to determine if the distribution of surface temperature trends as calculated from observations and the distribution of surface temperature trends as calculated from climate models for the same period have in fact similar means. This is done by examining differences of paired trends, one coming from an observation sample, one coming from a model sample, and assessing the degree of discrepancy based upon the variances of the observations trends distribution and of the models trends distribution.

The equations (6.1) and (6.2) can be rewritten:

1. $M_{ij}(t) - \text{Eint}_{ij}(t) = u^{m}(t) + \text{Emod}_{i}(t),$
$i = 1, \dots, N^{m}, j = 1, \dots, N_{i}$
2. $O_{k}(t) - \text{Eint}^{o}(t) = u^{o}(t) + \text{Esamp}_{k}(t),$
$k = 1, \dots, N^{o}$

moving the trends in internal variability to the left, calculated side. Both $\text{Eint}_{ij}(t)$ and $\text{Eint}^{o}(t)$ are not directly observable. Without some additional assumptions, which are not explicitly given in the FGZ paper, such as

1. $\text{Eint}_{ij}(t) \sim \mathcal{N}(0, \Sigma_{\text{model int}})$
2. $\text{Eint}^{o}(t) \sim \mathcal{N}(0, \Sigma_{\text{obs int}})$

we can’t really be sure we’re seeing $O_{k}(t)$ or $O_{k}(t) - \text{Eint}^{o}(t)$, or at least $O_{k}(t)$ less the mean of $\text{Eint}^{o}(t)$. The same applies to $M_{ij}(t)$ and $\text{Eint}_{ij}(t)$. Here equations (6.5) and (6.6) describe internal variabilities as being multivariate but zero mean Gaussian random variables. $\Sigma_{\text{model int}}$ and $\Sigma_{\text{obs int}}$ are covariances among models and among observations. FGZ essentially say these are diagonal with their statement “An implicit assumption is that sampling uncertainty in [observation trends] is independent of uncertainty due to internal variability and also independent of uncertainty in [model trends]“. They might not be so, but it is reasonable to suppose their diagonals are strong, and that there is a row-column exchange operator on these covariances which can produce banded matrices.

### 7. On Reconciliation

The centerpiece of the FGZ result is their Figure 1, reproduced here as Figure 15. Their conclusion, that climate models do not properly capture surface temperature observations for the given periods, is based upon the significant separation of the red density from the grey density, even when measuring that separation using pooled variances. But, surely, a remarkable feature of these graphs is not only the separation of the means of the two densities, but the marked difference in size of the variances of the two densities.

Why are climate models so less precise than HadCRUT4 observations? Moreover, why do climate models disagree with one another so dramatically? We cannot tell without getting into CMIP5 details, but the same result could be obtained if the climate models came in three Gaussian populations, each with a variance 1.5x that of the observations, but mixed together. We could also obtain the same result if, for some reason, the variance of HadCRUT4 was markedly understated.

That brings us back to the comments about HadCRUT4 made at the end of Section 3. HadCRUT4 is noted for “drop outs” in observations, where either the quality of an observation on a patch of Earth was poor or the observation was missing altogether for a certain month in history. (To be fair, both GISS and BEST have months where there is no data available, especially in early years of the record.) It also has incomplete coverage [Co2013]. Whether or not values for patches are imputed in some way, perhaps using spatial kriging, or whether or not supports to calculate trends are adjusted to avoid these omissions are decisions in use of these data which are critical to resolving the question [Co2013, Gl2011].

As seen in Section 5, what trends you get depends a lot on how they are done. FGZ did linear trends. These are nice because means of trends have simple relationships with the trends themselves. On the other hand, confining trend estimation to local linear trends binds these estimates to being only supported by pairs of actual samples, however sparse these may be. This has the unfortunate effect of producing a broadly spaced set of trends which, when averaged, appear to be a single, tight distribution, close to the vertical black line of Figure 14, but erasing all the detail available by estimating the density of trends with a robust function of the first time derivative of the series. FGZ might be improved by using such, repairing this drawback and also making it more robust against HadCRUT4’s inescapable data drops. As mentioned before, however, we really cannot know, because details of their calculations are not available. (Again, this author suspects this fault lies not with FGZ but a matter of page limits.)

In fact, that was indicated by a recent paper from Cowtan and Way, arguing that the limited coverage of HadCRUT4 might explain the discrepancy Fyfe, Gillet, and Zwiers found. In return Fyfe and Gillet argued that even admitting the corrections for polar regions which Cowtan and Way indicate, the CMIP5 models fall short in accounting for global mean surface temperatures. What could be wrong? In the context of ensemble forecasts depicting future states of the atmosphere, Wilks notes (Section 7.7.1):

Accordingly, the dispersion of a forecast ensemble can at best only approximate the [probability density function] of forecast uncertainty … In particular, a forecast ensemble may reflect errors both in statistical location (most or all ensemble members being well away from the actual state of the atmosphere, but relatively nearer to each other) and dispersion (either under- or overrepresenting the forecast uncertainty). Often, operational ensemble forecasts are found to exhibit too little dispersion …, which leads to overconfidence in probability assessment if ensemble relative frequencies are interpreted as estimating probabilities.

In fact, the IPCC reference, Toth, Palmer and others raise the same caution. It could be that the answer to why the variance of the observational data in the Fyfe, Gillet, and Zwiers graph depicted in Figure 15 is so small is that ensemble spread does not properly reflect the true probability density function of the joint distribution of temperatures across Earth. These might be “relatively nearer to each other” than the true dispersion which climate models are accommodating.

If Earth’s climate is thought of as a dynamical system, and taking note of the suggestion of Kharin that “There is basically one observational record in climate research”, we can do the following thought experiment. Suppose the total state of the Earth’s climate system can be captured at one moment in time, no matter how, and the climate can be reinitialized to that state at our whim, again no matter how. What happens if this is done several times, and then the climate is permitted to develop for, say, exactly 100 years on each “run”? What are the resulting states? Also suppose the dynamical “inputs” from the Sun, as a function of time, are held identical during that 100 years, as are dynamical inputs from volcanic forcings, as are human emissions of greenhouse gases. Are the resulting states copies of one another?

No. Stochastic variability in the operation of climate means these end states will be each somewhat different than one another. Then of what use is the “one observation record”? Well, it is arguably better than no observational record. And, in fact, this kind of variability is a major part of the “internal variability” which is often cited in these literature, including by FGZ.

Setting aside the problems of using local linear trends, FGZ’s bootstrap approach to the HadCRUT4 ensemble is an attempt to imitate these various runs of Earth’s climate. The trouble is, the frequentist bootstrap can only replicate values of observations actually seen. (See inset.) In this case, these replications are those of the HadCRUT4 ensembles. It will never produce values in-between and, as the parameters of temperature anomalies are in general continuous measures, allowing for in-between values seems a reasonable thing to do.

No algorithm can account for a dispersion which is not reflected in the variability of the ensemble. If the dispersion of HadCRUT4 is too small, it could be corrected using ensemble MOS methods (Section 7.7.1.) In any case, underdispersion could explain the remarkable difference in variances of populations seen in Figure 15. I think there’s yet another way.

Consider equations (6.1) and (6.2) again. Recall, here, $i$ denotes the $i^{th}$ model and $j$ denotes the $j^{th}$ run of model $i$. Instead of $k$, however, a bootstrap resampling of the HadCRUT4 ensembles, let $\omega$ run over all the 100 ensemble members provided, let $\xi$ run over the 2592 patches on Earth’s surface, and let $\kappa$ run over the 1967 monthly time steps. Reformulate equations (6.1) and (6.2), instead, as

1. $M_{\kappa} = u_{\kappa} + \sum_{i = 1}^{N^{m}} x_{i} \left(\text{Emod}_{i\kappa} + \text{Eint}_{i\kappa}\right)$
2. $O_{\kappa} = u_{\kappa} + \sum_{\xi = 1}^{2592} \left(x_{0} \text{Eint}^{\zeta}_{\kappa} + x_{\xi} \text{Esamp}_{\xi\kappa}\right)$

Now, $u_{\kappa}$ is a common trend at time tick $\kappa$ and $\text{Emod}_{i\kappa}$ and $\text{Eint}_{i\kappa}$ are deflections from from that trend due to modeling error and internal variability in the $i^{\text{th}}$ model, respectively, at time tick $\kappa$. Similarly, $\text{Eint}^{\zeta}_{\kappa}$ denotes deflections from the common trend baseline $u$ due to internal variability as seen by the HadCRUT4 observational data at time tick $\kappa$, and $\text{Esamp}_{\xi\kappa}$ denotes the deflection from the common baseline due to sampling error in the $\xi^{\text{th}}$ patch at time tick $\kappa$. $x_{\iota}$ are indicator variables. This is the setup for an analysis of variance or ANOVA, preferably a Bayesian one (Sections 14.1.6, 18.1). In equation (7.1), successive model runs $j$ for model $i$ are used to estimate $\text{Emod}_{i\kappa}$ and $\text{Eint}_{i\kappa}$ for every $\kappa$. In equation (7.2), different ensemble members $\omega$ are used to estimate $\text{Eint}^{\zeta}_{\kappa}$ and $\text{Esamp}_{\xi\kappa}$ for every $\kappa$. Coupling the two gives a common estimate of $u_{\kappa}$. There’s considerable flexibility in how model runs or ensemble members are used for this purpose, opportunities for additional differentiation and ability to incorporate information about relationships among models or among observations. For instance, models might be described relative to a Bayesian model average [Ra2005]. Observations might be described relative to a common or slowly varying spatial trend, reflecting dependencies among $\xi$ patches. Here, differences between observations and models get explicitly allocated to modeling error and internal variability for models, and sampling error and internal variability for observations.

More work needs to be done to assess the proper virtues of the FGZ technique, even without modification. A device like that Rohde used to compare BEST temperature observations with HadCRUT4 and GISS, one of supplying the FGZ procedure with synthetic data, would be perhaps the most informative regarding its character. Alternatively, if an ensemble MOS method were devised and applied to HadCRUT4, it might better reflect a true spread of possibilities. Because a dataset like HadCRUT4 records just one of many possible observational records the Earth might have exhibited, it would be useful to have a means of elaborating what those other possibilities were, given the single observational trace.

Regarding climate models, while they will inevitably disagree from a properly elaborated set of observations in the particulars of their statistics, in my opinion, the goal should be to strive to match the distributions of solutions these two instruments of study on their first few moments by improving both. While, statistical equivalence is all that’s sought, we’re not there yet. Assessing parametric uncertainty of observations hand-in-hand with the model builders seems to be a sensible route. Indeed, this is important. In review of the Cowtan and Way result, one based upon kriging, Kintisch summarizes the situation as reproduced in Table 1, a reproduction of his table on page 348 of the reference [Co2013, Gl2011, Ki2014]:

TEMPERATURE TRENDS
1997-2012
Source Warming ($^{\circ}\,\mathrm{C}$/decade)
Climate models 0.102-0.412
NASA data set 0.080
Cowtan/Way 0.119
 Table 1. Getting warmer. New method brings measured temperatures closer to projections. Added in quotation: “Climate models” refers to the CMIP5 series. “NASA data set” is GISS. “HadCRUT data set” is HadCRUT4. “Cowtan/Way” is from their paper. Note values are per decade, not per year.

Note that these estimates of trends, once divided by 10 years/decade to convert to a per year change in temperature, all fall well within the slope estimates depicted in the summary Figure 14. Note, too, how low the HadCRUT trend is.

If the FGZ technique, or any other, can contribute to this elucidation, it is most welcome.

As an example Lee reports how the GLOMAP model of aerosols was systematically improved using such careful statistical consideration. It seems likely to be a more rewarding way than “black box” treatments. Incidently, Dr Lindsay Lee’s article was runner-up in the Significance/Young Statisticians Section writers’ competition. It’s great to see bright young minds charging in to solve these problems!

 The bootstrap is a general name for a resampling technique, most commonly associated with what is more properly called the frequentist bootstrap. Given a sample of observations, $\mathring{Y} = \{y_{1}, y_{2}, \dots, y_{n}\}$, the bootstrap principle says that in a wide class of statistics and for certain minimum sizes of $n$, the sampling density of a statistic $h(Y)$ from a population of all $Y$, where $\mathring{Y}$ is a single observation, can be approximated by the following procedure. Sample $\mathring{Y}$ $M$ times with replacement to obtain $M$ samples each of size $n$ called $\tilde{Y}_{k}$, $k = 1, \dots, M$. For each $\tilde{Y}_{k}$, calculate $h(\tilde{Y}_{k})$ so as to obtain $H = h_{1}, h_{2}, \dots, h_{M}$. The set $H$ so obtained is an approximation of the sampling density of $h(Y)$ from a population of all $Y$. Note that because $\mathring{Y}$ is sampled, only elements of that original set of observations will ever show up in any $\tilde{Y}_{k}$. This is true even if $Y$ is drawn from an interval of the real numbers. This is where a Bayesian bootstrap might be more suitable. In a Bayesian bootstrap, the set of possibilities to be sampled are specified using a prior distribution on $Y$ [Da2009, Section 10.5]. A specific observation of $Y$, like $\mathring{Y}$, is use to update the probability density on $Y$, and then values from $Y$ are drawn in proportion to this updated probability. Thus, values in $Y$ never in $\mathring{Y}$ might be drawn. Both bootstraps will, under similar conditions, preserve the sampling distribution of $Y$.

### 8. Summary

Various geophysical datasets recording global surface temperature anomalies suggest a slowdown in anomalous global warming from historical baselines. Warming is increasing, but not as fast, and much of the media attention to this is reacting to the second time derivative of temperature, which is negative, not the first time derivative, its rate of increase. Explanations vary. In one important respect, 20 or 30 years is an insufficiently long time to assess the state of the climate system. In another, while the global surface temperature increase is slowing, oceanic temperatures continue to soar, at many depths. Warming might even decrease. None of these seem to pose a challenge to the geophysics of climate, which has substantial support both from experimental science and ab initio calculations. An interesting discrepancy is noted by Fyfe, Gillet, and Zwiers, although their calculation could be improved both by using a more robust estimator for trends, and by trying to integrate out anomalous temperatures due to internal variability in their models, because much of it is not separately observable. Nevertheless, Fyfe, Gillet, and Zwiers may have done the field a great service, making explicit a discrepancy which enables students of datasets like the important HadCRUT4 to discover an important limitation, that their dispersion across ensembles does not properly reflect the set of Earth futures which one might wish they did and, in their failure for users who think of the ensemble as representing such futures, give them a dispersion which is significantly smaller than what we might know.

The Azimuth Project can contribute, and I am planning subprojects to pursue my suggestions in Section 7, those of examining HadCRUT4 improvements using MOS ensembles, a Bayesian bootstrap, or the Bayesian ANOVA described there. Beyond trends in mean surface temperatures, there’s another more challenging statistical problem involving trends in sea levels which awaits investigation [Le2012b, Hu2010].

Working out these kinds of details is the process of science at its best, and many disciplines, not least mathematics, statistics, and signal processing, have much to contribute to the methods and interpretations of these series data. It is possible too much is being asked of a limited data set, and perhaps we have not yet observed enough of climate system response to say anything definitive. But the urgency to act responsibly given scientific predictions remains.

### Bibliography

1. Credentials. I have taken courses in geology from Binghamton University, but the rest of my knowledge of climate science is from reading the technical literature, principally publications from the American Geophysical Union and the American Meteorological Society, and self-teaching, from textbooks like Pierrehumbert. I seek to find ways where my different perspective on things canhelp advance and explain the climate science enterprise. I also apply my skills to working local environmental problems, ranging from inferring people’s use of energy in local municipalities, as well as studying things like trends in solid waste production at the same scales using Bayesian inversions. I am fortunate that techniques used in my professional work and those in these problems overlap so much. I am a member of the American Statistical Association, the American Geophysical Union, the American Meteorological Association, the International Society for Bayesian Analysis, as well as the IEEE and its signal processing society.
2. [Yo2014] D. S. Young, “Bond. James Bond. A statistical look at cinema’s most famous spy”, CHANCE Magazine, 27(2), 2014, 21-27, http://chance.amstat.org/2014/04/james-bond/.
3. [Ca2014a] S. Carson, Science of Doom, a Web site devoted to atmospheric radiation physics and forcings, last accessed 7 February 2014.
4. [Pi2012] R. T. Pierrehumbert, Principles of Planetary Climate, Cambridge University Press, 2010, reprinted 2012.
5. [Pi2011] R. T. Pierrehumbert, “Infrared radiative and planetary temperature”, Physics Today, January 2011, 33-38.
6. [Pe2006] G. W. Petty, A First Course in Atmospheric Radiation, 2nd edition, Sundog Publishing, 2006.
7. [Le2012a] S. Levitus, J. I. Antonov, T. P. Boyer, O. K. Baranova, H. E. Garcia, R. A. Locarnini, A. V. Mishonov, J. R. Reagan, D. Seidov, E. S. Yarosh, and M. M. Zweng, “World ocean heat content and thermosteric sea level change (0-2000 m), 1955-2010″, Geophysical Research Letters, 39, L10603, 2012, http://dx.doi.org/10.1029/2012GL051106.
8. [Le2012b] S. Levitus, J. I. Antonov, T. P. Boyer, O. K. Baranova, H. E. Garcia, R. A. Locarnini, A. V. Mishonov, J. R. Reagan, D. Seidov, E. S. Yarosh, and M. M. Zweng, “World ocean heat content and thermosteric sea level change (0-2000 m), 1955-2010: supplementary information”, Geophysical Research Letters, 39, L10603, 2012, http://onlinelibrary.wiley.com/doi/10.1029/2012GL051106/suppinfo.
9. [Sm2009] R. L. Smith, C. Tebaldi, D. Nychka, L. O. Mearns, “Bayesian modeling of uncertainty in ensembles of climate models”, Journal of the American Statistical Association, 104(485), March 2009.
10. Nomenclature. The nomenclature can be confusing. With respect to observations, variability arising due to choice of method is sometimes called structural uncertainty [Mo2012, Th2005].
11. [Kr2014] J. P. Krasting, J. P. Dunne, E. Shevliakova, R. J. Stouffer (2014), “Trajectory sensitivity of the transient climate response to cumulative carbon emissions”, Geophysical Research Letters, 41, 2014, http://dx.doi.org/10.1002/2013GL059141.
12. [Sh2014a] D. T. Shindell, “Inhomogeneous forcing and transient climate sensitivity”, Nature Climate Change, 4, 2014, 274-277, http://dx.doi.org/10.1038/nclimate2136.
13. [Sh2014b] D. T. Shindell, “Shindell: On constraining the Transient Climate Response”, RealClimate, http://www.realclimate.org/index.php?p=17134, 8 April 2014.
14. [Sa2011] B. M. Sanderson, B. C. O’Neill, J. T. Kiehl, G. A. Meehl, R. Knutti, W. M. Washington, “The response of the climate system to very high greenhouse gas emission scenarios”, Environmental Research Letters, 6, 2011, 034005,
http://dx.doi.org/10.1088/1748-9326/6/3/034005.
15. [Em2011] K. Emanuel, “Global warming effects on U.S. hurricane damage”, Weather, Climate, and Society, 3, 2011, 261-268, http://dx.doi.org/10.1175/WCAS-D-11-00007.1.
16. [Sm2011] L. A. Smith, N. Stern, “Uncertainty in science and its role in climate policy”, Philosophical Transactions of the Royal Society A, 269, 2011 369, 1-24, http://dx.doi.org/10.1098/rsta.2011.0149.
17. [Le2010] M. C. Lemos, R. B. Rood, “Climate projections and their impact on policy and practice”, WIREs Climate Change, 1, September/October 2010, http://dx.doi.org/10.1002/wcc.71.
18. [Sc2014] G. A. Schmidt, D. T. Shindell, K. Tsigaridis, “Reconciling warming trends”, Nature Geoscience, 7, 2014, 158-160, http://dx.doi.org/10.1038/ngeo2105.
19. [Be2013] “Examining the recent “pause” in global warming”, Berkeley Earth Memo, 2013, http://static.berkeleyearth.org/memos/examining-the-pause.pdf.
20. [Mu2013a] R. A. Muller, J. Curry, D. Groom, R. Jacobsen, S. Perlmutter, R. Rohde, A. Rosenfeld, C. Wickham, J. Wurtele, “Decadal variations in the global atmospheric land temperatures”, Journal of Geophysical Research: Atmospheres, 118 (11), 2013, 5280-5286, http://dx.doi.org/10.1002/jgrd.50458.
21. [Mu2013b] R. Muller, “Has global warming stopped?”, Berkeley Earth Memo, September 2013, http://static.berkeleyearth.org/memos/has-global-warming-stopped.pdf.
22. [Br2006] P. Brohan, J. Kennedy, I. Harris, S. Tett, P. D. Jones, “Uncertainty estimates in regional and global observed temperature changes: A new data set from 1850″, Journal of Geophysical Research—Atmospheres, 111(D12), 27 June 2006, http://dx.doi.org/10.1029/2005JD006548.
23. [Co2013] K. Cowtan, R. G. Way, “Coverage bias in the HadCRUT4 temperature series and its impact on recent temperature trends”, Quarterly Journal of the Royal Meteorological Society, 2013, http://dx.doi.org/10.1002/qj.2297.
24. [Fy2013] J. C. Fyfe, N. P. Gillett, F. W. Zwiers, “Overestimated global warming over the past 20 years”, Nature Climate Change, 3, September 2013, 767-769, and online at http://dx.doi.org/10.1038/nclimate1972.
25. [Ha2013] E. Hawkins, “Comparing global temperature observations and simulations, again”, Climate Lab Book, http://www.climate-lab-book.ac.uk/2013/comparing-observations-and-simulations-again/, 28 May 2013.
26. [Ha2014] A. Hannart, A. Ribes, P. Naveau, “Optimal fingerprinting under multiple sources of uncertainty”, Geophysical Research Letters, 41, 2014, 1261-1268, http://dx.doi.org/10.1002/2013GL058653.
27. [Ka2013a] R. W. Katz, P. F. Craigmile, P. Guttorp, M. Haran, Bruno Sansó, M.L. Stein, “Uncertainty analysis in climate change assessments”, Nature Climate Change, 3, September 2013, 769-771 (“Commentary”).
28. [Sl2013] J. Slingo, “Statistical models and the global temperature record”, Met Office, May 2013, http://www.metoffice.gov.uk/media/pdf/2/3/Statistical_Models_Climate_Change_May_2013.pdf.
29. [Tr2013] K. Trenberth, J. Fasullo, “An apparent hiatus in global warming?”, Earth’s Future, 2013,
http://dx.doi.org/10.1002/2013EF000165.
30. [Mo2012] C. P. Morice, J. J. Kennedy, N. A. Rayner, P. D. Jones, “Quantifying uncertainties in global and regional temperature change using an ensemble of observational estimates: The HadCRUT4 data set”, Journal of Geophysical Research, 117, 2012, http://dx.doi.org/10.1029/2011JD017187. See also http://www.metoffice.gov.uk/hadobs/hadcrut4/data/current/download.html where the 100 ensembles can be found.
31. [Sa2012] B. D. Santer, J. F. Painter, C. A. Mears, C. Doutriaux, P. Caldwell, J. M. Arblaster, P. J. Cameron-Smith, N. P. Gillett, P. J. Gleckler, J. Lanzante, J. Perlwitz, S. Solomon, P. A. Stott, K. E. Taylor, L. Terray, P. W. Thorne, M. F. Wehner, F. J. Wentz, T. M. L. Wigley, L. J. Wilcox, C.-Z. Zou, “Identifying human infuences on atmospheric temperature”, Proceedings of the National Academy of Sciences, 29 November 2012, http://dx.doi.org/10.1073/pnas.1210514109.
32. [Ke2011a] J. J. Kennedy, N. A. Rayner, R. O. Smith, D. E. Parker, M. Saunby, “Reassessing biases and other uncertainties in sea-surface temperature observations measured in situ since 1850, part 1: measurement and sampling uncertainties”, Journal of Geophysical Research: Atmospheres (1984-2012), 116(D14), 27 July 2011, http://dx.doi.org/10.1029/2010JD015218.
33. [Kh2008a] S. Kharin, “Statistical concepts in climate research: Some misuses of statistics in climatology”, Banff Summer School, 2008, part 1 of 3. Slide 7, “Climatology is a one-experiment science. There is basically one observational record in climate”, http://www.atmosp.physics.utoronto.ca/C-SPARC/ss08/lectures/Kharin-lecture1.pdf.
34. [Kh2008b] S. Kharin, “Climate Change Detection and Attribution: Bayesian view”, Banff Summer School, 2008, part 3 of 3, http://www.atmosp.physics.utoronto.ca/C-SPARC/ss08/lectures/Kharin-lecture3.pdf.
35. [Le2005] T. C. K. Lee, F. W. Zwiers, G. C. Hegerl, X. Zhang, M. Tsao, “A Bayesian climate change detection and attribution assessment”, Journal of Climate, 18, 2005, 2429-2440.
36. [De1982] M. H. DeGroot, S. Fienberg, “The comparison and evaluation of forecasters”, The Statistician, 32(1-2), 1983, 12-22.
37. [Ro2013a] R. Rohde, R. A. Muller, R. Jacobsen, E. Muller, S. Perlmutter, A. Rosenfeld, J. Wurtele, D. Groom, C. Wickham, “A new estimate of the average Earth surface land temperature spanning 1753 to 2011″, Geoinformatics & Geostatistics: An Overview, 1(1), 2013, http://dx.doi.org/10.4172/2327-4581.1000101.
38. [Ke2011b] J. J. Kennedy, N. A. Rayner, R. O. Smith, D. E. Parker, M. Saunby, “Reassessing biases and other uncertainties in sea-surface temperature observations measured in situ since 1850, part 2: Biases and homogenization”, Journal of Geophysical Research: Atmospheres (1984-2012), 116(D14), 27 July 2011, http://dx.doi.org/10.1029/2010JD015220.
39. [Ro2013b] R. Rohde, “Comparison of Berkeley Earth, NASA GISS, and Hadley CRU averaging techniques on ideal synthetic data”, Berkeley Earth Memo, January 2013, http://static.berkeleyearth.org/memos/robert-rohde-memo.pdf.
40. [En2014] M. H. England, S. McGregor, P. Spence, G. A. Meehl, A. Timmermann, W. Cai, A. S. Gupta, M. J. McPhaden, A. Purich, A. Santoso, “Recent intensification of wind-driven circulation in the Pacific and the ongoing warming hiatus”, Nature Climate Change, 4, 2014, 222-227, http://dx.doi.org/10.1038/nclimate2106. See also http://www.realclimate.org/index.php/archives/2014/02/going-with-the-wind/.
41. [Fy2014] J. C. Fyfe, N. P. Gillett, “Recent observed and simulated warming”, Nature Climate Change, 4, March 2014, 150-151, http://dx.doi.org/10.1038/nclimate2111.
42. [Ta2013] Tamino, “el Niño and the Non-Spherical Cow”, Open Mind blog, http://tamino.wordpress.com/2013/09/02/el-nino-and-the-non-spherical-cow/, 2 September 2013.
43. [Fy2013s] Supplement to J. C. Fyfe, N. P. Gillett, F. W. Zwiers, “Overestimated global warming over the past 20 years”, Nature Climate Change, 3, September 2013, online at http://www.nature.com/nclimate/journal/v3/n9/extref/nclimate1972-s1.pdf.
44. Ionizing. There are tiny amounts of heating due to impinging ionizing radiation from space, and changes in Earth’s magnetic field.
45. [Ki1997] J. T. Kiehl, K. E. Trenberth, “Earth’s annual global mean energy budget”, Bulletin of the American Meteorological Society, 78(2), 1997, http://dx.doi.org/10.1175/1520-0477(1997)0782.0.CO;2.
46. [Tr2009] K. Trenberth, J. Fasullo, J. T. Kiehl, “Earth’s global energy budget”, Bulletin of the American Meteorological Society, 90, 2009, 311–323, http://dx.doi.org/10.1175/2008BAMS2634.1.
47. [IP2013] IPCC, 2013: Climate Change 2013: The Physical Science Basis. Contribution of Working Group I to the Fifth Assessment Report of the Intergovernmental Panel on Climate Change [Stocker, T.F., D. Qin, G.-K. Plattner, M. Tignor, S.K. Allen, J. Boschung, A. Nauels, Y. Xia, V. Bex and P.M. Midgley (eds.)]. Cambridge University Press, Cambridge, United Kingdom and New York, NY, USA, 1535 pp. Also available online at https://www.ipcc.ch/report/ar5/wg1/.
48. [Ve2012] A. Vehtari, J. Ojanen, “A survey of Bayesian predictive methods for model assessment, selection and comparison”, Statistics Surveys, 6 (2012), 142-228, http://dx.doi.org/10.1214/12-SS102.
49. [Ge1998] J. Geweke, “Simulation Methods for Model Criticism and Robustness Analysis”, in Bayesian Statistics 6, J. M. Bernardo, J. O. Berger, A. P. Dawid and A. F. M. Smith (eds.), Oxford University Press, 1998.
50. [Co2006] P. Congdon, Bayesian Statistical Modelling, 2nd edition, John Wiley & Sons, 2006.
51. [Fe2011b] D. Ferreira, J. Marshall, B. Rose, “Climate determinism revisited: Multiple equilibria in a complex climate model”, Journal of Climate, 24, 2011, 992-1012, http://dx.doi.org/10.1175/2010JCLI3580.1.
52. [Bu2002] K. P. Burnham, D. R. Anderson, Model Selection and Multimodel Inference, 2nd edition, Springer-Verlag, 2002.
53. [Ea2014a] S. Easterbrook, “What Does the New IPCC Report Say About Climate Change? (Part 4): Most of the heat is going into the oceans”, 11 April 2014, at the Azimuth blog, http://johncarlosbaez.wordpress.com/2014/04/11/what-does-the-new-ipcc-report-say-about-climate-change-part-4/.
54. [Ko2014] Y. Kostov, K. C. Armour, and J. Marshall, “Impact of the Atlantic meridional overturning circulation on ocean heat storage and transient climate change”, Geophysical Research Letters, 41, 2014, 2108–2116, http://dx.doi.org/10.1002/2013GL058998.
55. [Me2011] G. A. Meehl, J. M. Arblaster, J. T. Fasullo, A. Hu.K. E. Trenberth, “Model-based evidence of deep-ocean heat uptake during surface-temperature hiatus periods”, Nature Climate Change, 1, 2011, 360–364, http://dx.doi.org/10.1038/nclimate1229.
56. [Me2013] G. A. Meehl, A. Hu, J. M. Arblaster, J. Fasullo, K. E. Trenberth, “Externally forced and internally generated decadal climate variability associated with the Interdecadal Pacific Oscillation”, Journal of Climate, 26, 2013, 7298–7310, http://dx.doi.org/10.1175/JCLI-D-12-00548.1.
57. [Ha2010] J. Hansen, R. Ruedy, M. Sato, and K. Lo, “Global surface temperature change”, Reviews of Geophysics, 48(RG4004), 2010, http://dx.doi.org/10.1029/2010RG000345.
58. [GISS-BEST] 3.667 (GISS) versus 3.670 (BEST).
59. Spar. The smoothing parameter is a constant which weights a penalty term proportional to the second directional derivative of the curve. The effect is that if a candidate spline is chosen which is very bumpy, this candidate is penalized and will only be chosen if the data demands it. There is more said about choice of such parameters in the caption of Figure 12.
60. [Ea2009] D. R. Easterling, M. F. Wehner, “Is the climate warming or cooling?”, Geophysical Research Letters, 36, L08706, 2009, http://dx.doi.org/10.1029/2009GL037810.
61. Hiatus. The term hiatus has a formal meaning in climate science, as described by the IPCC itself (Box TS.3).
62. [Ea2000] D. J. Easterbrook, D. J. Kovanen, “Cyclical oscillation of Mt. Baker glaciers in response to climatic changes and their correlation with periodic oceanographic changes in the northeast Pacific Ocean”, 32, 2000, Proceedings of the Geological Society of America, Abstracts with Program, page 17, http://myweb.wwu.edu/dbunny/pdfs/dje_abstracts.pdf, abstract reviewed 23 April 2014.
63. [Ea2001] D. J. Easterbrook, “The next 25 years: global warming or global cooling? Geologic and oceanographic evidence for cyclical climatic oscillations”, 33, 2001, Proceedings of the Geological Society of America, Abstracts with Program, page 253, http://myweb.wwu.edu/dbunny/pdfs/dje_abstracts.pdf, abstract reviewed 23 April 2014.
64. [Ea2005] D. J. Easterbrook, “Causes and effects of abrupt, global, climate changes and global warming”, Proceedings of the Geological Society of America, 37, 2005, Abstracts with Program, page 41, http://myweb.wwu.edu/dbunny/pdfs/dje_abstracts.pdf, abstract reviewed 23 April 2014.
65. [Ea2006a] D. J. Easterbrook, “The cause of global warming and predictions for the coming century”, Proceedings of the Geological Society of America, 38(7), Astracts with Programs, page 235, http://myweb.wwu.edu/dbunny/pdfs/dje_abstracts.pdf, abstract reviewed 23 April 2014.
66. [Ea2006b] D. J. Easterbrook, 2006b, “Causes of abrupt global climate changes and global warming predictions for the coming century”, Proceedings of the Geological Society of America, 38, 2006, Abstracts with Program, page 77, http://myweb.wwu.edu/dbunny/pdfs/dje_abstracts.pdf, abstract reviewed 23 April 2014.
67. [Ea2007] D. J. Easterbrook, “Geologic evidence of recurring climate cycles and their implications for the cause of global warming and climate changes in the coming century”, Proceedings of the Geological Society of America, 39(6), Abstracts with Programs, page 507, http://myweb.wwu.edu/dbunny/pdfs/dje_abstracts.pdf, abstract reviewed 23 April 2014.
68. [Ea2008] D. J. Easterbrook, “Correlation of climatic and solar variations over the past 500 years and predicting global climate changes from recurring climate cycles”, Proceedings of the International Geological Congress, 2008, Oslo, Norway.
69. [Wi2007] J. K. Willis, J. M. Lyman, G. C. Johnson, J. Gilson, “Correction to ‘Recent cooling of the upper ocean”‘, Geophysical Research Letters, 34, L16601, 2007, http://dx.doi.org/10.1029/2007GL030323.
70. [Ra2006] N. Rayner, P. Brohan, D. Parker, C. Folland, J. Kennedy, M. Vanicek, T. Ansell, S. Tett, “Improved analyses of changes and uncertainties in sea surface temperature measured in situ since the mid-nineteenth century: the HadSST2 dataset”, Journal of Climate, 19, 1 February 2006, http://dx.doi.org/10.1175/JCLI3637.1.
71. [Pi2006] R. Pielke, Sr, “The Lyman et al paper ‘Recent cooling in the upper ocean’ has been published”, blog entry, September 29, 2006, 8:09 AM, https://pielkeclimatesci.wordpress.com/2006/09/29/the-lyman-et-al-paper-recent-cooling-in-the-upper-ocean-has-been-published/, last accessed 24 April 2014.
72. [Ko2013] Y. Kosaka, S.-P. Xie, “Recent global-warming hiatus tied to equatorial Pacific surface cooling”, Nature, 501, 2013, 403–407, http://dx.doi.org/10.1038/nature12534.
73. [Ke1998] C. D. Keeling, “Rewards and penalties of monitoring the Earth”, Annual Review of Energy and the Environment, 23, 1998, 25–82, http://dx.doi.org/10.1146/annurev.energy.23.1.25.
74. [Wa1990] G. Wahba, Spline Models for Observational Data, Society for Industrial and Applied Mathematics (SIAM), 1990.
75. [Go1979] G. H. Golub, M. Heath, G. Wahba, “Generalized cross-validation as a method for choosing a good ridge parameter”, Technometrics, 21(2), May 1979, 215-223, http://www.stat.wisc.edu/~wahba/ftp1/oldie/golub.heath.wahba.pdf.
76. [Cr1979] P. Craven, G. Wahba, “Smoothing noisy data with spline functions: Estimating the correct degree of smoothing by the method of generalized cross-validation”, Numerische Mathematik, 31, 1979, 377-403, http://www.stat.wisc.edu/~wahba/ftp1/oldie/craven.wah.pdf.
77. [Sa2013] S. Särkkä, Bayesian Filtering and Smoothing, Cambridge University Press, 2013.
78. [Co2009] P. S. P. Cowpertwait, A. V. Metcalfe, Introductory Time Series With R, Springer, 2009.
79. [Ko2005] R. Koenker, Quantile Regression, Cambridge University Press, 2005.
80. [Du2012] J. Durbin, S. J. Koopman, Time Series Analysis by State Space Methods, Oxford University Press, 2012.
81. Process variance. Here, the process variance was taken here to be $\frac{1}{50}$ of the observations variance.
82. Probabilities. “In this Report, the following terms have been used to indicate the assessed likelihood of an outcome or a result: Virtually certain 99-100% probability, Very likely 90-100%, Likely 66-100%, About as likely as not 33-66\$%, Unlikely 0-33%, Very unlikely 0-10%, Exceptionally unlikely 0-1%. Additional terms (Extremely likely: 95-100%, More likely than not 50-100%, and Extremely unlikely 0-5%) may also be used when appropriate. Assessed likelihood is typeset in italics, e.g., very likely (see Section 1.4 and Box TS.1 for more details).”
83. [Ki2013] E. Kintsch, “Researchers wary as DOE bids to build sixth U.S. climate model”, Science 341 (6151), 13 September 2013, page 1160, http://dx.doi.org/10.1126/science.341.6151.1160.
84. Inez Fung. “It’s great there’s a new initiative,” says modeler Inez Fung of DOE’s Lawrence Berkeley National Laboratory and the University of California, Berkeley. “But all the modeling efforts are very short-handed. More brains working on one set of code would be better than working separately””.
85. Exchangeability. Exchangeability is a weaker assumption than independence. Random variables are exchangeable if their joint distribution only depends upon the set of variables, and not their order [Di1977, Di1988, Ro2013c]. Note the caution in Coolen.
86. [Di1977] P. Diaconis, “Finite forms of de Finetti’s theorem on exchangeability”, Synthese, 36, 1977, 271-281.
87. [Di1988] P. Diaconis, “Recent progress on de Finetti’s notions of exchangeability”, Bayesian Statistics, 3, 1988, 111-125.
88. [Ro2013c] J.C. Rougier, M. Goldstein, L. House, “Second-order exchangeability analysis for multi-model ensembles”, Journal of the American Statistical Association, 108, 2013, 852-863, http://dx.doi.org/10.1080/01621459.2013.802963.
89. [Co2005] F. P. A. Coolen, “On nonparametric predictive inference and objective Bayesianism”, Journal of Logic, Language and Information, 15, 2006, 21-47, http://dx.doi.org/10.1007/s10849-005-9005-7. (“Generally, though, both for frequentist and Bayesian approaches, statisticians are often happy to assume exchangeability at the prior stage. Once data are used in combination with model assumptions, exchangeability no longer holds ‘post-data’ due to the influence of modelling assumptions, which effectively are based on mostly subjective input added to the information from the data.”).
90. [Ch2008] M. R. Chernick, Bootstrap Methods: A Guide for Practitioners and Researches, 2nd edition, 2008, John Wiley & Sons.
91. [Da2009] A. C. Davison, D. V. Hinkley, Bootstrap Methods and their Application, first published 1997, 11th printing, 2009, Cambridge University Press.
92. [Mu2007] M. Mudelsee, M. Alkio, “Quantifying effects in two-sample environmental experiments using bootstrap condidence intervals”, Environmental Modelling and Software, 22, 2007, 84-96, http://dx.doi.org/10.1016/j.envsoft.2005.12.001.
93. [Wi2011] D. S. Wilks, Statistical Methods in the Atmospheric Sciences, 3rd edition, 2011, Academic Press.
94. [Pa2006] T. N. Palmer, R. Buizza, R. Hagedon, A. Lawrence, M. Leutbecher, L. Smith, “Ensemble prediction: A pedagogical perspective”, ECMWF Newsletter, 106, 2006, 10–17.
95. [To2001] Z. Toth, Y. Zhu, T. Marchok, “The use of ensembles to identify forecasts with small and large uncertainty”, Weather and Forecasting, 16, 2001, 463–477, http://dx.doi.org/10.1175/1520-0434(2001)0162.0.CO;2.
96. [Le2013a] L. A. Lee, K. J. Pringle, C. I. Reddington, G. W. Mann, P. Stier, D. V. Spracklen, J. R. Pierce, K. S. Carslaw, “The magnitude and causes of uncertainty in global model simulations of cloud condensation nuclei”, Atmospheric Chemistry and Physics Discussion, 13, 2013, 6295-6378, http://www.atmos-chem-phys.net/13/9375/2013/acp-13-9375-2013.pdf.
97. [Gl2011] D. M. Glover, W. J. Jenkins, S. C. Doney, Modeling Methods for Marine Science, Cambridge University Press, 2011.
98. [Ki2014] E. Kintisch, “Climate outsider finds missing global warming”, Science, 344 (6182), 25 April 2014, page 348, http://dx.doi.org/10.1126/science.344.6182.348.
99. [GL2011] D. M. Glover, W. J. Jenkins, S. C. Doney, Modeling Methods for Marine Science, Cambridge University Press, 2011, Chapter 7.
100. [Le2013b] L. A. Lee, “Uncertainties in climate models: Living with uncertainty in an uncertain world”, Significance, 10(5), October 2013, 34-39, http://dx.doi.org/10.1111/j.1740-9713.2013.00697.x.
101. [Ur2014] N. M. Urban, P. B. Holden, N. R. Edwards, R. L. Sriver, K. Keller, “Historical and future learning about climate sensitivity”, Geophysical Research Letters, 41, http://dx.doi.org/10.1002/2014GL059484.
102. [Th2005] P. W. Thorne, D. E. Parker, J. R. Christy, C. A. Mears, “Uncertainties in climate trends: Lessons from upper-air temperature records”, Bulletin of the American Meteorological Society, 86, 2005, 1437-1442, http://dx.doi.org/10.1175/BAMS-86-10-1437.
103. [Fr2008] C. Fraley, A. E. Raftery, T. Gneiting, “Calibrating multimodel forecast ensembles with exchangeable and missing members using Bayesian model averaging”, Monthly Weather Review. 138, January 2010, http://dx.doi.org/10.1175/2009MWR3046.1.
104. [Ow2001] A. B. Owen, Empirical Likelihood, Chapman & Hall/CRC, 2001.
105. [Al2012] M. Aldrin, M. Holden, P. Guttorp, R. B. Skeie, G. Myhre, T. K. Berntsen, “Bayesian estimation of climate sensitivity based on a simple climate model fitted to observations of hemispheric temperatures and global ocean heat content”, Environmentrics, 2012, 23, 253-257, http://dx.doi.org/10.1002/env.2140.
106. [AS2007] “ASA Statement on Climate Change”, American Statistical Association, ASA Board of Directors, adopted 30 November 2007, http://www.amstat.org/news/climatechange.cfm, last visited 13 September 2013.
107. [Be2008] L. M. Berliner, Y. Kim, “Bayesian design and analysis for superensemble-based climate forecasting”, Journal of Climate, 21, 1 May 2008, http://dx.doi.org/10.1175/2007JCLI1619.1.
108. [Fe2011a] X. Feng, T. DelSole, P. Houser, “Bootstrap estimated seasonal potential predictability of global temperature and precipitation”, Geophysical Research Letters, 38, L07702, 2011, http://dx.doi.org/10.1029/2010GL046511.
109. [Fr2013] P. Friedlingstein, M. Meinshausen, V. K. Arora, C. D. Jones, A. Anav, S. K. Liddicoat, R. Knutti, “Uncertainties in CMIP5 climate projections due to carbon cycle feedbacks”, Journal of Climate, 2013, http://dx.doi.org/10.1175/JCLI-D-12-00579.1.
110. [Ho2003] T. J. Hoar, R. F. Milliff, D. Nychka, C. K. Wikle, L. M. Berliner, “Winds from a Bayesian hierarchical model: Computations for atmosphere-ocean research”, Journal of Computational and Graphical Statistics, 12(4), 2003, 781-807, http://www.jstor.org/stable/1390978.
111. [Jo2013] V. E. Johnson, “Revised standards for statistical evidence”, Proceedings of the National Academy of Sciences, 11 November 2013, http://dx.doi.org/10.1073/pnas.1313476110, published online before print.
112. [Ka2013b] J. Karlsson, J., Svensson, “Consequences of poor representation of Arctic sea-ice albedo and cloud-radiation interactions in the CMIP5 model ensemble”, Geophysical Research Letters, 40, 2013, 4374-4379, http://dx.doi.org/10.1002/grl.50768.
113. [Kh2002] V. V. Kharin, F. W. Zwiers, “Climate predictions with multimodel ensembles”, Journal of Climate, 15, 1 April 2002, 793-799.
114. [Kr2011] J. K. Kruschke, Doing Bayesian Data Analysis: A Tutorial with R and BUGS, Academic Press, 2011.
115. [Li2008] X. R. Li, X.-B. Li, “Common fallacies in hypothesis testing”, Proceedings of the 11th IEEE International Conference on Information Fusion, 2008, New Orleans, LA.
116. [Li2013] J.-L. F. Li, D. E. Waliser, G. Stephens, S. Lee, T. L’Ecuyer, S. Kato, N. Loeb, H.-Y. Ma, “Characterizing and understanding radiation budget biases in CMIP3/CMIP5 GCMs, contemporary GCM, and reanalysis”, Journal of Geophysical Research: Atmospheres, 118, 2013, 8166-8184, http://dx.doi.org/10.1002/jgrd.50378.
117. [Ma2013b] E. Maloney, S. Camargo, E. Chang, B. Colle, R. Fu, K. Geil, Q. Hu, x. Jiang, N. Johnson, K. Karnauskas, J. Kinter, B. Kirtman, S. Kumar, B. Langenbrunner, K. Lombardo, L. Long, A. Mariotti, J. Meyerson, K. Mo, D. Neelin, Z. Pan, R. Seager, Y. Serra, A. Seth, J. Sheffield, J. Stroeve, J. Thibeault, S. Xie, C. Wang, B. Wyman, and M. Zhao, “North American Climate in CMIP5 Experiments: Part III: Assessment of 21st Century Projections”, Journal of Climate, 2013, in press, http://dx.doi.org/10.1175/JCLI-D-13-00273.1.
118. [Mi2007] S.-K. Min, D. Simonis, A. Hense, “Probabilistic climate change predictions applying Bayesian model averaging”, Philosophical Transactions of the Royal Society, Series A, 365, 15 August 2007, http://dx.doi.org/10.1098/rsta.2007.2070.
119. [Ni2001] N. Nicholls, “The insignificance of significance testing”, Bulletin of the American Meteorological Society, 82, 2001, 971-986.
120. [Pe2008] G. Pennello, L. Thompson, “Experience with reviewing Bayesian medical device trials”, Journal of Biopharmaceutical Statistics, 18(1), 81-115).
121. [Pl2013] M. Plummer, “Just Another Gibbs Sampler”, JAGS, 2013. Plummer describes this in greater detail at “JAGS: A program for analysis of Bayesian graphical models using Gibbs sampling”, Proceedings of the 3rd International Workshop on Distributed Statistical Computing (DSC 2003), 20-22 March 2003, Vienna. See also M. J. Denwood, [in review] “runjags: An R package providing interface utilities, parallel computing methods and additional distributions for MCMC models in JAGS”, Journal of Statistical Software, and http://cran.r-project.org/web/packages/runjags/. See also J. Kruschke, “Another reason to use JAGS instead of BUGS”, http://doingbayesiandataanalysis.blogspot.com/2012/12/another-reason-to-use-jags-instead-of.html, 21 December 2012.
122. [Po1994] D. N. Politis, J. P. Romano, “The Stationary Bootstrap”, Journal of the American Statistical Association, 89(428), 1994, 1303-1313, http://dx.doi.org/10.1080/01621459.1994.10476870.
123. [Sa2002] C.-E. Särndal, B. Swensson, J. Wretman, Model Assisted Survey Sampling, Springer, 1992.
124. [Ta2012] K. E. Taylor, R.J. Stouffer, G.A. Meehl, “An overview of CMIP5 and the experiment design”, Bulletin of the American Meteorological Society, 93, 2012, 485-498, http://dx.doi.org/10.1175/BAMS-D-11-00094.1.
125. [To2013] A. Toreti, P. Naveau, M. Zampieri, A. Schindler, E. Scoccimarro, E. Xoplaki, H. A. Dijkstra, S. Gualdi, J, Luterbacher, “Projections of global changes in precipitation extremes from CMIP5 models”, Geophysical Research Letters, 2013, http://dx.doi.org/10.1002/grl.50940.
126. [WC2013] World Climate Research Programme (WCRP), “CMIP5: Coupled Model Intercomparison Project”, http://cmip-pcmdi.llnl.gov/cmip5/, last visited 13 September 2013.
127. [We2011] M. B. Westover, K. D. Westover, M. T. Bianchi, “Significance testing as perverse probabilistic reasoning”, BMC Medicine, 9(20), 2011, http://www.biomedcentral.com/1741-7015/9/20.
128. [Zw2004] F. W. Zwiers, H. Von Storch, “On the role of statistics in climate research”, International Journal of Climatology, 24, 2004, 665-680.
129. [Ra2005] A. E. Raftery, T. Gneiting , F. Balabdaoui , M. Polakowski, “Using Bayesian model averaging to calibrate forecast ensembles”, Monthly Weather Review, 133, 1155–1174, http://dx.doi.org/10.1175/MWR2906.1.
130. [Ki2010] G. Kitagawa, Introduction to Time Series Modeling, Chapman & Hall/CRC, 2010.
131. [Hu2010] C. W. Hughes, S. D. P. Williams, “The color of sea level: Importance of spatial variations in spectral shape for assessing the significance of trends”, Journal of Geophysical Research, 115, C10048, 2010, http://dx.doi.org/10.1029/2010JC006102.