Nathan Urban has been telling us about a paper where he estimated the probability that global warming will shut down a major current in the Atlantic Ocean:

• Nathan M. Urban and Klaus Keller, Probabilistic hindcasts and projections of the coupled climate, carbon cycle and Atlantic meridional overturning circulation system: a Bayesian fusion of century-scale observations with a simple model, *Tellus A*, July 16, 2010.

We left off last time with a cliff-hanger: I didn’t let him tell us what the probability is! Since you must have been clutching your chair ever since, you’ll be relieved to hear that the answer is coming now, in the final episode of this interview.

But it’s also very interesting how he and Klaus Keller *got* their answer. As you’ll see, there’s some beautiful math involved. So let’s get started…

**JB**: Last time you told us roughly how your climate model works. This time I’d like to ask you about the rest of your paper, leading up to your estimate of the probability that the Atlantic Meridional Overturning Current (or "AMOC") will collapse. But before we get into that, I’d like to ask some very general questions.

For starters, why are scientists worried that the AMOC might collapse?

Last time I mentioned the Younger Dryas event, a time when Europe became drastically colder for about 1300 years, starting around 10,800 BC. Lots of scientists think this event was caused by a collapse of the AMOC. And lots of them believe it was caused by huge amounts of fresh water pouring into the north Atlantic from an enormous glacial lake. But nothing quite like that is happening now! So if the AMOC collapses in the next few centuries, the cause would have to be a bit different.

**NU**: In order for the AMOC to collapse, the overturning circulation has to weaken. The overturning is driven by the sinking of cold and salty, and therefore dense, water in the north Atlantic. Anything that affects the density structure of the ocean can alter the overturning.

As you say, during the Younger Dryas, it is thought that a lot of fresh water suddenly poured into the Atlantic from the draining of a glacial lake. This lessened the density of the surface waters and reduced the rate at which they sank, shutting down the overturning.

Since there aren’t any large glacial lakes left that could abruptly drain into the ocean, the AMOC won’t shut down in the same way it previously did. But it’s still possible that climate change could cause it to shut down. The surface waters from the north Atlantic can still freshen (and become less dense), either due to the addition of fresh water from melting polar ice and snow, or due to increased precipitation to the northern latitudes. In addition, they can simply become warmer, which also makes them less dense, reducing their sinking rate and weakening the overturning.

In combination, these three factors (warming, increased precipitation, meltwater) can theoretically shut down the AMOC if they are strong enough. This will probably not be as abrupt or extreme an event as the Younger Dryas, but it can still persistently alter the regional climate.

**JB**: I’m trying to keep our readers in suspense for a bit longer, but I don’t think it’s giving away too much to say that when you run your model, sometimes the AMOC shuts down, or at least slows down. Can you say anything about how this tends to happen, when it does? In your model, that is. Can you tell if it’s mainly warming, or increased precipitation, or meltwater?

**NU**: The short answer is "mainly warming, probably". The long answer:

I haven’t done experiments with the box model myself to determine this, but I can quote from the Zickfeld *et al.* paper where this model was published. It says, for their baseline collapse experiment,

In the box model the initial weakening of the overturning circulation is mainly due to thermal forcing […] This effect is amplified by a negative feedback on salinity, since a weaker circulation implies reduced salt advection towards the northern latitudes.

Even if they turn off all the freshwater input, they find substantial weakening of the AMOC from warming alone.

Freshwater could potentially become the dominant effect on the AMOC if more freshwater is added than in the paper’s baseline experiment. The paper did report computer experiments with different freshwater inputs, but upon skimming it, I can’t immediately tell whether the thermal effect loses its dominance.

These experiments have also been performed using more complex climate models. This paper reports that in all the models they studied, the AMOC weakening is caused more by changes in surface heat flux than by changes in surface water flux:

• J. M. Gregory *et al.*, A model intercomparison of changes in the Atlantic thermohaline circulation in response to increasing atmospheric CO_{2} concentration, *Geophysical Research Letters* **32** (2005), L12703.

However, that paper studied "best-estimate" freshwater fluxes, not the fluxes on the high end of what’s possible, so I don’t know whether thermal effects would still dominate if the freshwater input ends up being large. There are papers that suggest freshwater input from Greenland, at least, won’t be a dominant factor any time soon:

• J. H. Jungclaus *et al.*, Will Greenland melting halt the thermohaline circulation?, *Geophysical Research Letters* **33** (2006), L17708.

• E. Driesschaert *et al.*, Modeling the influence of Greenland ice sheet melting on the Atlantic meridional overturning circulation during the next millennia, *Geophysical Research Letters* **34** (2007), L10707.

I’m not sure what the situation is for precipitation, but I don’t think that would be much larger than the meltwater flux. In summary, it’s probably the thermal effects that dominate, both in complex and simpler models.

Note that in our version of the box model, the precipitation and meltwater fluxes are combined into one number, the "North Atlantic hydrological sensitivity", so we can’t distinguish between those sources of water. This number is treated as uncertain in our analysis, lying within a range of possible values determined from the hydrologic changes predicted by complex models. The Zickfeld *et al.* paper experimented with separating them into the two individual contributions, but my version of the model doesn’t do that.

**JB**: Okay. Now back to what you and Klaus Keller actually did in your paper. You have a climate model with a bunch of adjustable knobs, or parameters. Some of these parameters you take as "known" from previous research. Others are more uncertain, and that’s where the Bayesian reasoning comes in. Very roughly, you use some data to guess the probability that the right settings of these knobs lie within any given range.

How many parameters do you treat as uncertain?

**NU**: 18 parameters in total. 7 model parameters that control dynamics, 4 initial conditions, and 7 parameters describing error statistics.

**JB**: What are a few of these parameters? Maybe you can tell us about some of the most important ones — or ones that are easy to understand.

**NU**: I’ve mentioned these briefly in "week304" in the model description. The AMOC-related parameter is the hydrologic sensitivity I described above, controlling the flux of fresh water into the North Atlantic.

There are three climate related parameters:

• the climate sensitivity (the equilibrium warming expected in response to doubled CO

_{2}),• the ocean heat vertical diffusivity (controlling the rate at which oceans absorb heat from the atmosphere), and

• "aerosol scaling", a factor that multiplies the strength of the aerosol-induced cooling effect, mostly due to uncertainties in aerosol-cloud interactions.

I discussed these in "week302" in the part about total feedback estimates.

There are also three carbon cycle related parameters:

• the heterotrophic respiration sensitivity (describing how quickly dead plants decay when it gets warmer),

• CO

_{2}fertilization (how much faster plants grow in CO_{2}-elevated conditions), and• the ocean carbon vertical diffusivity (the rate at which the oceans absorb CO

_{2}from the atmosphere).

The initial conditions describe what the global temperature, CO_{2} level, etc. were at the start of my model simulations, in 1850. The statistical parameters describe the variance and autocorrelation of the residual error between the observations and the model, due to measurement error, natural variability, and model error.

**JB**: Could you say a bit about the data you use to estimate these uncertain parameters? I see you use a number of data sets.

**NU**: We use global mean surface temperature and ocean heat content to constrain the three climate parameters. We use atmospheric CO_{2} concentration and some ocean flux measurements to constrain the carbon parameters. We use measurements of the AMOC strength to constrain the AMOC parameter. These are all time series data, mostly global averages — except the AMOC strength, which is an Atlantic-specific quantity defined at a particular latitude.

The temperature data are taken by surface weather stations and are for the years 1850-2009. The ocean heat data are taken by shipboard sampling, 1953-1996. The atmospheric CO_{2} concentrations are measured from the Mauna Loa volcano in Hawaii, 1959-2009. There are also some ice core measurements of trapped CO_{2} at Law Dome, Antarctica, dated to 1854-1953. The air-sea CO_{2} fluxes, for the 1980s and 1990s, are derived from measurements of dissolved inorganic carbon in the ocean, combined with measurements of manmade chlorofluorocarbon to date the water masses in which the carbon resides. (The dates tell you when the carbon entered the ocean.)

The AMOC strength is reconstructed from station measurements of poleward water circulation over an east-west section of the Atlantic Ocean, near 25 °N latitude. Pairs of stations measure the northward velocity of water, inferred from the ocean bottom pressure differences between northward and southward station pairs. The velocities across the Atlantic are combined with vertical density profiles to determine an overall rate of poleward water mass transport. We use seven AMOC strength estimates measured sparsely between the years 1957 and 2004.

**JB**: So then you start the Bayesian procedure. You take your model, start it off with your 18 parameters chosen somehow or other, run it from 1850 to now, and see how well it matches all this data you just described. Then you tweak the parameters a bit — last time we called that "turning the knobs" — and run the model again. And then you do this again and again, lots of times. The goal is to calculate the probability that the right settings of these knobs lie within any given range.

Is that about right?

**NU**: Yes, that’s right.

**JB**: About how many times did you actually run the model? Is the sort of thing you can do on your laptop overnight, or is it a mammoth task?

**NU**: I ran the model a million times. This took about two days on a single CPU. Some of my colleagues later ported the model from Matlab to Fortran, and now I can do a million runs in half an hour on my laptop.

**JB**: Cool! So if I understand correctly, you generated a million lists of 18 numbers: those uncertain parameters you just mentioned.

Or in other words: you created a cloud of points: a million points in an 18-dimensional space. Each point is a choice of those 18 parameters. And the density of this cloud near any point should be proportional to the probability that the parameters have those values.

That’s the goal, anyway: getting this cloud to approximate the right probability density on your 18-dimensional space. To get this to happen, you used the Markov chain Monte Carlo procedure we discussed last time.

Could you say in a bit more detail how you did this, exactly?

**NU**: There are two steps. One is to write down a formula for the probability of the parameters (the "Bayesian posterior distribution"). The second is to draw random samples from that probability distribution using Markov chain Monte Carlo (MCMC).

Call the parameter vector θ and the data vector y. The Bayesian posterior distribution p(θ|y) is a function of θ which says how probable θ is, given the data y that you’ve observed. The little bar (|) indicates conditional probability: p(θ|y) is the probability of θ, assuming that you know y happened.

The posterior factorizes into two parts, the likelihood and the prior. The prior, p(θ) says how probable you think a particular 18-dimensional vector of parameters is, before you’ve seen the data you’re using. It encodes your "prior knowledge" about the problem, unconditional on the data you’re using.

The likelihood, p(y|θ), says how likely it is for the observed data to arise from a model run using some particular vector of parameters. It describes your data generating process: assuming you know what the parameters are, how likely are you to see data that looks like what you actually measured? (The posterior is the reverse of this: how probable are the parameters, assuming the data you’ve observed?)

Bayes’s theorem simply says that the posterior is proportional to the product of these two pieces:

If I know the two pieces, I multiply them together and use MCMC to sample from that probability distribution.

Where do the pieces come from? For the prior, we assumed bounded uniform distributions on all but one parameter. Such priors express the belief that each parameter lies within some range we deemed reasonable, but we are agnostic about whether one value within that range is more probable than any other. The exception is the climate sensitivity parameter. We have prior evidence from computer models and paleoclimate data that the climate sensitivity is most likely around 2 or 3 °C, albeit with significant uncertainties. We encoded this belief using a "diffuse" Cauchy distribution peaked in this range, but allowing substantial probability to be outside it, so as to not prematurely exclude too much of the parameter range based on possibly overconfident prior beliefs. We assume the priors on all the parameters are independent of each other, so the prior for all of them is the product of the prior for each of them.

For the likelihood, we assumed a normal (Gaussian) distribution for the residual error (the scatter of the data about the model prediction). The simplest such distribution is the independent and identically distributed ("iid") normal distribution, which says that all the data points have the same error and the errors at each data point are independent of each other. Neither of these assumptions is true. The errors are not identical, since they get bigger farther in the past, when we measured data with less precision than we do today. And they’re not independent, because if one year is warmer than the model predicts, the next year likely to be also warmer than the model predicts. There are various possible reasons for this: chaotic variability, time lags in the system due to finite heat capacity, and so on.

In this analysis, we kept the identical-error assumption for simplicity, even though it’s not correct. I think this is justifiable, because the strongest constraints on the parameters come from the most recent data, when the largest climate and carbon cycle changes have occurred. That is, the early data are already relatively uninformative, so if their errors get bigger, it doesn’t affect the answer much.

We rejected the independent-error assumption, since there is very strong autocorrelation (serial dependence) in the data, and ignoring autocorrelation is known to lead to overconfidence. When the errors are correlated, it’s harder to distinguish between a short-term random fluctuation and a true trend, so you should be more uncertain about your conclusions. To deal with this, we assumed that the errors obey a correlated autoregressive "red noise" process instead of an uncorrelated "white noise" process. In the likelihood, we converted the red-noise errors to white noise via a "whitening" process, assuming we know how much correlation is present. (We’re allowed to do that in the likelihood, because it gives the probability of the data assuming we know what all the parameters are, and the autocorrelation is one of the parameters.) The equations are given in the paper.

Finally, this gives us the formula for our posterior distribution.

**JB**: Great! There’s a lot of technical material here, so I have many questions, but let’s go through the whole story first, and come back to those.

**NU**: Okay. Next comes step two, which is to draw random samples from the posterior probability distribution via MCMC.

To do this, we use the famous Metropolis algorithm, which was invented by a physicist of that name, along with others, to do computations in statistical physics. It’s a very simple algorithm which takes a "random walk" through parameter space.

You start out with some guess for the parameters. You randomly perturb your guess to a nearby point in parameter space, which you are going to propose to move to. If the new point is more probable than the point you were at (according to the Bayesian posterior distribution), then accept it as a new random sample. If the proposed point is less probable than the point you’re at, then you randomly accept the new point with a certain probability. Otherwise you reject the move, staying where you are, treating the old point as a duplicate random sample.

The acceptance probability is equal to the ratio of the posterior distribution at the new point to the posterior distribution at the old point. If the point you’re proposing to move to is, say, 5 times less probable than the point you are at now, then there’s a 20% chance you should move there, and a 80% chance that you should stay where you are.

If you iterate this method of proposing new "jumps" through parameter space, followed by the Metropolis accept/reject procedure, you can prove that you will eventually end up with a long list of (correlated) random samples from the Bayesian posterior distribution.

**JB**: Okay. Now let me ask a few questions, just to help all our readers get up to speed on some jargon.

Lots of people have heard of a "normal distribution" or "Gaussian", because it’s become sort of the default choice for probability distributions. It looks like a bell curve:

When people don’t know the probability distribution of something — like the tail lengths of newts or the IQ’s of politicians — they often assume it’s a Gaussian.

But I bet fewer of our readers have heard of a "Cauchy distribution". What’s the point of that? Why did you choose that for your prior probability distribution of the climate sensitivity?

**NU**: There is a long-running debate about the "upper tail" of the climate sensitivity distribution. High climate sensitivities correspond to large amounts of warming. As you can imagine, policy decisions depend a lot on how likely we think these extreme outcomes could be, i.e., how quickly the "upper tail" of the probability distribution drops to zero.

A Gaussian distribution has tails that drop off exponentially quickly, so very high sensitivities will never get any significant weight. If we used it for our prior, then we’d almost automatically get a "thin tailed" posterior, no matter what the data say. We didn’t want to put that in by assumption and automatically conclude that high sensitivities should get no weight, regardless of what the data say. So we used a weaker assumption, which is a "heavy tailed" prior distribution. With this prior, the probability of large amounts of warming drops off more slowly, as a power law, instead of exponentially fast. If the data strongly rule out high warming, we can get a thin tailed posterior, but if they don’t, it will be heavy tailed. The Cauchy distribution, a limiting case of the "Student t" distribution that students of statistics may have heard of, is one of the most conservative choices for a heavy-tailed prior. Probability drops off so slowly at its tails that its variance is infinite.

**JB**: The issue of "fat tails" is also important in the stock market, where big crashes happen more frequently than you might guess with a Gaussian distribution. After the recent economic crisis I saw a lot of financiers walking around with their tails between their legs, wishing their tails had been fatter.

I’d also like to ask about "white noise" versus "red noise". "White noise" is a mathematical description of a situation where some quantity fluctuates randomly with time in a way so that it’s value at any time is completely uncorrelated with its value at any other time. If you graph an example of white noise, it looks really spiky:

If you play it as a sound, it sounds like hissy static — quite unpleasant. If you could play it in the form of light, it would look white, hence the name.

"Red noise" is less wild. Its value at any time is still random, but it’s correlated to the values at earlier or later times, in a specific way. So it looks less spiky:

and it sounds less high-pitched, more like a steady rainfall. Since it’s stronger at low frequencies, it would look more red if you could play it in the form of light — hence the name "red noise".

If understand correctly, you’re assuming that some aspects of the climate are noisy, but in a red noise kind of way, when you’re computing p(y|θ): the likelihood that your data takes on the value y, given your climate model with some specific choice of parameters θ.

Is that right? You’re assuming this about all your data: the temperature data from weather stations, the ocean heat data are from shipboard samples, the atmospheric CO_{2} concentrations at Mauna Loa volcano in Hawaii, the ice core measurements of trapped CO_{2}, the air-sea CO_{2} fluxes, and also the AMOC strength? Red, red, red — all red noise?

**NU**: I think the red noise you’re talking about refers to a specific type of autocorrelated noise ("Brownian motion"), with a power spectrum that is inversely proportional to the square of frequency. I’m using "red noise" more generically to speak of any autocorrelated process that is stronger at low frequencies. Specifically, the process we use is a first-order autoregressive, or "AR(1)", process. It has a more complicated spectrum than Brownian motion.

**JB**: Right, I was talking about "red noise" of a specific mathematically nice sort, but that’s probably less convenient for you. AR(1) sounds easier for computers to generate.

**NU**: It’s not only easier for computers, but closer to the spectrum we see in our analysis.

Note that when I talk about error I mean "residual error", which is the difference between the observations and the model prediction. If the residual error is correlated in time, that doesn’t necessarily reflect true red noise in the climate system. It could also represent correlated errors in measurement over time, or systematic errors in the model. I am not attempting to distinguish between all these sources of error. I’m just lumping them all together into one total error process, and assuming it has a simple statistical form.

We assume the residual errors in the annual surface temperature, ocean heat, and instrumental CO_{2} time series are AR(1). The ice core CO_{2}, air-sea CO_{2} flux, and AMOC strength data are sparse, and we can’t really hope to estimate the correlation between them, so we assume their residual errors are uncorrelated.

Speaking of correlation, I’ve been talking about "autocorrelation", which is correlation within one data set between one time and another. It’s also possible for the errors in different data sets to be correlated with each other ("cross correlation"). We assumed there is no cross correlation (and residual analysis suggests only weak correlation between data sets).

**JB**: I have a few more technical questions, but I bet most of our readers are eager to know: *so, what next?*

You use all these nifty mathematical methods to work out p(θ|y), the probability that your 18 parameters have any specific value given your data. And now I guess you want to figure out the probability that the Atlantic Meridional Overturning Current, or AMOC, will collapse by some date or other.

How do you do this? I guess most people want to know the answer more than the method, but they’ll just have to wait a few more minutes.

**NU**: That’s easy. After MCMC, we have a million runs of the model, sampled in proportion how well the model fits historic data. There will be lots of runs that agree well with the data, and a few that agree less well. All we do now is extend each of those runs into the future, using an assumed scenario for what CO_{2} emissions and other radiative forcings will do in the future. To find out the probability that the AMOC will collapse by some date, conditional on the assumptions we’ve made, we just count what fraction of the runs have an AMOC strength of zero in whatever year we care about.

**JB**: Okay, that’s simple enough. What scenario, or scenarios, did you consider?

**NU**: We considered a worst-case "business as usual" scenario in which we continue to burn fossil fuels at an accelerating rate until we start to run out of them, and eventually burn the maximum amount of fossil fuels we think there might be remaining (about 5000 gigatons worth of of carbon, compared to the roughly 500 gigatons we’ve emitted so far). This assumes we get desperate for cheap energy and extract all the hard-to-get fossil resources in oil shales and tar sands, all the remaining coal, etc. It doesn’t necessarily preclude the use of non-fossil energy; it just assumes that our appetite for energy grows so rapidly that there’s no incentive to slow down fossil fuel extraction. We used a simple economic model to estimate how fast we might do this, if the world economy continues to grow at a similar rate to the last few decades.

**JB**: And now for the big question: *what did you find?* How likely is it that the AMOC will collapse, according to your model? Of course it depends how far into the future you look.

**NU**: We find a negligible probability that the AMOC will collapse this century. The odds start to increase around 2150, rising to about a 10% chance by 2200, and a 35% chance by 2300, the last year considered in our scenario.

**JB**: I guess one can take this as good news or really scary news, depending on how much you care about folks who are alive in 2300. But I have some more questions. First, what’s a "negligible probability"?

**NU**: In this case, it’s less than 1 in 3000. For computational reasons, we only ran 3000 of the million samples forward into the future. There were no samples in this smaller selection that had the AMOC collapsed in 2100. The probability rises to 1 in 3000 in the year 2130 (the first time I see a collapse in this smaller selection), and 1% in 2152. You should take these numbers with a grain of salt. It’s these rare "tail-area events" that are most sensitive to modeling assumptions.

**JB**: Okay. And second, don’t the extrapolations become more unreliable as you keep marching further into the future? You need to model not only climate physics but also the world economy. In this calculation, how many gigatons of carbon dioxide per year are you assuming will be emitted in 2300? I’m just curious. In 1998 it was about 27.6 gigatons. By 2008, it was about 30.4.

**NU**: Yes, the uncertainty grows with time (and this is reflected in our projections). And in considering a fixed emissions scenario, we’ve ignored the economic uncertainty, which, so far out into the future, is even larger than the climate uncertainty. Here we’re concentrating on just the climate uncertainty, and are hoping to get an idea of bounds, so we used something close to a worst-case economic scenario. In this scenario carbon emissions peak around 2150 at about 23 gigatons carbon per year (84 gigatons CO_{2}). By 2300 they’ve tapered off to about 4 GtC (15 GtCO_{2}).

Actual future emissions may be less than this, if we act to reduce them, or there are fewer economically extractable fossil resources than we assume, or the economy takes a prolonged downturn, etc. Actually, it’s not completely an economic worst case; it’s possible that the world economy could grow even faster than we assume. And it’s not the worst case scenario from a climate perspective, either. For example, we don’t model potential carbon emissions from permafrost or methane clathrates. It’s also possible that climate sensitivity could be higher than what we find in our analysis.

**JB**: Why even bother projecting so far out into the future, if it’s so uncertain?

**NU**: The main reason is because it takes a while for the AMOC to weaken, so if we’re interested in what it would take to make it collapse, we have to run the projections out a few centuries. But another motivation for writing this paper is policy related, having to do with the concept of "climate commitment" or "triggering". Even if it takes a few centuries for the AMOC to collapse, it may take less time than that to reach a "point of no return", where a future collapse has already been unavoidably "triggered". Again, to investigate this question, we have to run the projections out far enough to get the AMOC to collapse.

We define "the point of no return" to be a point in time which, if CO_{2} emissions were immediately reduced to zero and kept there forever, the AMOC would still collapse by the year 2300 (an arbitrary date chosen for illustrative purposes). This is possible because even if we stop emitting new CO_{2}, existing CO_{2} concentrations, and therefore temperatures, will remain high for a long time (see "week303").

In reality, humans wouldn’t be able to reduce emissions instantly to zero, so the actual "point of no return" would likely be earlier than what we find in our study. We couldn’t economically reduce emissions fast enough to avoid triggering an AMOC collapse. (In this study we ignore the possibility of negative carbon emissions, that is, capturing CO_{2} directly from the atmosphere and sequestering it for a long period of time. We’re also ignoring the possibility of climate geoengineering, which is global cooling designed to cancel out greenhouse warming.)

So what do we find? Although we calculate a negligible probability that the AMOC will collapse by the end of this century, the probability that, in this century, we will commit later generations to a collapse (by 2300) is almost 5%. The probabilities of "triggering" rise rapidly, to almost 20% by 2150 and about 33% by 2200, even though the probability of experiencing a collapse by those dates is about 1% and 10%, respectively. You can see it in this figure from our paper:

The take-home message is that while most climate projections are currently run out to 2100, we shouldn’t fixate only on what might happen to people this century. We should consider what climate changes our choices in this century, and beyond, are committing future generations to experiencing.

**JB**: That’s a good point!

I’d like to thank you right now for a wonderful interview, that really taught me — and I hope our readers — a huge amount about climate change and climate modelling. I think we’ve basically reached the end here, but as the lights dim and the audience files out, I’d like to ask just a few more technical questions.

One of them was raised by David Tweed. He pointed out that while you’re "training" your model on climate data from the last 150 years or so, you’re using it to predict the future in a world that will be different in various ways: a lot more CO_{2} in the atmosphere, hotter, and so on. So, you’re extrapolating rather than interpolating, and that’s a lot harder. It seems especially hard if the collapse of the AMOC is a kind of "tipping point" — if it suddenly snaps off at some point, instead of linearly decreasing as some parameter changes.

This raises the question: why should we trust your model, or any model of this sort, to make such extrapolations correctly? In the discussion after that comment, I think you said that ultimately it boils down to

1) whether you think you have the physics right,

and

2) whether you think the parameters change over time.

That makes sense. So my question is: what are some of the best ways people could build on the work you’ve done, and make more reliable predictions about the AMOC? There’s a lot at stake here!

**NU**: Our paper is certainly an early step in making probabilistic AMOC projections, with room for improvement. I view the main points as (1) estimating how large the climate-related uncertainties may be within a given model, and (2) illustrating the difference between experiencing, and committing to, a climate change. It’s certainly not an end-all "prediction" of what will happen 300 years from now, taking into account all possible model limitations, economic uncertainties, etc.

To answer your question, the general ways to improve predictions are to improve the models, and/or improve the data constraints. I’ll discuss both.

Although I’ve argued that our simple box model reasonably reproduces the dynamics of the more complex model it was designed to approximate, that complex model itself isn’t the best model available for the AMOC. The problem with using complex climate models is that it’s computationally impossible to run them millions of times. My solution is to work with "statistical emulators", which are tools for building fast approximations to slow models. The idea is to run the complex model a few times at different points in its parameter space, and then statistically interpolate the resulting outputs to predict what the model would have output at nearby points. This works if the model output is a smooth enough function of the parameters, and there are enough carefully-chosen "training" points.

From an oceanographic standpoint, even current complex models are probably not wholly adequate (see the discussion at the end of "week304"). There is some debate about whether the AMOC becomes more stable as the resolution of the model increases. On the other hand, people still have trouble getting the AMOC in models, and the related climate changes, to behave as abruptly as they apparently did during the Younger Dryas. I think the range of current models is probably in the right ballpark, but there is plenty of room for improvement. Model developers continue to refine their models, and ultimately, the reliability of any projection is constrained by the quality of models available.

Another way to improve predictions is to improve the data constraints. It’s impossible to go back in time and take better historic data, although with things like ice cores, it is possible to dig up new cores to analyze. It’s also possible to improve some historic "data products". For example, the ocean heat data is subject to a lot of interpolation of sparse measurements in the deep ocean, and one could potentially improve the interpolation procedure without going back in time and taking more data. There are also various corrections being applied for known biases in the data-gathering instruments and procedures, and it’s possible those could be improved too.

Alternatively, we can simply *wait*. Wait for new and more precise data to become available.

But when I say "improve the data constraints", I’m mostly talking about adding more of them, that I simply didn’t include in the analysis, or looking at existing data in more detail (like spatial patterns instead of global averages). For example, the ocean heat data mostly serves to constrain the vertical mixing parameter, controlling how quickly heat penetrates into the deep ocean. But we can also look at the penetration of chemicals in the ocean (such carbon from fossil fuels, or chlorofluorocarbons). This is also informative about how quickly water masses mix down to the ocean depths, and indirectly informative about how fast heat mixes. I can’t do that with my simple model (which doesn’t have the ocean circulation of any of these chemicals in it), but I can with more complex models.

As another example, I could constrain the climate sensitivity parameter better with paleoclimate data, or more resolved spatial data (to try to, e.g., pick up the spatial fingerprint of industrial aerosols in the temperature data), or by looking at data sets informative about particular feedbacks (such as water vapor), or at satellite radiation budget data.

There is a lot of room for reducing uncertainties by looking at more and more data sets. However, this presents its own problems. Not only is this simply harder to do, but it runs more directly into limitations in the models and data. For example, if I look at what ocean temperature data implies about a model’s vertical mixing parameter, and what ocean chemical data imply, I might find that they imply two inconsistent values for the parameter! Or that those data imply a different mixing than is implied by AMOC strength measurements. This can happen if there are flaws in the model (or in the data). We have some evidence from other work that there are circumstances in which this can happen:

• A. Schmittner, N. M. Urban, K. Keller and D. Matthews, Using tracer observations to reduce the uncertainty of ocean diapycnal mixing and climate-carbon cycle projections, *Global Biogeochemical Cycles* **23** (2009), GB4009.

• M. Goes, N. M. Urban, R. Tonkonojenkov, M. Haran, and K. Keller, The skill of different ocean tracers in reducing uncertainties about projections of the Atlantic meridional overturning circulation, *Journal of Geophysical Research — Oceans*, in press (2010).

How to deal with this, if and when it happens, is an open research challenge. To an extent it depends on expert judgment about which model features and data sets are "trustworthy". Some say that expert judgment renders conclusions subjective and unscientific, but as a scientist, I say that such judgments are always applied! You always weigh how much you trust your theories and your data when deciding what to conclude about them.

In my response I’ve so far ignored the part about parameters changing in time. I think the hydrological sensitivity (North Atlantic freshwater input as a function of temperature) can change with time, and this could be improved by using a better climate model that includes ice and precipitation dynamics. Feedbacks can fluctuate in time, but I think it’s okay to treat them as a constant for long term projections. Some of these parameters can also be spatially dependent (e.g., the respiration sensitivity in the carbon cycle). I think treating them all as constant is a decent first approximation for the sorts of generic questions we’re asking in the paper. Also, all the parameter estimation methods I’ve described only work with static parameters. For time varying parameters, you need to get into state estimation methods like Kalman or particle filters.

**JB**: I also have another technical question, which is about the Markov chain Monte Carlo procedure. You generate your cloud of points in 18-dimensional space by a procedure where you keep either jumping randomly to a nearby point, or staying put, according to that decision procedure you described. Eventually this cloud fills out to a good approximation of the probability distribution you want. But, how long is "eventually"? You said you generated a million points. But how do you know that’s enough?

**NU**: This is something of an art. Although there is an asymptotic convergence theorem, there is no general way of knowing whether you’ve reached convergence. First you check to see whether your chains "look right". Are they sweeping across the full range of parameter space where you expect significant probability? Are they able to complete many sweeps (thoroughly exploring parameter space)? Is the Metropolis test accepting a reasonable fraction of proposed moves? Do you have enough effective samples in your Markov chain? (MCMC generates correlated random samples, so there are fewer "effectively independent" samples in the chain than there are total samples.) Then you can do consistency checks: start the chains at several different locations in parameter space, and see if they all converge to similar distributions.

If the posterior distribution shows, or is expected to show, a lot of correlation between parameters, you have to be more careful to ensure convergence. You want to propose moves that carry you along the "principal components" of the distribution, so you don’t waste time trying to jump away from the high probability directions. (Roughly, if your posterior density is concentrated on some low dimensional manifold, you want to construct your way of moving around parameter space to stay near that manifold.) You also have to be careful if you see, or expect, multimodality (multiple peaks in the probability distribution). It can be hard for MCMC to move from one mode to another through a low-probability "wasteland"; it won’t be inclined to jump across it. There are more advanced algorithms you can use in such situations, if you suspect you have multimodality. Otherwise, you might discover later that you only sampled one peak, and never noticed that there were others.

**JB**: Did you do some of these things when testing out the model in your paper? Do you have any intuition for the "shape" of the probability distribution in 18-dimensional space that lies at the heart of your model? For example: do you know if it has one peak, or several?

**NU**: I’m pretty confident that the MCMC in our analysis is correctly sampling the shape of the probability distribution. I ran lots and lots of analyses, starting the chain in different ways, tweaking the proposal distribution (jumping rule), looking at different priors, different model structures, different data, and so on.

It’s hard to "see" what an 18-dimensional function looks like, but we have 1-dimensional and 2-dimensional projections of it in our paper:

I don’t believe that it has multiple peaks, and I don’t expect it to. Multiple peaks usually show up when the model behavior is non-monotonic as a function of the parameters. This can happen in really nonlinear systems (an with threshold systems like the AMOC), but during the historic period I’m calibrating the model to, I see no evidence of this in the model.

There are correlations between parameters, so there are certain "directions" in parameter space that the posterior distribution is oriented along. And the distribution is not Gaussian. There is evidence of skew, and nonlinear correlations between parameters. Such correlations appear when the data are insufficient to completely identify the parameters (i.e., different combinations of parameters can produce similar model output). This is discussed in more detail in another of our papers:

• Nathan M. Urban and Klaus Keller, Complementary observational constraints on climate sensitivity, *Geophysical Research Letters* **36** (2009), L04708.

In a Gaussian distribution, the distribution of any pair of parameters will look ellipsoidal, but our distribution has some "banana" or "boomerang" shaped pairwise correlations. This is common, for example, when the model output is a function of the product of two parameters.

**JB**: Okay. It’s great that we got a chance to explore some of the probability theory and statistics underlying your work. It’s exciting for me to see these ideas being used to tackle a big real-life problem. Thanks again for a great interview.

*Maturity is the capacity to endure uncertainty.* – John Finley

John wrote:

…which they do for a reason, I might add, because if the something is influenced by many different factors, each of them of little influence and independent from the others, then the central limit theorem says that the distribution is approximately Gaussian (with the caveat that the theorem itself does not tell how good this approximation is in real situations where the number of influences is finite).

The IQ itself is

definedusing this assumption, the mean is defined to be 100 and the standard deviation is defined to be 15. “Highly gifted” is defined to be two standard deviations above average, which is above 130 and applies to ca. 2% of the population, by definition. So if someone tells you that scientists have discovered that ca. 2% of the population is highly gifted, you should ask what definition of “highly gifted” they use, because if they use the only one I know about, this assessment is simply true by definition :-)Nathan wrote:

Usually performance bottlenecks are localized at very few lines of code – do you know what exactly caused the increase of performance in this case? I don’t think that Fortran is just faster (because it is not Fortran or Mathlab that is executed by the CPU, but assembler, and I doubt that the available Fortran compilers are that much better than the ones for C or C++).

Nathan said:

Do you know about projects to deploy more measuring probes into the oceans to increase data quality and detailedness?

The speed improvement is indeed simply that Fortran is a faster language than Matlab. Matlab is an interpreted language. They sell a compiler, but I didn’t have access to it.

(Actually, my speed comparison is somewhat off. The “million runs in half an hour” is for just the Fortran climate module. The Fortran port of the full model is slower, but I haven’t benchmarked it yet.)

I don’t know what the current plans are for future ocean observing systems. The current state of the art is the Argo system, and I imagine any plans to deploy more probes would be part of that project.

I think the existing probes will prove highly useful, once they’ve been there long enough to collect a few decades of data, and the bias-correction issues are sorted out.

(I say “existing probes” loosely, because they have to be continually replenished as they beach themselves or are otherwise lost. I mean the existing number of probes.)

I created a stub on the Azimuth project with the link to Argo: Oceanography.

> “They sell a compiler, but I didn’t have access to it.”

Nathan, The name “compiler” is there for historical reasons, but really it is just a way of packaging things up with a runtime version of the interpreter, so you can give executables to people (along with the required dlls). So it does not really improve execution speed that much.

Automatically generating optimized C code that is independent on the interpreter is possible using the Embedded MATLAB Coder (people in the aerospace and automotive industries might be more familiar with it). Using that, i have obtained a 200-fold performance improvement (for an extended kalman filter) with respect to interpreted MATLAB. So your 100-fold improvement is not really surprising to me.

It is also possible to run MATLAB on the GPU, using the parallel computing toolbox, but i haven’t tried that yet.

Regarding computer languages:

1. It’s not the “FORTRAN compilers are better than C/C++ ones”, it’s that the FORTRAN semantics (default of no aliasing, expression evaluation order unspecified except for brackets, FORALL, etc) make it much easier to write an “numerical array processing” program that the compiler can generate close to optimal code for compared to C/C++. (They also mean that the compiler routinely restructures the code so much that doing low-level debugging is incredibly hard.) Conversely, those same semantics make it an absolute nightmare to write efficient code for, eg, a compiler compared to C/C++. (Or so I understand. The need to also do non-numeric stuff is why I haven’t written Fortran programs.) It’s a bit like the difference between a saw and a chisel: they’re optimised to be good for different tasks.

It’s too bad that I don’t know enough about Fortran to add anything to the discussion.

FWIT Java is – first and foremost – an interpreted language, too, but it is possible to outperform C and C++ for

certaintasks that most probablydon’tinclude heavy number crunching, see e.g. Java HotSpot Virtual Machine.Java has just-in-time compilation of bytecode down to native machine language. It’s not purely interpreted in the classic sense of the term.

Right, I should have been more precise: Java code is compiled into byte code that is platform independent and executed by a virtual machine (JVM), which is platform dependent.

As a performance optimization commonly used JVMs are capable to compile deployed byte code into machine code. Some JVMs outperform some C++ compilers in this regard, so that it can happen that Java seems to be faster than C++.

Regarding the MATLAB, I gather a part of the issue is not interpretation per se, but that the “semantics” are that you have to be able to interrupt the computation and look at

anyvariable currently in scope. This means that the interpreter can’t do even simple optimisations likeT=A+B;

D=T+C; # T MUST be “inspectable in the development environment”

into

D=A+B+C;

let alone more complicated transformations. (I think even when written as A+B+C it creates “hidden” temporary arrays rather than doing all the additions “at once”.) When these are big matrices this causes a big slowdown due to cache miss issues and memory churn.

You mean the support of reflection?

I don’t know about the handling of this specifically, what I do know is that you can deploy Java programs with debug information added, which the JVM HotSpot complier of the JVMs I know will strip away and optimize at will, if it deems it necessary (and if one did not forbid this via configuration, I guess).

The latest speed trick in phylogenetics (which also uses Bayesian MCMC techniques) is to use GPUs instead of CPUs to calculate the likelihood.

http://tree.bio.ed.ac.uk/publications/390/

Developments in computer hardware are largely driven by 3D games, which need fast GPUs. Killing fictitious aliens is bigger business than saving real humans.

In fact, I’m looking into GPU computation for my Monte Carlo needs as well. One problem is that traditional MCMC a rather sequential algorithm, but there are more parallel algorithms out there (as well as Monte Carlo algorithms that are not MCMC).

I said people use the Gaussian distribution as a kind of ‘default’ probability distribution, and Tim wrote:

The central limit theorem is a wonderful thing. But in reality probability distributions are rarely

exactlyGaussian, and often they’re not even close. So in reality, there must usually be some condition of the central limit theorem that doesn’t apply.You mentioned one way this can happen: if a random variable is the result of combining just

finitelymany independent influences, it doesn’t need to be Gaussian. But there are others. For example, the central limit theorem only applies when the influencesadd. But in real life, few things are linear. And, of course, the influences might not be independent!So, to a large extent, people use Gaussian distributions for reasons of

conveniencerather than principle. They’re easy to work with — especially because people have already developed a large arsenal of machinery for working with them (the t-test, the chi-square test, etc.), and people learn this stuff in school.This may or may not be okay, depending on the situation. But we should always be aware of it, because it can get us in trouble.

Personal anecdote I: One of my high-school teachers told us in the sociology class that everything is always Gaussian :-)

Personal anecdote II: Some engineers did a statistical analysis of signal transfer times in the bus of a car, using a Gaussian distribution, i.e. they used estimators for the mean and the variance. When I looked at the data it became immediately clear that the distribution of the transmission times were sharply peaked around two different values (there were almost always either x ms or y ms :-)

When I visited Göttingen, I found to my great amusement that Gauss’ tomb was covered with little notes from students asking for help on their math tests. Perhaps instead scientists should be begging him to make all their probability distributions Gaussian.

But why do all these engineers immediately slap down a Gaussian whenever they need a distribution in a model?

Because it’s normal! (Baddam-bam)

(Sorry for that ancient joke.)

Gaussians often do work many situations, at large scales with quasi-linear dynamics (when you are adding up lots of little errors in a linear way). They’re more useful for climate statistics on global averages, than they are for weather statistics, regional climate extremes, etc.

Usually the central limit theorem doesn’t fail because of finite vs. infinite numbers of errors being added. There are typically lots of errors floating around. And there are versions of the CLT that apply to correlated errors, although I’m not sure how general they are.

I more often see non-Gaussian distributions when errors don’t add, or they’re being propagated through nonlinear dynamics, or maybe when there’s a finite

mixtureof error processes (instead of sum of errors) … e.g. a “normal” error process combined with something else that causes “outliers”.So I don’t always worry a lot about non-normality in the data-generating process, unless there are serious outliers or I’m working with something local like weather data. (For example, I run into them all the time looking at local CO2 flux measurements from eddy covariance flux towers.) For global-scale data, usually correlation in the data generating process is more important than non-normality (and errors in the model even more important than that).

I do worry about non-normality in the parameter estimates, which is why I avoid certain statistical approximations like Kalman filters that make normality assumptions about parameters.

Yes, but this doesn’t imply that the IQs of politicians (or any specific subset of the population) are normally distributed.

An important aspect of Nathan’s paper is that it estimates the probability that we’ll be

committedto a collapse of the Atlantic Meridional Overturning Current by 2300, barring extraordinary measures:I would like to know about other work people have done to estimate the probability of “climate change committments”. In particular, we are in danger of reaching a number of tipping points, shown below.

Which one are we the closest to tipping? Which one are we the closest tocommittingto tipping?(Click for a larger image.)

One source of scepticism about predictions in climate science is the lack (or to be more precise: my lacking knowledge) of successful postdictions (you know, like string theory postdicts gravity :-).

Do you know about the running contests to predict the shrinkage of Arctic sea ice? See here, here, here and here, or on the skeptical side here and here. It seems a bit like a horse race where people don’t agree on where the horses are… but I know that some people are claiming successful

predictions, not just postdictions.There are probably more scholarly places to read about the success or failure of climate model predictions, but at least the blogs show that it’s become a popular sport!

Unfortunately, it’s really hard to judge a climate model based on its short term predictive skill, since the climate on those scales is dominated by chaotic variability that is hard to predict. For predictions less than a decade or two ahead, a climate model probably isn’t going to beat a simple statistical regression. See Cox & Stephenson (2007), and Hawkins & Sutton (2009), Fig. 3 (reproduced in Meehl et al. (2009)), for illustrations of “skillful timescales” in climate models.

Clearly there are some things that climate models don’t get right. But I wouldn’t say that climate science lacks successful postdictions (we call them “hindcasts”). But you can go to the IPCC report (e.g., chapters 8 and 9 of the WG1 report) and judge for yourself what you think a “successful” postdiction should be.

There isn’t too much probabilistic work on “committing to tipping” of the sort that we did in our paper. There are two separate bodies of research in the literature.

One looks for “threshold temperatures” – if we stabilize global temperatures above this line, the system will tip. There isn’t much uncertainty analysis there. That’s the sort of thing you’ll find in the Lenton et al. paper and its references.

The other body of research on “stabilization” looks at how we can level out below a given temperature, including the probability that we’ll cross some temperature target anyway even if we start trying to reduce emissions. See the “trillionth tonne” work of Meinshausen, Allen, etc.

As for which tipping points we’re closest to, I think we are likely to lose the Arctic summer sea ice this century. We may be already committed to that, though I don’t know any literature on the subject. In general, the points we’re closest to tipping are also the ones we’re closest to committing to tip.

Another possibility is disintegration of the Greenland ice sheet, although different researchers have rather divergent opinions on how close that may be (e.g., Richard Alley has favored a low number around 2 C, and Jonathan Bamber a high number around 6-8 C). Alley thinks we could commit to that within the next decade (barring geoengineering and such), although total disintegration would be much farther into the future.

Thanks to John and Nathan for a great interview! Some questions:

In the graphs, it looks like some of the posterior parameter distributions are being clipped by the bounds on the priors. Also, uniform priors are rarely a good expression of prior knowledge. So I wonder how you came to choose these priors? I assume the Cauchy(3,2) prior for the climate sensitivity is proportional to . What is the thinking behind that?

How sensitive are your results to choice of prior? For example, what if the climate sensitivity had a prior like dlnorm(meanlog=1, sdlog=1)?

In the predictions, is the collapse of the AMOC closely correlated with high global temperatures? Or is it plausible to have a modest temperature rise, but still have the AMOC collapse? Or in other words, is the AMOC an extra worry, is it only going to collapse when the rest of the climate is going crazy anyway?

The priors came mostly from parameter ranges considered by previous modeling studies. The modelers haven’t published more sophisticated expressions of belief, and we didn’t conduct a full blown elicitation study to try to determine them.

The main boundary clipping I worry about is with the ‘h’ and ‘kappa_V’ parameters (hydrological sensitivity and vertical diffusivity). I’m not as worried about other parameters, which have more natural bounds.

The upper bound on ‘h’ was taken from a range of predictions of GCM runs, but probably this should be relaxed somewhat, to allow for the possibility of higher hydrological sensitivities than modeled. However, I don’t think this will impact the results much, because hydrological sensitivity isn’t the major contributor to variance in the AMOC strength projections. The AMOC collapses mostly come from the high temperatures, not the high hydrological sensitivities (answering your last question). Thus, the important parameter is climate sensitivity.

Clipping on vertical diffusivity is more of an issue, because it’s correlated with climate sensitivity, which generates high temperatures. The data to seem to favor diffusivities at the edge of the prior range. The range in our paper is actually narrower than what was considered in the study where the model was developed, because I thought that upper bound was too high. But we might have expressed that belief too strongly in the prior, and cut off too much of the upper tail on climate sensitivity.

It is hard to come up with a prior for kappa_V, because heat diffusivity is not directly measurable in the real ocean; it’s an effective parameter that needs to be tuned to ocean heat data. On the other hand (as I alluded above), I don’t really believe that 1 and 10 cm^2/s are equally plausible a priori. Right now I’m exploring trapezoidal priors (uniform on some “likely” range but decaying linearly on an “unlikely but plausible” range), as well as various lognormals. I’m also thinking about how I can use information about diffusivity from other studies, where the main obstacle is that different models can have very different numerical values for that parameter.

As for climate sensitivity, the Cauchy prior was chosen as a heavy tailed distribution, reflecting knowledge from paleoclimate and modeling studies that climate sensitivity is most likely around 3 C, but wider than what those studies reflect to guard against overconfidence.

I have tried other priors for climate sensitivity (like various lognormals). At the present I’m using a lognormal prior instead of the Cauchy prior, because it has bounded variance and I’m now looking at some low-data situations (mostly as a conceptual exercise).

I think it’s possible to use a narrower climate sensitivity prior than we did while still guarding against overconfidence; we may have been over-conservative in that respect. But narrowing the climate sensitivity prior doesn’t do much, if it’s the bounds on vertical diffusivity that are controlling the tail areas.

Ultimately, climate predictions are sensitive to prior choices; this has been known for some time with respect to some climate sensitivity priors. That’s one reason why I don’t advertise our paper as giving “the answer” to AMOC collapse. It’s more intended to explore what an answer to that question may look like in a coupled model with different sources of uncertainty, and to look at the experiencing vs. committing question.

Some think that the difficulties of eliciting good priors will be solved by “objective Bayesian” reference priors, but while this could help in some cases, I don’t feel this is the way to go. I don’t think expert elicitation of model parameter ranges is, either, since these are so often “effective” parameters encapsulating a lot of unresolved processes.

Ultimately, I think prior sensitivity will have to be addressed by applying more data constraints to make priors less relevant. This is one reason why we’re moving to more complex Earth system models. Not just because they have more realistic physics, but they have more outputs. I can constrain vertical diffusivity not just with heat data but with other oceanic chemical tracers, as discussed in the intervew; then the prior on the vertical mixing becomes less important.

I’d also like to incorporate paleo data more explicitly, but this is somewhat tricky using results from paleo studies that used different models than what I use. There is some model dependence in the results (and there aren’t too many paleo studies that do careful uncertainty bounds, either). It’s hard to avoid having to use intentionally “weakened” priors, rather than the results of any particular study.

This is another reason why I’m interested in learning. How much do we have to learn, and how fast, before we can rule out some of the tail-area risks that may or may not be prior-driven? And it’s also why I’m interested in “robust decision making” – how do you make decisions when you know some of your predictions are sensitive to assumptions?

Thanks for the detailed reply. The paper by Knutti and Hegerl

http://www.nature.com/ngeo/journal/v1/n11/abs/ngeo337.html

that you referred to earlier gave a good picture of the uncertainties (and therefore sensitivity to priors). It seems worth pointing to again. The take home message is that climate models have become more complex and realistic, but not more precise in their predictions: there’s never enough data.

Like you, I’m skeptical of “objective Bayesian” reference priors, or any claim to objectivity. I’m probably more optimistic about expert elicitation of priors. Sometimes reparameterising the model can make it easier to elicit priors. It’s also a good idea, especially with high dimenional priors, to run the model with no data, so that you, or an expert, can see more clearly what is being assumed in the prior. For decision making, only the the prior-loss combination matters, and that can often focus attention on which aspects of the prior matter. (I’m sure you know these things but a lot of people don’t.)

On learning, what you say is reminiscent of reinforcement learning

http://en.wikipedia.org/wiki/Reinforcement_learning

though I’m used to thinking of that in terms of controlling robots, not global policy decisions!

The Knutt and Hegerl paper is a good review. It is disheartening how slowly we have learned about some climate processes. I hope this will change in the future, with more and better observation systems, and an increased statistical capacity to compare models to data and interpret the results.

One concern with the sort of probabilistic model calibration I do is the extent to which past data are even informative about the major future uncertainties. It may be that calibrating models on historic data rewards them for getting certain climate patterns right that were historically important (e.g., ocean multidecadal or ENSO variability), but are not related to the factors that will drive future uncertainty a century from now.

I find that reparameterization works better with simple statistical models, where the parameters don’t always have very physical interpretations and it’s mathematically easy to rewrite the model in a reparameterized form. Neither of these tend to be true with complex climate models.

And yes, it’s good to remind people to propagate prior uncertainties to see if it really reflects what we believe about the model’s behavior, and that only the prior-loss combination matters, or really the posterior-loss combination. (I do think about these things.) This has focused attention on the high tails of climate sensitivity, to which the long term decision problem is sensitive. (See Weitzman’s “dismal theorem” paper for an extreme, albeit technically questionable, example.)

I was speaking of learning in just a passive sense, but when coupled to policy it does become related to dynamic programming and reinforcement learning. In fact, I am currently looking into approximate dynamic programming, which has imported methods from reinforcment and machine learning, as a way to tackle future problems in active learning and adaptive policy. See this article (and this review) as well as this book by Warren Powell here at Princeton for details.

Graham wrote:

I’m not familiar with this jargon. Does this mean something like the prior probability that you lose — some money, say — times how much you’ll lose? Or more generally, the expected loss, given your prior?

Is there some reason people use the word ‘loss’ instead of ‘gain’ here? Just general pessimism, because they’re talking about something like risk management instead of how to maximize gains?

I’ll venture my own response, but Graham can fill in.

Yes, this refers to expected loss, which is the average of loss x probability. If “probability” is sensitive to prior assumptions, maybe this doesn’t matter, if it doesn’t affect a situation where there is a lot of “loss”.

But in climate change, the prior sensitivity is to the high upper tails of the distribution, which is precisely where the loss is important. That’s why priors are a problem.

Different fields prefer different jargon: “loss” or “cost” minimizers, or “gain” or “utility” maximizers. I’m not familiar with the historical use of the word.

Thanks. I suspect that anyone trying to sell a financial instrument would speak of gain maximizers instead of loss minimizers… at least until recently.

I should explain “prior-loss combination” better than I did before. But, first just to confirm that the only difference between loss on the one hand and gain or utility on the other is a sign change. In finance, people are mainly interested in how much they stand to gain by a particular action. In science, there is often a theoretical optimum behaviour, and it is natural to measure how far away from that an actual decision-making process is.

To the main point. I’m not even sure “prior-loss combination” is standard jargon; it is something I’ve picked up from Herman Rubin on usenet. I am talking in the context of feeding ‘expert’ opinion into Bayesian statistical decision theory. Assume the model is fixed, and the data as yet unknown. You might have two experts, one of whom says “I think climate sensitivity is high, but I don’t think that much damage will be done by high temperatures”, while another says “I think climate sensitivity is low, but I think that damage will be done even by modest temperature rises”. It could be that (suitably formalised) both these opinions (which are prior-loss combinations) lead to exactly the same decisions. That is, the decisions are data-dependent (of course) but they don’t depend on which expert you choose to believe.

Ha, I also learned this point from Herman Rubin on Usenet, long before I ever started working with decision theory.

Anyway, there is evidence that the tails can matter to decision making for standard choices of loss function, i.e., there are a range of possible decisions depending on prior assumptions. Maybe a different loss function would produce the same range of decisions, but it wouldn’t change the fact that you have a range of decisions depending on priors, and not just a single decision independent of priors. That is, unless you can kill off enough of the prior dependence right in the probability distribution.

This is all very interesting and all, but if there’s negligible possibility of seeing any kind of effect with regard to the AMOC for the next 140 years, this is all pretty much useless in recommending changes to policy. In other words, you’re asking people and nations to make drastic changes in their use of energy, changes which will trigger immediate and perhaps catastrophic societal effects, based on the probability, predicted by a scientific model for which we do not know the inherent error, of events which begin to have the least amount of probability 140 years from now. Your chances of success are probably directly proportional to the probability of change in the AMOC … Absolutely negligible, for at least the next 100 years or so.

This work also does not take into account “peak oil” and “peak coal” and “peak nuclear energy (uranium, thorium, etc)”, which may not be important now, but will probably be very important by 2050, with oil, at the present rate of extraction, being economically too costly to continue to extract. Most predictions of future oil supply show a sharp decrease in availability by 2050, with the same case holding in 2075 or thereabouts for coal. That would pretty much do it for fossil fuel CO2 generation, so if you look at the “trigger curve” in 2075, it’s at 0.01, so if there’s no more fossil fuel burning, that will change the trigger curve, making it decrease, not increase.

In summary, looking at how these studies do not take fossil fuel resource depletion into account, and seeing how the nearest non-negligible effect is at least 140 years in the future, I think it’s pretty safe to say that arguments based on these results will be very non-persuasive. Climate skeptics will have a field day with this…

My opinions:

First, the proposal that climate mitigation efforts will trigger “catastrophic social effects” is a false narrative, which serves only as an excuse to not bother trying to accomplish anything, ever. No matter what happens to the climate, one can always imagine economic disaster scenarios to further justify inaction. Ditto for claims that “scientific uncertainty” is an argument for inaction (especially concerning claims that we know nothing about the scientific uncertainty).

This goes double for those who claim a high degree of certainty about the enormous socioeconomic disasters awaiting anyone who reduces fossil energy use, particularly considering how much easier it is to “learn by doing” about economic costs than to learn about long term environmental risks.

In short, if climate policy really is as disastrous as economic pessimists claim, we will learn that quickly as we begin to put policies in place (since, as you say, the effects are immediate), before implementing policies stringent enough to invoke actual disasters. By contrast, if climate impacts really are as disastrous as climate pessimists claim, we will probably learn that far too late to avoid much of the damage. This argues for an experimental policy to learn about costs and damages, with iterative improvements to ramp mitigation efforts up or down on the basis of what we learn. It doesn’t argue for a policy of perpetual inaction based on imagined disasters. That’s not the way anybody approaches risk management in any other field that I know of.

Second, these studies do take fossil fuel depletion into account. But they use far different assumptions than you evidently do about how much fossil resources will prove economically extractable under growing global energy demand.

Third, while I personally find it interesting, AMOC collapse is hardly the only or even the most important motivation for large scale climate policy.

Fourth, you may be right that many people simply don’t care about long term environmental hazards, and will never contemplate policy which could forgo damages to future generations. If people don’t care about what happens beyond 2100, then certainly studies of what may happen beyond 2100 aren’t going to convince anybody to do anything. But I don’t view public apathy toward future generations as an actual argument to commit ourselves to potentially long lasting and severe environmental risks.

Fifth, I don’t know why climate skeptics would “have a field day” with our paper. It’s not like it’s a fundamentally surprising or even new result; it just formalizes some of the uncertainty analysis associated with the conclusion that the IPCC already reached in 2007 (less than a 10% chance of AMOC collapse this century).

@Nathan, you state “Second, these studies do take fossil fuel depletion into account. But they use far different assumptions than you evidently do about how much fossil resources will prove economically extractable under growing global energy demand.”

That’s pretty obvious, from looking at the curves. What exactly are these “far different assumptions”? What assumptions do you make about population growth, and population, over this time scale of nearly 300 years?

I hope Nathan answers your questions in more detail: these are some issues I really want to understand! But just so he doesn’t need to repeat himself, this is from the interview:

Here’s the graph from his paper:

Here’s what the paper says:

The reference to Nordhaus is:

• Nordhaus, W. D. 2007. The challenge of global warming: Economic models and environmental policy, Technical report, http://nordhaus.econ.yale.edu/DICE2007.htm, accessed May 2, 2007, model version: DICE-2007.delta.v7.

A writeup describing Nordhaus’ 2007 model is here:

• William D. Nordhaus,

A Question of Balance: Weighing the Options on Global Warming Policies, Yale University Press, New Haven, 2008.A version of this book is free online — just click!

Since I’m just looking at this 248-page book for the first time right now, I can’t say much about where the curve above comes from. I will note, though, that on page 127 of his book, Nordhaus estimates that a total of 6±1.2 trillion metric tons of carbon are available to be burnt. Currently we’ve burnt about 0.54 trillion tons, and people are wringing their hands about the trillionth ton, with some estimating it will be burnt by around 2044.

On the same page, Nordhaus estimates that the world population will level off at 8.6±1.9 billion.

Anyway, I hope Nathan weighs in — this is just to get the ball rolling.

I don’t intend to “weigh in” too heavily, since I’m not an energy economist. I can offer a few general points and pointers to references.

Fossil fuel emissions in the “business as usual” (BAU) scenarios that are usually considered are not driven primarily by population growth. They’re mostly driven by an assumption of continued

economicgrowth, particularly that the rest of the developing world will eventually grow to consume energy at intensities similar to European, or U.S., consumption patterns (barring additional economic incentives to strive for low energy intensities).This is coupled to an assumption that there is a large amount of fossil carbon available in coal, tar sands, and oil shales (thousands of gigatons), and that eventual high energy demand will make it economically worthwhile to extract most or all of that carbon.

This doesn’t preclude growth in alternative energy, simply that energy demand will be high enough that we’ll eventually want to dig up all that fossil carbon anyway, in addition to whatever alternative energy we deploy.

Nordhaus’s DICE model is one way to turn these assumptions into an emissions trajectory. I should also point people toward the “Representative Concentration Pathway” (RCP) scenarios (overview here), which is what the IPCC will be using in its next assessment report.

You can view (preliminary versions of?) these scenarios with this browser. The emissions projections go out to 2100, and they have “extension scenarios” (ECPs) for CO2 concentrations (not emissions) out to 2300.

Their BAU scenario is called RCP8.5, and it is based on (but not identical to?) work in this paper, which outlines growth scenarios. (They don’t explicitly discuss fossil carbon constraints because in this scenario they implicitly assume that there is enough carbon to avoid peaking before 2100, the last date they consider.)

All the other RCP scenarios are “stabilization” or mitigation scenarios where society opts to stabilize at below-BAU CO2 concentrations, or reduce emissions even further.

The RCPs and ECPs look like this:

RCP8.5 appears to have a slightly larger and sooner peak than the DICE BAU scenario, but is fairly comparable. Looking at the ECP concentrations, they seem to be assuming a similar total fossil resource constraint (~5000 GtC). The other ECPs assume stabilization at some CO2 concentration around 2150, or even a decline.

I don’t know that I personally believe that we will go after ever last scrap of carbon we think may be in the ground. As I said in the interview, we intentionally considered a “worst case” scenario.

I do think there’s a serious risk that we’ll extract, say, half that amount, and reach quadrupled (from pre-industrial) CO2 levels some time in the century after this one. That would require extracting what are currently low grade and unprofitable reserves, but they will become more profitable as other sources are depleted.

Eventually fossil prices will rise, and alternative energy prices drop, to the point that it’s more profitable to switch completely to non-fossil energy. Absent price controls on carbon, I am not convinced this will happen fast enough to avoid some pretty high CO2 levels.

There are other scenarios that lead to lower fossil fuel consumption, such as a global economic collapse leading to permanently depressed economic growth, or otherwise lower continued growth than we’ve seen historically based on our fossil energy economy. (There are also scenarios of enhanced economic growth…) Even so, a lowered rate of growth doesn’t necessarily imply a lowered final CO2 concentration, just that we’d hit it at a later date.

I can’t venture my own estimate as to what I think will come to pass. For my policy work I choose to use what appears in the mainstream climate economic literature, and if those estimates change, so will my projections.

Nathan wrote:

I just meant that I hoped you’d answer “What assumptions do you make about population growth, and population, over this time scale of nearly 300 years?” And you have.

Justifyingthese assumptions would be a vastly harder task, even for shorter time scales. I’m happy just to get the references. I expect to spend a lot of time learning and discussing such issues on this blog…I think that unless we find a source of energy equally as energy-dense as crude oil, before depletion seriously sets in, we’ll be in for a worldwide economic collapse. US agriculture is heavily dependent on fossil fuels to provide the food surplus which feeds the underdeveloped and developing nations whose population is booming, far past their carrying capacity even now. When oil scarcity hits the US, not only will it drive US food prices higher, it’ll become prohibitively expensive to ship food overseas, thus the developing and underdeveloped countries will be forced to fall back on what they can produce on their own. For underdeveloped countries in Africa, this will result in famine and mass starvation and political instability; for developing countries like China, the consequences will be hard to predict, but they may suffer economic depression and political instability as well. The end result of these events may be a drastic downward shift in population, in the underdeveloped world, and in the developing world, a slowing-down of economic development.

It should also be considered that crude oil is also a feedstock for plastics, pharmaceutical drugs, and agricultural chemicals among other items, and is not just a source of energy. There’ll be a considerable amount of competition amongst there uses, unless an alternate, renewable feedstock, such as hemp, is developed.

Here’s a reference with pic of Ford’s hemp car

http://green.autoblog.com/2007/07/20/fords-hemp-car-closer-a-greener-choice-that-produces-less-smo/

I guess you know that currently available varieties of hemp, including common wild weed, produce far more usable fiber per hectare than any other plant, while simultaneously producing more edible protein and oil than any crop it would replace. Even common wild weed outperforms Dyson’s hypothetical magic forests by an order of magnitude more CO2 removal.

Streamfortyseven wrote:

I’m surprised that you say most predictions say this about coal. I’ve seen a lot of different predictions about peak oil — there’s an interesting new article by Kevin Drum in

Mother Jonesmagazine, for example. But I always thought theproblem, as far as global warming goes, is that there’s a lot more coal.The Wikipedia article Peak coal is not very informative. For example: a figure for peak coal production in China, without any source or attribution.

Well, I mis-spoke about peak coal. TheOilDrum.com has, in this 2007 piece, peak coal occurring in a plateau from about 2015 to 2040, then tailing off: http://www.theoildrum.com/node/2396

In this study, published this year, 2010, peak coal, including Chinese production, is forecast to occur between 2015 and 2033:

http://www.theoildrum.com/node/6782#more

And finally, in this paper, published in 2009 in a journal named Fuel, the peak production of coal is forecast to occur somewhere between 2010 and 2048, with their best guess for peak production year being, roughly, 2036:

http://www.theoildrum.com/node/5256

So it look like I was optimistic by about 40 years or so…

There’s very little out there on total Chinese production, but here’s a chart on exports:

Of course, this could reflect that China has gone from being a net exporter of crude oil to a net importer. The curve ends in 2007, before the economic collapse, so that doesn’t enter in to these figures. It might be simply that the Chinese are diverting their coal to a coal-to-liquids plant, to produce oil, hydrocarbons, and asphalt. I’ll keep looking around…

Here’s one of the articles referenced by the Wiki peak coal page:

Click to access EWG_Report_Coal_10-07-2007ms.pdf

Their guess for Chinese peak coal is 2015; the EIA guess is 2030 or beyond. Apparently they’ve got some big coal mine fires which they can’t put out, which account for about 5% of their reserves. There’s a paper referenced on this topic, which appears to be unpublished work by one of the authors.

Chinese reserves are probably a huge state secret, probably CIA might have a better guess, but they might not be too forthcoming, either.

Thanks for the help! I’ll email the CIA and see what they say.