Quantum Information Processing 2011 (Day 2)

12 January, 2011

Here are some very fragmentary notes on the second day of QIP 2011. You can see arXiv references, slides, and videos of the talks here. I’ll just give links to the slides, and again I’ll only mention 3 talks.

Stephanie Werner gave a talk on the relation between the uncertainty principle and nonlocality in quantum theory. There’s a general framework for physical theories, called “generalized probabilistic theories”, which includes classical and quantum mechanics as special cases. In this framework we can see that while quantum theory is “nonlocal” in the sense made famous by John Bell, even more nonlocal theories are logically possible!

For example, while quantum theory violates the Clauser–Horn–Shimony–Holt inequality, which is obeyed by any local hidden variables theory, it doesn’t violate it to the maximum possible extent. There is a logically conceivable gadget, the Popescu–Rohrlich box, which violates the CHSH inequalities to the maximum extent allowed by a theory that prohibits faster-than-light signalling. However, such a gadget would give us implausibly godlike computational powers.

In Werner’s talk, she explained another reason not to hope for more nonlocality than quantum theory provides. Namely, given the “steering” ability we have in quantum theory — that is, our ability to prepare a state at one location by doing a measurement at another — the theory cannot be more nonlocal than it is while still obeying the uncertainty principle.

Jérémie Roland gave a talk on generalizations of Grover’s search algorithm. Grover’s algorithm is one of the implausibly godlike powers that quantum computers might give us, if only we could build the bloody things: it’s a way to search a list of N items for a given item in a time that’s only on the order of N1/2. This algorithm assumes we can freely jump from place to place on the list, so instead of a linearly ordered “list” it’s probably better to visualize this data structure as a complete graph with N vertices. Roland’s talk explained a way to generalize this idea to arbitrary graphs.

He began by considering a Markov chain on the graph — that is, a way to blunder randomly from vertex to vertex along the graph, where you can only go from one vertex to another in one step if there’s an edge connecting them. He assumed that it’s reversible and ergodic. Then, starting from this, he described how to fashion a quantum process that finds the desired vertex (or vertices) in about the square root of the time that the Markov chain would take.

Robin Kothari gave a talk on using quantum computation to efficiently detect certain properties of graphs. He considered “minor-closed properties” of graphs. Let me just tell you what those properties are, and tell you about a fascinating older result about them.

The word graph means many slightly different things, but in this blog entry I mean a finite set V whose elements are called vertices, together with a set E of unordered pairs of distinct vertices, which we call edges. So, these are undirected finite graphs without self-loops or multiple edges connecting vertices.

A minor of a graph is another graph that can be obtained from the first one by repeatedly:

1) removing edges,

2) removing vertices that have no edges connected to them, and

3) contracting edges, that is, shrinking them to nothing and then identifying the vertices at both ends, like this:



For example, this graph:



is a minor of this one:



A property of graphs is minor-closed if whenever one graph has it, all its minors also have it.

What are some minor-closed properties? An obvious one is lacking cycles, that is, being a forest. You can get rid of cycles by getting rid of edges and vertices, or contracting edges, but you can’t create them!

A more interesting minor-closed property is planarity. If you can draw a graph on the plane, you can clearly also draw the graph you get by removing an edge, or removing a lone vertex, or contracting an edge.

Now, Kuratowski showed that planar graphs as precisely those that don’t have this graph:



or this one:



as minors.

Similarly, graphs that lack cycles are precisely those that don’t have a triangle as a minor!

So, we could ask if this pattern generalizes. Given a minor-closed property of graphs, is it equivalent to a property saying that there’s some finite set of graphs that don’t show up as minors?

This is called Wagner’s conjecture, after Klaus Wagner. He published it in 1970.

Wagner’s conjecture is true! It was proved by Neil Robertson and Paul Seymour in 2004. But the really interesting thing, to me, is that their proof takes about 400 or 500 pages!

I find this quite surprising… but then, I wouldn’t have guessed the four-color theorem was so hard to prove, either.

To make sure you understand Wagner’s conjecture, check that if we dropped the word “finite”, it would have a one-sentence proof.


Quantum Information Processing 2011 (Day 1)

11 January, 2011

This year’s session of the big conference on quantum computation, quantum cryptography, and so on is being held in Singapore this year:

QIP 2011, the 14th Workshop on Quantum Information Processing, 8-14 January 2011, Singapore.

Because the battery on my laptop is old and not very energetic, and I can’t find any sockets in the huge ballroom where the talks are being delivered, I can’t live-blog the talks. So, my reports here will be quite incomplete.

Here are microscopic summaries of just three talks from Monday’s session. You can see arXiv references, slides, and videos of the talks here. I’ll just give links to the slides.

Christian Kurtsiefer gave a nice talk on how to exploit the physics of photodetectors to attack quantum key distribution systems! By cutting the optical fiber and shining a lot of light down both directions, evil Eve can ‘blind’ Alice and Bob’s photodetectors. Then, by shining a quick even brighter pulse of light, she can fool one of their photodetectors into thinking it’s seen a single photon. She can even fool them into thinking they’ve seen a violation of Bell’s inequality, by purely classical means, thanks to the fact that only a small percentage of photons make it down the cable in the first place. Christian and his collaborators have actually done this trick in an experiment here at the CQT!

Tzu-Chieh Wei and Akimasa Miyake gave a two-part joint talk on how the AKLT ground state is universal for measurement-based quantum computation. The AKLT ground state works like this: you’ve got a hexagonal lattice with three spin-1/2 particles at each vertex. Think of each particle as attached to one of the three edges coming out of that vertex. In the ground state, you start by putting the pair of particles at the ends of each edge in the spin-0 (also known as “singlet”, or antisymmetrized) state, and then you project the three particles at each vertex down to the spin-3/2 (completely symmetrized) state. This is indeed the ground state of a cleverly chosen antiferromagnetic Hamiltonian. But has anyone ever prepared this sort of system in the lab?

David Poulin gave a talk on how to efficiently compute time evolution given a time-dependent quantum Hamiltonian. The trickiness arises from Hamiltonians that change very rapidly with time. In a naive evenly spaced discretization of the time-ordered exponential, this would require you to use lots of tiny time steps to get a good approximation. But using a clever randomly chosen discretization you can avoid this problem, at least for uniformly bounded Hamiltonians, those obeying:

\| H(t) \| \le K

for all times t. The key is that the high-frequency part of a time-dependent Hamiltonian only couples faraway energy levels, but a uniformly bounded Hamiltonian doesn’t have faraway energy levels.

A couple more things — really just notes to myself:

Steve Flammia told me about this paper relating the Cramer-Rao bound (which involves Fisher information) to the time-energy uncertainty principle:

• Sergio Boixo, Steven T. Flammia, Carlton M. Caves, and J.M. Geremia, Generalized limits for single-parameter quantum estimation.

Markus Müller told me about a paper mentioning relations between Maxwell’s demon and algorithmic entropy. I need to get some references on this work — it might help me make progress on algorithmic thermodynamics. It’s probably one of these:

• Markus Müller, Quantum Kolmogorov complexity and the quantum Turing machine (PhD thesis).

• Markus Müller, On the quantum Kolmogorov complexity of classical strings, Int. J. Quant. Inf. 7 (2009), 701-711.

Hmm — the first one says:

A concrete proposal for an application of quantum Kolmogorov complexity is to analyze a quantum version of the thought experiment of Maxwell’s demon. In one of the versions of this thought experiment, some microscopic device tries to decrease the entropy of some gas in a box, without the expense of energy, by intelligently opening or closing some little door that separates both halves of the box. It is clear that a device like this cannot work as described, since its existence would violate the second law of thermodynamics. But then, the question is what prevents such a little device (or “demon”) from operating.

Roughly, the answer is that the demon has to make observations to decide whether to close or open the door, and these observations accumulate information. From time to time, the demon must erase this additional information, which is only possible at the expense of energy, due to Landauer’s principle. In Li and Vitanyi’s book An Introduction to Kolmogorov Complexity and Its Applications, this cost of energy is analyzed under very weak assumptions with the help of Kolmogorov complexity. Basically, the energy that the demon can extract from the gas is limited by the difference of the entropy of the gas, plus the difference of the Kolmogorov complexity of the demon’s memory before and after the demon’s actions. The power of this analysis is that it even encloses the case that the demon has a computer to do clever calculations, e.g. to compress the accumulated information before erasing it.

So, I guess I need to reread Li and Vitanyi’s book!


Mathematical Economics

8 January, 2011

My student Miguel Carrión Álvarez got his Ph.D. in 2004. This is when I was still working on loop quantum gravity. So, he decided to work on a rigorous loop quantization of the electromagnetic field. I like his thesis a lot:

• Miguel Carrión Álvarez, Loop Quantization versus Fock Quantization of p-Form Electromagnetism on Static Spacetimes.

However, he decided to leave mathematical physics when he got his degree… and he switched to finance.

There’s a lot of math in common between quantum field theory and mathematical finance. When you take quantum fluctuations in quantum fields, and replace time by imaginary time, you get random fluctuations in the stock market!

Or at least in some models of the stock market. One difference between quantum field theory and mathematical finance is that the former is famous for predicting certain quantities with many decimal digits of accuracy, while the latter is famous for predicting certain quantities with no digits of accuracy at all! I’m talking about the recent financial crisis.

Miguel and I share an interest in the failures of neoclassical economics. My interest comes from the hope — quite possibly a futile hope — that correcting some mistakes in economic theory could help show us a way out of some problems civilization now finds itself in. In fact, the Azimuth Project has its origins in my old economics diary.

Right now we’re having a little conversation about mathematical economics. Maybe you’d like to join in!

Miguel wrote:

I’ll be teaching parts of two courses on mathematical finance and financial risk management in an ‘Mathematical Engineering’ MSc course at the Universidad Complutense here in Madrid.

I replied:

Cool! Has the way people teach these subjects changed any since the economic crisis? I would hope so…

Miguel replied:

I don’t think it has.

First of all, these courses are mostly technical, as part of a Master’s programme intended to teach people what people do in practice. I don’t think criticizing the foundations is part of the program.

But you may have noticed (for instance, if you follow Krugman in the NYT) that the economic establishment has been very resistant to recognizing that the crisis is an empirical refutation of neoclassical economics. This crisis doesn’t fit within the conceptual framework of NCE, but that’s not a problem as they can just call it an “external shock” and continue pretending that the economy will trend to equilibrium from its new perturbed position. Related jokes of mine include that the recession part of the economic cycle is considered an outlier.

And this is not to speak of mathematical finance, where the situation is even worse. Academics still think that the best way to manage a new risk is to quantify it, define an index, and then create derivatives markets to trade it. In other words, turn all risk into market price risk and then push it all to the least solvent participants on the periphery of the financial system.

I think there is good progress being made by non-mainstream economists. Notably Steve Keen – see here:

• ARGeezer, Steve Keen’s dynamic model of the economy, European Tribune, 23 September 2009.

One of the problems with economics that Steve Keen complains about is that economists generally don’t know much about dynamical systems. I doubt they know what the Lotka-Volterra equation is, let alone understanding it (if you discretize it, the predator-prey model displays chaos like the logistic equation). I also doubt economists know about chaos in the logistic equation:

even if they know about logistic growth models which may not be generally the case either. Their model of the economy seems to be basically the Clausius-Clapeyron equation.

I believe the proper way to look at macroeconomics is as a flow network and as such ideas from category theory may be useful at least to organize one’s thinking.

The economy is a network whose nodes are “agents” and links are “economic relations”. Economic relations are flows, of goods and services in one direction and of money in the other direction (opposite categories via arrow reversal?).

Each node also has a balance sheet: assets and liabilities, and it’s the liabilities that are the key here, because they are mostly monetary. But they are also intertemporal. When you buy on credit, you get a physical asset today in exchange for the promise of cash at a later date. Presumably, the discounted present value of that future cash covers the value of the asset today, and that’s what you book as an asset or a liability. But the asset/liability accrues interest over time so its value changes on the balance sheet, and every so often you still need to make an actual transfer of cash along the an edge of the network. And when IOUs become tradeable (you issue me a bearer-IOU which I can then give to someone else who trusts your credit) they become money, too. And the relative variations in the prices of all these different kinds of money, their liquidity, etc, are key in understanding a recession like the one we’re in, or the growth (ehm, bubble) phase of a business cycle.

I don’t have a theory of this, but I keep thinking in these terms and one thing that seems to come out of it is a sort of “fluctuation-dissipation” relation between market fluctuations in prices and trade volumes, the creation of money-as-debt, and inflation. Because nodes abhor insolvency, fluctuations in cash flows lead to the creation of IOUs, which inflate the money mass. With the analogy inflation ~ dissipation, you get a sense that the more intense the market-induced cash flow fluctuations, the higher the inflation rate of the money mass, through the creation of money-as-tradeable-credit.

But none of this is mainstream economics, though there is an active community around “modern monetary theory”, “chartalism” or “monetary circuit theory” working on similar ideas.

But “Dynamic Stochastic General Equilibrium” is not going to cut it.

Anyway, maybe I could do a guest post on your blog about these things… It’s all very inchoate as you can see.

I also think the parallels between these economic flow networks and ecosystems and climates as matter/energy flow networks are important and such dynamical systems are a relatively unexplored area of mathematical physics – it’s just too difficult to say anything general about them!

Best,
Miguel

Miguel allowed me to post this exchange, noting that he could fill in gaps or moderate excessive claims in the comments. It would be nice if together we could all figure out how to take his thoughts and make them a bit more precise.

In week309 I plan to give an explanation of the Lotka-Volterra equation, based on work Graham Jones has done here:

Quantitative ecology, Azimuth Project.

I’m also dying to talk about flow networks in ecology, so it’s nice to hear that Miguel has been thinking about them in economics.

But here’s a basic question: in what sense do economists model the economy using the Clausius-Clapeyron equation? Is the idea that we can take this equation and use it to model economic equilibrium, somehow? How, exactly?


Algorithmic Thermodynamics (Part 2)

6 January, 2011

Here are some notes for a talk tomorrow at the Centre for Quantum Technologies. You can think of this as a kind of followup to my first post on this subject.

Introduction

The idea of entropy arose in thermodynamics, and its connection to probability theory is fundamental to statistical mechanics. Its connection to information was developed later by Shannon. We now think of entropy and information as two sides of the same coin.

But there’s another concept, called algorithmic information, which was developed in work on logic and computer science. This concept is related to Shannon’s earlier notion of information, but it also seems rather different.

For example, Shannon’s ideas let us compute the information of a probability distribution on bit strings. So if we detect a radio signal from an extraterrestrial life form, that sends us a string of a dozen 0’s and 1’s each day, and it seems the string is chosen randomly according to some probability distribution, we can calculate the information per message. But if I just show you a single bit string, like this:

011101100001

it makes no sense to ask what its information is. On the other hand, we can define its algorithmic information.

Nonetheless, my goal here is to show you that algorithmic information can really be seen as a special case of the old probabilistic concept of information. This lets us take all the familiar tools from statistical mechanics and thermodynamics, and apply them to algorithmic information theory. Mike Stay and I started to do this here:

• John Baez and Mike Stay, Algorithmic thermodynamics.

Unfortunately, I’ll only have time to sketch a bit of what we did!

Algorithmic information – first definition

Here’s a definition of algorithmic information. Later we’ll see a better one that’s almost equivalent.

Fix a programming language where a program is a finite bit string and its output, if it halts, is again a finite bit string. Then the algorithmic information of a finite bit string is the length of the shortest program that has that bit string as output.

So, for example, the algorithmic information of a string of a trillion 0’s is low, because you can write a short program that prints this out. On the other hand, the algorithmic information of a long “random” string of bits will be high, because the shortest program to print it out will be a program that says “print out this string: 01011100101100…” — so the program is approximately as long as the string itself: slightly longer, but only by a fixed amount.

Of course you may ask what “random” means here. I haven’t defined it! But the point is, people have used these ideas to give a definition what it means for a string of bits to be random. There are different ways to do it. For example: a bit string is ε-random if the shortest program that prints it out is at least ε times as long as the string itself.

You may also wonder: “Doesn’t the definition of algorithmic information depend on the details of our programming language?” And the answer is, “Yes, but not very much.” More precisely, for any two universal programming languages, there’s a constant K such that for any finite bit string, the algorithmic information of that string will differ by at most K depending on which language we use.

Ordinary information

Let’s see how algorithmic information compares to the usual concept of information introduced by Shannon. The usual concept works like this:

If an event of probability p occurs, we define the information gained by learning this event has occurred to be - \log p. We take the logarithm because probabilities of independent events multiply, and we want information to add. The minus sign makes the information positive. We can use any base we want for the logarithm: physicists like e, while computer scientists favor 2.

If there’s a set X of possible outcomes, where the outcome x \in X occurs with probability p_x, then the average or expected amount of information we gain by learning the outcome is

S(p) = - \sum_{x \in X} p_x \log p_x

We call this the entropy of the probability distribution p.

Now, ordinary information doesn’t look very much like algorithmic information! But there’s a second definition of algorithmic information, almost equivalent to the first one, that makes the relation clearer.

Algorithmic information – second definition

First, a minor shift in viewpoint. Before we defined algorithmic information for a finite bit string. Now let’s define it for a natural number. This change in viewpoint is no big deal, since we can set up a one-to-one correspondence between natural numbers and finite bit strings that’s easy to compute.

We’ll still think of our programs as finite bit strings. Not every such string gives a program that halts. We also may demand that bit strings have special features to count as programs. For example, we may demand that programs end in the string 11111, just like a Fortran program must end with the word END.

Let X be the set of programs that halt. Without loss of generality, we can assume that X is prefix-free. This means that if x \in X, no other bit string starting with the string x is also in X. For example, if the string 11111 marks the end of a program, and is only allowed to show up at the end of the program, X will be prefix-free.

Let V(x) be the length of the program x \in X. It’s easy to see that if X is prefix-free,

\sum_{x \in X} 2^{-V(x)} \le 1

Making this sum finite is the reason we want the prefix-free condition — you’ll see exactly why in a minute.

Let N(x) be the output of the program x \in X. Remember, we now think of the output as a natural number. So, we have functions

V, N : X \to \mathbb{N}

We define the algorithmic information of a number n \in \mathbb{N} to be

S(n) = - \log \sum_{x \in X, N(x) = n} 2^{-V(x)}

So: we do a sum over all programs x having the number n as output. We sum up 2^{-V(x)}, where V(x) is the length of the program x. The sum converges because

\sum_{x \in X} 2^{-V(x)} \le 1

That’s where the prefix-free condition comes into play.
Finally, we take minus the logarithm of this sum.

This seems a bit complicated! But suppose there’s just one program x having the number n as output. Then the sum consists of one term, the logarithm cancels out the exponential, and the minus signs cancel too, so we get

S(n) = V(x)

This matches our first definition: the algorithmic information of a number is the length of the shortest program having that number as output.

Of course in this case the shortest program is the only program having that output. If we have more than one program with n as output, the second definition of algorithmic entropy gives a smaller answer than the first definition:

S_{\mathrm{second}}(n) \le S_{\mathrm{first}}(n)

However, there’s a famous theorem, proved by Leonid Levin in 1974, that says the new definition is not very different from the old one. The difference is bounded by a constant. More precisely: for any universal prefix-free programming language, there’s a constant K such that for every n \in \mathbb{N},

S_{\mathrm{second}}(n) \ge S_{\mathrm{first}}(n) - K

Since the algorithmic information depends on our programming language, it was only well-defined within an additive constant in the first place. So, switching from the first definition to the second one doesn’t significantly change the concept. But, it makes the relation to ordinary information a lot easier to see!

From now on we’ll use the second definition.

Algorithmic information versus ordinary information

To relate algorithmic information to ordinary information, we need to get logarithms and probabilities into the game. We see a logarithm here:

S(n) = - \log \sum_{x \in X, N(x) = n} 2^{-V(x)}

but it’s in a funny place, outside the sum — and where are the probabilities?

To solve both these puzzles, let’s define a probability distribution p on the set of natural numbers by

p_n = \frac{1}{Z} \sum_{x \in X, N(x) = n} 2^{-V(x)}

Here Z is a normalizing constant, to make the probabilities sum to 1. Taking

Z = \sum_{x \in X} 2^{-V(x)}

ensures that

\sum_{n \in \mathbb{N}} p_n = 1

Now, notice that

S(n) = - \log p_n \; + \; \log Z

So, up to an additive constant, the algorithmic entropy of the number n is -\log p_n. But -\log p_n is just the information gained upon learning the number n, if we started out knowing only that n was chosen randomly according to the probability distribution p.

What’s the meaning of the probability distribution p? Simple: p_n is the probability that the number n is the output of a randomly chosen program… where ‘randomly chosen’ means chosen according to a certain specific rule. The rule is that increasing the length of a program by 1 makes it half as probable. In other words, the probability q_x of the program x is

q_x = \frac{1}{Z} 2^{-V(x)}

where Z is the normalization factor chosen to make these probabilities sum to 1.

So, the relation between algorithmic information and ordinary information is this:

The algorithmic information of a number is the information gained by learning that number, if beforehand we only knew that it was the output of a randomly chosen program.

Algorithmic thermodynamics

Why should the probability of the program x \in X be defined as

q_x = \frac{1}{Z} 2^{-V(x)} ?

There is no reason we need to use this probability distribution. However, there’s something good about it. It’s an example of a Gibbs state, meaning a probability distribution that maximizes entropy subject to a constraint on the expected value of some observable. Nature likes to maximize entropy, so Gibbs states are fundamental to statistical mechanics. So is the quantity Z: it’s called the partition function.

The idea works like this: suppose we want to maximize the entropy of a probability distribution p on the set of programs, subject to a constraint on the expected value of the length of the program. Then the answer is

p_x = \frac{1}{Z} e^{-\gamma V(x)}

where \gamma is some number, and the partition function Z ensures that the probabilities sum to 1:

Z = \sum_{x \in X} e^{- \gamma V(x)}

So, p equals our previous probability distribution q when

\gamma = \ln 2

However, it is also interesting to consider other values of \gamma, and in fact Tadaki has already done so:

• K. Tadaki, A generalization of Chaitin’s halting probability and halting self-similar sets, Hokkaido Math. J. 31 (2002), 219–253.

The partition function converges for \gamma \ge 2 but diverges otherwise.

More generally, we can look for the probability distribution that maximizes entropy subject to constraints on the expected value of several observables. For example, let:

E(x) = the logarithm of the runtime of the program x

V(x) = the length of the program x

N(x) = the output of the program x

Then the probability distribution that maximizes entropy subject to constraints on the expected values of these three quantities is

p_x = \frac{1}{Z} e^{-\beta E(x) - \gamma V(x) - \delta N(x)}

where now the partition function is

Z = \sum_{x \in X} e^{-\beta E(x) - \gamma V(x) - \delta N(x)}

We’ve chosen our notation to remind experts on statistical mechanics that they’ve seen formulas like this before. The exact same formulas describe a piston full of gas in thermal equilibrium! From a formal, purely mathematical viewpoint:

E is analogous to the internal energy of the gas, and \beta is analogous to 1/T where T is the temperature in units where Boltzmann’s constant is 1.

V is analogous to the volume of the gas, and \gamma is analogous to P/T where P is the pressure.

N is analogous to the number of molecules in the gas, and \delta is analogous to -\mu/T where \mu is the chemical potential.

The analogy here is quite arbitrary. I’m not saying that the length of a program is profoundly similar to the volume of a cylinder of gas; we could have chosen the letter E to stand for the length of a program and everything would still work.

But the analogy works. In other words: now we can take a lot of familiar facts about thermodynamics and instantly translate them into analogous facts about algorithmic information theory! For example, define the entropy of the probability distribution p in the usual way:

S(p) = \sum_{x \in X} p_x \ln p_x

We can think of this as a function of either T, P, and \mu or the expected values of E, V and N. In thermodynamics we learn lots of equations like this:

\left. \frac{\partial S}{\partial E} \right|_{V,N} = \frac{1}{T}

where by standard abuse of notation we’re using E, V, N to stand for their expected values. And, all these equations remain true in algorithmic thermodynamics!

The interesting part is figuring out what all these equations really mean in the context of algorithmic thermodynamics. For a start on that, try our paper.

For example, we call T algorithmic temperature. If you allow programs to run longer, more of them will halt and give an answer. Thanks to the equation

\left. \frac{\partial S}{\partial E} \right|_{V,N} = \frac{1}{T}

the algorithmic temperature is roughly the number of times you have to double the runtime in order to double the number of programs that satisfy the constraints on length and output.

We also consider the algorithmic analogues of thermodynamic cycles, familiar from old work on steam engines. Charles Babbage described a computer powered by a steam engine; we describe a heat engine powered by programs! The significance of this line of thinking remains fairly mysterious. However, I’m hoping that it will help point the way toward a further synthesis of algorithmic information theory and thermodynamics.


Welcome to the Greenhouse

5 January, 2011

Happy New Year!

I just got back from Hanoi. I think I love it. It’s a fascinating city, full of street vendors and craftsmen and twisting back alleys and deep ‘tunnel houses’ and Buddhist temples and more. It’s over a thousand years old, layered with history. The Vietnamese have taken culture from all the more powerful countries that tried to invade them — the Chinese, the French, the Americans — and made it their own. You can see it all here.

At first I hated Hanoi. In the Old Quarter, the roads are packed with motorbikes zipping in all directions — and the sidewalks are all used either as motorbike parking lots, or restaurants with people sitting on little plastic chairs, so you pretty much need to walk in the road a lot of the time. Traffic lights are few and rarely heeded.

It seems like a pedestrian’s worst nightmare. Try walking across this:


But on my first night there, I ran in a philosopher friend from Singapore, an expert on Moore’s paradox: John Williams. He was sitting in a bar, looking out the door, watching the world go by. Strange coincidence!

So we had a beer, and I relaxed a bit, and I noticed that the traffic is actually governed by a set of unwritten rules that you can learn by watching. Little old ladies or gaggles of laughing teenagers can stroll across the street, motorbikes whizzing past them in both directions, and not get hurt! And so can you, once you learn what to do.

And then the fun starts. There’s way, way, way too much to tell you about, so I won’t even try now. Maybe later, probably on my diary.

I think Jeremy Kressman said it right: “You don’t just visit Hanoi. Hanoi visits you.” The picture above is from his blog. I’ll replace it with one of my own as soon as I get around to downloading my photos.

But as I’ve already hinted, this blog entry is not really about Hanoi. It’s about a new collection of SF stories, all tackling the theme of global warming:

• Gordon Van Gelder, editor, Welcome to the Greenhouse, O/R Books, 2011.

It includes stories by Brian W. Aldiss, Jeff Carlson, Judith Moffett, Matthew Hughes, Gregory Benford, Michael Alexander, Bruce Sterling, Joseph Green, Pat MacEwen, Alan Dean Foster, David Prill, George Guthridge, Paul Di Filippo, Chris Lawson, Ray Vukcevich and M. J. Locke.

It’s long past time for SF writers to start envisioning the world we seem to be heading towards. Were some of these stories published a while ago? Have you read any of them? Are they any good?

This item was pointed out to me by David Roberts. And by the way: in case you missed it, check out the global warming game that Giampiero Campa brought to our attention:

Fate of the World, website.

It’s not free, but Giampiero says it’s fun, and I trust him. He wrote:

But don’t waste your time, winning is plain impossible!

but then later:

I take it back, I have finally won today, which basically means that I got to the year 2120 with less than 3 degrees increment without destroying civilization. I mean, Latin America has to be rebuilt and there’s war raging in China, but other than that, it’s an excellent result :-)


This Week’s Finds (Week 308)

24 December, 2010

Last week we met the El Niño-Southern Oscillation, or ENSO. I like to explain things as I learn about them. So, often I look back and find my explanations naive. But this time it took less than a week!

What did it was reading this:

• J. D. Neelin, D. S. Battisti, A. C. Hirst et al., ENSO theory, J. Geophys. Res. 103 (1998), 14261-14290.

I wouldn’t recommend this to the faint of heart. It’s a bit terrifying. It’s well-written, but it tells the long and tangled tale of how theories of the ENSO phenomenon evolved from 1969 to 1998 — a period that saw much progress, but did not end with a neat, clean understanding of this phenomenon. It’s packed with hundreds of references, and sprinkled with somewhat intimidating remarks like:

The Fourier-decomposed longitude and time dependence of these eigensolutions obey dispersion relations familiar to every physical oceanographer…

Nonetheless I found it fascinating — so, I’ll pick off one small idea and explain it now.

As I’m sure you’ve heard, climate science involves some extremely complicated models: some of the most complex known to science. But it also involves models of lesser complexity, like the "box model" explained by Nathan Urban in "week304". And it also involves some extremely simple models that are designed to isolate some interesting phenomena and display them in their Platonic ideal form, stripped of all distractions.

Because of their simplicity, these models are great for mathematicians to think about: we can even prove theorems about them! And simplicity goes along with generality, so the simplest models of all tend to be applicable — in a rough way — not just to the Earth’s climate, but to a vast number of systems. They are, one might say, general possibilities of behavior.

Of course, we can’t expect simple models to describe complicated real-world situations very accurately. That’s not what they’re good for. So, even calling them "models" could be a bit misleading. It might be better to call them "patterns": patterns that can help organize our thinking about complex systems.

There’s a nice mathematical theory of these patterns… indeed, several such theories. But instead of taking a top-down approach, which gets a bit abstract, I’d rather tell you about some examples, which I can illustrate using pictures. But I didn’t make these pictures. They were created by Tim van Beek as part of the Azimuth Code Project. The Azimuth Code Project is a way for programmers to help save the planet. More about that later, at the end of this article.

As we saw last time, the ENSO cycle relies crucially on interactions between the ocean and atmosphere. In some models, we can artificially adjust the strength of these interactions, and we find something interesting. If we set the interaction strength to less than a certain amount, the Pacific Ocean will settle down to a stable equilibrium state. But when we turn it up past that point, we instead see periodic oscillations! Instead of a stable equilibrium state where nothing happens, we have a stable cycle.

This pattern, or at least one pattern of this sort, is called the "Hopf bifurcation". There are various differential equations that exhibit a Hopf bifurcation, but here’s my favorite:

\frac{d x}{d t} =  -y + \beta  x - x (x^2 + y^2)

\frac{d y}{d t} =  \; x + \beta  y - y (x^2 + y^2)

Here x and y are functions of time, t, so these equations describe a point moving around on the plane. It’s easier to see what’s going on in polar coordinates:

\frac{d r}{d t} = \beta r - r^3

\frac{d \theta}{d t} = 1

The angle \theta goes around at a constant rate while the radius r does something more interesting. When \beta \le 0, you can see that any solution spirals in towards the origin! Or, if it starts at the origin, it stays there. So, we call the origin a "stable equilibrium".

Here’s a typical solution for \beta = -1/4, drawn as a curve in the x y plane. As time passes, the solution spirals in towards the origin:

The equations are more interesting for \beta > 0. Then dr/dt = 0 whenever

\beta r - r^3 = 0

This has two solutions, r = 0 and r = \sqrt{\beta}. Since r = 0 is a solution, the origin is still an equilibrium. But now it’s not stable: if r is between 0 and \sqrt{\beta}, we’ll have \beta r - r^3 > 0, so our solution will spiral out, away from the origin and towards the circle r = \sqrt{\beta}. So, we say the origin is an "unstable equilibrium". On the other hand, if r starts out bigger than \sqrt{\beta}, our solution will spiral in towards that circle.

Here’s a picture of two solutions for \beta = 1:

The red solution starts near the origin and spirals out towards the circle r = \sqrt{\beta}. The green solution starts outside this circle and spirals in towards it, soon becoming indistinguishable from the circle itself. So, this equation describes a system where x and y quickly settle down to a periodic oscillating behavior.

Since solutions that start anywhere near the circle r = \sqrt{\beta} will keep going round and round getting closer to this circle, it’s called a "stable limit cycle".

This is what the Hopf bifurcation is all about! We’ve got a dynamical system that depends on a parameter, and as we change this parameter, a stable fixed point become unstable, and a stable limit cycle forms around it.

This isn’t quite a mathematical definition yet, but it’s close enough for now. If you want something a bit more precise, try:

• Yuri A. Kuznetsov, Andronov-Hopf bifurcation, Scholarpedia, 2006.

Now, clearly the Hopf bifurcation idea is too simple for describing real-world weather cycles like the ENSO. In the Hopf bifurcation, our system settles down into an orbit very close to the limit cycle, which is perfectly periodic. The ENSO cycle is only roughly periodic:



The time between El Niños varies between 3 and 7 years, averaging around 4 years. There can also be two El Niños without an intervening La Niña, or vice versa. One can try to explain this in various ways.

One very simple, general idea to add random noise to whatever differential equation we were using to model the ENSO cycle, obtaining a so-called stochastic differential equation: a differential equation describing a random process. Richard Kleeman discusses this idea in Tim Palmer’s book:

• Richard Kleeman, Stochastic theories for the irregularity of ENSO, in Stochastic Physics and Climate Modelling, eds. Tim Palmer and Paul Williams, Cambridge U. Press, Cambridge, 2010, pp. 248-265.

Kleeman mentions three general theories for the irregularity of the ENSO. They all involve the idea of separating the weather into "modes" — roughly speaking, different ways that things can oscillate. Some modes are slow and some are fast. The ENSO cycle is defined by the behavior of certain slow modes, but of course these interact with the fast modes. So, there are various options:

  1. Perhaps the relevant slow modes interact with each other in a chaotic way.
  2. Perhaps the relevant slow modes interact with each other in a non-chaotic way, but also interact with chaotic fast modes, which inject noise into what would otherwise be simple periodic behavior.
  3. Perhaps the relevant slow modes interact with each other in a chaotic way, and also interact in a significant way with chaotic fast modes.

Kleeman reviews work on the first option but focuses on the second. The third option is the most complicated, so the pessimist in me suspects that’s what’s really going on. Still, it’s good to start by studying simple models!

How can we get a simple model that illustrates the second option? Simple: take the model we just saw, and add some noise! This idea is discussed in detail here:

• H. A. Dijkstra, L. M. Frankcombe and A.S von der Heydt, The Atlantic Multidecadal Oscillation: a stochastic dynamical systems view, in Stochastic Physics and Climate Modelling, eds. Tim Palmer and Paul Williams, Cambridge U. Press, Cambridge, 2010, pp. 287-306.

This paper is not about the ENSO cycle, but another one, which is often nicknamed the AMO. I would love to talk about it — but not now. Let me just show you the equations for a Hopf bifurcation with noise:

\frac{d x}{d t} =  -y + \beta  x - x (x^2 + y^2) + \lambda \frac{d W_1}{d t}

\frac{d y}{d t} =  \; x + \beta  y - y (x^2 + y^2) + \lambda \frac{d W_2}{d t}

They’re the same as before, but with some new extra terms at the end: that’s the noise.

This could easily get a bit technical, but I don’t want it to. So, I’ll just say some buzzwords and let you click on the links if you want more detail. W_1 and W_2 are two independent Wiener processes, so they describe Brownian motion in the x and y coordinates. When we differentiate a Wiener process we get white noise. So, we’re adding some amount of white noise to the equations we had before, and the number \lambda says precisely how much. That means that x and y are no longer specific functions of time: they’re random functions, also known as stochastic processes.

If this were a math course, I’d feel obliged to precisely define all the terms I just dropped on you. But it’s not, so I’ll just show you some pictures!

If \beta = 1 and \lambda = 0.1, here are some typical solutions:

They look similar to the solutions we saw before for \beta = 1, but now they have some random wiggles added on.

(You may be wondering what this picture really shows. After all, I said the solutions were random functions of time, not specific functions. But it’s tough to draw a "random function". So, to get one of the curves shown above, what Tim did is randomly choose a function according to some rule for computing probabilities, and draw that.)

If we turn up the noise, our solutions get more wiggly. If \beta = 1 and \lambda = 0.3, they look like this:

In these examples, \beta > 0, so we would have a limit cycle if there weren’t any noise — and you can see that even with noise, the solutions approximately tend towards the limit cycle. So, we can use an equation of this sort to describe systems that oscillate, but in a somewhat random way.

But now comes the really interesting part! Suppose \beta \le 0. Then we’ve seen that without noise, there’s no limit cycle: any solution quickly spirals in towards the origin. But with noise, something a bit different happens. If \beta = -1/4 and \lambda = 0.1 we get a picture like this:

We get irregular oscillations even though there’s no limit cycle! Roughly speaking, the noise keeps knocking the solution away from the stable fixed point at x = y = 0, so it keeps going round and round, but in an irregular way. It may seem to be spiralling in, but if we waited a bit longer it would get kicked out again.

This is a lot easier to see if we plot just x as a function of t. Then we can run our solution for a longer time without the picture becoming a horrible mess:

If you compare this with the ENSO cycle, you’ll see they look roughly similar:



That’s nice. Of course it doesn’t prove that a model based on a Hopf bifurcation plus noise is "right" — indeed, we don’t really have a model until we’ve chosen variables for both x and y. But it suggests that a model of this sort could be worth studying.

If you want to see how the Hopf bifurcation plus noise is applied to climate cycles, I suggest starting with the paper by Dijkstra, Frankcombe and von der Heydt. If you want to see it applied to the El Niño-Southern Oscillation, start with Section 6.3 of the ENSO theory paper, and then dig into the many references. Here it seems a model with \beta > 0 may work best. If so, noise is not required to keep the ENSO cycle going, but it makes the cycle irregular.

To a mathematician like me, what’s really interesting is how the addition of noise "smooths out" the Hopf bifurcation. When there’s no noise, the qualitative behavior of solutions jumps drastically at \beta = 0. For \beta \le 0 we have a stable equilibrium, while for \beta > 0 we have a stable limit cycle. But in the presence of noise, we get irregular cycles not only for \beta > 0 but also \beta \le 0. This is not really surprising, but it suggests a bunch of questions. Such as: what are some quantities we can use to describe the behavior of "irregular cycles", and how do these quantities change as a function of \lambda and \beta?

You’ll see some answers to this question in Dijkstra, Frankcombe and von der Heydt’s paper. However, if you’re a mathematician, you’ll instantly think of dozens more questions — like, how can I prove what these guys are saying?

If you make any progress, let me know. If you don’t know where to start, you might try the Dijkstra et al. paper, and then learn a bit about the Hopf bifurcation, stochastic processes, and stochastic differential equations:

• John Guckenheimer and Philip Holmes, Nonlinear Oscillations, Dynamical Systems and Bifurcations of Vector Fields, Springer, Berlin, 1983.

• Zdzisław Brzeźniak and Tomasz Zastawniak, Basic Stochastic Processes: A Course Through Exercises, Springer, Berlin, 1999.

• Bernt Øksendal, Stochastic Differential Equations: An Introduction with Applications, 6th edition, Springer, Berlin, 2003.

Now, about the Azimuth Code Project. Tim van Beek started it just recently, but the Azimuth Project seems to be attracting people who can program, so I have high hopes for it. Tim wrote:

My main objectives to start the Azimuth Code Project were:

• to have a central repository for the code used for simulations or data analysis on the Azimuth Project,

• to have an online free access repository and make all software open source, to enable anyone to use the software, for example to reproduce the results on the Azimuth Project. Also to show by example that this can and should be done for every scientific publication.

Of less importance is:

• to implement the software with an eye to software engineering principles.

This less important because the world of numerical high performance computing differs significantly from the rest of the software industry: it has special requirements and it is not clear at all which paradigms that are useful for the rest will turn out to be useful here. Nevertheless I’m confident that parts of the scientific community will profit from a closer interaction with software engineering.

So, if you like programming, I hope you’ll chat with us and consider joining in! Our next projects involve limit cycles in predator-prey models, stochastic resonance in some theories of the ice ages, and delay differential equations in ENSO models.

And in case you’re wondering, the code used for the pictures above is a simple implementation in Java of the Euler scheme, using random number generating algorithms from Numerical Recipes. Pictures were generated with gnuplot.


There are two ways of constructing a software design. One way is to make it so simple that there are obviously no deficiencies. And the other way is to make it so complicated that there are no obvious deficiencies. – C.A.R. Hoare


Adapting to a Hotter Earth

23 December, 2010

Over on the Azimuth Forum, Staffan Liljegren pointed out a good article in The Economist:

Facing the consequences, The Economist, 25 November 2010.

I’m going to quote the beginning, and hope that lures you into reading the rest. This is important stuff!

Adapting to climate change

Facing the consequences

Global action is not going to stop climate change. The world needs to look harder at how to live with it.

ON NOVEMBER 29th representatives of countries from around the world will gather in Cancún, Mexico, for the first high-level climate talks since those in Copenhagen last December. The organisers hope the meeting in Mexico, unlike the one in Denmark, will be unshowy but solid, leading to decisions about finance, forestry and technology transfer that will leave the world better placed to do something about global warming. Incremental progress is possible, but continued deadlock is likelier. What is out of reach, as at Copenhagen, is agreement on a plausible programme for keeping climate change in check.

The world warmed by about 0.7°C in the 20th century. Every year in this century has been warmer than all but one in the last (1998, since you ask). If carbon-dioxide levels were magically to stabilise where they are now (almost 390 parts per million, 40% more than before the industrial revolution) the world would probably warm by a further half a degree or so as the ocean, which is slow to change its temperature, caught up. But CO2 levels continue to rise. Despite 20 years of climate negotiation, the world is still on an emissions trajectory that fits pretty easily into the “business as usual” scenarios drawn up by the Intergovernmental Panel on Climate Change (IPCC).

The Copenhagen accord, a non-binding document which was the best that could be salvaged from the summit, talks of trying to keep the world less than 2°C warmer than in pre-industrial times—a level that is rather arbitrarily seen as the threshold for danger. Many countries have, in signing the accord, promised actions that will or should reduce carbon emissions. In the World Energy Outlook, recently published by the International Energy Agency, an assessment of these promises forms the basis of a “new policies scenario” for the next 25 years (see chart 1). According to the IEA, the scenario puts the world on course to warm by 3.5°C by 2100. For comparison, the difference in global mean temperature between the pre-industrial age and the ice ages was about 6°C.

The IEA also looked at what it might take to hit a two-degree target; the answer, says the agency’s chief economist, Fatih Birol, is “too good to be believed”. Every signatory of the Copenhagen accord would have to hit the top of its range of commitments. That would provide a worldwide rate of decarbonisation (reduction in carbon emitted per unit of GDP) twice as large in the decade to come as in the one just past: 2.8% a year, not 1.4%. Mr Birol notes that the highest annual rate on record is 2.5%, in the wake of the first oil shock.

But for the two-degree scenario 2.8% is just the beginning; from 2020 to 2035 the rate of decarbonisation needs to double again, to 5.5%. Though they are unwilling to say it in public, the sheer improbability of such success has led many climate scientists, campaigners and policymakers to conclude that, in the words of Bob Watson, once the head of the IPCC and now the chief scientist at Britain’s Department for Environment, Food and Rural Affairs, “Two degrees is a wishful dream.”

The fight to limit global warming to easily tolerated levels is thus over. Analysts who have long worked on adaptation to climate change—finding ways to live with scarcer water, higher peak temperatures, higher sea levels and weather patterns at odds with those under which today’s settled patterns of farming developed—are starting to see their day in the uncomfortably hot sun. That such measures cannot protect everyone from all harm that climate change may bring does not mean that they should be ignored. On the contrary, they are sorely needed.

For more see:

World Energy Outlook 2010, Azimuth Project.


Follow

Get every new post delivered to your Inbox.

Join 3,095 other followers