## This Week’s Finds (Week 308)

Last week we met the El Niño-Southern Oscillation, or ENSO. I like to explain things as I learn about them. So, often I look back and find my explanations naive. But this time it took less than a week!

What did it was reading this:

• J. D. Neelin, D. S. Battisti, A. C. Hirst et al., ENSO theory, J. Geophys. Res. 103 (1998), 14261-14290.

I wouldn’t recommend this to the faint of heart. It’s a bit terrifying. It’s well-written, but it tells the long and tangled tale of how theories of the ENSO phenomenon evolved from 1969 to 1998 — a period that saw much progress, but did not end with a neat, clean understanding of this phenomenon. It’s packed with hundreds of references, and sprinkled with somewhat intimidating remarks like:

The Fourier-decomposed longitude and time dependence of these eigensolutions obey dispersion relations familiar to every physical oceanographer…

Nonetheless I found it fascinating — so, I’ll pick off one small idea and explain it now.

As I’m sure you’ve heard, climate science involves some extremely complicated models: some of the most complex known to science. But it also involves models of lesser complexity, like the "box model" explained by Nathan Urban in "week304". And it also involves some extremely simple models that are designed to isolate some interesting phenomena and display them in their Platonic ideal form, stripped of all distractions.

Because of their simplicity, these models are great for mathematicians to think about: we can even prove theorems about them! And simplicity goes along with generality, so the simplest models of all tend to be applicable — in a rough way — not just to the Earth’s climate, but to a vast number of systems. They are, one might say, general possibilities of behavior.

Of course, we can’t expect simple models to describe complicated real-world situations very accurately. That’s not what they’re good for. So, even calling them "models" could be a bit misleading. It might be better to call them "patterns": patterns that can help organize our thinking about complex systems.

There’s a nice mathematical theory of these patterns… indeed, several such theories. But instead of taking a top-down approach, which gets a bit abstract, I’d rather tell you about some examples, which I can illustrate using pictures. But I didn’t make these pictures. They were created by Tim van Beek as part of the Azimuth Code Project. The Azimuth Code Project is a way for programmers to help save the planet. More about that later, at the end of this article.

As we saw last time, the ENSO cycle relies crucially on interactions between the ocean and atmosphere. In some models, we can artificially adjust the strength of these interactions, and we find something interesting. If we set the interaction strength to less than a certain amount, the Pacific Ocean will settle down to a stable equilibrium state. But when we turn it up past that point, we instead see periodic oscillations! Instead of a stable equilibrium state where nothing happens, we have a stable cycle.

This pattern, or at least one pattern of this sort, is called the "Hopf bifurcation". There are various differential equations that exhibit a Hopf bifurcation, but here’s my favorite:

$\frac{d x}{d t} = -y + \beta x - x (x^2 + y^2)$

$\frac{d y}{d t} = \; x + \beta y - y (x^2 + y^2)$

Here $x$ and $y$ are functions of time, $t$, so these equations describe a point moving around on the plane. It’s easier to see what’s going on in polar coordinates:

$\frac{d r}{d t} = \beta r - r^3$

$\frac{d \theta}{d t} = 1$

The angle $\theta$ goes around at a constant rate while the radius $r$ does something more interesting. When $\beta \le 0$, you can see that any solution spirals in towards the origin! Or, if it starts at the origin, it stays there. So, we call the origin a "stable equilibrium".

Here’s a typical solution for $\beta = -1/4$, drawn as a curve in the $x y$ plane. As time passes, the solution spirals in towards the origin:

The equations are more interesting for $\beta > 0$. Then $dr/dt = 0$ whenever

$\beta r - r^3 = 0$

This has two solutions, $r = 0$ and $r = \sqrt{\beta}$. Since $r = 0$ is a solution, the origin is still an equilibrium. But now it’s not stable: if $r$ is between $0$ and $\sqrt{\beta}$, we’ll have $\beta r - r^3 > 0$, so our solution will spiral out, away from the origin and towards the circle $r = \sqrt{\beta}$. So, we say the origin is an "unstable equilibrium". On the other hand, if $r$ starts out bigger than $\sqrt{\beta}$, our solution will spiral in towards that circle.

Here’s a picture of two solutions for $\beta = 1$:

The red solution starts near the origin and spirals out towards the circle $r = \sqrt{\beta}$. The green solution starts outside this circle and spirals in towards it, soon becoming indistinguishable from the circle itself. So, this equation describes a system where $x$ and $y$ quickly settle down to a periodic oscillating behavior.

Since solutions that start anywhere near the circle $r = \sqrt{\beta}$ will keep going round and round getting closer to this circle, it’s called a "stable limit cycle".

This is what the Hopf bifurcation is all about! We’ve got a dynamical system that depends on a parameter, and as we change this parameter, a stable fixed point become unstable, and a stable limit cycle forms around it.

This isn’t quite a mathematical definition yet, but it’s close enough for now. If you want something a bit more precise, try:

• Yuri A. Kuznetsov, Andronov-Hopf bifurcation, Scholarpedia, 2006.

Now, clearly the Hopf bifurcation idea is too simple for describing real-world weather cycles like the ENSO. In the Hopf bifurcation, our system settles down into an orbit very close to the limit cycle, which is perfectly periodic. The ENSO cycle is only roughly periodic:

The time between El Niños varies between 3 and 7 years, averaging around 4 years. There can also be two El Niños without an intervening La Niña, or vice versa. One can try to explain this in various ways.

One very simple, general idea to add random noise to whatever differential equation we were using to model the ENSO cycle, obtaining a so-called stochastic differential equation: a differential equation describing a random process. Richard Kleeman discusses this idea in Tim Palmer’s book:

• Richard Kleeman, Stochastic theories for the irregularity of ENSO, in Stochastic Physics and Climate Modelling, eds. Tim Palmer and Paul Williams, Cambridge U. Press, Cambridge, 2010, pp. 248-265.

Kleeman mentions three general theories for the irregularity of the ENSO. They all involve the idea of separating the weather into "modes" — roughly speaking, different ways that things can oscillate. Some modes are slow and some are fast. The ENSO cycle is defined by the behavior of certain slow modes, but of course these interact with the fast modes. So, there are various options:

1. Perhaps the relevant slow modes interact with each other in a chaotic way.
2. Perhaps the relevant slow modes interact with each other in a non-chaotic way, but also interact with chaotic fast modes, which inject noise into what would otherwise be simple periodic behavior.
3. Perhaps the relevant slow modes interact with each other in a chaotic way, and also interact in a significant way with chaotic fast modes.

Kleeman reviews work on the first option but focuses on the second. The third option is the most complicated, so the pessimist in me suspects that’s what’s really going on. Still, it’s good to start by studying simple models!

How can we get a simple model that illustrates the second option? Simple: take the model we just saw, and add some noise! This idea is discussed in detail here:

• H. A. Dijkstra, L. M. Frankcombe and A.S von der Heydt, The Atlantic Multidecadal Oscillation: a stochastic dynamical systems view, in Stochastic Physics and Climate Modelling, eds. Tim Palmer and Paul Williams, Cambridge U. Press, Cambridge, 2010, pp. 287-306.

This paper is not about the ENSO cycle, but another one, which is often nicknamed the AMO. I would love to talk about it — but not now. Let me just show you the equations for a Hopf bifurcation with noise:

$\frac{d x}{d t} = -y + \beta x - x (x^2 + y^2) + \lambda \frac{d W_1}{d t}$

$\frac{d y}{d t} = \; x + \beta y - y (x^2 + y^2) + \lambda \frac{d W_2}{d t}$

They’re the same as before, but with some new extra terms at the end: that’s the noise.

This could easily get a bit technical, but I don’t want it to. So, I’ll just say some buzzwords and let you click on the links if you want more detail. $W_1$ and $W_2$ are two independent Wiener processes, so they describe Brownian motion in the $x$ and $y$ coordinates. When we differentiate a Wiener process we get white noise. So, we’re adding some amount of white noise to the equations we had before, and the number $\lambda$ says precisely how much. That means that $x$ and $y$ are no longer specific functions of time: they’re random functions, also known as stochastic processes.

If this were a math course, I’d feel obliged to precisely define all the terms I just dropped on you. But it’s not, so I’ll just show you some pictures!

If $\beta = 1$ and $\lambda = 0.1$, here are some typical solutions:

They look similar to the solutions we saw before for $\beta = 1$, but now they have some random wiggles added on.

(You may be wondering what this picture really shows. After all, I said the solutions were random functions of time, not specific functions. But it’s tough to draw a "random function". So, to get one of the curves shown above, what Tim did is randomly choose a function according to some rule for computing probabilities, and draw that.)

If we turn up the noise, our solutions get more wiggly. If $\beta = 1$ and $\lambda = 0.3$, they look like this:

In these examples, $\beta > 0$, so we would have a limit cycle if there weren’t any noise — and you can see that even with noise, the solutions approximately tend towards the limit cycle. So, we can use an equation of this sort to describe systems that oscillate, but in a somewhat random way.

But now comes the really interesting part! Suppose $\beta \le 0$. Then we’ve seen that without noise, there’s no limit cycle: any solution quickly spirals in towards the origin. But with noise, something a bit different happens. If $\beta = -1/4$ and $\lambda = 0.1$ we get a picture like this:

We get irregular oscillations even though there’s no limit cycle! Roughly speaking, the noise keeps knocking the solution away from the stable fixed point at $x = y = 0$, so it keeps going round and round, but in an irregular way. It may seem to be spiralling in, but if we waited a bit longer it would get kicked out again.

This is a lot easier to see if we plot just $x$ as a function of $t$. Then we can run our solution for a longer time without the picture becoming a horrible mess:

If you compare this with the ENSO cycle, you’ll see they look roughly similar:

That’s nice. Of course it doesn’t prove that a model based on a Hopf bifurcation plus noise is "right" — indeed, we don’t really have a model until we’ve chosen variables for both $x$ and $y$. But it suggests that a model of this sort could be worth studying.

If you want to see how the Hopf bifurcation plus noise is applied to climate cycles, I suggest starting with the paper by Dijkstra, Frankcombe and von der Heydt. If you want to see it applied to the El Niño-Southern Oscillation, start with Section 6.3 of the ENSO theory paper, and then dig into the many references. Here it seems a model with $\beta > 0$ may work best. If so, noise is not required to keep the ENSO cycle going, but it makes the cycle irregular.

To a mathematician like me, what’s really interesting is how the addition of noise "smooths out" the Hopf bifurcation. When there’s no noise, the qualitative behavior of solutions jumps drastically at $\beta = 0$. For $\beta \le 0$ we have a stable equilibrium, while for $\beta > 0$ we have a stable limit cycle. But in the presence of noise, we get irregular cycles not only for $\beta > 0$ but also $\beta \le 0$. This is not really surprising, but it suggests a bunch of questions. Such as: what are some quantities we can use to describe the behavior of "irregular cycles", and how do these quantities change as a function of $\lambda$ and $\beta$?

You’ll see some answers to this question in Dijkstra, Frankcombe and von der Heydt’s paper. However, if you’re a mathematician, you’ll instantly think of dozens more questions — like, how can I prove what these guys are saying?

If you make any progress, let me know. If you don’t know where to start, you might try the Dijkstra et al. paper, and then learn a bit about the Hopf bifurcation, stochastic processes, and stochastic differential equations:

• John Guckenheimer and Philip Holmes, Nonlinear Oscillations, Dynamical Systems and Bifurcations of Vector Fields, Springer, Berlin, 1983.

• Zdzisław Brzeźniak and Tomasz Zastawniak, Basic Stochastic Processes: A Course Through Exercises, Springer, Berlin, 1999.

• Bernt Øksendal, Stochastic Differential Equations: An Introduction with Applications, 6th edition, Springer, Berlin, 2003.

Now, about the Azimuth Code Project. Tim van Beek started it just recently, but the Azimuth Project seems to be attracting people who can program, so I have high hopes for it. Tim wrote:

My main objectives to start the Azimuth Code Project were:

• to have a central repository for the code used for simulations or data analysis on the Azimuth Project,

• to have an online free access repository and make all software open source, to enable anyone to use the software, for example to reproduce the results on the Azimuth Project. Also to show by example that this can and should be done for every scientific publication.

Of less importance is:

• to implement the software with an eye to software engineering principles.

This less important because the world of numerical high performance computing differs significantly from the rest of the software industry: it has special requirements and it is not clear at all which paradigms that are useful for the rest will turn out to be useful here. Nevertheless I’m confident that parts of the scientific community will profit from a closer interaction with software engineering.

So, if you like programming, I hope you’ll chat with us and consider joining in! Our next projects involve limit cycles in predator-prey models, stochastic resonance in some theories of the ice ages, and delay differential equations in ENSO models.

And in case you’re wondering, the code used for the pictures above is a simple implementation in Java of the Euler scheme, using random number generating algorithms from Numerical Recipes. Pictures were generated with gnuplot.

There are two ways of constructing a software design. One way is to make it so simple that there are obviously no deficiencies. And the other way is to make it so complicated that there are no obvious deficiencies. – C.A.R. Hoare

### 51 Responses to This Week’s Finds (Week 308)

1. John Baez says:

By the way, people reading this blog might prefer the version of This Week’s Finds on my website. You can see “week308” here, and latest edition can always be found here.

The version on my website doesn’t suffer from the “skinny column” format of the blog version — in particular, some of the pictures are bigger than 450 pixels across!

The equations also look better — at least to me, now that I’ve started using jsMath to render TeX on my website. If the math symbols on “week308” look smaller than the other letters, let me know — and let me know what browswer you’re using. To me, using Firefox on Windows, they look just right, but apparently that varies from browser to browser. They’re easy to adjust at my end, but I may not be able to make everyone happy.

On my website, you may see an annoying red box at the top of “week308” saying “no jsMath TeX fonts found”. This should not impede your ability to see the math symbols! The great thing about jsMath is that you don’t need to download any math symbol fonts. But the math symbols will show up faster, and maybe better, if you download the fonts. You can do this by clicking the little grey box saying “jsMath Control Panel” and then doing some (supposedly) obvious things.

• srp says:

Great post and a very worthy software project. Best of luck with it. BTW, this blog’s graphics look fine on Safari.

• some guy dotdotdot says:

needs your home directory, not just “home”!

2. John Baez says:

Tim van Beek notes that Terry Tao has a blog entry that explains Brownian motion in a rigorous way. This is a good way for mathematicians to get started on stochastic differential equations.

3. Arrow says:

I find the idea of using random noise to make a model of something quite absurd. A model should capture causal relations of the modeled phenomenon. A “model” which produces a similar output for completely different reasons then the ones governing the actual phenomenon is a failure.

If one thinks that there is some noise which for some reason is impossible to model then it would make much more sense to filter this noise out and then compare such filtered data to a model without any noise.

• srp says:

I’m not sure your precept is valid in general. If you believe your model has to have exogenous variables and you are trying to fit data where those exogenous variables arrive in an irregular and uncertain way, then you neither want to filter them or try to explain them (which would threaten an infinite regress to another set of exogenous variables).

It is true that introducing noise without a clear understanding of where it’s coming from (no specification of the exogenous variables) is less convincing, especially if it plays a major role in generating the structure of interest, as in some of the pictures above. But in that less-satisfying case, perhaps the results can be interpreted as a candidate pattern that any causal variable would have to match.

• John Baez says:

If a model increases our probability of making successful predictions, it’s a useful tool, regardless of its nature.
Scientists are in love with knowledge, and engaged in an all-out war against ignorance. All is fair in love and war.

Models that incorporate randomness are widely used in physics and engineering. For a brief history, try this.

In “week306” I interviewed Tim Palmer, who is a big proponent of stochastic (i.e., probabilistic) methods in weather and climate modelling. He explained how thanks to the “real butterfly effect”, it’s impossible to make accurate deterministic weather predictions beyond a certain short threshold. Unmeasured wind velocities, temperatures and pressures at short distance scales are examples of the ‘exogenous variables’ srp mentioned. This umeasured information at short distance scales quickly percolates up to affect larger distance scales. But we can model this unmeasured information stochastically, by treating it as random noise.

In short: just because we can’t know things for sure, doesn’t mean we should throw in the towel. It can help a lot to know the probability it’s going to rain in a week.

For more on Palmer’s ideas, try this:

• Steve Easterbrook, AGU talk: Tim Palmer on Building Probabilistic Climate Models, Serendipity, 19 December 2010.

My plan is to explain a tiny bit of this book he helped edit:

• Tim Palmer and Paul Williams, editors, Stochastic Physics and Climate Modelling, Cambridge U. Press, Cambridge, 2010.

So, after an intro to El Niño in “week307”, this week I talked about stochastic models of the El Niño and the Atlantic Multidecadal Oscillation, based on these two papers:

• Richard Kleeman, Stochastic theories for the irregularity of ENSO, in Stochastic Physics and Climate Modelling, eds. Tim Palmer and Paul Williams, Cambridge U. Press, Cambridge, 2010, pp. 248-265.

• H. A. Dijkstra, L. M. Frankcombe and A. S. von der Heydt, The Atlantic Multidecadal Oscillation: a stochastic dynamical systems view, in Stochastic Physics and Climate Modelling, eds. Tim Palmer and Paul Williams, Cambridge U. Press, Cambridge, 2010, pp. 287-306.

Of course I only scratched the surface: the Hopf bifurcation plus noise is pretty, but far from the whole story. These papers are really fun to read, so I hope folks dig into them!

In future episodes I want to talk about glacial cycles, and how the noise from short-term weather may act to amplify the effects of cycles in the Earth’s orbit
(Milankovitch cycles), via a phenomenon called stochastic resonance. Luckily for me, Tim van Beek has been studying stochastic resonance and writing programs to show how it works.

But before I get to that, I’ll probably give another example of a Hopf bifurcation plus noise, developed by Graham Jones over on the Azimuth Project. I won’t give the link because I want it to be a surprise!

If anyone reading this likes to program, I hope you join the fun: drop us a line on the Azimuth Forum!

Merry Christmas, everyone. My wife and I going to Vietnam in 2 hours. We’ll spend seven days in Hanoi and five days in Huế, which was the imperial capital of the Nguyễn Dynasty until 1945. I’ll be back on January 5th. Be good while I’m gone.

• Tim van Beek says:

Happy Holidays!

Arrow said:

A model should capture causal relations of the modeled phenomenon. A “model” which produces a similar output for completely different reasons then the ones governing the actual phenomenon is a failure.

The actual phenomenon is governed by an interaction of a slowly varying, long memory phenomenon, namely oscillations of temperature and other variables of the Pacific Ocean, and very short term, no memory influences from the atmosphere. The noise models the latter.

If one thinks that there is some noise which for some reason is impossible to model then it would make much more sense to filter this noise out and then compare such filtered data to a model without any noise.

The “noise” is used to include effects into the model which would otherwise not be included, therefore making the model more accurate. However, models can have different uses, for example:

1) make predictions,

2) explain phenomena qualitatively,

3) as a null hypothesis for other, more sophisticaed models.

The model in week 308 cannot be used to make predictions: to predict the next ENSO cycle one needs more sophisticated models that are closer to a full model using Navier-Stokes equations with real world data as initial and boundary data. But I think it can be useful for 2) and 3).

But: Note that “noise” can be and has been used very successfully for models that make very accurate predictions, examples can be found in financial mathematics (Black-Scholes formula), but also in engineering, for example in control theory.

There is a lot of literature out there, simply look for the keyword “stochastic control theory”.

• DavidTweed says:

Note that something subtle is going on: adding noise is using a model, it’s just that it’s a very weak model which where we’ve decided (due to either ignorance of the details, inability to measure inputs to the details, computational intractability of the details or some combination thereof) is that all the model tells us is the probabilistic distribution of possibilities. Using those to model small residuals after dominant contributions have been modeled more closely is typical of modeling, and use of models in science. Have you ever tried to come up with a measure of the coefficient of friction between two surfaces directly without measuring the atomic level details and computing all the interactions? That’s the same thing, except in this case — as with most cases in typical simple physics — the probability distribution is very, very tightly peaked so you can neglect the probabilistic dimension. What’s different here is that (analogies to) very complex, huge-number-of-component systems are being considered where the system behaves very non-linearly to small perturbations.

• John Baez says:

David wrote:

Note that something subtle is going on: adding noise is using a model…

Right: and this is very easy to see by noticing that there are infinitely many different kinds of noise, so building a model involving noise requires that you make assumptions about the nature of the noise involved.

In the simple “Hopf bifurcation plus noise” model described in this blog entry, we’re adding independent identically distributed white noise terms to the formulas for the time derivatives of both variables (x and y). This has the mathematical advantage of being as simple as possible, and not destroying the circular symmetry of the problem. But in any serious attempt to model a specific real-world phenomenon, one should try to make the noise realistic, not just mathematically elegant.

For example, in weather simulation, where the noise might model the effects of turbulence at length scales not included in the model, one would want to think hard about how turbulence actually works. Luckily, people have been thinking about this ever since Kolmogorov wrote his amazing paper back in 1941. So, a lot of the hard thinking has already been done.

• John F says:

Although I never worked in weather, the same issues affect any modeling w/noise. The unmeasured stuff, the residuals, the fast variables, uncertainty, the details of how the noise is designated is not very important. What is important is that what is being modeled be correspondable to what you want you want it to model! For instance, you wouldn’t want to model an audio amplifer as being only fully off (too quiet to hear) and wide open (too loud to hear). For one reason, because then what should be actual sound is instead noise in that model. In the context of control theory this modelability is controllability
http://en.wikipedia.org/wiki/Controllability

Anyway, in almost all applications (such as finance – where are those finance guys when you want them?) you have to use past measured behavior to predict future behavior, especially to make realistic noise aspects. Because you almost never have a clear idea of where the noise is coming from or what it’s properties should be, especially apart from those measurements.

What exactly do they use for the noise terms in weather forecasting? Are there trade secrets?

• Tim van Beek says:

John F. asked:

What exactly do they use for the noise terms in weather forecasting? Are there trade secrets?

So far I have not discovered a weather model using stochastic elements. People include subgrid phenomena via so called “parameterizations”, which are heuristic deterministic influences on each grid cell based on an estimation of concrete physical effects. They then use “ensemble runs” which means running the models multiple times with different parameterizations and different initial data, to find out which predictions of the models are stable and which are not.

This is significantly different from having a stochastic effect on every time step of the model, like one has in stochastic partial or ordinary differential equations.

Stochastic equations are used with great success in engineering, for example for controlling the flow of fluids through pipes, as an approximation to the unaccessible solution of Navier-Stokes equations. So, IMHO, this could turn out to be of use for weather and climate models, too, but if I’m correct there would be the need for a minor paradigm change in the weather modeling community.

• John F says:

Tim,
I’m surprised that weather models are only studied via sensitivity analyses
http://en.wikipedia.org/wiki/Sensitivity_analysis
without stochastic parameters. It seems blind to me.

4. Tim van Beek says:

John wrote:

To a mathematician like me, what’s really interesting is how the addition of noise “smooths out” the Hopf bifurcation.

The topic of “qualitative analysis of differential equations” is now over 100 years old, it was started by Henri Poincaré, and a lot of concepts bear his name, like Poincaré map. Of course, Poincaré could not resort to computer approximations to get an idea how solutions to certain differential equations look like. But although we have powerful computers today, the topic itself is still very important to supplement and cross check numerical solutions.

The topic of qualitative analysis of stochastic differential equations however lies idle, as far as I know (I don’t know of any interesting results or methods). The example in week 308 shows that there are additional phenomena that have to be investigated and taken care of, the influence of noise clearly adds a layer of complexity. Equillibrium methods, like the evolution of probability distributions described by Fokker-Planck equations, are of limited use here, since we are interested in the qualitative behaviour of single sample paths…

• Phil Henshaw says:

Would it count for what you’re looking for, as “interesting results or methods”, to find a result that energy conservation implies that the stochastic equations of natural processes need to incorporate finite developmental periods displaying accumulative proportional change?

Somebody’s going to get it some day, why not you guys??

http://www.synapse9.com/drtheo.pdf

5. srp says:

I own a wonderful desk toy called the Randomly Oscillating Magnetic Pendulum (ROMP). It gives some intuitive feel for how a simple system can exhibit limit points and cycles and can abruptly switch from one to another “for no good reason.” You can rearrange the magnets into any configuration on the plate below the pendulum, causing a new set of complex behaviors. Entertaining, enlightening, and slightly hypnotic.

6. Giampiero Campa says:

I think Simulink is the ideal tool to do these kind of simulations, however, this is the quick and dirty Matlab code to do this, if anyone wants to try:

beta=1;lambda=0.1;

dv=@(t,v) [-v(2);v(1)]+beta*v-v’*v*v+lambda*randn(2,1);

[t,v]=ode45(dv,[0 10],[0 2]);

plot(v(:,1),v(:,2));axis([-2 2 -1.5 2])

The first line sets up the parameters, the second sets up the dynamical system (where v is the vector having x and y as components), the third solves the ode from 0 to 10 seconds, with an initial condition of [0 2], and the fourth line plots the results (use plot(t,v(:,1)) to plot the first component versus time).

The plots are smoother than the one shown above, perhaps because I have assumed that the variance of the white noise is 1, and maybe a larger one was assumed above. Multiplying lambda by 10 produces very similar plots.

• Tim van Beek says:

Interesting!
The main reason why I did not use Matlab is that it is not free and I don’t have a license – I know that most universities have one, but I’m not affiliated with any university :-)

Besides that, Matlab is not open source, so I wonder

a) what algorithm does it use to generate random numbers? Does it always use the same seed i.e. does it always reproduce the same graphs?

b) what algorithm does it use to solve the SDE?

c) do different versions of Matlab produce the same graphics? (I doubt that, to be honest).

The plots are smoother than the one shown above, perhaps because I have assumed that the variance of the white noise is 1, and maybe a larger one was assumed above.

No, the variance of white noise is not a free parameter in an Itô stochastic differential equation driven by Brownian motion, which makes me wonder why you have to specify it at all and to what effect.

The graphics above were produced with the explicit Euler scheme with a step size of 0.001. Therefore, in every timestep one has to add noise in the form of a random variable drawn from $\lambda N(0, \sqrt(0.001))$, because the increments of Brownian motion $W_{t + h} - W_{t}$ are Gaussian with mean zero and variance h.

• Tim van Beek says:

Sorry, it should be:

in every timestep one has to add noise in the form of a random variable drawn from $N(0, 0.001)$ multiplied by $\lambda$.

• Giampiero Campa says:

Tim, are you sure about this ? If the variance is 0.001 and lambda is 0.1 then the noise should be pretty much non-existent, and the plots would be really smooth.

Maybe you meant that the variance should be 1/sqrt(0.001) ? (and in fact i get plots that are really similar to yours in that case).

In any case i have checked also with forward euler and results are essentially the same, so it comes down entirely to the noise level.

Let me know if you want me to send you the time histories for comparison.

• Tim van Beek says:

Giampiero said:

If the variance is 0.001 and lambda is 0.1 then the noise should be pretty much non-existent, and the plots would be really smooth.

It is important to keep in mind that the Euler scheme is a discrete approximation with a fixed stepsize h, and when we simulate a sample path we have to add noise at every step with variance h. So if the stepsize is rather small (like 0.001 in our sample) the variance of the noise added at each step is also very small. But the effect adds up!

Tim, are you sure about this ?

No, this is of course an important question! I don’t trust my intution to tell me what a certain noise level should look like – in fact I should have written some tests before the publication of week 308!

A good test for the noise level is this: It is possible to formulate the Fokker-Planck equation for our example in polar coordinates and find the stationary, rotationally invariant solution which describes the distribution of r, the radius, in the equillibrium. This can be calculated in closed form.

Then it is possible to create a histogram approximation of the distribution of r by sampling a very long run of the SDE solver, and compare this to the solution obtained from the Fokker-Planck equation.

I did this, and the results match! so, yes, know I’m pretty confident that the SDE solver for week 308 gets the noise right.

I will explain all of this in more detail on the Azimuth project once I have more time (maybe tomorrow).

Thanks for your help and for getting me to finally do these chross-checks!

• Giampiero Campa says:

Hi Tim, yes of course h multiplies everything on the right hand side if you use Euler approx. But i am talking about the variance of the noise in the equations given by John.

In other words, let’s suppose the vector [x y] is [1 1] and have a look at the derivatives:
The first term is [-1 1], the second is [1 1], and the third is -[2 2]. Now, here is the important part, we are adding a noise here and I want to be sure about the variance of the noise that we are adding. If you really add a noise of variance 0.001 and multiply it by 0.1 here in this equation (and i don’t think you meant to add it here), you are adding something that is 3 to 4 orders of magnitudes below the other terms so you will hardly see anything !

I don’t think there is a bug anywhere, i just think we are using different levels of noise. If you tell me the variance of noise that is supposed to be added to those equations (not to their Euler approx) then I’ll redo the plots and upload them to you so we will start comparing apples to apples :)

• Tim van Beek says:

Giampiero said:

I don’t think there is a bug anywhere…

I don’t think that there is a bug with the noise level, too, because the SDE solver reproduces the distribution of r pretty good :-)

In other words, let’s suppose the vector [x y] is [1 1] and have a look at the derivatives

I’m sorry, but I don’t understand you: Is [1 1] the starting point of the process and you calculate the derivatives?

…but I am talking about the variance of the noise in the equations given by John.

The only free parameters are $\beta$ and $\lambda$. As you can see on the page stochastic differential equation, the Euler scheme is a simple generalization of the classical deterministic Euler scheme, in every step the position of the process is changed according to the deterministic part + noise.

Once a step size h is fixed, the increment $W_{t+h} - W_t$ has a fixed distribution, namely $N(0, h)$. There is no further choice involved here.

(Higher order schemes like Runge-Kutta schemes use higher terms from the stochastic Taylor series, which involve iterated integrals of Brownian motion, which are a little bit more complicated to simulate, so it is not true that all discrete approximations use the simple increment of Brownian motion only.)

The deterministic part gets multiplied by h, where the noise has variance h. This scaling behaviour is the reason why Brownian motion has a.s. sample paths that are not of bounded variation, so that the Riemann-Stieltjes integral with respect to the sample paths does not converge a.s. This is the reason why there is a stochastic calculus at all :-)

But when we compare the influence of the deterministic path to the influence of additive noise in the Euler scheme, we have to keep in mind that the influence of the deterministic part gets multiplied by h, ergo it is not true that the noise term is negligible.

• Giampiero Campa says:

So the same code works fine in octave, (you just have to download ode45.m which you can just google for, you’ll find several open sourced versions, there is one at the university of Boston. There is also an ode78.m which implements Runge-Kutta with 7th order formulas).

About randn:
http://www.mathworks.com/help/techdoc/ref/randn.html
(yes you can reset the seed to repeat the same sequence).

By the way if you specifically want an sde solver there is an apposite sde toolbox here (free):
http://sdetoolbox.sourceforge.net/
(also, i think there are specific SDE objects and solvers in the econometrics toolbox, but i have never used them).

I have tried a couple of versions before 2008 and the graphs are identical, with R2010a it is slightly different, i think something changed in the number generator around that time.

• Giampiero Campa says:

Tim, I also meant to say that if you are using a simple forward Euler integration it’s already a big difference right there, obviously.

• Tim van Beek says:

Alright, I finally took the time to look at the Matlab script, here are my thoughts:

1. Matlab does not provide any SDE solvers, at least I did not find any description of one,

2. ode45 is a ODE solver, it chooses it’s stepsize dynamically,

3. it would seem that the line

“dv=@(t,v) [-v(2);v(1)]+beta*v-v’*v*v+lambda*randn(2,1)”

gets Matlab to add random numbers at every step that are distributed according to N(0, 1).

Now, with some luck ode45 will use a constant stepsize h, in this case you’ll get sample paths that look similar to those of TWF if $\lambda*h*N(0,1)$ looks close enough to $\lambda*N(0, h)$. But this is more of an accident if the stepsize $h$ is much smaller than the resolution of the graphic. Although ode45 chose 1.0 as the default initial stepsize, I suspect that it will calculate a much smaller one and use this according to the default error bounds, but I don’t know how to find out which one it actually uses.

If I’m correct then this is a very good example that one needs more than simple intuition to check if an SDE solver is correct :-)

• Giampiero Campa says:

Ok, got it! dW/dt is approximated by (W(t+h)-W(t))/h, so you have a noise with variance 0.001 that _is divided by h_.

Indeed using the line dv=@(t,v) [-v(2);v(1)]+beta*v-v’*v*v+lambda*sqrt(0.001)*randn(2,1)/0.001;gives very similar plots to yours, especially when you fix the step size to 0.001.

To double check, I have also tried a simple euler approximation, using the following 4 lines, which can be copied and pasted directly in Octave or Matlab, and don’t need any ode solver (and besides execute very fast):

beta=1;lambda=0.1;t=[0:0.001:10];z=[0 2]’*t;z(:,1)=[0 2]’;
dv=@(t,v) 0.001*([-v(2);v(1)]+beta*v-v’*v*v)+lambda*sqrt(0.001)*randn(2,1);
for i=1:length(t)-1,z(:,i+1)=z(:,i)+dv(t(i),z(:,i));end
plot(z(1,:),z(2,:));axis([-2 2 -1.5 2])

This also gives plots that are basically identical to yours, so problem solved.
Thanks Tim !

• Tim van Beek says:

Yes, that’s right :-)

There should be also several additional packages for Matlab that are not part of the core with SDE solvers, also with adaptive step size control – at least such algorithms exist. Of course the noise level has to be adapted to the chosen step size at each step.

• John F says:

Matlab fan here. I also constantly write numerical routines in Basic, Fortran, C, and IDL, but (except for OPC – Other People’s Code) I always develop numerical analysis algorithms in Matlab first.

By the way, almost all Matlab packages are just someone’s collections of ordinary Matlab routines, so you probably don’t need them or even want to have to learn them unless you’re going to be mostly focused on something specialized, like say Computational Biology.

• Tim van Beek says:

John F said:

By the way, almost all Matlab packages are just someone’s collections of ordinary Matlab routines, so you probably don’t need them or even want to have to learn them unless you’re going to be mostly focused on something specialized, like say Computational Biology.

I’m trying to learn a little bit more about climate models. I understand that Matlab is a good tool to quickly code a simple algorithm and play with it. Climate models need an effort of several man decades. I doubt that Matlab is suitable to handle models that are comprised of millions of lines of code and are developed by hundreds of people over several years. On the other hand, this is quite usual for applications using either the Java or the .NET tool stack.

• Tim van Beek says:

This is the first bug report for the Azimuth Code Project, so I created an issue with the issue tracker over at google projects here.

Thanks to Giampiero: It’s very important that we cross-check our results and fix all bugs before they have any impact.

(I’m pretty sure that the pictures give a qualitative correct impression of the behaviour of the solutions, even if it turns out that there is some scaling error in the noise level – so such an error would not be that bad now. It would be very bad if we used the code to calculate e.g. crossing times for pricing options :-).

• John Baez says:

Hi, Tim! I’m in Vietnam, which is a fascinating place, so I don’t want to spend spend much time blogging, but if you could fix the the pictures for “week308”, that would be great. That way I can put up fixed versions when I get back.

• Tim van Beek says:

Sure, no problem :-)

Every report of unusual behaviour should result in a bug report, or an issue (these are synonymous, especially if one uses the bug tracker of google project). This does not imply that there actually is a bug in the software.

In this case I became suspicious that I maybe did something wrong in the random number generator when I translated it from C++ to Java, so I wrote a chi-square-test for it – which it passed, so now I’m pretty sure that there isn’t a bug.

Nevertheless I’ll calculate the Fokker-Planck equation in polar coordinates and compare the stationary solution to the approximation I get from my SDE solver – this is another sanity check (I’ll try to get this done before new year’s eve, since I won’t have any time for that afterwardts :-(.

But right now I’m more supicious about what Matlab actually does when fed with the lines that Giampiero posted and could some help with that, since I don’t have and don’t know Matlab.

7. Tim van Beek says:

Giampiero has updated the issue over at the Azimuth Code Project, I need to take a closer look at it – meanwhile I had enough time to publish my plausibility test of the noise level of the SDE solver over at the Azimuth project:

Fokker-Planck equation, look for “Hopf-Bifurcation Example of TWF 308”.

The distribution of r is very sensitive to the noise level λ, so I find it hard to believe that a rather exact match of both calculated and simulated values as displayed in the graphic over there is an accident.

8. Tim van Beek says:

I just noted a typo, it should be:

Bernt Øksendal

His book contains many quite successful applications of stochastic differential equations to convince the sceptics that this topic is worthwhile to study :-)

• John Baez says:

Hi, I’m back in Singapore. I fixed this typo — thanks for catching it.

9. Chris Goddard says:

This post had me thinking about catastrophes. Interestingly, there were 7 types discussed in Thom’s investigations (including the “butterfly catastrophe”), which suggests to me 3-cat level complexity. If viewed with minimal structure & in a box model context, I believe this could be a fruitful direction of further work in this instance.

• John Baez says:

Hi, Chris! I’ve tried to fix the html in your post. To make a link in a comment on this blog, you should type something like this:

<a href = “http://math.ucr.edu/home/baez/music/”>John Baez’s music</a>

which produces this result:

John Baez’s music

You wrote:

This post had me thinking about catastrophes.

Indeed, I was wanting to say a lot more about catastrophe theory, but I restrained myself because I thought it would be too distracting. I decided it would be better to see some examples of catastrophe theory in climate science before chatting about the mathematical subject in its abstract glory!

This week we saw the Hopf bifurcation in the context of the El Niño-Southern Oscillation. Next week we’ll see it in the context of predator-prey models. Later we’ll see the fold catastrophe in the context of ‘Snowball Earth’ and the glacial cycles. But if you’re impatient, you can see the basic idea now:

• Azimuth Project, Snowball Earth.

In all cases these catastrophes display some extra interesting behavior when we add noise.

And yes, I’m very interested in thinking about them in terms of box models, because box models are really just applied category theory.

I’ll probably spend a year or two writing some papers that make the last sentence more precise. I’m eager to bring some category-theoretic tools into climate physics, systems ecology and the like. I don’t want to get too sidetracked by this theoretical desire — I doubt it will ‘help save the planet’ very much — but I do need to keep publishing math papers, and this is one obvious way.

10. Chris Goddard says:

Hi John,

Thanks for fix – this wordpress stuff is tricky business =|

Incidentally, related to your aside, procedural generation of music is something that I’d like to learn how to do at some stage (and the implementation of procedural techniques in general).

But I suspect that that would be sidetracking this discussion just a bit!

11. Nice post! I never thought adding noise could have this effect.

Just for fun I create an interactive Sage version of this phenomenon: you can play with the model at http://www.sagenb.org/home/pub/2630.

On a side-note, I think that Sage fits in very well with the Azimuth philosophy:

It is open source (it suffices to add ?? to any method name to view the implementation) and actively maintained. It doesn’t aim to reinvent the wheel but instead incorporates GAP, Maxima, scipy, … into one consistent interface.

It’s written in Python, which is easy to learn and (thanks to cython) can be as fast as compiled C. For instance, the whole interactive demo is no more than 10 lines.

Code that goes into Sage is peer reviewed and unit tested, so you don’t need to worry about stress-testing the code (though it is always good to be cautious).

• John Baez says:

Joris wrote:

Just for fun I created an interactive Sage version of this phenomenon: you can play with the model at http://www.sagenb.org/home/pub/2630.

For some reason that doesn’t work for me. I see some sliders at the bottom, and I can slide them, but I don’t see a plot of the solution.

Also, for some mysterious reason the jsmath doesn’t work for me either.

Does this stuff work for anyone else reading this? Maybe it’s just me.

• Nathan Urban says:

I have the same problems as John (using Mac Firefox 3.6.1 or Safari 5.0.3).

• Tim van Beek says:

Unfortunately it does not work for me, either, I have the same problems as John.
Over at the Azimuth forum Staffan Liljegren already pointed out Sage but I haven’t taken a closer look at it yet. The motivation of the authors of Sage is very similar to the motivation of the Azimuth project, so I do hope that we can use and maybe extend Sage for our purposes.

• Thanks for pointing this out! It seems that we have stumbled upon a limitation within Sage: apparently, to see the interactive elements in a worksheet one needs to be logged in. Rather than asking you all to create an account with Sage, I noticed that there are some patches lying around to enable this, and once I get back from this conference I’m attending, I’ll try to set up my own server running this worksheet.

Sorry for this unexpected setback! You can still see the niftyness by logging in though…

12. peter waaben says:

The Poincaré–Bendixson theorem states that any orbit that stays in a compact region of the state space of a 2-dimensional planar continuous dynamical system, all of whose fixed points are isolated, approaches its ω-limit set, which is either a fixed point, a periodic orbit, or is a connected set composed of a finite number of fixed points together with homoclinic and heteroclinic orbits connecting these. Thus chaotic behaviour can only arise in continuous dynamical systems whose phase space has three or more dimensions. However the theorem does not apply to discrete dynamical systems, where chaotic behaviour can arise in two or even one dimensional systems.

A weaker version of the theorem was originally conceived by Henri Poincaré, although he lacked a complete proof. Ivar Bendixson (1901) gave a rigorous proof of the full theorem.

I.e., this is a proof that discrete and continous systems are not-isomorphic thus care seems appropriate in the development of models etc.

13. […] Software for investigating the Hopf bifurcation and its stochastic version: see week308 of This Week’s […]

14. In This Weeks Finds, back in ‘week308′, we had a look at a model with ‘noise’. A lot of systems can be described by ordinary […]

15. lee bloomquist says:

— John wrote: “…there are infinitely many different kinds of noise, so building a model involving noise requires that you make assumptions about the nature of the noise involved.”

Since 1/f noise is found in some types of meteorological data, has anybody thought about using 1/f noise to model ENSO with Hopf bifurcation and noise? Since 1/f noise is said to model “long term memory effect” could it also be relevant to the current discussion in the El Nino project, part 6, between Steve Wenner and HyperG about Bayesian and frequentist interpretations of probability?

— In the above post Tim van Beek wrote “a model can have different uses…[for example] as a null hypothesis for other, more sophisticated models.”

Instead of a single null hypothesis I wonder if an algorithm based on multiple null hypotheses would be useful. Don’t know if that’s standard but here goes.

This is an idea about an optimal capital budgeting process based on software prediction of ENSOs (using an algorithm based on multiple null hypotheses).

Why this kind of approach to software specification? Because ENSOs have big economic costs and so we might be able to cost-justify developing some software for this kind of optimal budgeting process.

Banks should be interested because, on behalf of their shareholders, they would be able to bet their money on better ENSO predictions. Governments should be interested because, on behalf of the people under their governance, they would be able to make better expenditures e.g. on appropriate corn storage and so on.

The idea comes from looking at Steve’s spreadsheet, specifically the table for “Training.” And then, thinking of all those models from the “big guys,” that John showed us, as actually, null hypotheses. For what follows I assume that each null hypothesis from the list either predicts or does not predict an upcoming ENSO each year.

To make it easier (for me anyway), imagine a cartoon about a bank with a group of investment bankers, in three acts. Each investment banker is given just one dollar per year from a finite amount of dollars to invest based on predictions of an ENSO. After all it’s a just a cartoon.

Act one focuses on one of the investment bankers. As the curtain opens it is capital budgeting time and he is considering all of the null hypotheses. Scene one: he focuses on one of the null hypotheses. It is predicting that next year there will be an ENSO. The banker bets his one dollar on shorting a company that will be hurt by the predicted ENSO. Scene two: The predicted ENSO does not occur. The banker puts a mark on a blackboard for this particular null hypothesis labelled “buyer’s regret.” Because he regrets having listened to the prediction by this particular null hypothesis. Scene three: Time passes. Once again we see the banker focusing on the very same null hypothesis. It predicts an ENSO for next year. The banker goes somewhere else with his dollar. Scene four: As predicted, the ENSO occurs. The banker puts a mark on another blackboard next to the buyer’s regret blackboard for this null hypothesis. It is labelled “regret at lost opportunity.” Scene five: Time moves on like pages turning in a book. We see the banker putting more and more marks on each blackboard for the null hypothesis, keeping track of his two kinds of regret associated to the null hypothesis in question. Final scene of act one: the narrator tells us how the banker has balanced the counts. Through his choices the banker has evenly balanced his counts on the two blackboards assigned to the null hypothesis in question. Buyers regret balances regret from lost opportunity. Each mark of regret on betting this null hypothesis would be correct pushes the investment banker away from it. But each mark of regret on having lost an opportunity to invest in this null hypothesis pushes the banker toward it. He is pushed toward and away from the null hypothesis in question by these two opposing entropic forces. Equlibrium occurs for the null hypothesis in question when these two balance each other. Note: the bankers in this cartoon bet just on whether or not a particular null hypothesis is right in its prediction, regardless of whether the prediction is for or against an ENSO . (This would be the infamous probability learning algorithm if all the null hypothesis worked like the feeding spots in the probability learning experiment. Of course they don’t, but for now this is just a cartoon.)

Act two. Scene one: We see a the entire group of investors for the bank,each given a dollar bill per budgeting period, each balancing these two forces of regret for each null hypothesis in the collection that John gave us. That is, each banker has two blackboards for each of the null hypotheses. It’s a lot of blackboards. Scene two: In flurry of activity the bankers run around from one kind of blackboard to the other for each null hypothesis. Finally, some form emerges and we see that things settle out to there being for each null hypothesis a group of bankers, each group of varying size in proportion to the prediction accuracy of each null hypothesis. (It’s like the school of fish settling into a Nash equilibrium in a previous comment on biology.) Scene three: We focus on the original soliatry banker and see the cartoon bubble of thoughts in his mind. “I’d like to try out another one of those hypotheses, but I’m afraid of leaving my group here. It’s stable. Who knows, if I leave it they might not let me back in. And those other groups might not let me in. Think I’ll just stay here. I’m afraid to do anything else.”

Act three: the narrator speaks. “Socrates left us with the idea that self-knowledge is the basis of all other knowledge. We may never be able to know the perfect model, but we do know how all of these null hypotheses can make us experience regret. The self-knowledge in this case is knowing our own experience of regret as a kind of thirst or hunger not satisfied. It’s the basis of the algorithm.”

(curtain closes)

• davetweed says:

I believe that Tim was using “null hypothesis” more as a shorthand for “base model”, in the sense that whenever you create a model which you know will not be a full, correct model of the phenomenon (which philosophically might be the only thing science does, but that’s a tangential debate) you generally want to see if it performs better than your base model. For example, in a binary detection classifier one can compare against the classifiers “always true” and “always false”: if your model performs worse than that then it’s clearly no use and should (in its current incarnation) be abandoned.

As such, I think the only time you’d use them is when you’re coming up with a new classifier technique, and you compare against the base model (on your “validation set”) and discard one of the two models. (Actually, if you care about the kinds of error (false positive, false negative, etc) you might find different models work better for different regimes and you’re off doing ROC analysis as briefly discussed a little here. However, in that case you’d still only have one active model for each combination of error values.)

That’s related to what I think Tim was getting at; I’ll have to think some more about what you proposed as an idea in its own right.

• lee bloomquist says:

Dave, when you say ” … “null hypothesis” more as a shorthand for “base model”, in the sense that whenever you create a model which you know will not be a full, correct model of the phenomenon (which philosophically might be the only thing science does, but that’s a tangential debate)”–

I do agree. Not being a scientist but an engineer (ret.) I admit to bringing up something that is perhaps tangential. From an equation that explained to me a well-studied phenomenon in psychology called probability learning, I think I see a way to apply this equation for the lab animals’ behavior to guide investors’ or governments’ spending if and when ENSOs get worse and worse. Actually, my application of the equation wouldn’t care at all about the details of particular models. It doesn’t answer which model explains historical data better. The idea is just to give investors’ and governments’ some guidance for their spending behavior in real time– as a future unfolds in which the probability of ENSOs might well change over comparatively short periods of time.

Although I do think it would be interesting to use the equation in order to compare a number of differently initialized Hopf bifurcation models with 1/f noise against what is going to happen in the future. The algorithm might pick one or the other of the differently initialized Hopf models as time goes on. And in a separate running, their performance as the future unfolds could be compared to the same application of the equation to the models of the “big guys.” To me it looks like a competition between simple and complex models.