(The question mark was left behind after reformulating my comment.) ]]>

I would like to point out that a sort of information-theoretic “second-law–like” converse also hold, i.e., that any process that makes any ensemble of initial distributions less and less distinguishable over time is necessarily Markovian?

The question mark at the end confuses me. Are you pointing out a fact, or asking a question?

]]>Wonderfulness! I’m looking forward to a detailed article comparing the classical methods described by Dutton with the modern approach espoused by Baez et al. ]]>

Hi John,

I computed Bayes probability intervals for the success probabilities of Ludescher’s predictions (and for my variations). My method is not perfect, since I assumed independence across years, which is surely not correct. However, taking into account the serial correlations would only increase the length of the error bars, and they are plenty long enough already! I’m not certain how to do better, but I suspect I would need to use Monte Carlo methods, and they have their own problems.

Ludescher’s method has an expected posterior probability of successfully predicting an El Nino initiation event, when one actually occurs, of 0.601. This is very close to the frequentist estimate (11 successes / 18 El Ninos = 0.611); so, the prior distribution has little effect on the estimate of the mean. The 95% confidence interval is from 0.387 to 0.798; so, the data and method did succeed in narrowing the prior uniform interval (see the next paragraph) that extends from 0.286 to 1. The intervals for “non” El Nino events are shorter: 0.768 to 0.951 for the probability of successfully predicting a “non” event; however, the prior for the non-event is from 0.714 to 1, so Ludescher’s method doesn’t narrow the range very much!

If we don’t condition on the outcome, then the estimate of the mean success probability is 0.795 using Ludescher’s method; but, if we simply use the “dumb” rule (always predict “no El Nino”) then we will be right with probability 0.714 – the data and Ludescher gain us very little!

Truncated Uniform Prior:I assume that any reasonable method will do at least as well as chance. Over many years we have experienced an El Nino initiation event in 28.6% of those years. So, a dumb method that simply declares “El Nino will happen next year!” with probability 28.6% and “will not happen!” with probability 71.4% will be successful in “predicting” 28.6% of all El Ninos. So, I set the minimum success probability at p0 = 28.6%, given that an El Nino actually occurs. Similarly, the dumb method successfully predicts 71.4% of the “no El Nino” years; so, I set the minimum success probability at p0 = 71.4% for any prediction method, given that the outcome is “no El Nino”. In both cases the upper limit is p1 = 1 for a perfect prediction method.

For a binomial sampling situation with a truncated uniform prior the posterior density is expressible with the help of the beta distribution function (normalized incomplete beta function). The formulas can be found in Bayesian Reliability Analysis by Martz & Waller, Wiley, 1982, pp262-264. The posterior mean has a closed form, but the Bayes probability intervals must be found by iterative methods (I used the Excel “solver” add-in).

The details are on the spreadsheet. I greyed out superfluous stuff I used to help me get the formulas straight.

Cheers,

Steve

He added:

]]>Hi, I just wanted to add a couple of comments:

The reason I used both the training and the validation data in estimating confidence limits was because the validation data show a better fit to the model than the training data; so, it seemed more fair to use both data sets for these calculations.

I did some rough calculations to estimate the increase in the error bars that might result if I were to take the serial correlations into account. For instance, I think that the lower limit for the probability of successfully predicting an El Nino with Ludescher’s method is actually closer to 0.366, rather than the 0.387 reported below, and the upper limit would increase from 0.798 to 0.815. Since my ideas for this adjustment are only half-baked, I won’t go into the details here.

.

Just in case the matrix $\latex P$ is formed from dividing each element by a column dependent number, which is the sum of all the elements in the column.

]]>There is another thing, for each column let the sum of the elements of the matrix . Let the matrix with elements . So the matrix is a stochastic matrix, for each column all the elements add 1. By Perron’s Theorem 1 is an eigenvalue and all others eigenvalues have norm less than 1.

The eigenvalue 1 is simple if graph is connected. This should be the case. The correspondent eigenvector is an equilibrium measure for the stochastic process associated with . The point is that this measure is precisely the one defined above.

Could there be a climatic interpretation of this? Flow of information?

I am very intrigued about the very low correlated white strip just south of the most correlated zone.I mean, most of the ocean is light blue and you have a clearly white strip zone just south of the red and green and dark blue zone.

I enjoyed very much.

]]>