## Fires in Indonesia

2 November, 2015

I lived in Singapore for two years, and I go back to work there every summer. I love Southeast Asia, its beautiful landscapes, its friendly people, and its huge biological and cultural diversity. It’s a magical place.

But in 2013 there was a horrible haze from fires in nearby Sumatra. And this year it’s even worse. It makes me want to cry, thinking about how millions of people all over this region are being choked as the rain forest burns.

This part of the world has a dry season from May to October and then a wet season. In the dry season, Indonesian farmers slash down jungle growth, burn it, and plant crops. That is nothing new.

But now, palm oil plantations run by big companies do this on a massive scale. Jungles are disappearing at an astonishing rate. Some of this is illegal, but corrupt government officials are paid to look the other way. Whenever we buy palm oil—in soap, cookies, bread, margarine, detergents, and many other products—we become part of the problem.

This year the fires are worse. One reason is that we’re having an El Niño. That typically means more rain in California—which we desperately need. But it means less rain in Southeast Asia.

This summer it was very dry in Singapore. Then, in September, the haze started. We got used to rarely seeing the sun—only yellow-brown light filtering through the smoke. When it stinks outside, you try to stay indoors.

When I left on September 19th, the PSI index of air pollution had risen above 200, which is ‘very unhealthy’. Singapore had offered troops to help fight the fires, but Indonesia turned down the offer, saying they could handle the situation themselves. That was completely false: thousands of fires were burning out of control in Sumatra, Borneo and other Indonesian islands.

I believe the Indonesian government just didn’t want foreign troops out their land. Satellites could detect the many hot spots where fires were burning. But outrageously, the government refused to say who owned those lands.

A few days after I left, the PSI index in Singapore had shot above 300, which is ‘hazardous’. But in parts of Borneo the PSI had reached 1,986. The only name for that is hell.

By now Indonesia has accepted help from Singapore. Thanks to changing winds, the PSI in Singapore has been slowly dropping throughout October. In the last few days the rainy season has begun. Each time the rain clears the air, Singaporeans can see something beautiful and almost forgotten: a blue sky.

Rain is also helping in Borneo. But the hellish fires continue. There have been over 100,000 individual fires—mostly in Sumatra, Borneo and Papua. In many places, peat in the ground has caught on fire! It’s very hard to put out a peat fire.

If you care about the Earth, this is very disheartening. These fires have been putting over 15 million tons of carbon dioxide into the air per day – more than the whole US economy! And so far this year they’ve put out 1.5 billion tons of CO2. That’s more than Germany’s carbon emissions for the whole year—in fact, even more than Japan’s. How can we make progress on reducing carbon emissions with this going on?

For you and me, the first thing is to stop buying products with palm oil. The problem is largely one of government corruption driven by money from palm oil plantations. But the real heart of the problem lies in Indonesia. Luckily Widodo, the president of this country, may be part of the solution. But the solution will be difficult.

Widodo is Indonesia’s first president with a track record of efficient local governance in running two large cities. Strong action on the haze issue could help fulfill the promise of reform that motivated Indonesian voters to put him in office in October 2014.

The president has deployed thousands of firefighters and accepted international assistance. He has ordered a moratorium on new licenses to use peat land and ordered law enforcers to prosecute people and companies who clear land by burning forests.

“It must be stopped, we mustn’t allow our tropical rainforests to disappear because of monoculture plantations like oil palms,” Widodo said early in his administration.

Land recently burned and planted with palm trees is now under police investigation in Kalimantan [the Indonesian part of Borneo].

The problem of Indonesia’s illegal forest fires is so complex that it’s very hard to say exactly who is responsible for causing it.

Indonesia’s government has blamed both big palm oil companies and small freeholders. Poynton [executive director of the Forest Trust] says the culprits are often mid-sized companies with strong ties to local politicians. He describes them as lawless middlemen who pay local farmers to burn forests and plant oil palms, often on other companies’ concessions.

“There are these sort of low-level, Mafioso-type guys that basically say, ‘You get in there and clear the land, and I’ll then finance you to establish a palm oil plantation,'” he says.

The problem is exacerbated by ingrained government corruption, in which politicians grant land use permits for forests and peat lands to agribusiness in exchange for financial and political support.

“The disaster is not in the fires,” says independent Jakarta-based commentator Wimar Witoelar. “It’s in the way that past Indonesian governments have colluded with big palm oil businesses to make the peat lands a recipe for disaster.”

The quote is from here:

For how to avoid using palm oil, see for example:

• Lael Goodman, How many products with palm oil do I use in a day?

First, avoid processed foods. That’s smart for other reasons too.

Second, avoid stuff that contains stearic acid, sodium palmitate, sodium laureth sulfate, cetyl alcohol, glyceryl stearate and related compounds—various forms of artificial grease that are often made from palm oil. It takes work to avoid all this stuff, but at least be aware of it. These chemicals are not made in laboratories from pure carbon, hydrogen, oxygen and nitrogen! The raw ingredients often come from palm plantations, huge monocultures that are replacing the wonderful diversity of rainforest life.﻿

For more nuanced suggestions, see the comments below. Right now I’m just so disgusted that I want to avoid palm oil.

For data on the carbon emissions of this and other fires, see:

1997 was the last really big El Niño.

This shows a man in Malaysia in September. Click on the pictures for more details. The picture at top shows a woman named a woman named Gaye Thavisin in Indonesia—perhaps in Kalimantan, the Indonesian half of Borneo, the third largest island in the world. Here is a bit of her story:

The Jungle River Cruise is run by Kalimantan Tour Destinations a foreign owned company set up by two women pioneering the introduction of ecotourism into a part of Central Kalimantan that to date has virtually no tourism.

Inspired by the untapped potential of Central Kalimantan’s mighty rivers, Gaye Thavisin and Lorna Dowson-Collins converted a traditional Kalimantan barge into a comfortable cruise boat with five double cabins, an inside sitting area and a upper viewing deck, bringing the first jungle cruises to the area.

Originally Lorna Dowson-Collins worked in Central Kalimantan with a local NGO on a sustainable livelihoods programme. The future livelihoods of the local people were under threat as logging left the land devastated with poor soils and no forest to fend from.

Kalimantan was teeming with the potential of her people and their fascinating culture, with beautiful forests of diverse flora and fauna, including the iconic orang-utan, and her mighty rivers providing access to these wonderful treasures.

An idea for a social enterprise emerged , which involved building a boat to journey guests to inaccessible places and provide comfortable accommodation.

Gaye Thavisin, an Australian expatriate, for 4 years operated an attractive, new hotel 36 km out of Palangkaraya in Kalimantan. Gaye was passionate about developing the tourism potential of Central Kalimantan and was also looking at the idea of boats. With her contract at the hotel coming to an end, the Jungle Cruise began to take shape!

## Information and Entropy in Biological Systems (Part 5)

30 May, 2015

John Harte of U. C. Berkeley spoke about the maximum entropy method as a method of predicting patterns in ecology. Annette Ostling of the University of Michigan spoke about some competing theories, such as the ‘neutral model’ of biodiversity—a theory that sounds much too simple to be right, yet fits the data surprisingly well!

We managed to get a video of Ostling’s talk, but not Harte’s. Luckily, you can see the slides of both. You can also see a summary of Harte’s book Maximum Entropy and Ecology:

• John Baez, Maximum entropy and ecology, Azimuth, 21 February 2013.

Here are his talk slides and abstract:

Abstract. Constrained maximization of information entropy (MaxEnt) yields least-biased probability distributions. In statistical physics, this powerful inference method yields classical statistical mechanics/thermodynamics under the constraints imposed by conservation laws. I apply MaxEnt to macroecology, the study of the distribution, abundance, and energetics of species in ecosystems. With constraints derived from ratios of ecological state variables, I show that MaxEnt yields realistic abundance distributions, species-area relationships, spatial aggregation patterns, and body-size distributions over a wide range of taxonomic groups, habitats and spatial scales. I conclude with a brief summary of some of the major opportunities at the frontier of MaxEnt-based macroecological theory.

Here is a video of Ostling’s talk, as well as her slides and some papers she recommended:

• Annette Ostling, The neutral theory of biodiversity and other competitors to maximum entropy.

Abstract: I am a bit of the odd man out in that I will not talk that much about information and entropy, but instead about neutral theory and niche theory in ecology. My interest in coming to this workshop is in part out of an interest in what greater insights we can get into neutral models and stochastic population dynamics in general using entropy and information theory.

I will present the niche and neutral theories of the maintenance of diversity of competing species in ecology, and explain the dynamics included in neutral models in ecology. I will also briefly explain how one can derive a species abundance distribution from neutral models. I will present the view that neutral models have the potential to serve as more process-based null models than previously used in ecology for detecting the signature of niches and habitat filtering. However, tests of neutral theory in ecology have not as of yet been as useful as tests of neutral theory in evolutionary biology, because they leave open the possibility that pattern is influenced by “demographic complexity” rather than niches. I will mention briefly some of the work I’ve been doing to try to construct better tests of neutral theory.

Finally I’ll mention some connections that have been made so far between predictions of entropy theory and predictions of neutral theory in ecology and evolution.

These papers present interesting relations between ecology and statistical mechanics. Check out the nice ‘analogy chart’ in the second one!

• M. G. Bowler, Species abundance distributions, statistical mechanics and the priors of MaxEnt, Theoretical Population Biology 92 (2014), 69–77.

Abstract. The methods of Maximum Entropy have been deployed for some years to address the problem of species abundance distributions. In this approach, it is important to identify the correct weighting factors, or priors, to be applied before maximising the entropy function subject to constraints. The forms of such priors depend not only on the exact problem but can also depend on the way it is set up; priors are determined by the underlying dynamics of the complex system under consideration. The problem is one of statistical mechanics and it is the properties of the system that yield the correct MaxEnt priors, appropriate to the way the problem is framed. Here I calculate, in several different ways, the species abundance distribution resulting when individuals in a community are born and die independently. In
the usual formulation the prior distribution for the number of species over the number of individuals is 1/n; the problem can be reformulated in terms of the distribution of individuals over species classes, with a uniform prior. Results are obtained using master equations for the dynamics and separately through the combinatoric methods of elementary statistical mechanics; the MaxEnt priors then emerge a posteriori. The first object is to establish the log series species abundance distribution as the outcome of per capita guild dynamics. The second is to clarify the true nature and origin of priors in the language of MaxEnt. Finally, I consider how it may come about that the distribution is similar to log series in the event that filled niches dominate species abundance. For the general ecologist, there are two messages. First, that species abundance distributions are determined largely by population sorting through fractional processes (resulting in the 1/n factor) and secondly that useful information is likely to be found only in departures from the log series. For the MaxEnt practitioner, the message is that the prior with respect to which the entropy is to be maximised is determined by the nature of the problem and the way in which it is formulated.

• Guy Sella and Aaron E. Hirsh, The application of statistical physics to evolutionary biology, Proc. Nat. Acad. Sci. 102 (2005), 9541–9546.

A number of fundamental mathematical models of the evolutionary process exhibit dynamics that can be difficult to understand analytically. Here we show that a precise mathematical analogy can be drawn between certain evolutionary and thermodynamic systems, allowing application of the powerful machinery of statistical physics to analysis of a family of evolutionary models. Analytical results that follow directly from this approach include the steady-state distribution of fixed genotypes and the load in finite populations. The analogy with statistical physics also reveals that, contrary to a basic tenet of the nearly neutral theory of molecular evolution, the frequencies of adaptive and deleterious substitutions at steady state are equal. Finally, just as the free energy function quantitatively characterizes the balance between energy and entropy, a free fitness function provides an analytical expression for the balance between natural selection and stochastic drift.

## Biodiversity, Entropy and Thermodynamics

27 October, 2014

I’m giving a short 30-minute talk at a workshop on Biological and Bio-Inspired Information Theory at the Banff International Research Institute.

I’ll say more about the workshop later, but here’s my talk, in PDF and video form:

Most of the people at this workshop study neurobiology and cell signalling, not evolutionary game theory or biodiversity. So, the talk is just a quick intro to some things we’ve seen before here. Starting from scratch, I derive the Lotka–Volterra equation describing how the distribution of organisms of different species changes with time. Then I use it to prove a version of the Second Law of Thermodynamics.

This law says that if there is a ‘dominant distribution’—a distribution of species whose mean fitness is at least as great as that of any population it finds itself amidst—then as time passes, the information any population has ‘left to learn’ always decreases!

Of course reality is more complicated, but this result is a good start.

This was proved by Siavash Shahshahani in 1979. For more, see:

• Lou Jost, Entropy and diversity.

• Marc Harper, The replicator equation as an inference dynamic.

• Marc Harper, Information geometry and evolutionary game theory.

## Life’s Struggle to Survive

19 December, 2013

Here’s the talk I gave at the SETI Institute:

When pondering the number of extraterrestrial civilizations, it is worth noting that even after it got started, the success of life on Earth was not a foregone conclusion. In this talk, I recount some thrilling episodes from the history of our planet, some well-documented but others merely theorized: our collision with the planet Theia, the oxygen catastrophe, the snowball Earth events, the Permian-Triassic mass extinction event, the asteroid that hit Chicxulub, and more, including the massive environmental changes we are causing now. All of these hold lessons for what may happen on other planets!

To watch the talk, click on the video above. To see

Here’s a mistake in my talk that doesn’t appear in the slides: I suggested that Theia started at the Lagrange point in Earth’s orbit. After my talk, an expert said that at that time, the Solar System had lots of objects with orbits of high eccentricity, and Theia was probably one of these. He said the Lagrange point theory is an idiosyncratic theory, not widely accepted, that somehow found its way onto Wikipedia.

Another issue was brought up in the questions. In a paper in Science, Sherwood and Huber argued that:

Any exceedence of 35 °C for extended periods should
induce hyperthermia in humans and other mammals, as dissipation of metabolic heat becomes impossible. While this never happens now, it would begin to occur with global-mean warming of about 7 °C, calling the habitability of some regions into question. With 11-12 °C warming, such regions would spread to encompass the majority of the human population as currently distributed. Eventual warmings of 12 °C are
possible from fossil fuel burning.

However, the Paleocene-Eocene Thermal Maximum seems to have been even hotter:

So, the question is: where did mammals live during this period, which mammals went extinct, if any, and does the survival of other mammals call into question Sherwood and Huber’s conclusion?

## Monarch Butterflies

25 November, 2013

Have you ever seen one of these? It’s a Monarch Butterfly. Every spring, millions fly from Mexico and southern California to other parts of the US and southern Canada. And every autumn, they fly back. On the first of November, called the Day of the Dead, people celebrate the return of the monarchs to the mountainous fir forests of Central Mexico.

But their numbers are dropping. In 1997, there were 150 million. Last year there were only 60 million. One problem is the gradual sterilization of American farmlands thanks to powerful herbicides like Roundup. Monarch butterfly larvae eat a plant called milkweed. But the amount of this plant in Iowa, for example, has dropped between 60% and 90% over the last decade.

And this year was much worse for the monarchs. They came late to Mexico… and I think only 3 million have been seen so far! That’s a stunning decrease!

Some blame the intense drought that hit the US in recent years—the sort of drought we can expect to become more frequent as global warming proceeds.

Earlier this year, Michael Risnit wrote this in USA Today:

Illegal logging in the Mexican forests where they spend the winter, new climate patterns and the disappearance of milkweed—the only plant on which monarchs lay their eggs and on which their caterpillars feed—are being blamed for their shrinking numbers.

Brooke Beebe, former director of the Native Plant Center at Westchester Community College in Valhalla, N.Y., collects monarch eggs, raises them from caterpillar to butterfly and releases them.

“I do that when they’re here. They’re not here,” she said.

The alarm over disappearing monarchs intensified this spring when conservation organizations reported that the amount of Mexican forest the butterflies occupied was at its lowest in 20 years. The World Wildlife Fund, in partnership with a Mexican wireless company and Mexico’s National Commission of Protected Areas, found nine hibernating colonies occupied almost 3 acres during the 2012-13 winter, a 59% decrease from the previous winter.

Because the insects can’t be counted individually, the colonies’ total size is used. Almost 20 years ago, the colonies covered about 45 acres. A couple of acres contains millions of monarchs.

“The monarch population is pretty strong, except it’s not as strong as it used to be and we find out it keeps getting smaller and smaller,” said Travis Brady, the education director at the Greenburgh Nature Center here.

Monarchs arrived at the nature center later this year and in fewer numbers, Brady said.

The nature center’s butterfly house this summer was aflutter with red admirals, giant swallowtails, painted ladies and monarchs, among others. But the last were difficult to obtain because collectors supplying the center had trouble finding monarch eggs in the wild, he said.

No one is suggesting monarchs will become extinct. The concern is whether the annual migration will remain sustainable, said Jeffrey Glassberg, the North American Butterfly Association’s president.

The record low shouldn’t set off a panic, said Marianna T. Wright, executive director of the National Butterfly Center in Texas, a project of the butterfly association.

“It should certainly get some attention,” she said. “I do think the disappearance of milkweed nationwide needs to be addressed. If you want to have monarchs, you have to have milkweed.”

Milkweed is often not part of suburban landscape, succumbing to lawn mowers and weed whackers, monarch advocates point out. Without it, monarch eggs aren’t laid and monarch caterpillars can’t feed and develop into winged adults.

“Many people know milkweed, and many people like it,” said Brady at the nature center. “And a lot of people actively try to destroy it. The health of the monarch population is solely dependent on the milkweed plant.”

The widespread use of herbicide-resistant corn and soybeans, which has resulted in the loss of more than 80 million acres of monarch habitat in recent years, also threatens the plant, according to the website Monarch Watch. In spraying fields to eradicate unwanted plants, Midwest farmers also eliminate butterflies’ habitat.

The 2012 drought and wildfires in Texas also made butterfly life difficult. All monarchs heading to or from the eastern two-thirds of the country pass through the state.

So—check out Monarch Watch! Plant some milkweed and make your yard insect-friendly in other ways… like mine!

I may seem like a math nerd, but I’m out there every weekend gardening. My wife Lisa is the real driving force behind this operation, but I’ve learned to love working with plants, soil, and compost. The best thing we ever did is tear out the lawn. Lawns are boring, let native plants flourish! Even if you don’t like insects, birds eat them, and you’ve gotta like birds. Let the beauty of nature start right where you live.

## Maximum Entropy and Ecology

21 February, 2013

I already talked about John Harte’s book on how to stop global warming. Since I’m trying to apply information theory and thermodynamics to ecology, I was also interested in this book of his:

John Harte, Maximum Entropy and Ecology, Oxford U. Press, Oxford, 2011.

There’s a lot in this book, and I haven’t absorbed it all, but let me try to briefly summarize his maximum entropy theory of ecology. This aims to be “a comprehensive, parsimonious, and testable theory of the distribution, abundance, and energetics of species across spatial scales”. One great thing is that he makes quantitative predictions using this theory and compares them to a lot of real-world data. But let me just tell you about the theory.

It’s heavily based on the principle of maximum entropy (MaxEnt for short), and there are two parts:

Two MaxEnt calculations are at the core of the theory: the first yields all the metrics that describe abundance and energy distributions, and the second describes the spatial scaling properties of species’ distributions.

### Abundance and energy distributions

The first part of Harte’s theory is all about a conditional probability distribution

$R(n,\epsilon | S_0, N_0, E_0)$

which he calls the ecosystem structure function. Here:

$S_0$: the total number of species under consideration in some area.

$N_0$: the total number of individuals under consideration in that area.

$E_0$: the total rate of metabolic energy consumption of all these individuals.

Given this,

$R(n,\epsilon | S_0, N_0, E_0) \, d \epsilon$

is the probability that given $S_0, N_0, E_0,$ if a species is picked from the collection of species, then it has $n$ individuals, and if an individual is picked at random from that species, then its rate of metabolic energy consumption is in the interval $(\epsilon, \epsilon + d \epsilon).$

Here of course $d \epsilon$ is ‘infinitesimal’, meaning that we take a limit where it goes to zero to make this idea precise (if we’re doing analytical work) or take it to be very small (if we’re estimating $R$ from data).

I believe that when we ‘pick a species’ we’re treating them all as equally probable, not weighting them according to their number of individuals.

Clearly $R$ obeys some constraints. First, since it’s a probability distribution, it obeys the normalization condition:

$\displaystyle{ \sum_n \int d \epsilon \; R(n,\epsilon | S_0, N_0, E_0) = 1 }$

Second, since the average number of individuals per species is $N_0/S_0,$ we have:

$\displaystyle{ \sum_n \int d \epsilon \; n R(n,\epsilon | S_0, N_0, E_0) = N_0 / S_0 }$

Third, since the average over species of the total rate of metabolic energy consumption of individuals within the species is $E_0/ S_0,$ we have:

$\displaystyle{ \sum_n \int d \epsilon \; n \epsilon R(n,\epsilon | S_0, N_0, E_0) = E_0 / S_0 }$

Harte’s theory is that $R$ maximizes entropy subject to these three constraints. Here entropy is defined by

$\displaystyle{ - \sum_n \int d \epsilon \; R(n,\epsilon | S_0, N_0, E_0) \ln(R(n,\epsilon | S_0, N_0, E_0)) }$

Harte uses this theory to calculate $R,$ and tests the results against data from about 20 ecosystems. For example, he predicts the abundance of species as a function of their rank, with rank 1 being the most abundant, rank 2 being the second most abundant, and so on. And he gets results like this:

The data here are from:

• Green, Harte, and Ostling’s work on a serpentine grassland,

• Luquillo’s work on a 10.24-hectare tropical forest, and

• Cocoli’s work on a 2-hectare wet tropical forest.

The fit looks good to me… but I should emphasize that I haven’t had time to study these matters in detail. For more, you can read this paper, at least if your institution subscribes to this journal:

• J. Harte, T. Zillio, E. Conlisk and A. Smith, Maximum entropy and the state-variable approach to macroecology, Ecology 89 (2008), 2700–2711.

### Spatial abundance distribution

The second part of Harte’s theory is all about a conditional probability distribution

$\Pi(n | A, n_0, A_0)$

This is the probability that $n$ individuals of a species are found in a region of area $A$ given that it has $n_0$ individuals in a larger region of area $A_0.$

$\Pi$ obeys two constraints. First, since it’s a probability distribution, it obeys the normalization condition:

$\displaystyle{ \sum_n \Pi(n | A, n_0, A_0) = 1 }$

Second, since the mean value of $n$ across regions of area $A$ equals $n_0 A/A_0,$ we have

$\displaystyle{ \sum_n n \Pi(n | A, n_0, A_0) = n_0 A/A_0 }$

Harte’s theory is that $\Pi$ maximizes entropy subject to these two constraints. Here entropy is defined by

$\displaystyle{- \sum_n \Pi(n | A, n_0, A_0)\ln(\Pi(n | A, n_0, A_0)) }$

Harte explains two approaches to use this idea to derive ‘scaling laws’ for how $n$ varies with $n$. And again, he compares his predictions to real-world data, and get results that look good to my (amateur, hasty) eye!

I hope sometime I can dig deeper into this subject. Do you have any ideas, or knowledge about this stuff?

## The Mathematics of Biodiversity (Part 8)

14 July, 2012

Last time I mentioned that estimating entropy from real-world data is important not just for measuring biodiversity, but also for another area of biology: neurobiology!

When you look at something, neurons in your eye start firing. But how, exactly, is their firing related to what you see? Questions like this are hard! Answering them— ‘cracking the neural code’—is a big challenge. To make progress, neuroscientists are using information theory. But as I explained last time, estimating information from experimental data is tricky.

Romain Brasselet, now a postdoc at the Max Planck Institute for Biological Cybernetics at Tübingen, is working on these topics. He sent me a nice email explaining this area.

This is a bit of a digression, but the Mathematics of Biodiversity program in Barcelona has been extraordinarily multidisciplinary, with category theorists rubbing shoulders with ecologists, immunologists and geneticists. One of the common themes is entropy and its role in biology, so I think it’s worth posting Romain’s comments here. This is what he has to say…

### Information in neurobiology

I will try to explain why neurobiologists are today very interested in reliable estimates of entropy/information and what are the techniques we use to obtain them.

The activity of sensory as well as more central neurons is known to be modulated by external stimulations. In 1926, in a seminal paper, Adrian observed that neurons in the sciatic nerve of the frog fire action potentials (or spikes) when some muscle in the hindlimb is stretched. In addition, he observed that the frequency of the spikes increases with the amplitude of the stretching.

• E.D. Adrian, The impulses produced by sensory nerve endings. (1926).

For another very nice example, in 1962, Hubel and Wiesel found neurons in the cat visual cortex whose activity depends on the orientation of a visual stimulus, a simple black line over white background: some neurons fire preferentially for one orientation of the line (Hubel and Wiesel were awarded the 1981 Nobel Prize in Physiology for their work). This incidentally led to the concept of “receptive field” which is of tremendous importance in neurobiology—but though it’s fascinating, it’s a different topic.

Good, we are now able to define what makes a neuron tick. The problem is that neural activity is often very “noisy”: when the exact same stimulus is presented many times, the responses appear to be very different from trial to trial. Even careful observation cannot necessarily reveal correlations between the stimulations and the neural activity. So we would like a measure capable of capturing the statistical dependencies between the stimulation and the response of the neuron to know if we can say something about the stimulation just by observing the response of a neuron, which is essentially the task of the brain. In particular, we want a fundamental measure that does not rely on any assumption about the functioning of the brain. Information theory provides the tools to do this, that is why we like to use it: we often try to measure the mutual information between stimuli and responses.

To my knowledge, the first paper using information theory in neuroscience was by MacKay and McCulloch in 1952:

• Donald M. Mackay and Warren S. McCulloch, The limiting information capacity of a neuronal link, Bulletin of Mathematical Biophysics 14 (1952), 127–135.

But information theory was not used in neuroscience much until the early 90’s. It started again with a paper by Bialek et al. in 1991:

• W. Bialek, F. Rieke, R. R. de Ruyter van Steveninck and D. Warland, Reading a neural code, Science 252 (1991), 1854–1857.

However, when applying information-theoretic methods to biological data, we often have a limited sampling of the neural response, we are usually very happy when we have 50 trials for a given stimulus. Why is this limited sample a problem?

During the major part of the 20th century, following Adrian’s finding, the paradigm for the neural code was the frequency of the spikes or, equivalently, the number of spikes in a window of time. But in the early 90’s, it was observed that the exact timing of spikes is (in some cases) reliable across trials. So instead of considering the neural response as a single number (the number of spikes), the temporal patterns of spikes started to be taken into account. But time is continuous, so to be able to do actual computations, time was discretized and a neural response became a binary string.

Now, if you consider relevant time-scales, say, a 100 millisecond time window with a 1 millisecond bin with a firing frequency of about 50 per second, then your response space is huge and the estimates of information with only 50 trials are not reliable anymore. That’s why a lot of efforts have been carried out to overcome the limited sampling bias.

Now, getting at the techniques developed in this field, John already mentioned the work by Liam Paninski, but here are other very interesting references:

• Stefano Panzeri and Alessandro Treves, Analytical estimates of limited sampling biases in different information measures, Network: Computation in Neural Systems 7 (1996), 87–107.

They computed the first-order bias of the information (related to the Miller–Madow correction) and then used a Bayesian technique to estimate the number of responses not included in the sample but that would be in an infinite sample (a goal similar to that of Good’s rule of thumb).

• S.P. Strong, R. Koberle, R.R. de Ruyter van Steveninck, and W. Bialek, Entropy and information in neural spike trains, Phys. Rev. Lett. 80 (1998), 197–200.

The entropy (or if you prefer, information) estimate can be expanded in a power series in $N$ (the sample size) around the true value. By computing the estimate for various values of $N$ and fitting it with a parabola, it is possible to estimate the value of the entropy as $N \rightarrow \infty.$

These approaches are also well-known:

• Ilya Nemenman, Fariel Shafee and William Bialek, Entropy and inference, revisited, 2002.

• Alexander Kraskov, Harald Stögbauer and Peter Grassberger, Estimating mutual information, Phys. Rev. E. 69 (2004), 066138.

Actually, Stefano Panzeri has quite a few impressive papers about this problem, and recently with colleagues he has made public a free Matlab toolbox for information theory (www.ibtb.org) implementing various correction methods.

Finally, the work by Jonathan Victor is worth mentioning, since he provided (to my knowledge again) the first estimate of mutual information using geometry. This is of particular interest with respect to the work by Christina Cobbold and Tom Leinster on measures of biodiversity that take the distance between species into account:

• J. D. Victor and K. P. Purpura, Nature and precision of temporal coding in visual cortex: a metric-space analysis, Journal of Neural Physiology 76 (1996), 1310–1326.

He introduced a distance between sequences of spikes and from this, derived a lower bound on mutual information.

• Jonathan D. Victor, Binless strategies for estimation of information from neural data, Phys. Rev. E. 66 (2002), 051903.

Taking inspiration from work by Kozachenko and Leonenko, he obtained an estimate of the information based on the distances between the closest responses.

Without getting too technical, that’s what we do in neuroscience about the limited sampling bias. The incentive is that obtaining reliable estimates is crucial to understand the ‘neural code’, the holy grail of computational neuroscientists.