Today the conference part of this program is starting:
• Research Program on the Mathematics of Biodiversity, June-July 2012, Centre de Recerca Matemàtica, Barcelona, Spain. Organized by Ben Allen, Silvia Cuadrado, Tom Leinster, Richard Reeve and John Woolliams.
Lou Jost kicked off the proceedings with an impassioned call to think harder about fundamental concepts:
Then Tom Leinster gave an introduction to some of these concepts, and Lou explained how they show up in ecology, genetics, economics and physics.
Suppose we have different species on an island. Suppose a fraction of the organisms belong to the th species. So,
and mathematically we can treat these numbers as probabilities.
People have many ways to compute the ‘biodiversity’ from these numbers. Some of these can be wildly misleading when applied incorrectly, and this has led to shocking errors. For example, in genetics, a commonly used formula for determining when plants or animals on a bunch of islands will split into separate species is completely wrong.
In fact, if we’re not careful, some measures of biodiversity can fool us into thinking we’re saving most of the biodiversity when we’re actually losing almost all of it!
One good example involves measures of similarity between tropical butterflies in the canopy (the top of the forest) and the understory (the bottom). According to Lou Just, some published studies say the similarity is about 95%. That sounds like the two communities are almost the same. However, almost no butterflies living in the canopy live in the understory, and vice versa! The problem is that mathematics is being used inappropriately.
Here are four famous measures of biodiversity:
• Species richness. This is just the number of species:
• Shannon entropy. This is the expected amount of information you gain when someone tells you which species an organism belongs to:
• The inverse Simpson index. This is the reciprocal of the probability that two randomly chosen organisms belong to the same species:
The probability that two organisms belong to the same species is called the Simpson index:
This is used in economics as a measure of the concentration of wealth, where is the fraction of wealth owned by the th individual. Be careful: there’s a lot of different jargon in different fields, so it’s easy to get confused at first! For example, the probability that two organisms belong to different species is often called the Gini–Simpson index:
• The Berger–Parker index. This is the fraction of organisms that belong to the most common species:
So, unlike the other main ones I’ve listed, this quantity tends to go down when biodiversity goes up. To fix this we could take its reciprocal, as we did with the Simpson index.
What a mess, eh? But here’s some good news: all these quantities are functions of a single quantity, the Rényi entropy:
for various values of the parameter
I’ve written about the Rényi entropies and their role in thermodynamics before on this blog. I’ll also talk about it later in this conference, and I’ll show you my slides. So, I won’t repeat that story here. Suffice it to say that Rényi entropies are fascinating but still a bit mysterious to me.
But one of Lou Jost’s main points is that we can make bad mistakes if we work with Rényi entropies when we should be working with their exponentials, which are called Hill numbers and denoted by a , for ‘diversity':
These were introduced by M. O. Hill in 1973. One reason they’re good is that they are effective numbers. This means that if all the species are equally common, the Hill number equals the number of species, regardless of :
So, they’re a way of measuring an ‘effective’ number of species in situations where species are not all equally common.
A closely related fact is that the Hill numbers obey the replication principle. This means that if we have probability distributions on two finite sets, each with Hill number for some choice of and we combine them with equal weights to get a probability distribution on the disjoint union of those sets, the resulting distribution has Hill number
Another good fact is that the Hill numbers are as large as possible when all the probabilities are equal. They’re as small as possible, namely 1, when one of the equals 1 and the rest are zero.
Let’s see how all the measures of biodiversity I listed are either Hill numbers or can easily be converted to Hill numbers. We’ll also see that at the Hill number treats all species that are present in an equal way, regardless of their abundance. As increases, it counts more abundant species more heavily, since we’re raising the probabilities to a bigger power. And when , we only care about the most abundant species: none of the others matter at all!
• The species richness is the limit of the Hill numbers as from above:
So, we can just call this
• The exponential of the Shannon entropy is the limit of the Hill numbers as :
So, we can just call this
• The inverse Simpson index is the Hill number at :
• The reciprocal of the Berger–Parker index is the limit of Hill numbers as :
so we can call this quantity
These facts mean that understanding Hill numbers will help us understand lots of measures of biodiversity! And the good properties of Hill numbers will help us avoid dangerous mistakes.
For mathematicians, a good challenge is to find theorems uniquely characterizing the Hill numbers…. preferably with assumptions that biologists will accept as plausible facts about ‘diversity’. Some theorems like this already exist for specific choices of but it will be better to characterize the function for all values of in one blow. Tom Leinster is working on such a theorem now.
Another important task is to generalize Hill numbers to take into account things like:
• ‘distances’ between species, measured either genetically, phylogenetically or functionally,
• ‘values’ for species, measured either economically or
any other way.
There’s a lot of work on this, and many of the talks here conference will discuss these generalizations.