Here’s how Fisher stated his fundamental theorem:
The rate of increase of fitness of any species is equal to the genetic variance in fitness.
But clearly this is only going to be true under some conditions!
A lot of early criticism of Fisher’s fundamental theorem centered on the fact that the fitness of a species can vary due to changing external conditions. For example: suppose the Sun goes supernova. The fitness of all organisms on Earth will suddenly drop. So the conclusions of Fisher’s theorem can’t hold under these circumstances.
I find this obvious and thus uninteresting. So, let’s tackle situations where the fitness changes due to changing external conditions later. But first let’s see what happens if the fitness isn’t changing for these external reasons.
What’s ‘fitness’, anyway? To define this we need a mathematical model of how populations change with time. We’ll start with a very simple, very general model. While it’s often used in population biology, it will have very little to do with biology per se. Indeed, the reason I’m digging into Fisher’s fundamental theorem is that it has a mathematical aspect that doesn’t require much knowledge of biology to understand. Applying it to biology introduces lots of complications and caveats, but that won’t be my main focus here. I’m looking for the simple abstract core.
The Lotka–Volterra equation
The Lotka–Volterra equation is a simplified model of how populations change with time. Suppose we have
different types of self-replicating entity. We will call these entities replicators. We will call the types of replicators species, but they do not need to be species in the biological sense!
For example, the replicators could be organisms of one single biological species, and the types could be different genotypes. Or the replicators could be genes, and the types could be alleles. Or the replicators could be restaurants, and the types could be restaurant chains. In what follows these details won’t matter: we’ll have just have different ‘species’ of ‘replicators’.
Let
or just
for short, be the population of the
th species at time
We will treat this population as a differentiable real-valued function of time, which is a reasonable approximation when the population is fairly large.
Let’s assume the population obeys the Lotka–Volterra equation:

where each function
depends in a differentiable way on all the populations. Thus each population
changes at a rate proportional to
but the ‘constant of proportionality’ need not be constant: it depends on the populations of all the species.
We call
the fitness function of the
th species. Note: we are assuming this function does not depend on time.
To write the Lotka–Volterra equation more concisely, we can create a vector whose components are all the populations:

Let’s call this the population vector. In terms of the population vector, the Lotka–Volterra equation become

where the dot stands for a time derivative.
To define concepts like ‘mean fitness’ or ‘variance in fitness’ we need to introduce probability theory, and the replicator equation.
The replicator equation
Starting from the populations
, we can work out the probability
that a randomly chosen replicator belongs to the
th species. More precisely, this is the fraction of replicators belonging to that species:

As a mnemonic, remember that the big Population
is being normalized to give a little probability
I once had someone scold me for two minutes during a talk I was giving on this subject, for using lower-case and upper-case P’s to mean different things. But it’s my blog and I’ll do what I want to.
How do these probabilities
change with time? We can figure this out using the Lotka–Volterra equation. We pull out the trusty quotient rule and calculate:

Then the Lotka–Volterra equation gives

Using the definition of
this simplifies and we get

The expression in parentheses here has a nice meaning: it is the mean fitness. In other words, it is the average, or expected, fitness of a replicator chosen at random from the whole population. Let us write it thus:

This gives the replicator equation in its classic form:

where the dot stands for a time derivative. Thus, for the fraction of replicators of the
th species to increase, their fitness must exceed the mean fitness.
The moral is clear:
To become numerous you have to be fit.
To become predominant you have to be fitter than average.
This picture by David Wakeham illustrates the idea:
The fundamental theorem
What does the fundamental theorem of natural selection say, in this context? It says the rate of increase in mean fitness is equal to the variance of the fitness. As an equation, it says this:

The left hand side is the rate of increase in mean fitness—or decrease, if it’s negative. The right hand side is the variance of the fitness: the thing whose square root is the standard deviation. This can never be negative!
A little calculation suggests that there’s no way in the world that this equation can be true without extra assumptions!
We can start computing the left hand side:

Before your eyes glaze over, let’s look at the two terms and think about what they mean. The first term says: the mean fitness will change since the fitnesses
depend on
which is changing. The second term says: the mean fitness will change since the fraction
of replicators that are in the
th species is changing.
We could continue the computation by using the Lotka–Volterra equation for
and the replicator equation for
But it already looks like we’re doomed without invoking an extra assumption. The left hand side of Fisher’s fundamental theorem involves the gradients of the fitness functions,
The right hand side:

does not!
This suggests an extra assumption we can make. Let’s assume those gradients
vanish!
In other words, let’s assume that the fitness of each replicator is a constant, independent of the populations:

where
at right is just a number.
Then we can redo our computation of the rate of change of mean fitness. The gradient term doesn’t appear:

We can use the replicator equation for
and get

This is the mean of the squares of the
minus the square of their mean. And if you’ve done enough probability theory, you’ll recognize this as the variance! Remember, the variance is

Same thing.
So, we’ve gotten a simple version of Fisher’s fundamental theorem. Given all the confusion swirling around this subject, let’s summarize it very clearly.
Theorem. Suppose the functions
obey the equations

for some constants
Define probabilities by

Define the mean fitness by

and the variance of the fitness by

Then the time derivative of the mean fitness is the variance of the fitness:

This is nice—but as you can see, our extra assumption that the fitness functions are constants has trivialized the problem. The equations

are easy to solve: all the populations change exponentially with time. We’re not seeing any of the interesting features of population biology, or even of dynamical systems in general. The theorem is just an observation about a collection of exponential functions growing or shrinking at different rates.
So, we should look for a more interesting theorem in this vicinity! And we will.
Before I bid you adieu, let’s record a result we almost reached, but didn’t yet state. It’s stronger than the one I just stated. In this version we don’t assume the fitness functions are constant, so we keep the term involving their gradient.
Theorem. Suppose the functions
obey the Lotka–Volterra equations:

for some differentiable functions
called fitness functions. Define probabilities by

Define the mean fitness by

and the variance of the fitness by

Then the time derivative of the mean fitness is the variance plus an extra term involving the gradients of the fitness functions:

The proof just amounts to cobbling together the calculations we have already done, and not assuming the gradient term vanishes.
Acknowledgements
After writing this blog article I looked for a nice picture to grace it. I found one here:
• David Wakeham, Replicators and Fisher’s fundamental theorem, 30 November 2017.
I was mildly chagrined to discover that he said most of what I just said more simply and cleanly… in part because he went straight to the case where the fitness functions are constants. But my mild chagrin was instantly offset by this remark:
Fisher likened the result to the second law of thermodynamics, but there is an amusing amount of disagreement about what Fisher meant and whether he was correct. Rather than look at Fisher’s tortuous proof (or the only slightly less tortuous results of latter-day interpreters) I’m going to look at a simpler setup due to John Baez, and (unlike Baez) use it to derive the original version of Fisher’s theorem.
So, I’m just catching up with Wakeham, but luckily an earlier blog article of mine helped him avoid “Fisher’s tortuous proof” and the “only slightly less tortuous results of latter-day interpreters”. We are making progress here!
(By the way, a quiz show I listen to recently asked about the difference between “tortuous” and “torturous”. They mean very different things, but this particular case either word would apply.)
My earlier blog article, in turn, was inspired by this paper:
• Marc Harper, Information geometry and evolutionary game theory.
The whole series:
• Part 1: the obscurity of Fisher’s original paper.
• Part 2: a precise statement of Fisher’s fundamental theorem of natural selection, and conditions under which it holds.
• Part 3: a modified version of the fundamental theorem of natural selection, which holds much more generally.
• Part 4: my paper on the fundamental theorem of natural selection.