Information Geometry (Part 13)

Last time I gave a sketchy overview of evolutionary game theory. Now let’s get serious.

I’ll start by explaining ‘Nash equilibria’ for 2-person games. These are situations where neither player can profit by changing what they’re doing. Then I’ll introduce ‘mixed strategies’, where the players can choose among several strategies with different probabilities. Then I’ll introduce evolutionary game theory, where we think of each strategy as a species, and its probability as the fraction of organisms that belong to that species.

Back in Part 9, I told you about the ‘replicator equation’, which says how these fractions change with time thanks to natural selection. Now we’ll see how this leads to the idea of an ‘evolutionarily stable strategy’. And finally, we’ll see that when evolution takes us toward such a stable strategy, the amount of information the organisms have ‘left to learn’ keeps decreasing!

Nash equilibria

We can describe a certain kind of two-person game using a payoff matrix, which is an n \times n matrix A_{ij} of real numbers. We think of A_{ij} as the payoff that either player gets if they choose strategy i and their opponent chooses strategy j.

Note that in this kind of game, there’s no significant difference between the ‘first player’ and the ‘second player': either player wins an amount A_{ij} if they choose strategy i and their opponent chooses strategy j. So, this kind of game is called symmetric even though the matrix A_{ij} may not be symmetric. Indeed, it’s common for this matrix to be antisymmetric, meaning A_{ij} = - A_{ji}, since in this case what one player wins, the other loses. Games with this extra property are called zero-sum games. But we won’t limit ourselves to those!

We say a strategy i is a symmetric Nash equilibrium if

A_{ii} \ge A_{ji}

for all j. This means that if both players use strategy i, neither gains anything by switching to another strategy.

For example, suppose our matrix is

\left( \begin{array}{rr} -1 & -12 \\  0 & -3 \end{array} \right)

Then we’ve got the Prisoner’s Dilemma exactly as described last time! Here strategy 1 is cooperate and strategy 2 is defect. If a player cooperates and so does his opponent, he wins

A_{11} = -1

meaning he gets one month in jail. We include a minus sign because ‘winning a month in jail’ is not a good thing. If the player cooperates but his opponent defects, he gets a whole year in jail:

A_{12} = -12

If he defects but his opponent cooperates, he doesn’t go to jail at all:

A_{21} = 0

And if they both defect, they both get three months in jail:

A_{22} = -3

You can see that defecting is a Nash equilibrium, since

A_{22} \ge A_{12}

So, oddly, if our prisoners know game theory and believe Nash equilibria are best, they’ll both be worse off than if they cooperate and don’t betray each other.

    

Nash equilibria for mixed strategies

So far we’ve been assuming that with 100% certainty, each player chooses one strategy i = 1,2,3,\dots, n. Since we’ll be considering more general strategies in a minute, let’s call these pure strategies.

Now let’s throw some probability theory into the stew! Let’s allow the players to pick different pure strategies with different probabilities. So, we define a mixed strategy to be a probability distribution on the set of pure strategies. In other words, it’s a list of n nonnegative numbers

p_i \ge 0

that sum to one:

\displaystyle{ \sum_{i=1}^n p_i = 1 }

Say I choose the mixed strategy p while you, my opponent, choose the mixed strategy q. Say our choices are made independently. Then the probability that I choose the pure strategy i while you chose j is

p_i q_j

so the expected value of my winnings is

\displaystyle{ \sum_{i,j = 1}^n p_i A_{ij} q_j }

or using vector notation

p \cdot A q

where the dot is the usual dot product on \mathbb{R}^n.

We can easily adapt the concept of Nash equilibrium to mixed strategies. A mixed strategy q is a symmetric Nash equilibrium if for any other mixed strategy p,

q \cdot A q \ge  p \cdot A q

This means that if both you and I are playing the mixed strategy q, I can’t improve my expected winnings by unilaterally switching to the mixed strategy p. And neither can you, because the game is symmetric!

If this were a course on game theory, I would now do some examples. But it’s not, so I’ll just send you to page 6 of Sandholm’s paper: he looks at some famous games like ‘hawks and doves’ and ‘rock paper scissors’.

Evolutionarily stable strategies

We’re finally ready to discuss evolutionarily stable strategies. To do this, let’s reinterpret the ‘pure strategies’ i = 1,2,3, \dots n as species. Here I don’t necessarily mean species in the classic biological sense: I just mean different kinds of self-replicating entities, or replicators. For example, they could be different alleles of the same gene.

Similarly, we’ll reinterpret the ‘mixed strategy’ p as describing a mixed population of replicators, where the fraction of replicators belonging to the ith species is p_i. These numbers are still probabilities: p_i is the probability that a randomly chosen replicator will belong to the ith species.

We’ll reinterpret the payoff matrix A_{ij} as a fitness matrix. In our earlier discussion of the replicator equation, we assumed that the population P_i of the ith species grew according to the replicator equation

\displaystyle{ \frac{d P_i}{d t} = f_i(P_1, \dots, P_n) P_i }

where the fitness function f_i is any smooth function of the populations of each kind of replicator.

But in evolutionary game theory it’s common to start by looking at a simple special case where

\displaystyle{f_i(P_1, \dots, P_n)   = \sum_{j=1}^n A_{ij} p_j }

where

\displaystyle{ p_j = \frac{P_j}{\sum_k P_k} }

is the fraction of replicators who belong to the jth species.

What does this mean? The idea is that we have a well-mixed population of game players—or replicators. Each one has its own pure strategy—or species. Each one randomly roams around and ‘plays games’ with each other replicator it meets. It gets to reproduce at a rate proportional to its expected winnings.

This is unrealistic in all sorts of ways, but it’s mathematically cute, and it’s been studied a lot, so it’s good to know about. Today I’ll explain evolutionarily stable strategies only in this special case. Later I’ll go back to the general case.

Suppose that we select a sample of replicators from the overall population. What is the mean fitness of the replicators in this sample? For this, we need to know the probability that a replicator from this sample belongs to the ith species. Say it’s q_j. Then the mean fitness of our sample is

\displaystyle{ \sum_{i,j=1}^n q_i A_{ij} p_j }

This is just a weighted average of the fitnesses in our earlier formula. But using the magic of vectors, we can write this sum as

q \cdot A p

We already saw this type of expression in the last section! It’s my expected winnings if I play the mixed strategy q and you play the mixed strategy p.

John Maynard Smith defined q to be evolutionarily stable strategy if when we add a small population of ‘invaders’ distributed according to any other probability distribution p, the original population is more fit than the invaders.

In simple terms: a small ‘invading’ population can’t do better than the population as a whole.

Mathematically, this means:

q \cdot A ((1-\epsilon)q + \epsilon p) >  p \cdot A ((1-\epsilon)q + \epsilon p)

for all mixed strategies p and all sufficiently small \epsilon \ge 0 . Here

(1-\epsilon)q + \epsilon p

is the population we get by replacing an \epsilon-sized portion of our original population by invaders.

Puzzle: Show that q is a weakly evolutionarily stable strategy if and only these two conditions hold for all mixed stategies p:

q \cdot A q \ge p \cdot A q

and also

q \cdot A q = p \cdot A q \; \implies \; q \cdot A p > p \cdot A p

The first condition says that q is a symmetric Nash equilibrium. In other words, the invaders can’t on average be better playing against the original population than members of the original population are. The second says that if the invaders are just as good at playing against the original population, they must be worse at playing against each other! The combination of these conditions means the invaders won’t take over.

Again, I should do some examples… but instead I’ll refer you to page 9 of Sandholm’s paper, and also these course notes:

• Samuel Alizon and Daniel Cownden, Evolutionary games and evolutionarily stable strategies.

• Samuel Alizon and Daniel Cownden, Replicator dynamics.

The decrease of relative information

Now comes the punchline… but with a slight surprise twist at the end. Last time we let

P = (P_1, \dots , P_n)

be a population that evolves with time according to the replicator equation, and we let p be the corresponding probability distribution. We supposed q was some fixed probability distribution. We saw that the relative information

I(q,p) = \displaystyle{ \sum_i \ln \left(\frac{q_i}{ p_i }\right) q_i }

obeys

\displaystyle{ \frac{d}{dt} I(q,p) =  (p - q) } \cdot f(P)

where f(P) is the vector of fitness functions. So, this relative information can never increase if

(p - q) \cdot f(P) \le 0

for all P.

We can adapt this to the special case we’re looking at now. Remember, right now we’re assuming

\displaystyle{f_i(P_1, \dots, P_n)   = \sum_{j=1}^n A_{ij} p_j }

so

f(P) = A p

Thus, the relative information will never increase if

(p - q) \cdot A p \le 0

or in other words,

q \cdot A p \ge p \cdot A p  \qquad \qquad \qquad \qquad \qquad \qquad  (1)

Now, this looks very similar to the conditions for an evolutionary stable strategy as stated in the Puzzle above. But it’s not the same! That’s the surprise twist.

Remember, the Puzzle says that q is an evolutionarily stable state if for all mixed strategies p we have

q \cdot A q \ge p \cdot A q  \qquad \qquad \qquad \qquad \qquad \qquad (2)

and also

q \cdot A q = p \cdot A q \; \implies \; q \cdot A p > p \cdot A p  \qquad \; (3)

Note that condition (1), the one we want, is neither condition (2) nor condition (3)! This drove me crazy for almost a day.

I kept thinking I’d made a mistake, like mixing up p and q somewhere. You’ve got to mind your p’s and q’s in this game!

But the solution turned out to be this. After Maynard Smith came up with his definition of ‘evolutionarily stable state’, another guy came up with a different definition:

• Bernhard Thomas, On evolutionarily stable sets, J. Math. Biology 22 (1985), 105–115.

For him, an evolutionarily stable strategy obeys

q \cdot A q \ge p \cdot A q  \qquad \qquad \qquad \qquad \qquad \qquad (2)

and also

q \cdot A p \ge p \cdot A p  \qquad \qquad \qquad \qquad \qquad \qquad  (1)

Condition (1) is stronger than condition (3), so he renamed Maynard Smith’s evolutionarily stable strategies weakly evolutionarily stable strategies. And condition (1) guarantees that the relative information I(q,p) can never increase. So, now we’re happy.

Except for one thing: why should we switch from Maynard Smith’s perfectly sensible concept of evolutionarily stable state to this new stronger one? I don’t really know, except that

• it’s not much stronger

and

• it lets us prove the theorem we want!

So, it’s a small mystery for me to mull over. If you have any good ideas, let me know.

6 Responses to Information Geometry (Part 13)

  1. Greg Egan says:

    There are some typos in the subscripts when you’re discussing the payoff matrix for the prisoner’s dilemma: A_{12} is used three times when you mean A_{21} and A_{22} for the last two equations.

  2. Marc Harper says:

    I think I can shed some light on the distinctions between the conditions (1) and (3). Many authors use the implication of condition (3) (i.e. strict inequality in condition (1)) as the definition of ESS. Indeed, one can show that the strict version of (1) holds in some neighborhood of q if and only if q is an ESS (for a proof, see theorem 6.4.1 in Hofbauer and Sigmond’s “Evolutionary Games and Population Dynamics”). So the question is why we want the inequality to be strict intuitively. I’ll give a few examples why strictness matters.

    The derivative of I being zero does not imply that the dynamic is stationary. Sometimes I is a constant of motion (e.g. for the rock-scissors-paper game), which has concentric orbits about the center of the simplex in dimension three. These orbits are not attractive (there is no limit cycle), so while the population is stuck on a particular cycle based on the initial point and is in some sense stable, it is not stable in the intuitive sense of evolutionary/selective stability. It makes sense that the relative entropy (with q being the center of all the cycles, which is a stable point, but not asymptotically so) is constant on these cycles — there is no information gain for otherwise the dynamic would not cycle.

    Another situation is that one can have an evolutionarily stable set for a particular game matrix A, for instance, a connected line of stable points such that the set is locally asymptotically stable, but that there is no motion along the line, so that no point in the line is distinguished as locally asymptotically stable. In three dimensions it would look something like this (http://ars.els-cdn.com/content/image/1-s2.0-S0370157307001810-gr4.jpg) if all those points were connected in a line across the ternary plot. The matrix given by equation (7.20) in Hofbauer and Sigmond produces such a set of equilibria.

    In this case, we don’t have have evolutionary stability in the sense of Maynard Smith because points q on the line are not resistant to invasion. In other words, condition (3) would not be satisfied because with the influx of a particular distribution of mutants (shifting along the line of equilibria from q to p), the population would not tend back to the distribution before the influx, and so the point q is not selectively stable, and this holds for every point on the line. If an evolutionarily stable set is a single point, then it would typically be an evolutionarily stable state.

    Finally, some more mathematical reasons: there are different conclusions from the Lyapunov stability theorem if the derivative of the relative entropy I is always negative versus just non-positive, namely that the equilibrium is locally asymptotically stable rather than just stable. Also, Ross Cresmann showed that evolutionary stability is equivalent to the dynamical system notion of strong stability, which many find to be intuitively satistifying.

  3. Ben Allen says:

    I’m in favor of John Maynard Smith’s definition. It comes from basic biological considerations. In his definition, a strategy is evolutionarily stable iff it is impossible for a small group of invaders with a different strategy to displace a resident population that is already using this strategy. In short, “evolutionarily stable” means “robust to small groups of invaders”. Of course, there is a biological reason for thinking about small groups of invaders: new types typically arise in small numbers, either through mutation or migration.

    In the replicator dynamics, Maynard Smith’s ESS definition implies local asymptotic stability.

    Bernhard Thomas’s definition, on the other hand, says that strategy q beats or ties any other strategy distribution it is competing with, whether rare or common. It is equivalent to saying that q is an ESS if for any other distribution p and any proportion x of strategy p,

    q.[(1-x)q+xp] >= p.[(1-x)q+xp]

    So, no matter what proportions q and p are blended in, q wins or ties.

    To me, this doesn’t fit the name “evolutionarily stable”. I think it should instead be something like “evolutionarily dominant”. In this view “evolutionarily stable” means you’re safe against small group of invaders, but another strategy could still displace you if it arrives in sufficiently large numbers. “Evolutionarily dominant” means that nothing can beat you. (This distinction is relevant in many biological situations.)

    Also, it’s not true that Thomas’s definition is strictly stronger than Maynard Smiths, at least not in the way you’ve written them. This is because the inequality in (3) is strict while the inequality in (1) is not.

    If we strengthen Thomas’s definition by making the inequality in (1) strict for all p != q, then this strengthened version of (1) implies that q is a global attractor for the replicator dynamics. So in some sense, Maynard Smith’s definition is about local stability, and Thomas’s definition is about global stability.

    • John Baez says:

      After having thought about this some more, I agree even more that Maynard Smith’s definition matches our intuition of ‘evolutionary stability’… but Thomas’ definition is important too, at least because we need it to prove this version of the 2nd law of thermodynamics. I like your idea of naming this concept ‘evolutionary dominance’. In my talk in Barcelona I used the term ‘evolutionary optimum’, but ‘evolutionary dominance’ captures the idea better. I’ll use that from now on!

      I should pay more attention to > versus \ge. In Marc Harper’s theorem I was content to have a non-strict inequality, so I used a non-strict inequality in the definition, but a strict inequality in the definition should give a strict inequality in the theorem, as long as p \ne q. Thanks!

  4. \displaystyle{  \sum_i f_i(P) (p_i - q_i) } \le 0

    If q makes this true for all p, we say q is an evolutionarily stable state. For some reasons why, see Part 13.

You can use HTML in your comments. You can also use LaTeX, like this: $latex E = m c^2 $. The word 'latex' comes right after the first dollar sign, with a space after it.

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 2,849 other followers