Game Theory (Part 15)

In Part 13 we saw an example of a Nash equilibrium where both players use a mixed strategy: that is, make their choice randomly, using a certain probability distribution on their set of mixed strategies.

We found this Nash equilibrium using the oldest method known to humanity: we guessed it. Guessing is what mathematicians do when we don’t know anything better to do. When you’re trying to solve a new kind of problem, it’s often best to start with a very easy example, where you can guess the answer. While you’re doing this, you should try to notice patterns! These can give you tricks that are useful for harder problems, where pure guessing won’t usually work so well.

So now let’s try a slightly harder example. In Part 13 we looked at the game ‘matching pennies’, which was a zero-sum game. Now we’ll look at a modified version. It will still be a zero-sum game. We’ll find a Nash equilibrium by finding each player’s ‘maximin strategy’. This technique works for zero-sum two-player normal-form games, but not most other games.

Why the funny word ‘maximin’?

In this sort of strategy, you assume that no matter what you do, the other player will do whatever minimizes your expected payoff. This is a sensible assumption in a zero-sum game! They are trying to maximize their expected payoff, but in a zero-sum game whatever they win, you lose… so they’ll minimize your expected payoff.

So, in a maximin strategy you try to maximize your expected payoff while assuming that given whatever strategy you use, your opponent will use a strategy that minimizes your expected payoff.

Think about that sentence: it’s tricky! But it explains the funny word ‘maximin’. In brief, you’re trying to maximize the minimum of your expected payoff.

We’ll make this more precise in a while… but let’s dive in and look at a new game!

Matching pennies—modified version

In this new game, each player has a penny and must secretly turn the penny to either heads or tails. They then reveal their choices simultaneously. If the pennies do not match—one heads and one tails—B wins 10 cents from A. If the pennies are both heads, player A wins 10 cents from player B. And if the pennies are both tails, player A wins 20 cents from player B

Let’s write this game in normal form! If we say

• choice 1 is heads
• choice 2 is tails

then the normal form looks like this:

1 2
1    (10,-10)   (-10,10)  
2 (-10,10)   (20,-20)  

We can also break this into two matrices in the usual way, one showing the payoff for A:

A = \left( \begin{array}{rr} 10 & -10 \\ -10 & 20 \end{array} \right)

and one showing the payoff for B:

B = \left( \begin{array}{rr} -10 & 10 \\ 10 & -20 \end{array} \right)

Before we begin our real work, here are some warmup puzzles:

Puzzle 1. Is this a zero-sum game?

Puzzle 2. Is this a symmetric game?

Puzzle 3. Does this game have a Nash equilibrium if we consider only pure strategies?

Puzzle 4. Does this game have a dominant strategy for either player A or player B?

A’s expected payoff

Remember that we write A’s mixed strategy as

p = (p_1, p_2)

Here p_1 is the probability that A makes choice 1: that is, chooses heads. p_2 is the probability that A makes choice 2: that is, chooses tails. Similarly, we write B’s mixed strategy as

q = (q_1, q_2)

Here q_1 is the probability that B makes choice 1, and q_2 is the probability that B makes choice 2.

Given all this, the expected value of A’s payoff is

p \cdot A q

We saw this back in Part 12.

Now let’s actually calculate the expected value of A’s payoff. It helps to remember that probabilities add up to 1, so

p_1 + p_2 = 1, \qquad q_1 + q_2 = 1

It’s good to have fewer variables to worry about! So, we’ll solve for p_2 and q_2:

p_2 = 1 - p_1, \qquad q_2 = 1 - q_1

and write

p = (p_1, 1 - p_1)
q = (q_1, 1 - q_1)

Then we can calculate A’s expected payoff. We start by computing A q:

\begin{array}{ccl} A q &=&   \left( \begin{array}{rr} 10 & -10 \\ -10 & 20 \end{array} \right) \left( \begin{array}{c} q_1 \\ 1 - q_1 \end{array} \right) \\  \\ &=& \left( \begin{array}{c} 10 q_1 - 10(1-q_1) \\ -10 q_1 + 20(1 - q_1) \end{array} \right) \\  \\ &=& \left( \begin{array}{c} 20 q_1 - 10  \\ 20 - 30 q_1  \end{array} \right) \end{array}

This gives

\begin{array}{ccl} p \cdot A q &=& (p_1 , 1 - p_1) \cdot (20 q_1 - 10  ,  20 - 30 q_1) \\  \\  &=& p_1(20 q_1 - 10) + (1-p_1)(20 - 30 q_1) \\ \\ &=& 20 - 30 p_1 - 30 q_1 + 50 p_1 q_1 \end{array}

This is a bit complicated, so let’s graph A’s expected payoff in the extreme cases where B either makes choice 1 with probability 100%, or choice 2 with probability 100%.

If B makes choice 1 with probability 100%, then q_1 = 1. Put this in the formula above and we see A’s expected payoff is

p \cdot A q = 20 p_1 - 10

If we graph this as a function of p_1 we get a straight line:

On the other hand, if B makes choice 2 with probability 100%, then q_2 = 1 so q_1 = 0. Now A’s expected payoff is

p \cdot A q = 20 - 30 p_1

Graphing this, we get:

The point of these graphs becomes clearer if we draw them both together:

Some interesting things happen where the lines cross! We’ll get A’s maximin strategy by picking the value of p_1 where these lines cross! And in fact, there’s a Nash equilibrium where player A chooses this value of p_1, and B uses the same trick to choose his value of q_1.

But we can already see something simpler. We’ve drawn graphs of A’s expected payoff for two extreme cases of player B’s mixed strategy: q_1 = 1 and q_1 = 0. Suppose B uses some other mixed strategy, say q_1 = 2/5 for example. Now A’s expected payoff will be something between the two lines we’ve drawn. Let’s see what it is:

\begin{array}{ccl} p \cdot A q &=& 20 - 30 p_1 - 30 q_1 + 50 p_1 q_1 \\  \\  &=& 20 - 30 p_1 - 30 \cdot \frac{2}{5} + 50 p_1 \cdot \frac{2}{5} \\  \\  &=& 8 - 10p_1  \end{array}

If we graph this along with our other two lines, we get this:

We get a line between the other two, as we expect. But more importantly, all three lines cross at the same point!

That’s not a coincidence, that’s how it always works. If we draw lines for more different choices of B’s mixed strategy, they look like this:

The point where they all intersect looks important! It is. Soon we’ll see why.

A’s maximin strategy

Now we’ll find A’s maximin strategy. Let me explain the idea a bit more. The idea is that A is cautious, so he wants to choose p_1 that maximizes his expected payoff in the worst-case scenario. In other words, A wants to maximize his expected payoff assuming that B is trying to minimize A’s expected payoff.

Read that last sentence a few times, because it’s confusing at first.

Why would B try to minimize A’s expected payoff? B isn’t nasty: he’s just a rational agent trying to maximize his own expected payoff! But if you solved Puzzle 1, you know this game is a zero-sum game. So A’s payoff is minus B’s payoff. Thus, if B is trying to maximize his own expected payoff, he’s also minimizing A’s expected payoff.

Given this, let’s figure out what A should do. A should look at different mixed strategies B can use, and graph A’s payoff as a function of p_1. We did this:

Then, he should choose p_1 that maximizes his expected payoff in the worst-case scenario. To do this, he should focus attention on the lines that give him the smallest expected payoff:

This is the dark curve. It’s not very ‘curvy’, but mathematicians consider a broken line to be an example of a curve!

To find this dark curve, all that matters are extreme cases of B’s strategy: the case q_1 = 1 and the case q_1 = 0. So we can ignore the intermediate cases, and simplify our picture:

The dark curve is highest where the two lines cross! This happens when

-20 + 10 p_1 = 20 - 30 p_1

or in other words

p_1 = 3/5

So, A’s maximin strategy is

p = (3/5, 2/5)

It’s the mixed strategy that maximizes A’s expected payoff given that B is trying to minimize A’s expected payoff.

B’s expected payoff

Next let’s give the other guy a chance. Let’s work out B’s expected payoff and maximin strategy. The expected value of B’s payoff is

p \cdot B q

We could calculate this just like we calculated p \cdot A q , but it’s quicker to remember that this is a zero-sum game:

B = - A

so that

p \cdot B q = - p \cdot A q

and since we already saw that

p \cdot A q = 20 - 30 p_1 - 30 q_1 + 50 p_1 q_1

now we have

p \cdot B q = -20 + 30 p_1 + 30 q_1 - 50 p_1 q_1

B’s maximin strategy

To figure out B’s maximin strategy, we’ll copy what worked for player A. First we’ll work out B’s expected payoff in two extreme cases:

• The case where A always makes choice 1: p_1 = 1.

• The case where A always makes choice 2: p_1 = 0.

We’ll get two functions of q_1 whose graphs are lines. Then we’ll find where these lines intersect!

Let’s get to work. When p_1 = 1 we get

\begin{array}{ccl} p \cdot B q &=& -20 + 30 p_1 + 30 q_1 - 50 p_1 q_1  \\  \\  &=& 10 - 20 p_1 \end{array}

When p_1 = 0 we get

p \cdot B q = -20 + 30 q_1

I won’t graph these two lines, but they intersect when

10 - 20 q_1 = -20 + 30 q_1

or

q_1 = 3/5

Hmm, it’s sort of a coincidence that we’re getting the same number that we got for p_1; it won’t always work this way! But anyway, we see that B’s maximin strategy is

q = (3/5, 2/5)

A Nash equilibrium

Now let’s put the pieces together. What happens in a zero-sum game when player A and player B both choose a maximin strategy? It’s not too surprising: we get a Nash equilibrium!

Let’s see why in this example. (We can do a general proof later.) We have

p = (3/5, 2/5)

and

q = (3/5, 2/5)

To show that p and q form a Nash equilibrium, we need to check that neither player could improve their expected payoff by changing to a different mixed strategy. Remember:

Definition. Given a 2-player normal form game, a pair of mixed strategies (p,q), one for player A and one for player B, is a Nash equilibrium if:

1) For all mixed strategies p' for player A, p' \cdot A q \le p \cdot A q.

2) For all mixed strategies q' for player B, p \cdot B q' \le p \cdot B q.

I’ll check clause 1) and I’ll let you check clause 2), which is similar. For clause 1) we need to check

p' \cdot A q \le p \cdot A q

for all mixed strategies p'. Just like we did with p, we can write

p' = (p'_1, 1 - p'_1)

We can reuse one of our old calculations to see that

\begin{array}{ccl} p' \cdot A q &=& 20 - 30 p'_1 - 30 q_1 + 50 p'_1 q_1 \\   \\  &=& 20 - 30 p'_1 - 30 \cdot \frac{3}{5} + 50 p'_1 \cdot \frac{3}{5} \\   \\  &=& 2 \end{array}

Hmm, A’s expected payoff is always 2, no matter what mixed strategy he uses, as long as B uses his maximin strategy q! So of course we have

p' \cdot A q = p \cdot A q = 2

which implies that

p' \cdot A q \le p \cdot A q

If you don’t believe me, we can calculate p \cdot A q and see it equals p' \cdot A q:

\begin{array}{ccl} p \cdot A q &=& 20 - 30 p_1 - 30 q_1 + 50 p_1 q_1 \\   \\  &=& 20 - 30 \cdot \frac{3}{5} - 30 \cdot \frac{3}{5} + 50 \cdot \frac{3}{5} \cdot \frac{3}{5} \\   \\  &=& 2 \end{array}

Yup, it’s 2.

Puzzle 5. Finish showing that (p,q) is a Nash equilibrium by showing that for all mixed strategies q' for player B, p \cdot B q' \le p \cdot B q.

Puzzle 6. How high does this dark curve get?

You have the equations for the two lines, so you can figure this out. Explain the meaning of your answer!

5 Responses to Game Theory (Part 15)

  1. [...] Last time we looked at a zero-sum game and noticed that when both players use their maximin strategy, we get a Nash equilibrium. This isn’t a coincidence—it always works this way for zero-sum games! This fact is not obvious. It will take some work to prove it. This will be our first really big theorem. [...]

  2. alma Garcia says:

    Finding the dark curve (security level): how can you find the dark curve? Is it by just looking at the graph or by the equation?

    • John Baez says:

      In our example both player A and player B have just two choices. Then we can look at player B’s ‘extreme’ mixed strategies, where he plays one choice all the time. These are

      q = (1,0)

      and

      q = (0,1)

      For each of these we graph A’s expected payoff

      p \cdot A q

      as a function of p_1, where

      p = (p_1, 1 - p_1)

      We get two functions, whose graphs are straight lines. Then we take the minimum of these two functions and get a function whose graph may be a broken line. This is A’s security level.

      It’s easiest to find this by drawing.

      The same idea works if player B has more than two choices, but then we get more than two functions to take the minimum of.

You can use HTML in your comments. You can also use LaTeX, like this: $latex E = m c^2 $. The word 'latex' comes right after the first dollar sign, with a space after it.

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 2,846 other followers