Game Theory (Part 16)

Last time we looked at a zero-sum game and noticed that when both players use their maximin strategy, we get a Nash equilibrium. This isn’t a coincidence—it always works this way for zero-sum games! This fact is not obvious. It will take some work to prove it. This will be our first really big theorem.

But first, we need to define a maximin strategy.

I already tried to give you the rough idea: it’s where you pick a mixed strategy that maximizes your expected payoff while assuming that no matter what mixed strategy you pick, the other player will pick the mixed strategy that minimizes your expected payoff. But to prove things about this concept, we need a more precise definition.

The setup

First, remember the setup. We’re talking about a zero-sum 2-player normal form game. So as usual, we assume:

• Player A has some set of choices i = 1, \dots, m.

• Player B has some set of choices j = 1, \dots, n.

• If player A makes choice i and player B makes choice j, the payoff to player A is A_{ij} and the payoff to player B is B_{ij}.

But, because it’s zero-sum game, we also assume

A_{ij} + B_{ij} = 0

for all choices i = 1, \dots, m and j = 1, \dots, n. In other words, the payoff matrices A and B, whose entries are these numbers A_{ij} and B_{ij}, sum to zero:

A + B = 0

We’ll let p to stand for A’s mixed strategy, and let q stand for B’s mixed strategy. These are probability distributions. So, p = (p_1, \dots, p_m) is a vector in \mathbb{R}^m obeying

0 \le p_i \le 1 , \quad \displaystyle{ \sum_{i = 1}^m p_i = 1 }

while q = (q_1 , \dots, q_m) is a vector in \mathbb{R}^n obeying

0 \le q_j \le 1 , \quad  \displaystyle{ \sum_{j=1}^n q_j = 1 }

Given these mixed strategies, A’s expected payoff is

p \cdot A q

while B’s expected payoff is

p \cdot B q = - p \cdot A q

Since B = -A, we will never mention the matrix B again!

Minima and maxima

As you might guess, we’re going to talk a lot about minima and maxima. So, we should be really clear about what they are!

Definition 1. Suppose f : X \to \mathbb{R} is any real-valued function defined on any set X. We say f has a maximum at x \in X if

f(x) \ge f(x')

for all x' \in X. In this case we call the number f(x) the maximum value of f, and we write

\displaystyle{ f(x) = \max_{x' \in X} f(x') }

Note that mathematicians use ‘maximum’ to mean an element x where the function f gets as big as possible… and use ‘maximum value’ to mean how big f gets there. This is a bit different than ordinary English!

Also note that a maximum may not exist! And if it exists, it may not be unique. For example, the function x^2 on the real line has no maximum, while the sine function has lots. So unless we know for sure a function has exactly one maximum, we should talk about a maximum instead of the maximum.

Similar stuff is true for minima, too:

Definition 1. Suppose f : X \to \mathbb{R} is any real-valued function defined on any set X. We say f has a minimum at x \in X if

f(x) \le f(x')

for all x' \in X. In this case we call the number f(x) the minimum value of f, and we write

\displaystyle{ f(x) = \min_{x' \in X} f(x') }

Security levels

Pretend you’re player A. Since it’s a zero-sum game, we know B will try to maximize their payoff… which means minimizing your payoff. So, no matter what your mixed strategy p is, you should expect that B will find a mixed strategy q' that’s a minimum of

p \cdot A q

So, you should only expect to get a payoff of

\displaystyle{ \min_{q' \in \{ \textrm{B's mixed strategies\}}} \; p \cdot A q' }

We call this player A’s security level. For short, let’s write it as

\displaystyle{ \min_{q'} \; p \cdot A q' }

A’s security level is a function of p. We graphed this function in the example we looked at last time. It’s the dark curve here:

The different lines show p \cdot A q for different choices of q. The minimum of all these gives the dark curve.

Maximin strategies

Last time we found A’s maximin strategy by finding the maximum of A’s security level—the place where that dark curve is highest!

Suppose p is a maximin strategy for player A. Since it maximizes A’s security level,

\displaystyle{ \min_{q'} \; p \cdot A q' \ge  \min_{q'} \; p' \cdot A q' }

for all mixed strategies p' for player A. In other words, we have

\displaystyle{ \min_{q'} \; p \cdot A q'  = \max_{p'} \min_{q'} \; p' \cdot A q' }

If you look at this formula, you can really see why we use the word ‘maximin’!

It’s a little-known false fact that the concept of maximin strategy was named after the Roman emperor Maximin. Such an emperor really does exist! But in fact, there were two Roman emperors named Maximin. So he exists, but he’s not unique.

 

Definitions

Now we’re ready for some definitions that summarize what we’ve learned.

Definition 3. Given a zero-sum 2-player normal form game and a mixed strategy p for player A, we define A’s security level to be

\displaystyle{ \min_{q'} \; p \cdot A q' }

Definition 4. Given a zero-sum 2-player normal form game, we say a mixed strategy p for player A is a maximin strategy if

\displaystyle{ \min_{q'} \; p \cdot A q'  = \max_{p'} \min_{q'} \; p' \cdot A q' }

We can also make up definitions that apply to player B. We just need to remember that there’s a minus sign in B’s expected payoff:

Definition 5. Given a zero-sum 2-player normal form game and a mixed strategy q for player B, we define B’s security level to be

\displaystyle{ \min_{p'} \; - p' \cdot A q }

Definition 6. Given a zero-sum 2-player normal form game, we say a mixed strategy q' for B is a minimax strategy for B if

\displaystyle{  \min_{p'} \; - p' \cdot A q = \max_{q'} \min_{p'} \; - p' \cdot A q' }

But there’s an easy fact about maxima and minima that lets us simplify this last definition. To make a function -f as big as possible is the same as making f as small as possible and then sticking a minus sign in front:

\displaystyle{ \max_{x \in X} -f(x) = - \min_{x \in X} f(x)}

Similarly, to minimize -f is the same as maximizing f and then taking the negative of that:

\displaystyle{ \min_{x \in X} -f(x) = - \max_{x \in X} f(x)}

Using these rules, we get this little theorem:

Theorem 1. Given a zero-sum 2-player normal form game, q is a minimax strategy for B if and only if:

\displaystyle{  \max_{p'} \; p' \cdot A q  = \min_{q'} \max_{p'} \; p' \cdot A q' }

Proof. Suppose that q is a minimax strategy for B. By Definition 6,

\displaystyle{  \min_{p'} \; - p' \cdot A q = \max_{q'} \min_{p'} \; - p' \cdot A q' }

Repeatedly using the rules for pushing minus signs through max and min, we see:

\displaystyle{ - \max_{p'} p \cdot A q = \max_{q'} \left( - \max_{p'} \; p \cdot A q \right) }

and then

\displaystyle{ - \max_{p'} p \cdot A q = - \min_{q'} \max_{p'} \; p' \cdot A q' }

Taking the negative of both sides, we get the equation we want:

\displaystyle{  \max_{p'} \; p' \cdot A q  = \min_{q'} \max_{p'} \; p' \cdot A q' }

We can also reverse our calculation and show that this equation implies q is a maximin strategy. So, this equation is true if and only if q is a maximin strategy for B.   █

This little theorem talks about a minimum of maxima instead of a maximum of minima. This is one reason people talk about ‘minimax strategies’. In fact what we’re calling a maximin strategy, people often call a minimax strategy!

Next time we’ll start proving some really important things, which were first shown by the great mathematician John von Neumann:

• If both players in a zero-sum game use a maximin strategy, we get a Nash equilibrium.

• In any Nash equilibrium for a zero-sum game, both players must be using a maximin strategy!

• For any zero-sum game, there exists a Nash equilibrium.

Now we’re talking about mixed strategies, of course. We already saw a while back that if we only consider pure strategies, there are games like rock-paper-scissors and matching pennies that don’t have a Nash equilibrium.

Before I quit, one more false fact. Just as the maximin strategy was named after the emperor Maximin, the minimax strategy was named after the emperor Minimax! I mentioned that Emperor Maximin really exists, but is not unique. The case of Emperor Minimax is even more interesting. He’s really unique… but he does not exist!

2 Responses to Game Theory (Part 16)

  1. Blake Stacey says:

    Another amazing false fact: the MaxEnt approach has its own patron saint!

  2. Last time we saw the official definition of maximin strategy. Now we’ll show something really important. In a Nash equilibrium for a zero-sum game, both players must be using a maximin strategy! [...]

You can use HTML in your comments. You can also use LaTeX, like this: $latex E = m c^2 $. The word 'latex' comes right after the first dollar sign, with a space after it.

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 2,827 other followers