We’ve been looking at games where each player gets a payoff depending on the choice that both players make. The payoff is a real number, which I often call the number of **points**. When we play these games in class, these points go toward your grade. 10% of your grade depends on the the total number of points you earn in quizzes and games. But what do these points mean in other games, like Prisoner’s Dilemma or Battle of the Sexes?

This leads us into some very interesting and deep questions. Let’s take a *very* quick look at them, without getting very dep.

### Maximizing the payoff

The main thing is this. When we’re studying games, we’ll assume *each player’s goal is to earn as many points as possible.* In other words, they are trying to maximize their payoff.

They are *not*, for example, trying to make their payoff bigger than the other player’s payoff. Indeed, in class you should *not* be trying to earn more points than me! One student said he was trying to do that. That’s a mistake. You should be happier if

• you get 10 points and I get 20

than if

• you get -10 points and I get -20.

After all, it’s only *your total number of points* that affects your grade, not whether it’s bigger than mine.

So, you should always try to maximize your payoff. And I promise to do the same thing: I’ll always try to maximize my payoff. You have to take my word on this, since my salary is not affected by my payoff! But I want to make your task very clear: you are trying to maximize your payoff, and *you can assume I am trying to maximize mine.*

(If I were doing something else, like sadistically trying to minimize your payoff, that would affect your decisions!)

### Rational agents and utility

We can’t understand how people actually play games unless we know what they are trying to do. In real life, people’s motives are very complicated and sometimes mysterious. But in mathematical game theory, we start by studying simpler: **rational agents**. Roughly speaking, a rational agent is defined to be a person or animal or computer program or something that is *doing the best possible job of maximizing some quantity, given the information they have*.

This is a rough definition, which we will try to improve later.

You shouldn’t be fooled by the positive connotations of the word ‘rational’. We’re using it in a very specific technical way here. A madman in a movie theater who is trying to kill as many people as possible counts as ‘rational’ by our definition if they *maximize the number of people killed, given the information they have.*

The whole question of what really *should* count as ‘rationality’ is a very deep one. People have a lot of interesting ideas about it:

• Rationality, Wikipedia.

### Utility

So: we say a rational agent does the best possible job of maximizing their payoff given the information they have. But in economics, this payoff is often called **utility**.

That’s an odd word, but comes from a moral philosophy called **utilitarianism**, which says—very roughly—that the goal of life is to maximize happiness. Perhaps because it’s a bit embarrassing to talk about maximizing happiness, these philosophers called it ‘utility’.

But be careful: while the moral philosophers often talk about agents trying to maximize *the total utility of everyone,* economists focus on rational agents trying to maximize their *own* utility.

This sounds very selfish. But it’s not necessarily. If you want other people to be happy, your utility depends on their utility. If you were a complete altruist, perhaps maximizing your utility would even be *the same* as maximizing the total utility of everyone!

Again, there are many deep problems here, which I won’t discuss. I’ll just mention one: in practice, it’s very hard to define utility in a way that’s precise enough to measure, much less add up! See here for a bit more:

• Utility, Wikipedia.

• Utilitarianism, Wikipedia.

### The assumption of mutual rationality

Game theory is simplest when

• *all players are rational agents*,

and

• *each player knows all the other players are rational agents*.

Of course, in the real world nobody is rational all the time, so things get much more complicated. If you’re playing against an irrational agent, you have to work harder to guess what they are going to do!

But in the games we play in class, I will try to be a rational agent: I will try my best to maximize my payoff. And you too should try to be a rational agent, and maximize your payoff—since that will help your grade. And you can assume I am a rational agent. And I will assume you are a rational agent.

So: I know that if I keep making the same choice, you will make the choice that maximizes your payoff given what I do.

And: you know that if you keep making the same choice, I will make the choice that maximizes my payoff given what you do.

Given this, we should both seek a Nash equilibrium. I won’t try to state this precisely and prove it as a theorem… but I hope it’s believable. You can see some theorems about this here:

• Robert Aumann and Adam Brandenburger, Epistemic conditions for Nash equilibrium.

### Probabilities

All this is fine if a Nash equilibrium exists and is unique. But we’ve seen that in some games, a Nash equilibirum doesn’t exist—at least not, if we only consider **pure strategies**, where each player makes the same choice every time. And in other games, the Nash equilibrium exists but there is more than one.

In games like this, saying that players will try to find a Nash equilibrium doesn’t settle all our questions! What should they do if there’s none, or more than one?

We’ve seen one example: rock-paper-scissors. If we only consider pure strategies, this game has no Nash equilibrium. But I’ve already suggested the solution to this problem. The players should use **mixed strategies**, where they randomly make different choices with different probabilities.

So, to make progress, we’ll need to learn a bit of probability theory! That’ll be our next topic.

For the amusement of the class, this reminds me of a line in “Gut Feelings” by G. Gigerenzer (also recommended by Tim van Beek in our Recommended reading)

So to maximize utility/happiness in real life every rational agent should get a brain implant in their reward center and proceed to self stimulate:

http://en.wikipedia.org/wiki/Brain_stimulation_reward

There might be a problem since according to that article an overabundance of happiness can lead to starvation and death. I’m not sure if it was ever tested in humans though. If true and the “maximum total happiness” is the goal the implant should probably be limited somehow, for example it could require certain minimal level of blood sugar to work.

That makes me wonder if and when such technology becomes popular. It seems to be within our technological capabilities now but I have yet to see an internet ad for a brain implant. Of course if widespread such practice would have far reaching consequences.

I’m not a utilitarian, but people who are face some challenges trying to make this philosophy precise. When we say “maximum total happiness”, do we mean

summed over people and integrated over time? If so, being very happy for a while and then starving to death might not be our goal. What about beings that aren’t people? Etcetera, etcetera—I think there are papers in philosophy journals carefully debating all these questions.It’s very hard to even define happiness in a way that justifies summing it over people. What if I claim I’m always twice as happy as you under any conditions? If true, this means my happiness contributes twice as much to the total than yours, so it’s more important to please me than to please you, if we’re trying to maximize this total! If false, how do we show it’s false?

The von Neumann–Morgenstern utility theorem gives conditions under which we can numerically compute someone’s happiness (or strictly speaking, utility). However, the result is only well-defined up to an additive constant and a multiplicative constant. So, it doesn’t let us determine whether I am really twice as happy as you!

Furthermore, the conditions of this theorem are rarely true in ordinary life… much to the secret shame of economists.

Thid idea of maximizing happiness over time reminds me of a talk by Yonatan Loewenstein about intertemporal game. It’s about games where players are “you” but at different times.

For example, if you have a dental cavity that is painful (daily payoff=-2) but not as much as going to the dentist to get it cured (daily payoff=-10), then the “present you” does not want to go the dentist. The “next moment you” may wish you had done it before but won’t go to the dentist either.

And you will end up always postponing the appointment at the dentist.

I’m pretty sure I’m not a utilitarian either, if that means we sum “total happiness” (whatever the hell that would mean!) over all people and over all time, and we reach the same conclusion as Yudkowsky does here (scroll down a bit in the comments to see his conclusion). That this “summing of happiness” is even a meaningful operation is very far from clear (and yet it seems to be accepted by E.Y. without, er, “a blink of the eye”).

I’m scared of anyone who believes in ‘maximizing total utility’ to the point of willingness to torture one person to prevent a very large number of people from getting dust specks in their eyes. I’m even more scared of them if they believe in ‘maximizing expected total utility’, so that they’d be willing to torture someone to prevent a very very large number of people from having a

tiny chanceof getting dust specks in their eyes.But ‘total utility’ is such an ill-defined concept that we can also reject the principle of ‘maximizing total utility’ on general theoretical grounds, not just on the grounds that it feels wrong.

That’s where the concept of Pareto-optimality shows its power. In Yudkowsky’s game, both (-10000, 1, 1, 1,…., 1) and (0, 0, 0, 0,…., 0) are Pareto-optimal, and there’s no way to say that any case is better without adding extra structure to the problem.

The concept of Pareto-optimality is the furthest one can go in the direction of “maximizing total happiness”.