The Planck Mission

22 March, 2013

Yesterday, the Planck Mission released a new map of the cosmic microwave background radiation:

380,000 years after the Big Bang, the Universe cooled down enough for protons and electrons to settle down and combine into hydrogen atoms. Protons and electrons are charged, so back when they were freely zipping around, no light could go very far without getting absorbed and then re-radiated. When they combined into neutral hydrogen atoms, the Universe soon switched to being almost transparent… as it is today. So the light emitted from that time is still visible now!

And it would look like this picture here… if you could see microwaves.

When this light was first emitted, it would have looked white to our eyes, since the temperature of the Universe was about 4000 kelvin. That’s the temperature when half the hydrogen atoms split apart into electrons and protons. 4200 kelvin looks like a fluorescent light; 2800 kelvin like an incandescent bulb, rather yellow.

But as the Universe expanded, this light got stretched out to orange, red, infrared… and finally a dim microwave glow, invisible to human eyes. The average temperature of this glow is very close to absolute zero, but it’s been measured very precisely: 2.725 kelvin.

But the temperature of the glow is not the same in every direction! There are tiny fluctuations! You can see them in this picture. The colors here span a range of ± .0002 kelvin.

These fluctuations are very important, because they were later amplified by gravity, with denser patches of gas collapsing under their own gravitational attraction (thanks in part to dark matter), and becoming even denser… eventually leading to galaxies, stars and planets, you and me.

But where did these fluctuations come from? I suspect they started life as quantum fluctuations in an originally completely homogeneous Universe. Quantum mechanics takes quite a while to explain – but in this theory a situation can be completely symmetrical, yet when you measure it, you get an asymmetrical result. The universe is then a ‘sum’ of worlds where these different results are seen. The overall universe is still symmetrical, but each observer sees just a part: an asymmetrical part.

If you take this seriously, there are other worlds where fluctuations of the cosmic microwave background radiation take all possible patterns… and form galaxies in all possible patterns. So while the universe as we see it is asymmetrical, with galaxies and stars and planets and you and me arranged in a complicated and seemingly arbitrary way, the overall universe is still symmetrical – perfectly homogeneous!

That seems very nice to me. But the great thing is, we can learn more about this, not just by chatting, but by testing theories against ever more precise measurements. The Planck Mission is a great improvement over the Wilkinson Microwave Anisotropy Probe (WMAP), which in turn was a huge improvement over the Cosmic Background Explorer (COBE):

Here is some of what they’ve learned:

• It now seems the Universe is 13.82 ± 0.05 billion years old. This is a bit higher than the previous estimate of 13.77 ± 0.06 billion years, due to the Wilkinson Microwave Anisotropy Probe.

• It now seems the rate at which the universe is expanding, known as Hubble’s constant, is 67.15 ± 1.2 kilometers per second per megaparsec. A megaparsec is roughly 3 million light-years. This is less than earlier estimates using space telescopes, such as NASA’s Spitzer and Hubble.

• It now seems the fraction of mass-energy in the Universe in the form of dark matter is 26.8%, up from 24%. Dark energy is now estimated at 68.3%, down from 71.4%. And normal matter is now estimated at 4.9%, up from 4.6%.

These cosmological parameters, and a bunch more, are estimated here:

Planck 2013 results. XVI. Cosmological parameters.

It’s amazing how we’re getting more and more accurate numbers for these basic facts about our world! But the real surprises lie elsewhere…

A lopsided universe, with a cold spot?

 

The Planck Mission found two big surprises in the cosmic microwave background:

• This radiation is slightly different on opposite sides of the sky! This is not due to the fact that the Earth is moving relative to the average position of galaxies. That fact does make the radiation look hotter in the direction we’re moving. But that produces a simple pattern called a ‘dipole moment’ in the temperature map. If we subtract that out, it seems there are real differences between two sides of the Universe… and they are complex, interesting, and not explained by the usual theories!

• There is a cold spot that seems too big to be caused by chance. If this is for real, it’s the largest thing in the Universe.

Could these anomalies be due to experimental errors, or errors in data analysis? I don’t know! They were already seen by the Wilkinson Microwave Anisotropy Probe; for example, here is WMAP’s picture of the cold spot:

The Planck Mission seems to be seeing them more clearly with its better measurements. Paolo Natoli, from the University of Ferrara writes:

The Planck data call our attention to these anomalies, which are now more important than ever: with data of such quality, we can no longer neglect them as mere artefacts and we must search for an explanation. The anomalies indicate that something might be missing from our current understanding of the Universe. We need to find a model where these peculiar traits are no longer anomalies but features predicted by the model itself.

For a lot more detail, see this paper:

Planck 2013 results. XXIII. Isotropy and statistics of the CMB.

(I apologize for not listing the authors on these papers, but there are hundreds!) Let me paraphrase the abstract for people who want just a little more detail:

Many of these anomalies were previously observed in the Wilkinson Microwave Anisotropy Probe data, and are now confirmed at similar levels of significance (around 3 standard deviations). However, we find little evidence for non-Gaussianity with the exception of a few statistical signatures that seem to be associated with specific anomalies. In particular, we find that the quadrupole-octopole alignment is also connected to a low observed variance of the cosmic microwave background signal. The dipolar power asymmetry is now found to persist to much smaller angular scales, and can be described in the low-frequency regime by a phenomenological dipole modulation model. Finally, it is plausible that some of these features may be reflected in the angular power spectrum of the data which shows a deficit of power on the same scales. Indeed, when the power spectra of two hemispheres defined by a preferred direction are considered separately, one shows evidence for a deficit in power, whilst its opposite contains oscillations between odd and even modes that may be related to the parity violation and phase correlations also detected in the data. Whilst these analyses represent a step forward in building an understanding of the anomalies, a satisfactory explanation based on physically motivated models is still lacking.

If you’re a scientist, your mouth should be watering now… your tongue should be hanging out! If this stuff holds up, it’s amazing, because it would call for real new physics.

I’ve heard that the difference between hemispheres might fit the simplest homogeneous but not isotropic solutions of general relativity, the Bianchi models. However, this is something one should carefully test using statistics… and I’m sure people will start doing this now.

As for the cold spot, the best explanation I can imagine is some sort of mechanism for producing fluctuations very early on… so that these fluctuations would get blown up to enormous size during the inflationary epoch, roughly between 10-36 and 10-32 seconds after the Big Bang. I don’t know what this mechanism would be!

There are also ways of trying to ‘explain away’ the cold spot, but even these seem jaw-droppingly dramatic. For example, an almost empty region 150 megaparsecs (500 million light-years) across would tend to cool down cosmic microwave background radiation coming through it. But it would still be the largest thing in the Universe! And such an unusual void would seem to beg for an explanation of its own.

Particle physics

The Planck Mission also shed a lot of light on particle physics, and especially on inflation. But, it mainly seems to have confirmed what particle physicists already suspected! This makes them rather grumpy, because these days they’re always hoping for something new, and they’re not getting it.

We can see this at Jester’s blog Résonaances, which also gives a very nice, though technical, summary of what the Planck Mission did for particle physics:

From a particle physicist’s point of view the single most interesting observable from Planck is the notorious N_{\mathrm{eff}}. This observable measures the effective number of degrees of freedom with sub-eV mass that coexisted with the photons in the plasma at the time when the CMB was formed (see e.g. my older post for more explanations). The standard model predicts N_{\mathrm{eff}} \approx 3, corresponding to the 3 active neutrinos. Some models beyond the standard model featuring sterile neutrinos, dark photons, or axions could lead to N_{\mathrm{eff}} > 3, not necessarily an integer. For a long time various experimental groups have claimed N_{\mathrm{eff}} much larger than 3, but with an error too large to blow the trumpets. Planck was supposed to sweep the floor and it did. They find

N_{\mathrm{eff}} = 3 \pm 0.5,

that is, no hint of anything interesting going on. The gurgling sound you hear behind the wall is probably your colleague working on sterile neutrinos committing a ritual suicide.

Another number of interest for particle theorists is the sum of neutrino masses. Recall that oscillation experiments tell us only about the mass differences, whereas the absolute neutrino mass scale is still unknown. Neutrino masses larger than 0.1 eV would produce an observable imprint into the CMB. [….] Planck sees no hint of neutrino masses and puts the 95% CL limit at 0.23 eV.

Literally, the most valuable Planck result is the measurement of the spectral index n_s, as it may tip the scale for the Nobel committee to finally hand out the prize for inflation. Simplest models of inflation (e.g., a scalar field φ with a φn potential slowly changing its vacuum expectation value) predicts the spectrum of primordial density fluctuations that is adiabatic (the same in all components) and Gaussian (full information is contained in the 2-point correlation function). Much as previous CMB experiments, Planck does not see any departures from that hypothesis. A more quantitative prediction of simple inflationary models is that the primordial spectrum of fluctuations is almost but not exactly scale-invariant. More precisely, the spectrum is of the form

\displaystyle{ P \sim (k/k_0)^{n_s-1} }

with n_s close to but typically slightly smaller than 1, the size of n_s being dependent on how quickly (i.e. how slowly) the inflaton field rolls down its potential. The previous result from WMAP-9,

n_s=0.972 \pm 0.013

(n_s =0.9608 \pm 0.0080 after combining with other cosmological observables) was already a strong hint of a red-tilted spectrum. The Planck result

n_s = 0.9603 \pm 0.0073

(n_s =0.9608 \pm 0.0054 after combination) pushes the departure of n_s - 1 from zero past the magic 5 sigma significance. This number can of course also be fitted in more complicated models or in alternatives to inflation, but it is nevertheless a strong support for the most trivial version of inflation.

[….]

In summary, the cosmological results from Planck are really impressive. We’re looking into a pretty wide range of complex physical phenomena occurring billions of years ago. And, at the end of the day, we’re getting a perfect description with a fairly simple model. If this is not a moment to cry out “science works bitches”, nothing is. Particle physicists, however, can find little inspiration in the Planck results. For us, what Planck has observed is by no means an almost perfect universe… it’s rather the most boring universe.

I find it hilarious to hear someone complain that the universe is “boring” on a day when astrophysicists say they’ve discovered the universe is lopsided and has a huge cold region, the largest thing ever seen by humans!

However, particle physicists seem so far rather skeptical of these exciting developments. Is this sour grapes, or are they being wisely cautious?

Time, as usual, will tell.


Meta-Rationality

15 March, 2013

On his blog, Eli Dourado writes something that’s very relevant to the global warming debate, and indeed most other debates.

He’s talking about Paul Krugman, but I think with small modifications we could substitute the name of almost any intelligent pundit. I don’t care about Krugman here, I care about the general issue:

Nobel laureate, Princeton economics professor, and New York Times columnist Paul Krugman is a brilliant man. I am not so brilliant. So when Krugman makes strident claims about macroeconomics, a complex subject on which he has significantly more expertise than I do, should I just accept them? How should we evaluate the claims of people much smarter than ourselves?

A starting point for thinking about this question is the work of another Nobelist, Robert Aumann. In 1976, Aumann showed that under certain strong assumptions, disagreement on questions of fact is irrational. Suppose that Krugman and I have read all the same papers about macroeconomics, and we have access to all the same macroeconomic data. Suppose further that we agree that Krugman is smarter than I am. All it should take, according to Aumann, for our beliefs to converge is for us to exchange our views. If we have common “priors” and we are mutually aware of each others’ views, then if we do not agree ex post, at least one of us is being irrational.

It seems natural to conclude, given these facts, that if Krugman and I disagree, the fault lies with me. After all, he is much smarter than I am, so shouldn’t I converge much more to his view than he does to mine?

Not necessarily. One problem is that if I change my belief to match Krugman’s, I would still disagree with a lot of really smart people, including many people as smart as or possibly even smarter than Krugman. These people have read the same macroeconomics literature that Krugman and I have, and they have access to the same data. So the fact that they all disagree with each other on some margin suggests that very few of them behave according to the theory of disagreement. There must be some systematic problem with the beliefs of macroeconomists.

In their paper on disagreement, Tyler Cowen and Robin Hanson grapple with the problem of self-deception. Self-favoring priors, they note, can help to serve other functions besides arriving at the truth. People who “irrationally” believe in themselves are often more successful than those who do not. Because pursuit of the truth is often irrelevant in evolutionary competition, humans have an evolved tendency to hold self-favoring priors and self-deceive about the existence of these priors in ourselves, even though we frequently observe them in others.

Self-deception is in some ways a more serious problem than mere lack of intelligence. It is embarrassing to be caught in a logical contradiction, as a stupid person might be, because it is often impossible to deny. But when accused of disagreeing due to a self-favoring prior, such as having an inflated opinion of one’s own judgment, people can and do simply deny the accusation.

How can we best cope with the problem of self-deception? Cowen and Hanson argue that we should be on the lookout for people who are “meta-rational,” honest truth-seekers who choose opinions as if they understand the problem of disagreement and self-deception. According to the theory of disagreement, meta-rational people will not have disagreements among themselves caused by faith in their own superior knowledge or reasoning ability. The fact that disagreement remains widespread suggests that most people are not meta-rational, or—what seems less likely—that meta-rational people cannot distinguish one another.

We can try to identify meta-rational people through their cognitive and conversational styles. Someone who is really seeking the truth should be eager to collect new information through listening rather than speaking, construe opposing perspectives in their most favorable light, and offer information of which the other parties are not aware, instead of simply repeating arguments the other side has already heard.

All this seems obvious to me, but it’s discussed much too rarely. Maybe we can figure out ways to encourage this virtue that Cohen and Hanson call ‘meta-rationality’? There are already too many mechanisms that reward people for aggressively arguing for fixed positions. If Krugman really were ‘meta-rational’, he might still have his Nobel Prize, but he probably wouldn’t be a popular newspaper columnist.

The Azimuth Project, and this blog, are already doing a lot of things to prevent people from getting locked into fixed positions and filtering out evidence that goes against their views. Most crucial seems to be the policy of forbidding insults, bullying, and overly repetitive restatement of the same views. These behaviors increase what I call the ‘heat’ in a discussion, and I’ve decided that, all things considered, it’s best to keep the heat fairly low.

Heat attracts many people, so I’m sure we could get a lot more people to read this blog by turning up the heat. A little heat is a good thing, because it engages people’s energy. But heat also makes it harder for people to change their minds. When the heat gets too high, changing ones mind is perceived as a defeat, to be avoided at all costs. Even worse, people form ‘tribes’ who back each other up in every argument, regardless of the topic. Rationality goes out the window. And meta-rationality? Forget it!

Some Questions

Dourado talks about ways to “identify meta-rational people.” This is very attractive, but I think it’s better to talk about “identifying when people are behaving meta-rationally”. I don’t think we should spend too much of our time looking around for paragons of meta-rationality. First of all, nobody is perfect. Second of all, as soon as someone gets a big reputation for rationality, meta-rationality, or any other virtue, it seems they develop a fan club that runs a big risk of turning into a cult. This often makes it harder rather than easier for people to think clearly and change their minds!

I’d rather look for customs and institutions that encourage meta-rationality. So, my big question is:

How can we encourage rationality and meta-rationality, and make them more popular?

Of course science, and academia, are institutions that have been grappling with this question for centuries. Universities, seminars, conferences, journals, and so on—they all put a lot of work into encouraging the search for knowledge and examining the conditions under which it thrives.

And of course these institutions are imperfect: everything humans do is riddled with flaws.

But instead of listing cases where existing institutions failed to do their job optimally, I’d like to think about ways of developing new customs and institutions that encourage meta-rationality… and linking these to the existing ones.

Why? Because I feel the existing institutions don’t reach out enough to the ‘general public’, or ‘laymen’. The mere existence of these terms is a clue. There are a lot of people who consider academia as an ‘ivory tower’, separate from their own lives and largely irrelevant. And there are a lot of good reasons for this.

There’s one you’ve heard me talk about a lot: academia has let its journals get bought by big multimedia conglomerates, who then charge high fees for access. So, we have have scientific research on global warming paid for by our tax dollars, and published by prestigious journals such as Science and Nature… which unfortunately aren’t available to the ‘general public’.

That’s like a fire alarm you have to pay to hear.

But there’s another problem: institutions that try to encourage meta-rationality seem to operate by shielding themselves from the broader sphere that favors ‘hot’ discussions. Meanwhile, the hot discussions don’t get enough input from ‘cooler’ forums… and vice versa!

For example: we have researchers in climate science who publish in refereed journals, which mostly academics read. We have conferences, seminars and courses where this research is discussed and criticized. These are again attended mostly by academics. Then we have journalists and bloggers who try to explain and discuss these papers in more easily accessed venues. There are some blogs written by climate scientists, who try to short-circuit the middlemen a bit. Unfortunately the heated atmosphere of some of these blogs makes meta-rationality difficult. There are also blogs by ‘climate skeptics’, many from outside academia. These often criticize the published papers, but—it seems to me—rarely get into discussions with the papers’ authors in conditions that make it easy for either party to change their mind. And on top of all this, we have various think tanks who are more or less pre-committed to fixed positions… and of course, corporations and nonprofits paying for advertisements pushing various agendas.

Of course, it’s not just the global warming problem that suffers from a lack of public forums that encourage meta-rationality. That’s just an example. There have got to be some ways to improve the overall landscape a little. Just a little: I’m not expecting miracles!

Details

Here’s the paper by Aumann:

• Robert J. Aumann, Agreeing to disagree, The Annals of Statistics 4 (1976), 1236-1239.

and here’s the one by Cowen and Hanson:

• Tyler Cowen and Robin Hanson, Are disagreements honest?, 18 August 2004.

Personally I find Aumann’s paper uninteresting, because he’s discussing agents that are not only rational Bayesians, but rational Bayesians that share the same priors to begin with! It’s unsurprising that such agents would have trouble finding things to argue about.

His abstract summarizes his result quite clearly… except that he calls these idealized agents ‘people’, which is misleading:

Abstract. Two people, 1 and 2, are said to have common knowledge of an event E if both know it, 1 knows that 2 knows it, 2 knows that 1 knows is, 1 knows that 2 knows that 1 knows it, and so on.

Theorem. If two people have the same priors, and their posteriors for an event A are common knowledge, then these posteriors are equal.

Cowen and Hanson’s paper is more interesting to me. Here are some key sections for what we’re talking about here:

How Few Meta-rationals?

We can call someone a truth-seeker if, given his information and level of effort on a topic, he chooses his beliefs to be as close as possible to the truth. A non-truth seeker will, in contrast, also put substantial weight on other goals when choosing his beliefs. Let us also call someone meta-rational if he is an honest truth-seeker who chooses his opinions as if he understands the basic theory of disagreement, and abides by the rationality standards that most people uphold, which seem to preclude self-favoring priors.

The theory of disagreement says that meta-rational people will not knowingly have self-favoring disagreements among themselves. They might have some honest disagreements, such as on values or on topics of fact where their DNA encodes relevant non-self-favoring attitudes. But they will not have dishonest disagreements, i.e., disagreements directly on their relative ability, or disagreements on other random topics caused by their faith in their own superior knowledge or reasoning ability.

Our working hypothesis for explaining the ubiquity of persistent disagreement is that people are not usually meta-rational. While several factors contribute to this situation, a sufficient cause that usually remains when other causes are removed is that people do not typically seek only truth in their beliefs, not even in a persistent rational core. People tend to be hypocritical in have self-favoring priors, such as priors that violate indexical independence, even though they criticize others for such priors. And they are reluctant to admit this, either publicly or to themselves.

How many meta-rational people can there be? Even if the evidence is not consistent with most people being meta-rational, it seems consistent with there being exactly one meta-rational person. After all, in this case there never appears a pair of meta-rationals to agree with each other. So how many more meta-rationals are possible?

If meta-rational people were common, and able to distinguish one another, then we should see many pairs of people who have almost no dishonest disagreements with each other. In reality, however, it seems very hard to find any pair of people who, if put in contact, could not identify many persistent disagreements. While this is an admittedly difficult empirical determination to make, it suggests that there are either extremely few meta-rational people, or that they have virtually no way to distinguish each other.

Yet it seems that meta-rational people should be discernible via their conversation style. We know that, on a topic where self-favoring opinions would be relevant, the sequence of alternating opinions between a pair of people who are mutually aware of both being meta-rational must follow a random walk. And we know that the opinion sequence between typical non-meta-rational humans is nothing of the sort. If, when responding to the opinions of someone else of uncertain type, a meta-rational person acts differently from an ordinary non-meta-rational person, then two meta-rational people should be able to discern one another via a long enough conversation. And once they discern one another, two meta-rational people should no longer have dishonest disagreements. (Aaronson (2004) has shown that regardless of the topic or their initial opinions, any two Bayesians have less than a 10% chance of disagreeing by more than a 10% after exchanging about a thousand bits, and less than a 1% chance of disagreeing by more than a 1% after exchanging about a million bits.)

Since most people have extensive conversations with hundreds of people, many of whom they know very well, it seems that the fraction of people who are meta-rational must be very small. For example, given N people, a fraction f of whom are meta-rational, let each person participate in C conversations with random others that last long enough for two meta-rational people to discern each other. If so, there should be on average f^2CN/2 pairs who no longer disagree. If, across the world, two billion people, one in ten thousand of who are meta-rational, have one hundred long conversations each, then we should see one thousand pairs of people with only honest disagreements. If, within academia, two million people, one in ten thousand of who are meta-rational, have one thousand long conversations each, we should see ten agreeing pairs of academics. And if meta-rational people had any other clues to discern each another, and preferred to talk with one another, there should be far more such pairs. Yet, with the possible exception of some cult-like or fan-like relationships, where there is an obvious alternative explanation for their agreement, we know of no such pairs of people who no longer disagree on topics where self-favoring opinions are relevant.

We therefore conclude that unless meta-rationals simply cannot distinguish each other, only a tiny non-descript percentage of the population, or of academics, can be meta-rational. Either few people have truth-seeking rational cores, and those that do cannot be readily distinguished, or most people have such cores but they are in control infrequently and unpredictably. Worse, since it seems unlikely that the only signals of meta-rationality would be purely private signals, we each seem to have little grounds for confidence in our own meta-rationality, however much we would like to believe otherwise.

Personally, I think the failure to find ‘ten agreeing pairs of academics’ is not very interesting. Instead of looking for people who are meta-rational in all respects, which seems futile, I’m more interested in to looking for contexts and institutions that encourage people to behave meta-rationally when discussing specific issues.

For example, there’s surprisingly little disagreement among mathematicians when they’re discussing mathematics and they’re on their best behavior—for example, talking in a classroom. Disagreements show up, but they’re often dismissed quickly when one or both parties realize their mistake. The same people can argue bitterly and endlessly over politics or other topics. They are not meta-rational people: I doubt such people exist. They are people who have been encouraged by an institution to behave meta-rationally in specific limited ways… because the institution rewards this behavior.

Moving on:

Personal policy implications

Readers need not be concerned about the above conclusion if they have not accepted our empirical arguments, or if they are willing to embrace the rationality of self-favoring priors, and to forgo criticizing the beliefs of others caused by such priors. Let us assume, however, that you, the reader, are trying to be one of those rare meta-rational souls in the world, if indeed there are any. How guilty should you feel when you disagree on topics where self-favoring opinions are relevant?

If you and the people you disagree with completely ignored each other’s opinions, then you might tend to be right more if you had greater intelligence and information. And if you were sure that you were meta-rational, the fact that most people were not might embolden you to disagree with them. But for a truth-seeker, the key question must be how sure you can be that you, at the moment, are substantially more likely to have a truth-seeking, in-control, rational core than the people you now disagree with. This is because if either of you have some substantial degree of meta-rationality, then your relative intelligence and information are largely irrelevant except as they may indicate which of you is more likely to be self-deceived about being meta-rational.

One approach would be to try to never assume that you are more meta-rational than anyone else. But this cannot mean that you should agree with everyone, because you simply cannot do so when other people disagree among themselves. Alternatively, you could adopt a “middle” opinion. There are, however, many ways to define middle, and people can disagree about which middle is best (Barns 1998). Not only are there disagreements on many topics, but there are also disagreements on how to best correct for one’s limited meta-rationality.

Ideally we would want to construct a model of the process of individual self-deception, consistent with available data on behavior and opinion. We could then use such a model to take the observed distribution of opinion, and infer where lies the weight of evidence, and hence the best estimate of the truth. [Ideally this model would also satisfy a reflexivity constraint: when applied to disputes about self-deception it should select itself as the best model of self-deception. If people reject the claim that most people are self-deceived about their meta-rationality, this approach becomes more difficult, though perhaps not impossible.]

A more limited, but perhaps more feasible, approach to relative meta-rationality is to seek observable signs that indicate when people are self-deceived about their meta-rationality on a particular topic. You might then try to disagree only with those who display such signs more strongly than you do. For example, psychologists have found numerous correlates of self-deception. Self-deception is harder regarding one’s overt behaviors, there is less self-deception in a galvanic skin response (as used in lie detector tests) than in speech, the right brain hemisphere tends to be more honest, evaluations of actions are less honest after those actions are chosen than before (Trivers 2000), self-deceivers have more self-esteem and less psychopathology, especially less depression (Paulhus 1986), and older children are better than younger ones at hiding their self-deception from others (Feldman & Custrini 1988). Each correlate implies a corresponding sign of self-deception.

Other commonly suggested signs of self-deception include idiocy, self-interest, emotional arousal, informality of analysis, an inability to articulate supporting arguments, an unwillingness to consider contrary arguments, and ignorance of standard mental biases. If verified by further research, each of these signs would offer clues for identifying other people as self-deceivers.

Of course, this is easier said than done. It is easy to see how self-deceiving people, seeking to justify their disagreements, might try to favor themselves over their opponents by emphasizing different signs of self-deception in different situations. So looking for signs of self-deception need not be an easier approach than trying to overcome disagreement directly by further discussion on the topic of the disagreement.

We therefore end on a cautionary note. While we have identified some considerations to keep in mind, were one trying to be one of those rare meta-rational souls, we have no general recipe for how to proceed. Perhaps recognizing the difficulty of this problem can at least make us a bit more wary of our own judgments when we disagree.


Game Theory (Part 20)

11 March, 2013

Last time we tackled von Neumann’s minimax theorem:

Theorem. For every zero-sum 2-player normal form game,

\displaystyle{\min_{q'} \max_{p'} \; p' \cdot A q' = \max_{p'} \min_{q'} \; p' \cdot A q'}

where p' ranges over player A’s mixed strategies and q' ranges over player B’s mixed strategies.

We reduced the proof to two geometrical lemmas. Now let’s prove those… and finish up the course!

But first, let me chat a bit about this theorem. Von Neumann first proved it in 1928. He later wrote:

As far as I can see, there could be no theory of games … without that theorem … I thought there was nothing worth publishing until the Minimax Theorem was proved.

Von Neumann’s gave several proofs of this result:

• Tinne Hoff Kjeldesen, John von Neumann’s conception of the minimax theorem: a journey through different mathematical contexts, Arch. Hist. Exact Sci. 56 (2001) 39–68.

In 1937 he gave a proof which became quite famous, based on an important result in topology: Brouwer’s fixed point theorem. This says that if you have a ball

B = \{ x \in \mathbb{R}^n : \|x\| \le 1 \}

and a continuous function

f: B \to B

then this function has a fixed point, meaning a point x \in B with

f(x) = x

You’ll often seen Brouwer’s fixed point theorem in a first course on algebraic topology, though John Milnor came up with a proof using just multivariable calculus and a bit more.

After von Neumann proved his minimax theorem using Brouwer’s fixed point theorem, the mathematician Shizuo Kakutani proved another fixed-point theorem in 1941, which let him get the minimax theorem in a different way. This is now called Kakutani fixed-point theorem.

In 1949, John Nash generalized von Neumann’s result to nonzero-sum games with any number of players: they all have Nash equilibria if we let ourselves use mixed strategies! His proof is just one page long, and it won him the Nobel prize!

Nash’s proof used the Kakutani fixed-point theorem. There is also a proof of Nash’s theorem using Brouwer’s fixed-point theorem; see here for the 2-player case and here for the n-player case.

Apparently when Nash explained his result to von Neumann, the latter said:

That’s trivial, you know. That’s just a fixed point theorem.

Maybe von Neumann was a bit jealous?

I don’t know a proof of Nash’s theorem that doesn’t use a fixed-point theorem. But von Neumann’s original minimax theorem seems to be easier. The proof I showed you last time comes from Andrew Colman’s book Game Theory and its Applications in the Social and Biological Sciences. In it, he writes:

In common with many people, I first encountered game theory in non-mathematical books, and I soon became intrigued by the minimax theorem but frustrated by the way the books tiptoed around it without proving it. It seems reasonable to suppose that I am not the only person who has encountered this problem, but I have not found any source to which mathematically unsophisticated readers can turn for a proper understanding of the theorem, so I have attempted in the pages that follow to provide a simple, self-contained proof with each step spelt out as clearly as possible both in symbols and words.

There are other proofs that avoid fixed-point theorems: for example, there’s one in Ken Binmore’s book Playing for Real. But this one uses transfinite induction, which seems a bit scary and distracting! So far, Colman’s proof seems simplest, but I’ll keep trying to do better.

The lemmas

Now let’s prove the two lemmas from last time. A lemma is an unglamorous result which we use to prove a theorem we’re interested in. The mathematician Paul Taylor has written:

Lemmas do the work in mathematics: theorems, like management, just take the credit.

Let’s remember what we were doing. We had a zero-sum 2-player normal-form game with an m \times n payoff matrix A. The entry A_{ij} of this matrix says A’s payoff when player A makes choice i and player B makes choice j. We defined this set:

C = \{  A q' : \; q' \textrm{ is a mixed strategy for B} \} \subseteq \mathbb{R}^m

For example, if

\displaystyle{ A = \left( \begin{array}{rrr} 2 & 10 &  4 \\-2 & 1 & 6 \end{array} \right) }

then C looks like this:

We assumed that

\displaystyle{ \min_{q'} \max_{p'} \; p' \cdot A q' > 0}

This means there exists p' with

\displaystyle{  p' \cdot A q' > 0}

and this implies that at least one of the numbers (Aq')_i must be positive. So, if we define a set N by

\displaystyle{ N = \{(x_1, \dots, x_m) : x_i \le 0 \textrm{ for all } i\} \subseteq \mathbb{R}^m }

then Aq' can’t be in this set:

\displaystyle{ Aq' \notin N }

In other words, the set C \cap N is empty.

Here’s what C and N look like in our example:

Next, we choose a point in N and a point in C:

• let r be a point in N that’s as close as possible to C,

and

• let s be a point in C that’s as close as possible to r,

These points r and s need to be different, since C \cap N is empty. Here’s what these points and the vector s - r look like in our example:

To finish the job, we need to prove two lemmas:

Lemma 1. r \cdot (s-r) = 0, s_i - r_i \ge 0 for all i, and s_i - r_i > 0 for at least one i.

Proof. Suppose r' is any point in N whose coordinates are all the same those of r, except perhaps one, namely the ith coordinate for one particular choice of i. By the way we’ve defined s and r, this point r' can’t be closer to s than r is:

\| r' - s \| \ge  \| r - s \|

This means that

\displaystyle{ \sum_{j = 1}^m  (r_j' - s_j)^2 \ge  \sum_{j = 1}^m  (r_j - s_j)^2  }

But since r_j' = r_j except when j = i, this implies

(r_i' - s_i)^2 \ge  (r_i - s_i)^2

Now, if s_i \le 0 we can take r'_i = s_i. In this case we get

0 \ge  (r_i - s_i)^2

so r_i = s_i. On the other hand, if s_i > 0 we can take r'_i = 0 and get

s_i^2 \ge  (r_i - s_i)^2

which simplifies to

2 r_i s_i \ge r_i^2

But r_i \le 0 and s_i > 0, so this can only be true if r_i = 0.

In short, we know that either

r_i = s_i

or

s_i > 0 and r_i = 0.

So, either way we get

(s_i - r_i) r_i = 0

Since i was arbitrary, this implies

\displaystyle{ (s - r) \cdot r = \sum_{i = 1}^m (s_i - r_i) r_i = 0 }

which is the first thing we wanted to show. Also, either way we get

s_i - r_i \ge 0

which is the second thing we wanted. Finally, s_i - r_i \ge 0 but we know s \ne r, so

s_i - r_i > 0

for at least one choice of i. And this is the third thing we wanted!   █

Lemma 2. If Aq' is any point in C, then

(s-r) \cdot Aq' \ge 0

Proof. Let’s write

Aq' = a

for short. For any number t between 0 and 1, the point

ta + (1-t)s

is on the line segment connecting the points a and s. Since both these points are in C,, so is this point ta + (1-t)s, because the set C is convex. So, by the way we’ve defined s and r, this point can’t be closer to r than s is:

\| r - (ta + (1-t)s) \| \ge  \| r - s \|

This means that

\displaystyle{  (r + (1-t)s - ta) \cdot  (r + (1-t)s - ta) \ge (r - s) \cdot (r - s) }

With some algebra, this gives

\displaystyle{ 2 (a - s)\cdot (s - r) \ge -t (a - s) \cdot (a - s)  }

Since we can make t as small as we want, this implies that

\displaystyle{  (a - s)\cdot  (s - r) \ge 0  }

or

\displaystyle{ a \cdot (s - r) \ge  s \cdot (s - r)}

or

\displaystyle{ a \cdot (s - r) \ge  (s - r) \cdot (s - r) + r \cdot (s - r)}

By Lemma 1 we have r \cdot (s - r) \ge 0, and the dot product of any vector with itself is nonnegative, so it follows that

\displaystyle{ a \cdot (s - r) \ge 0}

And this is what we wanted to show!   █

Conclusion

Proving lemmas is hard work, and unglamorous. But if you remember the big picture, you’ll see how great this stuff is.

We started with a very general concept of two-person game. Then we introduced probability theory and the concept of ‘mixed strategy’. Then we realized that the expected payoff of each player could be computed using a dot product! This brings geometry into the subject. Using geometry, we’ve seen that every zero-sum game has at least one ‘Nash equilibrium’, where neither player is motivated to change what they do—at least if they’re rational agents.

And this is how math works: by taking a simple concept and thinking about it very hard, over a long time, we can figure out things that are not at all obvious.

For game theory, the story goes much further than we went in this course. For starters, we should look at nonzero-sum games, and games with more than two players. John Nash showed these more general games still have Nash equilibria!

Then we should think about how to actually find these equilibria. Merely knowing that they exist is not good enough! For zero-sum games, finding the equilibria uses a subject called linear programming. This is a way to maximize a linear function given a bunch of linear constraints. It’s used all over the place—in planning, routing, scheduling, and so on.

Game theory is used a lot by economists, for example in studying competition between firms, and in setting up antitrust regulations. For that, try this book:

• Lynne Pepall, Dan Richards and George Norman, Industrial Organization: Contemporary Theory and Empirical Applications, Blackwell, 2008.

For these applications, we need to think about how people actually play games and make economic decisions. We aren’t always rational agents! So, psychologists, sociologists and economists do experiments to study what people actually do. The book above has a lot of case studies, and you can learn more here:

• Andrew Colman, Game Theory and its Applications in the Social and Biological Sciences, Routledge, London, 1982.

As this book title hints, we should also think about how game theory enters into biology. Evolution can be seen as a game where the winning genes reproduce and the losers don’t. But it’s not all about competition: there’s a lot of cooperation involved. Life is not a zero-sum game! Here’s a good introduction to some of the math:

• William H. Sandholm, Evolutionary game theory, 12 November 2007.

For more on the biology, get ahold of this classic text:

• John Maynard Smith, Evolution and the Theory of Games, Cambridge University Press, 1982.

And so on. We’ve just scratched the surface!


Geoengineering Report

11 March, 2013

I think we should start serious research on geoengineering schemes, including actual experiments, not just calculations and simulations. I think we should do this with an open mind about whether we’ll decide that these schemes are good ideas or bad. Either way, we need to learn more about them. Simultaneously, we need an intelligent, well-informed debate about the many ethical, legal and political aspects.

Many express the fear that merely researching geoengineering schemes will automatically legitimate them, however hare-brained they are. There’s some merit to that fear. But I suspect that public opinion on geoengineering will suddenly tip from “unthinkable!” to “let’s do it now!” as soon as global warming becomes perceived as a real and present threat. This is especially true because oil, coal and gas companies have a big interest in finding solutions to global warming that don’t make them stop digging.

So if we don’t learn more about geoengineering schemes, and we start getting heat waves that threaten widespread famine, we should not be surprised if some big government goes it alone and starts doing something cheap and easy like putting tons of sulfur into the upper atmosphere… even if it’s been inadequately researched.

It’s hard to imagine a more controversial topic. But I think there’s one thing most of us should be able to agree on: we should pay attention to what governments are doing about geoengineering! So, let me quote a bit of this report prepared for the US Congress:

• Kelsi Bracmort and Richard K. Lattanzio, Geoengineering: Governance and Technology Policy, CRS Report for Congress, Congressional Research Service, 2 January 2013.

Kelsi Bracmort is a specialist in agricultural conservation and natural Resources Policy, and Richard K. Lattanzio is an analyst in environmental policy.

I will delete references to footnotes, since they’re huge and I’m too lazy to include them all here. So, go to the original text for those!

Introduction

Climate change has received considerable policy attention in the past several years both internationally and within the United States. A major report released by the Intergovernmental Panel on Climate Change (IPCC) in 2007 found widespread evidence of climate warming, and many are concerned that climate change may be severe and rapid with potentially catastrophic consequences for humans and the functioning of ecosystems. The National Academies maintains that the climate change challenge is unlikely to be solved with any single strategy or by the people of any single country.

Policy efforts to address climate change use a variety of methods, frequently including mitigation and adaptation. Mitigation is the reduction of the principal greenhouse gas (GHG) carbon dioxide (CO2) and other GHGs. Carbon dioxide is the dominant greenhouse gas emitted naturally through the carbon cycle and through human activities like the burning of fossil fuels. Other commonly discussed GHGs include methane, nitrous oxide, hydrofluorocarbons, perfluorocarbons, and sulfur hexafluoride. Adaptation seeks to improve an individual’s or institution’s ability to cope with or avoid harmful impacts of climate change, and to take advantage of potential beneficial ones.

Some observers are concerned that current mitigation and adaptation strategies may not prevent change quickly enough to avoid extreme climate disruptions. Geoengineering has been suggested by some as a timely additional method to mitigation and adaptation that could be included in climate change policy efforts. Geoengineering technologies, applied to the climate, aim to achieve large-scale and deliberate modifications of the Earth’s energy balance in order to reduce temperatures and counteract anthropogenic (i.e., human-made) climate change; these climate modifications would not be limited by country boundaries. As an unproven concept, geoengineering raises substantial environmental and ethical concerns for some observers. Others respond that the uncertainties of geoengineering may only be resolved through further scientific and technical examination.

Proposed geoengineering technologies vary greatly in terms of their technological characteristics and possible consequences. They are generally classified in two main groups:

• Solar radiation management (SRM) method: technologies that would increase the reflectivity, or albedo, of the Earth’s atmosphere or surface, and

• Carbon dioxide removal (CDR) method: technologies or practices that would remove CO2 and other GHGs from the atmosphere.

Much of the geoengineering technology discussion centers on SRM methods (e.g., enhanced albedo, aerosol injection). SRM methods could be deployed relatively quickly if necessary, and their impact on the climate would be more imme diate than that of CDR methods. Because SRM methods do not reduce GHG from the atmosphere, global warming could resume at a rapid pace if a deployed SRM method fails or is terminated at any time. At least one relatively simple SRM method is already being deployed with government assistance. [Enhanced albedo is one SRM effort currently being undertaken by the U.S. Environmental Protection Agency. See the Enhanced Albedo section for more information.] Other proposed SRM methods are at the conceptualization stage. CDR methods include afforestation, ocean fertilization, and the use of biomass to capture and store carbon.

The 112th Congress did not take any legislative action on geoengineering. In 2009, the House Science and Technology Committee of the 111th Congress held hearings on geoengineering that examined the “potential environmental risks and benefits of various proposals, associated domestic and international governance issues, evaluation mechanisms and criteria, research and development (R&D) needs, and economic rationales supporting the deployment of geoengineering activities.”

Some foreign governments, including the United Kingdom’s, as well as scientists from Germany and India, have begun considering engaging in the research or deployment of geoengineering technologies be cause of concern over the slow progress of emissions reductions, the uncertainties of climate sensitivity, the possible existence of climate thresholds (or “tipping points”), and the political, social, and economic impact of pursuing aggressive GHG mitigation strategies.

Congressional interest in geoengineering has focused primarily on whether geoengineering is a realistic, effective, and appropriate tool for the United States to use to address climate change. However, if geoengineering technologies are deployed by the United States, another government, or a private entity, several new concerns are likely to arise related to government support for, and oversight of, geoengineering as well as the transboundary and long-term effects of geoengineering. Such was the case in the summer of 2012, when an American citizen conducted a geoengineering experiment, specifically ocean fertilization, off the west coast of Canada that some say violated two international conventions.

This report is intended as a primer on the policy issues, science, and governance of geoengineering technologies. The report will first set the policy parameters under which geoengineering technologies may be considered. It will then describe selected technologies in detail and discuss their status. The third section provides a discussion of possible approaches to governmental involvement in, and oversight of, geoengineering, including a summary of domestic and international instruments and institutions that may affect geoengineering projects.

Geoengineering governance

Geoengineering technologies aim to modify the Earth’s energy balance in order to reduce temperatures and counteract anthropogenic climate change through large-scale and deliberate modifications. Implementation of some of the technologies may be controlled locally, while other technologies may require global input on implementation. Additionally, whether a technology can be controlled or not once implemented differs by technology type. Little research has been done on most geoengineering methods, and no major directed research programs are in place. Peer reviewed literature is scant, and deployment of the technology—either through controlled field tests or commercial enterprise—has been minimal.

Most interested observers agree that more research would be required to test the feasibility, effectiveness, cost, social and environmental impacts, and the possible unintended consequences of geoengineering before deployment; others reject exploration of the options as too risky. The uncertainties have led some policymakers to consider the need and the role for governmental oversight to guide research in the short term and to oversee potential deployment in the long term. Such governance structures, both domestic and international, could either support or constrain geoengineering activities, depending on the decisions of policymakers. As both technological development and policy considerations for geoengineering are in their early stages, several questions of governance remain in play:

• What risk factors and policy considerations enter into the debate over geoengineering activities and government oversight?

• At what point, if ever, should there be government oversight of geoengineering activities?

• If there is government oversight, what form should it take?

• If there is government oversight, who should be responsible for it?

• If there is publicly funded research and development, what should it cover and which disciplines should be engaged in it?

Risk Factors

As a new and emerging set of technologies potentially able to address climate change, geoengineering possesses many risk factors that must be taken into policy considerations. From a research perspective, the risk of geoengineering activities most often rests in the uncertainties of the new technology (i.e., the risk of failure, accident, or unintended consequences). However, many observers believe that the greater risk in geoengineering activities may lie in the social, ethical, legal, and political uncertainties associated with deployment. Given these risks, there is an argument that appropriate mechanisms for government oversight should be established before the federal government and its agencies take steps to promote geoengineering technologies and before new geoengineering projects are commenced. Yet, the uncertainty behind the technologies makes it unclear which methods, if any, may ever mature to the point of being deemed sufficiently effective, affordable, safe, and timely as to warrant potential deployment.

Some of the more significant risks factors associated with geoengineering are as follows:

Technology Control Dilemma. An analytical impasse inherent in all emerging technologies is that potential risks may be foreseen in the design phase but can only be proven and resolved through actual research, development, and demonstration. Ideally, appropriate safeguards are put in place during the early stages of conceptualization and development, but anticipating the evolution of a new technology can be difficult. By the time a technology is widely deployed, it may be impossible to build desirable oversight and risk management provisions without major disruptions to established interests. Flexibility is often required to both support investigative research and constrain potentially harmful deployment.

Reversibility. Risk mitigation relies on the ability to cease a technology program and terminate its adverse effects in a short period of time. In principle, all geoengineering options could be abandoned on short notice, with either an instant cessation of direct climate effects or a small time lag after abandonment.

However, the issue of reversibility applies to more than just the technologies themselves. Given the importance of internal adjustments and feedbacks in the climate system—still imperfectly understood—it is unlikely that all secondary effects from large-scale deployment would end immediately. Also, choices made regarding geoengineering methods may influence other social, economic, and technological choices regarding climate science. Advancing geoengineering options in lieu of effectively mitigating GHG emissions, for example, could result in a number of adverse effects, including ocean acidification, stresses on biodiversity, climate sensitivity shocks, and other irreversible consequences. Further, investing financially in the physical infrastructure to support geoengineering may create a strong economic resistance to reversing research and deployment activities.

Encapsulation. Risk mitigation also relies on whether a technology program is modular and contained or whether it involves the release of materials into the wider environment. The issue can be framed in the context of pollution (i.e., encapsulated technologies are often viewed as more “ethical” in that they are seen as non-polluting). Several geoengineering technologies are demonstrably non-encapsulated, and their release and deployment into the wider environment may lead to technical uncertainties, impacts on non-participants, and complex policy choices. But encapsulated technologies may still have localized environmental impacts, depending on the nature, size, and location of the application. The need for regulatory action may arise as much from the indirect impacts of activities on agro-forestry, species, and habitat as from the direct impacts of released materials in atmospheric or oceanic ecosystems.

Commercial Involvement. The role of private-sector engagement in the development and promotion of geoengineering may be debated. Commercial involvement, including competition, may be positive in that it mobilizes innovation and capital investment, which could lead to the development of more effective and less costly technologies at a faster rate than in the public sector.

However, commercial involvement could bypass or neglect social, economic, and environmental risk assessments in favor of what one commentator refers to as “irresponsible entrepreneurial behavior.” Private-sector engagement would likely require some form of public subsidies or GHG emission pricing to encourage investment, as well as additional considerations including ownership models, intellectual property rights, and trade and transfer mechanisms for the dissemination of the technologies.

Public Engagement. The consequences of geoengineering—including both benefits and risks discussed above—could affect people and communities across the world. Public attitudes toward geoengineering, and public engagement in the formation, development, and execution of proposed governance, could have a critical bearing on the future of the technologies. Perceptions of risks, levels of trust, transparency of actions, provisions for liabilities and compensation, and economies of investment could play a significant role in the political feasibility of geoengineering. Public acceptance may require a wider dialogue between scientists, policymakers, and the public.


Game Theory (Part 19)

7 March, 2013

Okay, we’re almost done! We’ve been studying Nash equilibria for zero-sum 2-player normal form games. We proved a lot of things about them, but now we’ll wrap up the story by proving this:

Grand Theorem. For every zero-sum 2-player normal-form game, a Nash equilibrium exists. Moreover, a pair of mixed strategies (p,q) for the two players is a Nash equilibrium if and only if each strategy is a maximin strategy.

Review

Let’s remember what we’ve proved in Part 16 and Part 18:

Theorem 1. For any zero-sum 2-player normal form game,

\displaystyle{ \min_{q'} \max_{p'} p' \cdot A q' \ge \max_{p'} \min_{q'} \; p' \cdot A q'}

Theorem 2. Given a zero-sum 2-player normal form game for which a Nash equilibrium exists, we have

\displaystyle{\min_{q'} \max_{p'} \; p' \cdot A q' = \max_{p'} \min_{q'} \; p' \cdot A q'}     ★

Theorem 3. If (p,q) is a Nash equilibrium for a zero-sum 2-player normal-form game, then p is a maximin strategy for player A and q is a maximin strategy for player B.

Theorem 4. Suppose we have a zero-sum 2-player normal form game for which ★ holds. If p is a maximin strategy for player A and q is a maximin strategy for player B, then (p,q) is a Nash equilibrium.

The plan

Today we’ll prove two more results. The first one is easy if you know some topology. The second one is the real heart of the whole subject:

Theorem 5. For every zero-sum 2-player normal-form game, a maximin strategy exists for each player.

Theorem 6. For every zero-sum 2-player normal-form game, ★ holds.

Putting all these results together, it’s easy to get our final result:

Grand Theorem. For every zero-sum 2-player normal-form game, a Nash equilibrium exists. Moreover, a pair of mixed strategies (p,q) for the two players is a Nash equilibrium if and only if each strategy is a maximin strategy.

Proof. By Theorem 6 we know that ★ holds. By Theorem 5 we know that there exist maximin strategies for each player, say p and q.. Theorem 4 says that if p and q are maximin strategies and ★ holds, then (p,q) is a Nash equilibrium. So, a Nash equilibrium exists.

Moreover, if (p,q) is any Nash equilibrium, Theorem 3 says p and q are maximin strategies. Conversely, since ★ holds, Theorem 4 says that if p and q are maximin strategies, (p,q) is a Nash equilibrium.   █

Minimax strategies exist

Okay, let’s dive in and get to work:

Theorem 5. For every zero-sum 2-player normal-form game, a maximin strategy exists for each player.

Proof. We’ll prove this only for player A, since the proof for player B is similar. Remember that a maximin strategy for player A is a mixed strategy that maximizes A’s security level, which is a function

\displaystyle{ f(p') = \min_{q'} p' \cdot A q' }

So, we just need to show that this function f really has a maximum. To do this, we note that

f : \{ \textrm{A's mixed strategies} \} \to \mathbb{R}

is a continuous function defined on a compact set. As mentioned at the start of Part 17, this guarantees that f has a maximum.   █

I apologize if this proof is hard to understand. All this stuff is standard if you know some topology, and a huge digression if you don’t, so I won’t go through the details. This is a nice example of how topology can be useful in other subjects!

The key theorem

Now we finally reach the heart of the whole subject: von Neumann’s minimax theorem. Our proof will be a condensed version of the one in Andrew Colman’s 1982 book Game Theory and its Applications in the Social and Biological Sciences.

Theorem 6. For every zero-sum 2-player normal-form game,

\displaystyle{\min_{q'} \max_{p'} \; p' \cdot A q' = \max_{p'} \min_{q'} \; p' \cdot A q'}     ★

holds.

Proof. Let’s write

\displaystyle{  \max_{p'} \min_{q'} \; p' \cdot A q' = V}

and

\displaystyle{ \min_{q'} \max_{p'} \; p' \cdot A q' = W}

Our goal is to prove ★, which says V = W. By Theorem 1 we know

V \le W

So, we just need to prove

V \ge W

Here’s how we will do this. We will prove

\textrm{if } W > 0 \textrm{ then } V \ge 0

Since we’ll prove this for any game of the sort we’re studying, it’ll be true even if we add some real number c to each entry of the payoff matrix A_{ij}. Doing this adds c to the expected payoff p' \cdot A q', so it adds c to V and W. So, it will follow that

\textrm{if } V + c > 0 \textrm{ then } W + c\ge 0

for any real number c, and this implies

V \ge W

So, let’s get going.

Assume W > 0. To prove that V \ge 0, remember that

\displaystyle{ V = \max_{p'} \min_{q'} \; p' \cdot A q'}

To show this is greater than or equal to zero, we just need to find some mixed strategy p for player A such that

\displaystyle{ \min_{q'} \; p \cdot A q' \ge 0}

In other words, we need to find p such that

\displaystyle{ p \cdot A q' \ge 0}     ★★

for all mixed strategies q' for player B.

How can we find p for which this ★★ is true? The key is to consider the set

C = \{  A q' : \; q' \textrm{ is a mixed strategy for B} \} \subseteq \mathbb{R}^m

For example, if

\displaystyle{ A = \left( \begin{array}{rrr} 2 & 10 &  4 \\-2 & 1 & 6 \end{array} \right) }

then C looks like this:

Since W > 0, for any Aq' \in C we have

\displaystyle{ \max_{p'} \; p' \cdot A q' \ge \min_{q'} \max_{p'} \; p' \cdot A q' = W > 0}

so there must exist p' with

\displaystyle{  p' \cdot A q' \ge W > 0}

Since p' = (p'_1, \dots, p'_m) is a mixed strategy, we have p'_i \ge 0 for all 1 \le i \le m. But since we’ve just seen

\displaystyle{ \sum_{i=1}^m p'_i (Aq')_i = p' \cdot A q' \ge W > 0}

at least one of the numbers (Aq')_i must be positive. In other words, if we define a set N by

\displaystyle{ N = \{(x_1, \dots, x_m) : x_i \le 0 \textrm{ for all } i\} \subseteq \mathbb{R}^m }

then Aq' can’t be in this set:

\displaystyle{ Aq' \notin N }

So, we’ve seen that no point in C can be in N:

C \cap N = \emptyset

Here’s what it looks like in our example:

Now the trick is to:

• let r be a point in N that’s as close as possible to C,

and

• let s be a point in C that’s as close as possible to r,

We need to use a bit of topology to be sure these points exist, since it means finding the minima of certain functions (namely, distances). But let’s not worry about that now! We’ll complete the proof with two lemmas:

Lemma 1. r \cdot (s-r) = 0, s_i - r_i \ge 0 for all i, and s_i - r_i > 0 for at least one i.

Lemma 2. If Aq' is any point in C, then

(s-r) \cdot Aq' \ge 0

Here’s what the points s and r and the vector s - r look like in our example:

Check to see that Lemmas 1 and 2 are true in this example! We’ll prove the lemmas later; right now let’s see how they get the job done.

First, by Lemma 1, the numbers s_i - r_i are nonnegative and at least one is positive. So, we can define a mixed strategy p for player A by defining

\displaystyle{ p_i = \frac{1}{c} (s_i - r_i) }

where c > 0 is a number chosen to make sure \sum_i p_i = 1. (Remember, the probabilities p_i must be \ge 0 and must sum to 1.) In other words,

\displaystyle{ p = \frac{1}{c} (s - r) }

Now, for any mixed strategy q' for player B, we have Aq' \in C and thus by Lemma 1

(s-r) \cdot Aq' \ge 0

Dividing by c, we get

p \cdot Aq' \ge 0

for all q'. But this is ★★, which is what we wanted to prove! So we are done!   █

I will give the proofs of Lemmas 1 and 2 in the next part.


Centre for Quantum Mathematics and Computation

6 March, 2013

This fall they’re opening a new Centre for Quantum Mathematics and Computation at Oxford University. They’ll be working on diagrammatic methods for topology and quantum theory, quantum gravity, and computation. You’ll understand what this means if you know the work of the people involved:

• Samson Abramsky
• Bob Coecke
• Christopher Douglas
• Kobi Kremnitzer
• Steve Simon
• Ulrike Tillman
• Jamie Vicary

All these people are already at Oxford, so you may wonder what’s new about this center. I’m not completely sure, but they’ve gotten money from EPSRC (roughly speaking, the British NSF), and they’re already hiring a postdoc. Applications are due on March 11, so hurry up if you’re interested!

They’re having a conference October 1st to 4th to start things off. I’ll be speaking there, and they tell me that Steve Awodey, Alexander Beilinson, Lucien Hardy, Martin Hyland, Chris Isham, Dana Scott, and Anton Zeilinger have been invited too.

I’m really looking forward to seeing Chris Isham, since he’s one of the most honest and critical thinkers about quantum gravity and the big difficulties we have in understanding this subject—and he has trouble taking airplane flights, so it’s been a long time since I’ve seen him. It’ll also be great to see all the other people I know, and meet the ones I don’t.

For example, back in the 1990’s, I used to spend summers in Cambridge talking about n-categories with Martin Hyland and his students Eugenia Cheng, Tom Leinster and Aaron Lauda (who had been an undergraduate at U.C. Riverside). And more recently I’ve been talking a lot with Jamie Vicary about categories and quantum computation—since was in Singapore some of the time while I was there. (Indeed, I’m going back there this summer, and so will he.)

I’m not as big on n-categories and quantum gravity as I used to be, but I’m still interested in the foundations of quantum theory and how it’s connected to computation, so I think I can give a talk with some new ideas in it.


Game Theory (Part 18)

5 March, 2013

We’re talking about zero-sum 2-player normal form games. Last time we saw that in a Nash equilibrium for a game like this, both players must use a maximin strategy. Now let’s try to prove the converse!

In other words: let’s try to prove that if both players use a maximin strategy, the result is a Nash equilibrium.

Today we’ll only prove this is true if a certain equation holds. It’s the cool-looking equation we saw last time:

\displaystyle{\min_{q'} \max_{p'} \; p' \cdot A q' = \max_{p'} \min_{q'} \; p' \cdot A q'}

Last time we showed this cool-looking equation is true whenever our game has a Nash equilibrium. In fact, this equation is always true. In other words: it’s true for any zero-sum two-player normal form game. The reason is that any such game has a Nash equilibrium. But we haven’t showed that yet.

So, let’s do what we can easily do.

Maximin strategies give Nash equilibria… sometimes

Let’s start by remembering some facts we saw in Part 16 and Part 17.

We’re studying a zero-sum 2-player normal form game. Player A’s payoff matrix is A, and player B’s payoff matrix is -A.

We saw that a pair of mixed strategies (p,q), one for player A and one for player B, is a Nash equilibrium if and only if

1) p \cdot A q \ge p' \cdot A q for all p'

and

2) p \cdot A q \ge p \cdot A q' for all q'.

We saw that p is a maximin strategy for player A if and only if:

\displaystyle{ \min_{q'} \; p \cdot A q'  = \max_{p'} \min_{q'} \; p' \cdot A q' }

We also saw that q is a maximin strategy for player B if and only if:

\displaystyle{  \max_{p'} \; p' \cdot A q  = \min_{q'} \max_{p'} \; p' \cdot A q' }

With these in hand, we can easily prove our big result for the day. We’ll call it Theorem 4, continuing with the theorem numbers we started last time:

Theorem 4. Suppose we have a zero-sum 2-player normal form game for which

\displaystyle{\min_{q'} \max_{p'} \; p' \cdot A q' = \max_{p'} \min_{q'} \; p' \cdot A q'}      ★

holds. If p is a maximin strategy for player A and q is a maximin strategy for player B, then (p,q) is a Nash equilibrium.

Proof. Suppose that p is a maximin strategy for player A and q is a maximin strategy for player B. Thus:

\displaystyle{ \min_{q'} \; p \cdot A q'  = \max_{p'} \min_{q'} \; p' \cdot A q' }

and

\displaystyle{ \max_{p'} \; p' \cdot A q  = \min_{q'} \max_{p'} \; p' \cdot A q' }

But since ★ holds, the right sides of these two equations are equal. So, the left sides are equal too:

\displaystyle{ \min_{q'} \; p \cdot A q' = \max_{p'} \; p' \cdot A q  }     ★★

Now, since a function is always less than or equal to its maximum value, and greater than or equal to its minimum value, we have

\displaystyle{ \min_{q'} \; p \cdot A q' \le p \cdot A q \le \max_{p'} \; p' \cdot A q  }

But ★★ says the quantity at far left here equals the quantity at far right! So, the quantity in the middle must equal both of them:

\displaystyle{ \min_{q'} \; p \cdot A q' = p \cdot A q = \max_{p'} \; p' \cdot A q  }     ★★★

By the definition of minimum value, the first equation in ★★★:

\displaystyle{  p \cdot A q = \min_{q'} \; p \cdot A q' }

says that

\displaystyle{ p \cdot A q \le p \cdot A q' }

for all q'. This is condition 2) in the definition of Nash equilibrium. Similarly, by the definition of maximum value, the second equation in ★★★:

\displaystyle{ p \cdot A q = \max_{p'} \; p' \cdot A q }

says that

\displaystyle{ p \cdot A q \ge p' \cdot A q }

for all p'. This is condition 1) in the definition of Nash equilibrium. So, the pair (p,q) is a Nash equilibrium. █


Follow

Get every new post delivered to your Inbox.

Join 3,095 other followers