Game Theory (Part 10)

5 February, 2013

Last time we solved some probability puzzles involving coin flips. This time we’ll look at puzzles involving cards.

Permutations

Example 1. How many ways are there to order 3 cards: a jack (J), a queen (Q), and a king (K)?

By order them I mean put one on top, then one in the middle, then one on the bottom. There are three choices for the first card: it can be A, Q, or K. That leaves two choices for what the second card can be, and just one for the third. So, there are

3 \times 2 \times 1 = 6

ways to order the cards.

Example 2. How many ways are there to order all 52 cards in an ordinary deck?

By the same reasoning, the answer is

52 \times 51 \times 50 \times \cdots \times 2 \times 1

This is a huge number. We call it 52 factorial, or 52! for short. I guess the exclamation mark emphasizes how huge this number is. In fact

52! \approx 8.06 \times 10^{67}

This is smaller than the number of atoms in the observable universe, which is about 10^{80}. But it’s much bigger than the number of galaxies in the observable universe, which is about 10^{11}, or even the number of stars in the observable universe, which is roughly 10^{22}. It’s impressive that we can hold such a big number in our hand… in the form of possible ways to order a deck of cards!

A well-shuffled deck

Definition 1. We say a deck is well-shuffled if each of the possible ways of ordering the cards in the deck has the same probability.

Example 3. If a deck of cards is well-shuffled, what’s the probability that it’s in this order?

Since all orders have the same probability, and there are 52! of them, the probability that they’re in any particular order is

\displaystyle{ \frac{1}{52!} }

So, the answer is

\displaystyle{ \frac{1}{52!} \approx 1.24 \times 10^{-68} }

A hand from a well-shuffled deck

Suppose you take the top k cards from a well-shuffled deck of n cards. You’ll get a subset of cards—though card players call this a hand of cards instead of a subset. And, there are n choose k possible hands you could get! Remember from last time:

Definition 2. The binomial coefficient

\displaystyle{ \binom{n}{k} = \frac{n(n-1)(n-2) \cdots (n-k+1)}{k(k-1)(k-2) \cdots 1}}

called n choose k is the number of ways of choosing a subset of k things from a set of n things.

I guess card-players call a set a ‘deck’, and a subset a ‘hand’. But now we can write a cool new formula for n choose k. Just multiply the top and bottom of that big fraction by

\displaystyle{  (n-k)(n-k-1) \cdots 1}

We get

\begin{array}{ccl} \displaystyle{ \binom{n}{k}} &=& \displaystyle{  \frac{n(n-1)(n-2) \cdots 1}{(k(k-1)(k-2) \cdots 1)((n-k)(n-k-1) \cdots 1)} } \\ &=& \displaystyle{ \frac{n!}{k! (n-k)!} } \end{array}

I won’t do it here, but here’s something you can prove using stuff I’ve told you. Suppose you have a well-shuffled deck of n cards and you draw a hand of k cards. Then each of these hands is equally probable!

Using this we can solve lots of puzzles.

Example 4. If you draw a hand of 5 cards from a well-shuffled standard deck, what’s the probability that you get the 10, jack, queen, king and ace of spades?

Since I’m claiming that all hands are equally probable, we just need to count the number of hands, and take the reciprocal of that.

There are

\displaystyle{ \binom{52}{5} = \frac{52 \times 51 \times 50 \times 49 \times 48}{5 \times 4 \times 3 \times 2 \times 1} }

5-card hands drawn from a 52-card deck. So, the probability of getting any particular hand is

\displaystyle{  \frac{1}{\binom{52}{5}} = \frac{5 \times 4 \times 3 \times 2 \times 1}{52 \times 51 \times 50 \times 49 \times 48} }

We can simplify this a bit since 50 is 5 × 10 and 48 is twice 4 × 3 × 2 × 1. So, the probability is

\displaystyle{  \frac{1}{52 \times 51 \times 10 \times 49 \times 2} = \frac{1}{2598960} \approx 3.85 \cdot 10^{-7}}

A royal flush

The hand we just saw:

{10♠, J♠, Q♠, K♠, A♠}

is an example of a ‘royal flush’… the best kind of hand in poker!

Definition 3. A straight is a hand of five cards that can be arranged in a consecutive sequence, for example:

{7♥, 8♣, 9♠, 10♠, J♦}

Definition 4. A straight flush is a straight whose cards are all of the same suit, for example:

{7♣, 8♣, 9♣, 10♣, J♣}

Definition 5. A royal flush is a straight flush where the cards go from 10 to ace, for example:

{10♠, J♠, Q♠, K♠, A♠}

Example 5. If you draw a 5-card hand from a standard deck, what is the probability that it is a royal flush?

We have seen that each 5-card hand has probability

\displaystyle{ \frac{1}{\binom{52}{5}} = \frac{1}{2598960} }

There are just 4 royal flushes, one for each suit. So, the probability of getting a royal flush is

\displaystyle{ \frac{4}{\binom{52}{5}} = \frac{1}{649740} \approx 0.000154\%}

Puzzles

Suppose you have a well-shuffled standard deck of 52 cards, and you draw a hand of 5 cards.

Puzzle 1. What is the probability that the hand is a straight flush?

Puzzle 2. What is the probability that the hand is a straight flush but not a royal flush?

Puzzle 3. What is the probability that the hand is a straight?

Puzzle 4. What is the probability that the hand is a straight but not a straight flush?


Game Theory (Part 9)

5 February, 2013

Last time we talked about independence of a pair of events, but we can easily go on and talk about independence of a longer sequence of events. For example, suppose we have three coins. Suppose:

• the 1st coin has probability p_H of landing heads up and p_T of landing tails up;
• the 2nd coin has probability q_H of landing heads up and q_T of landing tails up;
• the 3rd coin has probability r_H of landing heads up and r_T of landing tails up.

Suppose we flip all of these coins: the 1st, then the 2nd, then the 3rd. What’s the probability that we get this sequence of results:

(H, T, T)

If the coin flips are independent, the probability is just this product:

p_H \, q_T \, r_T

See the pattern? We just multiply the probabilities. And there’s nothing special about coins here, or the number three. We could flip a coin, roll a die, pick a card, and see if it’s raining outside.

For example, what’s the probability that we get heads with our coin, the number 6 on our die, an ace of spades with our cards, and it’s raining? If these events are independent, we just calculate:

the probability that we get heads, times
the probability that we roll a 6, times
the probability that we get an ace of spades, times
the probability that it’s raining outside.

Let’s solve some puzzles using this idea!

Three flips of a fair coin

Example 1. Suppose you have a fair coin: this means it has a 50% chance of landing heads up and a 50% chance of landing tails up. Suppose you flip it three times and these flips are independent. What is the probability that it lands heads up, then tails up, then heads up?

We’re asking about the probability of this event:

(H, T, H)

Since the flips are independent this is

p_{(H,T,H)} = p_H \, p_T \, p_H

Since the coin is fair we have

\displaystyle{ p_H = p_T = \frac{1}{2} }

so

\displaystyle{ p_H p_T p_H = \frac{1}{2} \times \frac{1}{2} \times \frac{1}{2} = \frac{1}{8} }

So the answer is 1/8, or 12.5%.

Example 2. In the same situation, what’s the probability that the coin lands heads up exactly twice?

There are 2 × 2 × 2 = 8 events that can happen:

(H,H,H)
(H,H,T), \; (H,T,H), \; (T,H,H)
(H,T,T), \; (T,H,T), \; (T,T,H)
(T,T,T)

We can work out the probability of each of these events. For example, we’ve already seen that (H,T,H) is

\displaystyle{ p_{(H,T,H)} = p_H p_T p_H = \frac{1}{8} }

since the coin is fair and the flips are independent. In fact, all 8 probabilities work out the same way. We always get 1/8. In other words, each of the 8 events is equally likely!

But we’re interested in the probability that we get exactly two heads. That’s the probability of this subset:

S = \{ (T,H,H), (H,T,H), (H,H,T) \}

Using the rule we saw in Part 7, this probability is

\displaystyle{ p(S) = p_{(T,H,H)} + p_{(H,T,H)} + p_{(H,H,T)} = 3 \times \frac{1}{8} }

So the answer is 3/8, or 37.5%.

I could have done this a lot faster. I could say “there are 8 events that can happen, each equally likely, and three that give us two heads, so the probability is 3/8.” But I wanted to show you how we’re just following rules we’ve already seen!

Three flips of a very unfair coin

Example 3. Now suppose we have an unfair coin with a 90% chance of landing heads up and 10% chance of landing tails up! What’s the probability that if we flip it three times, it lands heads up exactly twice? Again let’s assume the coin flips are independent.

Most of the calculation works exactly the same way, but now our coin has

\displaystyle{ p_H = 0.9, \quad p_T = 0.1 }

We’re interested in the events where the coin comes up heads twice, so we look at this subset:

S = \{ (T,H,H), (H,T,H), (H,H,T) \}

The probability of this subset is

\begin{array}{ccl} p(S) &=& p_{(T,H,H)} + p_{(H,T,H)} + p_{(H,H,T)} \\  &=& p_T \, p_H  \, p_H + p_H \, p_T \, p_H + p_H \, p_H \, p_T \\ &=& 3 p_T p_H^2 \\ &=& 3 \times 0.1 \times 0.9^2 \\ &=& 0.3 \times 0.81 \\ &=& 0.243 \end{array}

So now the probability is just 24.3%.

Six flips of a fair coin

Example 4. Suppose you have a fair coin. Suppose you flip it six times and these flips are independent. What is the probability that it lands heads up exactly twice?

We did a similar problem already, where we flipped the coin three times. Go back and look at that if you forget! The answer to that problem was

\displaystyle{ 3 \times \frac{1}{8} }

Why? Here’s why: there were 3 ways to get two heads when you flipped 3 coins, and each of these events had probability

\displaystyle{ \left(\frac{1}{2}\right)^3 = \frac{1}{8} }

We can do our new problem the same way. Count the number of ways to get two heads when we flip six coins. Then multiply this by

\displaystyle{ \left(\frac{1}{2}\right)^6 = \frac{1}{64} }

The hard part is to count how many ways we can get two heads when we flip six coins. To get good at probabilities, we have to get good at counting. It’s boring to list all the events we’re trying to count:

(H,H,T,T,T,T), (H,T,H,T,T,T), (H,T,T,H,T,T), …

So let’s try to come up with a better idea.

We have to pick 2 out of our 6 flips to be H’s. How many ways are there to do this?

There are 6 ways to pick one of the flips and draw a red H on it, and then 5 ways left over to pick another and draw a blue H on it… letting the rest be T’s. For example:

(T, H, T, T, H, T)

So, we’ve got 6 × 5 = 30 choices. But we don’t really care which H is red and which H is blue—that’s just a trick to help us solve the problem. For example, we don’t want to count

(T, H, T, T, H, T)

as different from

(T, H, T, T, H, T)

So, there aren’t really 30 ways to get two heads. There are only half as many! There are 15 ways.

So, the probability of getting two heads when we flip the coin six times is

\displaystyle{ 15 \times \frac{1}{64} = \frac{15}{64} \approx .234 }

where the squiggle means ‘approximately’. So: about 23.4%.

Binomial coefficients

Now for some jargon, which will help when we do harder problems like this. We say there are 6 choose 2 ways to choose 2 out of 6 things, and we write this as

\displaystyle{ \binom{6}{2} }

This sort of number is called a binomial coefficient.

We’ve just shown that

\displaystyle{ \binom{6}{2}  = \frac{6 \times 5}{2 \times 1} = 15 }

Why write it like this funky fraction: \frac{6 \times 5}{2 \times 1}? Because it’ll help us see the pattern for doing harder problems like this!

Nine flips of a fair coin

If we flip a fair coin 9 times, and the flips are independent, what’s the probability that we get heads exactly 6 times?

This works just like the last problem, only the numbers are bigger. So, I’ll do it faster!

When we flip the coin 9 times there are 2^9 possible events that can happen. Each of these is equally likely if it’s a fair coin and the flips are independent. So each has probability

\displaystyle{  \frac{1}{2^9} }

To get the answer, we need to multiply this by the number of ways we can get heads exactly 6 times. This number is called ‘9 choose 6’ or

\displaystyle{ \binom{9}{6}  }

for short. It’s the number of ways we can choose 6 things out of a collection of 9.

So we just need to know: what’s 9 choose 6? We can work this out as before. There are 9 ways to pick one of the flips and draw a red H on it, then 8 ways left to pick another and draw a blue H on it, and 7 ways left to pick a third and draw a orange H on it. That sounds like 9 × 8 × 7.

But we’ve overcounted! After all, we don’t care about the colors. We don’t care about the difference between this:

(T, H, T, T, H, T, T, H, T)

and this:

(T, H, T, T, H, T, T, H, T)

In fact we’ve counted each possibility 6 times! Why six? The first H could be red, green or blue—that’s 3 choices. But then the second H could be either of the two remaining 2 colors… and for the third, we just have 1 choice. So there are 3 × 2 × 1 = 6 ways to permute the colors.

So, the actual number of ways to get 6 heads out of 9 coin flips is

\displaystyle{ \frac{9 \times 8 \times 7}{3 \times 2 \times 1} }

In other words:

\displaystyle{ \binom{9}{6} = \frac{9 \times 8 \times 7}{3 \times 2 \times 1} }

To get the answer to our actual problem, remember we need to multiply 1/2^9 by this. So the answer is

\displaystyle{ \frac{1}{2^9} \times \binom{9}{6} }

If you’re a pure mathematician, you can say you’re done now. But normal people won’t understand this answer, so let’s calculate it out. I hope you know the first ten powers of two: 1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024. So:

\displaystyle{ 2^9 = 512 }

I hope you can also do basic arithmetic like this:

\displaystyle{ \binom{9}{6} = \frac{9 \times 8 \times 7}{3 \times 2 \times 1} = 84}

So, the probability of getting 6 heads when you do 9 independent flips of a fair coin is

\displaystyle{ \frac{1}{2^9} \times \binom{9}{6}  = \frac{84}{512} = 0.164025 }

or 16.4025%. I broke down and used a calculator at the last step. We’re becoming serious nerds here.

Okay, that’s enough for now. We’ve been counting how many ways we can get a certain number of heads from a certain number of coin flips. What we’re realy doing is taking a set of coin flips, say n of them, and choosing a subset of k of them to be heads. So, we say

Definition. The binomial coefficient

\displaystyle{ \binom{n}{k} }

called n choose k, is the number of ways of choosing a subset of k things from a set of n things.

We have seen in some examples that

\displaystyle{ \binom{n}{k} = \frac{n(n-1)(n-2) \cdots (n-k+1)}{k(k-1)(k-2) \cdots 1} }

Here there’s a product of k consecutive numbers on top, and k on bottom too. We didn’t prove this is true in general, but it’s not hard to see, using the tricks we’ve used already.


Milankovich vs the Ice Ages

30 January, 2013

guest post by Blake Pollard

Hi! My name is Blake S. Pollard. I am a physics graduate student working under Professor Baez at the University of California, Riverside. I studied Applied Physics as an undergraduate at Columbia University. As an undergraduate my research was more on the environmental side; working as a researcher at the Water Center, a part of the Earth Institute at Columbia University, I developed methods using time-series satellite data to keep track of irrigated agriculture over northwestern India for the past decade.

I am passionate about physics, but have the desire to apply my skills in more terrestrial settings. That is why I decided to come to UC Riverside and work with Professor Baez on some potentially more practical cross-disciplinary problems. Before starting work on my PhD I spent a year surfing in Hawaii, where I also worked in experimental particle physics at the University of Hawaii at Manoa. My current interests (besides passing my classes) lie in exploring potential applications of the analogy between information and entropy, as well as in understanding parallels between statistical, stochastic, and quantum mechanics.

Glacial cycles are one essential feature of Earth’s climate dynamics over timescales on the order of 100’s of kiloyears (kyr). It is often accepted as common knowledge that these glacial cycles are in some way forced by variations in the Earth’s orbit. In particular many have argued that the approximate 100 kyr period of glacial cycles corresponds to variations in the Earth’s eccentricity. As we saw in Professor Baez’s earlier posts, while the variation of eccentricity does affect the total insolation arriving to Earth, this variation is small. Thus many have proposed the existence of a nonlinear mechanism by which such small variations become amplified enough to drive the glacial cycles. Others have proposed that eccentricity is not primarily responsible for the 100 kyr period of the glacial cycles.

Here is a brief summary of some time series analysis I performed in order to better understand the relationship between the Earth’s Ice Ages and the Milankovich cycles.

I used publicly available data on the Earth’s orbital parameters computed by André Berger (see below for all references). This data includes an estimate of the insolation derived from these parameters, which is plotted below against the Earth’s temperature, as estimated using deuterium concentrations in an ice core from a site in the Antarctic called EPICA Dome C:

As you can see, it’s a complicated mess, even when you click to enlarge it! However, I’m going to focus on the orbital parameters themselves, which behave more simply. Below you can see graphs of three important parameters:

• obliquity (tilt of the Earth’s axis),
• precession (direction the tilted axis is pointing),
• eccentricity (how much the Earth’s orbit deviates from being circular).

You can click on any of the graphs here to enlarge them:

Richard Muller and Gordon MacDonald have argued that another astronomical parameter is important: the angle between the plane Earth’s orbit and the ‘invariant plane’ of the solar system. This invariant plane of the solar system depends on the angular momenta of the planets, but roughly coincides with the plane of Jupiter’s orbit, from what I understand. Here is a plot of the orbital plane inclination for the past 800 kyr:

One can see from these plots, or from some spectral analysis, that the main periodicities of the orbital parameters are:

• Obliquity ~ 42 kyr
• Precession ~ 21 kyr
• Eccentricity ~100 kyr
• Orbital plane ~ 100 kyr

Of course the curves clearly are not simple sine waves with those frequencies. Fourier transforms give information regarding the relative power of different frequencies occurring in a time series, but there is no information left regarding the time dependence of these frequencies as the time dependence is integrated out in the Fourier transform.

The Gabor transform is a generalization of the Fourier transform, sometimes referred to as the ‘windowed’ Fourier transform. For the Fourier transform:

\displaystyle{ F(w) = \dfrac{1}{\sqrt{2\pi}} \int_{-\infty}^{\infty} f(t) e^{-iwt} \, dt}

one may think of e^{-iwt} , the ‘kernel function’, as the guy acting as your basis element in both spaces. For the Gabor transform instead of e^{-iwt} one defines a family of functions,

g_{(b,\omega)}(t) = e^{i\omega(t-b)}g(t-b)

where g \in L^{2}(\mathbb{R}) is called the window function. Typical windows are square windows and triangular (Bartlett) windows, but the most common is the Gaussian:

\displaystyle{ g(t)= e^{-kt^2} }

which is used in the analysis below. The Gabor transform of a function f(t) is then given by

\displaystyle{ G_{f}(b,w) = \int_{-\infty}^\infty f(t) \overline{g(t-b)} e^{-iw(t-b)} \, dt }

Note the output of a Gabor transform, like the Fourier transform, is a complex function. The modulus of this function indicates the strength of a particular frequency in the signal, while the phase carries information about the… well, phase.

For example the modulus of the Gabor transform of

\displaystyle{ f(t)=\sin(\dfrac{2\pi t}{100}) }

is shown below. For these I used the package Rwave, originally written in S by Rene Carmona and Bruno Torresani; R port by Brandon Whitcher.

You can see that the line centered at a frequency of .01 corresponds to the function’s period of 100 time units.

A Fourier transform would do okay for such a function, but consider now a sine wave whose frequency increases linearly. As you can see below, the Gabor transform of such a function shows the linear increase of frequency with time:

The window parameter in both of the above Gabor transforms is 100 time units. Adjusting this parameter effects the vertical blurriness of the Gabor transform. For example here is the same plot as a above, but with window parameters of 300, 200, 100, and 50 time units:

You can see as you make the window smaller the line gets sharper, but only to a point. When the window becomes approximately smaller than a given period of the signal the line starts to blur again. This makes sense, because you can’t know the frequency of a signal precisely at a precise moment in time… just like you can’t precisely know both the momentum and position of a particle in quantum mechanics! The math is related, in fact.

Now let’s look at the Earth’s temperature over the past 800 kyr, estimated from the EPICA ice core deuterium concentrations:

When you look at this, first you notice spikes occurring about every 100 kyr. You can also see that the last 5 of these spikes appear to be bigger and more dramatic than the ones occurring before 500 kyr ago. Roughly speaking, each of these spikes corresponds to rapid warming of the Earth, after which occurs slightly less rapid cooling, and then a slow decrease in temperature until the next spike occurs. These are the Earth’s glacial cycles.

At the bottom of the curve, where the temperature is about about 4 °C cooler than the mean of this curve, glaciers are forming and extending down across the northern hemisphere. The relatively warm periods on the top of the spikes, about 10 °C hotter than the glacial periods. are called the interglacials. You can see that we are currently in the middle of an interglacial, so the Earth is relatively warm compared to rest of the glacial cycles.

Now we’ll take a look at the windowed Fourier transform, or the Gabor transform, of this data. The window size for these plots is 300 kyr.

Zooming in a bit, one can see a few interesting features in this plot:

We see one line at a frequency of about .024, with a sampling rate of 1 kyr, corresponds to a period of about 42 kyr, close to the period of obliquity. We also see a few things going on around a frequency of .01, corresponding to a 100 kyr period.

The band at .024 appears to be relatively horizontal, indicating an approximately constant frequency. Around the 100 kyr periods there is more going on. At a slightly higher frequency, about .015, there appears to be a band of slowly increasing frequency. Also, around .01 it’s hard to say what is really going on. It is possible that we see a combination of two frequency elements, one increasing, one decreasing, but almost symmetric. This may just be an artifact of the Gabor transform or the window and frequency parameters.

The window size for the plots below is slightly smaller, about 250 kyr. If we put the temperature and obliquity Gabor Transforms side by side, we see this:

It’s clear the lines at .024 line up pretty well.

Doing the same with eccentricity:

Eccentricity does not line up well with temperature in this exercise though both have bright bands above and below .01 .

Now for temperature and orbital inclination:

One sees that the frequencies line up better for this than for eccentricity, but one has to keep in mind that there is a nonlinear transformation performed on the ‘raw’ orbital plane data to project this down into the ‘invariant plane’ of the solar system. While this is physically motivated, it surely nudges the spectrum.

The temperature data clearly has a component with a period of approximately 42 kyr, matching well with obliquity. If you tilt your head a bit you can also see an indication of a fainter response at a frequency a bit above .04, corresponding roughly to period just below 25 kyrs, close to that of precession.

As far as the 100 kyr period goes, which is the periodicity of the glacial cycles, this analysis confirms much of what is known, namely that we can’t say for sure. Eccentricity seems to line up well with a periodicity of approximately 100 kyr, but on closer inspection there seems to be some discrepancies if you try to understand the glacial cycles as being forced by variations in eccentricity. The orbital plane inclination has a more similar Gabor transform modulus than does eccentricity.

A good next step would be to look the relative phases of the orbital parameters versus the temperature, but that’s all for now.

If you have any questions or comments or suggestions, please let me know!

References

The orbital data used above is due to André Berger et al and can be obtained here:

Orbital variations and insolation database, NOAA/NCDC/WDC Paleoclimatology.

The temperature proxy is due to J. Jouzel et al, and it’s based on changes in deuterium concentrations from the EPICA Antarctic ice core dating back over 800 kyr. This data can be found here:

EPICA Dome C – 800 kyr deuterium data and temperature estimates, NOAA Paleoclimatology.

Here are the papers by Muller and Macdonald that I mentioned:

• Richard Muller and Gordan MacDonald, Glacial cycles and astronomical forcing, Science 277 (1997), 215–218.

• Richard Muller and Gordan MacDonald, Spectrum of 100-kyr glacial cycle: orbital inclination, not eccentricity, PNAS 1997, 8329–8334.

They also have a book:

• Richard Muller and Gordan MacDonald, Ice Ages and Astronomical Causes, Springer, Berlin, 2002.

You can also get files of the data I used here:

Berger et al orbital parameter data, with explanatory text here.

Jouzel et al EPICA Dome C temperature data, with explanatory text here.

Muller and Macdonald’s orbital plane inclination data.


Game Theory (Part 8)

28 January, 2013

Last time we learned some rules for calculating probabilities. But we need a few more rules to get very far.

For example:

We say a coin is fair if it has probability 1/2 of landing heads up and probability 1/2 of landing tails up. What is the probability that if we flip two fair coins, both will land heads up?

Since each coin could land heads up or tails up, there are 4 events to consider here:

(H,H), (H,T),
(T,H), (T,T)

It seems plausible that each should be equally likely. If so, each has probability 1/4. So then the answer to our question would be 1/4.

But this is plausible only because we’re assuming that what one coin does doesn’t affect that the other one does! In other words, we’re assuming the two coin flips are ‘independent’.

If the coins were connected in some sneaky way, maybe each time one landed heads up, the other would land tails up. Then the answer to our question would be zero. Of course this seems silly. But it’s good to be very clear about this issue… because sometimes one event does affect another!

For example, suppose there’s a 5% probability of rain each day in the winter in Riverside. What’s the probability that it rains two days in a row? Remember that 5% is 0.05. So, you might guess the answer is

0.05 \times 0.05 = 0.0025

But this is wrong, because if it rains one day, that increases the probability that it will rain the next day. In other words, these events aren’t independent.

But if two events are independent, there’s an easy way to figure out the probability that they both happen: just multiply their probabilities! For example, if the chance that it will rain today in Riverside is 5% and the chance that it will rain tomorrow in Singapore is 60%, the chance that both these things will happen is

0.05 \times 0.6 = 0.03

or 3%, if these events are independent. I could try to persuade that this is a good rule, and maybe I will… but for now let’s just state it in a general way.

Independence

So, let’s make a precise definition out of all this! Suppose we have two sets of events, X and Y. Remember that X \times Y, the Cartesian product of the sets X and Y, is the set of all ordered pairs (i,j) where i \in X and j \in Y:

X \times Y = \{ (i,j) : \; i \in X, j \in Y \}

So, an event in X \times Y consists of an event in X and an event in Y. For example, if

X = \{ \textrm{rain today}, \textrm{no rain today} \}

and

Y = \{ \textrm{rain tomorrow}, \textrm{no rain tomorrow} \}

then

X \times Y = \begin{array}{l} \{ \textrm{(rain today, rain tomorrow)}, \\ \textrm{(no rain today, rain tomorrow)}, \\   \textrm{(rain today, no rain tomorrow}, \\ \textrm{(no rain today, no rain tomorrow)} \} \end{array}

Now we can define ‘independence’. It’s a rule for getting a probability distribution on X \times Y from probability distributions on X and Y:

Definition. Suppose p is a probability distribution on a set of events X, and q is a probability distribution on a set of events Y. If these events are independent, we use the probability distribution r on X \times Y given by

r_{(i,j)} = p_i q_j

People often call this probability distribution p \times q instead of r.

Examples

Example 1. Suppose we have a fair coin. This means we have a set of events

X = \{H, T \}

and a probability distribution p with

\displaystyle{ p_H = p_T = \frac{1}{2} }

Now suppose we flip it twice. We get a set of four events:

X \times X = \{(H,H), (H,T), (T,H), (T,T)\}

Suppose the two coin flips are independent. Then we describe the pair of coin flips using the probability measure r = p \times p on X \times X, with

\displaystyle{ r_{(H,H)} = p_H p_H = \frac{1}{4} }

\displaystyle{ r_{(H,T)} = p_H p_T = \frac{1}{4} }

\displaystyle{ r_{(T,H)} = p_T p_H = \frac{1}{4} }

\displaystyle{ r_{(T,T)} = p_T p_T = \frac{1}{4} }

So, each of the four events—“heads, heads” and so on—has probability 1/4. This is fairly boring: you should have known this already!

But now we can do a harder example:

Example 2. Suppose we have an unfair coin that has a 60% chance of landing heads up and a 40% chance of landing tails up. Now we have a new probability distribution on X, say q:

\displaystyle{ q_H = .6, \quad q_T = .4 }

Now say we flip this coin twice. What are the probabilities of the four different events that can happen? Let’s assume the two coin flips are independent. This means we should describe the pair of coin flips with a probability measure s = q \times q on X \times X. This tells us the answer to our question. We can work it out:

\displaystyle{ s_{(H,H)} = q_H q_H = 0.6 \times 0.6 = 0.36 }

\displaystyle{ s_{(H,T)} = q_H q_T = 0.6 \times 0.4 = 0.24 }

\displaystyle{ s_{(T,H)} = q_T q_H = 0.4 \times 0.6 = 0.24 }

\displaystyle{ s_{(T,T)} = q_T q_T = 0.4 \times 0.4 = 0.16 }

Puzzle 1. In this situation what is the probability that when we flip the coin twice it comes up heads exactly once?

Puzzle 2. In this situation what is the probability that when we flip the coin twice it comes up heads at least once?

For these puzzles you need to use what I told you in the section on ‘Probabilities of subsets’ near the end of Part 7.

Puzzle 3. Now suppose we have one fair coin and one coin that has a 60% chance of landing heads up. The first one is described by the probability distribution p, while the second is described by q. How likely is it that the first lands heads up and the second lands tails up? We can answer questions like this if the coin flips are independent. We do this by multiplying p and q to get a probability measure t = p \times q on X \times X. Remember the rule for how to do this:

t_{(i,j)} = p_i q_j

where each of i and j can be either H or T.

What are these probabilities:

\displaystyle{ t_{(H,H)} = ? }

\displaystyle{ t_{(H,T)} = ? }

\displaystyle{ t_{(T,H)} = ? }

\displaystyle{ t_{(T,T)} = ? }

Puzzle 4. In this situation what is the probability that exactly one coin lands heads up?

Puzzle 5. In this situation what is the probability that at least one coin lands heads up?

Next time we’ll go a lot further…


Game Theory (Part 7)

26 January, 2013

We need to learn a little probability theory to go further in our work on game theory.

We’ll start with some finite set X of ‘events’. The idea is that these are things that can happen—for example, choices you could make while playing a game. A ‘probability distribution’ on this set assigns to each event a number called a ‘probability’—which says, roughly speaking, how likely that event is. If we’ve got some event i, we’ll call its probability p_i.

For example, suppose we’re interested in whether it will rain today or not. Then we might look at a set of two events:

X = \{\textrm{rain}, \textrm{no rain} \}

If the weatherman says the chance of rain is 20%, then

p_{\textrm{rain} } = 0.2

since 20% is just a fancy way of saying 0.2. The chance of no rain will then be 80%, or 0.8, since the probabilities should add up to 1:

p_{\textrm{no rain}} = 0.8

Let’s make this precise with an official definition:

Definition. Given a finite set X of events, a probability distribution p assigns a real number p_i called a probability to each event i \in X, such that:

1) 0 \le p_i \le 1

and

2) \displaystyle{ \sum_{i \in X} p_i = 1}

Note that this official definition doesn’t say what an event really is, and it doesn’t say what probabilities really mean. But that’s how it should be! As usual with math definitions, the words in boldface could be replaced by any other words and the definition would still do its main job, which is to let us prove theorems involving these words. If we wanted, we could call an event a doohickey, and call a probability a schnoofus. All our theorems would still be true.

Of course we hope our theorems will be useful in real world applications. And in these applications, the probabilities p_i will be some way of measuring ‘how likely’ events are. But it’s actually quite hard to say precisely what probabilities really mean! People have been arguing about this for centuries. So it’s good that we separate this hard task from our definition above, which is quite simple and 100% precise.

Why is it hard to say what probabilities really are? Well, what does it mean to say “the probability of rain is 20%”? Suppose you see a weather report and read this. What does it mean?

A student suggests: “it means that if you looked at a lot of similar days, it would rain on 20% of them.”

Yes, that’s pretty good. But what counts as a “similar day”? How similar does it have to be? Does everyone have to wear the same clothes? No, that probably doesn’t matter, because presumably doesn’t affect the weather. But what does affect the weather? A lot of things! Do all those things have to be exactly the same for it count as similar day.

And what counts as a “lot” of days? How many do we need?

And it won’t rain on exactly 20% of those days. How close do we need to get?

Imagine I have a coin and I claim it lands heads up 50% of the time. Say I flip it 10 times and it lands heads up every time. Does that mean I was wrong? Not necessarily. It’s possible that the coin will do this. It’s just not very probable.

But look: now we’re using the word ‘probable’, which is the word we’re trying to understand! It’s getting sort of circular: we’re saying a coin has a 50% probability of landing heads up if when you flip it a lot of times, it probably lands head up close to 50% of the time. That’s not very helpful if you don’t already have some idea what ‘probability’ means.

For all these reasons, and many more, it’s tricky to say exactly what probabilities really mean. People have made a lot of progress on this question, but we will sidestep it and focus on learning to calculate with probabilities.

If you want to dig in a bit deeper, try this:

Probability interpretations, Wikipedia.

Equally likely events

As I’ve tried to convince you, it can be hard to figure out the probabilities of events. But it’s easy if we assume all the events are equally likely.

Suppose we have a set X consisting of n events. And suppose that all the probabilities p_i are equal: say for some constant c we have

p_i = c

for all i \in X. Then by rule 1) above,

\displaystyle{ 1 = \sum_{i \in X} p_i = \sum_{i \in X} c = n c }

since we’re just adding the number c to itself n times. So,

\displaystyle{  c = \frac{1}{n} }

and thus

\displaystyle{ p_i = \frac{1}{n} }

for all i \in X.

I made this look harder than it really is. I was just trying to show you that it follows from the definitions, not any intuition. But it’s obvious: if you have n events that are equally likely, each one has probability 1/n.

Example 1. Suppose we have a coin that can land either heads up or tails up—let’s ignore the possibility that it lands on its edge! Then

X = \{ H, T\}

If we assume these two events are equally probable, we must have

\displaystyle{ p_H = p_T =  \frac{1}{2} }

Note I said “if we assume” these two events are equally probable. I didn’t say they actually are! Are they? Suppose we take a penny and flip it a zillion times. Will it land heads up almost exactly half a zillion times?

Probably not! The treasury isn’t interested in making pennies that do this. They’re interested in making the head look like Lincoln, and the tail look like the Lincoln national monument:

Or at least they used to. Since the two sides are different, there’s no reason they should have the exact same probability of landing on top.

In fact nobody seems to have measured the difference between heads and tails in probabilities for flipping pennies. For hand-flipped pennies, it seems whatever side that starts on top has a roughly 51% chance of landing on top! But if you spin a penny, it’s much more likely to land tails up:

The coin flip: a fundamentally unfair proposition?, Coding the Wheel.

Example 2. Suppose we have a standard deck of cards, well-shuffled, and assume that when I draw a card from this deck, each card is equally likely to be chosen. What is the probability that I draw the ace of spades?

If there’s no joker in the deck, there are 52 cards, so the answer is 1/52.

Let me remind you how a deck of cards works: I wouldn’t want someone to fail the course because they didn’t ever play cards! Here are the 52 cards in a standard deck. Here’s what they all look like (click to enlarge):

As you can see, they come in 4 kinds, called suits. The suits are:

• clubs: ♣

• spades: ♠

diamonds: ♦

hearts: ♥

Two suits are black and two are red. Each suit has 13 cards in it, for a total of 4 × 13 = 52. The cards in each suit are numbered from 1 to 13, except for four exceptions. They go like this:

A, 2, 3, 4, 5, 6, 7, 8, 9, 10, J, Q, K

A stands for ‘ace’, J for ‘jack’, Q for ‘queen’ and K for ‘king’.

Probabilities of subsets

If we know a probability distribution on a finite set X, we can define the probability that an event in some subset S \subseteq X will occur. We define this to be

\displaystyle{p(S) = \sum_{i \in S} p_i }

For example, I usually have one of three things for breakfast:

X = \{ \textrm{oatmeal}, \textrm{waffles}, \textrm{eggs} \}

I have an 86% chance of eating oatmeal for breakfast, a 10% chance of eating waffles, and a 4% chance of eating eggs and toast. What’s the probability that I will eat oatmeal or waffles? These choices form the subset

S = \{ \textrm{oatmeal}, \textrm{waffles} \}

and the probability for this subset is

p(S) = p_{\textrm{oatmeal}} + p_{\textrm{waffles}} = 0.86 + 0.1 = 0.96

Here’s an example from cards:

Example 2. Suppose we have a standard deck of cards, well-shuffled, and assume that when I draw a card from this deck, each card is equally likely to be chosen. What is the probability that I draw a card in the suit of hearts?

Since there are 13 cards in the suit of hearts, each with probability 1/52, we add up their probabilities and get

\displaystyle{ 13 \times \frac{1}{52} = \frac{1}{4} }

This should make sense, since there are 4 suits, and as many cards in each suit.

Card tricks

This is just a fun digression. The deck of cards involves some weird numerology. For starters, it has 52 cards. That’s a strange number! Where else have you seen this number?

A student says: “It’s the number of weeks in a year.”

Right! And these 52 cards are grouped in 4 suits. What does the year have 4 of?

A student says: “Seasons!”

Right! And we have 52 = 4 × 13. So what are there 13 of?

A student says: “Weeks in a season!”

Right! I have no idea if this is a coincidence or not. And have you ever added up the values of all the cards in a suit, where we count the ace as 1, and so on? We get

1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9 + 10 + 11 + 12 + 13

And what’s that equal to?

After a long pause, a student says “91.”

Yes, that’s a really strange number. But let’s say we total up the values of all the cards in the deck, not just one suit. What do we get?

A student says “We get 4 × 91… or 364.”

Right. Three-hundred and sixty-four. Almost the number of days in year.

“So add one more: the joker! Then you get 365!”

Right, maybe that’s why they put an extra card called the joker in the deck:

One extra card for one extra day, joker-day… April Fool’s Day! That brings the total up to 365.

Again, I have no idea if this is a coincidence or not. But the people who invented the Tarot deck were pretty weird—they packed it with symbolism—so maybe the ordinary cards were designed this way on purpose too.

Puzzle. What are the prime factors of the number 91? You should know by now… and you should know what they have to do with the calendar!


Game Theory (Part 6)

25 January, 2013

We’ve been looking at games where each player gets a payoff depending on the choice that both players make. The payoff is a real number, which I often call the number of points. When we play these games in class, these points go toward your grade. 10% of your grade depends on the the total number of points you earn in quizzes and games. But what do these points mean in other games, like Prisoner’s Dilemma or Battle of the Sexes?

This leads us into some very interesting and deep questions. Let’s take a very quick look at them, without getting very dep.

Maximizing the payoff

The main thing is this. When we’re studying games, we’ll assume each player’s goal is to earn as many points as possible. In other words, they are trying to maximize their payoff.

They are not, for example, trying to make their payoff bigger than the other player’s payoff. Indeed, in class you should not be trying to earn more points than me! One student said he was trying to do that. That’s a mistake. You should be happier if

• you get 10 points and I get 20

than if

• you get -10 points and I get -20.

After all, it’s only your total number of points that affects your grade, not whether it’s bigger than mine.

So, you should always try to maximize your payoff. And I promise to do the same thing: I’ll always try to maximize my payoff. You have to take my word on this, since my salary is not affected by my payoff! But I want to make your task very clear: you are trying to maximize your payoff, and you can assume I am trying to maximize mine.

(If I were doing something else, like sadistically trying to minimize your payoff, that would affect your decisions!)

Rational agents and utility

We can’t understand how people actually play games unless we know what they are trying to do. In real life, people’s motives are very complicated and sometimes mysterious. But in mathematical game theory, we start by studying simpler: rational agents. Roughly speaking, a rational agent is defined to be a person or animal or computer program or something that is doing the best possible job of maximizing some quantity, given the information they have.

This is a rough definition, which we will try to improve later.

You shouldn’t be fooled by the positive connotations of the word ‘rational’. We’re using it in a very specific technical way here. A madman in a movie theater who is trying to kill as many people as possible counts as ‘rational’ by our definition if they maximize the number of people killed, given the information they have.

The whole question of what really should count as ‘rationality’ is a very deep one. People have a lot of interesting ideas about it:

Rationality, Wikipedia.

Utility

So: we say a rational agent does the best possible job of maximizing their payoff given the information they have. But in economics, this payoff is often called utility.

That’s an odd word, but comes from a moral philosophy called utilitarianism, which says—very roughly—that the goal of life is to maximize happiness. Perhaps because it’s a bit embarrassing to talk about maximizing happiness, these philosophers called it ‘utility’.

But be careful: while the moral philosophers often talk about agents trying to maximize the total utility of everyone, economists focus on rational agents trying to maximize their own utility.

This sounds very selfish. But it’s not necessarily. If you want other people to be happy, your utility depends on their utility. If you were a complete altruist, perhaps maximizing your utility would even be the same as maximizing the total utility of everyone!

Again, there are many deep problems here, which I won’t discuss. I’ll just mention one: in practice, it’s very hard to define utility in a way that’s precise enough to measure, much less add up! See here for a bit more:

Utility, Wikipedia.

Utilitarianism, Wikipedia.

The assumption of mutual rationality

Game theory is simplest when

all players are rational agents,

and

each player knows all the other players are rational agents.

Of course, in the real world nobody is rational all the time, so things get much more complicated. If you’re playing against an irrational agent, you have to work harder to guess what they are going to do!

But in the games we play in class, I will try to be a rational agent: I will try my best to maximize my payoff. And you too should try to be a rational agent, and maximize your payoff—since that will help your grade. And you can assume I am a rational agent. And I will assume you are a rational agent.

So: I know that if I keep making the same choice, you will make the choice that maximizes your payoff given what I do.

And: you know that if you keep making the same choice, I will make the choice that maximizes my payoff given what you do.

Given this, we should both seek a Nash equilibrium. I won’t try to state this precisely and prove it as a theorem… but I hope it’s believable. You can see some theorems about this here:

• Robert Aumann and Adam Brandenburger, Epistemic conditions for Nash equilibrium.

Probabilities

All this is fine if a Nash equilibrium exists and is unique. But we’ve seen that in some games, a Nash equilibirum doesn’t exist—at least not, if we only consider pure strategies, where each player makes the same choice every time. And in other games, the Nash equilibrium exists but there is more than one.

In games like this, saying that players will try to find a Nash equilibrium doesn’t settle all our questions! What should they do if there’s none, or more than one?

We’ve seen one example: rock-paper-scissors. If we only consider pure strategies, this game has no Nash equilibrium. But I’ve already suggested the solution to this problem. The players should use mixed strategies, where they randomly make different choices with different probabilities.

So, to make progress, we’ll need to learn a bit of probability theory! That’ll be our next topic.


Anasazi America (Part 2)

24 January, 2013

Last time I told you a story of the American Southwest, starting with the arrival of small bands of hunters around 10,000 BC. I focused on the Anasazi, or ‘ancient Pueblo people’, and I led up to the Late Basketmaker III Era, from 500 to 750 AD.

The big invention during this time was the bow and arrow. Before then, large animals were killed by darts thrown from slings, which required a lot more skill and luck. But even more important was the continuing growth of agriculture: the cultivation of corn, beans and squash. This was fueled a period of dramatic population growth.

But this was just the start!

The Pueblo I and II Eras

The Pueblo I Era began around 750 AD. At this time people started living in ‘pueblos’: houses with flat roofs held up by wooden poles. Towns became bigger, holding up to 600 people. But these towns typically lasted only 30 years or so. It seems people needed to move when conditions changed.

Starting around 800 AD, the ancient Pueblo people started building ‘great houses’: multi-storied buildings with high ceilings, rooms much larger than those in domestic dwellings, and elaborate subterranean rooms called ‘kivas’. And around 900 AD, people started building houses with stone roofs. We call this the start of the Pueblo II Era.

The center of these developments was the Chaco Canyon area in New Mexico:

Chaco Canyon is 125 kilometers east of Canyon de Chelly.
Unfortunately, I didn’t see it on my trip—I wanted to, but we didn’t have time.

By 950 AD, there were pueblos on every ridge and hilltop of the Chaco Canyon area. Due to the high population density and unpredictable rainfall, this area could no longer provide enough meat to sustain the needs of the local population. Apparently they couldn’t get enough fat, salt and minerals from a purely vegan diet—a shortcoming we have now overcome!

Yet the population continued to grow until 1000 AD. In his book Anasazi America, David Stuart wrote:

Millions of us buy mutual funds, believing the risk is spread among millions of investors and a large “basket” of fund stocks. Millions divert a portion of each hard-earned paycheck to purchase such funds for retirement. “Get in! Get in!” hawk the TV ads. “The market is going up. Historically, it always goes up in the long haul. The average rate of return this century is 9 percent per year!” Every one of us who does that is a Californian at heart, believing in growth, risk, power. It works—until an episode of too-rapid expansion in the market, combined with brutal business competition, threatens to undo it.

That is about what it was like, economically, at Chaco Canyon in the year 1000—rapid agricultural expansion, no more land to be gotten, and deepening competition. Don’t think of it as “romantic” or “primitive”. Think of it as just like 1999 in the United States, when the Dow Jones Industrial Average hit 11,000 and 30 million investors held their breath to see what would happen next.

The Chaco phenomenon

In 1020 the rainfall became more predictable. There wasn’t more rain, it was simply less erratic. This was good for the ancient Pueblo people. At this point the ‘Chaco phenomenon’ began: an amazing flowering of civilization.

We see this in places like Pueblo Bonito, the largest great house in Chaco Canyon:

Pueblo Bonito was founded in the 800s. But starting in 1020 it grew immensely, and it kept growing until 1120. By this time it had 700 rooms, nearly half devoted to grain storage. It also had 33 kivas, which are the round structures you see here.

But Pueblo Bonito is just one of a dozen great houses built in Chaco Canyon by 1120. About 215 thousand ponderosa pine trees were cut down in this building spree! Stuart estimates that building these houses took over 2 million man-hours of work. They also built about 650 kilometers of roads! Most of these connect one great house to another… but some mysteriously seem to go to ‘nowhere’.

By 1080, however, the summer rainfall had started to decline. And by 1090 there were serious summer drought lasting for five years. We know this sort of thing from tree rings: there are enough ponderosa logs and the like that archaeologists have built up a detailed year-by-year record.

Thanks to overpopulation and these droughts, Chaco Canyon civilization was in serious trouble at this point, but it charged ahead:

Part of Chacoan society were already in deep trouble after AD 1050 as health and living conditions progressively eroded in the southern districts’ open farming communities. The small farmers in the south had first created reliable surpluses to be stored in the great houses. Ultimately, it was the increasingly terrible conditions of those farmers, the people who grew the corn, that had made Chacoan society so fatally vulnerable. They simply got back too little from their efforts to carry on.

[….]

Still, the great-house dwellers didn’t merely sit on their hands. As some farms failed, they used farm labor to expand roads, rituals, and great houses. This prehistoric version of a Keynesian growth model apparently alleviated enough of the stresses and strains to sustain growth through the 1070s. Then came the waning rainfall of the 1080s, followed by drought in the 1090s.

Circumstances in farming communities worsened quickly and dramatically with this drought; the very survival of many was at stake. The great-house elites at Chaco Canyon apparently responded with even more roads, rituals, and great houses. This was actually a period of great-house and road infrastructure “in-fill”, both in and near established open communities. In a few years, the rains returned. This could not help but powerfully reinforce the elites’ now well-established, formulaic response to problems.

But roads, rituals, and great houses simply did not do enough for the hungry farmers who produced corn and pottery. As the eleventh century drew to a close, even though the rains had come again, they walked away, further eroding the surpluses that had fueled the system. Imagine it: the elites must have believe the situation was saved, even as more farmers gave up in despair. Inexplicably, they never “exported” the modest irrigation system that had caught and diverted midsummer runoff from the mesa tops at Chaco Canyon and made local fields more productive. Instead, once again the elites responded with the sacred formula—more roads, more rituals, more great houses.

So, Stuart argues that the last of the Chaco Canyon building projects were “the desperate economic reactions of a fragile and frightened society”.

Regardless of whether this is true, we know that starting around 1100 AD, many of the ancient Pueblo people left the Chaco Canyon area. Many moved upland, to places with more rain and snow. Instead of great houses, many returned to building the simpler pit houses of old.

Tribes descending from the ancient Pueblo people still have myths about the decline of the Chaco civilization. While such tales should be taken with a huge grain of salt, these are too fascinating not to repeat. Here are two quotes:

In our history we talk of things that occurred a long time ago, of people who had enormous amounts of power, spiritual power and power over people. I think that those kinds of people lived here in Chaco…. Here at Chaco there were very powerful people who had a lot of spiritual power, and these people probably used their power in ways that caused things to change, and that may have been one of the reasons why the migrations were set to start again, because these these people were causing changes that were never meant to occur.

My response to the canyon was that some sensibility other than my Pueblo ancestors had worked on the Chaco great houses. There were the familiar elements such as the nansipu (the symbolic opening into the underworld), kivas, plazas and earth materials, but they were overlain by a strictness and precision of design that was unfamiliar…. It was clear that the purpose of these great villages was not to restate their oneness with the earth but to show the power and specialness of humans… a desire to control human and natural resources… These were men who embraced a social-political-religious hierarchy and envisioned control and power over places, resources and people.

These quotes are from an excellent book on the changing techniques and theories of archaeologists of the American Southwest:

• Stephen H. Lekson, A History of the Ancient Southwest, School for Advanced Research, Santa Fe, New Mexico, 2008.

What these quotes show, I think, is that the sensibility of current-day Pueblo people is very different from that of the people who built the great houses of Chaco Canyon. According to David Stuart, the Chaco civilization was a ‘powerful’ culture, while their descendants became an ‘efficient’ culture:

… a powerful society (or organism) captures more energy and expends (metabolizes) it more rapidly than an efficient one. Such societies tend to be structurally more complex, more wasteful of energy, more competitive, and faster paced than an efficient one. Think of modern urban America as powerful, and you will get the picture. In contrast, an efficient society “metabolizes” its energy more slowly, and so it is structurally less complex, less wasteful, less competitive, and slower. Think of Amish farmers in Pennsylvania or contemporary Pueblo farms in the American Southwest.

In competitive terms, the powerful society has an enormous short-term advantage over the efficient one if enough energy is naturally available to “feed” it, or if its technology and trade can bring in energy rapidly enough to sustain it. But when energy (food, fuel and resources) becomes scarce, or when trade and technology fail, an efficient society is advantageous because it simpler, less wasteful structure is more easily sustained in times of scarcity.

The Pueblo III Era, and collapse

 

By 1150 AD, some of the ancient Pueblo people began building cliff dwellings at higher elevations—like Mesa Verde in Colorado, shown above. This marks the start of the Pueblo III Era. But this era lasted a short time. By 1280, Mesa Verde was deserted!

Some of the ruins in Canyon de Chelly also date to the Pueblo III Era. For example, the White House Ruins were built around 1200. Here are some of my pictures of this marvelous place. Click to enlarge:

But again, they were deserted by the end of the Pueblo III Era.

Why did the ancient Pueblo people move to cliff dwellings? And why did they move out so soon?

Nobody is sure. Cliff dwellings are easy to defend against attack. Built into the south face of a cliff, they catch the sun in winter to stay warm—it gets cold here in winter!—but they stay cool when the sun is straight overhead in summer. These are good reasons to build cliff dwellings. But these reasons don’t explain why cliff dwellings were so popular from 1150 to 1280, and then were abandoned!

One important factor seems to be this: there was a series of severe droughts starting around 1275. There were also raids from other tribes: speakers of Na-Dené languages, who eventually became the current-day Navajo inhabitants of this area.

But drought alone may be unable to explain what happened. There have been some fascinating attempts to model the collapse of the Anasazi culture. One is called the Artificial Anasazi Project. It used ‘agent-based modeling’ to study what the ancient Pueblo people did in Long House Valley, Arizona, from 200 to 1300. The Villages Project, a collaboration of Washington State University and the Crow Canyon Archaeological Center, focused on the region near Mesa Verde.

Quoting Stephen Lekson’s book:

Both projects mirrored actual settlement patterns from 800 to 1250 with admirable accuracy. Problems rose, however, with the abandonments of the regions, in both cases after 1250. There were unexplained exceptions, misfits between the models and reality.

Those misfits were not minor. Neither model predicted complete abandonment. Yet it happened. That’s perplexing. In the Scientific American summary of the Long House Valley model, Kohler, Gummerman, and Reynolds write, “We can only conclude that sociopolitical, ideological or environmental factors not included in our model must have contributed to the total depopulation of the valley.” Similar conundrums best the Villages Project: “None of our simulations terminated with a population decline as dramatic as what actually happened in the Mesa Verde region in the late 1200.”

These simulation projects look interesting! Of course they leave out many factors, but that’s okay: it suggests that one of those factors could be important in understanding the collapse.

For more info, click on the links. Also try this short review by the author of a famous book on why civilizations collapse:

• Jared Diamond, Life with the artificial Anasazi, Nature 419 (2002), 567–569.

From this article, here are the simulated versus ‘actual’ populations of the ancient Pueblo people in Long House Valley, Arizona, from 800 to 1350 AD:


The so-called ‘actual’ population is estimated using the number of house sites that were active at a given time, assuming five people per house.

This graph gives a shocking and dramatic ending to our tale! Lets hope our current-day tale doesn’t end so abruptly, because in abrupt transitions much gets lost. But of course the ancient Pueblo people didn’t disappear. They didn’t all die. They became an ‘efficient’ society: they learned to make do with diminished resources.


Follow

Get every new post delivered to your Inbox.

Join 3,095 other followers