## An Entropy Challenge

29 August, 2012

If you like computer calculations, here’s a little challenge for you. Oscar Dahlsten may have solved it, but we’d love for you to check his work. It’s pretty important for the foundations of thermodynamics, but you don’t need to know any physics or even anything beyond a little algebra to tackle it! First I’ll explain it in really simple terms, then I’ll remind you a bit of why it matters.

We’re looking for two lists of nonnegative numbers, of the same length, listed in decreasing order:

$p_1 \ge p_2 \ge \cdots \ge p_n \ge 0$

$q_1 \ge q_2 \ge \cdots \ge q_n \ge 0$

that sum to 1:

$p_1 + \cdots + p_n = 1$

$q_1 + \cdots + q_n = 1$

and that obey this inequality:

$\displaystyle{ \frac{1}{1 - \beta} \ln \sum_{i=1}^n p_i^\beta \le \frac{1}{1 - \beta} \ln \sum_{i=1}^n q_i^\beta }$

for all $0 < \beta < \infty$ (ignoring $\beta = 1$), yet do not obey these inequalities:

$p_1 + \cdots + p_k \ge q_1 + \cdots + q_k$

for all $1 \le k \le n.$

Oscar’s proposed solution is this:

$p = (0.4, 0.29, 0.29, 0.02)$

$q = (0.39, 0.31, 0.2, 0.1)$

Can you see if this works? Is there a simpler example, like one with lists of just 3 numbers?

This question came up near the end of my post More Second Laws of Thermodynamics. I phrased the question with a bit more jargon, and said a lot more about its significance. Suppose we have two probability distributions on a finite set, say $p$ and $q.$ We say $p$ majorizes $q$ if

$p_1 + \cdots + p_k \ge q_1 + \cdots + q_k$

for all $1 \le k \le n,$ when we write both lists of numbers in decreasing order. This means $p$ is ‘less flat’ than $q$, so it should have less entropy. And indeed it does: not just for ordinary entropy, but also for Rényi entropy! The Rényi entropy of $p$ is defined by

$\displaystyle{ H_\beta(p) = \frac{1}{1 - \beta} \ln \sum_{i=1}^n p_i^\beta }$

where $0 < \beta < 1$ or $1 < \beta < \infty$. We can also define Rényi entropy for $\beta = 0, 1, \infty$ by taking a limit, and at $\beta = 1$ we get the ordinary entropy

$\displaystyle{ H_1(p) = - \sum_{i = 1}^n p_i \ln (p_i) }$

The question is whether majorization is more powerful than Rényi entropy as a tool to to tell when one probability distribution is less flat than another. I know that if $p$ majorizes $q,$ its Rényi entropy is less than than that of $q$ for all $0 \le \beta \le \infty.$ Your mission, should you choose to accept it, is to show the converse is not true.

## Tidbits of Geometry

15 March, 2012

Since I grew up up reading Martin Gardner, I’ve often imagined it would be fun to write about math and physics in a way that nonexperts might enjoy. Right now I’m trying my hand at this on Google+. You can read that stuff here.

Google+ encourages brevity—not as much as Twitter, but more than a blog. So I’m posting things that feature a single catchy image, a brief explanation, and URL’s to visit for more details.

Lately I’ve been talking about geometry. I realized that these posts could be cobbled together into a kind of loose ‘story’, so here it is. I couldn’t resist expanding the posts a bit, but the only really new stuff is more about Leonardo Da Vinci and the golden ratio, and five puzzles—only one of which I know the answer to!

### The golden ratio

Sure, the golden ratio, Φ = (√5+1)/2, is cool… but if you think ancient Greeks ran around in togas talking about the “golden ratio” and writing it as “Φ”, you’re wrong. This number was named Φ after the Greek sculptor Phidias only in 1914, in a book called The Curves of Life by the artist Theodore Cook. And it was Cook who first started calling 1.618… the golden ratio. Before him, 1/Φ = 0.618… was called the golden ratio! Cook dubbed this number “φ”, the lower-case baby brother of Φ.

In fact, the whole “golden” terminology can only be traced back to 1826, when it showed up in a footnote to a book by one Martin Ohm, brother of Georg Ohm—the guy with the law about resistors. Before then, a lot of people called 1/Φ the “Divine Proportion”. And the guy who started that was Luca Pacioli, a pal of Leonardo da Vinci who translated Euclid’s Elements. In 1509, Pacioli published a 3-volume text entitled De Divina Proportione, advertising the virtues of this number.

Greek texts seem remarkably quiet about this number. The first recorded hint of it is Proposition 11 in Book II of Euclid’s Elements. It also shows up elsewhere in Euclid, especially Proposition 30 of Book VI, where the task is “to cut a given finite straight line in extreme and mean ratio”, meaning a ratio A:B such that A is to B as B is to A+B. This is later used in Proposition 17 of Book XIII to construct the pentagonal face of a regular dodecahedron.

The regular pentagon, and the pentagram inside it, is deeply connected to the golden ratio. If you look carefully, you’ll see no fewer than twenty long skinny isosceles triangles, in three different sizes but all the same shape!

They’re all ‘golden triangles’: the short side is φ times the length of the long sides.

And the picture here lets us see that φ is to 1 as 1 is to 1+φ. A little algebra then gives

$\varphi^2 + \varphi = 1$

which you can solve to get

$\varphi = \displaystyle{\frac{\sqrt{5}-1}{2}}$

and thus

$\Phi = \varphi + 1 = \displaystyle{\frac{\sqrt{5}+1}{2}}$

For more, see:

• John Baez, Tales of the Dodecahedron.

### Da Vinci and the golden ratio

Did Leonardo da Vinci use the golden ratio in his art? It would cool if he did. Unfortunately, attempts to prove it by drawing rectangles on his sketches and paintings are unconvincing. Here are three attempts you can see on the web; click for details if you want:

The first two make me less inclined to believe Da Vinci was using the golden ratio, not more. The last one, the so-called
Vitruvian Man
, looks the most convincing, but only if you take on faith that the ratio being depicted is really the golden ratio!

Puzzle 1. Carefully measure the ratio here and tell us what you get, with error bars on your result.

It would be infinitely more convincing if Da Vinci had written about the golden ratio in his famous notebooks. But I don’t think he did. If he didn’t, that actually weighs against the whole notion.

Indeed, I thought the whole notion was completely hopeless until I discovered that Da Vinci did the woodcuttings for Pacioli’s book De Divina Proportione. And even lived with Pacioli while this project was going on! So, we can safely assume Da Vinci knew what was in this book.

It consists of 3 separate volumes. First a volume about the golden ratio, polygons, and perspective. Then one about the ideas of Vitruvius on math in architecture. (Apparently Vitruvius did not discuss the golden ratio.) Then one that’s mainly an Italian translation of Piero della Francesca’s Latin writings on polyhedra.

De Divina Proportione was popular in its day, but only two copies of the original edition survive. Luckily, it’s been scanned in!

• Luca Pacioli, De Divina Proportione.

The only picture I see that might be about using the golden ratio to draw the human figure is this:

The rectangles don’t look very ‘golden’! But the really important thing is to read the text around this picture, or for that matter the whole book. Unfortunately my Renaissance Italian is… ahem… a bit rusty. The text has been translated into German but apparently not English.

The picture above is on page 70 of the scanned-in file. Of course some scholar should have written a paper about this already… I just haven’t gotten around to searching the literature.

By the way, here’s something annoying. This picture on the Wikipedia article about De Divina Proportione purports to come from that book:

Again most of the rectangles don’t look very golden, even though it says “Divina Proportio” right on top. But here’s the big problem: I can’t find it in the online version of the book! Luca Luve, who spotted the online version for me in the first place, concurs.

Puzzle 3. Where is it really from?

### Luca Pacioli

Luca Pacioli had many talents: besides books on art, geometry and mathematics, he also wrote the first textbook on double-entry bookkeeping! This portrait of him multitasking gives some clue as to how he accomplished so much. He seems to be somberly staring at a hollow glass cuboctahedron half-filled with water while simultaneously drawing something completely different and reading a book:

Note the compass and the regular dodecahedron. The identity of the other figure in the painting is uncertain, and so is that of the painter, though people tend to say it’s Jacopo de’ Barbari.

### Piero della Francesca

This creepy painting shows three people calmly discussing something while Jesus is getting whipped in the background. It’s one of the first paintings to use mathematically defined rules of perspective, and it’s by Piero della Francesca, the guy whose pictures of polyhedra fill the third part of Pacioli’s De Divina Proportione.

Piero della Francesca seems like an interesting guy: a major artist who actually quit painting in the 1470′s to focus on the mathematics of perspective and polyhedra. If you want to know how to draw a perfect regular pentagon in perpective using straightedge and compass, he’s your guy.

### Constructing the pentagon

I won’t tell you how to do it in perspective, but here’s how to construct a regular pentagon with straightedge and compass:

Just pay attention to how it starts. Say the radius of the circle is 1. We bisect it and get a segment of length 1/2, then consider a segment at right angles of length 1. But

$(1/2)^2 + 1^2 = 5/4$

so these are the sides of a right triangle whose hypotenuse has length √5/2, by the Pythagorean theorem!

Yes, I know I didn’t explain the whole construction… just the start. But the golden ratio is √5/2 + 1/2, so we’ve clearly on the right track. If you’re ever stuck on a desert island with nothing to do but lots of sand and some branches, you can figure out the rest yourself.

Or if you’ve got the internet on your desert island, read this:

Pentagon, Wikipedia.

But here’s the easy way to make a regular pentagon: just tie a simple overhand knot in a strip of paper!

### The pentagon-decagon-hexagon identity

The most bizarre fact in Euclid’s Elements is Proposition XIII.10. Take a circle and inscribe a regular pentagon, a regular hexagon, and a regular decagon. Take the edges of these shapes, and use them as the sides of a triangle. Then this is a right triangle!

How did anyone notice this??? It’s long been suspected that this fact first came up in studying the icosahedron. But nobody gave a proof using the icosahedron until I posed this as a challenge and Greg Egan took it up. The hard part is showing that the two right triangles here are congruent:

Then AB is the side of the pentagon, BC is the side of the decagon and AC’ is the radius of the circle itself, which is the side of the hexagon!

For details, see:

• John Baez, This Week’s Finds in Mathematical Physics (Week 283).

and

### The octahedron and icosahedron

Platonic solids are cool. A regular octahedron has 12 edges. A regular icosahedron has 12 vertices. Irrelevant coincidence? No! If you cleverly put a dot on each edge of the regular octahedron, you get the vertices of a regular icosahedron! But it doesn’t work if you put the dot right in the middle of the edge—you have to subdivide the edge in the exactly correct ratio. Which ratio? The golden ratio!

This picture comes from R. W. Gray.

According to Coxeter’s marvelous book Regular Polytopes, this fact goes back at least to an 1873 paper by a fellow named Schönemann.

Puzzle 4. What do you get if you put each dot precisely in the center of the edge?

### The heptagon

The golden ratio Φ is great, but maybe it’s time to move on? The regular pentagon’s diagonal is Φ times its edge, and a little geometry shows the ratio of 1 to Φ equals the ratio of Φ to Φ+1. What about the regular heptagon? Here we get two numbers, ρ and σ, which satisfy four equations, written as ratios below! So, for example, the ratio of 1 to ρ equals the ratio of ρ to 1+σ, and so on.

For more see:

• Peter Steinbach, Golden fields: a case for the heptagon, Mathematics Magazine 70 (Feb., 1997), 22-31.

He works out the theory for every regular polygon. So, it’s not that the fun stops after the pentagon: it just gets more sophisticated!

### Constructing the heptagon

You can’t use a straightedge and compass to construct a regular heptagon. But here’s a construction that seems to do just that!

If you watch carefully, the seeming paradox is explained. For more, see:

Heptagon, Wikipedia.

### Trisecting the angle

When I was a kid, my uncle wowed me by trisecting an angle. He wasn’t a crackpot: he was bending the usual rules! He marked two dots on the ruler, A and B below, whose distance equaled the radius of the circle, namely OB. Then the trick below makes φ one third of θ.

Drawing dots on your ruler is called neusis, and the ancient Greeks knew about it. You can also use it to double the cube and construct a regular heptagon—impossible with a compass and straightedge if you’re don’t draw dots on it. Oddly, it fell out of fashion. Maybe purity of method mattered more than solving lots of problems?

Nowadays we realize that if you only have a straightedge, you can only solve linear equations. Adding a compass to your toolkit lets you also take square roots, so you can solve quadratic equations. Adding neusis on top of that lets you take cube roots, which—together with the rest—lets you solve cubic equations. A fourth root is a square root of a square root, so you get those for free, and in fact you can even solve all quartic equations. But you can’t take fifth roots.

Puzzle 5. Did anyone ever build a mechanical gadget that lets you take fifth roots, or maybe even solve general quintics?

## Archimedean Tilings and Egyptian Fractions

5 February, 2012

Ever since I was a kid, I’ve loved Archimedean tilings of the plane: that is, tilings by regular polygons where all the edge lengths are the same and every vertex looks alike. Here’s my favorite:

There are also 11 others, two of which are mirror images of each other. But how do we know this? How do we list them all and be sure we haven’t left any out?

The interior angle of a regular $k$-sided polygon is obviously

$\displaystyle{\pi - \frac{2 \pi}{k}}$

since it’s a bit less than 180 degrees, or $\pi,$ and how much?— well, $1/k$ times a full turn, or $2 \pi.$ But these $\pi$‘s are getting annoying: it’s easier to say ‘a full turn’ than write $2\pi.$ Then we can say the interior angle is

$\displaystyle{\frac{1}{2} - \frac{1}{k}}$

times a full turn.

Now suppose we have an Archimedean tiling where $n$ polygons meet: one with $k_1$ sides, one with $k_2$ sides, and so on up to one with $k_n$ sides. Their interior angles must add up to a full turn. So, we have

$\displaystyle{\left(\frac{1}{2} - \frac{1}{k_1}\right) + \cdots + \left(\frac{1}{2} - \frac{1}{k_n}\right) = 1 }$

or

$\displaystyle{\frac{n}{2} - \frac{1}{k_1} - \cdots - \frac{1}{k_n} = 1}$

or

$\displaystyle{ \frac{1}{k_1} + \cdots + \frac{1}{k_n} = \frac{n}{2} - 1 }$

So: to get an Archimedean tiling you need n whole numbers whose reciprocals add up to one less than n/2.

Looking for numbers like this is a weird little math puzzle. The Egyptians liked writing numbers as sums of reciprocals, so they might have enjoyed this game if they’d known it. The tiling I showed you comes from this solution:

$\displaystyle{\frac{1}{4} + \frac{1}{6} + \frac{1}{12} = \frac{3}{2} - 1 }$

since it has 3 polygons meeting at each vertex: a 4-sided one, a 6-sided one and a 12-sided one.

Here’s another solution:

$\displaystyle{\frac{1}{3} + \frac{1}{4} + \frac{1}{4} + \frac{1}{6} = \frac{4}{2} - 1 }$

It gives us this tiling:

Hmm, now I think this one is my favorite, because my eye sees it as a bunch of linked 12-sided polygons, sort of like chain mail. Different tilings make my eyes move over them in different ways, and this one has a very pleasant effect.

Here’s another solution:

$\displaystyle{\frac{1}{3} + \frac{1}{3} + \frac{1}{3} + \frac{1}{3} + \frac{1}{6} = \frac{5}{2} - 1 }$

This gives two Archimedean tilings that are mirror images of each other!

Of course, whether you count these as two different Archimedean tilings or just one depends on what rules you choose. And by the way, people usually don’t say a tiling is Archimedean if all the polygons are the same, like this:

They instead say it’s regular. If modern mathematicians were inventing this subject, we’d say regular tilings are a special case of Archimedean tilings—but this math is all very old, and back then mathematicians treated special cases as not included in the general case. For example, the Greeks didn’t even consider the number 1 to be a number!

So here’s a fun puzzle: classify the Archimedean tilings! For starters, you need to find all ways to get $n$ whole numbers whose reciprocals add up to one less than $n/2$. That sounds hard, but luckily it’s obvious that

$n \le 6$

since an equilateral triangle has the smallest interior angle, of any regular polygon, and you can only fit 6 of them around a vertex. If you think a bit, you’ll see this cuts the puzzle down to a finite search.

But you have to be careful, since there are some solutions that don’t give Archimedean tilings. As usual, the number 5 causes problems. We have

$\displaystyle{ \frac{1}{5} + \frac{1}{5} + \frac{1}{10} = \frac{3}{2} - 1 }$

but there’s no way to tile the plane so that 2 regular pentagons and 1 regular decagon meet at each vertex! Kepler seems to have tried; here’s a picture from his book Harmonices Mundi:

It works beautifully at one vertex, but not for a tiling of the whole plane. To save the day he had to add some stars, and some of the decagons overlap! The Islamic tiling artists, and later Penrose, went further in this direction.

If you get stuck on this puzzle, you can find the answer here:

• Michal Krížek, Jakub Šolc, and Alena Šolcová, Is there a crystal lattice possessing five-fold symmetry?, AMS Notices 59 (January 2012), 22-30.

### Not enough?

In short, all Archimedean tilings of the plane arise from finding $n$ whole numbers whose reciprocals sum to $n/2 - 1.$ But what if the total is not enough? Don’t feel bad: you might still get a tiling of the hyperbolic plane. For example,

$\displaystyle{ \frac{1}{7} + \frac{1}{7} + \frac{1}{7} < \frac{3}{2} - 1 }$

so you can’t tile the plane with 3 heptagons meeting at each corner… but you still get this tiling of the hyperbolic plane:

which happens to be related to a wonderful thing called Klein’s quartic curve.

You don’t always win… but sometimes you do, so the game is worth playing. For example,

$\displaystyle{ \frac{1}{3} + \frac{1}{3} + \frac{1}{3} + \frac{1}{3} + \frac{1}{3} + \frac{1}{4} < \frac{6}{2} - 1 }$

so you have a chance at a tiling of the hyperbolic plane where five equilateral triangles and a square meet at each vertex. And in this case, you luck out:

For more beautiful pictures like these, see:

Uniform tilings in hyperbolic plane, Wikipedia.

• Don Hatch, Hyperbolic tesselations.

### Too much?

Similarly, if you’ve got $n$ reciprocals that add up to more than $n/2 -1$, you’ve got a chance at tiling the sphere. For example,

$\displaystyle{ \frac{1}{3} + \frac{1}{3} + \frac{1}{3} + \frac{1}{3} + \frac{1}{5} > \frac{5}{2} - 1 }$

and in this case we luck out and get the snub dodecahedron. I thought it was rude to snub a dodecahedron, but apparently not:

These tilings of the sphere are technically called Archimedean solids and (if all the polygons are the same) Platonic solids. Of these, only the snub dodecahedron and the ‘snub cube’ are different from their mirror images.

### Fancier stuff

In short, adding up reciprocals of whole numbers is related to Archimedean tilings of the plane, the sphere and the hyperbolic plane. But this is also how Egyptians would write fractions! In fact they even demanded that all the reciprocals be distinct, so instead of writing 2/3 as $\frac{1}{3} + \frac{1}{3}$, they’d write $\frac{1}{2} + \frac{1}{6}.$

It’s a lousy system—doubtless this is why King Tut died so young. But forget about the restriction that the reciprocals be distinct: that’s silly. If you can show that for every $n > 1$ the number $4/n$ can be written as $1/a + 1/b + 1/c$ for whole numbers $a,b,c,$ you’ll be famous! So far people have ‘only’ shown it’s true for $n$ up to a hundred trillion:

Erdös–Straus conjecture, Wikipedia.

So, see if you can do better! But if you’re into fancy math, a less stressful activity might be to read about Egyptian fractions, tilings and ADE classifications:

• John Baez, This Week’s Finds in Mathematical Physics (Week 182).

This only gets into ‘Platonic’ or ‘regular’ tilings, not the more general ‘Archimedean’ or ‘semiregular’ ones I’m talking about today—so the arithmetic works a bit differently.

Also, my colleague Julie Bergner has talked about how they Egyptian fractions show up in the study of ‘groupoid cardinality’:

• Julie Bergner, Groupoids and Egyptian fractions.

So, while nobody uses Egyptian fractions much anymore, they have a kind of eerie afterlife. For more on what the Egyptians actually did, try these:

• Ron Knott, Egyptian fractions.

Egyptian fractions, Wikipedia.

$\frac{1}{3} + \frac{1}{12} + \frac{1}{12} = \frac{3}{2} - 1$

## Babylon and the Square Root of 2

2 December, 2011

joint with Richard Elwes

Sometimes you can learn a lot from an old piece of clay. This is a Babylonian clay tablet from around 1700 BC. It’s known as “YBC7289″, since it’s one of many in the Yale Babylonian Collection.

It’s a diagram of a square with one side marked as having length 1/2. They took this length, multiplied it by the square root of 2, and got the length of the diagonal. And our question is: what did they really know about the square root of 2?

Questions like this are tricky. It’s even hard to be sure the square’s side has length 1/2. Since the Babylonians used base 60, they thought of 1/2 as 30/60. But since they hadn’t invented anything like a “decimal point”, they wrote it as 30. More precisely, they wrote it as this:

Take a look.

So maybe the square’s side has length 1/2… but maybe it has length 30. How can we tell? We can’t. But this tablet was probably written by a beginner, since the writing is large. And for a beginner, or indeed any mathematician, it makes a lot of sense to take 1/2 and multiply it by $\sqrt{2}$ to get $\frac{1}{\sqrt{2}}$.

Once you start worrying about these things, there’s no end to it. How do we know the Babylonians wrote 1/2 as 30? One reason is that they really liked reciprocals. According to Jöran Friberg’s book A Remarkable Collection of Babylonian Mathematical Texts, there are tablets where a teacher has set some unfortunate student the task of inverting some truly gigantic numbers such as 325 · 5. They even checked their answers the obvious way: by taking the reciprocal of the reciprocal! They put together tables of reciprocals and used these to tackle more general division problems. To calculate $\frac{a}{b}$ they would break $b$ up into factors, look up the reciprocal of each, and take the product of these together with $a$. This is cool, because modern algebra also sees reciprocals as logically preceding division, even if most non-mathematicians disagree!

So, we know from tables of reciprocals that Babylonians wrote 1/2 as 30. But let’s get back to our original question: what did they know about $\sqrt{2}$?

On this tablet, they used the value

$\displaystyle{ 1 + \frac{24}{60} + \frac{51}{60^2} + \frac{10}{60^3} \approx 1.41421297... }$

This is an impressively good approximation to

$\sqrt{2} \approx 1.41421356...$

But how did they get this approximation? Did they know it was just an approximation? And did they know $\sqrt{2}$ is irrational?

There seems to be no evidence that they knew about irrational numbers. One of the great experts on Babylonian mathematics, Otto Neugebauer, wrote:

… even if it were only due to our incomplete knowledge of the sources that we assume that the Babylonians did not know that $p^2 = 2q^2$ had no solution in integer numbers $p$ and $q$, even then the fact remains that the consequences of this result were never realized.

But there is evidence that the Babylonians knew their figure was just an approximation. In his book The Crest of the Peacock, George Gheverghese Joseph points out that a number very much like this shows up at the fourth stage of a fairly obvious recursive algorithm for approximating square roots! The first three approximations are

$1$

$\displaystyle{ \frac{3}{2} = 1.5 }$

and

$\displaystyle{ \frac{17}{12} \approx 1.41666... }$

The fourth is

$\displaystyle{ \frac{577}{408} \approx 1.41421569... }$

but if you work it out to 3 places in base 60, as the Babylonians seem to have done, you’ll get the number on this tablet!

The number 577/408 also shows up as an approximation to $\sqrt{2}$ in the Shulba Sutras, a collection of Indian texts compiled between 800 and 200 BC. So, Indian mathematicians may have known the same algorithm.

But what is this algorithm, exactly? Joseph describes it, but Sridhar Ramesh told us about an easier way to think about it. Suppose you’re trying to compute the square root of 2 and you have a guess, say $a$. If your guess is exactly right then

$a^2 = 2$

so

$a = 2/a$

But if your guess isn’t right, $a$ won’t be quite equal to $2/a$. So it makes sense to take the average of $a$ and $2/a$, and use that as a new guess. If your original guess wasn’t too bad, and you keep using this procedure, you’ll get a sequence of guesses that converges to $\sqrt{2}$. In fact it converges very rapidly: at each step, the number of correct digits in your guess will approximately double!

Let’s see how it goes. We start with an obvious dumb guess, namely 1. Now 1 sure isn’t equal to 2/1, but we can average them and get a better guess:

$\displaystyle{ \frac{1}{2}(1 \;+ \; 2) = \frac{3}{2} }$

Next, let’s average 3/2 and 2/(3/2):

$\displaystyle{ \frac{1}{2}\left(\frac{3}{2} \; + \; \frac{2}{\frac{3}{2}}\right) = \frac{1}{2}\left(\frac{3}{2} \; + \; \frac{4}{3}\right) = \frac{1}{2}\left(\frac{3 \cdot 3 + 2 \cdot 4}{2 \cdot 3}\right) = \frac{9 + 8}{12} = \frac{17}{12} }$

We’re doing the calculation in painstaking detail for two reasons. First, we want to prove that we’re just as good at arithmetic as the ancient Babylonians: we don’t need a calculator for this stuff! Second, a cute pattern will show up if you pay attention.

Let’s do the next step. Now we’ll average 17/12 and 2/(17/12):

$\displaystyle{ \frac{1}{2}\left(\frac{17}{12} \; + \; \frac{2}{\frac{17}{12}}\right) = \frac{1}{2}\left(\frac{17}{12} \; + \; \frac{24}{17}\right) = \frac{1}{2}\left(\frac{17 \cdot 17 + 12 \cdot 24}{12 \cdot 17}\right) }$

Do you remember what 17 times 17 is? No? That’s bad. It’s 289. Do you remember what 12 times 24 is? Well, maybe you remember that 12 times 12 is 144. So, double that and get 288. Hmm. So, moving right along, we get

$\displaystyle{ \frac{1}{2}\left(\frac{289 + 288}{204}\right) = \frac{577}{408} }$

which is what the Babylonians seem to have used!

Do you see the cute pattern? No? Yes? Even if you do, it’s good to try another round of this game, to see if this pattern persists. Besides, it’ll be fun to beat the Babylonians at their own game and get a better approximation to $\sqrt{2}$.

So, let’s average 577/408 and 2/(577/408):

$\begin{array}{ccl} \displaystyle{ \frac{1}{2}\left(\frac{577}{408} \; + \; \frac{2}{\frac{577}{408}}\right) } &=& \displaystyle{ \frac{1}{2}\left(\frac{577}{408} \; + \; \frac{816}{577}\right) } \\ \\ &=& \displaystyle{ \frac{1}{2}\left(\frac{577 \cdot 577 + 816 \cdot 408}{408 \cdot 577}\right) } \end{array}$

Do you remember what 577 times 577 is? Heh, neither do we. In fact, right now a calculator is starting to look really good. Okay: it says the answer is 332,929. And what about 816 times 408? That’s 332,928. Just one less! And that’s the pattern we were hinting at: it’s been working like that every time. Continuing, we get

$\displaystyle{ \frac{1}{2}\left(\frac{332,929 + 332,928}{235,416}\right) = \frac{665,857}{470,832} }$

So that’s our new approximation of $\sqrt{2}$, which is even better than the best known in 1700 BC! Let’s see how good it is:

$\begin{array}{ccc} \displaystyle{ \frac{665,857}{470,832} }\; &\approx & 1.414213562375... \\ & & \\ \sqrt{2} \; &\approx & 1.414213562373...\end{array}$

So, it’s good to 11 decimals!

What about that pattern we saw? As you can see, we keep getting a square number that’s one more than twice some other square:

$3^2 = 2 \cdot 1^2 + 1$

$17^2 = 2 \cdot 12^2 + 1$

$577^2 = 2 \cdot 408^2 + 1$

and so on… at least if the pattern continues. So, while we can’t find integers $p$ and $q$ with

$p^2 = 2 q^2$

because $\sqrt{2}$ is irrational, it seems we can find infinitely many solutions to

$p^2 = 2 q^2 + 1$

and these give fractions $p/q$ that are really good approximations to $\sqrt{2}$. But can you prove this is really what’s going on?

We’ll leave this as a puzzle in case you’re ever stuck on a desert island, or stuck in the deserts of Iraq. And if you want even more fun, try simplifying these fractions:

$\displaystyle{ 1 + \frac{1}{2} }$

$\displaystyle{ 1 + \frac{1}{2 + \frac{1}{2}} }$

$\displaystyle{ 1 + \frac{1}{2 + \frac{1}{2 +\frac{1}{2}}} }$

$\displaystyle{ 1 + \frac{1}{2 + \frac{1}{2 +\frac{1}{2 + \frac{1}{2}}}} }$

and so on. Some will give you the fractions we’ve seen already, but others won’t. How far out do you need to go to get 577/408? Can you figure the pattern and see when 665,857/470,832 will show up?

If you get stuck, it may help to read about Pell numbers. We could say more, but we’re beginning to babble on.

#### References

You can read about YBC7289 and see more photos of it here:

• Duncan J. Melville, YBC7289.

• Bill Casselman, YBC7289.

If you want to check that the tablet really says what the experts claim it does, ponder these pictures:

The number “1 24 51 10″ is base 60 for

$\displaystyle{ 1 + \frac{24}{60} + \frac{51}{60^2} + \frac{10}{60^3} \approx 1.41421297... }$

and the number “42 25 35″ is presumably base 60 for what you get when you multiply this by 1/2 (we were too lazy to check). But can you read the clay tablet well enough to actually see these numbers? It’s not easy.

For a quick intro to what Babylonian mathematicians might have known about the Pythagorean theorem, and how this is related to YBC7289, try:

• J. J. O’Connor and E. F. Robertson, Pythagoras’s
theorem in Babylonian mathematics
.

We got our table of Babylonian numerals from here:

• J. J. O’Connor and E. F. Robertson, Babylonian numerals.

For more details, try:

• D. H. Fowler and E. R. Robson, Square root approximations in Old Babylonian mathematics: YBC 7289 in context, Historia Mathematica 25 (1998), 366–378.

We also recommend this book, an easily readable introduction to the history of non-European mathematics that discusses YBC7289:

• George Gheverghese Joseph, The Crest of the Peacock: Non-European Roots of Mathematics, Princeton U. Press, Princeton, 2000.

To dig deeper, try these:

• Otto Neugebauer, The Exact Sciences in Antiquity, Dover Books, New York, 1969.

• Jöran Fridberg, A Remarkable Collection of Babylonian Mathematical Texts, Springer, Berlin, 2007.

Here a sad story must indeed be told. While the field work has been perfected to a very high standard during the last half century, the second part, the publication, has been neglected to such a degree that many excavations of Mesopotamian sites resulted only in a scientifically executed destruction of what was left still undestroyed after a few thousand years. – Otto Neugebauer.

## A Math Puzzle Coming From Chemistry

23 October, 2011

I posed this puzzle a while back, and nobody solved it. That’s okay—now that I think about it, I’m not sure how to solve it either!

It seems to involve group theory. But instead of working on it, solving it and telling you the answer, I’d rather dump all the clues in your lap, so we can figure it out together.

Suppose we have an ethyl cation. We’ll pretend it looks like this:

As I explained before, it actually doesn’t—not in real life. But never mind! Realism should never stand in the way of a good puzzle.

Continuing on in this unrealistic vein, we’ll pretend that the two black carbon atoms are distinguishable, and so are the five white hydrogen atoms. As you can see, 2 of the hydrogens are bonded to one carbon, and 3 to the other. We don’t care how the hydrogens are arranged, apart from which carbon each hydrogen is attached to. Given this, there are

$2 \times \displaystyle{ \binom{5}{2} = 20 }$

ways to arrange the hydrogens. Let’s call these arrangements states.

Now draw a dot for each of these 20 states. Draw an edge connecting two dots whenever you can get from one state to another by having a hydrogen hop from the carbon with 2 hydrogens to the carbon with 3. You’ll get this picture, called the Desargues graph:

The red dots are states where the first carbon has 2 hydrogens attached to it; the blue ones are states where the second carbon has 2 hydrogens attached to it. So, each edge goes between a red and a blue dot. And there are 3 edges coming out of each dot, since there are 3 hydrogens that can make the jump!

Now, the puzzle is to show that you can also get the Desargues graph from a different kind of molecule. Any molecule shaped like this will do:

The 2 balls on top and bottom are called axial, while the 3 around the middle are called equatorial.

There are various molecules like this. For example, phosphorus pentachloride. Let’s use that.

Like the ethyl cation, phosphorus pentachloride also has 20 states… but only if count them a certain way! We have to treat all 5 chlorines as distinguishable, but think of two arrangements of them as the same if we can rotate one to get the other. Again, I’m not claiming this is physically realistic: it’s just for the sake of the puzzle.

Phosphorus pentachloride has 6 rotational symmetries, since you can turn it around its axis 3 ways, but also flip it over. So, it has

$\displaystyle{ \frac{5!}{6} = 20}$

states.

That’s good: exactly the number of dots in the Desargues graph! But how about the edges? We get these from certain transitions between states. These transitions are called pseudorotations, and they look like this:

Phosphorus pentachloride really does this! First the 2 axial guys move towards each other to become equatorial. Beware: now the equatorial ones are no longer in the horizontal plane: they’re in the plane facing us. Then 2 of the 3 equatorial guys swing out to become axial.

To get from one state to another this way, we have to pick 2 of the 3 equatorial guys to swing out and become axial. There are 3 choices here. So, we again get a graph with 20 vertices and 3 edges coming out of each vertex.

Puzzle. Is this graph the Desargues graph? If so, show it is.

I read in some chemistry papers that it is. But is it really? And if so, why? David Corfield suggested a promising strategy. He pointed out that we just need to get a 1-1 correspondence between

states of the ethyl cation and states of phosphorus pentachloride,

together with a compatible 1-1 correspondence between

transitions of the ethyl cation and transitions of phosphorus pentachloride.

And he suggested that to do this, we should think of the split of hydrogens into a bunch of 2 and a bunch of 3 as analogous to the split of chlorines into a bunch of 2 (the ‘axial’ ones) and a bunch of 3 (the ‘equatorial’ ones).

It’s a promising idea. There’s a problem, though! In the ethyl cation, a single hydrogen hops from the bunch of 3 to the bunch of 2. But in a pseudorotation, two chlorines go from the bunch of 2 to the bunch of 3… and meanwhile, two go back from the bunch of 3 to bunch of 2.

And if you think about it, there’s another problem too. In the ethyl cation, there are 2 distinguishable carbons. One of them has 3 hydrogens attached, and one doesn’t. But in phosphorus pentachloride it’s not like that. The 3 equatorial chlorines are just that: equatorial. They don’t have 2 choices about how to be that way. Or do they?

Well, there’s more to say, but this should already make it clear that getting ‘natural’ one-to-one correspondences is a bit tricky… if it’s even possible at all!

If you know some group theory, we could try solving the problem using the ideas behind Felix Klein’s ‘Erlangen program’. The group of permutations of 5 things, say $S_5,$ acts as symmetries of either molecule. For the ethyl cation the set of states will be $X = S_5/G$ for some subgroup $G.$ You can think of $X$ as a set of structures of some sort on a 5-element set. The group $S_5$ acts on $X,$ and the transitions will give an invariant binary relation on $X,$ For phosphorus pentachloride we’ll have some set of states $X' = S_5/G'$ for some other subgroup $G'$, and the transitions will give an invariant relation on $X'$.

We could start by trying to see if $G$ is the same as $G'$—or more precisely, conjugate. If they are, that’s a good sign. If not, it’s bad: it probably means there’s no ‘natural’ way to show the graph for phosphorus pentachloride is the Desargues graph.

I could say more, but I’ll stop here. In case you’re wondering, all this is just a trick to get more mathematicians interested in chemistry. A few may then go on to do useful things.

## Network Theory (Part 6)

16 April, 2011

Now for the fun part. Let’s see how tricks from quantum theory can be used to describe random processes. I’ll try to make this post self-contained. So, even if you skipped a bunch of the previous ones, this should make sense.

You’ll need to know a bit of math: calculus, a tiny bit probability theory, and linear operators on vector spaces. You don’t need to know quantum theory, though you’ll have more fun if you do. What we’re doing here is very similar… but also strangely different—for reasons I explained last time.

#### Rabbits and quantum mechanics

Suppose we have a population of rabbits in a cage and we’d like to describe its growth in a stochastic way, using probability theory. Let $\psi_n$ be the probability of having $n$ rabbits. We can borrow a trick from quantum theory, and summarize all these probabilities in a formal power series like this:

$\Psi = \sum_{n = 0}^\infty \psi_n z^n$

The variable $z$ doesn’t mean anything in particular, and we don’t care if the power series converges. See, in math ‘formal’ means “it’s only symbols on the page, just follow the rules”. It’s like if someone says a party is ‘formal’, so need to wear a white tie: you’re not supposed to ask what the tie means.

However, there’s a good reason for this trick. We can define two operators on formal power series, called the annihilation operator:

$a \Psi = \frac{d}{d z} \Psi$

and the creation operator:

$a^\dagger \Psi = z \Psi$

They’re just differentiation and multiplication by $z$, respectively. So, for example, suppose we start out being 100% sure we have $n$ rabbits for some particular number $n$. Then $\psi_n = 1$, while all the other probabilities are 0, so:

$\Psi = z^n$

If we then apply the creation operator, we obtain

$a^\dagger \Psi = z^{n+1}$

Voilà! One more rabbit!

The annihilation operator is more subtle. If we start out with $n$ rabbits:

$\Psi = z^n$

and then apply the annihilation operator, we obtain

$a \Psi = n z^{n-1}$

What does this mean? The $z^{n-1}$ means we have one fewer rabbit than before. But what about the factor of $n$? It means there were $n$ different ways we could pick a rabbit and make it disappear! This should seem a bit mysterious, for various reasons… but we’ll see how it works soon enough.

The creation and annihilation operators don’t commute:

$(a a^\dagger - a^\dagger a) \Psi = \frac{d}{d z} (z \Psi) - z \frac{d}{d z} \Psi = \Psi$

so for short we say:

$a a^\dagger - a^\dagger a = 1$

or even shorter:

$[a, a^\dagger] = 1$

where the commutator of two operators is

$[S,T] = S T - T S$

The noncommutativity of operators is often claimed to be a special feature of quantum physics, and the creation and annihilation operators are fundamental to understanding the quantum harmonic oscillator. There, instead of rabbits, we’re studying quanta of energy, which are peculiarly abstract entities obeying rather counterintuitive laws. So, it’s cool that the same math applies to purely classical entities, like rabbits!

In particular, the equation $[a, a^\dagger] = 1$ just says that there’s one more way to put a rabbit in a cage of rabbits, and then take one out, than to take one out and then put one in.

But how do we actually use this setup? We want to describe how the probabilities $\psi_n$ change with time, so we write

$\Psi(t) = \sum_{n = 0}^\infty \psi_n(t) z^n$

Then, we write down an equation describing the rate of change of $\Psi$:

$\frac{d}{d t} \Psi(t) = H \Psi(t)$

Here $H$ is an operator called the Hamiltonian, and the equation is called the master equation. The details of the Hamiltonian depend on our problem! But we can often write it down using creation and annihilation operators. Let’s do some examples, and then I’ll tell you the general rule.

#### Catching rabbits

Last time I told you what happens when we stand in a river and catch fish as they randomly swim past. Let me remind you of how that works. But today let’s use rabbits.

So, suppose an inexhaustible supply of rabbits are randomly roaming around a huge field, and each time a rabbit enters a certain area, we catch it and add it to our population of caged rabbits. Suppose that on average we catch one rabbit per unit time. Suppose the chance of catching a rabbit during any interval of time is independent of what happened before. What is the Hamiltonian describing the probability distribution of caged rabbits, as a function of time?

There’s an obvious dumb guess: the creation operator! However, we saw last time that this doesn’t work, and we saw how to fix it. The right answer is

$H = a^\dagger - 1$

To see why, suppose for example that at some time $t$ we have $n$ rabbits, so:

$\Psi(t) = z^n$

Then the master equation says that at this moment,

$\frac{d}{d t} \Psi(t) = (a^\dagger - 1) \Psi(t) = z^{n+1} - z^n$

Since $\Psi = \sum_{n = 0}^\infty \psi_n(t) z^n$, this implies that the coefficients of our formal power series are changing like this:

$\frac{d}{d t} \psi_{n+1}(t) = 1$
$\frac{d}{d t} \psi_{n}(t) = -1$

while all the rest have zero derivative at this moment. And that’s exactly right! See, $\psi_{n+1}(t)$ is the probability of having one more rabbit, and this is going up at rate 1. Meanwhile, $\psi_n(t)$ is the probability of having $n$ rabbits, and this is going down at the same rate.

Puzzle 1. Show that with this Hamiltonian and any initial conditions, the master equation predicts that the expected number of rabbits grows linearly.

#### Dying rabbits

Don’t worry: no rabbits are actually injured in the research that Jacob Biamonte is doing here at the Centre for Quantum Technologies. He’s keeping them well cared for in a big room on the 6th floor. This is just a thought experiment.

Suppose a mean nasty guy had a population of rabbits in a cage and didn’t feed them at all. Suppose that each rabbit has a unit probability of dying per unit time. And as always, suppose the probability of this happening in any interval of time is independent of what happens before that time.

What is the Hamiltonian? Again there’s a dumb guess: the annihilation operator! And again this guess is wrong, but it’s not far off. As before, the right answer includes a ‘correction term’:

$H = a - N$

This time the correction term is famous in its own right. It’s called the number operator:

$N = a^\dagger a$

The reason is that if we start with $n$ rabbits, and apply this operator, it amounts to multiplication by $n$:

$N z^n = z \frac{d}{d z} z^n = n z^n$

Let’s see why this guess is right. Again, suppose that at some particular time $t$ we have $n$ rabbits, so

$\Psi(t) = z^n$

Then the master equation says that at this time

$\frac{d}{d t} \Psi(t) = (a - N) \Psi(t) = n z^{n-1} - n z^n$

So, our probabilities are changing like this:

$\frac{d}{d t} \psi_{n-1}(t) = n$
$\frac{d}{d t} \psi_n(t) = -n$

while the rest have zero derivative. And this is good! We’re starting with $n$ rabbits, and each has a unit probability per unit time of dying. So, the chance of having one less should be going up at rate $n$. And the chance of having the same number we started with should be going down at the same rate.

Puzzle 2. Show that with this Hamiltonian and any initial conditions, the master equation predicts that the expected number of rabbits decays exponentially.

#### Breeding rabbits

Suppose we have a strange breed of rabbits that reproduce asexually. Suppose that each rabbit has a unit probability per unit time of having a baby rabbit, thus effectively duplicating itself.

As you can see from the cryptic picture above, this ‘duplication’ process takes one rabbit as input and has two rabbits as output. So, if you’ve been paying attention, you should be ready with a dumb guess for the Hamiltonian: $a^\dagger a^\dagger a$. This operator annihilates one rabbit and then creates two!

But you should also suspect that this dumb guess will need a ‘correction term’. And you’re right! As always, the correction terms makes the probability of things staying the same go down at exactly the rate that the probability of things changing goes up.

You should guess the correction term… but I’ll just tell you:

$H = a^\dagger a^\dagger a - N$

We can check this in the usual way, by seeing what it does when we have $n$ rabbits:

$H z^n = z^2 \frac{d}{d z} z^n - n z^n = n z^{n+1} - n z^n$

That’s good: since there are $n$ rabbits, the rate of rabbit duplication is $n$. This is the rate at which the probability of having one more rabbit goes up… and also the rate at which the probability of having $n$ rabbits goes down.

Puzzle 3. Show that with this Hamiltonian and any initial conditions, the master equation predicts that the expected number of rabbits grows exponentially.

#### Dueling rabbits

Let’s do some stranger examples, just so you can see the general pattern.

Here each pair of rabbits has a unit probability per unit time of fighting a duel with only one survivor. You might guess the Hamiltonian $a^\dagger a a,$ but in fact:

$H = a^\dagger a a - N(N-1)$

Let’s see why this is right! Let’s see what it does when we have $n$ rabbits:

$H z^n = z \frac{d^2}{d z^2} z^n - n(n-1)z^n = n(n-1) z^{n-1} - n(n-1)z^n$

That’s good: since there are $n(n-1)$ ordered pairs of rabbits, the rate at which duels take place is $n(n-1)$. This is the rate at which the probability of having one less rabbit goes up… and also the rate at which the probability of having $n$ rabbits goes down.

(If you prefer unordered pairs of rabbits, just divide the Hamiltonian by 2. We should talk about this more, but not now.)

#### Brawling rabbits

Now each triple of rabbits has a unit probability per unit time of getting into a fight with only one survivor! I don’t know the technical term for a three-way fight, but perhaps it counts as a small ‘brawl’ or ‘melee’. In fact the Wikipedia article for ‘melee’ shows three rabbits in suits of armor, fighting it out:

Now the Hamiltonian is:

$H = a^\dagger a^3 - N(N-1)(N-2)$

You can check that:

$H z^n = n(n-1)(n-2) z^{n-2} - n(n-1)(n-2) z^n$

and this is good, because $n(n-1)(n-2)$ is the number of ordered triples of rabbits. You can see how this number shows up from the math, too:

$a^3 z^n = \frac{d^3}{d z^3} z^n = n(n-1)(n-2) z^{n-3}$

#### The general rule

Suppose we have a process taking $k$ rabbits as input and having $j$ rabbits as output:

I hope you can guess the Hamiltonian I’ll use for this:

$H = {a^{\dagger}}^j a^k - N(N-1) \cdots (N-k+1)$

This works because

$a^k z^n = \frac{d^k}{d z^k} z^n = n(n-1) \cdots (n-k+1) z^{n-k}$

so that if we apply our Hamiltonian to $n$ rabbits, we get

$H z^n = n(n-1) \cdots (n-k+1) (z^{n+j-k} - z^n)$

See? As the probability of having $n+j-k$ rabbits goes up, the probability of having $n$ rabbits goes down, at an equal rate. This sort of balance is necessary for $H$ to be a sensible Hamiltonian in this sort of stochastic theory (an ‘infinitesimal stochastic operator’, to be precise). And the rate is exactly the number of ordered $k$-tuples taken from a collection of $n$ rabbits. This is called the $k$th falling power of $n$, and written as follows:

$n^{\underline{k}} = n(n-1) \cdots (n-k+1)$

Since we can apply functions to operators as well as numbers, we can write our Hamiltonian as:

$H = {a^{\dagger}}^j a^k - N^{\underline{k}}$

#### Kissing rabbits

Let’s do one more example just to test our understanding. This time each pair of rabbits has a unit probability per unit time of bumping into one another, exchanging a friendly kiss and walking off. This shouldn’t affect the rabbit population at all! But let’s follow the rules and see what they say.

According to our rules, the Hamiltonian should be:

$H = {a^{\dagger}}^2 a^2 - N(N-1)$

However,

${a^{\dagger}}^2 a^2 z^n = z^2 \frac{d^2}{dz^2} z^n = n(n-1) z^n = N(N-1) z^n$

and since $z^n$ form a ‘basis’ for the formal power series, we see that:

${a^{\dagger}}^2 a^2 = N(N-1)$

so in fact:

$H = 0$

That’s good: if the Hamiltonian is zero, the master equation will say

$\frac{d}{d t} \Psi(t) = 0$

so the population, or more precisely the probability of having any given number of rabbits, will be constant.

There’s another nice little lesson here. Copying the calculation we just did, it’s easy to see that:

${a^{\dagger}}^k a^k = N^{\underline{k}}$

This is a cute formula for falling powers of the number operator in terms of annihilation and creation operators. It means that for the general transition we saw before:

we can write the Hamiltonian in two equivalent ways:

$H = {a^{\dagger}}^j a^k - N^{\underline{k}} = {a^{\dagger}}^j a^k - {a^{\dagger}}^k a^k$

Okay, that’s it for now! We can, and will, generalize all this stuff to stochastic Petri nets where there are things of many different kinds—not just rabbits. And we’ll see that the master equation we get matches the answer to the puzzle in Part 4. That’s pretty easy. But first, we’ll have a guest post by Jacob Biamonte, who will explain a more realistic example from population biology.

## Geometry Puzzle

11 October, 2010

We’re thinking about solar power over at the Azimuth Project, so Graham Jones wrote a page on solar radiation. This led him to a nice little geometry puzzle.

If you float in space near the Earth, and measure the power density of solar radiation, you’ll get 1366 watts per square meter. But because this radiation hits the Earth at an angle, and not at all during the night, the average global solar power density is a lot less: about 341.5 watts per square meter at the top of the atmosphere. And of this, only about 156 watts/meter2 makes it down the Earth’s surface. From 1366 down to 156 — that’s almost an order of magnitude! This is why some people like the idea of space-based solar power.

But when I said “a lot less”, I was concealing a cute and simple fact: the average global solar power density is one quarter of the power density in outer space near Earth’s orbit:

1366/4 = 341.5

Why? Because the area of a sphere is four times the area of its circular shadow!

Anyone who remembers their high-school math can see why this is true. The area of a circle is πr2, where r is the radius of the circle The surface area of a sphere is 4πr2, where r is the radius of the sphere. That’s where the factor of 4 comes from.

Cute and simple. But Graham Jones posed a nice followup puzzle: What’s the easiest way to understand this factor of 4? Maybe there’s a way that doesn’t require calculus — just geometry. Maybe with a little work we could just see that factor of 4. That would be really satisfying.

But I don’t know how to “just see it”. So this is not the sort of puzzle where I smile in a superior sort of way and chuckle to myself as you folks struggle to solve it. This the sort where I’d really like to know the best answer.

But here’s something I do know: we can derive this factor of 4 from a nice but even less obvious fact which I believe was proved by Archimedes.

Take a sphere and slice it with a bunch of parallel planes, like chopping an apple with a cleaver. If two slices have the same thickness, they also have the same surface area!

(When I say “surface area” here, I’m only counting the red skin of the apple slices.)

There’s an interesting cancellation at work here. A slice from near the top or bottom of the sphere will be smaller, but it’s also more “sloped”. The magic fact is that these effects exactly cancel when we compute its surface area.

If you think a bit, you can see this is equivalent to another nice fact:

The surface area of any slice of a sphere matches the surface area of the corresponding slice of the cylinder with the same radius. If you don’t get what I mean, see the picture at Wolfram Mathworld.

And this in turn implies that the surface area of the sphere equals the surface area of the cylinder, not including top and bottom. But that’s the cylinder’s circumference times its height. So we get

2πr × 2r = 4πr2

So we get that factor of 4 we wanted.

In fact, Archimedes was so proud of discovering this fact that he put it on his tomb! Cicero later saw this tomb and helped save it from obscurity. He wrote:

But from Dionysius’s own city of Syracuse I will summon up from the dust—where his measuring rod once traced its lines—an obscure little man who lived many years later, Archimedes. When I was questor in Sicily [in 75 BC, 137 years after the death of Archimedes] I managed to track down his grave. The Syracusians knew nothing about it, and indeed denied that any such thing existed. But there it was, completely surrounded and hidden by bushes of brambles and thorns. I remembered having heard of some simple lines of verse which had been inscribed on his tomb, referring to a sphere and cylinder modelled in stone on top of the grave. And so I took a good look round all the numerous tombs that stand beside the Agrigentine Gate. Finally I noted a little column just visible above the scrub: it was surmounted by a sphere and a cylinder. I immediately said to the Syracusans, some of whose leading citizens were with me at the time, that I believed this was the very object I had been looking for. Men were sent in with sickles to clear the site, and when a path to the monument had been opened we walked right up to it. And the verses were still visible, though approximately the second half of each line had been worn away.

I don’t know what the verses said.

It’s been said that the Roman contributions to mathematics were so puny that the biggest was Cicero’s discovery of this tomb. But Archimedes’ result doesn’t by itself give an easy intuitive way to see where that factor of 4 is coming from! You may or may not find it to be a useful clue.

(In fact, “equal surface areas for slices of equal thickness” is a special case of a principle called “Duistermaat-Heckman localization”. For that, try page 23 of chapter 2 of the book by Ginzburg, Guillemin and Karshon. But that’s much fancier stuff than I’m wondering about here, I think.)

## Probability Puzzles (Part 2)

5 September, 2010

Sometimes places become famous not because of what’s there, but because of the good times people have there.

There’s a somewhat historic bar in Singapore called the Colbar. Apparently that’s short for “Colonial Bar”. It’s nothing to look at: pretty primitive, basically a large shed with no air conditioning and a roofed-over patio made of concrete. Its main charm is that it’s “locked in a time warp”. It used to be set in the British army barracks, but it was moved in 2003. According to a food blog:

Thanks to the petitions of Colbar regulars and the subsequent intervention of the Jurong Town Council (JTC), who wanted to preserve its colourful history, Colbar was replicated and relocated just a stone’s throw away from the old site. Built brick by brick and copied to close exact, Colbar reopened its doors last year looking no different from what it used to be.

It’s now in one of the few remaining forested patches of Singapore. The Chinese couple who run it are apparently pretty well-off; they’ve been at it since the place opened in 1953, even before Singapore became a country.

Every Friday, a bunch of philosophers go there to drink beer, play chess, strum guitars and talk. Since my wife teaches in the philosophy department at NUS, we became part of this tradition, and it’s a lot of fun.

Anyway, the last time we went there, one of the philosophers posed this puzzle:

You know a woman who has two children. One day you see her walking by with one. You notice it’s a boy. What’s the probability that both her children are boys?

Of course I instantly thought of the probability puzzles we’ve discussed here. It’s not exactly any of the versions we have already talked about. So I thought you folks might enjoy it.

## Probability Puzzles

24 August, 2010

Today Greg Egan mailed me two puzzles in probability theory: a “simple” one, and a more complicated one that compares Bayesian and frequentist interpretations of probability theory.

Try your hand at the simple one first. Egan wrote:

A few months ago I read about a very simple but fun probability puzzle. Someone tells you:

“I have two children. One is a boy born on a Tuesday. What is the probability I have two boys?”

Please give it a try before moving on. Or at least figure out what this is:

Of course, your first reaction should be “it’s irrelevant the boy was born on a Tuesday“. At least that was my first reaction. So I said:

I’d intuitively assume that the day Tuesday is not relevant, so I’d ignore that information – or else look at some hospital statistics to see if it is relevant. I’d also assume that boy/girl births act just like independently distributed fair coin flips — which is surely false, but I’m guessing the puzzle wants us to assume it’s true. And then I’d say there are 4 equally likely options: BB, BG, GB and GG.

If you tell me “one is a boy”, it’s very different from “the first one is a boy”. If one is a boy, we’re down to 3 equally likely options: BB, BG, and GB. So, the probability of two boys is 1/3.

But that’s not the answer Egan gives:

The usual answer to this puzzle — after people get over an initial intuitive sense that the “Tuesday” can’t possibly be relevant — is that the probability of having two sons is 13/27. If someone has two children, for each there are 14 possibilities as to boy/girl and weekday of birth, so if at least one child is a son born on a Tuesday there are 14 + 14 – 1 = 27 possibilities (subtracting 1 for the doubly-counted intersection, where both children are sons born on a Tuesday), of which 7 + 7 – 1 = 13 involve two sons.

If you find that answer unbelievable, read his essay! He does a good job of making it more intuitive:

• Greg Egan, Some thoughts on Tuesday’s child.

But then comes his deeper puzzle, or question:

That’s fine, but as a frequentist if someone asks me to take this probability seriously and start making bets, I will only do so if I can imagine some repetition of the experiment. Suppose someone offered me $81 if the parent had two sons, but I had to pay$54 if they had a son and a daughter. The expected gain from that bet for P(two sons)=13/27 would be $11. If I took up that bet, I would then resolve that in the future I’d only take the same bet again if the person each time had two children and at least one son born specifically on a TUESDAY. In fact, I’d insist on asking the parent myself “Do you have at least one son born on a Tuesday?” rather than having them volunteer the information (since someone with two sons born on different days might not mention the one born on a Tuesday). That way, I’d be sampling a subset of parents all meeting exactly the same conditions, and I’d be satisfied that my long-term expectation of gain really would be$11 per bet.

But I’m curious as to how a Bayesian, who is happier to think of a probability applying to a single event in isolation, would respond to the same situation. It seems to me (perhaps naively) that a Bayesian ought to be happy to take this bet any time, and then forget about what they did in the past — which ought to make them willing to take the bet on future offers even when the day of the week when the son was born changes. After all, P(two sons)=13/27 whatever day is substituted for Tuesday.

However, anyone who agreed to keep taking the bet regardless of the day of the week would lose money! Without pinning down the day to a particular choice, you’re betting on a sample of parents who simply have two children, at least one of whom is a son. That gives P(two sons)=1/3, and the expectation for the $81/$54 bet becomes a \$9 loss.

Now, I understand how the difference between P(two sons)=13/27 and P(two sons)=1/3 arises, despite the perfect symmetry between the weekdays; the subsets with “at least one son born on day X” are not disjoint, so even though they are isomorphic, their union will have a different proportion of two-son families than the individual subsets.

What’s puzzling me is this: how does a Bayesian reason about the thought experiment I’ve described, in such a way that they don’t end up taking the bet every time and losing money?