A Math Puzzle Coming From Chemistry

23 October, 2011

I posed this puzzle a while back, and nobody solved it. That’s okay—now that I think about it, I’m not sure how to solve it either!

It seems to involve group theory. But instead of working on it, solving it and telling you the answer, I’d rather dump all the clues in your lap, so we can figure it out together.

Suppose we have an ethyl cation. We’ll pretend it looks like this:

As I explained before, it actually doesn’t—not in real life. But never mind! Realism should never stand in the way of a good puzzle.

Continuing on in this unrealistic vein, we’ll pretend that the two black carbon atoms are distinguishable, and so are the five white hydrogen atoms. As you can see, 2 of the hydrogens are bonded to one carbon, and 3 to the other. We don’t care how the hydrogens are arranged, apart from which carbon each hydrogen is attached to. Given this, there are

2 \times \displaystyle{ \binom{5}{2} = 20 }

ways to arrange the hydrogens. Let’s call these arrangements states.

Now draw a dot for each of these 20 states. Draw an edge connecting two dots whenever you can get from one state to another by having a hydrogen hop from the carbon with 2 hydrogens to the carbon with 3. You’ll get this picture, called the Desargues graph:

The red dots are states where the first carbon has 2 hydrogens attached to it; the blue ones are states where the second carbon has 2 hydrogens attached to it. So, each edge goes between a red and a blue dot. And there are 3 edges coming out of each dot, since there are 3 hydrogens that can make the jump!

Now, the puzzle is to show that you can also get the Desargues graph from a different kind of molecule. Any molecule shaped like this will do:

The 2 balls on top and bottom are called axial, while the 3 around the middle are called equatorial.

There are various molecules like this. For example, phosphorus pentachloride. Let’s use that.

Like the ethyl cation, phosphorus pentachloride also has 20 states… but only if count them a certain way! We have to treat all 5 chlorines as distinguishable, but think of two arrangements of them as the same if we can rotate one to get the other. Again, I’m not claiming this is physically realistic: it’s just for the sake of the puzzle.

Phosphorus pentachloride has 6 rotational symmetries, since you can turn it around its axis 3 ways, but also flip it over. So, it has

\displaystyle{ \frac{5!}{6}  = 20}


That’s good: exactly the number of dots in the Desargues graph! But how about the edges? We get these from certain transitions between states. These transitions are called pseudorotations, and they look like this:

Phosphorus pentachloride really does this! First the 2 axial guys move towards each other to become equatorial. Beware: now the equatorial ones are no longer in the horizontal plane: they’re in the plane facing us. Then 2 of the 3 equatorial guys swing out to become axial.

To get from one state to another this way, we have to pick 2 of the 3 equatorial guys to swing out and become axial. There are 3 choices here. So, we again get a graph with 20 vertices and 3 edges coming out of each vertex.

Puzzle. Is this graph the Desargues graph? If so, show it is.

I read in some chemistry papers that it is. But is it really? And if so, why? David Corfield suggested a promising strategy. He pointed out that we just need to get a 1-1 correspondence between

states of the ethyl cation and states of phosphorus pentachloride,

together with a compatible 1-1 correspondence between

transitions of the ethyl cation and transitions of phosphorus pentachloride.

And he suggested that to do this, we should think of the split of hydrogens into a bunch of 2 and a bunch of 3 as analogous to the split of chlorines into a bunch of 2 (the ‘axial’ ones) and a bunch of 3 (the ‘equatorial’ ones).

It’s a promising idea. There’s a problem, though! In the ethyl cation, a single hydrogen hops from the bunch of 3 to the bunch of 2. But in a pseudorotation, two chlorines go from the bunch of 2 to the bunch of 3… and meanwhile, two go back from the bunch of 3 to bunch of 2.

And if you think about it, there’s another problem too. In the ethyl cation, there are 2 distinguishable carbons. One of them has 3 hydrogens attached, and one doesn’t. But in phosphorus pentachloride it’s not like that. The 3 equatorial chlorines are just that: equatorial. They don’t have 2 choices about how to be that way. Or do they?

Well, there’s more to say, but this should already make it clear that getting ‘natural’ one-to-one correspondences is a bit tricky… if it’s even possible at all!

If you know some group theory, we could try solving the problem using the ideas behind Felix Klein’s ‘Erlangen program’. The group of permutations of 5 things, say S_5, acts as symmetries of either molecule. For the ethyl cation the set of states will be X  = S_5/G for some subgroup G. You can think of X as a set of structures of some sort on a 5-element set. The group S_5 acts on X, and the transitions will give an invariant binary relation on X, For phosphorus pentachloride we’ll have some set of states X' = S_5/G' for some other subgroup G', and the transitions will give an invariant relation on X'.

We could start by trying to see if G is the same as G'—or more precisely, conjugate. If they are, that’s a good sign. If not, it’s bad: it probably means there’s no ‘natural’ way to show the graph for phosphorus pentachloride is the Desargues graph.

I could say more, but I’ll stop here. In case you’re wondering, all this is just a trick to get more mathematicians interested in chemistry. A few may then go on to do useful things.

Network Theory (Part 6)

16 April, 2011

Now for the fun part. Let’s see how tricks from quantum theory can be used to describe random processes. I’ll try to make this post self-contained. So, even if you skipped a bunch of the previous ones, this should make sense.

You’ll need to know a bit of math: calculus, a tiny bit probability theory, and linear operators on vector spaces. You don’t need to know quantum theory, though you’ll have more fun if you do. What we’re doing here is very similar… but also strangely different—for reasons I explained last time.

Rabbits and quantum mechanics

Suppose we have a population of rabbits in a cage and we’d like to describe its growth in a stochastic way, using probability theory. Let \psi_n be the probability of having n rabbits. We can borrow a trick from quantum theory, and summarize all these probabilities in a formal power series like this:

\Psi = \sum_{n = 0}^\infty \psi_n z^n

The variable z doesn’t mean anything in particular, and we don’t care if the power series converges. See, in math ‘formal’ means “it’s only symbols on the page, just follow the rules”. It’s like if someone says a party is ‘formal’, so need to wear a white tie: you’re not supposed to ask what the tie means.

However, there’s a good reason for this trick. We can define two operators on formal power series, called the annihilation operator:

a \Psi = \frac{d}{d z} \Psi

and the creation operator:

a^\dagger \Psi = z \Psi

They’re just differentiation and multiplication by z, respectively. So, for example, suppose we start out being 100% sure we have n rabbits for some particular number n. Then \psi_n = 1, while all the other probabilities are 0, so:

\Psi = z^n

If we then apply the creation operator, we obtain

a^\dagger \Psi = z^{n+1}

Voilà! One more rabbit!

The annihilation operator is more subtle. If we start out with n rabbits:

\Psi = z^n

and then apply the annihilation operator, we obtain

a \Psi = n z^{n-1}

What does this mean? The z^{n-1} means we have one fewer rabbit than before. But what about the factor of n? It means there were n different ways we could pick a rabbit and make it disappear! This should seem a bit mysterious, for various reasons… but we’ll see how it works soon enough.

The creation and annihilation operators don’t commute:

(a a^\dagger - a^\dagger a) \Psi = \frac{d}{d z} (z \Psi) - z \frac{d}{d z} \Psi = \Psi

so for short we say:

a a^\dagger - a^\dagger a = 1

or even shorter:

[a, a^\dagger] = 1

where the commutator of two operators is

[S,T] = S T - T S

The noncommutativity of operators is often claimed to be a special feature of quantum physics, and the creation and annihilation operators are fundamental to understanding the quantum harmonic oscillator. There, instead of rabbits, we’re studying quanta of energy, which are peculiarly abstract entities obeying rather counterintuitive laws. So, it’s cool that the same math applies to purely classical entities, like rabbits!

In particular, the equation [a, a^\dagger] = 1 just says that there’s one more way to put a rabbit in a cage of rabbits, and then take one out, than to take one out and then put one in.

But how do we actually use this setup? We want to describe how the probabilities \psi_n change with time, so we write

\Psi(t) = \sum_{n = 0}^\infty \psi_n(t) z^n

Then, we write down an equation describing the rate of change of \Psi:

\frac{d}{d t} \Psi(t) = H \Psi(t)

Here H is an operator called the Hamiltonian, and the equation is called the master equation. The details of the Hamiltonian depend on our problem! But we can often write it down using creation and annihilation operators. Let’s do some examples, and then I’ll tell you the general rule.

Catching rabbits

Last time I told you what happens when we stand in a river and catch fish as they randomly swim past. Let me remind you of how that works. But today let’s use rabbits.

So, suppose an inexhaustible supply of rabbits are randomly roaming around a huge field, and each time a rabbit enters a certain area, we catch it and add it to our population of caged rabbits. Suppose that on average we catch one rabbit per unit time. Suppose the chance of catching a rabbit during any interval of time is independent of what happened before. What is the Hamiltonian describing the probability distribution of caged rabbits, as a function of time?

There’s an obvious dumb guess: the creation operator! However, we saw last time that this doesn’t work, and we saw how to fix it. The right answer is

H = a^\dagger - 1

To see why, suppose for example that at some time t we have n rabbits, so:

\Psi(t) = z^n

Then the master equation says that at this moment,

\frac{d}{d t} \Psi(t) = (a^\dagger - 1) \Psi(t) =  z^{n+1} - z^n

Since \Psi = \sum_{n = 0}^\infty \psi_n(t) z^n, this implies that the coefficients of our formal power series are changing like this:

\frac{d}{d t} \psi_{n+1}(t) = 1
\frac{d}{d t} \psi_{n}(t) = -1

while all the rest have zero derivative at this moment. And that’s exactly right! See, \psi_{n+1}(t) is the probability of having one more rabbit, and this is going up at rate 1. Meanwhile, \psi_n(t) is the probability of having n rabbits, and this is going down at the same rate.

Puzzle 1. Show that with this Hamiltonian and any initial conditions, the master equation predicts that the expected number of rabbits grows linearly.

Dying rabbits

Don’t worry: no rabbits are actually injured in the research that Jacob Biamonte is doing here at the Centre for Quantum Technologies. He’s keeping them well cared for in a big room on the 6th floor. This is just a thought experiment.

Suppose a mean nasty guy had a population of rabbits in a cage and didn’t feed them at all. Suppose that each rabbit has a unit probability of dying per unit time. And as always, suppose the probability of this happening in any interval of time is independent of what happens before that time.

What is the Hamiltonian? Again there’s a dumb guess: the annihilation operator! And again this guess is wrong, but it’s not far off. As before, the right answer includes a ‘correction term’:

H = a - N

This time the correction term is famous in its own right. It’s called the number operator:

N = a^\dagger a

The reason is that if we start with n rabbits, and apply this operator, it amounts to multiplication by n:

N z^n = z \frac{d}{d z} z^n = n z^n

Let’s see why this guess is right. Again, suppose that at some particular time t we have n rabbits, so

\Psi(t) = z^n

Then the master equation says that at this time

\frac{d}{d t} \Psi(t) = (a - N) \Psi(t) = n z^{n-1} - n z^n

So, our probabilities are changing like this:

\frac{d}{d t} \psi_{n-1}(t) = n
\frac{d}{d t} \psi_n(t) = -n

while the rest have zero derivative. And this is good! We’re starting with n rabbits, and each has a unit probability per unit time of dying. So, the chance of having one less should be going up at rate n. And the chance of having the same number we started with should be going down at the same rate.

Puzzle 2. Show that with this Hamiltonian and any initial conditions, the master equation predicts that the expected number of rabbits decays exponentially.

Breeding rabbits

Suppose we have a strange breed of rabbits that reproduce asexually. Suppose that each rabbit has a unit probability per unit time of having a baby rabbit, thus effectively duplicating itself.

As you can see from the cryptic picture above, this ‘duplication’ process takes one rabbit as input and has two rabbits as output. So, if you’ve been paying attention, you should be ready with a dumb guess for the Hamiltonian: a^\dagger a^\dagger a. This operator annihilates one rabbit and then creates two!

But you should also suspect that this dumb guess will need a ‘correction term’. And you’re right! As always, the correction terms makes the probability of things staying the same go down at exactly the rate that the probability of things changing goes up.

You should guess the correction term… but I’ll just tell you:

H = a^\dagger a^\dagger a - N

We can check this in the usual way, by seeing what it does when we have n rabbits:

H z^n =  z^2 \frac{d}{d z} z^n - n z^n = n z^{n+1} - n z^n

That’s good: since there are n rabbits, the rate of rabbit duplication is n. This is the rate at which the probability of having one more rabbit goes up… and also the rate at which the probability of having n rabbits goes down.

Puzzle 3. Show that with this Hamiltonian and any initial conditions, the master equation predicts that the expected number of rabbits grows exponentially.

Dueling rabbits

Let’s do some stranger examples, just so you can see the general pattern.

Here each pair of rabbits has a unit probability per unit time of fighting a duel with only one survivor. You might guess the Hamiltonian a^\dagger a a, but in fact:

H = a^\dagger a a - N(N-1)

Let’s see why this is right! Let’s see what it does when we have n rabbits:

H z^n = z \frac{d^2}{d z^2} z^n - n(n-1)z^n = n(n-1) z^{n-1} - n(n-1)z^n

That’s good: since there are n(n-1) ordered pairs of rabbits, the rate at which duels take place is n(n-1). This is the rate at which the probability of having one less rabbit goes up… and also the rate at which the probability of having n rabbits goes down.

(If you prefer unordered pairs of rabbits, just divide the Hamiltonian by 2. We should talk about this more, but not now.)

Brawling rabbits

Now each triple of rabbits has a unit probability per unit time of getting into a fight with only one survivor! I don’t know the technical term for a three-way fight, but perhaps it counts as a small ‘brawl’ or ‘melee’. In fact the Wikipedia article for ‘melee’ shows three rabbits in suits of armor, fighting it out:

Now the Hamiltonian is:

H = a^\dagger a^3 - N(N-1)(N-2)

You can check that:

H z^n = n(n-1)(n-2) z^{n-2} - n(n-1)(n-2) z^n

and this is good, because n(n-1)(n-2) is the number of ordered triples of rabbits. You can see how this number shows up from the math, too:

a^3 z^n = \frac{d^3}{d z^3} z^n = n(n-1)(n-2) z^{n-3}

The general rule

Suppose we have a process taking k rabbits as input and having j rabbits as output:

I hope you can guess the Hamiltonian I’ll use for this:

H = {a^{\dagger}}^j a^k - N(N-1) \cdots (N-k+1)

This works because

a^k z^n = \frac{d^k}{d z^k} z^n = n(n-1) \cdots (n-k+1) z^{n-k}

so that if we apply our Hamiltonian to n rabbits, we get

H z^n =  n(n-1) \cdots (n-k+1) (z^{n+j-k} - z^n)

See? As the probability of having n+j-k rabbits goes up, the probability of having n rabbits goes down, at an equal rate. This sort of balance is necessary for H to be a sensible Hamiltonian in this sort of stochastic theory (an ‘infinitesimal stochastic operator’, to be precise). And the rate is exactly the number of ordered k-tuples taken from a collection of n rabbits. This is called the kth falling power of n, and written as follows:

n^{\underline{k}} = n(n-1) \cdots (n-k+1)

Since we can apply functions to operators as well as numbers, we can write our Hamiltonian as:

H = {a^{\dagger}}^j a^k - N^{\underline{k}}

Kissing rabbits

Let’s do one more example just to test our understanding. This time each pair of rabbits has a unit probability per unit time of bumping into one another, exchanging a friendly kiss and walking off. This shouldn’t affect the rabbit population at all! But let’s follow the rules and see what they say.

According to our rules, the Hamiltonian should be:

H = {a^{\dagger}}^2 a^2 - N(N-1)


{a^{\dagger}}^2 a^2 z^n = z^2 \frac{d^2}{dz^2} z^n = n(n-1) z^n = N(N-1) z^n

and since z^n form a ‘basis’ for the formal power series, we see that:

{a^{\dagger}}^2 a^2 = N(N-1)

so in fact:

H = 0

That’s good: if the Hamiltonian is zero, the master equation will say

\frac{d}{d t} \Psi(t) = 0

so the population, or more precisely the probability of having any given number of rabbits, will be constant.

There’s another nice little lesson here. Copying the calculation we just did, it’s easy to see that:

{a^{\dagger}}^k a^k = N^{\underline{k}}

This is a cute formula for falling powers of the number operator in terms of annihilation and creation operators. It means that for the general transition we saw before:

we can write the Hamiltonian in two equivalent ways:

H = {a^{\dagger}}^j a^k - N^{\underline{k}} =  {a^{\dagger}}^j a^k - {a^{\dagger}}^k a^k

Okay, that’s it for now! We can, and will, generalize all this stuff to stochastic Petri nets where there are things of many different kinds—not just rabbits. And we’ll see that the master equation we get matches the answer to the puzzle in Part 4. That’s pretty easy. But first, we’ll have a guest post by Jacob Biamonte, who will explain a more realistic example from population biology.

Geometry Puzzle

11 October, 2010

We’re thinking about solar power over at the Azimuth Project, so Graham Jones wrote a page on solar radiation. This led him to a nice little geometry puzzle.

If you float in space near the Earth, and measure the power density of solar radiation, you’ll get 1366 watts per square meter. But because this radiation hits the Earth at an angle, and not at all during the night, the average global solar power density is a lot less: about 341.5 watts per square meter at the top of the atmosphere. And of this, only about 156 watts/meter2 makes it down the Earth’s surface. From 1366 down to 156 — that’s almost an order of magnitude! This is why some people like the idea of space-based solar power.

But when I said “a lot less”, I was concealing a cute and simple fact: the average global solar power density is one quarter of the power density in outer space near Earth’s orbit:

1366/4 = 341.5

Why? Because the area of a sphere is four times the area of its circular shadow!

Anyone who remembers their high-school math can see why this is true. The area of a circle is πr2, where r is the radius of the circle The surface area of a sphere is 4πr2, where r is the radius of the sphere. That’s where the factor of 4 comes from.

Cute and simple. But Graham Jones posed a nice followup puzzle: What’s the easiest way to understand this factor of 4? Maybe there’s a way that doesn’t require calculus — just geometry. Maybe with a little work we could just see that factor of 4. That would be really satisfying.

But I don’t know how to “just see it”. So this is not the sort of puzzle where I smile in a superior sort of way and chuckle to myself as you folks struggle to solve it. This the sort where I’d really like to know the best answer.

But here’s something I do know: we can derive this factor of 4 from a nice but even less obvious fact which I believe was proved by Archimedes.

Take a sphere and slice it with a bunch of parallel planes, like chopping an apple with a cleaver. If two slices have the same thickness, they also have the same surface area!

(When I say “surface area” here, I’m only counting the red skin of the apple slices.)

There’s an interesting cancellation at work here. A slice from near the top or bottom of the sphere will be smaller, but it’s also more “sloped”. The magic fact is that these effects exactly cancel when we compute its surface area.

If you think a bit, you can see this is equivalent to another nice fact:

The surface area of any slice of a sphere matches the surface area of the corresponding slice of the cylinder with the same radius. If you don’t get what I mean, see the picture at Wolfram Mathworld.

And this in turn implies that the surface area of the sphere equals the surface area of the cylinder, not including top and bottom. But that’s the cylinder’s circumference times its height. So we get

2πr × 2r = 4πr2

So we get that factor of 4 we wanted.

In fact, Archimedes was so proud of discovering this fact that he put it on his tomb! Cicero later saw this tomb and helped save it from obscurity. He wrote:

But from Dionysius’s own city of Syracuse I will summon up from the dust—where his measuring rod once traced its lines—an obscure little man who lived many years later, Archimedes. When I was questor in Sicily [in 75 BC, 137 years after the death of Archimedes] I managed to track down his grave. The Syracusians knew nothing about it, and indeed denied that any such thing existed. But there it was, completely surrounded and hidden by bushes of brambles and thorns. I remembered having heard of some simple lines of verse which had been inscribed on his tomb, referring to a sphere and cylinder modelled in stone on top of the grave. And so I took a good look round all the numerous tombs that stand beside the Agrigentine Gate. Finally I noted a little column just visible above the scrub: it was surmounted by a sphere and a cylinder. I immediately said to the Syracusans, some of whose leading citizens were with me at the time, that I believed this was the very object I had been looking for. Men were sent in with sickles to clear the site, and when a path to the monument had been opened we walked right up to it. And the verses were still visible, though approximately the second half of each line had been worn away.

I don’t know what the verses said.

It’s been said that the Roman contributions to mathematics were so puny that the biggest was Cicero’s discovery of this tomb. But Archimedes’ result doesn’t by itself give an easy intuitive way to see where that factor of 4 is coming from! You may or may not find it to be a useful clue.

(In fact, “equal surface areas for slices of equal thickness” is a special case of a principle called “Duistermaat-Heckman localization”. For that, try page 23 of chapter 2 of the book by Ginzburg, Guillemin and Karshon. But that’s much fancier stuff than I’m wondering about here, I think.)

Probability Puzzles (Part 2)

5 September, 2010

Sometimes places become famous not because of what’s there, but because of the good times people have there.

There’s a somewhat historic bar in Singapore called the Colbar. Apparently that’s short for “Colonial Bar”. It’s nothing to look at: pretty primitive, basically a large shed with no air conditioning and a roofed-over patio made of concrete. Its main charm is that it’s “locked in a time warp”. It used to be set in the British army barracks, but it was moved in 2003. According to a food blog:

Thanks to the petitions of Colbar regulars and the subsequent intervention of the Jurong Town Council (JTC), who wanted to preserve its colourful history, Colbar was replicated and relocated just a stone’s throw away from the old site. Built brick by brick and copied to close exact, Colbar reopened its doors last year looking no different from what it used to be.

It’s now in one of the few remaining forested patches of Singapore. The Chinese couple who run it are apparently pretty well-off; they’ve been at it since the place opened in 1953, even before Singapore became a country.

Every Friday, a bunch of philosophers go there to drink beer, play chess, strum guitars and talk. Since my wife teaches in the philosophy department at NUS, we became part of this tradition, and it’s a lot of fun.

Anyway, the last time we went there, one of the philosophers posed this puzzle:

You know a woman who has two children. One day you see her walking by with one. You notice it’s a boy. What’s the probability that both her children are boys?

Of course I instantly thought of the probability puzzles we’ve discussed here. It’s not exactly any of the versions we have already talked about. So I thought you folks might enjoy it.

What’s the answer?

Probability Puzzles (Part 1)

24 August, 2010

Today Greg Egan mailed me two puzzles in probability theory: a “simple” one, and a more complicated one that compares Bayesian and frequentist interpretations of probability theory.

Try your hand at the simple one first. Egan wrote:

A few months ago I read about a very simple but fun probability puzzle. Someone tells you:

“I have two children. One is a boy born on a Tuesday. What is the probability I have two boys?”

Please give it a try before moving on. Or at least figure out what this is:

Of course, your first reaction should be “it’s irrelevant the boy was born on a Tuesday“. At least that was my first reaction. So I said:

I’d intuitively assume that the day Tuesday is not relevant, so I’d ignore that information – or else look at some hospital statistics to see if it is relevant. I’d also assume that boy/girl births act just like independently distributed fair coin flips — which is surely false, but I’m guessing the puzzle wants us to assume it’s true. And then I’d say there are 4 equally likely options: BB, BG, GB and GG.

If you tell me “one is a boy”, it’s very different from “the first one is a boy”. If one is a boy, we’re down to 3 equally likely options: BB, BG, and GB. So, the probability of two boys is 1/3.

But that’s not the answer Egan gives:

The usual answer to this puzzle — after people get over an initial intuitive sense that the “Tuesday” can’t possibly be relevant — is that the probability of having two sons is 13/27. If someone has two children, for each there are 14 possibilities as to boy/girl and weekday of birth, so if at least one child is a son born on a Tuesday there are 14 + 14 – 1 = 27 possibilities (subtracting 1 for the doubly-counted intersection, where both children are sons born on a Tuesday), of which 7 + 7 – 1 = 13 involve two sons.

If you find that answer unbelievable, read his essay! He does a good job of making it more intuitive:

• Greg Egan, Some thoughts on Tuesday’s child.

But then comes his deeper puzzle, or question:

That’s fine, but as a frequentist if someone asks me to take this probability seriously and start making bets, I will only do so if I can imagine some repetition of the experiment. Suppose someone offered me $81 if the parent had two sons, but I had to pay $54 if they had a son and a daughter. The expected gain from that bet for P(two sons)=13/27 would be $11.

If I took up that bet, I would then resolve that in the future I’d only take the same bet again if the person each time had two children and at least one son born specifically on a TUESDAY. In fact, I’d insist on asking the parent myself “Do you have at least one son born on a Tuesday?” rather than having them volunteer the information (since someone with two sons born on different days might not mention the one born on a Tuesday). That way, I’d be sampling a subset of parents all meeting exactly the same conditions, and I’d be satisfied that my long-term expectation of gain really would be $11 per bet.

But I’m curious as to how a Bayesian, who is happier to think of a probability applying to a single event in isolation, would respond to the same situation. It seems to me (perhaps naively) that a Bayesian ought to be happy to take this bet any time, and then forget about what they did in the past — which ought to make them willing to take the bet on future offers even when the day of the week when the son was born changes. After all, P(two sons)=13/27 whatever day is substituted for Tuesday.

However, anyone who agreed to keep taking the bet regardless of the day of the week would lose money! Without pinning down the day to a particular choice, you’re betting on a sample of parents who simply have two children, at least one of whom is a son. That gives P(two sons)=1/3, and the expectation for the $81/$54 bet becomes a $9 loss.

Now, I understand how the difference between P(two sons)=13/27 and P(two sons)=1/3 arises, despite the perfect symmetry between the weekdays; the subsets with “at least one son born on day X” are not disjoint, so even though they are isomorphic, their union will have a different proportion of two-son families than the individual subsets.

What’s puzzling me is this: how does a Bayesian reason about the thought experiment I’ve described, in such a way that they don’t end up taking the bet every time and losing money?