Today Greg Egan mailed me two puzzles in probability theory: a “simple” one, and a more complicated one that compares Bayesian and frequentist interpretations of probability theory.
Try your hand at the simple one first. Egan wrote:
A few months ago I read about a very simple but fun probability puzzle. Someone tells you:
“I have two children. One is a boy born on a Tuesday. What is the probability I have two boys?”
Please give it a try before moving on. Or at least figure out what this is:
Of course, your first reaction should be “it’s irrelevant the boy was born on a Tuesday“. At least that was my first reaction. So I said:
I’d intuitively assume that the day Tuesday is not relevant, so I’d ignore that information – or else look at some hospital statistics to see if it is relevant. I’d also assume that boy/girl births act just like independently distributed fair coin flips — which is surely false, but I’m guessing the puzzle wants us to assume it’s true. And then I’d say there are 4 equally likely options: BB, BG, GB and GG.
If you tell me “one is a boy”, it’s very different from “the first one is a boy”. If one is a boy, we’re down to 3 equally likely options: BB, BG, and GB. So, the probability of two boys is 1/3.
But that’s not the answer Egan gives:
The usual answer to this puzzle — after people get over an initial intuitive sense that the “Tuesday” can’t possibly be relevant — is that the probability of having two sons is 13/27. If someone has two children, for each there are 14 possibilities as to boy/girl and weekday of birth, so if at least one child is a son born on a Tuesday there are 14 + 14 – 1 = 27 possibilities (subtracting 1 for the doubly-counted intersection, where both children are sons born on a Tuesday), of which 7 + 7 – 1 = 13 involve two sons.
If you find that answer unbelievable, read his essay! He does a good job of making it more intuitive:
• Greg Egan, Some thoughts on Tuesday’s child.
But then comes his deeper puzzle, or question:
That’s fine, but as a frequentist if someone asks me to take this probability seriously and start making bets, I will only do so if I can imagine some repetition of the experiment. Suppose someone offered me $81 if the parent had two sons, but I had to pay $54 if they had a son and a daughter. The expected gain from that bet for P(two sons)=13/27 would be $11.
If I took up that bet, I would then resolve that in the future I’d only take the same bet again if the person each time had two children and at least one son born specifically on a TUESDAY. In fact, I’d insist on asking the parent myself “Do you have at least one son born on a Tuesday?” rather than having them volunteer the information (since someone with two sons born on different days might not mention the one born on a Tuesday). That way, I’d be sampling a subset of parents all meeting exactly the same conditions, and I’d be satisfied that my long-term expectation of gain really would be $11 per bet.
But I’m curious as to how a Bayesian, who is happier to think of a probability applying to a single event in isolation, would respond to the same situation. It seems to me (perhaps naively) that a Bayesian ought to be happy to take this bet any time, and then forget about what they did in the past — which ought to make them willing to take the bet on future offers even when the day of the week when the son was born changes. After all, P(two sons)=13/27 whatever day is substituted for Tuesday.
However, anyone who agreed to keep taking the bet regardless of the day of the week would lose money! Without pinning down the day to a particular choice, you’re betting on a sample of parents who simply have two children, at least one of whom is a son. That gives P(two sons)=1/3, and the expectation for the $81/$54 bet becomes a $9 loss.
Now, I understand how the difference between P(two sons)=13/27 and P(two sons)=1/3 arises, despite the perfect symmetry between the weekdays; the subsets with “at least one son born on day X” are not disjoint, so even though they are isomorphic, their union will have a different proportion of two-son families than the individual subsets.
What’s puzzling me is this: how does a Bayesian reason about the thought experiment I’ve described, in such a way that they don’t end up taking the bet every time and losing money?