## Shelves and the Infinite

6 May, 2016

A shelf is a set with a binary operation $\rhd$ that distributes over itself:

$a \rhd (b \rhd c) = (a \rhd b) \rhd (a \rhd c)$

There are lots of examples, the simplest being any group, where we define

$g \rhd h = g h g^{-1}$

They have a nice connection to knot theory, which you can see here if you think hard:

My former student Alissa Crans, who invented the term ‘shelf’, has written a lot about them, starting here:

• Alissa Crans, Lie 2-Algebras, Chapter 3.1: Shelves, Racks, Spindles and Quandles, Ph.D. thesis, U.C. Riverside, 2004.

I could tell you a long and entertaining story about this, including the tale of how shelves got their name. But instead I want to talk about something far more peculiar, which I understand much less well. There’s a strange connection between shelves, extremely large infinities, and extremely large finite numbers! It was first noticed by a logician named Richard Laver in the late 1980s, and it’s been developed further by Randall Dougherty.

It goes like this. For each $n,$ there’s a unique shelf structure on the numbers $\{1,2, \dots ,2^n\}$ such that

$a \rhd 1 = a + 1 \bmod 2^n$

So, the elements of our shelf are

$1$

$1 \rhd 1 = 2$

$2 \rhd 1 = 3$

and so on, until we get to

$2^n \rhd 1 = 1$

However, we can now calculate

$1 \rhd 1$

$1 \rhd 2$

$1 \rhd 3$

and so on. You should try it yourself for a simple example! You’ll need to use the self-distributive law. It’s not hard to do.

You’ll get a list of $2^n$ numbers, but this list will not contain all the numbers $\{1, 2, \dots, 2^n\}.$ Instead, it will repeat with some period $P(n).$

Here is where things get weird. The numbers $P(n)$ form this sequence:

1, 1, 2, 4, 4, 8, 8, 8, 8, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, …

It may not look like it, but the numbers in this sequence approach infinity!

if we assume an extra axiom, which goes beyond the usual axioms of set theory, but so far seems consistent!

This axiom asserts the existence of an absurdly large cardinal, called an I3 rank-into-rank cardinal.

On the other hand, Randall Dougherty has proved a lower bound on how far you have to go out in this sequence to reach the number 32.

And, it’s an incomprehensibly large number!

The third Ackermann function $A_3(n)$ is roughly 2 to the $n$th power. The fourth Ackermann function $A_4(n)$ is roughly 2 raised to itself $n$ times:

$2^{2^{2^{.^{.} }}}$

And so on: each Ackermann function is defined by iterating the previous one.

Dougherty showed that for the sequence $P(n)$ to reach 32, you have to go at least

$n = A(9,A(8,A(8,255)))$

This is an insanely large number!

I should emphasize that if we use just the ordinary axioms of set theory, the ZFC axioms, nobody has proved that the sequence $P(n)$ ever reaches 32. Neither is it known that this is unprovable if we only use ZFC.

I should admit that my definition of the Ackermann function is rough. In reality it’s defined like this:

$A(m, n) = \begin{cases} n+1 & \mbox{if } m = 0 \\ A(m-1, 1) & \mbox{if } m > 0 \mbox{ and } n = 0 \\ A(m-1, A(m, n-1)) & \mbox{if } m > 0 \mbox{ and } n > 0. \end{cases}$

And if you work this out, you’ll find it’s a bit annoying. Somehow the number 3 sneaks in:

$A(2,n) = 2 + (n+3) - 3$

$A(3,n) = 2 \cdot (n+3) - 3$

$A(4,n) = 2^{n+3} - 3$

$A(5,n) = 2\uparrow\uparrow(n+3) - 3$

where $a \uparrow\uparrow b$ means $a$ raised to itself $b$ times,

$A(6,n) = 2 \uparrow\uparrow\uparrow(n+3) - 3$

where $a \uparrow\uparrow\uparrow b$ means $a \uparrow\uparrow (a \uparrow\uparrow (a \uparrow\uparrow \cdots ))$ with the number $a$ repeated $b$ times, and so on.

However, these irritating 3’s scarcely matter, since Dougherty’s number is so large… and I believe he could have gotten an even larger upper bound if he wanted.

Perhaps I’ll wrap up by saying very roughly what an I3 rank-into-rank cardinal is.

In set theory the universe of all sets is built up in stages. These stages are called the von Neumann hierarchy. The lowest stage has nothing in it:

$V_0 = \emptyset$

Each successive stage is defined like this:

$V_{\lambda + 1} = P(V_\lambda)$

where $P(S)$ is the the power set of $S,$ that is, the set of all subsets of $S.$ For ‘limit ordinals’, that is, ordinals that aren’t of the form $\lambda + 1,$ we define

$\displaystyle{ V_\lambda = \bigcup_{\alpha < \lambda} V_\lambda }$

An I3 rank-into-rank cardinal is an ordinal $\lambda$ such that $V_\lambda$ admits a nontrivial elementary embedding into itself.

Very roughly, this means the infinity $\lambda$ is so huge that the collection of sets that can be built by this stage can mapped into itself, in a one-to-one but not onto way, into a smaller collection that’s indistinguishable from the original one when it comes to the validity of anything you can say about sets!

More precisely, a nontrivial elementary embedding of $V_\lambda$ into itself is a one-to-one but not onto function

$f: V_\lambda \to V_\lambda$

that preserves and reflects the validity of all statements in the language of set theory. That is: for any sentence $\phi(a_1, \dots, a_n)$ in the language of set theory, this statement holds for sets $a_1, \dots, a_n \in V_\lambda$ if and only if $\phi(f(a_1), \dots, f(a_n))$ holds.

I don’t know why, but an I3 rank-into-rank cardinal, if it’s even consistent to assume one exists, is known to be extraordinarily big. What I mean by this is that it automatically has a lot of other properties known to characterize large cardinals. It’s inaccessible (which is big) and ineffable (which is bigger), and measurable (which is bigger), and huge (which is even bigger), and so on.

How in the world is this related to shelves?

The point is that if

$f, g : V_\lambda \to V_\lambda$

are elementary embeddings, we can apply $f$ to any set in $V_\lambda.$ But in set theory, functions are sets too: sets of ordered pairs So, $g$ is a set. It’s not an element of $V_\lambda,$ but all its subsets $g \cap V_\alpha$ are, where $\alpha < \lambda.$ So, we can define

$f \rhd g = \bigcup_{\alpha < \lambda} f (g \cap V_\alpha)$

Laver showed that this operation distributes over itself:

$f \rhd (g \rhd h) = (f \rhd g) \rhd (f \rhd h)$

And, he showed that if we take one elementary embedding and let it generate a shelf by this this operation, we get the free shelf on one generator!

The shelf I started out describing, the numbers $\{1, \dots, 2^n \}$ with

$a \rhd 1 = a + 1 \bmod 2^n$

also has one generator namely the number 1. So, it’s a quotient of the free shelf on one generator by one relation, namely the above equation.

That’s about all I understand. I don’t understand how the existence of a nontrivial elementary embedding of $V_\lambda$ into itself implies that the function $P(n)$ goes to infinity, and I don’t understand Randall Dougherty’s lower bound on how far you need to go to reach $P(n) = 32.$ For more, read these:

• Richard Laver, The left distributive law and the freeness of an algebra of elementary embeddings, Adv. Math. 91 (1992), 209–231.

• Richard Laver, On the algebra of elementary embeddings of a rank into itself, Adv. Math. 110 (1995), 334–346.

• Randall Dougherty, Critical points in an algebra of elementary embeddings.

## Bleaching of the Great Barrier Reef

22 April, 2016

The chatter of gossip distracts us from the really big story, the Anthropocene: the new geological era we are bringing about. Here’s something that should be dominating the headlines: Most of the Great Barrier Reef, the world’s largest coral reef system, now looks like a ghostly graveyard.

Most corals are colonies of tiny genetically identical animals called polyps. Over centuries, their skeletons build up reefs, which are havens for many kinds of sea life. Some polyps catch their own food using stingers. But most get their food by symbiosis! They cooperate with single-celled organism called zooxanthellae. Zooxanthellae get energy from the sun’s light. They actually live inside the polyps, and provide them with food. Most of the color of a coral reef comes from these zooxanthellae.

When a polyp is stressed, the zooxanthellae living inside it may decide to leave. This can happen when the sea water gets too hot. Without its zooxanthellae, the polyp is transparent and the coral’s white skeleton is revealed—as you see here. We say the coral is bleached.

After they bleach, the polyps begin to starve. If conditions return to normal fast enough, the zooxanthellae may come back. If they don’t, the coral will die.

The Great Barrier Reef, off the northeast coast of Australia, contains over 2,900 reefs and 900 islands. It’s huge: 2,300 kilometers long, with an area of about 340,000 square kilometers. It can be seen from outer space!

With global warming, this reef has been starting to bleach. Parts of it bleached in 1998 and again in 2002. But this year, with a big El Niño pushing world temperatures to new record highs, is the worst.

Scientists have being flying over the Great Barrier Reef to study the damage, and divers have looked at some of the reefs in detail. Of the 522 reefs surveyed in the northern sector, over 80% are severely bleached and less than 1% are not bleached at all. The damage is less further south where the water is cooler—but most of the reefs are in the north:

The top expert on coral reefs in Australia, Terry Hughes, wrote:

I showed the results of aerial surveys of bleaching on the Great Barrier Reef to my students. And then we wept.

Imagine devoting your life to studying and trying to protect coral reefs, and then seeing this.

Some of the bleached reefs may recover. But as oceans continue to warm, the prospects look bleak. The last big El Niño was in 1998. With a lot of hard followup work, scientists showed that in the end, 16% of the world’s corals died in that event.

This year is quite a bit hotter.

So, global warming is not a problem for the future: it’s a problem now. It’s not good enough to cut carbon emissions eventually. We’ve got to get serious now.

I need to recommit myself to this. For example, I need to stop flying around to conferences. I’ve cut back, but I need to do much better. Future generations, living in the damaged world we’re creating, will not have much sympathy for our excuses.

## Statistical Laws of Darwinian Evolution

18 April, 2016

guest post by Matteo Smerlak

Biologists like Steven J. Gould like to emphasize that evolution is unpredictable. They have a point: there is absolutely no way an alien visiting the Earth 400 million years ago could have said:

Hey, I know what’s gonna happen here. Some descendants of those ugly fish will grow wings and start flying in the air. Others will walk the surface of the Earth for a few million years, but they’ll get bored and they’ll eventually go back to the oceans; when they do, they’ll be able to chat across thousands of kilometers using ultrasound. Yet others will grow arms, legs, fur, they’ll climb trees and invent BBQ, and, sooner or later, they’ll start wondering “why all this?”.

Nor can we tell if, a week from now, the flu virus will mutate, become highly pathogenic and forever remove the furry creatures from the surface of the Earth.

Evolution isn’t gravity—we can’t tell in which directions things will fall down.

One reason we can’t predict the outcomes of evolution is that genomes evolve in a super-high dimensional combinatorial space, which a ginormous number of possible turns at every step. Another is that living organisms interact with one another in a massively non-linear way, with, feedback loops, tipping points and all that jazz.

Life’s a mess, if you want my physicist’s opinion.

But that doesn’t mean that nothing can be predicted. Think of statistics. Nobody can predict who I’ll vote for in the next election, but it’s easy to tell what the distribution of votes in the country will be like. Thus, for continuous variables which arise as sums of large numbers of independent components, the central limit theorem tells us that the distribution will always be approximately normal. Or take extreme events: the max of $N$ independent random variables is distributed according to a member of a one-parameter family of so-called “extreme value distributions”: this is the content of the famous Fisher–Tippett–Gnedenko theorem.

So this is the problem I want to think about in this blog post: is evolution ruled by statistical laws? Or, in physics terms: does it exhibit some form of universality?

### Fitness distributions are the thing

One lesson from statistical physics is that, to uncover universality, you need to focus on relevant variables. In the case of evolution, it was Darwin’s main contribution to figure out the main relevant variable: the average number of viable offspring, aka fitness, of an organism. Other features—physical strength, metabolic efficiency, you name it—matter only insofar as they are correlated with fitness. If we further assume that fitness is (approximately) heritable, meaning that descendants have the same fitness as their ancestors, we get a simple yet powerful dynamical principle called natural selection: in a given population, the lineage with the highest fitness eventually dominates, i.e. its fraction goes to one over time. This principle is very general: it applies to genes and species, but also to non-living entities such as algorithms, firms or language. The general relevance of natural selection as a evolutionary force is sometimes referred to as “Universal Darwinism”.

The general idea of natural selection is pictured below (reproduced from this paper):

It’s not hard to write down an equation which expresses natural selection in general terms. Consider an infinite population in which each lineage grows with some rate $x$. (This rate is called the log-fitness or Malthusian fitness to contrast it with the number of viable offspring $w=e^{x\Delta t}$ with $\Delta t$ the lifetime of a generation. It’s more convenient to use $x$ than $w$ in what follows, so we’ll just call $x$ “fitness”). Then the distribution of fitness at time $t$ satisfies the equation

$\displaystyle{ \frac{\partial p_t(x)}{\partial t} =\left(x-\int d y\, y\, p_t(y)\right)p_t(x) }$

whose explicit solution in terms of the initial fitness distribution $p_0(x):$

$\displaystyle{ p_t(x)=\frac{e^{x t}p_0(x)}{\int d y\, e^{y t}p_0(y)} }$

is called the Cramér transform of $p_0(x)$ in large deviations theory. That is, viewed as a flow in the space of probability distributions, natural selection is nothing but a time-dependent exponential tilt. (These equations and the results below can be generalized to include the effect of mutations, which are critical to maintain variation in the population, but we’ll skip this here to focus on pure natural selection. See my paper referenced below for more information.)

An immediate consequence of these equations is that the mean fitness $\mu_t=\int dx\, x\, p_t(x)$ grows monotonically in time, with a rate of growth given by the variance $\sigma_t^2=\int dx\, (x-\mu_t)^2\, p_t(x)$:

$\displaystyle{ \frac{d\mu_t}{dt}=\sigma_t^2\geq 0 }$

The great geneticist Ronald Fisher (yes, the one in the extreme value theorem!) was very impressed with this relationship. He thought it amounted to an biological version of the second law of thermodynamics, writing in his 1930 monograph

Professor Eddington has recently remarked that “The law that entropy always increases—the second law of thermodynamics—holds, I think, the supreme position among the laws of nature”. It is not a little instructive that so similar a law should hold the supreme position among the biological sciences.

Unfortunately, this excitement hasn’t been shared by the biological community, notably because this Fisher “fundamental theorem of natural selection” isn’t predictive: the mean fitness $\mu_t$ grows according to the fitness variance $\sigma_t^2$, but what determines the evolution of $\sigma_t^2$? I can’t use the identity above to predict the speed of evolution in any sense. Geneticists say it’s “dynamically insufficient”.

### Two limit theorems

But the situation isn’t as bad as it looks. The evolution of $p_t(x)$ may be decomposed into the evolution of its mean $\mu_t$, of its variance $\sigma_t^2$, and of its shape or type

$\overline{p}_t(x)=\sigma_t p_t(\sigma_t x+\mu_t)$.

(We also call $\overline{p}_t(x)$ the “standardized fitness distribution”.) With Ahmed Youssef we showed that:

• If $p_0(x)$ is supported on the whole real line and decays at infinity as

$-\ln\int_x^{\infty}p_0(y)d y\underset{x\to\infty}{\sim} x^{\alpha}$

for some $\alpha > 1$, then $\mu_t\sim t^{\overline{\alpha}-1}$, $\sigma_t^2\sim t^{\overline{\alpha}-2}$ and $\overline{p}_t(x)$ converges to the standard normal distribution as $t\to\infty$. Here $\overline{\alpha}$ is the conjugate exponent to $\alpha$, i.e. $1/\overline{\alpha}+1/\alpha=1$.

• If $p_0(x)$ has a finite right-end point $x_+$ with

$p(x)\underset{x\to x_+}{\sim} (x_+-x)^\beta$

for some $\beta\geq0$, then $x_+-\mu_t\sim t^{-1}$, $\sigma_t^2\sim t^{-2}$ and $\overline{p}_t(x)$ converges to the flipped gamma distribution

$\displaystyle{ p^*_\beta(x)= \frac{(1+\beta)^{(1+\beta)/2}}{\Gamma(1+\beta)} \Theta[x-(1+\beta)^{1/2}] }$

$\displaystyle { e^{-(1+\beta)^{1/2}[(1+\beta)^{1/2}-x]}\Big[(1+\beta)^{1/2}-x\Big]^\beta }$

Here and below the symbol $\sim$ means “asymptotically equivalent up to a positive multiplicative constant”; $\Theta(x)$ is the Heaviside step function. Note that $p^*_\beta(x)$ becomes Gaussian in the limit $\beta\to\infty$, i.e. the attractors of cases 1 and 2 form a continuous line in the space of probability distributions; the other extreme case, $\beta\to0$, corresponds to a flipped exponential distribution.

The one-parameter family of attractors $p_\beta^*(x)$ is plotted below:

These results achieve two things. First, they resolve the dynamical insufficiency of Fisher’s fundamental theorem by giving estimates of the speed of evolution in terms of the tail behavior of the initial fitness distribution. Second, they show that natural selection is indeed subject to a form of universality, whereby the relevant statistical structure turns out to be finite dimensional, with only a handful of “conserved quantities” (the $\alpha$ and $\beta$ exponents) controlling the late-time behavior of natural selection. This amounts to a large reduction in complexity and, concomitantly, an enhancement of predictive power.

(For the mathematically-oriented reader, the proof of the theorems above involves two steps: first, translate the selection equation into a equation for (cumulant) generating functions; second, use a suitable Tauberian theorem—the Kasahara theorem—to relate the behavior of generating functions at large values of their arguments to the tail behavior of $p_0(x)$. Details in our paper.)

It’s useful to consider the convergence of fitness distributions to the attractors $p_\beta^*(x)$ for $0\leq\beta\leq \infty$ in the skewness-kurtosis plane, i.e. in terms of the third and fourth cumulants of $p_t(x)$.

The red curve is the family of attractors, with the normal at the bottom right and the flipped exponential at the top left, and the dots correspond to numerical simulations performed with the classical Wright–Fisher model and with a simple genetic algorithm solving a linear programming problem. The attractors attract!

### Conclusion and a question

Statistics is useful because limit theorems (the central limit theorem, the extreme value theorem) exist. Without them, we wouldn’t be able to make any population-level prediction. Same with statistical physics: it only because matter consists of large numbers of atoms, and limit theorems hold (the H-theorem, the second law), that macroscopic physics is possible in the first place. I believe the same perspective is useful in evolutionary dynamics: it’s true that we can’t predict how many wings birds will have in ten million years, but we can tell what shape fitness distributions should have if natural selection is true.

I’ll close with an open question for you, the reader. In the central limit theorem as well as in the second law of thermodynamics, convergence is driven by a Lyapunov function, namely entropy. (In the case of the central limit theorem, it’s a relatively recent result by Arstein et al.: the entropy of the normalized sum of $n$ i.i.d. random variables, when it’s finite, is a monotonically increasing function of $n$.) In the case of natural selection for unbounded fitness, it’s clear that entropy will also be eventually monotonically increasing—the normal is the distribution with largest entropy at fixed variance and mean.

Yet it turns out that, in our case, entropy isn’t monotonic at all times; in fact, the closer the initial distribution $p_0(x)$ is to the normal distribution, the later the entropy of the standardized fitness distribution starts to increase. Or, equivalently, the closer the initial distribution $p_0(x)$ to the normal, the later its relative entropy with respect to the normal. Why is this? And what’s the actual Lyapunov function for this process (i.e., what functional of the standardized fitness distribution is monotonic at all times under natural selection)?

In the plots above the blue, orange and green lines correspond respectively to

$\displaystyle{ p_0(x)\propto e^{-x^2/2-x^4}, \quad p_0(x)\propto e^{-x^2/2-.01x^4}, \quad p_0(x)\propto e^{-x^2/2-.001x^4} }$

### References

• S. J. Gould, Wonderful Life: The Burgess Shale and the Nature of History, W. W. Norton & Co., New York, 1989.

• M. Smerlak and A. Youssef, Limiting fitness distributions in evolutionary dynamics, 2015.

• R. A. Fisher, The Genetical Theory of Natural Selection, Oxford University Press, Oxford, 1930.

• S. Artstein, K. Ball, F. Barthe and A. Naor, Solution of Shannon’s problem on the monotonicity of entropy, J. Am. Math. Soc. 17 (2004), 975–982.

## Diamonds and Triamonds

11 April, 2016

The structure of a diamond crystal is fascinating. But there’s an equally fascinating form of carbon, called the triamond, that’s theoretically possible but never yet seen in nature. Here it is:

In the triamond, each carbon atom is bonded to three others at 120° angles, with one double bond and two single bonds. Its bonds lie in a plane, so we get a plane for each atom.

But here’s the tricky part: for any two neighboring atoms, these planes are different. In fact, if we draw the bond planes for all the atoms in the triamond, they come in four kinds, parallel to the faces of a regular tetrahedron!

If we discount the difference between single and double bonds, the triamond is highly symmetrical. There’s a symmetry carrying any atom and any of its bonds to any other atom and any of its bonds. However, the triamond has an inherent handedness, or chirality. It comes in two mirror-image forms.

A rather surprising thing about the triamond is that the smallest rings of atoms are 10-sided. Each atom lies in 15 of these 10-sided rings.

Some chemists have argued that the triamond should be ‘metastable’ at room temperature and pressure: that is, it should last for a while but eventually turn to graphite. Diamonds are also considered metastable, though I’ve never seen anyone pull an old diamond ring from their jewelry cabinet and discover to their shock that it’s turned to graphite. The big difference is that diamonds are formed naturally under high pressure—while triamonds, it seems, are not.

Nonetheless, the mathematics behind the triamond does find its way into nature. A while back I told you about a minimal surface called the ‘gyroid’, which is found in many places:

It turns out that the pattern of a gyroid is closely connected to the triamond! So, if you’re looking for a triamond-like pattern in nature, certain butterfly wings are your best bet:

• Matthias Weber, The gyroids: algorithmic geometry III, The Inner Frame, 23 October 2015.

Instead of trying to explain it here, I’ll refer you to the wonderful pictures at Weber’s blog.

### Building the triamond

I want to tell you a way to build the triamond. I saw it here:

• Toshikazu Sunada, Crystals that nature might miss creating, Notices of the American Mathematical Society 55 (2008), 208–215.

This is the paper that got people excited about the triamond, though it was discovered much earlier by the crystallographer Fritz Laves back in 1932, and Coxeter named it the Laves graph.

It’s called $\mathrm{K}_4,$ since it’s the complete graph on four vertices, meaning there’s one edge between each pair of vertices. The vertices correspond to four different kinds of atoms in the triamond: let’s call them red, green, yellow and blue. The edges of this graph have arrows on them, labelled with certain vectors

$e_1, e_2, e_3, f_1, f_2, f_3 \in \mathbb{R}^3$

Let’s not worry yet about what these vectors are. What really matters is this: to move from any atom in the triamond to any of its neighbors, you move along the vector labeling the edge between them… or its negative, if you’re moving against the arrow.

For example, suppose you’re at any red atom. It has 3 nearest neighbors, which are blue, green and yellow. To move to the blue neighbor you add $f_1$ to your position. To move to the green one you subtract $e_2,$ since you’re moving against the arrow on the edge connecting blue and green. Similarly, to go to the yellow neighbor you subtract the vector $f_3$ from your position.

Thus, any path along the bonds of the triamond determines a path in the graph $\mathrm{K}_4.$

Conversely, if you pick an atom of some color in the triamond, any path in $\mathrm{K}_4$ starting from the vertex of that color determines a path in the triamond! However, going around a loop in $\mathrm{K}_4$ may not get you back to the atom you started with in the triamond.

Mathematicians summarize these facts by saying the triamond is a ‘covering space’ of the graph $\mathrm{K}_4.$

Now let’s see if you can figure out those vectors.

Puzzle 1. Find vectors $e_1, e_2, e_3, f_1, f_2, f_3 \in \mathbb{R}^3$ such that:

A) All these vectors have the same length.

B) The three vectors coming out of any vertex lie in a plane at 120° angles to each other:

For example, $f_1, -e_2$ and $-f_3$ lie in a plane at 120° angles to each other. We put in two minus signs because two arrows are pointing into the red vertex.

C) The four planes we get this way, one for each vertex, are parallel to the faces of a regular tetrahedron.

If you want, you can even add another constraint:

D) All the components of the vectors $e_1, e_2, e_3, f_1, f_2, f_3$ are integers.

### Diamonds and hyperdiamonds

That’s the triamond. Compare the diamond:

Here each atom of carbon is connected to four others. This pattern is found not just in carbon but also other elements in the same column of the periodic table: silicon, germanium, and tin. They all like to hook up with four neighbors.

The pattern of atoms in a diamond is called the diamond cubic. It’s elegant but a bit tricky. Look at it carefully!

To build it, we start by putting an atom at each corner of a cube. Then we put an atom in the middle of each face of the cube. If we stopped there, we would have a face-centered cubic. But there are also four more carbons inside the cube—one at the center of each tetrahedron we’ve created.

If you look really carefully, you can see that the full pattern consists of two interpenetrating face-centered cubic lattices, one offset relative to the other along the cube’s main diagonal.

The face-centered cubic is the 3-dimensional version of a pattern that exists in any dimension: the Dn lattice. To build this, take an n-dimensional checkerboard and alternately color the hypercubes red and black. Then, put a point in the center of each black hypercube!

You can also get the Dn lattice by taking all n-tuples of integers that sum to an even integer. Requiring that they sum to something even is a way to pick out the black hypercubes.

The diamond is also an example of a pattern that exists in any dimension! I’ll call this the hyperdiamond, but mathematicians call it Dn+, because it’s the union of two copies of the Dn lattice. To build it, first take all n-tuples of integers that sum to an even integer. Then take all those points shifted by the vector (1/2, …, 1/2).

In any dimension, the volume of the unit cell of the hyperdiamond is 1, so mathematicians say it’s unimodular. But only in even dimensions is the sum or difference of any two points in the hyperdiamond again a point in the hyperdiamond. Mathematicians call a discrete set of points with this property a lattice.

If even dimensions are better than odd ones, how about dimensions that are multiples of 4? Then the hyperdiamond is better still: it’s an integral lattice, meaning that the dot product of any two vectors in the lattice is again an integer.

And in dimensions that are multiples of 8, the hyperdiamond is even better. It’s even, meaning that the dot product of any vector with itself is even.

In fact, even unimodular lattices are only possible in Euclidean space when the dimension is a multiple of 8. In 8 dimensions, the only even unimodular lattice is the 8-dimensional hyperdiamond, which is usually called the E8 lattice. The E8 lattice is one of my favorite entities, and I’ve written a lot about it in this series:

To me, the glittering beauty of diamonds is just a tiny hint of the overwhelming beauty of E8.

But let’s go back down to 3 dimensions. I’d like to describe the diamond rather explicitly, so we can see how a slight change produces the triamond.

It will be less stressful if we double the size of our diamond. So, let’s start with a face-centered cubic consisting of points whose coordinates are even integers summing to a multiple of 4. That consists of these points:

(0,0,0)   (2,2,0)   (2,0,2)   (0,2,2)

and all points obtained from these by adding multiples of 4 to any of the coordinates. To get the diamond, we take all these together with another face-centered cubic that’s been shifted by (1,1,1). That consists of these points:

(1,1,1)   (3,3,1)   (3,1,3)   (1,3,3)

and all points obtained by adding multiples of 4 to any of the coordinates.

(0,0,0)   (1,2,3)   (2,3,1)   (3,1,2)

and all the points obtain from these by adding multiples of 4 to any of the coordinates. To get the triamond, we take all these together with another copy of these points that’s been shifted by (2,2,2). That other copy consists of these points:

(2,2,2)   (3,0,1)   (0,1,3)   (1,3,0)

and all points obtained by adding multiples of 4 to any of the coordinates.

Unlike the diamond, the triamond has an inherent handedness, or chirality. You’ll note how we used the point (1,2,3) and took cyclic permutations of its coordinates to get more points. If we’d started with (3,2,1) we would have gotten the other, mirror-image version of the triamond.

### Covering spaces

I mentioned that the triamond is a ‘covering space’ of the graph $\mathrm{K}_4.$ More precisely, there’s a graph $T$ whose vertices are the atoms of the triamond, and whose edges are the bonds of the triamond. There’s a map of graphs

$p: T \to \mathrm{K}_4$

This automatically means that every path in $T$ is mapped to a path in $\mathrm{K}_4.$ But what makes $T$ a covering space of $\mathrm{K}_4$ is that any path in $T$ comes from a path in $\mathrm{K}_4,$ which is unique after we choose its starting point.

If you’re a high-powered mathematician you might wonder if $T$ is the universal covering space of $\mathrm{K}_4.$ It’s not, but it’s the universal abelian covering space.

What does this mean? Any path in $\mathrm{K}_4$ gives a sequence of vectors $e_1, e_2, e_3, f_1, f_2, f_3$ and their negatives. If we pick a starting point in the triamond, this sequence describes a unique path in the triamond. When does this path get you back where you started? The answer, I believe, is this: if and only if you can take your sequence, rewrite it using the commutative law, and cancel like terms to get zero. This is related to how adding vectors in $\mathbb{R}^3$ is a commutative operation.

For example, there’s a loop in $\mathrm{K}_4$ that goes “red, blue, green, red”. This gives the sequence of vectors

$f_1, -e_3, e_2$

We can turn this into an expression

$f_1 - e_3 + e_2$

However, we can’t simplify this to zero using just the commutative law and cancelling like terms. So, if we start at some red atom in the triamond and take the unique path that goes “red, blue, green, red”, we do not get back where we started!

Note that in this simplification process, we’re not allowed to use what the vectors “really are”. It’s a purely formal manipulation.

Puzzle 2. Describe a loop of length 10 in the triamond using this method. Check that you can simplify the corresponding expression to zero using the rules I described.

A similar story works for the diamond, but starting with a different graph:

The graph formed by a diamond’s atoms and the edges between them is the universal abelian cover of this little graph! This graph has 2 vertices because there are 2 kinds of atom in the diamond. It has 4 edges because each atom has 4 nearest neighbors.

Puzzle 3. What vectors should we use to label the edges of this graph, so that the vectors coming out of any vertex describe how to move from that kind of atom in the diamond to its 4 nearest neighbors?

There’s also a similar story for graphene, which is hexagonal array of carbon atoms in a plane:

Puzzle 4. What graph with edges labelled by vectors in $\mathbb{R}^2$ should we use to describe graphene?

I don’t know much about how this universal abelian cover trick generalizes to higher dimensions, though it’s easy to handle the case of a cubical lattice in any dimension.

Puzzle 5. I described higher-dimensional analogues of diamonds: are there higher-dimensional triamonds?

### References

The Wikipedia article is good:

• Wikipedia, Laves graph.

They say this graph has many names: the K4 crystal, the (10,3)-a network, the srs net, the diamond twin, and of course the triamond. The name triamond is not very logical: while each carbon has 3 neighbors in the triamond, each carbon has not 2 but 4 neighbors in the diamond. So, perhaps the diamond should be called the ‘quadriamond’. In fact, the word ‘diamond’ has nothing to do with the prefix ‘di-‘ meaning ‘two’. It’s more closely related to the word ‘adamant’. Still, I like the word ‘triamond’.

This paper describes various attempts to find the Laves graph in chemistry:

• Stephen T. Hyde, Michael O’Keeffe, and Davide M. Proserpio, A short history of an elusive yet ubiquitous structure in chemistry, materials, and mathematics, Angew. Chem. Int. Ed. 47 (2008), 7996–8000.

This paper does some calculations arguing that the triamond is a metastable form of carbon:

• Masahiro Itoh et al, New metallic carbon crystal, Phys. Rev. Lett. 102 (2009), 055703.

Abstract. Recently, mathematical analysis clarified that sp2 hybridized carbon should have a three-dimensional crystal structure ($\mathrm{K}_4$) which can be regarded as a twin of the sp3 diamond crystal. In this study, various physical properties of the $\mathrm{K}_4$ carbon crystal, especially for the electronic properties, were evaluated by first principles calculations. Although the $\mathrm{K}_4$ crystal is in a metastable state, a possible pressure induced structural phase transition from graphite to $\mathrm{K}_4$ was suggested. Twisted π states across the Fermi level result in metallic properties in a new carbon crystal.

The picture of the $\mathrm{K}_4$ crystal was placed on Wikicommons by someone named ‘Workbit’, under a Creative Commons Attribution-Share Alike 4.0 International license. The picture of the tetrahedron was made using Robert Webb’s Stella software and placed on Wikicommons. The pictures of graphs come from Sunada’s paper, though I modified the picture of $\mathrm{K}_4.$ The moving image of the diamond cubic was created by H.K.D.H. Bhadeshia and put into the public domain on Wikicommons. The picture of graphene was drawn by Dr. Thomas Szkopek and put into the public domain on Wikicommons.

## Computing the Uncomputable

2 April, 2016

I love the more mind-blowing results of mathematical logic:

Here’s a new one:

• Joel David Hamkins, Any function can be computable.

Let me try to explain it without assuming you’re an expert on mathematical logic. That may be hard, but I’ll give it a try. We need to start with some background.

First, you need to know that there are many different ‘models’ of arithmetic. If you write down the usual axioms for the natural numbers, the Peano axioms (or ‘PA’ for short), you can then look around for different structures that obey these axioms. These are called ‘models’ of PA.

One of them is what you think the natural numbers are. For you, the natural numbers are just 0, 1, 2, 3, …, with the usual way of adding and multiplying them. This is usually called the ‘standard model’ of PA. The numbers 0, 1, 2, 3, … are called the ‘standard’ natural numbers.

But there are also nonstandard models of arithmetic. These models contain extra numbers beside the standard ones! These are called ‘nonstandard’ natural numbers.

This takes a while to get used to. There are several layers of understanding to pass through.

For starters, you should think of these extra ‘nonstandard’ natural numbers as bigger than all the standard ones. So, imagine a whole bunch of extra numbers tacked on after the standard natural numbers, with the operations of addition and multiplication cleverly defined in such a way that all the usual axioms still hold.

You can’t just tack on finitely many extra numbers and get this to work. But there can be countably many, or uncountably many. There are infinitely many different ways to do this. They are all rather hard to describe.

To get a handle on them, it helps to realize this. Suppose you have a statement S in arithmetic that is neither provable nor disprovable from PA. Then S will hold in some models of arithmetic, while its negation not(S) will hold in some other models.

For example, the Gödel sentence G says “this sentence is not provable in PA”. If Peano arithmetic is consistent, neither G nor not(G) is provable in PA. So G holds in some models, while not(G) holds in others.

Thus, you can intuitively think of different models as “possible worlds”. If you have an undecidable statement, meaning one that you can’t prove or disprove in PA, then it holds in some worlds, while its negation holds in other worlds.

In the case of the Gödel sentence G, most mathematicians think G is “true”. Why the quotes? Truth is a slippery concept in logic—there’s no precise definition of what it means for a sentence in arithmetic to be “true”. All we can precisely define is:

1) whether or not a sentence is provable from some axioms

and

2) whether or not a sentence holds in some model.

Nonetheless, mathematicians are human, so they have beliefs about what’s true. Many mathematicians believe that G is true: indeed, in popular accounts one often hears that G is “true but unprovable in Peano arithmetic”. So, these mathematicians are inclined to say that any model where G doesn’t hold is nonstandard.

### The result

Anyway, what is Joel David Hamkins’ result? It’s this:

There is a Turing machine T with the following property. For any function $f$ from the natural numbers to the natural numbers, there is a model of PA such that in this model, if we give T any standard natural $n$ as input, it halts and outputs $f(n).$

So, take $f$ to be your favorite uncomputable function. Then there’s a model of arithmetic such that in this model, the Turing machine computes $f,$ at least when you feed the machine standard numbers as inputs.

So, very very roughly, there’s a possible world in which your uncomputable function becomes computable!

But you have to be very careful about how you interpret this result.

### The trick

What’s the trick? The proof is beautiful, but it would take real work to improve on Hamkins’ blog article, so please read that. I’ll just say that he makes extensive use of Rosser sentences, which say:

“For any proof of this sentence in theory T, there is a smaller proof of the negation of this sentence.”

Rosser sentences are already mind-blowing, but Hamkins uses an infinite sequence of such sentences and their negations, chosen in a way that depends on the function $f,$ to cleverly craft a model of arithmetic in which the Turing machine T computes this function on standard inputs.

But what’s really going on? How can using a nonstandard model make an uncomputable function become computable for standard natural numbers? Shouldn’t nonstandard models agree with the standard one on this issue? After all, the only difference is that they have extra nonstandard numbers tacked on after all the standard ones! How can that make a Turing machine succeed in computing $f$ on standard natural numbers?

I’m not 100% sure, but I think I know the answer. I hope some logicians will correct me if I’m wrong.

You have to read the result rather carefully:

There is a Turing machine T with the following property. For any function $f$ from the natural numbers to the natural numbers, there is a model of PA such that in this model, if we give T any standard natural $n$ as input, it halts and computes $f(n).$

When we say the Turing machine halts, we mean it halts after $N$ steps for some natural number $N.$ But this may not be a standard natural number! It’s a natural number in the model we’re talking about.

So, the Turing machine halts… but perhaps only after a nonstandard number of steps.

In short: you can compute the uncomputable, but only if you’re willing to wait long enough. You may need to wait a nonstandard amount of time.

It’s like that old Navy saying:

But the trick becomes more evident if you notice that one single Turing machine T computes different functions from the natural numbers to the natural numbers… in different models. That’s even weirder than computing an uncomputable function.

The only way to build a machine that computes $n+1$ in one model and $n+2$ in another to build a machine that doesn’t halt in a standard amount of time in either model. It only halts after a nonstandard amount of time. In one model, it halts and outputs $n+1.$ In another, it halts and outputs $n+2.$

### A scary possibility

To dig a bit deeper—and this is where it gets a bit scary—we have to admit that the standard model is a somewhat elusive thing. I certainly didn’t define it when I said this:

For you, the natural numbers are just 0, 1, 2, 3, …, with the usual way of adding and multiplying them. This is usually called the standard model of PA. The numbers 0, 1, 2, 3, … are called the ‘standard’ natural numbers.

The point is that “0, 1, 2, 3, …” here is vague. It makes sense if you already know what the standard natural numbers are. But if you don’t already know, those three dots aren’t going to tell you!

You might say the standard natural numbers are those of the form 1 + ··· + 1, where we add 1 to itself some finite number of times. But what does ‘finite number’ mean here? It means a standard natural number! So this is circular.

So, conceivably, the concept of ‘standard’ natural number, and the concept of ‘standard’ model of PA, are more subjective than most mathematicians think. Perhaps some of my ‘standard’ natural numbers are nonstandard for you!

I think most mathematicians would reject this possibility… but not all. Edward Nelson tackled it head-on in his marvelous book Internal Set Theory. He writes:

Perhaps it is fair to say that “finite” does not mean what we have always thought it to mean. What have we always thought it to mean? I used to think that I knew what I had always thought it to mean, but I no longer think so.

If we go down this road, Hamkins’ result takes on a different significance. It says that any subjectivity in the notion of ‘natural number’ may also infect what it means for a Turing machine to halt, and what function a Turing machine computes when it does halt.

## Shock Breakout

30 March, 2016

Here you can see the brilliant flash of a supernova as its core blasts through its surface. This is an animated cartoon made by NASA based on observations of a red supergiant star that exploded in 2011. It has been sped up by a factor of 240. You can see a graph of brightness showing the actual timescale at lower right.

When a star like this runs out of fuel for nuclear fusion, its core cools. That makes the pressure drop—so the core collapses under the force of gravity.

When the core of a supernova collapses, the infalling matter can reach almost a quarter the speed of light. So when it hits the center, this matter becomes very hot! Indeed, the temperature can reach 100 billion kelvin. That’s 6000 times the temperature of our Sun’s core!

For a supernova less than 25 solar masses, the collapse stops only when the core is compressed into a neutron star. As this happens, lots of electrons and protons become neutrons and neutrinos. Most of the resulting energy is instantly carried away by a ten-second burst of neutrinos. This burst can have an energy of 1046 joules.

It’s hard to comprehend this. It’s what you’d get if you suddenly converted the mass of 18,000 Earths into energy! Astronomers use a specially huge unit with such energies: the foe, which stands for ten to the fifty-one ergs.

That’s 1044 joules. So, a supernova can release 100 foe in neutrinos. By comparison, only 1 or 2 foe come out as light.

Why? Neutrinos can effortlessly breeze through matter. Light cannot! So it takes longer to actually see things happen at the star’s surface—especially since a red supergiant is large. This one was about 500 times the radius of our Sun.

So what happened? A shock wave rushed upward through the star. First it broke through the star’s surface in the form of finger-like plasma jets, which you can see in the animation.

20 minutes later, the full fury of the shock wave reached the surface—and the doomed star exploded in a blinding flash! This is called the shock breakout.

Then the star expanded as a blue-hot ball of plasma.

Here’s how the star’s luminosity changed with time, measured in multiples of the Sun’s luminosity:

Note that while the shock breakout seems very bright, it’s ultimately dwarfed by the luminosity of the expanding ball of plasma. So, KSN2011d was actually one of the first two supernovae for which the shock breakout was seen! For details, read this:

• P. M. Garnavich, B. E. Tucker, A. Rest, E. J. Shaya, R. P. Olling, D. Kasen and A. Villar, Shock breakout and early light curves of Type II-P supernovae observed with Kepler.

A Type II supernova is one that shows hydrogen in its spectral lines: these are commonly formed by the collapse of a star that has run out of fuel in its core, but retains hydrogen in its outer layers. A Type II-P is one that shows a plateau in its light curve: the P is for ‘plateau’. These are more common than the Type II-L, which show a more rapid (‘linear’) decay in their luminosity:

## Probability Puzzles (Part 3)

26 March, 2016

Here’s a puzzle based on something interesting that I learned from Greg Egan. I’ve dramatized it a bit.

Traditional Tom and Liberal Lisa are dating. They discuss their plans for having children:

Tom: I plan to keep having kids until I get two sons in a row.

Lisa: What?! That’s absurd. Why?

Tom: I want two to run my store when I get old.

Lisa: Even ignoring your insulting assumption that only boys can manage your shop, why in the world do you need two in a row?

Tom: From my own childhood, I’ve learned there’s a special bond between sons who are next to each other in age. They play together, they grow up together… they can run my shop together.

Lisa: Hmm. Well, then maybe I should have children until I have a girl followed directly by a boy!

Tom: What?!

Lisa: Well, I’ve observed that something special happens when a boy has an older sister, with no intervening siblings. They play together, they grow up together… and maybe he learns not to be such a sexist pig!

They decide they are incompatible, so they split up and each one separately tries to find a mate who will go along with their reproductive plan.

Now for some puzzles. In these puzzles, assume that each time someone has a child, they have a 50% chance of having either a daughter or a son. Also assume each event is independent: that is, the gender of any children so far has no effect on that of later ones. Also ignore twins and other tricky issues.

Puzzle 1. If Tom carries out his plan of having children until he has two consecutive sons, and then stops, what is the expected number of children he will have?

Puzzle 2. If Lisa carries out her plan of having children until she has a daughter followed directly by a son, and then stops, what is the expected number of children she will have?

Puzzle 3: Which is greater, Tom’s expected number of children or Lisa’s? Or are they equal?

For maximum benefit, try to answer Puzzle 3 before doing the calculations required to answer Puzzles 1 or 2.