Lebesgue’s Universal Covering Problem (Part 1)

8 December, 2013

I try to focus on serious problems in this blog, mostly environmental issues and the attempt to develop ‘green mathematics’. But I seem unable to resist talking about random fun stuff now and then. For example, the Lebesgue universal covering problem. It’s not important, it’s just strange… but for some reason I feel a desire to talk about it.

For starters, let’s say the diameter of a region in the plane is the maximum distance between two points in this region. You all know what a circle of diameter 1 looks like. But an equilateral triangle with edges of length 1 also has diameter 1:


After all, two points in this triangle are farthest apart when they’re at two corners. And you’ll notice, if you think about it, that this triangle doesn’t fit inside a circle of diameter 1:

In 1914, the famous mathematician Henri Lebesgue sent a letter to a pal named Pál. And in this letter he asked: what’s the smallest possible region that contains every set in the plane with diameter 1?

He was actually more precise. The phrase “smallest possible” is vague, but Lebesgue wanted the set with the least possible area. The phrase “contains” is also vague, at least as I used it. I didn’t mean that our region should literally contain every set with diameter 1. Only the whole plane does that! I meant that we can rotate or translate any set with diameter 1 until it fits in our region. So a more precise statement is:

What is the smallest possible area of a region S in the plane such that every set of diameter 1 in the plane can be rotated and translated to fit inside S?

You see why math gets a reputation of being dry: sometimes when you make a simple question precise it gets longer.

Even this second version of the question is a bit wrong, since it’s presuming there exists a region with smallest possible area that does the job. Maybe you can do it with regions whose area gets smaller and smaller, closer and closer to some number, but you can’t do it with a region whose area equals that number! Also, the word ‘region’ is a bit vague. So if I were talking to a nitpicky mathematician, I might pose the puzzle this way:

What is the greatest lower bound of the measures of closed sets S \subseteq \mathbb{R}^2 such that every set T \subseteq \mathbb{R}^2 of diameter 1 can be rotated and translated to fit inside S?

Anyway, the reason this puzzle is famous is not that it gets dry when you state it precisely. It’s that it’s hard to solve!

It’s called Lebesgue’s universal covering problem. A region in the plane is called a universal covering if it does the job: any set in the plane of diameter 1 can fit inside it, in the sense I made precise.

Pál set out to find universal coverings, and in 1920 he wrote a paper on his results. He found a very nice one: a regular hexagon circumscribed around a circle of diameter 1. Do you see why you can fit the equilateral triangle with side length 1 inside this?

This hexagon has area

\sqrt{3}/2 \approx 0.86602540

But in the same paper, Pál showed you could safely cut off two corners of this hexagon and get a smaller universal covering! This one has area

2 - 2/\sqrt{3} \approx 0.84529946

He guessed this solution was optimal.

However, in 1936, a guy named Sprague sliced two tiny pieces off Pál’s proposed solution and found a universal covering with area

\sim 0.84413770

Here’s the idea:

The big hexagon is Pál's original solution. He then inscribed a regular 12-sided polygon inside this, and showed that you can remove two of the corners, say B_1B_2B and F_1F_2F, and get a smaller universal covering. But Sprague noticed that near D you can also remove the part outside the circle with radius 1 centered at B_1, as well as the part outside the circle with radius 1 centered at F_2.

Sprague conjectured that this is the best you can do.

But in 1975, Hansen showed you could slice off very tiny corners off Sprague’s solution, each of which reduces the area by just 6 \cdot 10^{-18}.

I think this microscopic advance, more than anything else, convinced people that Lebesgue’s universal covering problem was fiendishly difficult. Victor Klee, in a parody of the usual optimistic prophecies people like to make, wrote that:

… progress on this question, which has been painfully slow in the past, may be even more painfully slow in the future.

Indeed, my whole interest in this problem is rather morbid. I don’t know any reason that it’s important. I don’t see it as connected to lots of other beautiful math. It just seems astoundingly hard compared to what you might initially think. I admire people who work on it in the same way I admire people who decide to ski across the Antarctic. It’s fun to read about—from the comfort of my living room, sitting by the fire—but I can’t imagine actually doing it!

In 1980, a guy named Duff did a lot better. All the universal coverings so far were convex subsets of the plane. But he considered nonconvex universal coverings and found one with area:

\sim 0.84413570

This changed the game a lot. If we only consider convex universal coverings, it’s possible to prove there’s one with the least possible area—though we don’t know what it is. But if we allow nonconvex ones, we don’t even know this. There could be solutions whose area gets smaller and smaller, approaching some nonzero value, but never reaching it.

So at this point, Lebesgue’s puzzle split in two: one for universal coverings, and one for convex universal coverings. I’ll only talk about the second one, since I don’t know of further progress on the first.

Remember how Hansen improved Sprague’s universal covering by chopping off two tiny pieces? In 1992 he went further. He again sliced two corners off Sprague’s solution. One, the same shape as before, reduced the area by just 6 \cdot 10^{-18}. But the other was vastly larger: it reduced the area by a whopping 4 \cdot 10^{-11}.

I believe this is the smallest convex universal covering known so far.

But there’s more to say. In 2005, Peter Brass and Mehrbod Sharifi came up with a nice lower bound. They showed any convex universal covering must have an area of least

0.832

They got this result by a rigorous computer-aided search for the smallest possible area of a convex set containing a circle, equilateral triangle and pentagon of diameter 1.

If you only want your convex set to contain a circle and equilateral triangle of diameter 1, you can get away with this:

This gives a lower bound of

0.825

But the pentagon doesn’t fit in this set! Here is a convex shape that also contains a pentagon of diameter 1, as found by Brass and Sharifi:

You’ll notice the triangle no longer has the same center as the circle!

To find this result, it was enough to keep the circle fixed, translate the triangle, and translate and rotate the pentagon. So, they searched through a 5-dimensional space, repeatedly computing the area of the convex hull of these three shapes to see how small they could make it. They considered 53,118,162 cases. Of these, 655,602 required further care—roughly speaking, they had to move the triangle and pentagon around slightly to see if they could make the area even smaller. They proved that they could not make it smaller than 0.825.

They say they could have gotten a better result if they’d also included a fourth shape, but this was computationally infeasible, since it would take them from a 5-dimensional search space to an 8-dimensional one. It’s possible that one could tackle this using a distributed computing project where a lot of people contribute computer time, like they do in the search for huge prime numbers.

If you hear of more progress on this issue, please let me know! I hope that sometime—perhaps tomorrow, perhaps decades hence—someone will report good news.

References

Pál’s paper is here:

• Julius Pál, Ueber ein elementares Variationsproblem, Danske Mat.-Fys. Meddelelser III 2 (1920).

Sprague’s paper is here:

• R. Sprague, Über ein elementares Variationsproblem, Matematiska Tidsskrift Ser. B (1936), 96–99.

Hansen’s record-holding convex universal cover is here:

• H. Hansen, Small universal covers for sets of unit diameter, Geometriae Dedicata 42 (1992), 205–213.

It’s quite readable. This is where I got the picture of Pál’s solution and Sprague’s improvement.

The paper on the current best lower bound for convex universal coverings is also quite nice:

• Peter Brass and Mehrbod Sharifi, A lower bound for Lebesgue’s universal cover problem, International Journal of Computational Geometry & Applications 15 (2005), 537–544.

The picture of the equilateral triangle in the circle comes from an earlier paper, which got a lower bound of

0.8271

by considering the circle and regular 3^n-gons of diameter 1 for all n:

• Gy. Elekes, Generalized breadths, circular Cantor sets, and the least area UCC, Discrete & Computational Geometry 12 (1994), 439–449.

I have not yet managed to get ahold of Duff’s paper on the record-holding nonconvex universal covering:

• G. F. D. Duff, A smaller universal cover for sets of unit diameter, C. R. Math. Acad. Sci. 2 (1980), 37–42.

One interesting thing is that in 1963, Eggleston found a universal covering that’s minimal in the following sense: no subset of it is a universal covering. However, it doesn’t have the least possible area!

• H. G. Eggleston, Minimal universal covers in \mathrm{E}^n, Israel J. Math. 1 (1963), 149–155.

Later, Kovalev showed that any ‘minimal’ universal covering in this sense is a star domain:

• M. D. Kovalev, A minimal Lebesgue covering exists (Russian), Mat. Zametki 40 (1986), 401–406, 430.

This means there’s a point x_0 inside the set such that for any other point x in the set, the line segment connecting x_0 and x is in the set. Any convex set is a star domain, but not conversely:




Talk at the SETI Institute

5 December, 2013

SETI means ‘Search for Extraterrestrial Intelligence’. I’m giving a talk at the SETI Institute on Tuesday December 17th, from noon to 1 pm. You can watch it live, watch it later on their YouTube channel, or actually go there and see it. It’s free, and you can just walk in at 189 San Bernardo Avenue in Mountain View, California, but please register if you can.

Life’s Struggle to Survive

When pondering the number of extraterrestrial civilizations, it is worth noting that even after it got started, the success of life on Earth was not a foregone conclusion. We recount some thrilling episodes from the history of our planet, some well-documented but others merely theorized: our collision with the planet Theia, the oxygen catastrophe, the snowball Earth events, the Permian-Triassic mass extinction event, the asteroid that hit Chicxulub, and more, including the global warming episode we are causing now. All of these hold lessons for what may happen on other planets.

If you know interesting things about these or other ‘close calls’, please tell me! I’m still preparing my talk, and there’s room for more fun facts. I’ll make my slides available when they’re ready.

The SETI Institute looks like an interesting place, and my host, Adrian Brown, is an expert on the poles of Mars. I’ve been fascinated about the water there, and I’ll definitely ask him about this paper:

• Adrian J. Brown, Shane Byrne, Livio L. Tornabene and Ted Roush, Louth crater: Evolution of a layered water ice mound, Icarus 196 (2008), 433–445.

Louth Crater is a fascinating place. Here’s a photo:

By the way, I’ll be in Berkeley from December 14th to 21st, except for a day trip down to Mountain View for this talk. I’ll be at the Machine Intelligence Research Institute talking to Eliezer Yudkowsky, Paul Christiano and others at a Workshop on Probability, Logic and Reflection. This invitation arose from my blog post here:

Probability theory and the undefinability of truth.

If you’re in Berkeley and you want to talk, drop me a line. I may be too busy, but I may not.


Rolling Hypocycloids

3 December, 2013

The hypocycloid with n cusps is the curve traced out by a point on a circle rolling inside a circle whose radius is n times larger.

The hypocycloid with 2 cusps is sort of strange:

It’s just a line segment! It’s called the Tusi couple.

The hypocycloid with 3 cusps is called the deltoid, because it’s shaped like the Greek letter Δ:

The hypocycloid with 4 cusps is called the astroid, because it reminds people of a star:

I got interested in hypocycloids while writing some articles here:

Rolling circles and balls.

My goal was to explain a paper John Huerta and I wrote about the truly amazing things that happen when you roll a ball on a ball 3 times as big. But I got into some interesting digressions.

While pondering hypocycloids in the ensuing discussion, the mathematician, physicist, programmer and science fiction author Greg Egan created this intriguing movie:

It’s a deltoid rolling inside an astroid. It fits in a perfectly snug way, with all three corners touching the astroid at all times!

Why does it work? It’s related to some things that physicists like: SU(3) and SU(4).

SU(n) is the group of n × n unitary matrices with determinant 1. Physicists like Pauli and Heisenberg got interested in SU(2) when they realized it describes the rotational symmetries of an electron. You need it to understand the spin of an electron. Later, Gell-Mann got interested in SU(3) because he thought it described the symmetries of quarks. He won a Nobel prize for predicting a new particle based on this theory, which was then discovered.

We now know that SU(3) does describe symmetries of quarks, but not in the way Gell-Mann thought. It turns out that quarks come in 3 ‘colors’—not real colors, but jokingly called red, green and blue. Similarly, electrons come in 2 different spin states, called up and down. Matrices in SU(3) can change the color of a quark, just as matrices in SU(2) can switch an electron’s spin from up to down, or some mix of up and down.

SU(4) would be important in physics if quarks came in 4 colors. In fact there’s a theory saying they do, with electrons and neutrinos being examples of quarks in their 4th color state! It’s called the Pati–Salam theory, and you can see an explanation here:

• John Baez and John Huerta, The algebra of grand unified theories.

It’s a lot of fun, because it unifies leptons (particles like electrons and neutrinos) and quarks. There’s even a chance that it’s true. But it’s not very popular these days, because it has some problems. It predicts that protons decay, which we haven’t seen happen yet.

Anyway, the math of SU(3) and SU(4) is perfectly well understood regardless of their applications to physics. And here’s the cool part:

If you take a matrix in SU(3) and add up its diagonal entries, you can get any number in the complex plane that lies inside a deltoid. If you take a matrix in SU(4) and add up its diagonal entries, you can get any number in the complex plane that lies inside an astroid. And using how SU(3) fits inside SU(4), you can show that a deltoid moves snugly inside an astroid!

The deltoid looks like it’s rolling, hence the title of this article. But Egan pointed out that it’s not truly ‘roll’ in the sense of mechanics—it slides a bit as it rolls.

For the details, see this article on my other blog:

Deltoid rolling on an astroid.

But the really cool part—the new thing I want to show you today—is that this pattern continues!

For example, an astroid moves snugly inside a 5-pointed shape, thanks to how SU(4) sits inside SU(5). Here’s a movie of that, again made by Greg Egan:

In general, a hypocycloid with n cusps moves snugly inside a hypocycloid with n + 1 cusps. But this implies that you can have a hypocycloid with 2 cusps moves inside one with 3 cusps moving inside one with 4 cusps, etcetera! I’ve been wanting to see this for a while, but yesterday Egan created some movies showing this:

Depending on the details, you can get different patterns:

Egan explains:

In the first movie, every n-cusped hypocycloid moves inside the enclosing (n+1)-cusped hypocycloid at the same angular velocity and in the same direction, relative to the enclosing one … which is itself rotating, except for the very largest one. So the cumulative rotational velocity is highest for the line, lower for the deltoid, lower still for the astroid, in equal steps until you hit zero for the outermost hypocycloid.

In the second movie, the magnitude of the relative angular velocity is always the same between each hypocycloid and the one that encloses it, but the direction alternates as you go up each level.

Here’s one where he went up to a hypocycloid with 10 cusps:

Note how sometimes all the cusps meet!

Yonatan Zunger is the chief architect of Google+, but before that he used to work on string theory. Reading about this stuff on Google+, he had a number of interesting ideas, like this:

I wonder, can a deltoid roll in a 5-hypocycloid? I haven’t worked through the math of just how the SU(n) → SU(n+1) embedding guarantees a fit here.

(And also, are there corresponding pictures we could draw to illustrate the embeddings of other Lie groups? This could be a lovely way to illustrate a wide range of relationships if it could be done more generally.)

Egan figured out that yes, a deltoid can move nicely in a hypocycloid with 5 cusps:

He writes:

To make this, I took the border of a deltoid in “standard position” (as per the trace theorem) to be:

\mathrm{deltoid}(t) = 2 \exp(i t) + \exp(-2 i t)

Suppose I apply the linear function:

f(s, z) = \exp(-2 i s / 5) z + 2 \exp(3 i s / 5)

to the whole deltoid. Then at the point s=t on the deltoid, we have:

f(s, \mathrm{deltoid}(s)) = 4 \exp(3 i s / 5) + \exp(-4 (3 i s / 5))

which is a point on a 5-cusped hypocycloid in “standard position”. Of course with this choice of parameters you need to take s from 0 through to 10 \pi/3, not 2 \pi, to complete a cycle.

Using the same trick, you can get a hypocyloid with n cusps to move inside one with m cusps whenever n ≤ m. As for the more general question of how different Lie groups give different ‘generalized hypocycloids’, and how fitting one Lie group in another lets you roll one generalized hypocycloid in another, the current state of the art is here:

• N. Kaiser, Mean eigenvalues for simple, simply connected, compact Lie groups.

But there is still more left to do!

On a somewhat related note, check out this fractal obtained by rolling a circle inside a circle 4 times as big that’s rolling in a circle 4 times as big… and so on: astroidae ad infinitum!


Water

29 November, 2013

Over a year ago, I wrote here about ice. It has 16 known forms with different crystal geometries. The most common form on Earth, hexagonal ice I, is a surprisingly subtle blend of order and randomness:

Liquid water is even more complicated. It’s mainly a bunch of molecules like this jostling around:


The two hydrogens are tightly attached to the oxygen. But accidents do happen. On average, for every 555 million molecules of water, one is split into a negatively charged OH⁻ and a positively charged H⁺. And this actually matters a lot, in chemistry. It’s the reason we say water has pH 7.

Why? By definition, pH 7 means that for every liter of water, there’s 10-7 moles of H⁺. That’s where the 7 comes from. But there’s 55.5 moles of water in every liter, at least when the water is cold so its density is almost 1 kilogram/liter. So, do the math and you see one that for 555 million molecules of water, there’s only one H⁺.

Acids have a lot more. For example, lemon juice has one H⁺ per 8800 water molecules.

But let’s think about this H⁺ thing. What is it, really? It’s a hydrogen atom missing its electron: a proton, all by itself!

But what happens when you’ve got a lone proton in water? It doesn’t just sit there. It quickly attaches to a water molecule, forming H₃O⁺. This is called a hydronium ion, and it looks like this:


But hydronium is still positively charged, so it will attract electrons in other water molecules! Different things can happen. Here you see a hydronium ion water molecule surrounded by three water molecules in a symmetrical way:


This is called an Eigen cation, with chemical formula H₉O₄⁺. I believe it’s named after the Nobel-prize-winning chemist Manfred Eigen—not his grandfather Günther, the mathematician of ‘eigenvector’ fame.

And here you see a hydronium ion at lower right, attracted to water molecule at left:


The is a Zundel cation, with chemical formula H₅O₂⁺. It’s named after Georg Zundel, the German expert on hydrogen bonds. The H⁺ in the middle looks more tightly connected to the water at right than the water at left. But it should be completely symmetrical—at least, that’s the theory of how a Zundel cation works.

But the Eigen and Zundel cations are still positively charged, so they attract more water molecules, making bigger and bigger structures. Nowadays chemists are studying these using computer simulations, and comparing the results to experiments. In 2010, Evgenii Stoyanov, Irina Stoyanova and Christopher Reed used infrared spectroscopy to argue that a lone proton often attaches itself to 6 water molecules, forming H⁺(H₂O)₆, or H₁₃O₆⁺, like this:


As you can see, this forms when each hydrogen in a Zundel cation attracts an extra water molecule.

Even this larger structure attracts more water molecules:


But the positive charge, they claim, stays roughly within the dotted line.

Wait. Didn’t I say the lone proton was right in the middle? Isn’t that what the picture shows—the H in the middle?

Well, the picture is a bit misleading! First, everything is wiggling around a lot. And second, quantum mechanics says we don’t know the position of that proton precisely! Instead, it’s a ‘probability cloud’ smeared over a large region, ending roughly at the dashed line. (You can’t say precisely where a cloud ends.)

It seems that something about these subtleties makes the distance between the two oxygen nuclei at the center is surprisingly large. In an ordinary water molecule, the distance between the hydrogen and oxygen is a bit less than 100 pm—that’s 100 picometers, or 100 × 10-12 meters, or one angstrom (Å) in chemist’s units:


In ordinary ice, there are also weaker bonds called hydrogen bonds that attach neighboring water molecules. These bonds are a bit longer, as shown in this picture by Stephen Lower, who also drew that great picture of ice:

But the distance between the two central oxygens in H₁₃O₆⁺ is about 2.57 angstroms, or twice 1.28:


Stoyanov, Stoyanova and Reed put the exclamation mark here. I guess the big distance came as a big surprise!

I should emphasize that this work is new and still controversial. There’s some evidence, which I don’t understand, that 20 is a ‘magic number’: a lone proton is happiest when accompanied by 20 water molecules, forming H⁺(H₂O)₂₀. One possibility is that the proton is surrounded by a symmetrical cage of 20 water molecules shaped like a dodecahedron! But in 2005, a team of scientists did computer simulations and arrived at a different geometry, like this:

This is not symmetrical: there’s a Zundel cation highlighted at right, together with 20 water molecules.

Of course, in reality a number of different structures may predominate, in a rapidly changing and random way. Computer simulations should eventually let us figure this out. We’ve known the relevant laws of nature for over 80 years. But running them on a computer is not easy! Kieron Taylor did his PhD work on simulating water, and he wrote:

It’s a most vexatious substance to simulate in useful time scales. Including the proton exchange or even flexible multipoles requires immense computation.

It would be very interesting if the computational complexity of water were higher, in some precise sense, than many other liquids. It’s weird in other ways. It takes a lot of energy to heat water, it expands when it freezes, and its molecules have a large ‘dipole moment’—meaning the electric charge is distributed in a very lopsided way, thanks to the ‘Mickey Mouse’ way the two H’s are attached to the O.

I’ve been talking about the fate of the H⁺ when a water molecule splits into H⁺ and OH⁻. I should add that in heavy water, H⁺ could be something other than a lone proton. It could be a deuteron: a proton and a neutron stuck together. Or it could be a triton: a proton and two neutrons. For this reason, while most chemists call H⁺ simply a ‘proton’, the pedantically precise ones call it a hydron, which covers all the possibilities!

But what about the OH⁻? This is called a hydroxide ion:


But this, too, attracts other water molecules. First it grabs one and forms a bihydroxide ion, which is a chain like this:

H—O—H—O—H

with chemical formula H₃O₂⁻. And then the bihydroxide ion attracts other water molecules, perhaps like this:


Again, this is a guess—and certainly a simplified picture of a dynamic, quantum-mechanical system.

References and digressions

For more, see:

• Evgenii S. Stoyanov, Irina V. Stoyanova, Christopher A. Reed, The unique nature of H⁺ in water, Chemical Science 2 (2011), 462–472.

Abstract: The H⁺(aq) ion in ionized strong aqueous acids is an unexpectedly unique H₁₃O₆⁺ entity, unlike those in gas phase H⁺(H₂O)n clusters or typical crystalline acid hydrates. IR spectroscopy indicates that the core structure has neither H₉O₄⁺ Eigen-like nor typical H₅O₂⁺ Zundel-like character. Rather, extensive delocalization of the positive charge leads to a H₁₃O₆⁺ ion having an unexpectedly long central OO separation of 2.57 Å and four conjugated OO separations of 2.7 Å. These dimensions are in conflict with the shorter OO separations found in structures calculated by theory. Ultrafast dynamic properties of the five H atoms involved in these H-bonds lead to a substantial collapse of normal IR vibrations and the appearance of a continuous broad absorption (cba) across the entire IR spectrum. This cba is distinguishable from the broad IR bands associated with typical low-barrier H-bonds. The solvation shell outside of the H₁₃O₆⁺ ion defines the boundary of positive charge delocalization. At low acid concentrations, the H₁₃O₆⁺ ion is a constituent part of an ion pair that has contact with the first hydration shell of the conjugate base anion. At higher concentrations, or with weaker acids, one or two H₂O molecules of H₁₃O₆⁺ cation are shared with the hydration shell of the anion. Even the strongest acids show evidence of ion pairing.

Unfortunately this paper is not free, and my university doesn’t even subscribe to this journal. But I just discovered that Evgenii Stoyanov and Irina Stoyanova are here at U. C. Riverside! So, I may ask them some questions.

This picture:

came from here:

• Srinivasan S. Iyengar, Matt K. Petersen, Tyler J. F. Day, Christian J. Burnham, Virginia E. Teige and Gregory A. Voth, The properties of ion-water clusters. I. The protonated 21-water cluster, J. Chem. Phys. 123 (2005), 084309.

Abstract. The ab initio atom-centered density-matrix propagation approach and the multistate empirical valence bond method have been employed to study the structure, dynamics, and rovibrational spectrum of a hydrated proton in the “magic” 21 water cluster. In addition to the conclusion that the hydrated proton tends to reside on the surface of the cluster, with the lone pair on the protonated oxygen pointing “outwards,” it is also found that dynamical effects play an important role in determining the vibrational properties of such clusters. This result is used to analyze and complement recent experimental and theoretical studies.

This paper is free online! We live in a semi-barbaric age where science is probing the finest details of matter, space and time—but many of the discoveries, paid for by taxes levied on the hard-working poor, are snatched, hidden, and sold by profiteers. Luckily, a revolution is afoot…

There are other things in ‘pure water’ beside what I’ve mentioned. For example, there are some lone electrons! Since these are light, quantum mechanics says their probability cloud spreads out to be quite big. This picture by Michael Tauber shows what you should imagine:

He says:

Schematic representation of molecules in the first and second coordination shells around the solvated electron. First shell molecules are shown hydrogen bonded to the electron. Hydrogen bonds between molecules of 1st and 2nd shells are disrupted.


Monarch Butterflies

25 November, 2013


Have you ever seen one of these? It’s a Monarch Butterfly. Every spring, millions fly from Mexico and southern California to other parts of the US and southern Canada. And every autumn, they fly back. On the first of November, called the Day of the Dead, people celebrate the return of the monarchs to the mountainous fir forests of Central Mexico.

But their numbers are dropping. In 1997, there were 150 million. Last year there were only 60 million. One problem is the gradual sterilization of American farmlands thanks to powerful herbicides like Roundup. Monarch butterfly larvae eat a plant called milkweed. But the amount of this plant in Iowa, for example, has dropped between 60% and 90% over the last decade.

And this year was much worse for the monarchs. They came late to Mexico… and I think only 3 million have been seen so far! That’s a stunning decrease!

Some blame the intense drought that hit the US in recent years—the sort of drought we can expect to become more frequent as global warming proceeds.



Earlier this year, Michael Risnit wrote this in USA Today:

Illegal logging in the Mexican forests where they spend the winter, new climate patterns and the disappearance of milkweed—the only plant on which monarchs lay their eggs and on which their caterpillars feed—are being blamed for their shrinking numbers.

Brooke Beebe, former director of the Native Plant Center at Westchester Community College in Valhalla, N.Y., collects monarch eggs, raises them from caterpillar to butterfly and releases them.

“I do that when they’re here. They’re not here,” she said.

The alarm over disappearing monarchs intensified this spring when conservation organizations reported that the amount of Mexican forest the butterflies occupied was at its lowest in 20 years. The World Wildlife Fund, in partnership with a Mexican wireless company and Mexico’s National Commission of Protected Areas, found nine hibernating colonies occupied almost 3 acres during the 2012-13 winter, a 59% decrease from the previous winter.


Because the insects can’t be counted individually, the colonies’ total size is used. Almost 20 years ago, the colonies covered about 45 acres. A couple of acres contains millions of monarchs.

“The monarch population is pretty strong, except it’s not as strong as it used to be and we find out it keeps getting smaller and smaller,” said Travis Brady, the education director at the Greenburgh Nature Center here.

Monarchs arrived at the nature center later this year and in fewer numbers, Brady said.

The nature center’s butterfly house this summer was aflutter with red admirals, giant swallowtails, painted ladies and monarchs, among others. But the last were difficult to obtain because collectors supplying the center had trouble finding monarch eggs in the wild, he said.

No one is suggesting monarchs will become extinct. The concern is whether the annual migration will remain sustainable, said Jeffrey Glassberg, the North American Butterfly Association’s president.

The record low shouldn’t set off a panic, said Marianna T. Wright, executive director of the National Butterfly Center in Texas, a project of the butterfly association.

“It should certainly get some attention,” she said. “I do think the disappearance of milkweed nationwide needs to be addressed. If you want to have monarchs, you have to have milkweed.”

Milkweed is often not part of suburban landscape, succumbing to lawn mowers and weed whackers, monarch advocates point out. Without it, monarch eggs aren’t laid and monarch caterpillars can’t feed and develop into winged adults.

“Many people know milkweed, and many people like it,” said Brady at the nature center. “And a lot of people actively try to destroy it. The health of the monarch population is solely dependent on the milkweed plant.”

The widespread use of herbicide-resistant corn and soybeans, which has resulted in the loss of more than 80 million acres of monarch habitat in recent years, also threatens the plant, according to the website Monarch Watch. In spraying fields to eradicate unwanted plants, Midwest farmers also eliminate butterflies’ habitat.

The 2012 drought and wildfires in Texas also made butterfly life difficult. All monarchs heading to or from the eastern two-thirds of the country pass through the state.


So—check out Monarch Watch! Plant some milkweed and make your yard insect-friendly in other ways… like mine!

I may seem like a math nerd, but I’m out there every weekend gardening. My wife Lisa is the real driving force behind this operation, but I’ve learned to love working with plants, soil, and compost. The best thing we ever did is tear out the lawn. Lawns are boring, let native plants flourish! Even if you don’t like insects, birds eat them, and you’ve gotta like birds. Let the beauty of nature start right where you live.


Quantropy (Part 4)

11 November, 2013

There’s a new paper on the arXiv:

• John Baez and Blake Pollard, Quantropy.

Blake is a physics grad student at U. C. Riverside who plans to do his thesis with me.

If you have carefully read all my previous posts on quantropy (Part 1, Part 2 and Part 3), there’s only a little new stuff here. But still, it’s better organized, and less chatty.

And in fact, Blake came up with a lot of new stuff for this paper! He studied the quantropy of the harmonic oscillator, and tweaked the analogy between statistical mechanics and quantum mechanics in an interesting way. Unfortunately, we needed to put a version of this paper on the arXiv by a deadline, and our writeup of this new work wasn’t quite ready (my fault). So, we’ll put that other stuff in a new version—or, I’m thinking now, a separate paper.

But here are two new things.

First, putting this paper on the arXiv had the usual good effect of revealing some existing work on the same topic. Joakim Munkhammar emailed me and pointed out this paper, which is free online:

• Joakim Munkhammar, Canonical relational quantum mechanics from information theory, Electronic Journal of Theoretical Physics 8 (2011), 93–108.

You’ll see it cites Garrett Lisi’s paper and pushes forward in various directions. There seems to be a typo where he writes the path integral

Z = \displaystyle{ \int e^{-\alpha S(q) } D q}

and says

In order to fit the purpose Lisi concluded that the Lagrange multiplier value \alpha \equiv 1/i \hbar. In similarity with Lisi’s approach we shall also assume that the arbitrary scaling-part of the constant \alpha is in fact 1/\hbar.

I’m pretty sure he means 1/i\hbar, given what he writes later. However, he speaks of ‘maximizing entropy’, which is not quite right for a complex-valued quantity; Blake and I prefer to give this new quantity a new name, and speak of ‘finding a stationary point of quantropy’.

But in a way these are small issues; being a mathematician, I’m more quick to spot tiny technical defects than to absorb significant new ideas, and it will take a while to really understand Munkhammar’s paper.

Second, while writing our paper, Blake and I noticed another similarity between the partition function of a classical ideal gas and the partition function of a quantum free particle. Both are given by an integral like this:

Z = \displaystyle{\int e^{-\alpha S(q) } D q }

where S is a quadratic function of q \in \mathbb{R}^n. Here n is the number of degrees of freedom for the particles in the ideal gas, or the number of time steps for a free particle on a line (where we are discretizing time). The only big difference is that

\alpha = 1/kT

for the ideal gas, but

\alpha = 1/i \hbar

for the free particle.

In both cases there’s an ambiguity in the answer! The reason is that to do this integral, we need to pick a measure D q. The obvious guess is Lebesgue measure

dq = dq_1 \cdots dq_n

on \mathbb{R}^n. But this can’t be right, on physical grounds!

The reason is that the partition function Z needs to be dimensionless, but d q has units. To correct this, we need to divide dq by some dimensionful quantity to get D q.

In the case of the ideal gas, this dimensionful quantity involves the ‘thermal de Broglie wavelength’ of the particles in the gas. And this brings Planck’s constant into the game, even though we’re not doing quantum mechanics: we’re studying the statistical mechanics of a classical ideal gas!

That’s weird and interesting. It’s not the only place where we see that classical statistical mechanics is incomplete or inconsistent, and we need to introduce some ideas from quantum physics to get sensible answers. The most famous one is the ultraviolet catastrophe. What are all rest?

In the case of the free particle, we need to divide by a quantity with dimensions of lengthn to make

dq = dq_1 \cdots dq_n

dimensionless, since each dq_i has dimensions of length. The easiest way is to introduce a length scale \Delta x and divide each dq_i by that. This is commonly done when people study the free particle. This length scale drops out of the final answer for the questions people usually care about… but not for the quantropy.

Similarly, Planck’s constant drops out of the final answer for some questions about the classical ideal gas, but not for its entropy!

So there’s an interesting question here, about what this new length scale \Delta x means, if anything. One might argue that quantropy is a bad idea, and the need for this new length scale to make it unambiguous is just proof of that. However, the mathematical analogy to quantum mechanics is so precise that I think it’s worth going a bit further out on this limb, and thinking a bit more about what’s going on.

Some weird sort of déjà vu phenomenon seems to be going on. Once upon a time, people tried to calculate the partition functions of classical systems. They discovered they were infinite or ambiguous until they introduced Planck’s constant, and eventually quantum mechanics. Then Feynman introduced the path integral approach to quantum mechanics. In this approach one is again computing partition functions, but now with a new meaning, and with complex rather than real exponentials. But these partition functions are again infinite or ambiguous… for very similar mathematical reasons! And at least in some cases, we can remove the ambiguity using the same trick as before: introducing a new constant. But then… what?

Are we stuck in an infinite loop here? What, if anything, is the meaning of this ‘second Planck’s constant’? Does this have anything to do with second quantization? (I don’t see how, but I can’t resist asking.)


Entropy and Information in Biological Systems (Part 1)

2 November, 2013

John Harte is an ecologist who uses maximum entropy methods to predict the distribution, abundance and energy usage of species. Marc Harper uses information theory in bioinformatics and evolutionary game theory. Harper, Harte and I are organizing a workshop on entropy and information in biological systems, and I’m really excited about it!

It’ll take place at the National Institute for Mathematical and Biological Synthesis in Knoxville Tennesee. We are scheduling it for Wednesday-Friday, April 8-10, 2015. When the date gets confirmed, I’ll post an advertisement so you can apply to attend.

Writing the proposal was fun, because we got to pull together lots of interesting people who are applying information theory and entropy to biology in quite different ways. So, here it is!

Proposal

Ever since Shannon initiated research on information theory in 1948, there have been hopes that the concept of information could serve as a tool to help systematize and unify work in biology. The link between information and entropy was noted very early on, and it suggested that a full thermodynamic understanding of biology would naturally involve the information processing and storage that are characteristic of living organisms. However, the subject is full of conceptual pitfalls for the unwary, and progress has been slower than initially expected. Premature attempts at ‘grand syntheses’ have often misfired. But applications of information theory and entropy to specific highly focused topics in biology have been increasingly successful, such as:

• the maximum entropy principle in ecology,
• Shannon and Rényi entropies as measures of biodiversity,
• information theory in evolutionary game theory,
• information and the thermodynamics of individual cells.

Because they work in diverse fields, researchers in these specific topics have had little opportunity to trade insights and take stock of the progress so far. The aim of the workshop is to do just this.

In what follows, participants’ names are in boldface, while the main goals of the workshop are in italics.

Roderick Dewar is a key advocate of the principle of Maximum Entropy Production, which says that biological systems—and indeed all open, non-equilibrium systems—act to produce entropy at the maximum rate. Along with others, he has applied this principle to make testable predictions in a wide range of biological systems, from ATP synthesis [DJZ2006] to respiration and photosynthesis of individual plants [D2010] and plant communities. He has also sought to derive this principle from ideas in statistical mechanics [D2004, D2009], but it remains controversial.

The first goal of this workshop is to study the validity of this principle.

While they may be related, the principle of Maximum Entropy Production should not be confused with the MaxEnt inference procedure, which says that we should choose the probabilistic hypothesis with the highest entropy subject to the constraints provided by our data. MaxEnt was first explicitly advocated by Jaynes. He noted that it is already implicit in the procedures of statistical mechanics, but convincingly argued that it can also be applied to situations where entropy is more ‘informational’ than ‘thermodynamic’ in character.

Recently John Harte has applied MaxEnt in this way to ecology, using it to make specific testable predictions for the distribution, abundance and energy usage of species across spatial scales and across habitats and taxonomic groups [Harte2008, Harte2009, Harte2011]. Annette Ostling is an expert on other theories that attempt to explain the same data, such as the ‘neutral model’ [AOE2008, ODLSG2009, O2005, O2012]. Dewar has also used MaxEnt in ecology [D2008], and he has argued that it underlies the principle of Maximum Entropy Production.

Thus, a second goal of this workshop is to familiarize all the participants with applications of the MaxEnt method to ecology, compare it with competing approaches, and study whether MaxEnt provides a sufficient justification for the principle of Maximum Entropy Production.

Entropy is not merely a predictive tool in ecology: it is also widely used as a measure of biodiversity. Here Shannon’s original concept of entropy naturally generalizes to ‘Rényi entropy’, which depends on a parameter \alpha \ge 0. This equals

\displaystyle{ H_\alpha(p) = \frac{1}{1-\alpha} \log \sum_i p_i^\alpha  }

where p_i is the fraction of organisms of the ith type (which could mean species, some other taxon, etc.). In the limit \alpha \to 1 this reduces to the Shannon entropy:

\displaystyle{  H(p) = - \sum_i p_i \log p_i }

As \alpha increases, we give less weight to rare types of organisms. Christina Cobbold and Tom Leinster have described a systematic and highly flexible system of biodiversity measurement, with Rényi entropy at its heart [CL2012]. They consider both the case where all we have are the numbers p_i, and the more subtle case where we take the distance between different types of organisms into account.

John Baez has explained the role of Rényi entropy in thermodynamics [B2011], and together with Tom Leinster and Tobias Fritz he has proved other theorems characterizing entropy which explain its importance for information processing [BFL2011]. However, these ideas have not yet been connected to the widespread use of entropy in biodiversity studies. More importantly, the use of entropy as a measure of biodiversity has not been clearly connected to MaxEnt methods in ecology. Does the success of MaxEnt methods imply a tendency for ecosystems to maximize biodiversity subject to the constraints of resource availability? This seems surprising, but a more nuanced statement along these general lines might be correct.

So, a third goal of this workshop is to clarify relations between known characterizations of entropy, the use of entropy as a measure of biodiversity, and the use of MaxEnt methods in ecology.

As the amount of data to analyze in genomics continues to surpass the ability of humans to analyze it, we can expect automated experiment design to become ever more important. In Chris Lee and Marc Harper’s RoboMendel program [LH2013], a mathematically precise concept of ‘potential information’—how much information is left to learn—plays a crucial role in deciding what experiment to do next, given the data obtained so far. It will be useful for them to interact with William Bialek, who has expertise in estimating entropy from empirical data and using it to constrain properties of models [BBS, BNS2001, BNS2002], and Susanne Still, who applies information theory to automated theory building and biology [CES2010, PS2012].

However, there is another link between biology and potential information. Harper has noted that in an ecosystem where the population of each type of organism grows at a rate proportional to its fitness (which may depend on the fraction of organisms of each type), the quantity

\displaystyle{ I(q||p) = \sum_i q_i \ln(q_i/p_i) }

always decreases if there is an evolutionarily stable state [Harper2009]. Here p_i is the fraction of organisms of the ith genotype at a given time, while q_i is this fraction in the evolutionarily stable state. This quantity is often called the Shannon information of q ‘relative to’ p. But in fact, it is precisely the same as Lee and Harper’s potential information! Indeed, there is a precise mathematical analogy between evolutionary games and processes where a probabilistic hypothesis is refined by repeated experiments.

Thus, a fourth goal of this workshop is to develop the concept of evolutionary games as ‘learning’ processes in which information is gained over time.

We shall try to synthesize this with Carl Bergstrom and Matina Donaldson-Matasci’s work on the ‘fitness value of information’: a measure of how much increase in fitness a population can obtain per bit of extra information [BL2004, DBL2010, DM2013]. Following Harper, we shall consider not only relative Shannon entropy, but also relative Rényi entropy, as a measure of information gain [Harper2011].

A fifth and final goal of this workshop is to study the interplay between information theory and the thermodynamics of individual cells and organelles.

Susanne Still has studied the thermodynamics of prediction in biological systems [BCSS2012]. And in a celebrated related piece of work, Jeremy England used thermodynamic arguments to a derive a lower bound for the amount of entropy generated during a process of self-replication of a bacterial cell [England2013]. Interestingly, he showed that E. coli comes within a factor of 3 of this lower bound.

In short, information theory and entropy methods are becoming powerful tools in biology, from the level of individual cells, to whole ecosystems, to experimental design, model-building, and the measurement of biodiversity. The time is ripe for an investigative workshop that brings together experts from different fields and lets them share insights and methods and begin to tackle some of the big remaining questions.

Bibliography

[AOE2008] D. Alonso, A. Ostling and R. Etienne, The assumption of symmetry and species abundance distributions, Ecology Letters 11 (2008), 93–105.

[TMMABB2012} D. Amodei, W. Bialek, M. J. Berry II, O. Marre, T. Mora, and G. Tkacik, The simplest maximum entropy model for collective behavior in a neural network, arXiv:1207.6319 (2012).

[B2011] J. Baez, Rényi entropy and free energy, arXiv:1102.2098 (2011).

[BFL2011] J. Baez, T. Fritz and T. Leinster, A characterization of entropy in terms of information loss, Entropy 13 (2011), 1945–1957.

[B2011] J. Baez and M. Stay, Algorithmic thermodynamics, Math. Struct. Comp. Sci. 22 (2012), 771–787.

[BCSS2012] A. J. Bell, G. E. Crooks, S. Still and D. A Sivak, The thermodynamics of prediction, Phys. Rev. Lett. 109 (2012), 120604.

[BL2004] C. T. Bergstrom and M. Lachmann, Shannon information and biological fitness, in IEEE Information Theory Workshop 2004, IEEE, 2004, pp. 50-54.

[BBS] M. J. Berry II, W. Bialek and E. Schneidman, An information theoretic approach to the functional classification of neurons, in Advances in Neural Information Processing Systems 15, MIT Press, 2005.

[BNS2001] W. Bialek, I. Nemenman and N. Tishby, Predictability, complexity and learning, Neural Computation 13 (2001), 2409–2463.

[BNS2002] W. Bialek, I. Nemenman and F. Shafee, Entropy and inference, revisited, in Advances in Neural Information Processing Systems 14, MIT Press, 2002.

[CL2012] C. Cobbold and T. Leinster, Measuring diversity: the importance of species similarity, Ecology 93 (2012), 477–489.

[CES2010] J. P. Crutchfield, S. Still and C. Ellison, Optimal causal inference: estimating stored information and approximating causal architecture, Chaos 20 (2010), 037111.

[D2004] R. C. Dewar, Maximum entropy production and non-equilibrium statistical mechanics, in Non-Equilibrium Thermodynamics and Entropy Production: Life, Earth and Beyond, eds. A. Kleidon and R. Lorenz, Springer, New York, 2004, 41–55.

[DJZ2006] R. C. Dewar, D. Juretíc, P. Zupanovíc, The functional design of the rotary enzyme ATP synthase is consistent with maximum entropy production, Chem. Phys. Lett. 430 (2006), 177–182.

[D2008] R. C. Dewar, A. Porté, Statistical mechanics unifies different ecological patterns, J. Theor. Bio. 251 (2008), 389–403.

[D2009] R. C. Dewar, Maximum entropy production as an inference algorithm that translates physical assumptions into macroscopic predictions: don’t shoot the messenger, Entropy 11 (2009), 931–944.

[D2010] R. C. Dewar, Maximum entropy production and plant optimization theories, Phil. Trans. Roy. Soc. B 365 (2010) 1429–1435.

[DBL2010} M. C. Donaldson-Matasci, C. T. Bergstrom, and
M. Lachmann, The fitness value of information, Oikos 119 (2010), 219-230.

[DM2013] M. C. Donaldson-Matasci, G. DeGrandi-Hoffman, and A. Dornhaus, Bigger is better: honey bee colonies as distributed information-gathering systems, Animal Behaviour 85 (2013), 585–592.

[England2013] J. L. England, Statistical physics of self-replication, J. Chem. Phys. 139 (2013), 121923.

[ODLSG2009} J. L. Green, J. K. Lake, J. P. O’Dwyer, A. Ostling and V. M. Savage, An integrative framework for stochastic, size-structured community assembly, PNAS 106 (2009), 6170–6175.

[Harper2009] M. Harper, Information geometry and evolutionary game theory, arXiv:0911.1383 (2009).

[Harper2011] M. Harper, Escort evolutionary game theory, Physica D 240 (2011), 1411–1415.

[Harte2008] J. Harte, T. Zillio, E. Conlisk and A. Smith, Maximum entropy and the state-variable approach to macroecology, Ecology 89 (2008), 2700–2711.

[Harte2009] J. Harte, A. Smith and D. Storch, Biodiversity scales from plots to biomes with a universal species-area curve, Ecology Letters 12 (2009), 789–797.

[Harte2011] J. Harte, Maximum Entropy and Ecology: A Theory of Abundance, Distribution, and Energetics, Oxford U. Press, Oxford, 2011.

[LH2013] M. Harper and C. Lee, Basic experiment planning via information metrics: the RoboMendel problem, arXiv:1210.4808 (2012).

[O2005] A. Ostling, Neutral theory tested by birds, Nature 436 (2005), 635.

[O2012] A. Ostling, Do fitness-equalizing tradeoffs lead to neutral communities?, Theoretical Ecology 5 (2012), 181–194.

[PS2012] D. Precup and S. Still, An information-theoretic approach to curiosity-driven reinforcement learning, Theory in Biosciences 131 (2012), 139–148.


Follow

Get every new post delivered to your Inbox.

Join 3,095 other followers