Here’s the idea. Everyone likes to say that biology is all about information. There’s something true about this—just think about DNA. But what does this insight actually do for us, quantitatively speaking? To figure this out, we need to do some work.

Biology is also about things that make copies of themselves. So it makes sense to figure out how information theory is connected to the replicator equation—a simple model of population dynamics for self-replicating entities.

To see the connection, we need to use ‘relative information’: the information of one probability distribution *relative to another*, also known as the Kullback–Leibler divergence. Then everything pops into sharp focus.

It turns out that free energy—energy in forms that can actually be *used*, not just waste heat—is a special case of relative information Since the decrease of free energy is what drives chemical reactions, biochemistry is founded on relative information.

But there’s a lot more to it than this! Using relative information we can also see evolution as a learning process, fix the problems with Fisher’s fundamental theorem of natural selection, and more.

So this what I’ll talk about! You can see my slides here:

• John Baez, Biology as information dynamics.

but my talk will be videotaped, and it’ll eventually be put here:

• Stanford complexity group, YouTube.

You can already see lots of cool talks at this location!

]]>

A peptide is basically a small protein: a chain of made of fewer than 50 amino acids. If you plot the number of peptides of different masses found in various organisms, you see peculiar oscillations:

These oscillations have a frequency of about 14 daltons, where a ‘dalton’ is roughly the mass of a hydrogen atom—or more precisely, 1/12 the mass of a carbon atom.

Biologists had noticed these oscillations in databases of peptide masses. But they didn’t understand them.

Can you figure out what causes these oscillations?

It’s a math puzzle, actually.

Next I’ll give you the answer, so stop looking if you want to think about it first.

Almost all peptides are made of 20 different amino acids, which have different masses, which are almost integers. So, to a reasonably good approximation, the puzzle amounts to this: if you have 20 natural numbers how many ways can you write any natural number as a finite ordered sum of these numbers? Call it and graph it. It oscillates! Why?

(We count *ordered* sums because the amino acids are stuck together in a linear way to form a protein.)

There’s a well-known way to write down a formula for . It obeys a linear recurrence:

and we can solve this using the ansatz

Then the recurrence relation will hold if

for all But this is fairly easy to achieve! If is the biggest mass, we just need this polynomial equation to hold:

There will be a bunch of solutions, about of them. (If there are repeated roots things get a bit more subtle, but let’s not worry about.) To get the actual formula for we need to find the right linear combination of functions where ranges over all the roots. That takes some work. Craciun and his collaborator Shane Hubler did that work.

But we can get a pretty good understanding with a lot less work. In particular, the root with the largest magnitude will make grow the fastest.

If you haven’t thought about this sort of recurrence relation it’s good to look at the simplest case, where we just have two masses Then the numbers are the Fibonacci numbers. I hope you know this: the th Fibonacci number is the number of ways to write as the sum of an ordered list of 1’s and 2’s!

1

1+1, 2

1+1+1, 1+2, 2+1

1+1+1+1, 1+1+2, 1+2+1, 2+1+1, 2+2

If I drew edges between these sums in the right way, forming a ‘family tree’, you’d see the connection to Fibonacci’s original rabbit puzzle.

In this example the recurrence gives the polynomial equation

and the root with largest magnitude is the golden ratio:

The other root is

With a little more work you get an explicit formula for the Fibonacci numbers in terms of the golden ratio:

But right now I’m more interested in the qualitative aspects! In this example both roots are real. The example from biology is different.

**Puzzle 1.** For which lists of natural numbers are all the roots of

real?

I don’t know the answer. But apparently this kind of polynomial equation always one root with the largest possible magnitude, which is real and has multiplicity one. I think it turns out that is asymptotically proportional to where is this root.

But in the case that’s relevant to biology, there’s also a pair of roots with the *second* largest magnitude, which are *not* real: they’re complex conjugates of each other. And these give rise to the oscillations!

For the masses of the 20 amino acids most common in life, the roots look like this:

The aqua root at right has the largest magnitude and gives the dominant contribution to the exponential growth of The red roots have the second largest magnitude. These give the main oscillations in which have period 14.28.

For the full story, read this:

• Shane Hubler and Gheorghe Craciun, Periodic patterns in distributions of peptide masses, *BioSystems* **109** (2012), 179–185.

Most of the pictures here are from this paper.

My main question is this:

**Puzzle 2.** Suppose we take many lists of natural numbers and draw all the roots of the equations

What pattern do we get in the complex plane?

I suspect that this picture is an approximation to the answer you’d get to Puzzle 2:

If you stare carefully at this picture, you’ll see some patterns, and I’m guessing those are hints of something very beautiful.

Earlier on this blog we looked at roots of polynomials whose coefficients are all 1 or -1:

The pattern is very nice, and it repays deep mathematical study. Here it is, drawn by Sam Derbyshire:

But now we’re looking at polynomials where the leading coefficient is 1 and all the rest are -1 or 0. How does that change things? A lot, it seems!

By the way, the 20 amino acids we commonly see in biology have masses ranging between 57 and 186. It’s not really true that all their masses are different. Here are their masses:

57, 71, 87, 97, 99, 101, 103, 113, 113, 114, 115, 128, 128, 129, 131, 137, 147, 156, 163, 186

I pretended that none of the masses are equal in Puzzle 2, and I left out the fact that only about 1/9th of the coefficients of our polynomial are nonzero. This may affect the picture you get!

]]>

The goal is to start a conversation about applications of category theory, not within pure math or fundamental physics, but to other branches of science and engineering—especially those where the use of category theory is not already well-established! For example, my students and I have been applying category theory to chemistry, electrical engineering, control theory and Markov processes.

Alas, we have no funds for travel and lodging. If you’re interested in giving a talk, please submit an abstract here:

• General information about abstracts, American Mathematical Society.

More precisely, please read the information there and then click on the link on that page to submit an abstract. It should then magically fly through cyberspace to me! Abstracts are due September 12th, but the sooner you submit one, the greater the chance that we’ll have space.

For the program of the whole conference, go here:

• Fall Western Sectional Meeting, U. C. Riverside, Riverside, California, 4–5 November 2017.

We’ll be having some interesting plenary talks:

• Paul Balmer, UCLA, An invitation to tensor-triangular geometry.

• Pavel Etingof, MIT, Double affine Hecke algebras and their applications.

• Monica Vazirani, U.C. Davis, Combinatorics, categorification, and crystals.

]]>

The positions are open to applicants who have PhD or will have a PhD by the beginning of the term from all research areas in mathematics. The teaching load is six courses per year (i.e. 2 per quarter). In addition to teaching, the applicants will be responsible for attending advanced seminars and working on research projects.

This is initially a one-year appointment, and with successful annual teaching review, it is renewable for up to a third year term.

For more details, including how to apply, go here:

https://www.mathjobs.org/jobs/jobs/10162

]]>

In 49 hours, the National Park Service will stop taking comments on an important issue: whether to reintroduce grizzly bears into the North Cascades near Seattle. If you leave a comment on their website before then, you can help make this happen! Follow the easy directions here:

http://theoatmeal.com/blog/grizzlies_north_cascades

Please go ahead! Then tell your friends to join in, and give them this link. This can be your good deed for the day.

But if you want more details:

Grizzly bears are traditionally the apex predator in the North Cascades. Without the apex predator, the whole ecosystem is thrown out of balance. I know this from my childhood in northern Virginia, where deer are stripping the forest of all low-hanging greenery with no wolves to control them. *With* the top predator, the whole ecosystem springs to life and starts humming like a well-tuned engine! For example, when wolves were reintroduced in Yellowstone National Park, it seems that even riverbeds were affected:

There are several plans to restore grizzlies to the North Cascades. On the link I recommended, Matthew Inman supports **Alternative C — Incremental Restoration**. I’m not an expert on this issue, so I went ahead and supported that. There are actually 4 alternatives on the table:

**Alternative A — No Action.** They’ll keep doing what they’re already doing. The few grizzlies already there would be protected from poaching, the local population would be advised on how to deal with grizzlies, and the bears would be monitored. All other alternatives will do these things and more.

**Alternative B — Ecosystem Evaluation Restoration.** Up to 10 grizzly bears will be captured from source populations in northwestern Montana and/or south-central British Columbia and released at a single remote site on Forest Service lands in the North Cascades. This will take 2 years, and then they’ll be monitored for 2 years before deciding what to do next.

**Alternative C — Incremental Restoration.** 5 to 7 grizzly bears will be captured and released into the North Casades each year over roughly 5 to 10 years, with a goal of establishing an initial population of 25 grizzly bears. Bears would be released at multiple remote sites. They can be relocated or removed if they cause trouble. Alternative C is expected to reach the restoration goal of approximately 200 grizzly bears within 60 to 100 years.

**Alternative D — Expedited Restoration.** 5 to 7 grizzly bears will be captured and released into the North Casades each year until the population reaches about 200, which is what the area can easily support.

So, pick your own alternative if you like!

By the way, the remaining grizzly bears in the western United States live within six recovery zones:

• the Greater Yellowstone Ecosystem (GYE) in Wyoming and southwest Montana,

• the Northern Continental Divide Ecosystem (NCDE) in northwest Montana,

• the Cabinet-Yaak Ecosystem (CYE) in extreme northwestern Montana and the northern Idaho panhandle,

• the Selkirk Ecosystem (SE) in northern Idaho and northeastern Washington,

• the Bitterroot Ecosystem (BE) in central Idaho and western Montana,

• and the North Cascades Ecosystem (NCE) in northwestern and north-central Washington.

The North Cascades Ecosystem consists of 24,800 square kilometers in Washington, with an additional 10,350 square kilometers in British Columbia. In the US, 90% of this ecosystem is managed by the US Forest Service, the US National Park Service, and the State of Washington, and approximately 41% falls within Forest Service wilderness or the North Cascades National Park Service Complex.

For more, read this:

• National Park Service, *Draft Grizzly Bear Restoration Plan / Environmental Impact Statement: North Cascades Ecosystem*.

The picture of grizzlies is from this article:

• Ron Judd, Why returning grizzlies to the North Cascades is the right thing to do, *Pacific NW Magazine*, 23 November 2015.

If you’re worried about reintroducing grizzly bears, read it!

The map is from here:

• Krista Langlois, Grizzlies gain ground, *High Country News*, 27 August 2014.

Here you’ll see the huge obstacles this project has overcome so far.

]]>

and the golden ratio:

They’re related:

Greg Egan and I came up with this formula last weekend. It’s probably not new, and it certainly wouldn’t surprise experts, but it’s still fun coming up with a formula like this. Let me explain how we did it.

History has a fractal texture. It’s not exactly *self-similar*, but the closer you look at any incident, the more fine-grained detail you see. The simplified stories we learn about the history of math and physics in school are like blurry pictures of the Mandelbrot set. You can see the overall shape, but the really exciting stuff is hidden.

François Viète is a French mathematician who doesn’t show up in those simplified stories. He studied law at Poitiers, graduating in 1559. He began his career as an attorney at a quite high level, with cases involving the widow of King Francis I of France and also Mary, Queen of Scots. But his true interest was always mathematics. A friend said he could think about a single question for up to three days, his elbow on the desk, feeding himself without changing position.

Nonetheless, he was highly successful in law. By 1590 he was working for King Henry IV. The king admired his mathematical talents, and Viète soon confirmed his worth by cracking a Spanish cipher, thus allowing the French to read all the Spanish communications they were able to obtain.

In 1591, François Viète came out with an important book, introducing what is called the new algebra: a symbolic method for dealing with polynomial equations. This deserves to be much better known; it was very familiar to Descartes and others, and it was an important precursor to our modern notation and methods. For example, he emphasized care with the use of variables, and advocated denoting known quantities by consonants and unknown quantities by vowels. (Later people switched to using letters near the beginning of the alphabet for known quantities and letters near the end like for unknowns.)

In 1593 he came out with another book, *Variorum De Rebus Mathematicis Responsorum, Liber VIII*. Among other things, it includes a formula for pi. In modernized notation, it looks like this:

This is remarkable! First of all, it looks cool. Second, it’s the earliest known example of an infinite product in mathematics. Third, it’s the earliest known formula for the exact value of pi. In fact, it seems to be the earliest formula representing a number as the result of an infinite process rather than of a finite calculation! So, Viète’s formula has been called the beginning of analysis. In his article “The life of pi”, Jonathan Borwein went even further and called Viète’s formula “the dawn of modern mathematics”.

How did Viète come up with his formula? I haven’t read his book, but the idea seems fairly clear. The area of the unit circle is pi. So, you can approximate pi better and better by computing the area of a square inscribed in this circle, and then an octagon, and then a 16-gon, and so on:

If you compute these areas in a clever way, you get this series of numbers:

and so on, where is the area of a regular *n*-gon inscribed in the unit circle. So, it was only a small step for Viète (though an infinite leap for mankind) to conclude that

or, if square roots in a denominator make you uncomfortable:

The basic idea here would not have surprised Archimedes, who rigorously proved that

by approximating the circumference of a circle using a regular 96-gon. Since , you can draw a regular 96-gon with ruler and compass by taking an equilateral triangle and bisecting its edges to get a hexagon, bisecting the edges of that to get a 12-gon, and so on up to 96. In a more modern way of thinking, you can figure out everything you need to know by starting with the angle and using half-angle formulas 4 times to work out the sine or cosine of . And indeed, before Viète came along, Ludolph van Ceulen had computed pi to 35 digits using a regular polygon with sides! So Viète’s daring new idea was to give an *exact* formula for pi that involved an *infinite* process.

Now let’s see in detail how Viète’s formula works. Since there’s no need to start with a square, we might as well start with a regular *n*-gon inscribed in the circle and repeatedly bisect its sides, getting better and better approximations to pi. If we start with a pentagon, we’ll get a formula for pi that involves the golden ratio!

We have

so we can also compute pi by starting with a regular *n*-gon and repeatedly doubling the number of vertices:

The key trick is to write as a ‘telescoping product’:

Thus, taking the limit as we get

where we start with the area of the *n*-gon and keep ‘correcting’ it to get the area of the *2n*-gon, the *4n*-gon, the *8n*-gon and so on.

There’s a simple formula for the area of a regular *n*-gon inscribed in a circle. You can chop it into right triangles, each of which has base and height , and thus area :

Thus,

This lets us understand how the area changes when we double the number of vertices:

This is nice and simple, but we really need a recursive formula for this quantity. Let’s define

Why the factor of 2? It simplifies our calculations slightly. We can express in terms of using the half-angle formula for the cosine:

Now we’re ready for some fun! We have

so using our recursive formula , which holds for any , we get

I think this deserves to be called the **generalized Viète formula**. And indeed, if we start with a square, we get

and

giving Viète’s formula:

as desired!

But what if we start with a pentagon? For this it helps to remember a beautiful but slightly obscure trig fact:

and a slightly less beautiful one:

It’s easy to prove these, and I’ll show you how later. For now, note that they imply

and

Thus, the formula

gives us

or, cleaning it up a bit, the formula we want:

Voilà!

There’s a lot more to say, but let me just explain the slightly obscure trigonometry facts we needed. To derive these, I find it nice to remember that a regular pentagon, and the pentagram inside it, contain lots of similar triangles:

Using the fact that all these triangles are similar, it’s easy to show that for any one, the ratio of the long side to the short side is to 1, since

Another important fact is that the pentagram trisects the interior angle of the regular pentagon, breaking the interior angle of into 3 angles of :

Again this is easy and fun to show.

Combining these facts, we can prove that

and

To prove the first equation, chop one of those golden triangles into two right triangles and do things you learned in high school. To prove the second, do the same things to one of the short squat isosceles triangles:

Starting from these equations and using , we can show

and, just for completeness (we don’t need it here):

These require some mildly annoying calculations, where it helps to use the identity

Okay, that’s all for now! But if you want more fun, try a couple of puzzles:

**Puzzle 1.** We’ve gotten formulas for pi starting from a square or a regular pentagon. What formula do you get starting from an equilateral triangle?

**Puzzle 2.** Using the generalized Viète formula, prove Euler’s formula

Conversely, use Euler’s formula to prove the generalized Viète formula.

So, one might say that the real point of Viète’s formula, and its generalized version, is not any special property of pi, but Euler’s formula.

]]>

• Dawn Reeves, EPA preserves Obama-Era website but climate change data doubts remain, *InsideEPA.com*, 21 February 2017.

For those of us who are backing up climate data, the really important stuff is in red near the bottom.

The EPA has posted a link to an archived version of its website from Jan. 19, the day before President Donald Trump was inaugurated and the agency began removing climate change-related information from its official site, saying the move comes in response to concerns that it would permanently scrub such data.

However, the archived version notes that links to climate and other environmental databases will go to current versions of them—continuing the fears that the Trump EPA will remove or destroy crucial greenhouse gas and other data.

The archived version was put in place and linked to the main page in response to “numerous [Freedom of Information Act (FOIA)] requests regarding historic versions of the EPA website,” says an email to agency staff shared by the press office. “The Agency is making its best reasonable effort to 1) preserve agency records that are the subject of a request; 2) produce requested agency records in the format requested; and 3) post frequently requested agency records in electronic format for public inspection. To meet these goals, EPA has re-posted a snapshot of the EPA website as it existed on January 19, 2017.”

The email adds that the action is similar to the snapshot taken of the Obama White House website.

The archived version of EPA’s website includes a “more information” link that offers more explanation.

For example, it says the page is “not the current EPA website” and that the archive includes “static content, such as webpages and reports in Portable Document Format (PDF), as that content appeared on EPA’s website as of January 19, 2017.”

It cites technical limits for the database exclusions. “For example, many of the links contained on EPA’s website are to databases that are updated with the new information on a regular basis. These databases are not part of the static content that comprises the Web Snapshot.” Searches of the databases from the archive “will take you to the current version of the database,” the agency says.

“In addition, links may have been broken in the website as it appeared” on Jan. 19 and those will remain broken on the snapshot. Links that are no longer active will also appear as broken in the snapshot.

“Finally, certain extremely large collections of content… were not included in the Snapshot due to their size” such as AirNow images, radiation network graphs, historic air technology transfer network information, and EPA’s searchable news releases.”

## ‘Smart’ Move

One source urging the preservation of the data says the snapshot appears to be a “smart” move on EPA’s behalf, given the FOIA requests it has received, and notes that even though other groups like NextGen Climate and scientists have been working to capture EPA’s online information, having it on EPA’s site makes it official.

But it could also be a signal that big changes are coming to the official Trump EPA site, and it is unclear how long the agency will maintain the archived version.

The source says while it is disappointing that the archive may signal the imminent removal of EPA’s climate site, “at least they are trying to accommodate public concerns” to preserve the information.

A second source adds that while it is good that EPA is seeking “to address the widespread concern” that the information will be removed by an administration that does not believe in human-caused climate change, “on the other hand, it doesn’t address the primary concern of the data. It is snapshots of the web text.” Also, information “not included,” such as climate databases, is what is difficult to capture by outside groups and is what really must be preserved.

“If they take [information] down” that groups have been trying to preserve, then the underlying concern about access to data remains. “Web crawlers and programs can do things that are easy,” such as taking snapshots of text, “but getting the data inside the database is much more challenging,” the source says.

The first source notes that EPA’s searchable databases, such as those maintained by its Clean Air Markets Division, are used by the public “all the time.”

The agency’s Office of General Counsel (OGC) Jan. 25 began a review of the implications of taking down the climate page—a planned wholesale removal that was temporarily suspended to allow for the OGC review.

But EPA did remove some specific climate information, including links to the Clean Power Plan and references to President Barack Obama’s Climate Action Plan. Inside EPA captured this screenshot of the “What EPA Is Doing” page regarding climate change. Those links are missing on the Trump EPA site. The archive includes the same version of the page as captured by our screenshot.

Inside EPA first reported the plans to take down the climate information on Jan. 17.

After the OGC investigation began, a source close to the Trump administration said Jan. 31 that climate “propaganda” would be taken down from the EPA site, but that the agency is not expected to remove databases on GHG emissions or climate science. “Eventually… the propaganda will get removed…. Most of what is there is not data. Most of what is there is interpretation.”

The Sierra Club and Environmental Defense Fund both filed FOIA requests asking the agency to preserve its climate data, while attorneys representing youth plaintiffs in a federal climate change lawsuit against the government have also asked the Department of Justice to ensure the data related to its claims is preserved.

The Azimuth Climate Data Backup Project and other groups are making copies of actual databases, not just the visible portions of websites.

]]>

Next time I’ll tell you what our project has actually been doing. This time I just want to give a huge “thank you!” to all 627 people who contributed money on Kickstarter!

I sent out thank you notes to everyone, updating them on our progress and asking if they wanted their names listed. The blanks in the following list represent people who either didn’t reply, didn’t want their names listed, or backed out and decided not to give money. I’ll list people in chronological order: first contributors first.

Only 12 people backed out; the vast majority of blanks on this list are people who haven’t replied to my email. I noticed some interesting but obvious patterns. For example, people who contributed later are less likely to have answered my email yet—I’ll update this list later. People who contributed more money were more likely to answer my email.

The magnitude of contributions ranged from $2000 to $1. A few people offered to help in other ways. The response was international—this was really heartwarming! People from the US were more likely than others to ask not to be listed.

But instead of continuing to list statistical patterns, let me just *thank* everyone who contributed.

Daniel Estrada Ahmed Amer Saeed Masroor Jodi Kaplan John Wehrle Bob Calder Andrea Borgia L Gardner Uche Eke Keith Warner Dean Kalahan James Benson Dianne Hackborn Walter Hahn Thomas Savarino Noah Friedman Eric Willisson Jeffrey Gilmore John Bennett Glenn McDavid Brian Turner Peter Bagaric Martin Dahl Nielsen Broc Stenman Gabriel Scherer Roice Nelson Felipe Pait Kenneth Hertz Luis Bruno Andrew Lottmann Alex Morse Mads Bach Villadsen Noam Zeilberger Buffy Lyon Josh Wilcox Danny Borg Krishna Bhogaonker Harald Tveit Alvestrand Tarek A. Hijaz, MD Jouni Pohjola Chavdar Petkov Markus Jöbstl Bjørn Borud Sarah G William Straub Frank Harper Carsten Führmann Rick Angel Drew Armstrong Jesimpson Valeria de Paiva Ron Prater David Tanzer Rafael Laguna Miguel Esteves dos Santos Sophie Dennison-Gibby Randy Drexler Peter Haggstrom Jerzy Michał Pawlak Santini Basra Jenny Meyer John Iskra Bruce Jones Māris Ozols Everett Rubel Mike D Manik Uppal Todd Trimble Federer Fanatic Forrest Samuel, Harmos Consulting Annie Wynn Norman and Marcia Dresner Daniel Mattingly James W. Crosby Jennifer Booth Greg Randolph Dave and Karen Deeter Sarah Truebe Tieg Zaharia Jeffrey Salfen Birian Abelson Logan McDonald Brian Truebe Jon Leland Nicole Sarah Lim James Turnbull John Huerta Katie Mandel Bruce Bethany Summer Heather Tilert Anna C. Gladstone Naom Hart Aaron Riley Giampiero Campa Julie A. Sylvia Pace Willisson Bangskij Peter Herschberg Alaistair Farrugia Conor Hennessy Stephanie Mohr Torinthiel Lincoln Muri Anet Ferwerda Hanna Michelle Lee Guiney Ben Doherty Trace Hagemann Ryan Mannion Penni and Terry O'Hearn Brian Bassham Caitlin Murphy John Verran Susan Alexander Hawson Fabrizio Mafessoni Anita Phagan Nicolas Acuña Niklas Brunberg Adam Luptak V. Lazaro Zamora Branford Werner Niklas Starck Westerberg Luca Zenti and Marta Veneziano Ilja Preuß Christopher Flint George Read Courtney Leigh Katharina Spoerri Daniel Risse Hanna Charles-Etienne Jamme rhackman41 Jeff Leggett RKBookman Aaron Paul Mike Metzler Patrick Leiser Melinda Ryan Vaughn Kent Crispin Michael Teague Ben Fabian Bach Steven Canning Betsy McCall John Rees Mary Peters Shane Claridge Thomas Negovan Tom Grace Justin Jones Jason Mitchell Josh Weber Rebecca Lynne Hanginger Kirby Dawn Conniff Michael T. Astolfi Kristeva Erik Keith Uber Elaine Mazerolle Matthieu Walraet Linda Penfold Lujia Liu Keith Samar Tareem Henrik Almén Michael Deakin Rutger Ockhorst Erin Bassett James Crook Junior Eluhu Dan Laufer Carl Robert Solovay Silica Magazine Leonard Saers Alfredo Arroyo García Larry Yu John Behemonth Eric Humphrey Svein Halvor Halvorsen Karim Issa Øystein Risan Borgersen David Anderson Bell III Ole-Morten Duesend Adam North and Gabrielle Falquero Robert Biegler Qu Wenhao Steffen Dittmar Shanna Germain Adam Blinkinsop John WS Marvin (Dread Unicorn Games) Bill Carter Darth Chronis Lawrence Stewart Gareth Hodges Colin Backhurst Christopher Metzger Rachel Gumper Mariah Thompson Falk Alexander Glade Johnathan Salter Maggie Unkefer Shawna Maryanovich Wilhelm Fitzpatrick Dylan “ExoByte” Mayo Lynda Lee Scott Carpenter Charles D, Payet Vince Rostkowski Tim Brown Raven Daegmorgan Zak Brueckner Christian Page Adi Shavit Steven Greenberg Chuck Lunney Adriel Bustamente Natasha Anicich Bram De Bie Edward L Gray Detrick Robert Sarah Russell Sam Leavin Abilash Pulicken Isabel Olondriz James Pierce James Morrison April Daniels José Tremblay Champagne Chris Edmonds Hans & Maria Cummings Bart Gasiewiski Andy Chamard Andrew Jackson Christopher Wright Crystal Collins ichimonji10 Alan Stern Alison W Dag Henrik Bråtane Martin Nilsson William Schrade

]]>

There’s a *lot* going on! Here’s a news roundup. I will separately talk about what the Azimuth Climate Data Backup Project is doing.

I’ll start with the bad news, and then go on to some good news.

Scientists are keeping track of how Trump administration is changing the Environmental Protection Agency website, with before-and-after photos, and analysis:

• Brian Kahn, Behold the “tweaks” Trump has made to the EPA website (so far), *National Resources Defense Council* blog, 3 February 2017.

There’s more about “adaptation” to climate change, and less about how it’s caused by carbon emissions.

All of this would be nothing compared to the new bill to eliminate the EPA, or Myron Ebell’s plan to fire most of the people working there:

• Joe Davidson, Trump transition leader’s goal is two-thirds cut in EPA employees, *Washington Post*, 30 January 2017.

If you want to keep track of this battle, I recommend getting a 30-day free subscription to this online magazine:

The Trump team is taking animal-welfare data offline. The US Department of Agriculture will no longer make lab inspection results and violations publicly available, citing privacy concerns:

• Sara Reardon, US government takes animal-welfare data offline, *Nature Breaking News*, 3 Feburary 2017.

A new bill would prevent the US government from providing access to geospatial data if it helps people understand housing discrimination. It goes like this:

Notwithstanding any other provision of law, no Federal funds may be used to design, build, maintain, utilize, or provide access to a Federal database of geospatial information on community racial disparities or disparities in access to affordable housing._

For more on this bill, and the important ways in which such data has been used, see:

• Abraham Gutman, Scott Burris, and the Temple University Center for Public Health Law Research, Where will data take the Trump administration on housing?, *Philly.com*, 1 February 2017.

The Environmental Data and Governance Initiative or **EDGI** is working to archive public environmental data. They’re helping coordinate data rescue events. You can attend one and have fun eating pizza with cool people while saving data:

• 3 February 2017, Portland

• 4 February 2017, New York City

• 10-11 February 2017, Austin Texas

• 11 February 2017, U. C. Berkeley, California

• 18 February 2017, MIT, Cambridge Massachusetts

• 18 February 2017, Haverford Connecticut

• 18-19 February 2017, Washington DC

• 26 February 2017, Twin Cities, Minnesota

Or, work with EDGI to organize one your own data rescue event! They provide some online tools to help download data.

I know there will also be another event at UCLA, so the above list is not complete, and it will probably change and grow over time. Keep up-to-date at their site:

• Environmental Data and Governance Initiative.

The pushback is so big it’s hard to list it all! For now I’ll just quote some of this article:

• Tabitha Powledge, The gag reflex: Trump info shutdowns at US science agencies, especially EPA, 27 January 2017.

THE PUSHBACK FROM SCIENCE HAS BEGUNPredictably, counter-tweets claiming to come from rebellious employees at the EPA, the Forest Service, the USDA, and NASA sprang up immediately. At The Verge, Rich McCormick says there’s reason to believe these claims may be genuine, although none has yet been verified. A lovely head on this post: “On the internet, nobody knows if you’re a National Park.”

At Hit&Run, Ronald Bailey provides handles for several of these alt tweet streams, which he calls “the revolt of the permanent government.” (That’s a compliment.)

Bailey argues, “with exception perhaps of some minor amount of national security intelligence, there is no good reason that any information, data, studies, and reports that federal agencies produce should be kept from the public and press. In any case, I will be following the Alt_Bureaucracy feeds for a while.”

NeuroDojo Zen Faulkes posted on how to demand that scientific societies show some backbone. “Ask yourself: “Have my professional societies done anything more political than say, ‘Please don’t cut funding?’” Will they fight?,” he asked.

Scientists associated with the group_

500 Women Scientists_donned lab coats and marched in DC as part of the Women’s March on Washington the day after Trump’s Inauguration, Robinson Meyer reported at the Atlantic. A wildlife ecologist from North Carolina told Meyer, “I just can’t believe we’re having to yell, ‘Science is real.’”Taking a cue from how the Women’s March did its social media organizing, other scientists who want to set up a Washington march of their own have put together a closed Facebook group that claims more than 600,000 members, Kate Sheridan writes at STAT.

The #ScienceMarch Twitter feed says a date for the march will be posted in a few days. [The march will be on 22 April 2017.] The group also plans to release tools to help people interested in local marches coordinate their efforts and avoid duplication.

At

The Atlantic, Ed Yong describes the political action committee 314Action. (314=the first three digits of pi.)Among other political activities, it is holding a webinar on Pi Day—March 14—to explain to scientists how to run for office. Yong calls 314Action the science version of Emily’s List, which helps pro-choice candidates run for office. 314Action says it is ready to connect potential candidate scientists with mentors—and donors.

Other groups may be willing to step in when government agencies wimp out. A few days before the Inauguration, the Centers for Disease Control and Prevention abruptly and with no explanation cancelled a 3-day meeting on the health effects of climate change scheduled for February. Scientists told

Ars Technica’s Beth Mole that CDC has a history of running away from politicized issues.One of the conference organizers from the American Public Health Association was quoted as saying nobody told the organizers to cancel.

I believe it. Just one more example of the chilling effect on global warming. In politics, once the Dear Leader’s wishes are known, some hirelings will rush to gratify them without being asked.

The APHA guy said they simply wanted to head off a potential last-minute cancellation. Yeah, I guess an anticipatory pre-cancellation would do that.

But then—Al Gore to the rescue! He is joining with a number of health groups—including the American Public Health Association—to hold a one-day meeting on the topic Feb 16 at the Carter Center in Atlanta, CDC’s home base. Vox’s Julia Belluz reports that it is not clear whether CDC officials will be part of the Gore rescue event.

The Sierra Club, of which I’m a proud member, is using the Freedom of Information Act or **FOIA** to battle or at least slow the deletion of government databases. They wisely started even before Trump took power:

• Jennifer A Dlouhy, Fearing Trump data purge, environmentalists push to get records, *BloombergMarkets*, 13 January 2017.

Here’s how the strategy works:

U.S. government scientists frantically copying climate data they fear will disappear under the Trump administration may get extra time to safeguard the information, courtesy of a novel legal bid by the Sierra Club.

The environmental group is turning to open records requests to protect the resources and keep them from being deleted or made inaccessible, beginning with information housed at the Environmental Protection Agency and the Department of Energy. On Thursday [January 9th], the organization filed Freedom of Information Act requests asking those agencies to turn over a slew of records, including data on greenhouse gas emissions, traditional air pollution and power plants.

The rationale is simple: Federal laws and regulations generally block government agencies from destroying files that are being considered for release. Even if the Sierra Club’s FOIA requests are later rejected, the record-seeking alone could prevent files from being zapped quickly. And if the records are released, they could be stored independently on non-government computer servers, accessible even if other versions go offline.

]]>

• John Baez, Biology as information dynamics, talk for Biological Complexity: Can it be Quantified?, a workshop at the Beyond Center, 2 February 2017.

While preparing this talk, I discovered a cool fact. I doubt it’s new, but I haven’t exactly seen it elsewhere. I came up with it while trying to give a precise and general statement of ‘Fisher’s fundamental theorem of natural selection’. I *won’t* start by explaining that theorem, since my version looks rather different than Fisher’s, and I came up with mine precisely because I had trouble understanding his. I’ll say a bit more about this at the end.

Here’s my version:

The square of the rate at which a population learns information is the variance of its fitness.

This is a nice advertisement for the virtues of diversity: more variance means faster learning. But it requires some explanation!

Let’s start by assuming we have different kinds of self-replicating entities with populations As usual, these could be all sorts of things:

• molecules of different chemicals

• organisms belonging to different species

• genes of different alleles

• restaurants belonging to different chains

• people with different beliefs

• game-players with different strategies

• etc.

I’ll call them **replicators** of different **species**.

Let’s suppose each population is a function of time that grows at a rate equal to this population times its ‘fitness’. I explained the resulting equation back in Part 9, but it’s pretty simple:

Here is a completely arbitrary smooth function of all the populations! We call it the **fitness** of the *i*th species.

This equation is important, so we want a short way to write it. I’ll often write simply as and simply as With these abbreviations, which any red-blooded physicist would take for granted, our equation becomes simply this:

Next, let be the probability that a randomly chosen organism is of the *i*th species:

Starting from our equation describing how the populations evolve, we can figure out how these probabilities evolve. The answer is called the **replicator equation**:

Here is the average fitness of all the replicators, or **mean fitness**:

In what follows I’ll abbreviate the replicator equation as follows:

Okay, now let’s figure out how fast the probability distribution

changes with time. For this we need to choose a way to measure the length of the vector

And here information geometry comes to the rescue! We can use the Fisher information metric, which is a Riemannian metric on the space of probability distributions.

I’ve talked about the Fisher information metric in many ways in this series. The most important fact is that as a probability distribution changes with time, its speed

as measured using the Fisher information metric can be seen as the *rate at which information is learned*. I’ll explain that later. Right now I just want a simple *formula* for the Fisher information metric. Suppose and are two tangent vectors to the point in the space of probability distributions. Then the **Fisher information metric** is given as follows:

Using this we can calculate the speed at which moves when it obeys the replicator equation. Actually the square of the speed is simpler:

The answer has a nice meaning, too! It’s just the variance of the fitness: that is, the square of its standard deviation.

So, if you’re willing to buy my claim that the speed is the rate at which our population learns new information, then we’ve seen that *the square of the rate at which a population learns information is the variance of its fitness!*

Now, how is this related to Fisher’s fundamental theorem of natural selection? First of all, what *is* Fisher’s fundamental theorem? Here’s what Wikipedia says about it:

It uses some mathematical notation but is not a theorem in the mathematical sense.

It states:

“The rate of increase in fitness of any organism at any time is equal to its genetic variance in fitness at that time.”

Or in more modern terminology:

“The rate of increase in the mean fitness of any organism at any time ascribable to natural selection acting through changes in gene frequencies is exactly equal to its genetic variance in fitness at that time”.

Largely as a result of Fisher’s feud with the American geneticist Sewall Wright about adaptive landscapes, the theorem was widely misunderstood to mean that the average fitness of a population would always increase, even though models showed this not to be the case. In 1972, George R. Price showed that Fisher’s theorem was indeed correct (and that Fisher’s proof was also correct, given a typo or two), but did not find it to be of great significance. The sophistication that Price pointed out, and that had made understanding difficult, is that the theorem gives a formula for part of the change in gene frequency, and not for all of it. This is a part that can be said to be due to natural selection

Price’s paper is here:

• George R. Price, Fisher’s ‘fundamental theorem’ made clear, *Annals of Human Genetics* **36** (1972), 129–140.

I don’t find it very clear, perhaps because I didn’t spend enough time on it. But I think I get the idea.

My result *is* a theorem in the mathematical sense, though quite an easy one. I assume a population distribution evolves according to the replicator equation and derive an equation whose right-hand side matches that of Fisher’s original equation: the variance of the fitness.

But my left-hand side is different: it’s the square of the speed of the corresponding probability distribution, where speed is measured using the ‘Fisher information metric’. This metric was discovered by the same guy, Ronald Fisher, but I don’t think he used it in *his* work on the fundamental theorem!

Something a bit similar to my statement appears as Theorem 2 of this paper:

• Marc Harper, Information geometry and evolutionary game theory.

and for that theorem he cites:

• Josef Hofbauer and Karl Sigmund, *Evolutionary Games and Population Dynamics*, Cambridge University Press, Cambridge, 1998.

However, his Theorem 2 really concerns the rate of increase of fitness, like Fisher’s fundamental theorem. Moreover, he assumes that the probability distribution flows along the gradient of a function, and I’m not assuming that. Indeed, my version applies to situations where the probability distribution moves round and round in periodic orbits!

The key to generalizing Fisher’s fundamental theorem is thus to focus on the speed at which moves, rather than the increase in fitness. Why do I call this speed the ‘rate at which the population learns information’? It’s because we’re measuring this speed using the Fisher information metric, which is closely connected to relative information, also known as relative entropy or the Kullback–Leibler divergence.

I explained this back in Part 7, but that explanation seems hopelessly technical to me now, so here’s a faster one, which I created while preparing my talk.

The information of a probability distribution **relative to** a probability distribution is

It says how much information you learn if you start with a hypothesis saying that the probability of the *i*th situation was and then update this to a new hypothesis

Now suppose you have a hypothesis that’s changing with time in a smooth way, given by a time-dependent probability Then a calculation shows that

for all times . This seems paradoxical at first. I like to jokingly put it this way:

To first order, you’re never learning anything.

However, as long as the velocity is nonzero, we have

so we can say

To second order, you’re always learning something… unless your opinions are fixed.

This lets us define a ‘rate of learning’—that is, a ‘speed’ at which the probability distribution moves. *And this is precisely the speed given by the Fisher information metric!*

In other words:

where the length is given by Fisher information metric. Indeed, this formula can be used to *define* the Fisher information metric. From this definition we can easily work out the concrete formula I gave earlier.

In summary: as a probability distribution moves around, the relative information between the new probability distribution and the original one grows approximately as the *square* of time, not linearly. So, to talk about a ‘rate at which information is learned’, we need to use the above formula, involving a second time derivative. This rate is just the speed at which the probability distribution moves, measured using the Fisher information metric. And when we have a probability distribution describing how many replicators are of different species, and it’s evolving according to the replicator equation, this speed is also just the variance of the fitness!

]]>