## Restoring the North Cascades Ecosystem

13 March, 2017

In 49 hours, the National Park Service will stop taking comments on an important issue: whether to reintroduce grizzly bears into the North Cascades near Seattle. If you leave a comment on their website before then, you can help make this happen! Follow the easy directions here:

But if you want more details:

Grizzly bears are traditionally the apex predator in the North Cascades. Without the apex predator, the whole ecosystem is thrown out of balance. I know this from my childhood in northern Virginia, where deer are stripping the forest of all low-hanging greenery with no wolves to control them. With the top predator, the whole ecosystem springs to life and starts humming like a well-tuned engine! For example, when wolves were reintroduced in Yellowstone National Park, it seems that even riverbeds were affected:

There are several plans to restore grizzlies to the North Cascades. On the link I recommended, Matthew Inman supports Alternative C — Incremental Restoration. I’m not an expert on this issue, so I went ahead and supported that. There are actually 4 alternatives on the table:

Alternative A — No Action. They’ll keep doing what they’re already doing. The few grizzlies already there would be protected from poaching, the local population would be advised on how to deal with grizzlies, and the bears would be monitored. All other alternatives will do these things and more.

Alternative B — Ecosystem Evaluation Restoration. Up to 10 grizzly bears will be captured from source populations in northwestern Montana and/or south-central British Columbia and released at a single remote site on Forest Service lands in the North Cascades. This will take 2 years, and then they’ll be monitored for 2 years before deciding what to do next.

Alternative C — Incremental Restoration. 5 to 7 grizzly bears will be captured and released into the North Casades each year over roughly 5 to 10 years, with a goal of establishing an initial population of 25 grizzly bears. Bears would be released at multiple remote sites. They can be relocated or removed if they cause trouble. Alternative C is expected to reach the restoration goal of approximately 200 grizzly bears within 60 to 100 years.

Alternative D — Expedited Restoration. 5 to 7 grizzly bears will be captured and released into the North Casades each year until the population reaches about 200, which is what the area can easily support.

So, pick your own alternative if you like!

By the way, the remaining grizzly bears in the western United States live within six recovery zones:

• the Greater Yellowstone Ecosystem (GYE) in Wyoming and southwest Montana,

• the Northern Continental Divide Ecosystem (NCDE) in northwest Montana,

• the Cabinet-Yaak Ecosystem (CYE) in extreme northwestern Montana and the northern Idaho panhandle,

• the Selkirk Ecosystem (SE) in northern Idaho and northeastern Washington,

• the Bitterroot Ecosystem (BE) in central Idaho and western Montana,

• and the North Cascades Ecosystem (NCE) in northwestern and north-central Washington.

The North Cascades Ecosystem consists of 24,800 square kilometers in Washington, with an additional 10,350 square kilometers in British Columbia. In the US, 90% of this ecosystem is managed by the US Forest Service, the US National Park Service, and the State of Washington, and approximately 41% falls within Forest Service wilderness or the North Cascades National Park Service Complex.

• National Park Service, Draft Grizzly Bear Restoration Plan / Environmental Impact Statement: North Cascades Ecosystem.

• Ron Judd, Why returning grizzlies to the North Cascades is the right thing to do, Pacific NW Magazine, 23 November 2015.

The map is from here:

• Krista Langlois, Grizzlies gain ground, High Country News, 27 August 2014.

Here you’ll see the huge obstacles this project has overcome so far.

## Pi and the Golden Ratio

7 March, 2017

Two of my favorite numbers are pi:

$\pi = 3.14159...$

and the golden ratio:

$\displaystyle{ \Phi = \frac{\sqrt{5} + 1}{2} } = 1.6180339...$

They’re related:

$\pi = \frac{5}{\Phi} \cdot \frac{2}{\sqrt{2 + \sqrt{2 + \Phi}}} \cdot \frac{2}{\sqrt{2 + \sqrt{2 + \sqrt{2 + \Phi}}}} \cdot \frac{2}{\sqrt{2 + \sqrt{2 + \sqrt{2 + \sqrt{2 + \Phi}}}}} \cdots$

Greg Egan and I came up with this formula last weekend. It’s probably not new, and it certainly wouldn’t surprise experts, but it’s still fun coming up with a formula like this. Let me explain how we did it.

History has a fractal texture. It’s not exactly self-similar, but the closer you look at any incident, the more fine-grained detail you see. The simplified stories we learn about the history of math and physics in school are like blurry pictures of the Mandelbrot set. You can see the overall shape, but the really exciting stuff is hidden.

François Viète is a French mathematician who doesn’t show up in those simplified stories. He studied law at Poitiers, graduating in 1559. He began his career as an attorney at a quite high level, with cases involving the widow of King Francis I of France and also Mary, Queen of Scots. But his true interest was always mathematics. A friend said he could think about a single question for up to three days, his elbow on the desk, feeding himself without changing position.

Nonetheless, he was highly successful in law. By 1590 he was working for King Henry IV. The king admired his mathematical talents, and Viète soon confirmed his worth by cracking a Spanish cipher, thus allowing the French to read all the Spanish communications they were able to obtain.

In 1591, François Viète came out with an important book, introducing what is called the new algebra: a symbolic method for dealing with polynomial equations. This deserves to be much better known; it was very familiar to Descartes and others, and it was an important precursor to our modern notation and methods. For example, he emphasized care with the use of variables, and advocated denoting known quantities by consonants and unknown quantities by vowels. (Later people switched to using letters near the beginning of the alphabet for known quantities and letters near the end like $x,y,z$ for unknowns.)

In 1593 he came out with another book, Variorum De Rebus Mathematicis Responsorum, Liber VIII. Among other things, it includes a formula for pi. In modernized notation, it looks like this:

$\displaystyle{ \frac2\pi = \frac{\sqrt 2}2 \cdot \frac{\sqrt{2+\sqrt 2}}2 \cdot \frac{\sqrt{2+\sqrt{2+\sqrt 2}}}{2} \cdots}$

This is remarkable! First of all, it looks cool. Second, it’s the earliest known example of an infinite product in mathematics. Third, it’s the earliest known formula for the exact value of pi. In fact, it seems to be the earliest formula representing a number as the result of an infinite process rather than of a finite calculation! So, Viète’s formula has been called the beginning of analysis. In his article “The life of pi”, Jonathan Borwein went even further and called Viète’s formula “the dawn of modern mathematics”.

How did Viète come up with his formula? I haven’t read his book, but the idea seems fairly clear. The area of the unit circle is pi. So, you can approximate pi better and better by computing the area of a square inscribed in this circle, and then an octagon, and then a 16-gon, and so on:

If you compute these areas in a clever way, you get this series of numbers:

$\begin{array}{ccl} A_4 &=& 2 \\ \\ A_8 &=& 2 \cdot \frac{2}{\sqrt{2}} \\ \\ A_{16} &=& 2 \cdot \frac{2}{\sqrt{2}} \cdot \frac{2}{\sqrt{2 + \sqrt{2}}} \\ \\ A_{32} &=& 2 \cdot \frac{2}{\sqrt{2}} \cdot \frac{2}{\sqrt{2 + \sqrt{2}}} \cdot \frac{2}{\sqrt{2 + \sqrt{2 + \sqrt{2}}}} \end{array}$

and so on, where $A_n$ is the area of a regular n-gon inscribed in the unit circle. So, it was only a small step for Viète (though an infinite leap for mankind) to conclude that

$\displaystyle{ \pi = 2 \cdot \frac{2}{\sqrt{2}} \cdot \frac{2}{\sqrt{2 + \sqrt{2}}} \cdot \frac{2}{\sqrt{2 + \sqrt{2 + \sqrt{2}}}} \cdots }$

or, if square roots in a denominator make you uncomfortable:

$\displaystyle{ \frac2\pi = \frac{\sqrt 2}2 \cdot \frac{\sqrt{2+\sqrt 2}}2 \cdot \frac{\sqrt{2+\sqrt{2+\sqrt 2}}}{2} \cdots}$

The basic idea here would not have surprised Archimedes, who rigorously proved that

$223/71 < \pi < 22/7$

by approximating the circumference of a circle using a regular 96-gon. Since $96 = 2^5 \times 3$, you can draw a regular 96-gon with ruler and compass by taking an equilateral triangle and bisecting its edges to get a hexagon, bisecting the edges of that to get a 12-gon, and so on up to 96. In a more modern way of thinking, you can figure out everything you need to know by starting with the angle $\pi/3$ and using half-angle formulas 4 times to work out the sine or cosine of $\pi/96$. And indeed, before Viète came along, Ludolph van Ceulen had computed pi to 35 digits using a regular polygon with $2^{62}$ sides! So Viète’s daring new idea was to give an exact formula for pi that involved an infinite process.

Now let’s see in detail how Viète’s formula works. Since there’s no need to start with a square, we might as well start with a regular n-gon inscribed in the circle and repeatedly bisect its sides, getting better and better approximations to pi. If we start with a pentagon, we’ll get a formula for pi that involves the golden ratio!

We have

$\displaystyle{ \pi = \lim_{k \to \infty} A_k }$

so we can also compute pi by starting with a regular n-gon and repeatedly doubling the number of vertices:

$\displaystyle{ \pi = \lim_{k \to \infty} A_{2^k n} }$

The key trick is to write $A_{2^k}{n}$ as a ‘telescoping product’:

$A_{2^k n} = A_n \cdot \frac{A_{2n}}{A_n} \cdot \frac{A_{4n}}{A_{2n}} \cdot \frac{A_{8n}}{A_{4n}}$

Thus, taking the limit as $k \to \infty$ we get

$\displaystyle{ \pi = A_n \cdot \frac{A_{2n}}{A_n} \cdot \frac{A_{4n}}{A_{2n}} \cdot \frac{A_{8n}}{A_{4n}} \cdots }$

where we start with the area of the n-gon and keep ‘correcting’ it to get the area of the 2n-gon, the 4n-gon, the 8n-gon and so on.

There’s a simple formula for the area of a regular n-gon inscribed in a circle. You can chop it into $2 n$ right triangles, each of which has base $\sin(\pi/n)$ and height $\cos(\pi/n)$, and thus area $n \sin(\pi/n) \cos(\pi/n)$:

Thus,

$A_n = n \sin(\pi/n) \cos(\pi/n) = \displaystyle{\frac{n}{2} \sin(2 \pi / n)}$

This lets us understand how the area changes when we double the number of vertices:

$\displaystyle{ \frac{A_{n}}{A_{2n}} = \frac{\frac{n}{2} \sin(2 \pi / n)}{n \sin(\pi / n)} = \frac{n \sin( \pi / n) \cos(\pi/n)}{n \sin(\pi / n)} = \cos(\pi/n) }$

This is nice and simple, but we really need a recursive formula for this quantity. Let’s define

$\displaystyle{ R_n = 2\frac{A_{n}}{A_{2n}} = 2 \cos(\pi/n) }$

Why the factor of 2? It simplifies our calculations slightly. We can express $R_{2n}$ in terms of $R_n$ using the half-angle formula for the cosine:

$\displaystyle{ R_{2n} = 2 \cos(\pi/2n) = 2\sqrt{\frac{1 + \cos(\pi/n)}{2}} = \sqrt{2 + R_n} }$

Now we’re ready for some fun! We have

$\begin{array}{ccl} \pi &=& \displaystyle{ A_n \cdot \frac{A_{2n}}{A_n} \cdot \frac{A_{4n}}{A_{2n}} \cdot \frac{A_{8n}}{A_{4n}} \cdots } \\ \\ & = &\displaystyle{ A_n \cdot \frac{2}{R_n} \cdot \frac{2}{R_{2n}} \cdot \frac{2}{R_{4n}} \cdots } \end{array}$

so using our recursive formula $R_{2n} = \sqrt{2 + R_n}$, which holds for any $n$, we get

$\pi = \displaystyle{ A_n \cdot \frac{2}{R_n} \cdot \frac{2}{\sqrt{2 + R_n}} \cdot \frac{2}{\sqrt{2 + \sqrt{2 + R_n}}} \cdots }$

I think this deserves to be called the generalized Viète formula. And indeed, if we start with a square, we get

$A_4 = \displaystyle{\frac{4}{2} \sin(2 \pi / 4)} = 2$

and

$R_4 = 2 \cos(\pi/4) = \sqrt{2}$

giving Viète’s formula:

$\pi = \displaystyle{ 2 \cdot \frac{2}{\sqrt{2}} \cdot \frac{2}{\sqrt{2 + \sqrt{2}}} \cdot \frac{2}{\sqrt{2 + \sqrt{2 + \sqrt{2}}}} \cdots }$

as desired!

But what if we start with a pentagon? For this it helps to remember a beautiful but slightly obscure trig fact:

$\cos(\pi / 5) = \Phi/2$

and a slightly less beautiful one:

$\displaystyle{ \sin(2\pi / 5) = \frac{1}{2} \sqrt{2 + \Phi} }$

It’s easy to prove these, and I’ll show you how later. For now, note that they imply

$A_5 = \displaystyle{\frac{5}{2} \sin(2 \pi / 5)} = \frac{5}{4} \sqrt{2 + \Phi}$

and

$R_5 = 2 \cos(\pi/5) = \Phi$

Thus, the formula

$\pi = \displaystyle{ A_5 \cdot \frac{2}{R_5} \cdot \frac{2}{\sqrt{2 + R_5}} \cdot \frac{2}{\sqrt{2 + \sqrt{2 + R_5}}} \cdots }$

gives us

$\pi = \displaystyle{ \frac{5}{4} \sqrt{2 + \Phi} \cdot \frac{2}{\Phi} \cdot \frac{2}{\sqrt{2 + \Phi}} \cdot \frac{2}{\sqrt{2 + \sqrt{2 + \Phi}}} \cdots }$

or, cleaning it up a bit, the formula we want:

$\pi = \frac{5}{\Phi} \cdot \frac{2}{\sqrt{2 + \sqrt{2 + \Phi}}} \cdot \frac{2}{\sqrt{2 + \sqrt{2 + \sqrt{2 + \Phi}}}} \cdot \frac{2}{\sqrt{2 + \sqrt{2 + \sqrt{2 + \sqrt{2 + \Phi}}}}} \cdots$

Voilà!

There’s a lot more to say, but let me just explain the slightly obscure trigonometry facts we needed. To derive these, I find it nice to remember that a regular pentagon, and the pentagram inside it, contain lots of similar triangles:

Using the fact that all these triangles are similar, it’s easy to show that for any one, the ratio of the long side to the short side is $\Phi$ to 1, since

$\displaystyle{\Phi = 1 + \frac{1}{\Phi} }$

Another important fact is that the pentagram trisects the interior angle of the regular pentagon, breaking the interior angle of $108^\circ = 3\pi/5$ into 3 angles of $36^\circ = \pi/5$:

Again this is easy and fun to show.

Combining these facts, we can prove that

$\displaystyle{ \cos(2\pi/5) = \frac{1}{2\Phi} }$

and

$\displaystyle{ \cos(\pi/5) = \frac{\Phi}{2} }$

To prove the first equation, chop one of those golden triangles into two right triangles and do things you learned in high school. To prove the second, do the same things to one of the short squat isosceles triangles:

Starting from these equations and using $\cos^2 \theta + \sin^2 \theta = 1$, we can show

$\displaystyle{ \sin(2\pi/5) = \frac{1}{2}\sqrt{2 + \Phi}}$

and, just for completeness (we don’t need it here):

$\displaystyle{ \sin(\pi/5) = \frac{1}{2}\sqrt{3 - \Phi}}$

These require some mildly annoying calculations, where it helps to use the identity

$\displaystyle{\frac{1}{\Phi^2} = 2 - \Phi }$

Okay, that’s all for now! But if you want more fun, try a couple of puzzles:

Puzzle 1. We’ve gotten formulas for pi starting from a square or a regular pentagon. What formula do you get starting from an equilateral triangle?

Puzzle 2. Using the generalized Viète formula, prove Euler’s formula

$\displaystyle{ \frac{\sin x}{x} = \cos\frac{x}{2} \cdot \cos\frac{x}{4} \cdot \cos\frac{x}{8} \cdots }$

Conversely, use Euler’s formula to prove the generalized Viète formula.

So, one might say that the real point of Viète’s formula, and its generalized version, is not any special property of pi, but Euler’s formula.

## Saving Climate Data (Part 6)

23 February, 2017

Scott Pruitt, who filed legal challenges against Environmental Protection Agency rules fourteen times, working hand in hand with oil and gas companies, is now head of that agency. What does that mean about the safety of climate data on the EPA’s websites? Here is an inside report:

• Dawn Reeves, EPA preserves Obama-Era website but climate change data doubts remain, InsideEPA.com, 21 February 2017.

For those of us who are backing up climate data, the really important stuff is in red near the bottom.

The EPA has posted a link to an archived version of its website from Jan. 19, the day before President Donald Trump was inaugurated and the agency began removing climate change-related information from its official site, saying the move comes in response to concerns that it would permanently scrub such data.

However, the archived version notes that links to climate and other environmental databases will go to current versions of them—continuing the fears that the Trump EPA will remove or destroy crucial greenhouse gas and other data.

The archived version was put in place and linked to the main page in response to “numerous [Freedom of Information Act (FOIA)] requests regarding historic versions of the EPA website,” says an email to agency staff shared by the press office. “The Agency is making its best reasonable effort to 1) preserve agency records that are the subject of a request; 2) produce requested agency records in the format requested; and 3) post frequently requested agency records in electronic format for public inspection. To meet these goals, EPA has re-posted a snapshot of the EPA website as it existed on January 19, 2017.”

The email adds that the action is similar to the snapshot taken of the Obama White House website.

For example, it says the page is “not the current EPA website” and that the archive includes “static content, such as webpages and reports in Portable Document Format (PDF), as that content appeared on EPA’s website as of January 19, 2017.”

It cites technical limits for the database exclusions. “For example, many of the links contained on EPA’s website are to databases that are updated with the new information on a regular basis. These databases are not part of the static content that comprises the Web Snapshot.” Searches of the databases from the archive “will take you to the current version of the database,” the agency says.

“In addition, links may have been broken in the website as it appeared” on Jan. 19 and those will remain broken on the snapshot. Links that are no longer active will also appear as broken in the snapshot.

“Finally, certain extremely large collections of content… were not included in the Snapshot due to their size” such as AirNow images, radiation network graphs, historic air technology transfer network information, and EPA’s searchable news releases.”

#### ‘Smart’ Move

One source urging the preservation of the data says the snapshot appears to be a “smart” move on EPA’s behalf, given the FOIA requests it has received, and notes that even though other groups like NextGen Climate and scientists have been working to capture EPA’s online information, having it on EPA’s site makes it official.

But it could also be a signal that big changes are coming to the official Trump EPA site, and it is unclear how long the agency will maintain the archived version.

The source says while it is disappointing that the archive may signal the imminent removal of EPA’s climate site, “at least they are trying to accommodate public concerns” to preserve the information.

A second source adds that while it is good that EPA is seeking “to address the widespread concern” that the information will be removed by an administration that does not believe in human-caused climate change, “on the other hand, it doesn’t address the primary concern of the data. It is snapshots of the web text.” Also, information “not included,” such as climate databases, is what is difficult to capture by outside groups and is what really must be preserved.

“If they take [information] down” that groups have been trying to preserve, then the underlying concern about access to data remains. “Web crawlers and programs can do things that are easy,” such as taking snapshots of text, “but getting the data inside the database is much more challenging,” the source says.

The first source notes that EPA’s searchable databases, such as those maintained by its Clean Air Markets Division, are used by the public “all the time.”

The agency’s Office of General Counsel (OGC) Jan. 25 began a review of the implications of taking down the climate page—a planned wholesale removal that was temporarily suspended to allow for the OGC review.

But EPA did remove some specific climate information, including links to the Clean Power Plan and references to President Barack Obama’s Climate Action Plan. Inside EPA captured this screenshot of the “What EPA Is Doing” page regarding climate change. Those links are missing on the Trump EPA site. The archive includes the same version of the page as captured by our screenshot.

Inside EPA first reported the plans to take down the climate information on Jan. 17.

After the OGC investigation began, a source close to the Trump administration said Jan. 31 that climate “propaganda” would be taken down from the EPA site, but that the agency is not expected to remove databases on GHG emissions or climate science. “Eventually… the propaganda will get removed…. Most of what is there is not data. Most of what is there is interpretation.”

The Sierra Club and Environmental Defense Fund both filed FOIA requests asking the agency to preserve its climate data, while attorneys representing youth plaintiffs in a federal climate change lawsuit against the government have also asked the Department of Justice to ensure the data related to its claims is preserved.

The Azimuth Climate Data Backup Project and other groups are making copies of actual databases, not just the visible portions of websites.

## Azimuth Backup Project (Part 4)

18 February, 2017

The Azimuth Climate Data Backup Project is going well! Our Kickstarter campaign ended on January 31st and the money has recently reached us. Our original goal was $5000. We got$20,427 of donations, and after Kickstarter took its cut we received $18,590.96. Next time I’ll tell you what our project has actually been doing. This time I just want to give a huge “thank you!” to all 627 people who contributed money on Kickstarter! I sent out thank you notes to everyone, updating them on our progress and asking if they wanted their names listed. The blanks in the following list represent people who either didn’t reply, didn’t want their names listed, or backed out and decided not to give money. I’ll list people in chronological order: first contributors first. Only 12 people backed out; the vast majority of blanks on this list are people who haven’t replied to my email. I noticed some interesting but obvious patterns. For example, people who contributed later are less likely to have answered my email yet—I’ll update this list later. People who contributed more money were more likely to answer my email. The magnitude of contributions ranged from$2000 to \$1. A few people offered to help in other ways. The response was international—this was really heartwarming! People from the US were more likely than others to ask not to be listed.

But instead of continuing to list statistical patterns, let me just thank everyone who contributed.

Daniel Estrada
Ahmed Amer
Saeed Masroor
Jodi Kaplan
John Wehrle
Bob Calder
Andrea Borgia
L Gardner

Uche Eke
Keith Warner
Dean Kalahan
James Benson
Dianne Hackborn

Walter Hahn
Thomas Savarino
Noah Friedman
Eric Willisson
Jeffrey Gilmore
John Bennett
Glenn McDavid

Brian Turner

Peter Bagaric

Martin Dahl Nielsen
Broc Stenman

Gabriel Scherer
Roice Nelson
Felipe Pait
Kenneth Hertz

Luis Bruno

Andrew Lottmann
Alex Morse

Noam Zeilberger

Buffy Lyon

Josh Wilcox

Danny Borg

Krishna Bhogaonker
Harald Tveit Alvestrand

Tarek A. Hijaz, MD
Jouni Pohjola
Chavdar Petkov
Markus Jöbstl
Bjørn Borud

Sarah G

William Straub

Frank Harper
Carsten Führmann
Rick Angel
Drew Armstrong

Jesimpson

Valeria de Paiva
Ron Prater
David Tanzer

Rafael Laguna
Miguel Esteves dos Santos
Sophie Dennison-Gibby

Randy Drexler
Peter Haggstrom

Jerzy Michał Pawlak
Santini Basra
Jenny Meyer

John Iskra

Bruce Jones
Māris Ozols
Everett Rubel

Mike D
Manik Uppal
Todd Trimble

Federer Fanatic

Forrest Samuel, Harmos Consulting

Annie Wynn
Norman and Marcia Dresner

Daniel Mattingly
James W. Crosby

Jennifer Booth
Greg Randolph

Dave and Karen Deeter

Sarah Truebe

Tieg Zaharia
Jeffrey Salfen
Birian Abelson

Logan McDonald

Brian Truebe
Jon Leland

Nicole

Sarah Lim

James Turnbull

John Huerta
Katie Mandel Bruce
Bethany Summer

Heather Tilert

Naom Hart
Aaron Riley

Giampiero Campa

Julie A. Sylvia

Pace Willisson

Bangskij

Peter Herschberg

Alaistair Farrugia

Conor Hennessy

Stephanie Mohr

Torinthiel

Lincoln Muri
Anet Ferwerda

Hanna

Michelle Lee Guiney

Ben Doherty
Trace Hagemann

Ryan Mannion

Penni and Terry O'Hearn

Brian Bassham
Caitlin Murphy
John Verran

Susan

Alexander Hawson
Fabrizio Mafessoni
Anita Phagan
Nicolas Acuña
Niklas Brunberg

V. Lazaro Zamora

Branford Werner
Niklas Starck Westerberg
Luca Zenti and Marta Veneziano

Ilja Preuß
Christopher Flint

Courtney Leigh

Katharina Spoerri

Daniel Risse

Hanna
Charles-Etienne Jamme
rhackman41

Jeff Leggett

RKBookman

Aaron Paul
Mike Metzler

Patrick Leiser

Melinda

Ryan Vaughn
Kent Crispin

Michael Teague

Ben

Fabian Bach
Steven Canning

Betsy McCall

John Rees

Mary Peters

Shane Claridge
Thomas Negovan
Tom Grace
Justin Jones

Jason Mitchell

Josh Weber
Rebecca Lynne Hanginger
Kirby

Dawn Conniff

Michael T. Astolfi

Kristeva

Erik
Keith Uber

Elaine Mazerolle
Matthieu Walraet

Linda Penfold

Lujia Liu

Keith

Samar Tareem

Henrik Almén
Michael Deakin
Rutger Ockhorst

Erin Bassett
James Crook

Junior Eluhu
Dan Laufer
Carl
Robert Solovay

Silica Magazine

Leonard Saers
Alfredo Arroyo García

Larry Yu

John Behemonth

Eric Humphrey

Svein Halvor Halvorsen

Karim Issa

Øystein Risan Borgersen
David Anderson Bell III

Ole-Morten Duesend

Robert Biegler

Qu Wenhao

Steffen Dittmar

Shanna Germain

John WS Marvin (Dread Unicorn Games)

Bill Carter
Darth Chronis

Lawrence Stewart

Gareth Hodges

Colin Backhurst
Christopher Metzger

Rachel Gumper

Mariah Thompson

Johnathan Salter

Maggie Unkefer
Shawna Maryanovich

Wilhelm Fitzpatrick
Dylan “ExoByte” Mayo
Lynda Lee

Scott Carpenter

Charles D, Payet
Vince Rostkowski

Tim Brown
Raven Daegmorgan
Zak Brueckner

Christian Page

Steven Greenberg
Chuck Lunney

Natasha Anicich

Bram De Bie
Edward L

Gray Detrick
Robert

Sarah Russell

Sam Leavin

Abilash Pulicken

Isabel Olondriz
James Pierce
James Morrison

April Daniels

José Tremblay Champagne

Chris Edmonds

Hans & Maria Cummings
Bart Gasiewiski

Andy Chamard

Andrew Jackson

Christopher Wright

Crystal Collins

ichimonji10

Alan Stern
Alison W

Dag Henrik Bråtane

Martin Nilsson



## Saving Climate Data (Part 5)

6 February, 2017

There’s a lot going on! Here’s a news roundup. I will separately talk about what the Azimuth Climate Data Backup Project is doing.

### Tweaking the EPA website

Scientists are keeping track of how Trump administration is changing the Environmental Protection Agency website, with before-and-after photos, and analysis:

• Brian Kahn, Behold the “tweaks” Trump has made to the EPA website (so far), National Resources Defense Council blog, 3 February 2017.

All of this would be nothing compared to the new bill to eliminate the EPA, or Myron Ebell’s plan to fire most of the people working there:

• Joe Davidson, Trump transition leader’s goal is two-thirds cut in EPA employees, Washington Post, 30 January 2017.

If you want to keep track of this battle, I recommend getting a 30-day free subscription to this online magazine:

### Taking animal welfare data offline

The Trump team is taking animal-welfare data offline. The US Department of Agriculture will no longer make lab inspection results and violations publicly available, citing privacy concerns:

• Sara Reardon, US government takes animal-welfare data offline, Nature Breaking News, 3 Feburary 2017.

A new bill would prevent the US government from providing access to geospatial data if it helps people understand housing discrimination. It goes like this:

Notwithstanding any other provision of law, no Federal funds may be used to design, build, maintain, utilize, or provide access to a Federal database of geospatial information on community racial disparities or disparities in access to affordable housing._

For more on this bill, and the important ways in which such data has been used, see:

• Abraham Gutman, Scott Burris, and the Temple University Center for Public Health Law Research, Where will data take the Trump administration on housing?, Philly.com, 1 February 2017.

### The EDGI fights back

The Environmental Data and Governance Initiative or EDGI is working to archive public environmental data. They’re helping coordinate data rescue events. You can attend one and have fun eating pizza with cool people while saving data:

• 3 February 2017, Portland
• 4 February 2017, New York City
• 10-11 February 2017, Austin Texas
• 11 February 2017, U. C. Berkeley, California
• 18 February 2017, MIT, Cambridge Massachusetts
• 18 February 2017, Haverford Connecticut
• 18-19 February 2017, Washington DC
• 26 February 2017, Twin Cities, Minnesota

Or, work with EDGI to organize one your own data rescue event! They provide some online tools to help download data.

I know there will also be another event at UCLA, so the above list is not complete, and it will probably change and grow over time. Keep up-to-date at their site:

### Scientists fight back

The pushback is so big it’s hard to list it all! For now I’ll just quote some of this article:

• Tabitha Powledge, The gag reflex: Trump info shutdowns at US science agencies, especially EPA, 27 January 2017.

THE PUSHBACK FROM SCIENCE HAS BEGUN

Predictably, counter-tweets claiming to come from rebellious employees at the EPA, the Forest Service, the USDA, and NASA sprang up immediately. At The Verge, Rich McCormick says there’s reason to believe these claims may be genuine, although none has yet been verified. A lovely head on this post: “On the internet, nobody knows if you’re a National Park.”

At Hit&Run, Ronald Bailey provides handles for several of these alt tweet streams, which he calls “the revolt of the permanent government.” (That’s a compliment.)

Bailey argues, “with exception perhaps of some minor amount of national security intelligence, there is no good reason that any information, data, studies, and reports that federal agencies produce should be kept from the public and press. In any case, I will be following the Alt_Bureaucracy feeds for a while.”

NeuroDojo Zen Faulkes posted on how to demand that scientific societies show some backbone. “Ask yourself: “Have my professional societies done anything more political than say, ‘Please don’t cut funding?’” Will they fight?,” he asked.

Scientists associated with the group_ 500 Women Scientists _donned lab coats and marched in DC as part of the Women’s March on Washington the day after Trump’s Inauguration, Robinson Meyer reported at the Atlantic. A wildlife ecologist from North Carolina told Meyer, “I just can’t believe we’re having to yell, ‘Science is real.’”

Taking a cue from how the Women’s March did its social media organizing, other scientists who want to set up a Washington march of their own have put together a closed Facebook group that claims more than 600,000 members, Kate Sheridan writes at STAT.

The #ScienceMarch Twitter feed says a date for the march will be posted in a few days. [The march will be on 22 April 2017.] The group also plans to release tools to help people interested in local marches coordinate their efforts and avoid duplication.

At The Atlantic, Ed Yong describes the political action committee 314Action. (314=the first three digits of pi.)

Among other political activities, it is holding a webinar on Pi Day—March 14—to explain to scientists how to run for office. Yong calls 314Action the science version of Emily’s List, which helps pro-choice candidates run for office. 314Action says it is ready to connect potential candidate scientists with mentors—and donors.

Other groups may be willing to step in when government agencies wimp out. A few days before the Inauguration, the Centers for Disease Control and Prevention abruptly and with no explanation cancelled a 3-day meeting on the health effects of climate change scheduled for February. Scientists told Ars Technica’s Beth Mole that CDC has a history of running away from politicized issues.

One of the conference organizers from the American Public Health Association was quoted as saying nobody told the organizers to cancel.

I believe it. Just one more example of the chilling effect on global warming. In politics, once the Dear Leader’s wishes are known, some hirelings will rush to gratify them without being asked.

The APHA guy said they simply wanted to head off a potential last-minute cancellation. Yeah, I guess an anticipatory pre-cancellation would do that.

But then—Al Gore to the rescue! He is joining with a number of health groups—including the American Public Health Association—to hold a one-day meeting on the topic Feb 16 at the Carter Center in Atlanta, CDC’s home base. Vox’s Julia Belluz reports that it is not clear whether CDC officials will be part of the Gore rescue event.

### The Sierra Club fights back

The Sierra Club, of which I’m a proud member, is using the Freedom of Information Act or FOIA to battle or at least slow the deletion of government databases. They wisely started even before Trump took power:

• Jennifer A Dlouhy, Fearing Trump data purge, environmentalists push to get records, BloombergMarkets, 13 January 2017.

Here’s how the strategy works:

U.S. government scientists frantically copying climate data they fear will disappear under the Trump administration may get extra time to safeguard the information, courtesy of a novel legal bid by the Sierra Club.

The environmental group is turning to open records requests to protect the resources and keep them from being deleted or made inaccessible, beginning with information housed at the Environmental Protection Agency and the Department of Energy. On Thursday [January 9th], the organization filed Freedom of Information Act requests asking those agencies to turn over a slew of records, including data on greenhouse gas emissions, traditional air pollution and power plants.

The rationale is simple: Federal laws and regulations generally block government agencies from destroying files that are being considered for release. Even if the Sierra Club’s FOIA requests are later rejected, the record-seeking alone could prevent files from being zapped quickly. And if the records are released, they could be stored independently on non-government computer servers, accessible even if other versions go offline.

## Information Geometry (Part 16)

1 February, 2017

This week I’m giving a talk on biology and information:

• John Baez, Biology as information dynamics, talk for Biological Complexity: Can it be Quantified?, a workshop at the Beyond Center, 2 February 2017.

While preparing this talk, I discovered a cool fact. I doubt it’s new, but I haven’t exactly seen it elsewhere. I came up with it while trying to give a precise and general statement of ‘Fisher’s fundamental theorem of natural selection’. I won’t start by explaining that theorem, since my version looks rather different than Fisher’s, and I came up with mine precisely because I had trouble understanding his. I’ll say a bit more about this at the end.

Here’s my version:

The square of the rate at which a population learns information is the variance of its fitness.

This is a nice advertisement for the virtues of diversity: more variance means faster learning. But it requires some explanation!

### The setup

Let’s start by assuming we have $n$ different kinds of self-replicating entities with populations $P_1, \dots, P_n.$ As usual, these could be all sorts of things:

• molecules of different chemicals
• organisms belonging to different species
• genes of different alleles
• restaurants belonging to different chains
• people with different beliefs
• game-players with different strategies
• etc.

I’ll call them replicators of different species.

Let’s suppose each population $P_i$ is a function of time that grows at a rate equal to this population times its ‘fitness’. I explained the resulting equation back in Part 9, but it’s pretty simple:

$\displaystyle{ \frac{d}{d t} P_i(t) = f_i(P_1(t), \dots, P_n(t)) \, P_i(t) }$

Here $f_i$ is a completely arbitrary smooth function of all the populations! We call it the fitness of the ith species.

This equation is important, so we want a short way to write it. I’ll often write $f_i(P_1(t), \dots, P_n(t))$ simply as $f_i,$ and $P_i(t)$ simply as $P_i.$ With these abbreviations, which any red-blooded physicist would take for granted, our equation becomes simply this:

$\displaystyle{ \frac{dP_i}{d t} = f_i \, P_i }$

Next, let $p_i(t)$ be the probability that a randomly chosen organism is of the ith species:

$\displaystyle{ p_i(t) = \frac{P_i(t)}{\sum_j P_j(t)} }$

Starting from our equation describing how the populations evolve, we can figure out how these probabilities evolve. The answer is called the replicator equation:

$\displaystyle{ \frac{d}{d t} p_i(t) = ( f_i - \langle f \rangle ) \, p_i(t) }$

Here $\langle f \rangle$ is the average fitness of all the replicators, or mean fitness:

$\displaystyle{ \langle f \rangle = \sum_j f_j(P_1(t), \dots, P_n(t)) \, p_j(t) }$

In what follows I’ll abbreviate the replicator equation as follows:

$\displaystyle{ \frac{dp_i}{d t} = ( f_i - \langle f \rangle ) \, p_i }$

### The result

Okay, now let’s figure out how fast the probability distribution

$p(t) = (p_1(t), \dots, p_n(t))$

changes with time. For this we need to choose a way to measure the length of the vector

$\displaystyle{ \frac{dp}{dt} = (\frac{d}{dt} p_1(t), \dots, \frac{d}{dt} p_n(t)) }$

And here information geometry comes to the rescue! We can use the Fisher information metric, which is a Riemannian metric on the space of probability distributions.

I’ve talked about the Fisher information metric in many ways in this series. The most important fact is that as a probability distribution $p(t)$ changes with time, its speed

$\displaystyle{ \left\| \frac{dp}{dt} \right\|}$

as measured using the Fisher information metric can be seen as the rate at which information is learned. I’ll explain that later. Right now I just want a simple formula for the Fisher information metric. Suppose $v$ and $w$ are two tangent vectors to the point $p$ in the space of probability distributions. Then the Fisher information metric is given as follows:

$\displaystyle{ \langle v, w \rangle = \sum_i \frac{1}{p_i} \, v_i w_i }$

Using this we can calculate the speed at which $p(t)$ moves when it obeys the replicator equation. Actually the square of the speed is simpler:

$\begin{array}{ccl} \displaystyle{ \left\| \frac{dp}{dt} \right\|^2 } &=& \displaystyle{ \sum_i \frac{1}{p_i} \left( \frac{dp_i}{dt} \right)^2 } \\ \\ &=& \displaystyle{ \sum_i \frac{1}{p_i} \left( ( f_i - \langle f \rangle ) \, p_i \right)^2 } \\ \\ &=& \displaystyle{ \sum_i ( f_i - \langle f \rangle )^2 p_i } \end{array}$

The answer has a nice meaning, too! It’s just the variance of the fitness: that is, the square of its standard deviation.

So, if you’re willing to buy my claim that the speed $\|dp/dt\|$ is the rate at which our population learns new information, then we’ve seen that the square of the rate at which a population learns information is the variance of its fitness!

### Fisher’s fundamental theorem

Now, how is this related to Fisher’s fundamental theorem of natural selection? First of all, what is Fisher’s fundamental theorem? Here’s what Wikipedia says about it:

It uses some mathematical notation but is not a theorem in the mathematical sense.

It states:

“The rate of increase in fitness of any organism at any time is equal to its genetic variance in fitness at that time.”

Or in more modern terminology:

“The rate of increase in the mean fitness of any organism at any time ascribable to natural selection acting through changes in gene frequencies is exactly equal to its genetic variance in fitness at that time”.

Largely as a result of Fisher’s feud with the American geneticist Sewall Wright about adaptive landscapes, the theorem was widely misunderstood to mean that the average fitness of a population would always increase, even though models showed this not to be the case. In 1972, George R. Price showed that Fisher’s theorem was indeed correct (and that Fisher’s proof was also correct, given a typo or two), but did not find it to be of great significance. The sophistication that Price pointed out, and that had made understanding difficult, is that the theorem gives a formula for part of the change in gene frequency, and not for all of it. This is a part that can be said to be due to natural selection

Price’s paper is here:

• George R. Price, Fisher’s ‘fundamental theorem’ made clear, Annals of Human Genetics 36 (1972), 129–140.

I don’t find it very clear, perhaps because I didn’t spend enough time on it. But I think I get the idea.

My result is a theorem in the mathematical sense, though quite an easy one. I assume a population distribution evolves according to the replicator equation and derive an equation whose right-hand side matches that of Fisher’s original equation: the variance of the fitness.

But my left-hand side is different: it’s the square of the speed of the corresponding probability distribution, where speed is measured using the ‘Fisher information metric’. This metric was discovered by the same guy, Ronald Fisher, but I don’t think he used it in his work on the fundamental theorem!

Something a bit similar to my statement appears as Theorem 2 of this paper:

• Marc Harper, Information geometry and evolutionary game theory.

and for that theorem he cites:

• Josef Hofbauer and Karl Sigmund, Evolutionary Games and Population Dynamics, Cambridge University Press, Cambridge, 1998.

However, his Theorem 2 really concerns the rate of increase of fitness, like Fisher’s fundamental theorem. Moreover, he assumes that the probability distribution $p(t)$ flows along the gradient of a function, and I’m not assuming that. Indeed, my version applies to situations where the probability distribution moves round and round in periodic orbits!

### Relative information and the Fisher information metric

The key to generalizing Fisher’s fundamental theorem is thus to focus on the speed at which $p(t)$ moves, rather than the increase in fitness. Why do I call this speed the ‘rate at which the population learns information’? It’s because we’re measuring this speed using the Fisher information metric, which is closely connected to relative information, also known as relative entropy or the Kullback–Leibler divergence.

I explained this back in Part 7, but that explanation seems hopelessly technical to me now, so here’s a faster one, which I created while preparing my talk.

The information of a probability distribution $q$ relative to a probability distribution $p$ is

$\displaystyle{ I(q,p) = \sum_{i =1}^n q_i \log\left(\frac{q_i}{p_i}\right) }$

It says how much information you learn if you start with a hypothesis $p$ saying that the probability of the ith situation was $p_i,$ and then update this to a new hypothesis $q.$

Now suppose you have a hypothesis that’s changing with time in a smooth way, given by a time-dependent probability $p(t).$ Then a calculation shows that

$\displaystyle{ \left.\frac{d}{dt} I(p(t),p(t_0)) \right|_{t = t_0} = 0 }$

for all times $t_0$. This seems paradoxical at first. I like to jokingly put it this way:

To first order, you’re never learning anything.

However, as long as the velocity $\frac{d}{dt}p(t_0)$ is nonzero, we have

$\displaystyle{ \left.\frac{d^2}{dt^2} I(p(t),p(t_0)) \right|_{t = t_0} > 0 }$

so we can say

To second order, you’re always learning something… unless your opinions are fixed.

This lets us define a ‘rate of learning’—that is, a ‘speed’ at which the probability distribution $p(t)$ moves. And this is precisely the speed given by the Fisher information metric!

In other words:

$\displaystyle{ \left\|\frac{dp}{dt}(t_0)\right\|^2 = \left.\frac{d^2}{dt^2} I(p(t),p(t_0)) \right|_{t = t_0} }$

where the length is given by Fisher information metric. Indeed, this formula can be used to define the Fisher information metric. From this definition we can easily work out the concrete formula I gave earlier.

In summary: as a probability distribution moves around, the relative information between the new probability distribution and the original one grows approximately as the square of time, not linearly. So, to talk about a ‘rate at which information is learned’, we need to use the above formula, involving a second time derivative. This rate is just the speed at which the probability distribution moves, measured using the Fisher information metric. And when we have a probability distribution describing how many replicators are of different species, and it’s evolving according to the replicator equation, this speed is also just the variance of the fitness!

## Biology as Information Dynamics

31 January, 2017

This is my talk for the workshop Biological Complexity: Can It Be Quantified?

• John Baez, Biology as information dynamics, 2 February 2017.

Abstract. If biology is the study of self-replicating entities, and we want to understand the role of information, it makes sense to see how information theory is connected to the ‘replicator equation’—a simple model of population dynamics for self-replicating entities. The relevant concept of information turns out to be the information of one probability distribution relative to another, also known as the Kullback–Leibler divergence. Using this we can get a new outlook on free energy, see evolution as a learning process, and give a clean general formulation of Fisher’s fundamental theorem of natural selection.