## Biology as Information Dynamics (Part 2)

27 April, 2017

Here’s a video of the talk I gave at the Stanford Complexity Group:

You can see slides here:

Abstract. If biology is the study of self-replicating entities, and we want to understand the role of information, it makes sense to see how information theory is connected to the ‘replicator equation’ — a simple model of population dynamics for self-replicating entities. The relevant concept of information turns out to be the information of one probability distribution relative to another, also known as the Kullback–Liebler divergence. Using this we can get a new outlook on free energy, see evolution as a learning process, and give a clearer, more general formulation of Fisher’s fundamental theorem of natural selection.

I’d given a version of this talk earlier this year at a workshop on Quantifying biological complexity, but I’m glad this second try got videotaped and not the first, because I was a lot happier about my talk this time. And as you’ll see at the end, there were a lot of interesting questions.

## Complexity Theory and Evolution in Economics

24 April, 2017

This book looks interesting:

• David S. Wilson and Alan Kirman, editors, Complexity and Evolution: Toward a New Synthesis for Economics, MIT Press, Cambridge Mass., 2016.

You can get some chapters for free here. I’ve only looked carefully at this one:

• Joshua M. Epstein and Julia Chelen, Advancing Agent_Zero.

Agent_Zero is a simple toy model of an agent that’s not the idealized rational actor often studied in economics: rather, it has emotional, deliberative, and social modules which interact with each other to make decisions. Epstein and Chelen simulate collections of such agents and see what they do:

Abstract. Agent_Zero is a mathematical and computational individual that can generate important, but insufficiently understood, social dynamics from the bottom up. First published by Epstein (2013), this new theoretical entity possesses emotional, deliberative, and social modules, each grounded in contemporary neuroscience. Agent_Zero’s observable behavior results from the interaction of these internal modules. When multiple Agent_Zeros interact with one another, a wide range of important, even disturbing, collective dynamics emerge. These dynamics are not straightforwardly generated using the canonical rational actor which has dominated mathematical social science since the 1940s. Following a concise exposition of the Agent_Zero model, this chapter offers a range of fertile research directions, including the use of realistic geographies and population levels, the exploration of new internal modules and new interactions among them, the development of formal axioms for modular agents, empirical testing, the replication of historical episodes, and practical applications. These may all serve to advance the Agent_Zero research program.

It sounds like a fun and productive project as long as one keeps ones wits about one. It’s hard to draw conclusions about human behavior from such simplified agents. One can argue about this, and of course economists will. But regardless of this, one can draw conclusions about which kinds of simplified agents will engage in which kinds of collective behavior under which conditions.

Basically, one can start mapping out a small simple corner of the huge ‘phase space’ of possible societies. And that’s bound to lead to interesting new ideas that one wouldn’t get from either 1) empirical research on human and animal societies or 2) pure theoretical pondering without the help of simulations.

Here’s an article whose title, at least, takes a vastly more sanguine attitude toward benefits of such work:

• Kate Douglas, Orthodox economics is broken: how evolution, ecology, and collective behavior can help us avoid catastrophe, Evonomics, 22 July 2016.

I’ll quote just a bit:

For simplicity’s sake, orthodox economics assumes that Homo economicus, when making a fundamental decision such as whether to buy or sell something, has access to all relevant information. And because our made-up economic cousins are so rational and self-interested, when the price of an asset is too high, say, they wouldn’t buy—so the price falls. This leads to the notion that economies self-organise into an equilibrium state, where supply and demand are equal.

Real humans—be they Wall Street traders or customers in Walmart—don’t always have accurate information to hand, nor do they act rationally. And they certainly don’t act in isolation. We learn from each other, and what we value, buy and invest in is strongly influenced by our beliefs and cultural norms, which themselves change over time and space.

“Many preferences are dynamic, especially as individuals move between groups, and completely new preferences may arise through the mixing of peoples as they create new identities,” says anthropologist Adrian Bell at the University of Utah in Salt Lake City. “Economists need to take cultural evolution more seriously,” he says, because it would help them understand who or what drives shifts in behaviour.

Using a mathematical model of price fluctuations, for example, Bell has shown that prestige bias—our tendency to copy successful or prestigious individuals—influences pricing and investor behaviour in a way that creates or exacerbates market bubbles.

We also adapt our decisions according to the situation, which in turn changes the situations faced by others, and so on. The stability or otherwise of financial markets, for instance, depends to a great extent on traders, whose strategies vary according to what they expect to be most profitable at any one time. “The economy should be considered as a complex adaptive system in which the agents constantly react to, influence and are influenced by the other individuals in the economy,” says Kirman.

This is where biologists might help. Some researchers are used to exploring the nature and functions of complex interactions between networks of individuals as part of their attempts to understand swarms of locusts, termite colonies or entire ecosystems. Their work has provided insights into how information spreads within groups and how that influences consensus decision-making, says Iain Couzin from the Max Planck Institute for Ornithology in Konstanz, Germany—insights that could potentially improve our understanding of financial markets.

Take the popular notion of the “wisdom of the crowd”—the belief that large groups of people can make smart decisions even when poorly informed, because individual errors of judgement based on imperfect information tend to cancel out. In orthodox economics, the wisdom of the crowd helps to determine the prices of assets and ensure that markets function efficiently. “This is often misplaced,” says Couzin, who studies collective behaviour in animals from locusts to fish and baboons.

By creating a computer model based on how these animals make consensus decisions, Couzin and his colleagues showed last year that the wisdom of the crowd works only under certain conditions—and that contrary to popular belief, small groups with access to many sources of information tend to make the best decisions.

That’s because the individual decisions that make up the consensus are based on two types of environmental cue: those to which the entire group are exposed—known as high-correlation cues—and those that only some individuals see, or low-correlation cues. Couzin found that in larger groups, the information known by all members drowns out that which only a few individuals noticed. So if the widely known information is unreliable, larger groups make poor decisions. Smaller groups, on the other hand, still make good decisions because they rely on a greater diversity of information.

So when it comes to organising large businesses or financial institutions, “we need to think about leaders, hierarchies and who has what information”, says Couzin. Decision-making structures based on groups of between eight and 12 individuals, rather than larger boards of directors, might prevent over-reliance on highly correlated information, which can compromise collective intelligence. Operating in a series of smaller groups may help prevent decision-makers from indulging their natural tendency to follow the pack, says Kirman.

Taking into account such effects requires economists to abandon one-size-fits-all mathematical formulae in favour of “agent-based” modelling—computer programs that give virtual economic agents differing characteristics that in turn determine interactions. That’s easier said than done: just like economists, biologists usually model relatively simple agents with simple rules of interaction. How do you model a human?

It’s a nut we’re beginning to crack. One attendee at the forum was Joshua Epstein, director of the Center for Advanced Modelling at Johns Hopkins University in Baltimore, Maryland. He and his colleagues have come up with Agent_Zero, an open-source software template for a more human-like actor influenced by emotion, reason and social pressures. Collections of Agent_Zeros think, feel and deliberate. They have more human-like relationships with other agents and groups, and their interactions lead to social conflict, violence and financial panic. Agent_Zero offers economists a way to explore a range of scenarios and see which best matches what is going on in the real world. This kind of sophistication means they could potentially create scenarios approaching the complexity of real life.

Orthodox economics likes to portray economies as stately ships proceeding forwards on an even keel, occasionally buffeted by unforeseen storms. Kirman prefers a different metaphor, one borrowed from biology: economies are like slime moulds, collections of single-celled organisms that move as a single body, constantly reorganising themselves to slide in directions that are neither understood nor necessarily desired by their component parts.

For Kirman, viewing economies as complex adaptive systems might help us understand how they evolve over time—and perhaps even suggest ways to make them more robust and adaptable. He’s not alone. Drawing analogies between financial and biological networks, the Bank of England’s research chief Andrew Haldane and University of Oxford ecologist Robert May have together argued that we should be less concerned with the robustness of individual banks than the contagious effects of one bank’s problems on others to which it is connected. Approaches like this might help markets to avoid failures that come from within the system itself, Kirman says.

To put this view of macroeconomics into practice, however, might mean making it more like weather forecasting, which has improved its accuracy by feeding enormous amounts of real-time data into computer simulation models that are tested against each other. That’s not going to be easy.

## Stanford Complexity Group

19 April, 2017

Aaron Goodman of the Stanford Complexity Group invited me to give a talk there on Thursday April 20th. If you’re nearby—like in Silicon Valley—please drop by! It will be in Clark S361 at 4:20 pm.

Here’s the idea. Everyone likes to say that biology is all about information. There’s something true about this—just think about DNA. But what does this insight actually do for us, quantitatively speaking? To figure this out, we need to do some work.

Biology is also about things that make copies of themselves. So it makes sense to figure out how information theory is connected to the replicator equation—a simple model of population dynamics for self-replicating entities.

To see the connection, we need to use ‘relative information’: the information of one probability distribution relative to another, also known as the Kullback–Leibler divergence. Then everything pops into sharp focus.

It turns out that free energy—energy in forms that can actually be used, not just waste heat—is a special case of relative information Since the decrease of free energy is what drives chemical reactions, biochemistry is founded on relative information.

But there’s a lot more to it than this! Using relative information we can also see evolution as a learning process, fix the problems with Fisher’s fundamental theorem of natural selection, and more.

So this what I’ll talk about! You can see my slides here:

• John Baez, Biology as information dynamics.

but my talk will be videotaped, and it’ll eventually be put here:

You can already see lots of cool talks at this location!

## Periodic Patterns in Peptide Masses

6 April, 2017

Gheorghe Craciun is a mathematician at the University of Wisconsin who recently proved the Global Attractor Conjecture, which since 1974 was the most famous conjecture in mathematical chemistry. This week he visited U. C. Riverside and gave a talk on this subject. But he also told me about something else—something quite remarkable.

### The mystery

A peptide is basically a small protein: a chain of made of fewer than 50 amino acids. If you plot the number of peptides of different masses found in various organisms, you see peculiar oscillations:

These oscillations have a frequency of about 14 daltons, where a ‘dalton’ is roughly the mass of a hydrogen atom—or more precisely, 1/12 the mass of a carbon atom.

Biologists had noticed these oscillations in databases of peptide masses. But they didn’t understand them.

Can you figure out what causes these oscillations?

It’s a math puzzle, actually.

Next I’ll give you the answer, so stop looking if you want to think about it first.

### The solution

Almost all peptides are made of 20 different amino acids, which have different masses, which are almost integers. So, to a reasonably good approximation, the puzzle amounts to this: if you have 20 natural numbers $m_1, ... , m_{20},$ how many ways can you write any natural number $N$ as a finite ordered sum of these numbers? Call it $F(N)$ and graph it. It oscillates! Why?

(We count ordered sums because the amino acids are stuck together in a linear way to form a protein.)

There’s a well-known way to write down a formula for $F(N)$. It obeys a linear recurrence:

$F(N) = F(N - m_1) + \cdots + F(N - m_{20})$

and we can solve this using the ansatz

$F(N) = x^N$

Then the recurrence relation will hold if

$x^N = x^{N - m_1} + x^{N - m_2} + \dots + x^{N - m_{20}}$

for all $N.$ But this is fairly easy to achieve! If $m_{20}$ is the biggest mass, we just need this polynomial equation to hold:

$x^{m_{20}} = x^{m_{20} - m_1} + x^{m_{20} - m_2} + \dots + 1$

There will be a bunch of solutions, about $m_{20}$ of them. (If there are repeated roots things get a bit more subtle, but let’s not worry about.) To get the actual formula for $F(N)$ we need to find the right linear combination of functions $x^N$ where $x$ ranges over all the roots. That takes some work. Craciun and his collaborator Shane Hubler did that work.

But we can get a pretty good understanding with a lot less work. In particular, the root $x$ with the largest magnitude will make $x^N$ grow the fastest.

If you haven’t thought about this sort of recurrence relation it’s good to look at the simplest case, where we just have two masses $m_1 = 1, m_2 = 2.$ Then the numbers $F(N)$ are the Fibonacci numbers. I hope you know this: the $N$th Fibonacci number is the number of ways to write $N$ as the sum of an ordered list of 1’s and 2’s!

1

1+1,   2

1+1+1,   1+2,   2+1

1+1+1+1,   1+1+2,   1+2+1,   2+1+1,   2+2

If I drew edges between these sums in the right way, forming a ‘family tree’, you’d see the connection to Fibonacci’s original rabbit puzzle.

In this example the recurrence gives the polynomial equation

$x^2 = x + 1$

and the root with largest magnitude is the golden ratio:

$\Phi = 1.6180339...$

The other root is

$1 - \Phi = -0.6180339...$

With a little more work you get an explicit formula for the Fibonacci numbers in terms of the golden ratio:

$\displaystyle{ F(N) = \frac{1}{\sqrt{5}} \left( \Phi^{N+1} - (1-\Phi)^{N+1} \right) }$

But right now I’m more interested in the qualitative aspects! In this example both roots are real. The example from biology is different.

Puzzle 1. For which lists of natural numbers $m_1 < \cdots < m_k$ are all the roots of

$x^{m_k} = x^{m_k - m_1} + x^{m_k - m_2} + \cdots + 1$

real?

I don’t know the answer. But apparently this kind of polynomial equation always one root with the largest possible magnitude, which is real and has multiplicity one. I think it turns out that $F(N)$ is asymptotically proportional to $x^N$ where $x$ is this root.

But in the case that’s relevant to biology, there’s also a pair of roots with the second largest magnitude, which are not real: they’re complex conjugates of each other. And these give rise to the oscillations!

For the masses of the 20 amino acids most common in life, the roots look like this:

The aqua root at right has the largest magnitude and gives the dominant contribution to the exponential growth of $F(N).$ The red roots have the second largest magnitude. These give the main oscillations in $F(N),$ which have period 14.28.

For the full story, read this:

• Shane Hubler and Gheorghe Craciun, Periodic patterns in distributions of peptide masses, BioSystems 109 (2012), 179–185.

Most of the pictures here are from this paper.

My main question is this:

Puzzle 2. Suppose we take many lists of natural numbers $m_1 < \cdots < m_k$ and draw all the roots of the equations

$x^{m_k} = x^{m_k - m_1} + x^{m_k - m_2} + \cdots + 1$

What pattern do we get in the complex plane?

I suspect that this picture is an approximation to the answer you’d get to Puzzle 2:

If you stare carefully at this picture, you’ll see some patterns, and I’m guessing those are hints of something very beautiful.

Earlier on this blog we looked at roots of polynomials whose coefficients are all 1 or -1:

The pattern is very nice, and it repays deep mathematical study. Here it is, drawn by Sam Derbyshire:

But now we’re looking at polynomials where the leading coefficient is 1 and all the rest are -1 or 0. How does that change things? A lot, it seems!

By the way, the 20 amino acids we commonly see in biology have masses ranging between 57 and 186. It’s not really true that all their masses are different. Here are their masses:

57, 71, 87, 97, 99, 101, 103, 113, 113, 114, 115, 128, 128, 129, 131, 137, 147, 156, 163, 186

I pretended that none of the masses $m_i$ are equal in Puzzle 2, and I left out the fact that only about 1/9th of the coefficients of our polynomial are nonzero. This may affect the picture you get!

## Applied Category Theory

6 April, 2017

The American Mathematical Society is having a meeting here at U. C. Riverside during the weekend of November 4th and 5th, 2017. I’m organizing a session on Applied Category Theory, and I’m looking for people to give talks.

The goal is to start a conversation about applications of category theory, not within pure math or fundamental physics, but to other branches of science and engineering—especially those where the use of category theory is not already well-established! For example, my students and I have been applying category theory to chemistry, electrical engineering, control theory and Markov processes.

Alas, we have no funds for travel and lodging. If you’re interested in giving a talk, please submit an abstract here:

General information about abstracts, American Mathematical Society.

More precisely, please read the information there and then click on the link on that page to submit an abstract. It should then magically fly through cyberspace to me! Abstracts are due September 12th, but the sooner you submit one, the greater the chance that we’ll have space.

For the program of the whole conference, go here:

Fall Western Sectional Meeting, U. C. Riverside, Riverside, California, 4–5 November 2017.

We’ll be having some interesting plenary talks:

• Paul Balmer, UCLA, An invitation to tensor-triangular geometry.

• Pavel Etingof, MIT, Double affine Hecke algebras and their applications.

• Monica Vazirani, U.C. Davis, Combinatorics, categorification, and crystals.

## Jobs at U.C. Riverside

30 March, 2017

The Mathematics Department of the University of California at Riverside is trying to hire some visiting assistant professors. We plan to make decisions quite soon!

The positions are open to applicants who have PhD or will have a PhD by the beginning of the term from all research areas in mathematics. The teaching load is six courses per year (i.e. 2 per quarter). In addition to teaching, the applicants will be responsible for attending advanced seminars and working on research projects.

This is initially a one-year appointment, and with successful annual teaching review, it is renewable for up to a third year term.

For more details, including how to apply, go here:

https://www.mathjobs.org/jobs/jobs/10162

## Restoring the North Cascades Ecosystem

13 March, 2017

In 49 hours, the National Park Service will stop taking comments on an important issue: whether to reintroduce grizzly bears into the North Cascades near Seattle. If you leave a comment on their website before then, you can help make this happen! Follow the easy directions here:

But if you want more details:

Grizzly bears are traditionally the apex predator in the North Cascades. Without the apex predator, the whole ecosystem is thrown out of balance. I know this from my childhood in northern Virginia, where deer are stripping the forest of all low-hanging greenery with no wolves to control them. With the top predator, the whole ecosystem springs to life and starts humming like a well-tuned engine! For example, when wolves were reintroduced in Yellowstone National Park, it seems that even riverbeds were affected:

There are several plans to restore grizzlies to the North Cascades. On the link I recommended, Matthew Inman supports Alternative C — Incremental Restoration. I’m not an expert on this issue, so I went ahead and supported that. There are actually 4 alternatives on the table:

Alternative A — No Action. They’ll keep doing what they’re already doing. The few grizzlies already there would be protected from poaching, the local population would be advised on how to deal with grizzlies, and the bears would be monitored. All other alternatives will do these things and more.

Alternative B — Ecosystem Evaluation Restoration. Up to 10 grizzly bears will be captured from source populations in northwestern Montana and/or south-central British Columbia and released at a single remote site on Forest Service lands in the North Cascades. This will take 2 years, and then they’ll be monitored for 2 years before deciding what to do next.

Alternative C — Incremental Restoration. 5 to 7 grizzly bears will be captured and released into the North Casades each year over roughly 5 to 10 years, with a goal of establishing an initial population of 25 grizzly bears. Bears would be released at multiple remote sites. They can be relocated or removed if they cause trouble. Alternative C is expected to reach the restoration goal of approximately 200 grizzly bears within 60 to 100 years.

Alternative D — Expedited Restoration. 5 to 7 grizzly bears will be captured and released into the North Casades each year until the population reaches about 200, which is what the area can easily support.

So, pick your own alternative if you like!

By the way, the remaining grizzly bears in the western United States live within six recovery zones:

• the Greater Yellowstone Ecosystem (GYE) in Wyoming and southwest Montana,

• the Northern Continental Divide Ecosystem (NCDE) in northwest Montana,

• the Cabinet-Yaak Ecosystem (CYE) in extreme northwestern Montana and the northern Idaho panhandle,

• the Selkirk Ecosystem (SE) in northern Idaho and northeastern Washington,

• the Bitterroot Ecosystem (BE) in central Idaho and western Montana,

• and the North Cascades Ecosystem (NCE) in northwestern and north-central Washington.

The North Cascades Ecosystem consists of 24,800 square kilometers in Washington, with an additional 10,350 square kilometers in British Columbia. In the US, 90% of this ecosystem is managed by the US Forest Service, the US National Park Service, and the State of Washington, and approximately 41% falls within Forest Service wilderness or the North Cascades National Park Service Complex.

• National Park Service, Draft Grizzly Bear Restoration Plan / Environmental Impact Statement: North Cascades Ecosystem.