Entropy and Information in Biological Systems (Part 2)

4 July, 2014

John Harte, Marc Harper and I are running a workshop! Now you can apply here to attend:

Information and entropy in biological systems, National Institute for Mathematical and Biological Synthesis, Knoxville Tennesee, Wednesday-Friday, 8-10 April 2015.

Click the link, read the stuff and scroll down to “CLICK HERE” to apply. The deadline is 12 November 2014.

Financial support for travel, meals, and lodging is available for workshop attendees who need it. We will choose among the applicants and invite 10-15 of them.

The idea

Information theory and entropy methods are becoming powerful tools in biology, from the level of individual cells, to whole ecosystems, to experimental design, model-building, and the measurement of biodiversity. The aim of this investigative workshop is to synthesize different ways of applying these concepts to help systematize and unify work in biological systems. Early attempts at “grand syntheses” often misfired, but applications of information theory and entropy to specific highly focused topics in biology have been increasingly successful. In ecology, entropy maximization methods have proven successful in predicting the distribution and abundance of species. Entropy is also widely used as a measure of biodiversity. Work on the role of information in game theory has shed new light on evolution. As a population evolves, it can be seen as gaining information about its environment. The principle of maximum entropy production has emerged as a fascinating yet controversial approach to predicting the behavior of biological systems, from individual organisms to whole ecosystems. This investigative workshop will bring together top researchers from these diverse fields to share insights and methods and address some long-standing conceptual problems.

So, here are the goals of our workshop:

• To study the validity of the principle of Maximum Entropy Production (MEP), which states that biological systems – and indeed all open, non-equilibrium systems – act to produce entropy at the maximum rate.

• To familiarize all the participants with applications to ecology of the MaxEnt method: choosing the probabilistic hypothesis with the highest entropy subject to the constraints of our data. We will compare MaxEnt with competing approaches and examine whether MaxEnt provides a sufficient justification for the principle of MEP.

• To clarify relations between known characterizations of entropy, the use of entropy as a measure of biodiversity, and the use of MaxEnt methods in ecology.

• To develop the concept of evolutionary games as “learning” processes in which information is gained over time.

• To study the interplay between information theory and the thermodynamics of individual cells and organelles.

For more details, go here.

If you’ve got colleagues who might be interested in this, please let them know. You can download a PDF suitable for printing and putting on a bulletin board by clicking on this:

Quantum Frontiers in Network Science

6 May, 2014

guest post by Jacob Biamonte

There’s going to be a workshop on quantum network theory in Berkeley this June. The event is being organized by some of my collaborators and will be a satellite of the biggest annual network science conference, NetSci.

A theme of the Network Theory series here on Azimuth has been to merge ideas appearing in quantum theory with other disciplines. Remember the first post by John which outlined the goal of a general theory of networks? Well, everyone’s been chipping away at this stuff for a few years now and I think you’ll agree that this workshop seems like an excellent way to push these topics even further, particularly as they apply to complex networks.

The event is being organized by Mauro Faccin, Filippo Radicchi and Zoltán Zimborás. You might recall when Tomi Johnson first explained to us some ideas connecting quantum physics with the concepts of complex networks (see Quantum Network Theory Part 1 and Part 2). Tomi’s going to be speaking at this event. I understand there is even still a little bit of space left to contribute talks and/or to attend. I suspect that those interested can sort this out by emailing the organizers or just follow the instructions to submit an abstract.

They have named their event Quantum Frontiers in Network Science or QNET for short. Here’s their call.

Quantum Frontiers in Network Science

This year the biggest annual network science conference, NetSci will take place in Berkeley California on 2-6 June. We are organizing a one-day Satellite Workshop on Quantum Frontiers in Network Science (QNET).

quantum netsci2014

A grand challenge in contemporary complex network science is to reconcile the staple “statistical mechanics based approach” with a theory based on quantum physics. When considering networks where quantum coherence effects play a non-trivial role, the predictive power of complex network science has been shown to break down. A new theory is now being developed which is based on quantum theory, from first principles. Network theory is a diverse subject which developed independently in several disciplines to rely on graphs with additional structure to model complex systems. Network science has of course played a significant role in quantum theory, for example in topics such as tensor network states, chiral quantum walks on complex networks, categorical tensor networks, and categorical models of quantum circuits, to name only a few. However, the ideas of complex network science are only now starting to be united with modern quantum theory. From this respect, one aim of the workshop is to put in contact two big and generally not very well connected scientific communities: statistical and quantum physicists.

The topic of network science underwent a revolution when it was realized that systems such as social or transport networks could be interrelated through common network properties, but what are the relevant properties to consider when facing quantum systems? This question is particularly timely as there has been a recent push towards studying increasingly larger quantum mechanical systems, where the analysis is only beginning to undergo a shift towards embracing the concepts of complex networks.

brain network

For example, theoretical and experimental attention has turned to explaining transport in photosynthetic complexes comprising tens to hundreds of molecules and thousands of atoms using quantum mechanics. Likewise, in condensed matter physics using the language of “chiral quantum walks”, the topological structure of the interconnections comprising complex materials strongly affects their transport properties.

An ultimate goal is a mathematical theory and formal description which pinpoints the similarities and differences between the use of networks throughout the quantum sciences. This would give rise to a theory of networks augmenting the current statistical mechanics approach to complex network structure, evolution, and process with a new theory based on quantum mechanics.

Topics of special interest to the satellite include

• Quantum transport and chiral quantum walks on complex networks
• Detecting community structure in quantum systems
• Tensor algebra and multiplex networks
• Quantum information measures (such as entropy) applied to complex networks
• Quantum critical phenomena in complex networks
• Quantum models of network growth
• Quantum techniques for reaction networks
• Quantum algorithms for problems in complex network science
• Foundations of quantum theory in relation to complex networks and processes thereon
• Quantum inspired mathematics as a foundation for network science


QNET will be held at the NetSci Conference venue at the Clark Kerr Campus of the University of California, on June 2nd in the morning (8am-1pm).


• Main conference page: NetSci2014
Call for abstracts and the program

It sounds interesting! You’ll notice that the list of topics seems reminiscent of some of the things we’ve been talking about right here on Azimuth! A general theme of the Network Theory Series has been geared towards developing frameworks to describe networked systems through a common language and then to map the use of tools and results across disciplines. It seems like a great place to talk about these ideas. Oh, and here’s a current list of the speakers:

Leonardo Banchi (UCL, London)
Ginestra Bianconi (London)
Silvano Garnerone (IQC, Waterloo)
Laetitia Gauvin (ISI Foundation)
Marco Javarone (Sassari)
Tomi Johnson (Oxford)

and again, the organizers are

Mauro Faccin (ISI Foundation)
Filippo Radicchi (Indiana University)
Zoltán Zimborás (UCL)

From the call, we can notice that a central discussion topic at QNET will be about contrasting stochastic and quantum mechanics. Here on Azimuth we like this stuff. You might remember that stochastic mechanics was formulated in the network theory series to mathematically resemble quantum theory (see e.g. Part 12). This formalism was then employed to produce several results, including a stochastic version of Noether’s theorem by John and Brendan in Parts 11 and 13—recently Ville has also written Noether’s Theorem: Quantum vs Stochastic. Several other results were produced by relating quantum field theory to Petri nets from population biology and to chemical reaction networks in chemistry (see the Network Theory homepage). It seems to me that people attending QNET will be interested in these sorts of things, as well as other related topics.

One of the features of complex network science is that it is often numerically based and geared directly towards interesting real-world applications. I suspect some interesting results should stem from the discussions that will take place at this workshop.

By the way, here’s a view of downtown San Francisco at dusk from Berkeley Hills California from the NetSci homepage:

San Francisco

Programming with Chemical Reaction Networks

23 March, 2014


There will be a 5-day workshop on Programming with Chemical Reaction Networks: Mathematical Foundation at BIRS from Sunday, June 8 to Friday June 13, 2014 It’s being organized by

Anne Condon (University of British Columbia)
David Doty (California Institute of Technology)
Chris Thachuk (University of Oxford).

BIRS is the Banff International Research Station, in the mountains west of Calgary, in Alberta, Canada.


Here’s the workshop proposal on the BIRS website. It’s a pretty interesting proposal, especially if you’ve already read Luca Cardelli’s description of computing with chemical reaction networks, at the end of our series of posts on chemical reaction networks. The references include a lot of cool papers, so I’ve created links to those to help you get ahold of them.

This workshop will explore three of the most important research themes concerning stochastic chemical reaction networks (CRNs). Below we motivate each theme and highlight key questions that the workshop will address. Our main objective is to bring together distinct research communities in order to consider new problems that could not be fully appreciated in isolation. It is also our aim to determine commonalities between different disciplines and bodies of research. For example, research into population protocols, vector addition systems, and Petri networks provide a rich body of theoretical results that may already address contemporary problems arising in the study of CRNs.

Computational power of CRNs

Before designing robust and practical systems, it is useful to know the limits to computing with a chemical soup. Some interesting theoretical results are already known for stochastic chemical reaction networks. The computational power of CRNs depend upon a number of factors, including: (i) is the computation deterministic, or probabilistic, and (ii) does the CRN have an initial context — certain species, independent of the input, that are initially present in some exact, constant count.

In general, CRNs with a constant number of species (independent of the input length) are capable of Turing universal computation [17], if the input is represented by the exact (unary) count of one molecular species, some small probability of error is permitted and an initial context in the form of a single-copy leader molecule is used.

Could the same result hold in the absence of an initial context? In a surprising result based on the distributed computing model of population protocols, it has been shown that if a computation must be error-free, then deterministic computation with CRNs having an initial context is limited to computing semilinear predicates [1], later extended to functions outputting natural numbers encoded by molecular counts [5].

Furthermore, any semilinear predicate or function can be computed by that class of CRNs in expected time polylogarithmic in the input length. Building on this result, it was recently shown that by incurring an expected time linear in the input length, the same result holds for “leaderless” CRNs [8] — CRNs with no initial context. Can this result be improved to sub-linear expected time? Which class of functions can be computed deterministically by a CRN without an initial context in expected time polylogarithmic in the input length?

While (restricted) CRNs are Turing-universal, current results use space proportional to the computation time. Using a non-uniform construction, where the number of species is proportional to the input length and each initial species is present in some constant count, it is known that any S(n) space-bounded computation can be computed by a logically-reversible tagged CRN, within a reaction volume of size poly(S(n)) [18]. Tagged CRNs were introduced to model explicitly the fuel molecules in physical realizations of CRNs such as DNA strand displacement systems [6] that are necessary to supply matter and energy for implementing reactions such as X → X + Y that violate conservation of mass and/or energy.

Thus, for space-bounded computation, there exist CRNs that are time-efficient or are space-efficient. Does there exist time- and space-efficient CRNs to compute any space-bounded function?

Designing and verifying robust CRNs

While CRNs provide a concise model of chemistry, their physical realizations are often more complicated and more granular. How can one be sure they accurately implement the intended network behaviour? Probabilistic model checking has already been employed to find and correct inconsistencies between CRNs and their DNA strand displacement system (DSD) implementations [9]. However, at present, model checking of arbitrary CRNs is only capable of verifying the correctness of very small systems. Indeed, verification of these types of systems is a difficult problem: probabilistic state reachability is undecidable [17, 20] and general state reachability is EXPSPACE-hard [4].

How can larger systems be verified? A deeper understanding of CRN behaviour may simplify the process of model checking. As a motivating example, there has been recent progress towards verifying that certain DSD implementations correctly simulate underlying CRNs [16, 7, 10]. This is an important step to ensuring correctness, prior to experiments. However, DSDs can also suffer from other errors when implementing CRNs, such as spurious hybridization or strand displacement. Can DSDs and more generally CRNs be designed to be robust to such predictable errors? Can error correcting codes and redundant circuit designs used in traditional computing be leveraged in these chemical computers? Many other problems arise when implementing CRNs. Currently, unique types of fuel molecules must be designed for every reaction type. This complicates the engineering process significantly. Can a universal type of fuel be designed to smartly implement any reaction?

Energy efficient computing with CRNs

Rolf Landauer showed that logically irreversible computation — computation as modeled by a standard Turing machine — dissipates an amount of energy proportional to the number of bits of information lost, such as previous state information, and therefore cannot be energy efficient [11]. However, Charles Bennett showed that, in principle, energy efficient computation is possible, by proposing a universal Turing machine to perform logically-reversible computation and identified nucleic acids (RNA/DNA) as a potential medium to realize logically-reversible computation in a physical system [2].

There have been examples of logically-reversible DNA strand displacement systems — a physical realization of CRNs — that are, in theory, capable of complex computation [12, 19]. Are these systems energy efficient in a physical sense? How can this argument be made formally to satisfy both the computer science and the physics communities? Is a physical experiment feasible, or are these results merely theoretical footnotes?


[1] D. Angluin, J. Aspnes, and D. Eisenstat. Stably computable predicates are semilinear. In PODC, pages 292–299, 2006.

[2] C. H. Bennett. Logical reversibility of computation. IBM Journal of Research and Development, 17 (6):525–532, 1973.

[3] L. Cardelli and A. Csikasz-Nagy. The cell cycle switch computes approximate majority. Scientific Reports, 2, 2012.

[4] E. Cardoza, R. Lipton, A. R. Meyer. Exponential space complete problems for Petri nets and commutative semigroups (Preliminary Report). Proceedings of the Eighth Annual ACM Symposium on Theory of Computing, pages 507–54, 1976.

[5] H. L. Chen, D. Doty, and D. Soloveichik. Deterministic function computation with chemical reaction networks. DNA Computing and Molecular Programming, pages 25–42, 2012.

[6] A. Condon, A. J. Hu, J. Manuch, and C. Thachuk. Less haste, less waste: on recycling and its limits in strand displacement systems. Journal of the Royal Society: Interface Focus, 2 (4):512–521, 2012.

[7] Q. Dong. A bisimulation approach to verification of molecular implementations of formal chemical reaction network. Master’s thesis. SUNY Stony Brook, 2012.

[8] D. Doty and M. Hajiaghayi. Leaderless deterministic chemical reaction networks. In Proceedings of the 19th International Meeting on DNA Computing and Molecular Programming, 2013.

[9] M. R. Lakin, D. Parker, L. Cardelli, M. Kwiatkowska, and A. Phillips. Design and analysis of DNA strand displacement devices using probabilistic model checking. Journal of The Royal Society Interface, 2012.

[10] M. R. Lakin, D. Stefanovic and A. Phillips. Modular Verification of Two-domain DNA Strand Displacement Networks via Serializability Analysis. In Proceedings of the 19th Annual conference on DNA computing, 2013.

[11] R. Landauer. Irreversibility and heat generation in the computing process. IBM Journal of research and development, 5 (3):183–191, 1961.

[12] L. Qian, D. Soloveichik, and E. Winfree. Efficient Turing-universal computation with DNA polymers (extended abstract) . In Proceedings of the 16th Annual conference on DNA computing, pages 123–140, 2010.

[13] L. Qian and E. Winfree. Scaling up digital circuit computation with DNA strand displacement cascades. Science, 332 (6034):1196–1201, 2011.

[14] L. Qian, E. Winfree, and J. Bruck. Neural network computation with DNA strand displacement cascades. Nature, 475 (7356):368–372, 2011.

[15] G. Seelig, D. Soloveichik, D.Y. Zhang, and E. Winfree. Enzyme-free nucleic acid logic circuits. Science, 314 (5805):1585–1588, 2006.

[16] S. W. Shin. Compiling and verifying DNA-based chemical reaction network implementations. Master’s thesis. California Insitute of Technology, 2011.

[17] D. Soloveichik, M. Cook, E. Winfree, and J. Bruck. Computation with finite stochastic chemical reaction networks. Natural Computing, 7 (4):615–633, 2008.

[18] C. Thachuk. Space and energy efficient molecular programming. PhD thesis, University of British Columbia, 2012.

[19] C. Thachuk and A. Condon. Space and energy efficient computation with DNA strand displacement systems. In Proceedings of the 18th Annual International Conference on DNA computing and Molecular Programming, 2012.

[20] G. Zavattaro and L. Cardelli. Termination Problems in Chemical Kinetics. In Proceedings of the 2008 Conference on Concurrency Theory, pages 477–491, 2008.

Network Theory Overview

22 February, 2014


Here’s a video of a talk I gave yesterday, made by Brendan Fong. You can see the slides here—and then click the items in blue, and the pictures, for more information!

The idea: nature and the world of human technology are full of networks! People like to draw diagrams of networks. Mathematical physicists know that in principle these diagrams can be understood using category theory. But why should physicists have all the fun? This is the century of understanding living systems and adapting to life on a finite planet. Math isn’t the main thing we need for this, but it’s got to be part of the solution… so one thing we should do is develop a unified and powerful theory of networks.

We are still far from doing this. In this overview, I briefly described three parts of the jigsaw puzzle, and invited everyone to help fit them together:

• electrical circuits and signal-flow graphs.

• stochastic Petri nets, chemical reaction networks and Feynman diagrams.

• Bayesian networks, information and entropy.

In my talks coming up, I’ll go into more detail on each of these. With luck, you’ll be able to see videos here.

But if you’re near Oxford, you might as well actually attend! You can see dates, times, locations, my slides, and the talks themselves as they show up by going here.

Network Theory Talks at Oxford

7 February, 2014

I’m giving some talks at Oxford:

Network Theory

Nature and the world of human technology are full of networks. People like to draw diagrams of networks: flow charts, electrical circuit diagrams, signal-flow graphs, Bayesian networks, Feynman diagrams and the like. Mathematically minded people know that in principle these diagrams fit into a common framework: category theory. But we are still far from a unified theory of networks. After an overview, we will look at three portions of the jigsaw puzzle in three separate talks:

I. Electrical circuits and signal-flow graphs.

II. Stochastic Petri nets, chemical reaction networks and Feynman diagrams.

III. Bayesian networks, information and entropy.

If you’re nearby I hope you can come! All these talks will take place in Lecture Theatre B in the Computer Science Department—see the map below. Here are the times:

• Friday 21 February 2014, 2 pm: Network Theory: overview. See the slides or watch a video.

• Tuesday 25 February, 3:30 pm: Network Theory I: electrical circuits and signal-flow graphs. See the slides or watch a video.

• Tuesday 4 March, 3:30 pm: Network Theory II: stochastic Petri nets, chemical reaction networks and Feynman diagrams. See the slides or watch a video.

• Tuesday 11 March, 3:30 pm: Network Theory III: Bayesian networks, information and entropy. See the slides or watch a video

The first talk will be part of the OASIS series, meaning the “Oxford Advanced Seminar on Informatic Structures”.

I thank Samson Abramsky, Bob Coecke and Jamie Vicary of the Computer Science Department for inviting me, and Ulrike Tillmann and Minhyong Kim of the Mathematical Institute for helping me get set up. I also thank all the people who helped do the work I’ll be talking about, most notably Jacob Biamonte, Jason Erbele, Brendan Fong, Tobias Fritz, Tom Leinster, Tu Pham, and Franciscus Rebro.

Ulrike Tillmann has also kindly invited me to give a topology seminar:

Operads and the Tree of Life

Trees are not just combinatorial structures: they are also biological structures, both in the obvious way but also in the study of evolution. Starting from DNA samples from living species, biologists use increasingly sophisticated mathematical techniques to reconstruct the most likely “phylogenetic tree” describing how these species evolved from earlier ones. In their work on this subject, they have encountered an interesting example of an operad, which is obtained by applying a variant of the Boardmann–Vogt “W construction” to the operad for commutative monoids. The operations in this operad are labelled trees of a certain sort, and it plays a universal role in the study of stochastic processes that involve branching. It also shows up in tropical algebra. This talk is based on work in progress with Nina Otter.

I’m not sure exactly where this will take place, but surely somewhere in the Mathematical Institute building:

• Monday 24 February, 3:30 pm, Operads and the Tree of Life. See the slides.

The Computer Science Department is shown in the map here:


The Mathematical Institute is a bit to the west:

Bio-Inspired Information Theory

31 January, 2014


There will be a 5-day workshop on Biological and Bio-Inspired Information Theory at BIRS from Sunday the 26th to Friday the 31st of October, 2014. It’s being organized by

Toby Berger (University of Virginia)
Andrew Eckford (York University)
Peter Thomas (Case Western Reserve University)

BIRS is the Banff International Research Station, a conference venue in a rather wild part of Alberta, in the mountains west of Calgary.


Here’s the workshop proposal on the BIRS website:

Currently, research in the community is organized into three streams:

• Information theory and biochemistry (including information theory and intercellular communication);

• Information theory and neuroscience; and

• Information-theoretic analysis of biologically-inspired communication systems (including nano-networking and design of biologically implemented information processing networks).

We propose a BIRS workshop to explore these streams, focusing on mathematical open problems that cut across the streams. The main objectives of this workshop would be: to bring together the most prominent researchers in this field; to discuss and review the current state of mathematical research in this field; to promote cross-pollination among the various streams of research to find common problems; and to collectively identify key future directions or open problems that would bring the greatest impact to this field over the next few years.

Expected impact

A BIRS workshop involving the field’s leading researchers would allow a review of the current state of the art, and would promote cross-pollination among these three streams of research. We expect to have these leading researchers in attendance. For example, Prof. Toby Berger (U. Virginia), a widely recognized pioneer in this field and a recipient of the Shannon Award (the top prize awarded by the IEEE Information Theory Society), is one of the co-organizers of the workshop. Moreover, we have approached many of the field’s most prominent mathematicians and scientists: a complete list is found elsewhere in this proposal, but among the most prominent confirmed participants are: Prof. Tadashi Nakano (Osaka U.), one of the earliest researchers in engineered molecular communication; Dr. Thomas D. Schneider (NIH – National Cancer Institute), inventor of the sequence logo and prominent researcher in genetic information theory; and Profs. William Bialek (Princeton U.) and Naftali Tishby (Hebrew U.), prominent experts on information theory in neural coding.

Although the focus of our workshop is on mathematical fundamentals, our list of expected participants includes a few experimental scientists, e.g. Raymond Cheong and Andre Levchenko (both from Johns Hopkins U.), in addition to mathematical scientists. This is because quantitative application of information theoretic analysis to biological systems typically requires empirical estimation of joint probability distributions for multiple input and output variables, often posing daunting data collection challenges which pioneered the use of high-throughput experimental methods to collect large data sets quantifying the input/output relationships for a specific biochemical signaling pathway). We believe a blended approach, emphasizing mathematics but including experimental perspectives, will enhance the impact of our workshop and increase the usefulness to our participants.

Given that publications in these research areas have achieved prominence in the past few years, the time is right for a general meeting among the key researchers to review the state of the field and develop future directions. Thus, our proposed workshop is timely and would be expected to have a tremendous impact on the field over the next several years.

Entropy and Information in Biological Systems (Part 1)

2 November, 2013

John Harte is an ecologist who uses maximum entropy methods to predict the distribution, abundance and energy usage of species. Marc Harper uses information theory in bioinformatics and evolutionary game theory. Harper, Harte and I are organizing a workshop on entropy and information in biological systems, and I’m really excited about it!

It’ll take place at the National Institute for Mathematical and Biological Synthesis in Knoxville Tennesee. We are scheduling it for Wednesday-Friday, April 8-10, 2015. When the date gets confirmed, I’ll post an advertisement so you can apply to attend.

Writing the proposal was fun, because we got to pull together lots of interesting people who are applying information theory and entropy to biology in quite different ways. So, here it is!


Ever since Shannon initiated research on information theory in 1948, there have been hopes that the concept of information could serve as a tool to help systematize and unify work in biology. The link between information and entropy was noted very early on, and it suggested that a full thermodynamic understanding of biology would naturally involve the information processing and storage that are characteristic of living organisms. However, the subject is full of conceptual pitfalls for the unwary, and progress has been slower than initially expected. Premature attempts at ‘grand syntheses’ have often misfired. But applications of information theory and entropy to specific highly focused topics in biology have been increasingly successful, such as:

• the maximum entropy principle in ecology,
• Shannon and Rényi entropies as measures of biodiversity,
• information theory in evolutionary game theory,
• information and the thermodynamics of individual cells.

Because they work in diverse fields, researchers in these specific topics have had little opportunity to trade insights and take stock of the progress so far. The aim of the workshop is to do just this.

In what follows, participants’ names are in boldface, while the main goals of the workshop are in italics.

Roderick Dewar is a key advocate of the principle of Maximum Entropy Production, which says that biological systems—and indeed all open, non-equilibrium systems—act to produce entropy at the maximum rate. Along with others, he has applied this principle to make testable predictions in a wide range of biological systems, from ATP synthesis [DJZ2006] to respiration and photosynthesis of individual plants [D2010] and plant communities. He has also sought to derive this principle from ideas in statistical mechanics [D2004, D2009], but it remains controversial.

The first goal of this workshop is to study the validity of this principle.

While they may be related, the principle of Maximum Entropy Production should not be confused with the MaxEnt inference procedure, which says that we should choose the probabilistic hypothesis with the highest entropy subject to the constraints provided by our data. MaxEnt was first explicitly advocated by Jaynes. He noted that it is already implicit in the procedures of statistical mechanics, but convincingly argued that it can also be applied to situations where entropy is more ‘informational’ than ‘thermodynamic’ in character.

Recently John Harte has applied MaxEnt in this way to ecology, using it to make specific testable predictions for the distribution, abundance and energy usage of species across spatial scales and across habitats and taxonomic groups [Harte2008, Harte2009, Harte2011]. Annette Ostling is an expert on other theories that attempt to explain the same data, such as the ‘neutral model’ [AOE2008, ODLSG2009, O2005, O2012]. Dewar has also used MaxEnt in ecology [D2008], and he has argued that it underlies the principle of Maximum Entropy Production.

Thus, a second goal of this workshop is to familiarize all the participants with applications of the MaxEnt method to ecology, compare it with competing approaches, and study whether MaxEnt provides a sufficient justification for the principle of Maximum Entropy Production.

Entropy is not merely a predictive tool in ecology: it is also widely used as a measure of biodiversity. Here Shannon’s original concept of entropy naturally generalizes to ‘Rényi entropy’, which depends on a parameter \alpha \ge 0. This equals

\displaystyle{ H_\alpha(p) = \frac{1}{1-\alpha} \log \sum_i p_i^\alpha  }

where p_i is the fraction of organisms of the ith type (which could mean species, some other taxon, etc.). In the limit \alpha \to 1 this reduces to the Shannon entropy:

\displaystyle{  H(p) = - \sum_i p_i \log p_i }

As \alpha increases, we give less weight to rare types of organisms. Christina Cobbold and Tom Leinster have described a systematic and highly flexible system of biodiversity measurement, with Rényi entropy at its heart [CL2012]. They consider both the case where all we have are the numbers p_i, and the more subtle case where we take the distance between different types of organisms into account.

John Baez has explained the role of Rényi entropy in thermodynamics [B2011], and together with Tom Leinster and Tobias Fritz he has proved other theorems characterizing entropy which explain its importance for information processing [BFL2011]. However, these ideas have not yet been connected to the widespread use of entropy in biodiversity studies. More importantly, the use of entropy as a measure of biodiversity has not been clearly connected to MaxEnt methods in ecology. Does the success of MaxEnt methods imply a tendency for ecosystems to maximize biodiversity subject to the constraints of resource availability? This seems surprising, but a more nuanced statement along these general lines might be correct.

So, a third goal of this workshop is to clarify relations between known characterizations of entropy, the use of entropy as a measure of biodiversity, and the use of MaxEnt methods in ecology.

As the amount of data to analyze in genomics continues to surpass the ability of humans to analyze it, we can expect automated experiment design to become ever more important. In Chris Lee and Marc Harper’s RoboMendel program [LH2013], a mathematically precise concept of ‘potential information’—how much information is left to learn—plays a crucial role in deciding what experiment to do next, given the data obtained so far. It will be useful for them to interact with William Bialek, who has expertise in estimating entropy from empirical data and using it to constrain properties of models [BBS, BNS2001, BNS2002], and Susanne Still, who applies information theory to automated theory building and biology [CES2010, PS2012].

However, there is another link between biology and potential information. Harper has noted that in an ecosystem where the population of each type of organism grows at a rate proportional to its fitness (which may depend on the fraction of organisms of each type), the quantity

\displaystyle{ I(q||p) = \sum_i q_i \ln(q_i/p_i) }

always decreases if there is an evolutionarily stable state [Harper2009]. Here p_i is the fraction of organisms of the ith genotype at a given time, while q_i is this fraction in the evolutionarily stable state. This quantity is often called the Shannon information of q ‘relative to’ p. But in fact, it is precisely the same as Lee and Harper’s potential information! Indeed, there is a precise mathematical analogy between evolutionary games and processes where a probabilistic hypothesis is refined by repeated experiments.

Thus, a fourth goal of this workshop is to develop the concept of evolutionary games as ‘learning’ processes in which information is gained over time.

We shall try to synthesize this with Carl Bergstrom and Matina Donaldson-Matasci’s work on the ‘fitness value of information': a measure of how much increase in fitness a population can obtain per bit of extra information [BL2004, DBL2010, DM2013]. Following Harper, we shall consider not only relative Shannon entropy, but also relative Rényi entropy, as a measure of information gain [Harper2011].

A fifth and final goal of this workshop is to study the interplay between information theory and the thermodynamics of individual cells and organelles.

Susanne Still has studied the thermodynamics of prediction in biological systems [BCSS2012]. And in a celebrated related piece of work, Jeremy England used thermodynamic arguments to a derive a lower bound for the amount of entropy generated during a process of self-replication of a bacterial cell [England2013]. Interestingly, he showed that E. coli comes within a factor of 3 of this lower bound.

In short, information theory and entropy methods are becoming powerful tools in biology, from the level of individual cells, to whole ecosystems, to experimental design, model-building, and the measurement of biodiversity. The time is ripe for an investigative workshop that brings together experts from different fields and lets them share insights and methods and begin to tackle some of the big remaining questions.


[AOE2008] D. Alonso, A. Ostling and R. Etienne, The assumption of symmetry and species abundance distributions, Ecology Letters 11 (2008), 93–105.

[TMMABB2012} D. Amodei, W. Bialek, M. J. Berry II, O. Marre, T. Mora, and G. Tkacik, The simplest maximum entropy model for collective behavior in a neural network, arXiv:1207.6319 (2012).

[B2011] J. Baez, Rényi entropy and free energy, arXiv:1102.2098 (2011).

[BFL2011] J. Baez, T. Fritz and T. Leinster, A characterization of entropy in terms of information loss, Entropy 13 (2011), 1945–1957.

[B2011] J. Baez and M. Stay, Algorithmic thermodynamics, Math. Struct. Comp. Sci. 22 (2012), 771–787.

[BCSS2012] A. J. Bell, G. E. Crooks, S. Still and D. A Sivak, The thermodynamics of prediction, Phys. Rev. Lett. 109 (2012), 120604.

[BL2004] C. T. Bergstrom and M. Lachmann, Shannon information and biological fitness, in IEEE Information Theory Workshop 2004, IEEE, 2004, pp. 50-54.

[BBS] M. J. Berry II, W. Bialek and E. Schneidman, An information theoretic approach to the functional classification of neurons, in Advances in Neural Information Processing Systems 15, MIT Press, 2005.

[BNS2001] W. Bialek, I. Nemenman and N. Tishby, Predictability, complexity and learning, Neural Computation 13 (2001), 2409–2463.

[BNS2002] W. Bialek, I. Nemenman and F. Shafee, Entropy and inference, revisited, in Advances in Neural Information Processing Systems 14, MIT Press, 2002.

[CL2012] C. Cobbold and T. Leinster, Measuring diversity: the importance of species similarity, Ecology 93 (2012), 477–489.

[CES2010] J. P. Crutchfield, S. Still and C. Ellison, Optimal causal inference: estimating stored information and approximating causal architecture, Chaos 20 (2010), 037111.

[D2004] R. C. Dewar, Maximum entropy production and non-equilibrium statistical mechanics, in Non-Equilibrium Thermodynamics and Entropy Production: Life, Earth and Beyond, eds. A. Kleidon and R. Lorenz, Springer, New York, 2004, 41–55.

[DJZ2006] R. C. Dewar, D. Juretíc, P. Zupanovíc, The functional design of the rotary enzyme ATP synthase is consistent with maximum entropy production, Chem. Phys. Lett. 430 (2006), 177–182.

[D2008] R. C. Dewar, A. Porté, Statistical mechanics unifies different ecological patterns, J. Theor. Bio. 251 (2008), 389–403.

[D2009] R. C. Dewar, Maximum entropy production as an inference algorithm that translates physical assumptions into macroscopic predictions: don’t shoot the messenger, Entropy 11 (2009), 931–944.

[D2010] R. C. Dewar, Maximum entropy production and plant optimization theories, Phil. Trans. Roy. Soc. B 365 (2010) 1429–1435.

[DBL2010} M. C. Donaldson-Matasci, C. T. Bergstrom, and
M. Lachmann, The fitness value of information, Oikos 119 (2010), 219-230.

[DM2013] M. C. Donaldson-Matasci, G. DeGrandi-Hoffman, and A. Dornhaus, Bigger is better: honey bee colonies as distributed information-gathering systems, Animal Behaviour 85 (2013), 585–592.

[England2013] J. L. England, Statistical physics of self-replication, J. Chem. Phys. 139 (2013), 121923.

[ODLSG2009} J. L. Green, J. K. Lake, J. P. O’Dwyer, A. Ostling and V. M. Savage, An integrative framework for stochastic, size-structured community assembly, PNAS 106 (2009), 6170–6175.

[Harper2009] M. Harper, Information geometry and evolutionary game theory, arXiv:0911.1383 (2009).

[Harper2011] M. Harper, Escort evolutionary game theory, Physica D 240 (2011), 1411–1415.

[Harte2008] J. Harte, T. Zillio, E. Conlisk and A. Smith, Maximum entropy and the state-variable approach to macroecology, Ecology 89 (2008), 2700–2711.

[Harte2009] J. Harte, A. Smith and D. Storch, Biodiversity scales from plots to biomes with a universal species-area curve, Ecology Letters 12 (2009), 789–797.

[Harte2011] J. Harte, Maximum Entropy and Ecology: A Theory of Abundance, Distribution, and Energetics, Oxford U. Press, Oxford, 2011.

[LH2013] M. Harper and C. Lee, Basic experiment planning via information metrics: the RoboMendel problem, arXiv:1210.4808 (2012).

[O2005] A. Ostling, Neutral theory tested by birds, Nature 436 (2005), 635.

[O2012] A. Ostling, Do fitness-equalizing tradeoffs lead to neutral communities?, Theoretical Ecology 5 (2012), 181–194.

[PS2012] D. Precup and S. Still, An information-theoretic approach to curiosity-driven reinforcement learning, Theory in Biosciences 131 (2012), 139–148.


Get every new post delivered to your Inbox.

Join 2,826 other followers