## Information Processing in Chemical Networks (Part 2)

Dynamics, Thermodynamics and Information Processing in Chemical Networks, 13-16 June 2017, Complex Systems and Statistical Mechanics Group, University of Luxembourg. Organized by Massimiliano Esposito and Matteo Polettini.

I’ll do it in the comments!

I explained the idea of this workshop here:

### 23 Responses to Information Processing in Chemical Networks (Part 2)

1. John Baez says:

The first talk:

Luca Peliti, On the value of information in gambling, evolution and thermodynamics.

Abstract. The connection between the information value of a message and capital gain was made by Kelly in 1953. In 1965 Kimura tried to evaluate the rate of information intake by a population undergoing Darwinian evolution by equating it with the substitutional load. Recently, the analogy between Kelly’s scheme and work extraction was pointed out in the context of stochastic thermodynamics. I shall try to connect these threads, highlighting analogies and differences between the meaning of information and its value in the different contexts.

• John Baez says:

I’m fond of Kelly’s work, where he showed that you can double your money if you can get people to bet on a topic where you know one bit of information that they don’t.

• Wikipedia, Kelly criterion.

I don’t know what ‘substitution load’ is.

• John Baez says:

I’ve been familiar with the analogy Peliti is explaining—the analogy between how Maxwell’s demon can extract extra work by knowing some information, and how Kelly’s gambler can win bets by knowing some ‘inside information’—for some time now. He says it’s due to Vinkler et al in 2015 and Rivoire in 2015. So, I missed the boat.

• John Baez says:

Peliti went on to cover a lot of very interesting material that’s new to me, e.g. the analogue of the Jarzynski equality for population dynamics.

I hear he’ll put his slides on the conference website; they contain a lot of cool references at the end. I’ll link to them here when they’re up. If I forget, post a comment to remind me!

• John Baez says:

Here’s one interesting reference:

• T. J. Kobayashi and Y. Sughiyama, Fluctuation relations of fitness and information in population dynamics, Phys. Rev. Lett. 115 (2015), 238102.

Abstract. Phenotype-switching with and without sensing environment is a ubiquitous strategy of organisms to survive in fluctuating environment. Fitness of a population of organisms with phenotype-switching may be constrained and restricted by hidden relations as the entropy production in a thermal system with and without sensing and feedback is well-characterized via fluctuation relations (FRs) . In this work, we derive such FRs of fitness together with an underlying information-theoretic structure in selection. By using path-integral formulation of a multi-phenotype population dynamics, we clarify that the optimal switching strategy is characterized as a consistency condition for time-forward and backward path probabilities. Within the formulation, the selection is regarded as passive information compression, and the loss of fitness from the optimal strategy is shown to satisfy various FRs that constrain the average and fluctuation of the loss. These results are naturally extended to the situation that organisms can use an environmental signal by actively sensing the environment. FRs of fitness gain by sensing are derived in which the multivariate mutual information among the phenotype, the environment and the signal plays the role to quantify the relevant information in the signal for fitness gain.

2. John Baez says:

Next:

Hong Qian, The Mathematical Foundation of a landscape theory for living matter and life.

Abstract. The physicists’ notion of energy is derived from Newtonian mechanics. The theory of thermodynamics is developed based on that notion, and the realization of mechanical energy dissipation in terms of heat. Since the work of L. Boltzmann, who trusted that atoms were real as early as in 1884, the heat became intimately related to the stochastic motion of the invisible atoms and molecules. In this talk, starting from a stochastic description of a class of rather general dynamics that is not limited to mechanics, we show a notion of energy can be derived mathematically, in the limit of vanishing stochasticity, based on the Kullback-Leibler divergence, or relative entropy associated with the stochastic, Markov processes. With the emergent notion of an energy function, e.g., “landscape”, a mathematical structure inherent to the stochastic dynamics, which is akin to thermodynamics, is revealed. This analysis implies that an abstract “mathematicothermodynamics” structure exists, and can be formulated, for dynamics of complex systems independent of classical thermal physics, for example, in ecology.

• John Baez says:

A cool-looking paper:

• Yi-An Ma and Hong Qian, A thermodynamic theory of ecology: Helmholtz theorem for Lotka-Volterra equation, extended conservation law, and stochastic predator-prey dynamics, Proceedings of the Royal Society A 471 (2015), 20150456.

• John Baez says:

Hong Qian said that Jan Maas has worked on writing Markov processes in terms of gradient flow with respect to the Wasserstein metric on the space of probability distributions (a metric that I don’t yet understand, apparently). I should read this:

Abstract. Let K be an irreducible and reversible Markov kernel on a finite set X. We construct a metric W on the set of probability measures on X and show that with respect to this metric, the law of the continuous time Markov chain evolves as the gradient flow of the entropy. This result is a discrete counterpart of the Wasserstein gradient flow interpretation of the heat flow in $\mathbb{R}^n$ by Jordan, Kinderlehrer, and Otto (1998). The metric W is similar to, but different from, the $L^2$-Wasserstein metric, and is defined via a discrete variant of the Benamou-Brenier formula.

3. John Baez says:

For ‘motifs’ in genetic regulatory networks, see:

Uri Alon, An Introduction to Systems Biology: Design Principles of Biological Circuits, Chapman, 2006.

Enrico Carlon is talking about one of these: the heterodimer auto-repression loop (or HAL) is one of several simple motifs that gives rise to oscillatory behavior. You can think of it as a chemical reaction network and write down its rate equation: a set of first-order differential equations.

• Enrico Carlon, A robust and flexible pulse-generating genetic module.

Abstract. The heterodimer autorepression loop (HAL) is a small genetic module in which a protein A acts as an autorepressor and binds to a second protein B to form an AB dimer. For suitable values of the rate constants, the HAL produces pulses of A alternating with pulses of B. By means of analytical and numerical calculations, we show that the duration of A pulses is extremely robust against variation of the rate constants while the duration of the B pulses can be flexibly adjusted. The HAL is thus a minimal genetic module generating robust pulses with a tunable duration, an interesting property for cellular signaling.

Here’s a paper on it:

• B. Lannoo, E. Carlon and M. Lefranc, The heterodimer auto-repression loop: a robust and flexible pulse-generating genetic module.

E. Yeger-Lotem has written a couple of papers in 2003-4 looking for the HAL motif in nature.

4. John Baez says:

Another talk by someone I know, whose work I like a lot, in part because it combines chemical reaction theory with graph rewrite theory:

Christoph Flamm, How to find mechanisms in large reaction networks? (Part 1.)

Abstract. Over the past years, we have developed a graph-grammar based formalism for the unified treatment of large chemical reaction networks. Molecules are abstracted to graph objects and reactions to graph transformations between these objects. The formalism is constructive in nature and makes it possible to systematically assemble the “chemical space” of the possible molecules and the reactions between them. The chemical space construction is primed by a set of educts and a “reaction chemistry” defined as a collection of graph transformation rules. Key features of chemical reaction systems such as mass conservation, or atom-to-atom maps between reactant and product molecules are inherently preserved properties in our formalism. The possibility to construct arbitrary chemical spaces paves the way to systematically attack questions such as “How is a molecule of interest constructed with a particular reaction chemistry?”, or “Does the chemical space harbor multiple, possibly competing, reaction mechanisms to a molecule of interest?”. The idea for answering such questions involves the rephrasing of the question in the language of integer flows on networks. Sub-networks, which conform to the formal flow specification can than be identified in the chemical space using mathematical optimization techniques. In contrast to a reaction mechanism, such flow solutions however, do not possess a causal ordering in terms of which reaction must happen before which other reaction, such that the educts are successively transformed, via reactions and intermediates, into the products. Therefore Petri net techniques are required to “interpret” a flow solution with respect to the causal order. Mechanisms identified with such a methodology are however not necessarily dynamically realizable. The approach also fails if the mechanism cannot be expressed as flow problem before hand. In such cases causal reconstruction approaches developed in the realm of concurrent systems are required to find causally ordered and dynamically realizable reaction mechanisms. Based on illustrative examples I will walk the audience through the graph-grammar based mechanism reconstruction, followed by a schematic description of how to identify dynamically realizable reaction mechanisms from stochastic simulations.

After this will come a second part:

Daniel Merkle, Exploration and analysis of chemical spaces. (Part 2.)

Abstract. In this presentation I will introduce the mechanism finding problem, as scratched by Christoph Flamm, formally more rigorous. Methods to model chemistry on an atomic level with algebraic and graph rewriting approaches will be presented. We use a formalism, rooted in category theory, called the Double Pushout approach, which directly expresses the transition state of chemical reactions. Chemical spaces generated by graph grammars contain important transformation patterns such as so-called auto-catalytic sub-networks or alternative routes to molecules of interest. Such chemical motifs are usually hard to find due to the computational complexity of the underlying problem and the vastness of the chemical spaces. However, our algorithmic approaches combined with the explicitness of our models allows for detailed investigations within these spaces, which is the foundation for understanding function of biological systems. Approaches and results based on deterministic as well as stochastic simulations will be presented
on a range of chemical systems including the non-oxidative part of the Pentose Phosphate pathway as well as Eschenmoser’s hypothetical relationship between HCN chemistry and constituents of the reductive citric acid cycle.

• John Baez says:

Merkle is speaking. This is the first talk I’ve seen where someone argues strongly for the usefulness of category theory, then explains the definition of pushouts, and then illustrates it with examples from organic chemistry! Good thing I’d explained the definition of ‘category’ earlier in the day! Double-pushout graph rewriting applied to chemistry, and implemented in software that creates pictures of molecules using TikZ:

• Jakob L. Andersen, Christoph Flamm, Daniel Merkle, Peter F. Stadler, A software package for chemically inspired graph transformation.

Abstract. Chemical reaction networks can be automatically generated from graph grammar descriptions, where rewrite rules model reaction patterns. Because a molecule graph is connected and reactions in general involve multiple molecules, the rewriting must be performed on multisets of graphs. We present a general software package for this type of graph rewriting system, which can be used for modelling chemical systems. The package contains a C++ library with algorithms for working with transformation rules in the Double Pushout formalism, e.g., composition of rules and a domain specific language for programming graph language generation. A Python interface makes these features easily accessible. The package also has extensive procedures for automatically visualising not only graphs and rewrite rules, but also Double Pushout diagrams and graph languages in form of directed hypergraphs. The software is available as an open source package, and interactive examples can be found on the accompanying webpage.

5. John Baez says:

I’m listening to this talk now:

Thomas Ouldridge, The thermodynamics of persistent information in biochemical systems.

Abstract. As discussed in the preceding talks, biochemical push-pull networks appear to make copies of receptor states in order to perform time integration. Furthermore, there is a trade-off between readout performance and energy consumption. These results hint at a tantalising analogy between such molecular systems and the fundamental limits on the thermodynamics of measurement, a topic that has generated considerable attention since Maxwell summoned his famous “demon”. But is this analogy rigorous, and does it provide any insight into the underlying operation of these molecular circuits? Can cellular copying networks approach the limit of optimal efficiency? We demonstrate that indeed, a concrete mapping relates the operation of push-pull motifs to computational copying. Moreover, just like a true computational copy, these biochemical circuits must do work to create long-lived (persistent) information between molecules; extracting no work from these correlations sets a lower bound on entropy generation. Autonomous biochemical networks can come close to this fundamental lower bound, even at high copying rate, but cannot reach it due to the existence of a constant thermodynamic drive. Similarly, we show how persistent information between molecular states can be exploited as a thermodynamic resource by a system based on the same underlying biochemistry.

Our analysis emphasises the importance of correlations between non-interacting molecules in the thermodynamics of molecular systems. Guided by this insight, we explore the biologically ubiquitous process of construction of persistent polymer copies from templates, occurring during DNA replication and protein synthesis. In such processes, a copy sequence is produced that must retain its sequence even after physical separation from its template. We show that this need to retain correlations after separation has a number of important effects. In particular, this persistence implies a resource cost for minimal replicators that grows with replication accuracy,
and suggests that in autonomous contexts a non-equilibrium process is not only necessary to enhance copying accuracy, but to provide any accuracy at all.

One thing I like is that he considers some simple reaction networks as models of the concepts he’s studying. One of them is this:

X + ATP $\leftrightarrow$ X* + ADP
X + P $\leftrightarrow$ X*

This is simple enough that I can imagine understanding it! He shows how the rate equation for this is “the same as” (or can be mapped to, in some way that deserves investigation) the rate equation for a rate equation for a more abstract reaction network for data copying.

6. John Baez says:

Eric Smith is talking now. He mentioned that kappa is supposed to be an extensible language for systems biology. Worth looking into!

• John Baez says:

It’s the first part of a two-part talk:

• Eric Smith and Supriya Krishnamurthy, Stochastic chemical reaction networks in the Doi-Peliti representation: scaling, moment hierarchies, and duality under the non-equiilbrium work relations and their generalizations.

Abstract. The common material in our two talks will be the class of discrete-state stochastic processes associated with Chemical Reaction Networks (CRNs), as these are studied using the Doi-Peliti (DP) 2-field functional integral representation. The CRNs are an interesting class for which the graphical models representing the stochastic process are computationally complex, and the extent to which their solution properties can be determined from network topology is a problem of ongoing interest. The DP formalism gives a general way to represent generating functions and functionals for such processes, and provides insight into both solution properties of longstanding interest within the CRN community, and into the meaning of the duality of the Jarzynski/Crooks Non-Equilibrium Work Relations (NEWRs) and their generalizations.

In the first segment, Eric Smith will review basics of CRN theory, including the similarities and differences to simpler diffusion processes, the sources of complexity, and the important topological characteristic known as “deficiency” due to Feinberg. He will show how representing stochastic CRNs with the DP formalism expresses simplicities and symmetries that are masked in more limited representations, and in particular how the duality associated with NEWRs is expressed. This talk will focus on the shift from causality to anti-causality that is generally associated with the NEWRs.

In the second segment, Supriya Krishnamurthy will show how the DP formalism combined with CRN theory helps in deriving a particularly simple representation of the hierarchy of moments. This representation provides a way to solve the moment hierarchies (in steady states) using matched asymptotic expansions. For some simple systems or for sub-hierarchies within more complex systems, the values of the moments can be solved from their asymptotic behaviours via direct numerical recursions.

7. John Baez says:

Eric Smith says “There is a native notion of the fluctuation-dissipation theorem built into Doi’s 2-field formalism”, citing Kamenev. This “2-field formalism” is the path integral approach to the chemical master equation.

8. John Baez says:

Stefan Schuster is talking about “elementary flux modes” in reaction networks. Here’s the talk abstract:

• Stefan Schuster, Jan Ewald, Maximilian Fichtner, Severin Sasso and Christoph Kaleta, Modelling the link between lipid and carbohydrate metabolism in various species.

Abstract Elementary-modes analysis [1] has become a well-established theoretical tool in metabolic pathway analysis. It allows one to decompose complex metabolic networks into the smallest functional entities, which can be interpreted as biochemical pathways. It led to successful theoretical prediction of novel pathways, such as in carbohydrate metabolism in Escherichia coli and other bacteria [1, 2]. Metabolism is more complex than a graph in the sense of graph theory because of the presence of bimolecular reactions. Therefore, the existence of a connected route does not necessarily guarantee a net conversion along that route at steady state. This is here illustrated by tackling the question whether humans can convert fatty acids into sugar [3]. While, in agreement with biochemical dogma, no stoichiometrically balanced route for such a conversion can be found in human central metabolism, we did find several routes in a genome-scale network of human
metabolism [4]. This is likely to be relevant for sports physiology, weight-reducing diets and other applications. In green plants, fungi and many bacteria, in contrast, the above-mentioned conversion is enabled by the glyoxylate shunt, which is absent from humans and most animals. That shunt is of special importance for pathogenic fungi such as Candida albicans. Finally, we present a method for enumerating fatty acids [5], with potential applications in lipidomics and synthetic biology. We show that the number of unmodified fatty acids grows according to the famous Fibonacci numbers when cis/trans isomerism is neglected. Under consideration of that isomerism or modification by hydroxy- or oxo groups, diversity can be described by generalized Fibonacci numbers (e.g. Pell numbers).

[1] S. Schuster, T. Dandekar, D.A. Fell, Detection of elementary flux modes in biochemical networks: A promising tool for pathway analysis and metabolic engineering, Trends Biotechnol. 17 (1999) 53-60.

[2] E. Fischer, U. Sauer: A novel metabolic cycle catalyzes glucose oxidation and anaplerosis in hungry Escherichia coli, _J. Biol. Chem. 278 (2003) 46446–46451.

[3] L. F. de Figueiredo, S. Schuster, C. Kaleta, D.A. Fell: Can sugars be produced from fatty acids? A test case for pathway analysis tools, Bioinformatics 25 (2009) 152–158.

[4] C. Kaleta, L.F. de Figueiredo, S. Werner, R. Guthke, M. Ristow, S. Schuster: In silico evidence for gluconeogenesis from fatty acids in humans, PLoS Comp. Biol. 7 (2011) e1002116.

[5] S. Schuster, M. Fichtner, S. Sasso: Use of Fibonacci numbers in lipidomics – Enumerating various classes of fatty acids, Sci. Rep. 7 (2017) 39821.

• John Baez says:

Schuster defined an elementary mode of a chemical reaction network as “a minimal set of enzymes that can operate at steady state with all irreversible reactions used in the appropriate direction, with enzymes weighted by the relative flux they carry”. A more mathematical-sounding definition would help me a lot, and I’m sure one exists! I believe we’ve got a (directed multi)graph and we’re looking for a basis of the space of 1-chains obeying some property. There may be an assumption that all reactions involve an enzyme, meaning that they’re of the form

$A + X \to B + X$

or perhaps

$A + X \to B + C + X$

or

$A + B + X \to C + X$

for some species $X,$ the enzyme.

However they’re defined, the elementary modes are unique up to scaling, and they form a spanning set for some interesting vector space.

• John Baez says:

Here’s my attempt to formalize the idea. I fear I’m leaving out some important details.

A reaction network has a finite set of species $S$, a finite set of reactions $R,$ and source and target maps

$s, t \colon R \to \mathbb{N}^S$

The change in species due to a reaction $r \in R$ is $t(r) - s(r) \in \mathbb{N}^S$, and there is a linear map

$Y : \mathbb{R}^R \to \mathbb{R}^S,$

usually called the stochiometric matrix defined by the equation

$Y(r) = t(r) - s(r)$

(Here I’m thinking of any element of $\mathbb{R}^R$ as a linear combination of elements of $R$, which form a basis; then I’m defining $Y$ by its action on these basis elements.)

A flux mode is an element

$f \in \mathbb{N}^R \subseteq \mathbb{R}^R$

for which

$Y(f) = 0$

So, it’s a sum of reactions that, taken together, ‘don’t do anything’—it leaves the number of species of each sort unchanged.

An elementary flux mode is a flux mode $f$ that’s minimal, meaning there’s no nonzero flux mode $f'$ with

$f' \le f$

Here I’m using the obvious partial order on flux modes, where $f' \le f$ iff $f'(s) \le f(s)$ for each species $s.$

I fear I’m leaving out some issues regarding enzymes: the reaction network I’m discussing might formally contain a reaction

$A \to B$

but really ‘secretly’ this reaction could be

$A + X \to B \to X$

This wouldn’t significantly change the concept of flux mode, because the amount of the enzyme $X$ is unchanged, but in applications of flux modes we might want to know which flux modes are allowed if we put a constraint on how much of the enzyme $X$ is available.

Perhaps we should model this by a set $E$ of enzymes and a map

$Z : R \to \mathbb{N}^E$

saying how much of each enzyme is involved in each reaction.

• John Baez says:

Schuster describes how elementary mode analysis has been used to study an interesting question: can fatty acids be transformed into glucose? Humans can’t do it using the reactions that convert glucose into fatty acids, but green plants, fungi, many bacteria (including E. coli) and nematodes can do it. That’s because they have the ‘glyoxylate shunt’.

Can humans convert fatty acids into sugar using some more complicated mechanism. Yes! – at least in principle. They’re very complicated: molecules need to cross the mitochondrial membrane three times! And yet, it seems they may be used by people playing soccer (for example), and by the Inuit, who eat a diet high in fat and low in sugar.

• John Baez says:

If you count fatty acids with n carbon atoms, excluding ‘allenic’ ones (those with two neighboring double bonds, which are rare in nature) and ignoring cis/trans isomerism, you get the Fibonacci series!

9. Our paper on this stuff just got accepted, and it should appear soon:

• John Baez and Blake Pollard, A compositional framework for reaction networks, to appear in Reviews in Mathematical Physics.

But thanks to the arXiv, you don’t have to wait: beat the rush, click and download now!

Blake and I gave talks about this stuff in Luxembourg this June, at a nice conference called Dynamics, thermodynamics and information processing in chemical networks. So, if you’re the sort who prefers talk slides to big scary papers, you can look at those:

• John Baez, The mathematics of open reaction networks.

• Blake Pollard, Black-boxing open reaction networks.

10. The University of Southern Denmark wants to hire several postdocs who will use category theory to design enzymes. This sounds like a wonderful job for people who like programming, chemistry and categories—and especially double pushout rewriting. The application deadline is 20 March 2020. The project is described here and the official job announcement is here.

I’ve seen Christoph Flamm, Daniel Merkle, Peter Sadler give talks on this project in Luxembourg, and it’s really fascinating. They’re using double pushout rewriting (as shown in the picture above) and other categorical techniques to design sequences of chemical reactions that accomplish desired tasks.

This site uses Akismet to reduce spam. Learn how your comment data is processed.