## Glycolysis (Part 1)

I’m trying to understand some biology. Being a mathematician I’m less interested in all the complicated details of life on this particular planet than in something a bit more abstract. I want to know ‘the language of life’: the right way to talk about living systems.

Of course, there’s no way to reach this goal without learning a lot of the complicated details. But I might as well be honest and state my goal, since it’s bound to put a strange spin on how I learn and talk about biology.

For example, when I heard people were using the pi-calculus to model a very simple bacterium, I wasn’t eager to know how close their model is to the Last Universal Ancestor, the primordial bug from which we all descend. Even though it’s a fascinating question, it’s not one I can help solve. Instead, I wanted to know if the pi-calculus is really the best language for this kind of model.

I also wanted to know what types of chemical reactions are needed for a cell to survive. I’ll never remember all the details of those reactions: I don’t have the right kind of mind for that. But I might manage to think about these reactions in abstract ways that biologists haven’t tried.

The minimal gene set prokaryote has been exhaustively described in the enhanced π-calculus. We represented the 237 genes, their relative products, and the metabolic pathways expressed and regulated by the genes, as the corresponding processes and channels. In particular: the glycolytic pathway, the pentose phosphate pathway, the pathways involved in nucleotide, aminoacids, coenzyme, lipids, and glycerol metabolism.

I instantly wanted to get an overall view of these reactions, without immersing myself in all the details.

Unfortunately I don’t know how to do this. Do you?

It might be like trying to learn grammar without learning vocabulary: not very easy, and perhaps unwise.

But I bet there’s a biochemistry textbook that would help me: one that focuses on the forest before getting into the names of all the trees. I may have even seen a such book! I’ve certainly tried to learn biochemistry. It’s a perfectly fascinating subject. But it’s only recently that I’ve gotten serious about chemical reaction networks and nonequilibrium thermodynamics. this may help guide my studies now.

Anyway, let me start with the ‘glycolytic pathway’. Glycolysis is the process of breaking down a sugar called glucose, thereby powering the formation of ATP, which holds energy in a form that the cell can easily use to do many things.

Glycolysis looks pretty complicated, at least if you’re a mathematician:

But when you’re trying to understand the activities of a complicated criminal network, a good slogan is ‘follow the money’. And for a chemical reaction network, you can ‘follow the conserved quantities’. We’ve got various kinds of atoms—hydrogen, carbon, nitrogen, oxygen, phosphorus—and the number of each kind is conserved. That should help us follow what’s going on.

Energy is also conserved, and that’s incredibly important in thermodynamics. Free energy—energy in forms that are actually useful—is not conserved. But it’s still very good to follow it, since while it can go away, turning into heat, it essentially never appears out of nowhere.

The usual definition of free energy is something like

$F = E - TS$

where $E$ is energy, $T$ is temperature and $S$ is entropy. You can think of this roughly “energy minus energy in the form of heat”. There’s a lot more to say here, but I just want to add that free energy can also be interpreted as ‘relative information’, a purely information-theoretic concept. For an explanation, see Section 4 of this paper:

• John Baez and Blake Pollard, Relative entropy in biological systems. (Blog article here.)

Since I like abstract generalities, this information-theoretic way of understanding free energy appeals to me.

And of course free energy is useful, so an organism should care about it—and we should be able to track what an organism actually does with it. This is one of my main goals: understanding better what it means for a system to ‘do something with free energy’.

In glycolysis, some of the free energy of glucose gets transferred to ATP. ATP is a bit like ‘money’: it carries free energy in a way that the cell can easily ‘spend’ to do interesting things. So, at some point I want to look at an example of how the cell actually spends this money. But for now I want to think about glycolysis—which may be more like ‘cashing a check and getting money’.

First, let’s see what we get if we ‘black-box’ glycolysis. I’ve written about black-boxing electrical circuits and Markov processes: it’s a way to ignore their inner workings and focus on the relation between inputs and outputs.

Blake Pollard and I are starting to study the black-boxing of chemical reaction networks. If we black-box glycolysis, we get this:

2 pyruvate + 2 NADH + 2 H+ + 2 ATP + 2 H2O

A molecule of glucose has more free energy than 2 pyruvate molecules plus 2 water molecules. On the other hand, ADP + phosphate has less free energy than ATP. So, glycolysis is taking free energy from glucose and putting some of it into the handy form of ATP molecules. And a natural question is: how efficient is this reaction? How much free energy gets wasted?

Here’s an interesting paper that touches indirectly on this question:

• Daniel A. Beard, Eric Babson, Edward Curtis and Hong Qian, Thermodynamic constraints for biochemical networks, Journal of Theoretical Biology 228 (2004), 327–333.

They develop a bunch of machinery for studying chemical reaction networks, which I hope to explain someday. (Mathematicians will be delighted to hear that they use matroids, a general framework for studying linear dependence. Biochemists may be less delighted.) Then they apply this machinery to glycolysis, using computers to do some calculations, and they conclude:

Returning to the original problem of ATP production in energy metabolism, and searching for the flux vector that maximizes ATP production while satisfying the
mass balance constraint and the thermodynamic constraint, we find that at most 20.5 ATP are produced for each glucose molecule consumed.

So, they’re getting some upper bound on how good glycolysis could actually be!

Puzzle 1. What upper bounds can you get simply from free energy considerations?

For example, ignore NADH and NAD+ for a second, and ask how much ATP you could make from turning a molecule of glucose into pyruvate and water if free energy were the only consideration. To answer this, you could take the free energy of a mole glucose minus the free energy of the corresponding amount of pyruvate and water, and divide it by the free energy of a mole of ATP minus the free energy of the corresponding amount of ADP and phosphate. What do you get?

Puzzle 2. How do NADH and NAD+ fit into the story? In the last paragraph I ignored those. We shouldn’t really do that! NAD+ is an oxidized form of nicotinamide adenine dinucleotide. NADH is the the reduced form of the same chemical. In our cells, NADH has more free energy than NAD+. So, besides producing ‘free energy money’ in the form of ATP, glycolysis is producing it in the form of NADH! This should improve our upper bound on how much ATP could be produced by glycolysis.

However, the cell uses NADH for more than just ‘money’. It uses NADH to oxidize other chemicals and NAD+ to reduce them. Reduction and oxidation are really important in chemistry, including biochemistry. I need to understand this whole redox business better. Right now my guess is that it’s connected to yet another conserved quantity, which I haven’t mentioned so far.

Puzzle 3. What conserved quantity is that?

### 36 Responses to Glycolysis (Part 1)

1. Puzzle 3: Charge?

• John Baez says:

Yes, I think so! According to Wikipedia:

Oxidation is the loss of electrons or an increase in oxidation state by a molecule, atom, or ion.

Reduction is the gain of electrons or a decrease in oxidation state by a molecule, atom, or ion.

Redox reactions, or oxidation-reduction reactions, have a number of similarities to acid–base reactions. Like acid–base reactions, redox reactions are a matched set, that is, there cannot be an oxidation reaction without a reduction reaction happening simultaneously. The oxidation alone and the reduction alone are each called a half-reaction, because two half-reactions always occur together to form a whole reaction. When writing half-reactions, the gained or lost electrons are typically included explicitly in order that the half-reaction be balanced with respect to electric charge.

That sounds very promising.

The key terms involved in redox are often confusing to students. For example, an element that is oxidized loses electrons; however, that element is referred to as the reducing agent. Likewise, an element that is reduced gains electrons and is referred to as the oxidizing agent.

This is just like saying that giving a gift is the opposite of getting one.

• Graham says:

It seems like redox potential (https://en.wikipedia.org/wiki/Reduction_potential) should be mentioned here.

2. Walter Blackstock says:

You may like Biological Physics by Philip Nelson (W. H. Freeman & Co.). My copy is ten years old, but there is a 2015 edition. Considering also your interest in water, have a look Life Depends upon Two Kinds of Water by Philippa Wiggins, PLoS ONE 2008.

• John Baez says:

Thanks! I’ll give Biological Physics a try.

That paper by Wiggins is a bit scary, since it suggests we have a lot of basic things left to understand about something one might have hoped was reasonably well-understood: the role of water in the chemical reactions crucial to life. I’m curious to know how controversial her views are. Knowing how complicated water is, I can easily imagine she’s onto something, but I’d like to see more recent papers if any. The model of ‘two kinds of ‘water’ would have to be a simplification, as she willingly admits—but perhaps a useful one, as she claims.

For here, I’ll just quote some of her remarks about ‘energy transduction’. I didn’t quite have the time to make the idea explicit in my post here, but it was sitting right beneath the surface. The idea is that a reaction like

A + B → C + D

is thernodynamically favored if the free energy decrease in

A → C

is more than the free energy gain of

B → D

The issue is how these reactions get ‘coupled’, so that the second happens along with the first. It’s like getting two gears into contact so that when one turns, so does the other.

The third line of evidence comes from a scrutiny of biological processes, especially enzyme reactions. Like the outstanding anomalies of the pure liquid, a full mechanistic explanation seemed to require more than the random hydrogen bonding of one-state water. In particular, the ability of some enzymes to hydrolyse ATP, peptides and polynucleotides and others to synthesise ATP and biopolymers, a pervading phenomenon in biochemistry, had no detailed molecular mechanism. The concept of energy transduction which was proposed in the 1960s has survived to achieve the status of a textbook fact, but it does not constitute a molecular mechanism. The free energy change of a reaction indicates how far the reaction, as written, is from equilibrium and, therefore, whether or not the reaction will proceed spontaneously. For example, the free energy change of the reaction:

MgATP + H2O → ADP + Mg++ + Pi

is approximately −30 kJ/mol. Here Pi means the mixture of H2PO4/HPO42− that exists at the prevailing pH. This is a rather high negative free energy change which means that the reaction, as written, will go spontaneously from left to right, and that the reverse reaction from right to left has a free energy change of +30 kJ/mol and will not go spontaneously. The magnitude of the free energy change shows that the reaction is far from equilibrium. As the reaction proceeds from left to right, 30 kJ/mol of energy is dissipated as heat. The concept of energy transduction states that the negative free energy of a spontaneous reaction, such as hydrolysis of ATP, can be used to drive an uphill reaction, such as movement of cations from low to high concentrations, provided that the two reactions are coupled in a single enzyme active site [12]. The authors take the case where the uphill reaction has a positive standard free energy change:

A → B       ΔGo = +10 kJ/mole

The spontaneous reaction has a larger negative standard free energy change:

C → D       ΔGo = -30 kJ/mole

When the two reactions are coupled the overall reactions is:

A + B → C + D

and the free energy change is ΔGo = +10–30 = −20 kJ/mol i.e. when the reactions are coupled, the sum of the two standard free energy changes is negative, and the uphill reaction can proceed. The uphill reaction might be movement of Na+ from a low concentration inside a cell to a higher concentration outside the cell, and the spontaneous reaction hydrolysis of ATP. This is called energy transduction. It is taken to mean that free energy stored in the bonds of ATP can be converted to other forms of energy and harnessed to do work.

This concept was vigorously debated in the 1960s. Physical chemists pointed out that, however and wherever ATP was hydrolysed, the free energy of that hydrolysis was dissipated as heat and could not be diverted to or bestowed upon another unrelated reaction, however hungry that reaction might be for kJ [13]. True coupling of reactions requires that a reactant of the uphill reaction is a product of the spontaneous reaction, so that it is so rapidly scavenged that the reaction can proceed. Biochemists, in their turn, pointed out that whenever the two reactions occurred together in the active site of the sodium pump, ATP was hydrolysed and work of transport performed. Without hydrolysis of ATP no transport occurred. The fallacy in their argument was that they believed that each step in the overall process must have a negative free energy change. In fact, the criterion that a reaction will proceed is simply that the free energy of the products is more negative that that of the reactants. There can be ups and downs on the way from reactant to products, but as long as all the steps considered contribute to the overall unitary process, all that matters is the beginning and the end, irrespective of how the system gets there.

Thanks! I am indebted to you, see my longish comment below.

3. Graham says:

I’d like an overall view of metabolism too! Though I want a different one to you. I’m mainly interested in evolutionary biology – not so much ‘what was the LUCA like?’ as understanding things like how life can acquire new metabolic pathways. I’d like to get a bird’s eye view of how a simple or not-so-simple bacterium fits into a larger space of possible metabolisms. I have a book Arrival of the Fittest by Andreas Wagner (http://arrival-of-the-fittest.com/) which is about this sort of thing. I find the book interesting, but also irritating.

Some things I found interesting:

E Coli can live in an austere environment of just seven small molecules (p61). It can use glucose, or any one of 80-odd other molecules, as its only source of both carbon and energy. But it also thrives in mammalian guts, where everything is provided, against competition from more specialist organisms. Obviously E Coli is far too complicated to start with!

There are about 5000 known chemical reactions used by some or other organism to produce the building blocks of life (p69). (I wonder how many unknown ones there are?) By the building blocks he means the 20 amino acids, 4 nucleotides, etc. He says there are 60 or so in total.

He says the citric acid cycle (https://en.wikipedia.org/wiki/Citric_acid_cycle) stands out as the best candidate for a very early metabolic pathway (p53). It is used by animals, plants, and microbes, including ones around hydrothermal vents. ‘If you sought one metabolic core from which you could build what life needs, the citric acid cycle would be it.’

The citric acid cycle works both ways. ‘Clockwise’, it part of the process of turning food into ATP. That’s what our cells do with it. ‘Anticlockwise’, it uses energy to convert one citric acid molecule into two. That’s how bacteria around hydrothermal vents use it.

I’d like to see a Petri net representing a cycle (preferably simpler than the ten-reaction citric acid cycle) which you can put into different environments (how to do that is your problem!) and make it cycle different ways.

• John Baez says:

Thanks a lot! I hope to talk about the citric acid cycle soon in my mini-tour of the key chemical reactions that power life—though I’m old enough that I still call it the Krebs cycle, at least in my own mind.

I’ve been thinking about the mathematics of such cycles, so I should be able to meet your request for that Petri net fairly soon. This is the kind of thing I’ve been reading:

• Matteo Polettini and Massimiliano Esposito, Irreversible thermodynamics of open chemical networks I: Emergent cycles and broken conservation laws.

Abstract. In this and a companion paper we outline a general framework for the thermodynamic description of open chemical reaction networks, with special regard to metabolic networks regulating cellular physiology and biochemical functions. We first introduce closed networks “in a box”, whose thermodynamics is subjected to strict physical constraints: the mass-action law, elementarity of processes, and detailed balance. We further digress on the role of solvents and on the seemingly unacknowledged property of network independence of free energy landscapes. We then open the system by assuming that the concentrations of certain substrate species (the chemostats) are fixed, whether because promptly regulated by the environment via contact with reservoirs, or because nearly constant in a time window. As a result, the system is driven out of equilibrium. A rich algebraic and topological structure ensues in the network of internal species: Emergent irreversible cycles are associated to nonvanishing affinities, whose symmetries are dictated by the breakage of conservation laws. These central results are resumed in the relation $a+b=s^Y$ between the number of fundamental affinities $a,$ that of broken conservation laws $b$ and the number of chemostats $s^Y$. We decompose the steady state entropy production rate in terms of fundamental fluxes and affinities in the spirit of Schnakenberg’s theory of network thermodynamics, paving the way for the forthcoming treatment of the linear regime, of efficiency and tight coupling, of free energy transduction and of thermodynamic constraints for network reconstruction.

But it should be possible to cook up examples of what you want without getting into all the details of this!

4. Graham says:

I can’t answer puzzles 1 and 2, but I can copy this from Wikipedia. It is about glycolysis plus citric acid cycle and oxidative phosphorylation.

The theoretical maximum yield of ATP through oxidation of one molecule of glucose in glycolysis, citric acid cycle, and oxidative phosphorylation is 38 (assuming 3 molar equivalents of ATP per equivalent NADH and 2 ATP per FADH2). In eukaryotes, two equivalents of NADH are generated in glycolysis, which takes place in the cytoplasm. Transport of these two equivalents into the mitochondria consumes two equivalents of ATP, thus reducing the net production of ATP to 36. Furthermore, inefficiencies in oxidative phosphorylation due to leakage of protons across the mitochondrial membrane and slippage of the ATP synthase/proton pump commonly reduces the ATP yield from NADH and FADH2 to less than the theoretical maximum yield.[15] The observed yields are, therefore, closer to ~2.5 ATP per NADH and ~1.5 ATP per FADH2, further reducing the total net production of ATP to approximately 30.[16] An assessment of the total ATP yield with newly revised proton-to-ATP ratios provides an estimate of 29.85 ATP per glucose molecule.

• John Baez says:

Thanks, that’s cool! The free energy payoff is a lot bigger when you oxidize glucose than when you merely split it into pyruvate, as in glycolysis. That’s why it pays to breathe.

The upper bound I quoted for glycolysis, 20.5 ATPs per glucose, seems sort of ridiculous compared to the 2 ATPs that apparently are actually produced. It may be due to the weak assumptions used to derive this bound.

5. Russ Abbott says:

I’m continually floored by your intellectual energy.

• John Baez says:

Thanks! Some things in life are just chores—but for others, the more energy you put in, the more you get out.

6. jim says:

That diagram does indeed look very complicated. What one cannot see on it (without the mathematical models) is all the oscillations due to the nonlinear dynamics. I think the original model of glycolytic oscillation was by Sel’kov in 1968: http://onlinelibrary.wiley.com/doi/10.1111/j.1432-1033.1968.tb00175.x/abstract.

• John Baez says:

7. hydrobates says:

The model considered by Selkov is relatively simple but despite that its dynamics does not seem to be completely understood. Here is some more background:

• Alan Rendall, The Higgins–Selkov oscillator, Hydrobates, 14 May 2014.

• John Baez says:

Thanks! This is especially nice to see because we used to both work on quantum gravity, and last summer we met at a conference called Trends in Reaction Network Theory, and we discovered we were both interested in chemistry. Now I seem to be following in your footsteps. I’ll have to read your article and really absorb it.

8. Wolfgang says:

One main principle in which biochemistry differs from chemistry is the notion of “non-equilibrium” processes. Classical thermodynamics models equilibrium and most chemical reactions are treated as equilibrium reactions. But life requests, for instance, that the creation and desctruction of biomolecules DOES NOT follow the same reactions pathways, otherwise these would form a network of equilibrium reactions and only one direction in such a network can yield a gain in energy, but not both. Thus a lot of the complexity of biochemical networks is due to the fact that Nature had to find two very distinct ways for essentially the same reaction in forward and backward direction.

Another major principle is enzyme catalysis. That means the optimization of a suitable surrounding for some reaction by means of protein synthesis (and evolution of course). Biochemical reactions are therefore totally different even if the educts and products are identical to a known chemical process. Another bulk of complexity is due to this fine-tuning of reaction environments (think of the photosystems to harness the power of light, very complex because all of the dirty details without which nothing would work).

A third major principle is control. Biochemical pathways are necessarily very complex, because they need to be finely controlled, partly by autocatalysis or negative feedback or only by spatial compartment (requiring signalling, transporter proteins and so on). Skip the control steps and everything gets simpler, yet will not work any longer in reality. The most ‘horrible’ example of control is DNA replication and protein synthesis, all these cofactors and error correction and the like.

These are reasons, why biochemical pathways/networks need to have a certain complexity and they explain many curiously looking steps (curiously from the physical chemistry viewpoint or also from a more abstract mathematical one).

9. Marshall Hampton says:

I’m a mathematician interested in biochemistry, and I found this book (Biochemical Pathways: An Atlas of Biochemistry and Molecular Biology) very helpful in getting an overview:

Although somewhat out of date (its from 2001, first edition 1991), I found da Silva and Williams’ “The biochemical chemistry of the elements” helpful in understanding some redox issues.

• John Baez says:

Thanks! I’m less interested in the latest subtleties than in the ‘big picture’, so both those books sound great. Time for a trip to the library.

10. Bruce Smith says:

I have a book called “The Machinery of Life” (David S. Goodsell, Springer-Verlag, 1993) which has lots of nice illustrations of the sorts of things that exist in a bacterium (from large molecules through the whole cell, and various size scales between). I haven’t read it, just skimmed the pictures, but I guess it’s more about mechanical structure than chemistry. (There is a lot of “machinery”, and more generally, complex structure in the sense of assemblies of things, though many of them “form as needed, do something, then fall apart” — it’s not by any means just a bag of chemicals. So any theory that assumed it was, would not be able to account for its actual reaction rates and dynamics.)

Unless I’m mixing it up with the above, I believe I used to have a different book called “The Molecules of Life” which was a more chemical survey, though still about structure rather than reaction pathways. Maybe it was produced by Scientific American, not sure.

And I had a book which was a good survey of reaction pathways from the point of view of energy supply and transduction (which I actually read, and thought was very good), which might have been called Bioenergetics. But I think I gave it away.

• John Baez says:

Bruce wrote:

There is a lot of “machinery”, and more generally, complex structure in the sense of assemblies of things, though many of them “form as needed, do something, then fall apart” — it’s not by any means just a bag of chemicals. So any theory that assumed it was, would not be able to account for its actual reaction rates and dynamics.

I’d like to know roughly how many different kinds of ‘machinery’ there are for the simplest bacteria. I’m hoping that describing these is quite a bit simpler for prokaryotes (like bacteria) than for eukaryotes (like animal and plants). Prokaryotes have a lot fewer membranes in their cells than eukaryotes, in part because the latter seem to have formed by ‘endosymbiosis’ from several eukaryotes. So, any theory of the eukaryotic cell needs to be quite good at describing the time evolution of membranes.

A prokaryote:

Some of the structures in a eukaryote:

A plant cell:

11. Bruce Smith says:

It’s certainly “simpler”, but not “simple”. That “cytoplasm”, which looks like a “simple fluid” in the picture, is very dense with “stuff”, including lots of proteins, many of them highly organized into larger structures (albeit dynamically). The overall cell shape, and the positions of pieces, are more or less controlled by a dense crosslinked network of protein fibers (eg microtubules), which also change dynamically but not randomly. (Wikipedia says microtubules are found in “eukaryotic cells, as well as some Bacteria” — so, not all bacteria. But there are other kinds of fibers too — I’m not sure a typical bacterium has none of them. In fact, DNA counts, since many proteins move along it, effectively crosslink it, etc.)

That said, if you just want to understand the “basic abstract principles of life”, there’s no reason I know of that a “simple bag of chemicals” (provided “pieces of the bag itself” can also participate in reactions) could not be alive — it just couldn’t compete (in a quantitative sense) with a “bag of more organized stuff”. In fact, I’d be really surprised if you couldn’t make up an abstract model with a small number of kinds of molecules, presumed fully mixed except when part of the bag skin, which “acted alive in a way reminiscent of a bacterium”. (I’d even be surprised if this hasn’t already been done.)

• Bruce Smith says:

As for your more specific question, “how many kinds of machinery are there in the simplest bacterium”, I’d like to know too! But I have no good idea. Maybe someone like Craig Venter has an idea.

But the answer depends on the level of description, and what you include — all known forms of life use ribosomes (which are fairly complex at the most detailed known level, but can be modeled very simply if you don’t care about that detail and take “1-dimensional tape-like information molecules” for granted), and either all or almost all use DNA replication (similar comments apply).

But if you want to “take for granted” everything that processes DNA and uses its structure, and the “outer bag structure” and its growth from subunits, then maybe everything else can be modeled as “fully mixed simple molecules” — if you also don’t care about quantitative predictions, whether dynamics of reaction networks are stable, etc.

But I’m not sure there is any “bright dividing line” as a boundary of life vs nonlife. If besides all the above, you abstract away even more, you can get to something as simple as “a flame”, which in suitable conditions does consume energy and thereby maintain its existence, and can reproduce, and yet in certain environments can maintain a stable form (though its doing so doesn’t much matter for whether it survives and reproduces, which might be one reason no one normally calls it “alive”). In a very different direction, you might abstract away other things instead (but retain the information content of DNA and how that gets processed, then idealize that), and eventually be left with something like a “formal system” (in which theorems are considered alive).

In other words, you’d better include in your model whatever it is which, if you left it out, would make you consider the resulting system “too simple or unreal to be called alive”. But exactly what that means you should include seems to be partly subjective, and certainly to have more than one reasonable choice.

• John Baez says:

Bruce wrote:

As for your more specific question, “how many kinds of machinery are there in the simplest bacterium”, I’d like to know too! But I have no good idea. Maybe someone like Craig Venter has an idea.

When I get the energy I’ll look into this more carefully:

The simulation of the complete life cycle of the pathogen, Mycoplasma genitalium, was presented on Friday in the journal Cell. The scientists called it a “first draft” but added that the effort was the first time an entire organism had been modeled in such detail — in this case, all of its 525 genes.

The simulation, which runs on a cluster of 128 computers, models the complete life span of the cell at the molecular level, charting the interactions of 28 categories of molecules — including DNA, RNA, proteins and small molecules known as metabolites, which are generated by cell processes.

[…]

For the new computer simulation, the researchers had the advantage of extensive scientific literature on the bacterium. They were able to use data taken from more than 900 scientific papers to validate the accuracy of their software model.

Still, they said, the model of the simplest biological system was pushing the limits of their computers.

“Right now, running a simulation for a single cell to divide only one time takes around 10 hours and generates half a gigabyte of data,” Dr. Covert wrote. “I find this fact completely fascinating, because I don’t know that anyone has ever asked how much data a living thing truly holds. We often think of the DNA as the storage medium, but clearly there is more to it than that.”

• Bruce Smith says:

I am intrigued by a quote at the end of that article you linked to as “this” (John Markoff, 2012):

“… what happens when we bring this to a bigger organism, like E. coli, yeast or even eventually a human cell?” Dr. Covert said….
“I’ll have the answer in a couple of years.”

• John Baez says:

Well, it’s been a couple of years, so I should see what he’s done! But projects always take longer than expected.

• John Baez says:

Bruce wrote:

But I’m not sure there is any “bright dividing line” as a boundary of life vs nonlife.

Neither am I. Luckily I’m not really looking for such a thing. My goal is really to keep getting better at nonequilibrium thermodynamics, network theory and the like, using bits of biology as a way to keep challenging my understanding. Living systems do lots of interesting things that I’d like to understand, but I want to tackle them slowly, one step at a time. That’s the main reason I’m curious about simple life forms.

Right now I’m at the point where I can get quite get good at understanding open chemical reaction networks—so, for example, a primitive bag-like organism that absorbs glucose and excretes some waste product through its cell membrane, using the free energy from glycolysis to power other reactions. When I say “get quite good at understanding” I mean rather specific things that I don’t want to reveal in detail until they’re ready—but basically it means generalizing the linear theory here:

• John Baez, Brendan Fong and Blake Pollard, A compositional framework for Markov processes. (Blog article here.)

• Blake Pollard, Open Markov processes: A compositional perspective on non-equilibrium steady states in biology. (Blog article here.)

to more interesting nonlinear cases. This will take a while. Meanwhile, I’m trying to figure out exactly what I should do next.

12. Bruce Smith says:

Martin Buliga has some kind of artificial life system called chemlambda, which I haven’t studied at all, but you might like to see it. He often posts demos of it on Google+. One of his home pages: http://www.imar.ro/~mbuliga/

“I instantly wanted to get an overall view of these reactions, without immersing myself in all the details.

Unfortunately I don’t know how to do this. Do you?”

Maybe the comment I posted on the “Biology and the Pi-Calculus” thread is even more apt here. I am reviewing the latest Astrobiology meeting, and a popular description of life was as an opportunistic geological process. Other descriptions are selfish genes (that control metabolism) and energy & matter utilization (controlled metabolism). In both cases the described system is nothing but historical detail heaped on each other with no other guiding principle for the participating populations but that “it works”.

For example, to utilize matter cells has to order it by pushing a disequilibrium closer to equilibrium. But to utilize energy cells may have to scavenge for electrons, which ultimately may lead to increased disequilibrium (oxygen atmosphere). It is an interplay between reduction and oxidation, making and disappearing disequilibriums, producing and consuming biochemicals, et cetera. The research edge is – still – here, because you can hear plenary sessions asking for principles of life (and sometimes anthropomorphically sloppily refer to it as ‘purpose’: ‘what is the problem life solves?’). [ http://www.hou.usra.edu/meetings/abscicon2015/pdf/program.pdf ; “PLENARY SESSION: PLANETS IN PERSPECTIVE: WHERE’S THE ENERGY?” http://www.hou.usra.edu/meetings/abscicon2015/pdf/sess501.pdf ]

TL;DR: It is complicated, the maxwellian demon may or may not be in the details, and life has no free lunch.

Linking Philippa Wiggins’s paper was timely, I had forgotten about it and see below why a reminder was good!

I don’t know how controversial it is. Certainly such people as Russell, of the vent theory for life emergence, describes energy transduction in electron bifurcating catalysts where Wiggins’s dissipation concerns does not seem to apply, elevating them (erroneously, I think) to redox engines. (And see the plenary session I linked to for one of the proponents.)

However, just today there is yet another paper out that tries to predict water characteristics from chemistry, and it seems to support Wiggins general model:

“In a review paper published in Nature Communications, physicists Anders Nilsson and Lars G.M. Pettersson from Stockholm University have pulled together the results from dozens of papers published over the past several years that have investigated water’s molecular structure, often with the help of cutting-edge experimental tools and simulations.

In their interpretation of the data, the researchers have proposed a picture of water in which its unique properties arise from its heterogeneous structure.

The researchers propose that, in the pressure (P) and temperature (T) region where water exhibits its anomalous behaviors (the funnel-like region in the P-T phase diagram above), water coexists in two different types of structures: a highly ordered, low-density structure with strong hydrogen bonds, and a somewhat smashed, high-density structure with distorted hydrogen bonds. The origins of water’s anomalous properties arise because these two types of structures are constantly fluctuating between one another in this heterogeneous phase, resulting in many small spatially separated regions of different structures.”

One unanswered question, for instance, is why does water’s anomalous region occur at the same temperatures and pressures that sustain life? It seems likely that water’s anomalous region served to place constraints on the conditions required for life to exist. A better understanding of this overlap could have implications for understanding life on a fundamental level.

[ and the article then goes into necessary tests; http://phys.org/news/2016-01-uncover-unusual-properties.html#jCp ; my bold.

Disclaimer: swede here, but SU is not my alma mater. My interest is rather here:]

If Wiggins solves the remaining two non-spontaneous steps in an enzyme free pathway between Keller et al’s Hadean pentoses & amino acids and the purine products, vent theory has its RNA world basis.

Oh yeah, I can as well provide the purported answer to ‘what problem life solves’. It is the geological continuation of high temperature processes as a terrestrial planet cools, processes that tries to oxidize the mantle against the planet’s hydrogen surplus.

If so it is very loosely helpful to understand why enzymes kicks in and why hydrogen based metabolism is the root metabolism, not so much with the many, many, …., many other historical details.

14. domenico says:

I think that the life is a engine near a critical point (near a phase transition only for a thermodynamic system).
If there is a generalization of the definition of physical engine to an object that generate flow of information, or that transfer entropy in the environment, then the life could be generalized to each system, chemical, electrical, mechanical, a computer program, a database computer software, because of the entropy use is common in every science field (so that the life can exist in each science field).
Usually the temperature of the life is greater of the environment, because of the inner engine, and I am thinking that the life could use the sleep-wake to obtain the phase transition point, so that I am thinking that could exist life form with temperature lower of the environment, with a physical engine that work like a refrigerator with the sleep-wake transition.

15. Tim Silverman says:

I want to say something fairly concrete about glycolysis, and then say something about the combinatorial and control/computational generalities involved.

1) A way to “read” glycolysis—an overview to make it a bit less “complicated”.

I’d divide this reaction sequence into three main sections with different subgoals.

a) First, we want to convert a hexose (6-carbon sugar, i.e. glucose) into two identical triose (3-carbon) phosphates.

The first 5 steps are therefore: stick a phosphate on one end of the molecule, do some internal rearrangement, stick another phosphate on the other end, split it into two different 3-carbon phosphates, convert one of the 3-carbon phosphates to be the same as the other.

b) We then have a 2-step section which goes: oxidise one of the carbons while sticking a phosphate on it (to hang onto the oxidation energy), then extract that energy into ATP by taking the phosphate off again.

c) We then have a few steps to ‘activate’ the remaining phosphate (the one added in section a). We have two neighbouring oxidised carbons, one with a phosphate on it, the other with just a hydroxyl, and the reactions try to, as it were, push the oxidatedness of the hydroxyl carbon onto the phosphate carbon, to make the phosphate-carbon arrangement more energetic. Then the high energy bond can be used to make ATP in the final step.

2) Organisation of the substrate structures

a) One general theme here is manipulating the reactiveness/inertness of particular sites of the molecule. C-H and C-C single bonds are pretty inert; C=C double bonds are more attackable due to the soft, floppy pi-bonds sticking out to the sides; C-O bonds are attackable because the electronegative O pulls electrons off the C, making it vulnerable to nucleophilic attack.

There’s a correlation here (I’m not sure how coincidental it is) with oxidation states—adding oxygens (possibly while eliminating water to give double bonds) both oxidises the carbons and increases their reactivity. A lot of what is going on here involves what might very loosely be called internal redox reactions, along with dehydrations and other rearrangements, with the goal of making particular sites on the molecule more or less reactive.

From this point of view, different locations in the carbon skeleton of a biomolecule, and different types of reactive group within it (hydroxyl, carbonyl, carboxylate, phosphate, etc) have their own character or identity. Tracking individual atoms is maybe not the best way to think about this, although it’s obviously necessary from a bookkeeping point of view.

b) Phosphates obviously play a central role in moving stored energy between molecules, and a key goal is assembling high-energy phosphate-containing groups at crucial stages of the reaction.

3) This reaction sequence starts with an irreversible reaction that adds a phosphate, and ends with an irreversible reaction which removes one. This sort of thing is typical.

a) Adding a phosphate can tag a molecule for entry into a particular pathway

b) Adding a phosphate may sometimes prime a molecule with energy needed in later steps of the pathway

c) Irreversible steps can help partition the total reaction network into manageable subsections

d) Relatedly, irreversible steps are a good target for enzyme regulation, whether by allosteric interactions in which products or substrates directly attach to the enzyme and alter its activity, or by an (enzyme-mediated) tagging of the enzyme (for instance by adding or removing a phosphate group) to alter its activity in response to a signal from elsewhere.

This way of thinking is all a lot more combinatorial than thermodynamic, but maybe is at least more mathematical than looking at particular molecular structures.

• John Baez says:

This is really, really nice.

It reminds me of how Balaban applies graph theory to chemistry at both levels: the level of individual molecules (which can be drawn rather abstractly as graphs) and at the level of reaction networks (which are also graphs). I would love it if there were something profound about ‘graphs of transformations of graphs’ going on here. I don’t see what that would be. But you have a delightfully clear way of thinking about why glycolysis works the way it does.

Over on the Azimuth Forum I just told a chemist:

My real aim, ultimately, is to treat things at such a high level of abstraction that I don’t focus on any details of particular molecules or chemical reactions. You can imagine, if you like, that I want to understand biochemistry not only on all possible planets, but in all possible universes.

This may sound strange, but there are some reasons for doing it. One is that I can never match biochemists for knowledge of specifics; I can only do something new by doing something different, and I hope that some aspects of how life works involve quite abstract and general patterns that just happen to be implemented in certain specific ways in the life we see here on our planet.

Of course I’m also fascinated by the gory specifics of what we happen to actually see.

My friend Tim Silverman, who usually tells me wonderful things about group theory (he’s a programmer who has become an expert on finite simple groups), has written a [eally nice comment on the Azimuth blog, in which he describes glycolysis in a very nice conceptual way that still grapples with the chemical details. I like this middle ground. But I would need to know a lot more chemistry to understand things at this level!

16. If you put yeast cells in water containing a constant low concentration of glucose, they convert it into alcohol at a constant rate. But if you increase the concentration of glucose something funny happens. The alcohol output starts to oscillate!

It’s not that the yeast is doing something clever and complicated. If you break down the yeast cells, killing them, this effect still happens. People think these oscillations are inherent to the chemical reactions in glycolysis.

I learned this after writing Part 1, thanks to Alan Rendall.