I’m at this workshop:
• Programming with Chemical Reaction Networks: Mathematical Foundations, Banff International Research Station, 8-13 June 2014.
Luca Cardelli wrote about computation with chemical reactions in Part 26 of the network theory series here on this blog. So, it’s nice to meet him and many other researchers, learn more, and try to solve some problems together!
The first tutorial was this:
• David Soloveichik, U.C. San Francisco, The computational power of chemical reaction networks.
David works at the Center for Systems and Synthetic Biology, and their website says:
David did his graduate work with Erik Winfree at Caltech, focusing on algorithmic self-assembly and on synthetic networks of nucleic-acid interactions based on strand displacement cascades. He is interested in “molecular programming”: the systematic design of complex molecular systems based on the principles of computer science and distributed computing. More generally, he is trying to create a theoretical foundation of chemical computation applicable to both synthetic and natural systems.
According to his webpage, Soloveichik’s research interests are:
Wet-lab: the rational design of molecular interactions for synthetic biology, nanotechnology, and bioengineering. The goal is to engineer autonomous molecular systems that can sense, compute, and perform various actions. Using nucleic-acid “strand displacement cascades” as the molecular primitive, we are able to attain freedom of design that hasn’t been previously possible.
Theory: The theoretical foundation of chemical computation. Once we have a way to program molecular interactions, what programming language shall we use? How molecules can process information and carry out computation is still not well-understood; however, a formal connection to models of concurrent computation may allow systematic and scalable design, rigorous analysis and verification. Further, computational principles may elucidate the design of biological regulatory networks.
Here are my notes on his tutorial.
We’ve got people here from different backgrounds:
• computational complexity theory
• wetlab / experimental science
• pure and applied mathematics
• software verification
CRNs (chemical reaction networks) show up in:
• population biology
• sensor networks
○ vector addition systems
○ Petri nets
○ commutative semigroups
○ bounded context-free languages
○ uniform recurrence equations
Why use them for computation? People want to go beyond the von Neumann architecture for computation. People also want to understand how cells process information. However, with a few exceptions, the computational perspective in this talk has not yet proved relevant in biology. So, there is a lot left to learn.
The model of computation here will be the master equation for a chemical reaction network… since this has been explained starting Part 4 of the network theory series, I won’t review it!
Can all chemical reaction networks, even those without any conservation laws, be realized by actual chemical systems?
Though this is a subtle question, one answer is “yes, using strand displacement cascades”. This is a trick for getting DNA to simulate other chemical reactions. It’s been carried out in the lab! See this paper and many subsequent ones:
• Soloveichik, Seelig and Winfree, DNA as a universal substrate for chemical kinetics.
Abstract: Molecular programming aims to systematically engineer molecular and chemical systems of autonomous function and ever-increasing complexity. A key goal is to develop embedded control circuitry within a chemical system to direct molecular events. Here we show that systems of DNA molecules can be constructed that closely approximate the dynamic behavior of arbitrary systems of coupled chemical reactions. By using strand displacement reactions as a primitive, we construct reaction cascades with effectively unimolecular and bimolecular kinetics. Our construction allows individual reactions to be coupled in arbitrary ways such that reactants can participate in multiple reactions simultaneously, reproducing the desired dynamical properties. Thus arbitrary systems of chemical equations can be compiled into real chemical systems. We illustrate our method on the Lotka–Volterra oscillator, a limit-cycle oscillator, a chaotic system, and systems implementing feedback digital logic and algorithmic behavior.
However, even working with the master equation for a CRN, there are various things we might mean by having it compute something:
• uniform vs non-uniform: is a single CRN supposed to handle all inputs, or do we allow adding extra reactions for larger inputs? It’s a bit like Turing machines vs Boolean circuits.
• deterministic vs probabilistic: is the correct output guaranteed or merely likely?
• halting vs stabilizing: does the CRN ‘know’ when it has finished, or not? In the ‘halting’ case the CRN irreversibly produces some molecules that signal that the computation is done. In the ‘stabilizing’ case, it eventually stabilizes to the right answer, but we may not know how long to wait.
These distinctions dramatically affect the computational power. In the case of uniform computation:
• deterministic and halting: this has finite computational power.
• deterministic and stabilizing: this can decide semilinear predicates.
• probabilistic and halting: this is Turing-universal.
• probabilistic and stabilizing: this can decide predicates, which are more general than computable ones. (Indeed, if we use Turing machines but don’t require them to signal when they’ve halted, the resulting infinitely long computations can ‘compute’ stuff that’s not computable in the usual sense.)
Deterministic stabilizing computations
Let’s look at the deterministic stabilizing computations in a bit more detail. We’ll look at decision problems. We have a subset and we want to answer this question: is the vector in the set
To do this, we represent the vector as a bunch of molecules: of the first kind, of the second kind, and so on. We call this an input. We may also include a fixed collection of additional molecules in our input, to help the reactions run.
Then we choose a chemical reaction network, and we let it run on our input. The answer to our question will be encoded in some molecules called Y and N. If is in we want our chemical reaction to produce Y molecules. If it’s not, we want our reaction to produce N’s.
To make this more precise, we need to define what counts as an output. If we’ve got a bunch of molecules that
• contains Y but not N: then the output is YES.
• contains N but not Y: then the output is NO.
Otherwise the output is undefined.
Output-stable states are states with YES or NO output such that all states reachable from them via our chemical reactions give the same output. We say an output-stable-state is correct if this output is the correct answer to the question: is in .
Our chemical reaction network gives a deterministic stabilizing computation if for any input, and choosing any state reachable from that input, we can do further chemical reactions to reach a correct output-stable state.
In other words: starting from our input, and letting the chemical reactions <run any way they want, we will eventually stabilize at an output that gives the right answer to the question “is in ?”
This sounds a bit complicated, but it’s really not. Let’s look at some examples!
Example 1. Suppose you want to check two numbers and see if one is greater than or equal to another. Here
How can you decide if a pair of numbers is in this set?
You start with molecules of type molecules of type and one molecule of type . Then you use a chemical reaction network with these reactions:
If you let these reactions run, the switches to a each time the reactions destroy an . But the switches back to a each time the reactions destroy a
When no more reactions are possible, we are left with either one or one , which is the correct answer to your question!
Example 2. Suppose you want to check two numbers and see if one is equal to another. Here
How can you decide if a pair of numbers is in here?
This is a bit harder! As before, you start with molecules of type molecules of type and one molecule of type Then you use a chemical reaction network with these reactions:
The first reaction lets an and a cancel out, producing a . If you only run this reaction, you’ll eventually be left with either a bunch of or a bunch of or nothing but .
If you have your numbers were equal. The other reactions deal with the cases where you have or left over. But the key thing to check is that no matter what order we run the reactions, we’ll eventually get the right answer! In the end, you’ll have either or not both, and this will provide the yes-or-no answer to the question of whether
What deterministic stabilizing computations can do
We’ve looked at some examples of deterministic stabilizing computations. The big question is: what kind of questions can they answer?
More precisely, for what subsets can we build a deterministic stabilizing computation that ends with output YES if the input lies in and with output NO otherwise?
The answer is: the ‘semilinear’ subsets!
• Dana Angluin, James Aspnes and David Eistenstat, Stably computable predicates are semilinear.
A set is linear if it’s of the form
for some fixed vectors of natural numbers
A set semilinear if it’s a finite union of linear sets.
How did Angluin, Aspnes and Eisenstat prove their theorem? Apparently the easy part is showing that membership in any semilinear set can be decided by a chemical reaction network. David sketched the proof of the converse. I won’t go into it, but it used a very nice fact:
Dickson’s Lemma. Any subset of has a finite set of minimal elements, where we define if for all .
For example, the region above and to the right of the hyperbola here has five minimal elements:
If you know some algebra, Dickson’s lemma should remind you of the Hilbert basis theorem, saying (for example) that every ideal in a ring of multivariable polynomials over a field is finitely generated. And in fact, Paul Gordan used Dickson’s Lemma in 1899 to help give a proof of Hilbert’s basis theorem.
It’s very neat to see how this lemma applies to chemical reaction networks! You can see how it works in Angluin, Aspnes and Eistenstat’s paper. But they call it “Higman’s lemma” for some reason.
Here are some of David Soloveichik’s recent talks:
• An introduction to strand displacement cascades for the Foresight Institute Conference (Palo Alto, Jan 2013): An artificial “biochemistry” with DNA.
• Paper presented at DNA Computing and Molecular Programming 18 (Aarhus, Denmark, Aug 2012): Deterministic function computation with chemical reaction networks.
• Tutorial talk for DNA Computing and Molecular Programming 17 (Pasadena, Aug 2011): The programming language of chemical kinetics, and how to discipline your DNA molecules using strand displacement cascades.
• High-level introduction to algorithmic self-assembly and stochastic chemical reaction networks as computer-theoretic models: Computer-theoretic abstractions for molecular programming.
• On algorithmic behavior in chemical reaction networks and implementing arbitrary chemical reaction networks with DNA: programming well-mixed chemical kinetics.