The Computational Power of Chemical Reaction Networks

10 June, 2014

I’m at this workshop:

Programming with Chemical Reaction Networks: Mathematical Foundations, Banff International Research Station, 8-13 June 2014.

Luca Cardelli wrote about computation with chemical reactions in Part 26 of the network theory series here on this blog. So, it’s nice to meet him and many other researchers, learn more, and try to solve some problems together!

The first tutorial was this:

• David Soloveichik, U.C. San Francisco, The computational power of chemical reaction networks.

David works at the Center for Systems and Synthetic Biology, and their website says:

David did his graduate work with Erik Winfree at Caltech, focusing on algorithmic self-assembly and on synthetic networks of nucleic-acid interactions based on strand displacement cascades. He is interested in “molecular programming”: the systematic design of complex molecular systems based on the principles of computer science and distributed computing. More generally, he is trying to create a theoretical foundation of chemical computation applicable to both synthetic and natural systems.

According to his webpage, Soloveichik’s research interests are:

Wet-lab: the rational design of molecular interactions for synthetic biology, nanotechnology, and bioengineering. The goal is to engineer autonomous molecular systems that can sense, compute, and perform various actions. Using nucleic-acid “strand displacement cascades” as the molecular primitive, we are able to attain freedom of design that hasn’t been previously possible.

Theory: The theoretical foundation of chemical computation. Once we have a way to program molecular interactions, what programming language shall we use? How molecules can process information and carry out computation is still not well-understood; however, a formal connection to models of concurrent computation may allow systematic and scalable design, rigorous analysis and verification. Further, computational principles may elucidate the design of biological regulatory networks.

Here are my notes on his tutorial.

Motivation

We’ve got people here from different backgrounds:

• computational complexity theory
• wetlab / experimental science
• pure and applied mathematics
• software verification

CRNs (chemical reaction networks) show up in:

• chemistry
• population biology
• sensor networks
• math:
    ○ vector addition systems
    ○ Petri nets
    ○ commutative semigroups
    ○ bounded context-free languages
    ○ uniform recurrence equations

Why use them for computation? People want to go beyond the von Neumann architecture for computation. People also want to understand how cells process information. However, with a few exceptions, the computational perspective in this talk has not yet proved relevant in biology. So, there is a lot left to learn.

The model

The model of computation here will be the master equation for a chemical reaction network… since this has been explained starting Part 4 of the network theory series, I won’t review it!

Can all chemical reaction networks, even those without any conservation laws, be realized by actual chemical systems?

Though this is a subtle question, one answer is “yes, using strand displacement cascades”. This is a trick for getting DNA to simulate other chemical reactions. It’s been carried out in the lab! See this paper and many subsequent ones:

• Soloveichik, Seelig and Winfree, DNA as a universal substrate for chemical kinetics.

Abstract: Molecular programming aims to systematically engineer molecular and chemical systems of autonomous function and ever-increasing complexity. A key goal is to develop embedded control circuitry within a chemical system to direct molecular events. Here we show that systems of DNA molecules can be constructed that closely approximate the dynamic behavior of arbitrary systems of coupled chemical reactions. By using strand displacement reactions as a primitive, we construct reaction cascades with effectively unimolecular and bimolecular kinetics. Our construction allows individual reactions to be coupled in arbitrary ways such that reactants can participate in multiple reactions simultaneously, reproducing the desired dynamical properties. Thus arbitrary systems of chemical equations can be compiled into real chemical systems. We illustrate our method on the Lotka–Volterra oscillator, a limit-cycle oscillator, a chaotic system, and systems implementing feedback digital logic and algorithmic behavior.

However, even working with the master equation for a CRN, there are various things we might mean by having it compute something:

• uniform vs non-uniform: is a single CRN supposed to handle all inputs, or do we allow adding extra reactions for larger inputs? It’s a bit like Turing machines vs Boolean circuits.

• deterministic vs probabilistic: is the correct output guaranteed or merely likely?

• halting vs stabilizing: does the CRN ‘know’ when it has finished, or not? In the ‘halting’ case the CRN irreversibly produces some molecules that signal that the computation is done. In the ‘stabilizing’ case, it eventually stabilizes to the right answer, but we may not know how long to wait.

These distinctions dramatically affect the computational power. In the case of uniform computation:

• deterministic and halting: this has finite computational power.

• deterministic and stabilizing: this can decide semilinear predicates.

• probabilistic and halting: this is Turing-universal.

• probabilistic and stabilizing: this can decide \Delta_2^0 predicates, which are more general than computable ones. (Indeed, if we use Turing machines but don’t require them to signal when they’ve halted, the resulting infinitely long computations can ‘compute’ stuff that’s not computable in the usual sense.)

Deterministic stabilizing computations

Let’s look at the deterministic stabilizing computations in a bit more detail. We’ll look at decision problems. We have a subset S \subseteq \mathbb{N}^d, and we want to answer this question: is the vector X \in \mathbb{N}^d in the set S?

To do this, we represent the vector as a bunch of molecules: X_1 of the first kind, X_2 of the second kind, and so on. We call this an input. We may also include a fixed collection of additional molecules in our input, to help the reactions run.

Then we choose a chemical reaction network, and we let it run on our input. The answer to our question will be encoded in some molecules called Y and N. If X is in S, we want our chemical reaction to produce Y molecules. If it’s not, we want our reaction to produce N’s.

To make this more precise, we need to define what counts as an output. If we’ve got a bunch of molecules that

• contains Y but not N: then the output is YES.

• contains N but not Y: then the output is NO.

Otherwise the output is undefined.

Output-stable states are states with YES or NO output such that all states reachable from them via our chemical reactions give the same output. We say an output-stable-state is correct if this output is the correct answer to the question: is X in S.

Our chemical reaction network gives a deterministic stabilizing computation if for any input, and choosing any state reachable from that input, we can do further chemical reactions to reach a correct output-stable state.

In other words: starting from our input, and letting the chemical reactions <run any way they want, we will eventually stabilize at an output that gives the right answer to the question “is X in S?”

Examples

This sounds a bit complicated, but it’s really not. Let’s look at some examples!

Example 1. Suppose you want to check two numbers and see if one is greater than or equal to another. Here

S = \{(X_1,X_2) : X_2 \ge X_1 \}

How can you decide if a pair of numbers (X_1,X_2) is in this set?

You start with X_1 molecules of type A, X_2 molecules of type B, and one molecule of type Y. Then you use a chemical reaction network with these reactions:

A + N \to Y
B + Y \to N

If you let these reactions run, the Y switches to a N each time the reactions destroy an A. But the N switches back to a Y each time the reactions destroy a B.

When no more reactions are possible, we are left with either one Y or one N, which is the correct answer to your question!

Example 2. Suppose you want to check two numbers and see if one is equal to another. Here

S = \{(X_1,X_2) : X_2 = X_1 \}

How can you decide if a pair of numbers (X_1,X_2) is in here?

This is a bit harder! As before, you start with X_1 molecules of type A, X_2 molecules of type B, and one molecule of type Y. Then you use a chemical reaction network with these reactions:

A + B \to Y
Y + N \to Y
A + Y \to A + N
B + Y \to B + N

The first reaction lets an A and a B cancel out, producing a Y. If you only run this reaction, you’ll eventually be left with either a bunch of A\mathrm{s} or a bunch of B\mathrm{s} or nothing but Y\mathrm{s}.

If you have Y\mathrm{s}, your numbers were equal. The other reactions deal with the cases where you have A\mathrm{s} or B\mathrm{s} left over. But the key thing to check is that no matter what order we run the reactions, we’ll eventually get the right answer! In the end, you’ll have either Y\mathrm{s} or N\mathrm{s}, not both, and this will provide the yes-or-no answer to the question of whether X_1 = X_2.

What deterministic stabilizing computations can do

We’ve looked at some examples of deterministic stabilizing computations. The big question is: what kind of questions can they answer?

More precisely, for what subsets A \subseteq \mathbb{N}^d can we build a deterministic stabilizing computation that ends with output YES if the input X lies in A and with output NO otherwise?

The answer is: the ‘semilinear’ subsets!

• Dana Angluin, James Aspnes and David Eistenstat, Stably computable predicates are semilinear.

A set S \subseteq \mathbb{N}^d is linear if it’s of the form

\{u_0 + n_1 u_1 + \cdots + n_p u_p : n_i \in \mathbb{N}  \}

for some fixed vectors of natural numbers u_i \in \mathbb{N}^d.

A set S \subseteq \mathbb{N}^d semilinear if it’s a finite union of linear sets.

How did Angluin, Aspnes and Eisenstat prove their theorem? Apparently the easy part is showing that membership in any semilinear set can be decided by a chemical reaction network. David sketched the proof of the converse. I won’t go into it, but it used a very nice fact:

Dickson’s Lemma. Any subset of \mathbb{N}^d has a finite set of minimal elements, where we define x \le y if x_i \le y_i for all i.

For example, the region above and to the right of the hyperbola here has five minimal elements:

If you know some algebra, Dickson’s lemma should remind you of the Hilbert basis theorem, saying (for example) that every ideal in a ring of multivariable polynomials over a field is finitely generated. And in fact, Paul Gordan used Dickson’s Lemma in 1899 to help give a proof of Hilbert’s basis theorem.

It’s very neat to see how this lemma applies to chemical reaction networks! You can see how it works in Angluin, Aspnes and Eistenstat’s paper. But they call it “Higman’s lemma” for some reason.

References

Here are some of David Soloveichik’s recent talks:

• An introduction to strand displacement cascades for the Foresight Institute Conference (Palo Alto, Jan 2013): An artificial “biochemistry” with DNA.

• Paper presented at DNA Computing and Molecular Programming 18 (Aarhus, Denmark, Aug 2012): Deterministic function computation with chemical reaction networks.

• Tutorial talk for DNA Computing and Molecular Programming 17 (Pasadena, Aug 2011): The programming language of chemical kinetics, and how to discipline your DNA molecules using strand displacement cascades.

• High-level introduction to algorithmic self-assembly and stochastic chemical reaction networks as computer-theoretic models: Computer-theoretic abstractions for molecular programming.

• On algorithmic behavior in chemical reaction networks and implementing arbitrary chemical reaction networks with DNA: programming well-mixed chemical kinetics.


The Stochastic Resonance Program (Part 1)

10 May, 2014

guest post by David Tanzer

At the Azimuth Code Project, we are aiming to produce educational software that is relevant to the Earth sciences and the study of climate. Our present software takes the form of interactive web pages, which allow you to experiment with the parameters of models and view their outputs. But to fully understand the meaning of a program, we need to know about the concepts and theories that inform it. So we will be writing articles to explain both the programs themselves and the math and science behind them.

In this two-part series, I’ll explain this program:

Stochastic resonance.

Check it out—it runs on your browser! It was created by Allan Erskine and Glyn Adgie. In the Azimuth blog article Increasing the Signal-to-Noise Ratio with More Noise, Glyn Adgie and Tim van Beek give a nice explanation of the idea of stochastic resonance, which includes some clear and exciting graphs.

My goal today is give a compact, developer-oriented introduction to stochastic resonance, which will set the context for the next blog article, where I’ll dissect the program itself. By way of introduction, I am a software developer with research training in computer science. It’s a new area for me, and any clarifications will be welcome!

The concept of stochastic resonance

Stochastic resonance is a phenomenon, occurring under certain circumstances, in which a noise source may amplify the effect of a weak signal. This concept was used in an early hypothesis about the timing of ice-age cycles, and has since been applied to a wide range of phenomena, including neuronal detection mechanisms and patterns of traffic congestion.

Suppose we have a signal detector whose internal, analog state is driven by an input signal, and suppose the analog states are partitioned into two regions, called “on” and “off” — this is a digital state, abstracted from the analog state. With a light switch, we could take the force as the input signal, the angle as the analog state, and the up/down classification of the angle as the digital state.

Consider the effect of a periodic input signal on the digital state. Suppose the wave amplitude is not large enough to change the digital state, yet large enough to drive the analog state close to the digital state boundary. Then, a bit of random noise, occurring near the peak of an input cycle, may “tap” the system over to the other digital state. So we will see a probability of state-transitions that is synchronized with the input signal. In a complex way, the noise has amplified the input signal.

But it’s a pretty funky amplifier! Here is a picture from the Azimuth library article on stochastic resonance:

Stochastic resonance has been found in the signal detection mechanisms of neurons. There are, for example, cells in the tails of crayfish that are tuned to low-frequency signals in the water caused by predator motions. These signals are too weak to cross the firing threshold for the neurons, but with the right amount of noise, they do trigger the neurons.

See:

Stochastic resonance, Azimuth Library.

Stochastic resonance in neurobiology, David Lyttle.

Bistable stochastic resonance and Milankovitch theories of ice-age cycles

Stochastic resonance was originally formulated in terms of systems that are bistable — where each digital state is the basin of attraction of a stable equilibrium.

An early application of stochastic resonance was to a hypothesis, within the framework of bistable climate dynamics, about the timing of the ice-age cycles. Although it has not been confirmed, it remains of interest (1) historically, (2) because the timing of ice-age cycles remains an open problem, and (3) because the Milankovitch hypothesis upon which it rests is an active part of the current research.

In the bistable model, the climate states are a cold, “snowball” Earth and a hot, iceless Earth. The snowball Earth is stable because it is white, and hence reflects solar energy, which keeps it frozen. The iceless Earth is stable because it is dark, and hence absorbs solar energy, which keeps it melted.

The Milankovitch hypothesis states that the drivers of climate state change are long-duration cycles in the insolation — the solar energy received in the northern latitudes — caused by periodic changes in the Earth’s orbital parameters. The north is significant because that is where the glaciers are concentrated, and so a sufficient “pulse” in northern temperatures could initiate a state change.

Three relevant astronomical cycles have been identified:

• Changing of the eccentricity of the Earth’s elliptical orbit, with a period of 100 kiloyears

• Changing of the obliquity (tilt) of the Earth’s axis, with a period of 41 kiloyears

• Precession (swiveling) of the Earth’s axis, with a period of 23 kiloyears

In the stochastic resonance hypothesis, the Milankovitch signal is amplified by random events to produce climate state changes. In more recent Milankovitch theories, a deterministic forcing mechanism is used. In a theory by Didier Paillard, the climate is modeled with three states, called interglacial, mild glacial and full glacial, and the state changes depend on the volume of ice as well as the insolation.

See:

Milankovitch cycle, Azimuth Library.

Mathematics of the environment (part 10), John Baez. This gives an exposition of Paillard’s theory.

Bistable systems defined by a potential function

Any smooth function with two local minima can be used to define a bistable system. For instance, consider the function V(x) = x^4/4 - x^2/2:

To define the bistable system, construct a differential equation where the time derivative of x is set to the negative of the derivative of the potential at x:

dx/dt = -V'(x) = -x^3 + x = x(1 - x^2)

So, for instance, where the potential graph is sloping upward as x increases, -V'(x) is negative, and this sends X(t) ‘downhill’ towards the minimum.

The roots of V'(x) yield stable equilibria at 1 and -1, and an unstable equilibrium at 0. The latter separates the basins of attraction for the stable equilibria.

Discrete stochastic resonance

Now let’s look at a discrete-time model which exhibits stochastic resonance. This is the model used in the Azimuth demo program.

We construct the discrete-time derivative, using the potential function, a sampled sine wave, and a normally distributed random number:

\Delta X_t = -V'(X_t) * \Delta t + \mathrm{Wave}(t) + \mathrm{Noise}(t) =
X_t (1 - X_t^2) \Delta t + \alpha * \sin(\omega t) + \beta * \mathrm{GaussianSample}(t)

where \Delta t is a constant and t is restricted to multiples of \Delta t.

This equation is the discrete-time counterpart to a continuous-time stochastic differential equation.

Next time, we will look into the Azimuth demo program itself.


Programming with Chemical Reaction Networks

23 March, 2014

 

There will be a 5-day workshop on Programming with Chemical Reaction Networks: Mathematical Foundation at BIRS from Sunday, June 8 to Friday June 13, 2014 It’s being organized by

Anne Condon (University of British Columbia)
David Doty (California Institute of Technology)
Chris Thachuk (University of Oxford).

BIRS is the Banff International Research Station, in the mountains west of Calgary, in Alberta, Canada.

Description

Here’s the workshop proposal on the BIRS website. It’s a pretty interesting proposal, especially if you’ve already read Luca Cardelli’s description of computing with chemical reaction networks, at the end of our series of posts on chemical reaction networks. The references include a lot of cool papers, so I’ve created links to those to help you get ahold of them.

This workshop will explore three of the most important research themes concerning stochastic chemical reaction networks (CRNs). Below we motivate each theme and highlight key questions that the workshop will address. Our main objective is to bring together distinct research communities in order to consider new problems that could not be fully appreciated in isolation. It is also our aim to determine commonalities between different disciplines and bodies of research. For example, research into population protocols, vector addition systems, and Petri networks provide a rich body of theoretical results that may already address contemporary problems arising in the study of CRNs.

Computational power of CRNs

Before designing robust and practical systems, it is useful to know the limits to computing with a chemical soup. Some interesting theoretical results are already known for stochastic chemical reaction networks. The computational power of CRNs depend upon a number of factors, including: (i) is the computation deterministic, or probabilistic, and (ii) does the CRN have an initial context — certain species, independent of the input, that are initially present in some exact, constant count.

In general, CRNs with a constant number of species (independent of the input length) are capable of Turing universal computation [17], if the input is represented by the exact (unary) count of one molecular species, some small probability of error is permitted and an initial context in the form of a single-copy leader molecule is used.

Could the same result hold in the absence of an initial context? In a surprising result based on the distributed computing model of population protocols, it has been shown that if a computation must be error-free, then deterministic computation with CRNs having an initial context is limited to computing semilinear predicates [1], later extended to functions outputting natural numbers encoded by molecular counts [5].

Furthermore, any semilinear predicate or function can be computed by that class of CRNs in expected time polylogarithmic in the input length. Building on this result, it was recently shown that by incurring an expected time linear in the input length, the same result holds for “leaderless” CRNs [8] — CRNs with no initial context. Can this result be improved to sub-linear expected time? Which class of functions can be computed deterministically by a CRN without an initial context in expected time polylogarithmic in the input length?

While (restricted) CRNs are Turing-universal, current results use space proportional to the computation time. Using a non-uniform construction, where the number of species is proportional to the input length and each initial species is present in some constant count, it is known that any S(n) space-bounded computation can be computed by a logically-reversible tagged CRN, within a reaction volume of size poly(S(n)) [18]. Tagged CRNs were introduced to model explicitly the fuel molecules in physical realizations of CRNs such as DNA strand displacement systems [6] that are necessary to supply matter and energy for implementing reactions such as X → X + Y that violate conservation of mass and/or energy.

Thus, for space-bounded computation, there exist CRNs that are time-efficient or are space-efficient. Does there exist time- and space-efficient CRNs to compute any space-bounded function?

Designing and verifying robust CRNs

While CRNs provide a concise model of chemistry, their physical realizations are often more complicated and more granular. How can one be sure they accurately implement the intended network behaviour? Probabilistic model checking has already been employed to find and correct inconsistencies between CRNs and their DNA strand displacement system (DSD) implementations [9]. However, at present, model checking of arbitrary CRNs is only capable of verifying the correctness of very small systems. Indeed, verification of these types of systems is a difficult problem: probabilistic state reachability is undecidable [17, 20] and general state reachability is EXPSPACE-hard [4].

How can larger systems be verified? A deeper understanding of CRN behaviour may simplify the process of model checking. As a motivating example, there has been recent progress towards verifying that certain DSD implementations correctly simulate underlying CRNs [16, 7, 10]. This is an important step to ensuring correctness, prior to experiments. However, DSDs can also suffer from other errors when implementing CRNs, such as spurious hybridization or strand displacement. Can DSDs and more generally CRNs be designed to be robust to such predictable errors? Can error correcting codes and redundant circuit designs used in traditional computing be leveraged in these chemical computers? Many other problems arise when implementing CRNs. Currently, unique types of fuel molecules must be designed for every reaction type. This complicates the engineering process significantly. Can a universal type of fuel be designed to smartly implement any reaction?

Energy efficient computing with CRNs

Rolf Landauer showed that logically irreversible computation — computation as modeled by a standard Turing machine — dissipates an amount of energy proportional to the number of bits of information lost, such as previous state information, and therefore cannot be energy efficient [11]. However, Charles Bennett showed that, in principle, energy efficient computation is possible, by proposing a universal Turing machine to perform logically-reversible computation and identified nucleic acids (RNA/DNA) as a potential medium to realize logically-reversible computation in a physical system [2].

There have been examples of logically-reversible DNA strand displacement systems — a physical realization of CRNs — that are, in theory, capable of complex computation [12, 19]. Are these systems energy efficient in a physical sense? How can this argument be made formally to satisfy both the computer science and the physics communities? Is a physical experiment feasible, or are these results merely theoretical footnotes?

References

[1] D. Angluin, J. Aspnes, and D. Eisenstat. Stably computable predicates are semilinear. In PODC, pages 292–299, 2006.

[2] C. H. Bennett. Logical reversibility of computation. IBM Journal of Research and Development, 17 (6):525–532, 1973.

[3] L. Cardelli and A. Csikasz-Nagy. The cell cycle switch computes approximate majority. Scientific Reports, 2, 2012.

[4] E. Cardoza, R. Lipton, A. R. Meyer. Exponential space complete problems for Petri nets and commutative semigroups (Preliminary Report). Proceedings of the Eighth Annual ACM Symposium on Theory of Computing, pages 507–54, 1976.

[5] H. L. Chen, D. Doty, and D. Soloveichik. Deterministic function computation with chemical reaction networks. DNA Computing and Molecular Programming, pages 25–42, 2012.

[6] A. Condon, A. J. Hu, J. Manuch, and C. Thachuk. Less haste, less waste: on recycling and its limits in strand displacement systems. Journal of the Royal Society: Interface Focus, 2 (4):512–521, 2012.

[7] Q. Dong. A bisimulation approach to verification of molecular implementations of formal chemical reaction network. Master’s thesis. SUNY Stony Brook, 2012.

[8] D. Doty and M. Hajiaghayi. Leaderless deterministic chemical reaction networks. In Proceedings of the 19th International Meeting on DNA Computing and Molecular Programming, 2013.

[9] M. R. Lakin, D. Parker, L. Cardelli, M. Kwiatkowska, and A. Phillips. Design and analysis of DNA strand displacement devices using probabilistic model checking. Journal of The Royal Society Interface, 2012.

[10] M. R. Lakin, D. Stefanovic and A. Phillips. Modular Verification of Two-domain DNA Strand Displacement Networks via Serializability Analysis. In Proceedings of the 19th Annual conference on DNA computing, 2013.

[11] R. Landauer. Irreversibility and heat generation in the computing process. IBM Journal of research and development, 5 (3):183–191, 1961.

[12] L. Qian, D. Soloveichik, and E. Winfree. Efficient Turing-universal computation with DNA polymers (extended abstract) . In Proceedings of the 16th Annual conference on DNA computing, pages 123–140, 2010.

[13] L. Qian and E. Winfree. Scaling up digital circuit computation with DNA strand displacement cascades. Science, 332 (6034):1196–1201, 2011.

[14] L. Qian, E. Winfree, and J. Bruck. Neural network computation with DNA strand displacement cascades. Nature, 475 (7356):368–372, 2011.

[15] G. Seelig, D. Soloveichik, D.Y. Zhang, and E. Winfree. Enzyme-free nucleic acid logic circuits. Science, 314 (5805):1585–1588, 2006.

[16] S. W. Shin. Compiling and verifying DNA-based chemical reaction network implementations. Master’s thesis. California Insitute of Technology, 2011.

[17] D. Soloveichik, M. Cook, E. Winfree, and J. Bruck. Computation with finite stochastic chemical reaction networks. Natural Computing, 7 (4):615–633, 2008.

[18] C. Thachuk. Space and energy efficient molecular programming. PhD thesis, University of British Columbia, 2012.

[19] C. Thachuk and A. Condon. Space and energy efficient computation with DNA strand displacement systems. In Proceedings of the 18th Annual International Conference on DNA computing and Molecular Programming, 2012.

[20] G. Zavattaro and L. Cardelli. Termination Problems in Chemical Kinetics. In Proceedings of the 2008 Conference on Concurrency Theory, pages 477–491, 2008.


Network Theory Overview

22 February, 2014

 

Here’s a video of a talk I gave yesterday, made by Brendan Fong. You can see the slides here—and then click the items in blue, and the pictures, for more information!

The idea: nature and the world of human technology are full of networks! People like to draw diagrams of networks. Mathematical physicists know that in principle these diagrams can be understood using category theory. But why should physicists have all the fun? This is the century of understanding living systems and adapting to life on a finite planet. Math isn’t the main thing we need for this, but it’s got to be part of the solution… so one thing we should do is develop a unified and powerful theory of networks.

We are still far from doing this. In this overview, I briefly described three parts of the jigsaw puzzle, and invited everyone to help fit them together:

• electrical circuits and signal-flow graphs.

• stochastic Petri nets, chemical reaction networks and Feynman diagrams.

• Bayesian networks, information and entropy.

In my talks coming up, I’ll go into more detail on each of these. With luck, you’ll be able to see videos here.

But if you’re near Oxford, you might as well actually attend! You can see dates, times, locations, my slides, and the talks themselves as they show up by going here.


Network Theory Talks at Oxford

7 February, 2014

I’m giving some talks at Oxford:

Network Theory

Nature and the world of human technology are full of networks. People like to draw diagrams of networks: flow charts, electrical circuit diagrams, signal-flow graphs, Bayesian networks, Feynman diagrams and the like. Mathematically minded people know that in principle these diagrams fit into a common framework: category theory. But we are still far from a unified theory of networks. After an overview, we will look at three portions of the jigsaw puzzle in three separate talks:

I. Electrical circuits and signal-flow graphs.

II. Stochastic Petri nets, chemical reaction networks and Feynman diagrams.

III. Bayesian networks, information and entropy.

If you’re nearby I hope you can come! All these talks will take place in Lecture Theatre B in the Computer Science Department—see the map below. Here are the times:

• Friday 21 February 2014, 2 pm: Network Theory: overview. See the slides or watch a video.

• Tuesday 25 February, 3:30 pm: Network Theory I: electrical circuits and signal-flow graphs. See the slides or watch a video.

• Tuesday 4 March, 3:30 pm: Network Theory II: stochastic Petri nets, chemical reaction networks and Feynman diagrams. See the slides or watch a video.

• Tuesday 11 March, 3:30 pm: Network Theory III: Bayesian networks, information and entropy. See the slides or watch a video

The first talk will be part of the OASIS series, meaning the “Oxford Advanced Seminar on Informatic Structures”.

I thank Samson Abramsky, Bob Coecke and Jamie Vicary of the Computer Science Department for inviting me, and Ulrike Tillmann and Minhyong Kim of the Mathematical Institute for helping me get set up. I also thank all the people who helped do the work I’ll be talking about, most notably Jacob Biamonte, Jason Erbele, Brendan Fong, Tobias Fritz, Tom Leinster, Tu Pham, and Franciscus Rebro.

Ulrike Tillmann has also kindly invited me to give a topology seminar:

Operads and the Tree of Life

Trees are not just combinatorial structures: they are also biological structures, both in the obvious way but also in the study of evolution. Starting from DNA samples from living species, biologists use increasingly sophisticated mathematical techniques to reconstruct the most likely “phylogenetic tree” describing how these species evolved from earlier ones. In their work on this subject, they have encountered an interesting example of an operad, which is obtained by applying a variant of the Boardmann–Vogt “W construction” to the operad for commutative monoids. The operations in this operad are labelled trees of a certain sort, and it plays a universal role in the study of stochastic processes that involve branching. It also shows up in tropical algebra. This talk is based on work in progress with Nina Otter.

I’m not sure exactly where this will take place, but surely somewhere in the Mathematical Institute building:

• Monday 24 February, 3:30 pm, Operads and the Tree of Life. See the slides.

The Computer Science Department is shown in the map here:

 

The Mathematical Institute is a bit to the west:


Category Theory for Better Spreadsheets

5 February, 2014

Since one goal of Azimuth is to connect mathematicians to projects that can more immediately help the world, I want to pass this on. It’s a press release put out by Jocelyn Paine, who has blogged about applied category theory on the n-Category Café. I think he’s a serious guy, so I hope we can help him out!

Spreadsheet researcher Jocelyn Ireson-Paine has launched an Indiegogo campaign to fund a project to make spreadsheets safer. It will show how to write spreadsheets that are easier to read and less error-prone than when written in Excel. This is important because spreadsheet errors have cost some companies millions of pounds, even causing resignations and share-price crashes. An error in one spreadsheet, an economic model written in 2010 by Harvard economists Carmen Reinhart and Kenneth Rogoff, has even been blamed for tax rises and public-sector cuts. If he gets funding, Jocelyn will re-engineer this spreadsheet. He hopes that, because of its notoriety, this will catch public attention.

Reinhart and Rogoff’s spreadsheet was part of a paper on the association between debt and economic growth. They concluded that in countries where debt exceeds 90% of gross domestic product, growth is notably lower. But in spring 2013, University of Massachusetts student Thomas Herndon found they had omitted data when calculating an average. Because their paper’s conclusion supported governments’ austerity programmes, much criticism followed. They even received hate email blaming them for tax rises and public-sector cuts.

Jocelyn said, “The error probably didn’t change the results much. But better software would have made the nature of the error clearer, as well as the economics calculations, thus averting ill-informed and hurtful media criticism. Indeed, it might have avoided the error altogether.”

Jocelyn’s project will use two ideas. One is “literate programming”. Normally, a programmer writes a program first, then adds comments explaining how it works. But in literate programming, the programmer becomes an essayist. He or she first writes the explanation, then inserts the calculations as if putting equations into a maths essay. In ordinary spreadsheets, you’re lucky to get any documentation at all; in literate spreadsheets, documentation comes first.

The other idea is “modularity”. This means building spreadsheets from self-contained parts which can be developed, tested, and documented independently of one another. This gives the spreadsheet’s author less to think about, making mistakes less likely. It also makes it easier to replace parts that do have mistakes.

Jocelyn has embodied these ideas in a piece of software named Excelsior. He said, “‘Excelsior’ means ‘higher’ in Latin, and ‘upwards!’ in Longfellow’s poem. I think of it as meaning ‘upwards from Excel’. In fact, though, it’s the name of a wonderful Oxford café where I used to work on my ideas.”

Jocelyn also wants to show how advanced maths benefits computing. Some of his inspiration came from a paper he found on a friend’s desk in the Oxford University Department of Computer Science. Written by professor Joseph Goguen, this used a branch of maths called category theory to elucidate what it means for something to be part of a system, and how the behaviour of a system arises from the behaviours of its parts. Jocelyn said, “The ideas in the paper were extremely general, applying to many different areas. And when you think of modules as parts, they even apply to spreadsheets. This shows the value of abstraction.”

For more

For pictures or more information, please contact Jocelyn Ireson-Paine:

Postal: 23 Stratfield Road, Oxford, OX2 7BG, UK.
Tel: 07768 534 091.
Email: make-spreadsheets-safe@j-paine.org

Campaign Web site: http://igg.me/at/safe-spreadsheets

Campaign blog: http://blog.make-spreadsheets-safe.co.uk/

Jocelyn’s bio: http://www.j-paine.org/bio.html

Jocelyn’s personal website, for academic and general stuff: http://www.j-paine.org/

Background information: links to all topics mentioned can be found at the end of Paine’s campaign text at

http://igg.me/at/safe-spreadsheets

These include literate programming, modularity, the Reinhart-Rogoff spreadsheet, category theory, and many horror stories about the damage caused by spreadsheet errors.


The Rarest Things in the Universe

27 January, 2014

guest post by Leonard Adleman

About 50 years ago Kolomogorov assigned to each finite binary string \sigma a non-negative integer that measured that string’s ‘descriptive complexity’. Informally, K(\sigma) is the length (in binary) of the shortest (Turing machine) program that with the empty string as input, outputs \sigma and halts. A related measure of descriptive complexity is M(\sigma)=\frac{K(\sigma)}{|\sigma|}, where |\sigma| denotes the length of \sigma. A simple string like:

\sigma_0=\overbrace{111\ldots 111}^{1,000,000}

can be produced by a very short program; hence M(\sigma_0) is near 0. But if \sigma is a ‘random string’ (e.g. obtained by flipping coins), then, with high probability, it cannot be produced by a program significantly shorter than \sigma itself; hence M(\sigma) will be near 1.

If I asked you to produce (using a computer) a thousand strings of length one million with M near 1, it would be easy to do; just flip a lot of coins. If I asked you to produce a thousand strings with M near 0, that would also be easy. For example, you could start with a short random string and repeat it a lot. Actually, if I chose my favorite \alpha\in [0,1] and wanted a thousand strings of length one million with M near \alpha, then a mix of the preceding approaches can be used to produce them. So strings with a desired M are not rare.

Now let’s consider ‘deep strings’*. I will be informal here, but the underlying theory can be found in my Time space and randomness.

For all binary strings \sigma, we assign a value that measures the ‘depth’ of \sigma. D(\sigma) is obtained by considering both the size and the number of steps used by each program that with the empty string as input, outputs \sigma and halts**. D(\sigma) has the following properties:

  • If there is no short program to produce \sigma, then D(\sigma) is small.
  • If there is a short program to produce \sigma and it uses few steps, then D(\sigma) is small.
  • If there is a short program to produce \sigma, but all short programs to produce \sigma use lots of steps, then D(\sigma) is large. Roughly speaking, the more steps small programs use to produce \sigma, the larger D(\sigma) will be.

Informally, we call a string with large D ‘deep’ and one with small D ‘shallow’. A few examples may help.

Consider a string \sigma obtained by flipping coins. With high probability there is no short program to produce \sigma, hence \sigma is shallow.

Now consider \sigma_0 above. Since there is a short program to produce \sigma_0 and that program uses few steps, \sigma_0 is shallow.

Now treat \sigma_0 as a number in binary (i.e. 2^{1,000,000}-1) and consider the prime factorization. The fundamental theorem tells us it exists and will be about one million bits long. But, unless 2^{1,000,000}-1 is somehow special (e.g. a prime times a very smooth number), its prime factorization may be very very deep. A short program can generate the prime factorization (just generate one million 1s with a short program and then give it to a short factoring program). But if it turns out that factoring can’t be done in polynomial time, then perhaps all short programs that generate the prime factorization use a huge number of steps. So the prime factorization would have a very large D. Conceivably, since steps on a computer use time and energy, the prime factorization can never be realized. It is not a long string (only one million bits) but it may exist only in theory and not in reality.

If I asked you to produce even one string of length one million with D near that of the prime factorization of 2^{1,000,000}-1, could you do it? I would not know how and I suspect that as a practical matter it cannot be done. So strings with such a D are rare.

Here is a string that does exist in our universe and that (I suspect) is quite deep:

46817226351072265620777670675006972301618979214252832875068976303839400413682313
921168154465151768472420980044715745858522803980473207943564433*5277396428112339
17558838216073534609312522896254707972010583175760467054896492872702786549764052
643493511382273226052631979775533936351462037464331880467187717179256707148303247

In fact, this string is the prime factorization of 2^{1061}-1. We should not expect it to be as deep as the prime factorization of 2^{1,000,000}-1, but we should still expect it to have considerable depth. There is even some experimental evidence that supports this view. How did I get this prime factorization? I got it from: Factorization of a 1061-bit number by the Special Number Field Sieve, by Greg Childers. So it was easy to get. Well, easy for me anyway; not so easy for Childers. He reports that the factorization took about `3 CPU-centuries’ using the Special Number Field Sieve. Other than by factoring 2^{1061}-1, how could you ever come to write this pair of primes? I venture to say that since the Special Number Field Sieve is the fastest known algorithm for factoring numbers of this kind, no method available today could have written these primes (for the first time) using fewer steps (and hence less time/energy).

The situation might be compared to that of atomic physics. I am not a physicist, but I suppose it is possible to theorize about an atomic nucleus with a million protons. But what if I want to create one? It appears that producing transuranic elements takes huge amounts of time/energy and the greater the number of protons, the more time/energy it takes. It is even conceivable (to me at least) that there is not enough time/energy available (at least on earth) to actually produce one. Like the prime factorization of 2^{1,000,000}-1, it may exist in theory but not in reality. On the other hand, physicists from Russia and America, using lots of time/energy, have created an atomic nucleus with 118 protons called Ununoctium. Ununoctium is analogous to Childers’ prime factorization; both exist in reality; both were very costly to create.

But my focus here is neither ontology nor physics; it is money. Recently Bitcoins and Bitcoin-like things have captured the world’s attention. I suspect that this kind of currency will revolutionize the nature of economies, and consequently the nature of governance. Personally, I am agnostic on whether this is a good thing or a bad thing. But, in any case, I am wondering if ‘depth’ might form part of a theoretical basis for understanding new currencies. I must confess to knowing few details about Bitcoins; consequently, I will start from first principles and consider some of the properties of currency:

  • Mass: A currency may have mass or may be massless. U.S. dollars and gold have mass. Bitcoins are massless. The efficiencies of the Internet market require a massless currency. Imagine running eBay or Amazon using gold. U.S. dollars have mass, so we do not actually use them for transactions on the Internet, rather we use credit card numbers to create massless IOUs that promise to deliver (but, in fact, seldom actually deliver) U.S. dollars in the future. These IOUs are essentially a massless currency.
  • Production: Who can produce or destroy the currency? This has been an important political, legal and economic issue for millennia. In the west, coins began to appear at about the time Solon ruled Athens. Solon realized that he could manipulate the value of coins. And so he did. Solon’s lesson has not been lost on governments ever since. In its simplest form: if you owe K dollars, you print K dollars, and voila! no more debt. Creditors don’t like debtors to do that, so they want to be paid in a currency that no one can produce more of. Gold comes close. You can produce more gold but you have to spend a lot of time/energy in a mine to do so. Production also includes counterfeiting. Counterfeiting comes in at least two important forms: de novo-counterfeiting (build a press) and duplication-counterfeiting (get a xerox machine). For now, all massless currencies are in digital form and are vulnerable to duplication-counterfeiting since computers make it cheap and easy to create perfect copies of digital notes (a unit of a currency will be called a ‘note’). As a result, massless currencies will typically be associated with systems that mitigate this threat. Perhaps the elaborate ledgers implemented by the creators of Bitcoin are an attempt to deal with the threat of duplication-counterfeiting.
  • Abundance: As is often said, money can be used to ‘lubricate the economy’. To accomplish this a currency has to be sufficiently abundant. For example, Ununoctium, might be attractive to creditors because, compared to gold, it is far more costly in time/energy to produce more. However, it is not desirable for lubricating the economy because it is so costly to produce that less than 100 atoms have ever been created.

The transition from mass to masslessness will lead to currencies with uses and properties we do not normally associate with money. For example, using the techniques of secret-sharing, it becomes possible to create digital currencies where a single note can be jointly owned by 1000 individuals; any 501 of whom could cooperate to spend it, while any 500 of whom would be unable to do so.

What is the perfect currency? This is probably the wrong question, rather we should ask what properties may currency have, in theory and in reality. Let’s consider massless currencies.

Is there a massless currency similar to the U.S. dollar? I think yes. For example, the Government could simply publish a set of numbers and declare the numbers to be currency. Put another way, make the U.S. dollar smaller and smaller to decrease mass until asymptotically all that is left is the serial number. With regard to abundance, like the U.S. dollar, the Government is free to determining the number of notes available. With regard to production, as with the U.S. dollar, the Government can print more by simply publishing new numbers to be added to the set (or destroy some by declaring that some numbers have been removed from the set). With regard to counterfeiting, the U.S. dollar has some advantages. The mass of the U.S. dollar turns counterfeiting from a digital problem to a physical one. This provides the Government with the ability to change the U.S. dollar physically to defeat technology that might arise to produce faithful (either de novo or duplication) counterfeits.

Is there a massless currency similar to gold? I think yes. I think this is what Bitcoin-like currencies are all about. They are ‘deep-currencies’, sets of deep strings. With regard to abundance, they are superior to gold. The total number of deep strings in the set can be chosen at initiation by the creator of the currency. The abundance of gold on the other hand has already been set by nature (and, at least as currently used, gold is not sufficiently abundant to lubricate the world economy). With regard to production, as with gold, making notes requires time/energy. With regard to counterfeiting, gold has an advantage. Counterfeit gold is an absurdity, since all gold (by the ounce, not numismatically), no matter how it arises, is the same atomically and is perceived to have the same value. On the other hand, as stated above, massless currencies are vulnerable to duplication-counterfeiting. Interestingly, deep-currencies may be resistant to de novo-counterfeiting, since the creator of the deep-currency is free to choose the depth of the notes, and consequently the cost of producing new notes.

The value of a massless currency in our information based world is clear. Deep-currencies such as Bitcoin offer an attractive approach. But there is an interesting issue that may soon arise. The issue stems from the fact that the production of each note in currencies like Bitcoin requires a large investment of time/energy, and as with gold, this deprives governments of the prerogative to print money. Creditors may like this, but governments will not. What will governments do? Perhaps they will want to create a ‘Dual-currency’. A Dual-currency should be massless. It should come with a secret key. If you do not possess the secret key, then, like gold, it should be very costly to produce a new note, but if you possess the secret key, then, like the U.S. dollar, it should be inexpensive to produce a new note. Is a Dual-currency possible? Here is an example of an approach I call aurum:

  • Generate a pair of RSA keys: a public key \langle E,N\rangle, and a secret key \langle D,N\rangle. Publish the public key \langle E,N\rangle.
  • Declare that notes are exactly those integers G such that 2\leq G\leq N-1 and (the least positive residue of) G^E\mbox{Mod}(N) is less than or equal to \frac{N}{10^9}.

So, choosing a random integer V such that 2\leq V\leq N-1, and then computing V^E\mbox{Mod}(N) has about one chance in a billion of producing a note. Hence the expected number of modular exponentiations to produce a note is about one billion. On the other hand, those who possess the secret key can chose an integer W such that 2\leq W\leq \frac{N}{10^9} and calculate W^D\mbox{Mod}(N) to produce a note after just one modular exponentiation. There are many bells and whistles that can be accommodated with aurum, but here is what is ‘really’ going on. The depth of a string \sigma is obtained by considering programs running with the empty string as input, but we can consider a more general concept: ‘relative depth’. Given a pair of strings \sigma and \tau, the depth of \sigma relative to \tau is obtained by considering programs running with \tau as the input. Hence depth as we have been discussing it is the same as depth relative to the empty string. In the example above, we have made `dual-strings’; strings that are deep relative to the public key, but shallow relative to the secret key.

One of the interesting phenomena of theoretical computer science is that you can sometimes turn bad news into good news. If factoring is hard, then we are deprived of the ability to do something we (at least number theorists) would like to do. Bad news. But surprisingly, we acquire public-key cryptography and the ability to preserve our privacy. Good news. Similarly, strings that are very hard to produce seem useless, but Bitcoins have revealed that such strings can provide a new and useful form of currency. Now that we are aware that deep strings can have value, I expect that clever people will find many new uses for them.

After thinking about deep strings for many years, I see them everywhere – I invite you to do the same. I will finish with one of my favorite observations. The watchmaker analogy is well known and is frequently used in arguing the existence of a creator. If you stumble upon a watch, you recognize from its complexity that it did not form by chance, and you conclude that there must have been a watchmaker. A human is more complex than a watch, so there must have been a ‘human-maker’ – a creator. An alternative view is that the complexity you recognize is actually ‘depth’ and the conclusion you should reach is that there must have been a computational process sustained through a great many steps. In the case of humans, the process is 3.6 billion years of evolution and the depth can be read in the genome. The watch is deep as well, but much of its depth is acquired from the human genome relative to which it is not nearly so deep.

*It has been brought to my attention that others have considered similar concepts including Charles Bennett, Murray Gell-Mann, and Luis Antunes, Lance Fortnow, Dieter Van Melkebeek, and N. Variyam Vinodchandran. Indeed, the latter group has even used the term ‘deep string’, hence it is to them we owe the name.

**For all \sigma\in\{0,1\}^{*}:

  • T(\sigma) denotes the set of Turing machine programs that with the empty string as input, output \sigma and halt.
  • E(\sigma) denotes \min\{\max\{|P|,\log_{2}(\vec{P})\}|P\in T(\sigma)\}, where |P| denotes the length of P, and \vec{P} denotes the number of steps used by P with the empty string as input.

It may be convenient for the reader to assume that D(\sigma) is approximately \frac{E(\sigma)}{K(\sigma)}; however, the proper theory of deep strings extends the notion of depth to sets of strings and accommodates the use of randomness in computation. Depth in the context of quantum computation may also be of interest.


Follow

Get every new post delivered to your Inbox.

Join 3,094 other followers