This week Lisa and I are visiting her 90-year-old mother in Montréal. Friday I’m giving a talk at the Université du Québec à Montréal. The main person I know there is André Joyal, an expert on category theory and algebraic topology. So, I decided to give a talk explaining how some ideas from these supposedly ‘pure’ branches of math show up in biology.

My talk is called ‘Operads and the Tree of Life’.

#### Trees

In biology, trees are very important:

So are trees of a more abstract sort: phylogenetic trees describe the history of evolution. The biggest phylogenetic tree is the ‘Tree of Life’. It includes all the organisms on our planet, alive now or anytime in the past. Here’s a rough sketch of this enormous tree:

Its structure is far from fully understood. So, biologists typically study smaller phylogenetic trees, like this tree of dog-like species made by Elaine Ostrander:

Abstracting still further, we can also think of a tree as a kind of purely mathematical structure, like this:

Trees are important in combinatorics, but also in algebraic topology. The reason is that in algebraic topology we get pushed into studying spaces equipped with enormous numbers of operations. We’d get hopelessly lost without a good way of drawing these operations. We can draw an operation with *n* inputs and one output as a little tree like this:

We can also draw the various ways of composing these operations. Composing them is just like building a big tree out of little trees!

An operation with *n* inputs and one output is called an ** n-ary operation**. In the late 1960s, various mathematicians including Boardmann and Vogt realized that spaces with tons of

*n*-ary operations were crucial to algebraic topology. To handle all these operations, Peter May invented the concept of an operad. This formalizes the way operations can be drawn as trees. By now operads are a standard tool, not just in topology, but also in algebraic geometry, string theory and many other subjects.

But how do operads show up in biology?

When attending a talk by Susan Holmes on phylogenetic trees, I noticed that her work on phylogenetic trees was closely related to a certain operad. And when I discussed her work here, James Griffin pointed out that this operad can be built using a slight variant of a famous construction due to Boardman and Vogt: their so-called ‘W construction’!

I liked the idea that trees and operads in topology might be related to phylogenetic trees. And thinking further, I found that the relation was real, and far from a coincidence. In fact, phylogenetic trees can be seen as operations in a certain operad… and this operad is closely related to the way computational biologists model DNA evolution as a branching sort of random walk.

That’s what I’d like to explain now.

I’ll be a bit sketchy, because I’d rather get across the basic ideas than the technicalities. I could even be wrong about some fine points, and I’d be glad to talk about those in the comments. But the overall picture is solid.

#### Phylogenetic trees

First, let’s ponder the mathematical structure of a phylogenetic tree. First, it’s a **tree**: a connected graph with no circuits. Second, it’s a **rooted tree**, meaning it has one vertex which is designated the **root**. And third, the leaves are labelled.

I should explain the third part! For any rooted tree, the vertices with just one edge coming out of them are called **leaves**. If the root is drawn at the bottom of the tree, the leaves are usually drawn at the top. In biology, the leaves are labelled by names of species: these labels matter. In mathematics, we can label the leaves by numbers where is the number of leaves.

Summarizing all this, we can say a phylogenetic tree should at least be a **leaf-labelled rooted tree**.

That’s not all there is to it. But first, a comment. When you see a phylogenetic tree drawn by a biologist, it’ll pretty much always a **binary** tree, meaning that as we move up any edge, away from the root, it either branches into two new edges or ends in a leaf. The reason is that while species often split into two as they evolve, it is less likely for a species to split into *three or more* new species all at once.

So, the phylogenetic trees we see in biology are usually leaf-labeled rooted binary trees. However, we often want to *guess* such a tree from some data. In this game, trees that aren’t binary become important too!

Why? Well, here another fact comes into play. In a phylogenetic tree, typically each edge can be labeled with a number saying how much evolution occurred along that edge. But as this number goes to zero, we get a tree that’s not binary anymore. So, we think of non-binary trees as conceptually useful ‘borderline cases’ between binary trees.

So, it’s good to think about phylogenetic trees that aren’t necessarily binary… and have edges labelled by numbers. Let’s make this into a formal definition:

**Definition** A **phylogenetic tree** is a leaf-labeled rooted tree where each edge not touching a leaf is labeled by a positive real number called its **length**.

By the way, I’m not claiming that biologists actually use this definition. I’ll write for the set of phylogenetic trees with leaves. This becomes a topological space in a fairly obvious way, where we can trace out a continuous path by continuously varying the edge lengths of a tree. But when some edge lengths approach zero, our graph converges to one where the vertices at ends of these edges ‘fuse into one’, leaving us with a graph with fewer vertices.

Here’s an example for you to check your understanding of what I just said. With the topology I’m talking about, there’s a continuous path in that looks like this:

These trees are upside-down, but don’t worry about that. You can imagine this path as a process where biologists slowly change their minds about a phylogenetic tree as new data dribbles in. As they change their minds, the tree changes shape in a continuous way.

For more on the space of phylogenetic trees, see:

• Louis Billera, Susan Holmes and Karen Vogtmann, Geometry of the space of phylogenetic trees, *Advances in Applied Mathematics* **27** (2001), 733-767.

#### Operads

How are phylogenetic trees related to operads? I have three things to say about this. First, they are the operations of an operad:

**Theorem 1.** There is an operad called the **phylogenetic operad**, or whose space of *n*-ary operations is

If you don’t know what an operad is, I’d better tell you now. They come in different flavors, and technically I’ll be using ‘symmetric topological operads’. But instead of giving the full definition, which you can find on the *n*Lab, I think it’s better if I sketch some of the key points.

For starters, an operad consists of a topological space for each . The point in are called the ** n-ary operations** of You can visualize an

*n*-ary operation as a black box with input wires and one output wire:

Of course, this also looks like a tree.

We can permute the inputs of an *n*-ary operation and get a new *n*-ary operation, so we have an action of the permutation group on . You visualize this as permuting input wires:

More importantly, we can compose operations! If we have an *n*-ary operation , and *n* more operations, say , we can **compose** with all the rest and get an operation called

Here’s how you should imagine it:

Composition and permutation must obey some laws, all of which are completely plausible if you draw them as pictures. For example, the **associative law** makes a composite of composites like this well-defined:

Now, these pictures look a lot like trees. So it shouldn’t come as a shock that phylogenetic trees are the operations of some operad But let’s sketch why it’s true.

First, we can permute the ‘inputs’—meaning the labels on the leaves—of any phylogenetic tree and get a new phylogenetic tree. This is obvious.

Second, and more importantly, we can ‘compose’ phylogenetic trees. How do we do this? Simple: we glue the roots of a bunch of phylogenetic trees to the leaves of another and get a new one!

More precisely, suppose we have a phylogenetic tree with leaves, say . And suppose we have more, say Then we can glue the roots of to the leaves of to get a new phylogenetic tree called

Third and finally, all the operad laws hold. Since these laws all look obvious when you draw them using pictures, this is really easy to show.

If you’ve been paying careful attention, you should be worrying about something now. In operad theory, we think of an operation as having *n* inputs and one output. For example, this guy has 3 inputs and one output:

But in biology, we think of a phylogenetic tree as having one input and *n* outputs. We start with one species (or other grouping of organisms) at the bottom of the tree, let it evolve and branch, and wind up with *n* of them!

In other words, operad theorists read a tree from top to bottom, while biologists read it from bottom to top.

Luckily, this isn’t a serious problem. Mathematicians often use a formal trick where they take an operation with *n* inputs and one output and think of it as having one input and *n* outputs. They use the prefix ‘co-‘ to indicate this formal trick.

So, we could say that phylogenetic trees stand for ‘co-operations’ rather than operations. Soon this trick will come in handy. But not just yet!

#### The W construction

Boardman and Vogt had an important construction for getting new operads for old, called the ‘W construction’. Roughly speaking, if you start with an operad , this gives a new operad whose operations are leaf-labelled rooted trees where:

1) all vertices except leaves are labelled by operations of and a vertex with *n* input edges must be labelled by an *n*-ary operation of

and

2) all edges except those touching the leaves are labelled by numbers in .

If you think about it, the operations of are strikingly similar to phylogenetic trees, except that:

1) in phylogenetic trees the vertices don’t seem to be labelled by operations of a operad,

and

2) we use arbitrary nonnegative numbers to label edges, instead of numbers in .

The second point is a real difference, but it doesn’t matter much: if Boardman and Vogt had used nonnegative numbers instead of numbers in to label edges in the W construction, it would have worked just as well. Technically, they’d get a ‘weakly equivalent’ operad.

The first point is not a real difference. You see, there’s an operad called which has exactly one operation of each arity. So, labelling vertices by operations of is a completely trivial process.

As a result, we conclude:

**Theorem 2.** The phylogenetic operad is weakly equivalent to .

If you’re not an expert on operads (such a person is called an ‘operadchik’), you may be wondering what stands for. The point is that operads have ‘algebras’, where the abstract operations of the operad are realized as actual operations on some topological space. And the algebras of are precisely commutative topological monoids: that is, topological spaces equipped with a commutative associative product!

#### Branching Markov processes and evolution

By now, if you haven’t fallen asleep, you should be brimming with questions, such as:

1) *What does it mean* that phylogenetic trees are the operations of some operad ? Why should we care?

2) *What does it mean* to apply the W construction to the operad ? What’s the significance of doing this?

3) *What does it mean* that is weakly equivalent to ? You can see the definition of weak equivalence here, but it’s pretty technical, so it needs some explanation.

The answers to questions 2) and 3) take us quickly into fairly deep waters of category theory and algebraic topology—deep, that is, if you’ve never tried to navigate them. However, these waters are well-trawled by numerous experts, and I have little to say about questions 2) and 3) that they don’t already know. So given how long this talk already is, I’ll instead try to answer question 1). This is where some ideas from *biology* come into play.

I’ll summarize my answer in a theorem, and then explain what the theorem means:

**Theorem 3.** Given any continuous-time Markov process on a finite set , the vector space whose basis is naturally becomes a coalgebra of the phylogenetic operad.

Impressive, eh? But this theorem is really just saying that *biologists are already secretly using the phylogenetic operad.*

Biologists who try to infer phylogenetic trees from present-day genetic data often use simple models where the genotype of each species follows a ‘random walk’. Also, species branch in two at various times. These models are called Markov models.

The simplest Markov model for DNA evolution is the Jukes–Cantor model. Consider a genome of fixed length: that is, one or more pieces of DNA having a total of base pairs. For example, this tiny genome has base pairs, just enough to illustrate the 4 possible choices, which are called A, T, C and G:

Since there are 4 possible choices for each base pair, there are possible genotypes with base pairs. In the human genome, is about . So, there are about

genotypes of this length. That’s a lot!

As time passes, the Jukes–Cantor model says that the human genome randomly walks through this enormous set of possibilities, with each base pair having the same rate of randomly flipping to any other base pair.

Biologists have studied many ways to make this model more realistic in many ways, but in a Markov model of DNA evolution we’ll typically have some finite set of possible genotypes, together with some random walk on this set. But the term ‘random walk’ is a bit imprecise: what I really mean is a ‘continuous-time Markov process’. So let me define that.

Fix a finite set . For each time and pair of points *i*, *j* in , a **continuous-time Markov process** gives a number saying the probability that starting at the point *i* at time zero, the random walk will go to the point *j* at time . We can think of these numbers as forming an square matrix at each time . We demand that four properties hold:

1) depends continuously on .

2) For all *s*, *t* we have .

3) is the identity matrix.

4) For all *j* and *t* we have:

.

All these properties make a lot of sense if you think a bit, though condition 2) says that the random walk does not change character with the passage of time, which would be false given external events like, say, ice ages. As far as math jargon goes, conditions 1)-3) say that is a **continuous one-parameter semigroup**, while condition 4) together with the fact that says that at each time, is a **stochastic matrix**.

Let be the vector space whose basis is . To avoid getting confused, let’s write for the basis vector corresponding to . Any probability distribution on gives a vector in . Why? Because it gives a probability for each , and we can think of these as the components of a vector .

Similarly, for any time , we can think of the matrix as a linear operator

So, if we start with some probability distribution of genotypes, and let them evolve for a time according to our continuous-time Markov process, by the end the probability distribution will be .

But species do more than evolve this way: they also branch! A phylogenetic tree describes a way for species to evolve and branch.

So, you might hope that any phylogenetic tree gives a ‘co-operation’ that takes one probability distribution as input and returns *n* probability distributions as output.

That’s true. But these *n* probability distributions will be correlated, so it’s better to think of them as a single probability distribution on the set This can be seen as a vector in the vector space , the tensor product of *n* copies of

So, any phylogenetic tree gives a linear operator from to . We’ll call it

because we’ll build it starting from the Markov process

Here’s a sketch of how we build it—I’ll give a more precise account in the next and final section. A phylogenetic tree is made of a bunch of vertices and edges. So, I just need to give you an operator for each vertex and each edge, and you can compose them and tensor them to get the operator :

1) For each vertex with one edge coming in and *n* coming out:

we need an operator

that describes what happens when one species branches into *n* species. This operator takes the probability distribution we put in and makes *n* identical and perfectly correlated copies. To define this operator, we use the fact that the vector space has a basis labelled by the genotypes Here’s how the operator is defined:

2) For each edge of length , we need an operator that describes a random walk of length This operator is provided by our continuous-time Markov process: it’s

And that’s it! By combining these two kinds of operators, one for ‘branching’ and one for ‘random walking’, we get a systematic way to take any phylogenetic tree and get an operator

In fact, these operators obey just the right axioms to make into what’s called a ‘coalgebra’ of the phylogenetic operad. But to see this—that is, to prove Theorem 3—it helps to use a bit more operad technology.

#### The proof

I haven’t even defined coalgebras of operads yet. And I don’t think I’ll bother. Why not? Well, while the proof of Theorem 3 is fundamentally trivial, it’s sufficiently sophisticated that only operadchiks would enjoy it without a lengthy warmup. And you’re probably getting tired by now.

So, to most of you reading this: bye! It was nice seeing you! And I hope you sensed the real point of this talk:

**Some of the beautiful structures used in algebraic topology are also lurking in biology. These structures may or may not be useful in biology… but we’ll never know if we don’t notice them and say what they are! So, it makes sense for mathematicians to spend some time looking for them.**

Now, let me sketch a proof of Theorem 3. It follows from a more general theorem:

**Theorem 4.** Suppose is an object in some symmetric monoidal topological category . Suppose that is equipped with an action of the additive monoid . Suppose also that is a cocommutative coalgebra. Then naturally becomes a coalgebra of the phylogenetic operad.

How does this imply Theorem 3? In Theorem 3, is the category of finite-dimensional real vector space. The action of on is the continuous-time Markov process. And becomes a cocommutative coalgebra because it’s a vector space with a distinguished basis, namely the finite set . This makes into a cocommutative coalgebra in the usual way, where the comultiplication:

‘duplicates’ basis vectors:

while the counit:

‘deletes’ them:

These correspond to species splitting in two and species going extinct, respectively. (Biologists trying to infer phylogenetic trees often ignore extinction, but it’s mathematically and biologically natural to include it.) So, all the requirements are met to apply Theorem 4 and make into coalgebra of the phylogenetic operad.

But how do we prove Theorem 4? It follows immediately from Theorem 5:

**Theorem 5.** The phylogenetic operad is the coproduct of the operad and the additive monoid , viewed as an operad with only 1-ary operations.

Given how coproducts works, this means that an algebra of both and is automatically an algebra of . In other words, any commutative algebra with an action of is an algebra of . Dualizing, it follows that any cocommutative *coalgebra* with an action of is an *coalgebra* of And that’s Theorem 4!

But why is Theorem 5 true? First of all, I should emphasize that the idea of using it was suggested by Tom Leinster in our last blog conversation on the phylogenetic operad. And in fact, Tom proved a result very similar to Theorem 5 here:

• Tom Leinster, Coproducts of operads, and the W-construction, 14 September 2000.

He gives an explicit description of the coproduct of an operad and a monoid, viewed as an operad with only unary operations. He works with non-symmetric, non-topological operads, but his ideas also work for symmetric, topological ones. Applying his ideas to the coproduct of and , we see that we get the phylogenetic operad!

And so, phylogenetic trees turn out to be related to coproducts of operads. Who’d have thought it? But we really don’t have as many fundamentally different ideas as you might think: it’s hard to have new ideas. So if you see biologists and algebraic topologists both drawing pictures of trees, you should expect that they’re related.

Wow John, It’s me Jake, from CQT, I’m in Waterloo Ontario right now — I thought for a second you got on a plane to track me to down to get back to work on those rabbit pictures. I’ll be back in Singapore in a month. Don’t forget about me.

I wanted to mention to others something that we noticed when you were teaching me stuff at CQT some time ago. You explained to me what an operad was, even drew a slick table showing how these tie to other structures. I have that table still. The two things I found the most interesting (warning seemingly way off topic from the post) are:

* In classical switching function theory, operads are called read-once formula. These are precisely the class of switching functions built from gate sets with no fan-in — that is, the gate set can be composed only to build a tree.

* In quantum network theory, they correspond to MERA networks, that can be contracted efficiently. Each of the three leg tensors (in this case) is an isomentry. To measure anything in quantum mechanics you take an inner product, when you take an inner product, you contract a state (here represented as a tree) with a conjugated copy of itself. Each of the three legged tensors will vanish when contracted to its isometric pair. This means the network can be contracted, and hence values can be measured efficiently.

* And from this, we can conclude that read-once formula are necessarily satisfiable. That means they correspond to constraint equations that can be satisfied.

This is a bit off topic, but every time I hear about operads I think about these cases. cheers!

Hi, Jake!

You may think I’ve been slacking off on our network theory project. It’s sort of true. But I claim that this operad stuff is just another aspect of that project. As usual, I’m taking ideas from quantum theory and adapting them to stochastic processes. Amplitudes become probabilities, unitary matrices become stochastic matrices, becomes , and so on. Even the Fock space idea shows up, though I didn’t make that explicit here.

In particular, the pictures of trees here can be seen as Feynman diagrams! But these are a bit different than the ones we’ve been talking about. The edges are ‘propagators’ where time evolution is described by the continuous-time Markov process The vertices are ‘interactions’, but only of a very limited sort: the vector space is a cocommutative coalgebra, so it comes with a ‘duplication’ operator

and a ‘deletion’ operator

These correspond to ‘speciation’ (where a species splits in two) and ‘extinction’.

I haven’t completely figured out how this new stuff is connected to stochastic Petri nets, but it’s clearly part of the same story.

This may seem a bit nit-picky when you’re obviously more concerned with the big picture, but I’m confused so I’ll go ahead and ask and hope you’ll indulge me in what are probably stupid questions. See, I had to deal with C0-semigroups pretty extensively in a past life, but I was never fortunate enough to have ones that were either Markov or acting on a finite dimensional space, so I’m struggling a bit with your condition 4 above, namely, for all j and t, where is defined to be the probability of being at j if you started at i at t=0. If I try to put this into words, I get something like: “If you find yourself at j at time t, then you must have come from from some i at time 0, and this is true for all j.” Is that a fair summary?

Personally, I tend to think in terms of initial conditions with C0-semigroups, so I would expect the condition for all i and t to hold, which I would summarize as “If you start at i, then you must end up at some j at time t, and this is true for all i and t.” Does this condition follow from the other in some way? Or is it meant to be a trivial consequence of your definition of as “the probability that starting at the point i at time zero, the random walk will go to the point j at time t?”

Thanks.

Dan.

Dan wrote:

Whoops—you caught a typo. I switched around an

iand ajhere. Actually for me is defined to be the probability of being atiif you started atjat .Some other people may use the opposite convention—this might add to your confusion.

I’ll fix this. Thanks!

No, these conditions are independent.

People like to talk about

left stochasticmatrices, which are matrices of nonnegative numbers where thecolumnssum to 1, and alsoright stochasticmatrices, which are matrices of nonnegative numbers where therowssum to 1.One of these describes random processes where starting at any point you have a total probability 1 of

goingto some point or other.The other describes random processes where ending at any point you have a total probability 1 of

comingfrom some point or other.However, which is which depends on a convention, namely whether you multiply column vectors by matrices on the left, or row vectors by matrices on the right.

In this talk I’m interested in processes where starting at any point you have a total probability 1 of

goingto some point or other. And, I’m multiplying column vectors by matrices on the left.So, for all I care, there could be genotypes that our random walk will never get to—that’s okay. But every genotype has to go somewhere as it randomly walks around.

A matrix that’s both left and right stochastic is called ‘doubly stochastic’.

This is a good intro:

• Stochastic matrix, Wikipedia.

Hi John,

Thanks for the enlightening discussion about operads and their applications to phylogenetic trees. I see several theorems about the correspondence between operads and phylogenetic trees, but I would really be interested in how the correspondence is used. I.e., is there a theorem that is stated only in terms of general operads, which we can pull through the correspondence, to get a theorem stated only in terms of phylogenetic trees and Markov processes?

You may be interested in the book

Algebraic Statistics for Computational Biologyand in particular in this review: “The Mathematics of Phylogenomics”.Ruchira wrote:

Me too! But I just made it up; I haven’t used it for anything.

I don’t know yet—I haven’t gotten that far. Basically what happened is that I listened to Susan Holmes talk about the geometry of the space of n-leaved phylogenetic trees, and realized that these spaces form an operad. It seemed awfully familiar from my studies of algebraic topology. Then James Griffin and Tom Leinster figured out exactly how an algebraic topologist, or operad theorist, would think about this operad. Operads have ‘algebras’, so I then started wondering what the algebras of this operad were. I soon realized that the

coalgebras of this operad were the branching Markov processes beloved by computational biologists working on phylogenetic trees! And that’s where I am now.In short, so far I’ve noticed that computational biologists are using some quite interesting operads without knowing it. I don’t know yet if it will help them to know it! But I think these connections are always worth noting. If X is an example of a Y, I think it’s always worth pointing it out, unless it’s so obvious it doesn’t need pointing out. These connections have a tendency to pay off, sometimes in unexpected ways.

It’s possible that operad theory could turn out to be useful in studying phylogenetic trees. That would be great. But it’s also possible that work on phylogenetic trees could inspire new work on operads which then gets applied to something else, like algebraic geometry or string theory! That would be cool too.

For people who don’t know operads, this stuff I’m talking about is probably not ‘ready for prime time’. But for people who do, it’ll be a shockingly practical example of an operad.

Thanks! If I learn more about this subject and what problems people are struggling with, that’ll improve my chance of making useful connections.

Do biologists have a way to pass between trees in the enormous, but simple, space of genotypes to phylogenetic trees? Markov processes for the former may involve enormous transition matrices, but at least one could use a simple rate of genetic mutation at each site. But then interpreting that model in terms of phenotypes, speciation, etc. can’t be easy.

David wrote:

I’m no expert on this, so take the following with a huge grain of salt, and hope that someone more knowledgeable chimes in:

In most of what I’ve seen, they aren’t observing trees in the space of genotypes: they’re observing the genotypes at the

leavesof those trees and trying to infer the rest of the tree. In other words, trying to guess the past given the present. There is a huge amount of work devoted to that. A bunch of it uses Markov process models combined with maximum likelihood or Bayesian methods to guess the best tree. They don’t usually use the whole genotype: just some parts of the DNA. There’s a lot of software for doing this kind of inference, like PhyML, FastML, RaxML and Mr. Bayes.And in most of the work I’ve seen, the tree that’s inferred is simply

calledthe phylogenetic tree. One assumes the edges are taxonomically significant units (e.g. species) and the vertices are taxonomically significant branching events (e.g. speciation events).I don’t know if that answers your question.

By the way, I said that mostly we observe DNA data now and try to infer the tree in the past, but one exception I know involves the HIV virus. This mutates fast enough, and has been studied carefully enough, that people have been able to watch it evolves as time passes. Viruses don’t have ‘species’ in the conventional sense of being able to interbreed, but there are medically significant branches forming in the phylogenetic tree as we watch.

Another exception would involve situations where we have access to fossil DNA, like Neanderthal or woolly mammoth DNA.

And by the way, the whole ‘tree’ concept is a simplification of reality. For example, humans and Neanderthals interbred, and among bacteria there’s a lot of ‘lateral transmission’ of genetic material via plasmids and the like.

Just a micro comment: did you notice the striking resemblance between this image (HIV phylogenetic tree, evolving as we observe) and Charles Darwin’s very first drawing of the “tree of life” in his notebook (in the Museum of Natural history in NYC)?:

Did they do so on purpose?

Hey, Carlo—it’s fun to see you here!

I don’t know they made that tree looks similar on purpose. At first I was going to say that all trees look sort of similar, but I think you’re right, there’s something more than that going on here, either coincidentally or not.

By the way, nobody except me can post images here, but if someone includes a link to an interesting image I can edit their comment to make it show up… and that’s what I just did. Of course I’m too lazy to do this very often.

Here is a link to a recent conference “Phylogenetics: New Data, New Phylogenetic Challenges”

http://www.newton.ac.uk/programmes/PLG/plgw05p.html

The talks were recorded. A lot of them are mathematical, and some are quite abstract, but I haven’t spotted anyone using operads yet.

Thanks! No operads in this one, but it offers us operadchiks a lot of food for thought:

• K. St. John, Exploring treespace.

The trees here have no labellings on edges, so the metrics are metric on discrete sets… but still interesting!

John wrote:

Interresting, do you have any documentations about that? I could imagine that something like an educational video could bring in handy all the mutations.

Hi, Nad. I don’t have any really nice graphics illustrating HIV evolution, but here’s a good general overview of HIV evolution:

• Andrew Rambaut,David Posada, Keith A. Crandall and Edward C.Holmes, The causes and consequences of HIV evolution.

My friend Chris Lee is an expert on this stuff:

• Chris Lee, HIV positive selection mutation database.

and this database allowed him to calculate in detail the effect of various drugs on HIV evolution.

There’s a lot about the evolution of HIV in the opening chapter, or perhaps the introduction, of Steve Jones’s book

Almost Like a Whale. (In the US it goes by the title ofDarwin’s Ghost. It’s a retelling of theOrigin of Species.) If you’re after a non-technical description, you might like that.Another virus that evolves very fast is influenza. That’s one of the things that makes vaccination difficult: the vaccines constantly have to be updated.

Thanks for the references John and Tom.

Yes and in the end the vaccine might be so poisonous that it kills the sick one instead of the flu.

I think fears of vaccines are greatly overstated. I don’t know of anyone dying from a ‘poisonous vaccine’, though almost everything happens at least once.

Here’s a nice article on it:

http://www.bunniestudios.com/blog/?p=353

Excellent post – it must have been an excellent talk too!

I do take slight issue with your idea that Operad theorists have much to say (as yet) on issues ‘what does it mean …’

The point being that there are two distinct interesting questions with the same English wording

1) What are the mathematical definitions and consequences …. This is what you are thinking

2) What does it tell us about biology? A question that we should start to address now :)

Roger wrote:

Thanks! But the talk is tomorrow, so while it may be predestined for excellence, I can’t be sure of that yet.

Actually I was thinking about the mathematical

meaning, which is not always easily apparent, even if one has seen the definitions and theorems. I distinguish between being able to state results and having them fully integrated into ones way of thinking. Only the latter lets one make significant progress.In particular, for these questions:

it takes a nontrivial change of worldview to appreciate why the W construction and weak equivalence are important and inevitable notions. They’re both aspects of the ‘homotopification’ of mathematics, where instead of demanding that equations hold, one demands that equations hold ‘up to coherent homotopy’. Operad theorists have a

lotto say about what this means and why it’s important.But yes, then there’s another layer: what does all this have to with

biology?I doubt anyone knows.Hey John,

I am thinking that your statement “Phyl is weakly equivalent to W(Comm)” is way weaker than what you would want to say. This is tantamount to saying “Phyl is weakly equivalent to the point.”

While true, it seems to me that the whole point you would like to make is that Phyl is actually

isomorphicto W(Comm) (degreewise homeomorphic).Urs wrote:

Hmm, I guess you’re right. Let me make sure I understand. Is every topological operad whose space of -ary operations is contractible weakly equivalent to the terminal topological operad? The action of on the space of -ary operations is irrelevant here?

I’m afraid something slightly weaker is true. The lengths of edges in the trees that give operations of W(Comm) are positive real numbers less than or equal to 1, while for Phyl they are positive real numbers. Since (0,1] isn’t homeomorphic to (0,∞), I’m afraid Phyl isn’t degreewise homeomorphic to W(Comm). They’re “almost homeomorphic” in some way that’s hard for me to state clearly.

As spaces, is some sort of compactification of .

Yes. There is necessarily a unique morphism to Comm and this is by assumption then a degreewise a weak homotopy equivalence of topological spaces. By this theorem these are weak equivalences of topological operads.

The action needs to be respected by the morphism of course, but for morphisms to Comm this is automatic. But given a morphism, its property of being a weak equivalence is just that it is degreewise a weak homotopy equivalence of topological spaces.

Notice that every topological operad which is 1. degreewise contractible and 2. “Sigma cofibrant” (has free permutation group action) is an operad.

Oh, okay. Right, so it slightly fails to be isomorphic to W(Comm) then. But not in a very interesting way. As you say in the second message: you could just add as an admissible tree length, if you wanted to “fix” this.

Urs wrote:

When it comes to biology, adding as an admissible edge length for phylogenetic trees amounts to requiring that our Markov process settles down to some limit as . More precisely: an algebra of the phylogenetic operad extends to an algebra of the larger operad that allows infinite edge lengths iff exists for all , where is the corresponding Markov process.

This isn’t true for all Markov processes, but it is for the ones that show up in Markov models of DNA evolution. So that’s okay. Then there’s another little nuance…

Boardman and Vogt’s W construction requires choosing a way to make into a topological monoid. I don’t have their book on me, so I don’t remember how they do it, but other sources seem to suggest two methods. One is:

and the other is:

The second one has the small advantage of being

obviouslyassociative, but the first is also associative (I just checked!), and it seems to have a bigger advantage, as far as I’m concerned.Namely, I

believe(but haven’t bothered to check) that the operation makes isomorphic, as a topological monoid, to with its usual addition (defined so that ).If so, we can embed with its usual addition as a dense submonoid of with the operation.

And if so, this should make the phylogenetic operad into a dense suboperad of where the inclusion is a homotopy equivalence.

That’s a fairly crisp statement of how they’re ‘almost isomorphic’.

And of course even if Boardman and Vogt’s way of making the closed interval into a topological monoid doesn’t make it isomorphic to with its usual addition, such a way

exists.John wrote:

We have

where . So is a homomorphism from on to ordinary multiplication on . And under multiplication is isomorphic to under addition via . So, you were right.

Great, thanks!!!

My talk seemed to go well. Afterwards, André Joyal made some interesting remarks. I’d like to write them down here so I don’t forget them.

First, a nice simple observation. He said that the phylogenetic operad formalizes a notion of time, ‘branching time’, which marches forward like the usual notion of time (described by the real numbers) except for certain points where it splits.

A species can split in two and evolve in two different ways. We wondered in what other situations the concept of ‘branching time’ could apply. An obvious candidate is the branching scenario in some versions of the ‘many-worlds interpretation’ of quantum mechanics, but I don’t know how to make this precise.

More technically, we talked about how the phylogenetic operad is related to Boardman and Vogt’s W construction.

My talk discussed the coproduct of the operads and . More generally, we can take the coproduct of any operad and any monoid . The operations in can be drawn as (roughly) trees with vertices labelled by operations in and edges labelled by elements of

If we take this gives the edges nonnegative real ‘lengths’.

But we can also take . Then the edges have integral lengths. We can draw this by marking each edge with a number of ‘ticks’: little marks representing the ticks of a clock.

The operad is interesting because it’s the operad freely generated by the operad and an extra unary operation. Let’s call this unary operation In his book on algebraic set theory with Moerdijk, Joyal considered the

monadgenerated by a monad and extra unary operation. This idea was previously studied by Bénabou and Jibladze.Moving in this circle of ideas, we saw that the operad contains a copy of but also

twocopies of the free operad on the underlying collection of operations of .One way to get the latter is to take each operation of and postcomposing it with the operation The resulting operations

generate a copy of the free operad on the underlying collection of operations of . The ticks serve as ‘parentheses’ separating the operations of .

Another way is to take each operation of and postcomposing it a bunch of copies of the operation The resulting operations

generate a second copy of the free operad on the underlying collection of operations of .

The same trick works for , taking our ‘tick’ to be any positive number.

The same trick also works for here becomes a monoid with the obvious concept of addition, such that

As we’ve seen, is isomorphic to with the product

From what we’ve seen earlier in this conversation, it follows that the operad is closely related to the Boardman–Vogt construction . I now understand this a bit better. Since

there’s yet another nice way that the operad contains a copy of the free operad on the underlying set of operations of . Namely, we take each operation of and both precompose

andpostcompose it with :I believe Boardman and Vogt use this trick in their book. Of course they talk about instead of the isomorphic monoid , but that doesn’t matter. What matters is that they’re sandwiching the operations in with an idempotent unary operation that’s not in . We need to do this to have a chance for to be a cofibrant replacement of $O$.

Tom Leinster’s discussion of the construction here doesn’t mention this point. He describes and writes:

All this is true except that the element 1 in the monoid (corresponding to the element in ) does play a special role in the Boardman–Vogt construction, by virtue of being a nontrivial idempotent.

John, the stuff you’re doing prompted me to make a pdf version of my note available, since most people (including me) prefer pdf to ps these days. It’s here: http://www.maths.ed.ac.uk/~tl/w.pdf.

Very neat!

I don’t have any suggestions for what the W-construction and weak equivalence have to do with biology, but it seems clear that the reason coproducts of operads appear is that you’re considering the process of evolution of each species, and the process of branching of species, to be totally unrelated. Right?

Something seems a little weird, though: the behavior of evolution of a single species is controlled by the Markov process, but the behavor of branching of species is controlled by the operad. In other words, we choose a particular n-ary (co)operation in the phylogenetic operad which specifies all of the species branching that is to occur, with specified time periods in between branching events, and then the corresponding coaction morphism specifies a probability distribution over the genotypes of the resulting n species. Wouldn’t it be more realistic to also assign probabilities to the different kinds of branching that could occur?

Mike wrote:

Right. And here “you” means not just little old me, but also biologists who are trying to infer phylogenetic trees from data about the DNA of various species we see today.

They do things like this: fix a Markov process and then seek the phylogenetic tree (in the sense I’ve defined here) and the genotype for the species at the root of this tree that maximize the likelihood that the species at the leaves have the genotypes we see today.

So, for example, Robert Wayne must have taken snippets of DNA from various kinds of dog-like animals and run some maximum likelihood or Bayesian algorithm to guess this tree:

If we were trying to simulate evolution including the branching of species, we might try that. That could be a lot of fun.

For the applications described above, I don’t think it would be very practical. It might be practical if we had a good guess as to the probability per unit time that an organism with a given genotype would branch into two species. But since we don’t, we’d probably be forced into a very simple dumb guess, namely that the probability is some constant. And this, I believe, would not affect the results of the above sort of calculation.

Actually biologists do assume a probability distribution on the branching patterns, and the choice can affect the results. You are right that little is known about speciation rates and extinction rates, so guesses have to be made. In a Bayesian context, the distribution can be seen as part of the prior. You hope that the molecular data will overwhelm the prior, but if the signal in the data is weak, it may not.

I was thinking more along the lines of the Bayesian calculation inferring what sort of branching was likely to have occurred, as Graham suggests. But I think now I see that I asked the wrong question, since what people are really doing is drawing inferences about the branching. I should have said, “wouldn’t it be more realistic to also draw inferences about the Markov process (like in an HMM), rather than fixing a choice at the outset?”

I’m guessing that some people are already trying that too. But I was having a bit of fun trying to think about how to “mix up” the two aspects operadically.

Yes, people already do that too, and more. A lot of Bayesian methods can be characterised as “co-estimate everything”.

The inference is like that for a HMM. The likelihood of the observed data at the leaves (the multiple sequence alignment, see http://en.wikipedia.org/wiki/Multiple_sequence_alignment) is calculated by working back recursively towards the root.

I only have a hazy idea about operads, and here is a very hazy idea: Statistical inference often seems to work ‘backwards’, using co-operations where the model uses operations, or vice-versa.

I was about to ask innocently, “what’s an HMM?” Then I googled it and figured it referred to this. But I guess it wouldn’t hurt to record this fact for the similarly innocent. :-)

Graham wrote:

Thanks for the correction! I only know a little about phylogenomics, so it’s nice to know that when I say something wrong, there’s a chance you’ll appear and correct me.

What branching patterns might be particularly likely and/or unlikely?

Also, you mention ‘extinction rates’. One person told me that extinction events were completely ignored in the process of guessing a phylogenetic tree from present-day DNA data. In other words, that you can work only with trees whose branches all make it to the present, without any harm. I’ve been hoping to find some context in which this actually causes problems. Do you know about that?

(Of course we need to think about extinction events when we also use DNA data from extinct species… but I’m not talking about that now.)

“What branching patterns might be particularly likely and/or unlikely?”

Trees have a topology and they can have branch lengths or node times. As far as the topology alone is concerned, the main observation is the phylogenetic trees are unbalanced. I have just added an image to

http://www.azimuthproject.org/azimuth/show/Tree+of+life

to show what I mean. This imbalance show up at every level down to trees with 4 leaves which is the smallest size that imbalance can occur. All the obvious mathematical models (eg a birth-death process assuming constant rates of birth and death) make more balanced trees than this.

As far as node times are concerned it will depend on context. For example, for gene trees within a species, coalescent theory

(http://en.wikipedia.org/wiki/Coalescent_theory) gives a useful model. Looking backward in time, the coalescences happen very quickly, then slow down. For a gene sampled from say 10 individuals, and assuming a constant population size, the expected time for the first coalescence is proportional to 1/(10*9), the additional time to the next coalescence is proportional to 1/(9*8), and so on down the the last pair to meet with expected time proportional to 1/(2*1). Looking forwards in time, this gives a tree that grows faster then exponential.

“In other words, that you can work only with trees whose branches all make it to the present, without any harm. I’ve been hoping to find some context in which this actually causes problems. Do you know about that?”

It can cause problems if you want to estimate dates. For example, there are 22 Crocodylla (crocodiles, alligators and gharials), and they separated from the rest of the tree of life a very long time ago (lets say 100My for the sake of argument). If you assumed there were no extinctions, you would estimate the time of the most recent common ancestor of extant Crocodylla (that is, the time of the first speciation in the phylogenetic tree for Crocodylla ) to be a very long time ago as well. Unlike the coalescence case, the expected times go like 1/21, 1/20, … 1/2, 1/1 going back in time, so – very roughly – you might estimate this time as 1/(1+1/2+…1/21) = 1/3.64 = .27 of 100My since the ancestor species separated from the rest of the tree of life, that is 73My ago. More realistically, there will have been many extinctions and quite likely there were once many more than 22 in this group. In this case, it could easily be that the most recent common ancestor of extant Crocodylla is very recent.

It also causes problems if you know some dates from fossils and want to estimate speciation and extinction rates. This article might be a good place to start.

“Estimating diversification rates from phylogenetic information”. By Ricklefs R E, Trends Ecol Evol. 2007.

Finally, ‘phylogenetics’ means estimating trees from genes and ‘phylogenomics’ means estimating trees from whole genomes.

Thanks for all that, Graham! As you noticed, I don’t even know the difference between phylogenetics and phylogenomics. Well, I do now. But I have a lot of catching up to do.

By the way, when I gave my talk at UQAM the fellow in charge of the combinatorics seminar, Franco Saliola, said he had been to a nice talk by the mathematical biologist Lior Pachter. Have you heard of him? His website says:

[…] Operads and the Tree of Life « Azimuth. […]

Thanks to budget cuts, the website with the canid phylogenetic tree is no longer available. The Internet Archive managed to grab a copy, however. You may want to update your post.

Wow, I’m surprised you caught that now—this is a pretty old post! I’ve replaced the phylogenetic tree of doggies with a different one I happen to have.

A few years ago, after hearing Susan Holmes speak about the mathematics of phylogenetic trees, I became interested in their connection to algebraic topology. I wrote an article about this here:

• John Baez, Operads and the tree of life, 6 July 2011.

In trying to the make the ideas precise I recruited the help of Nina Otter, who was then a graduate student at ETH Zürich, and who is now a grad student at Oxford working on mathematical biology with Heather Harrington.

It took us quite a while to work out all the details, and I could never have done it myself. But now we’re done! Here’s our paper:

• John Baez and Nina Otter, Operads and phylogenetic trees.