Network Theory (Part 18)

21 November, 2011

joint with Jacob Biamonte

Last time we explained how ‘reaction networks’, as used in chemistry, are just another way of talking about Petri nets. We stated an amazing result, the deficiency zero theorem, which settles quite a number of questions about chemical reactions. Now let’s illustrate it with an example.

Our example won’t show how powerful this theorem is: it’s too simple. But it’ll help explain the ideas involved.

Diatomic molecules

A diatomic molecule consists of two atoms of the same kind, stuck together:

At room temperature there are 5 elements that are diatomic gases: hydrogen, nitrogen, oxygen, fluorine, chlorine. Bromine is a diatomic liquid, but easily evaporates into a diatomic gas:

Iodine is a crystal at room temperatures:

but if you heat it a bit, it becomes a diatomic liquid and then a gas:

so people often list it as a seventh member of the diatomic club.

When you heat any diatomic gas enough, it starts becoming a ‘monatomic’ gas as molecules break down into individual atoms. However, just as a diatomic molecule can break apart into two atoms:

A_2 \to A + A

two atoms can recombine to form a diatomic molecule:

A + A \to A_2

So in equilibrium, the gas will be a mixture of diatomic and monatomic forms. The exact amount of each will depend on the temperature and pressure, since these affect the likelihood that two colliding atoms stick together, or a diatomic molecule splits apart. The detailed nature of our gas also matters, of course.

But we don’t need to get into these details here! Instead, we can just write down the ‘rate equation’ for the reactions we’re talking about. All the details we’re ignoring will be hiding in some constants called ‘rate constants’. We won’t try to compute these; we’ll leave that to our chemist friends.

A reaction network

To write down our rate equation, we start by drawing a ‘reaction network’. For this, we can be a bit abstract and call the diatomic molecule B instead of A_2. Then it looks like this:

We could write down the same information using a Petri net:

But today let’s focus on the reaction network! Staring at this picture, we can read off various things:

Species. The species are the different kinds of atoms, molecules, etc. In our example the set of species is S = \{A, B\}.

Complexes. A complex is a finite sum of species, like A, or A + A, or for a fancier example using more efficient notation, 2 A + 3 B. So, we can think of a complex as a vector v \in \mathbb{R}^S. The complexes that actually show up in our reaction network form a set C \subseteq \mathbb{R}^S. In our example, C = \{A+A, B\}.

Reactions. A reaction is an arrow going from one complex to another. In our example we have two reactions: A + A \to B and B \to A + A.

Chemists define a reaction network to be a triple (S, C, T) where S is a set of species, C is the set of complexes that appear in the reactions, and T is the set of reactions v \to w where v, w \in C. (Stochastic Petri net people call reactions transitions, hence the letter T.)

So, in our example we have:

• Species S = \{A,B\}.

• Complexes: C= \{A+A, B\}.

• Reactions: T =  \{A+A\to B, B\to A+A\}.

To get the rate equation, we also need one more piece of information: a rate constant r(\tau) for each reaction \tau \in T. This is a nonnegative real number that affects how fast the reaction goes. All the details of how our particular diatomic gas behaves at a given temperature and pressure are packed into these constants!

The rate equation

The rate equation says how the expected numbers of the various species—atoms, molecules and the like—changes with time. This equation is deterministic. It’s a good approximation when the numbers are large and any fluctuations in these numbers are negligible by comparison.

Here’s the general form of the rate equation:

\displaystyle{ \frac{d}{d t} x_i =  \sum_{\tau\in T} r(\tau) \, (n_i(\tau)-m_i(\tau)) \, x^{m(\tau)}  }

Let’s take a closer look. The quantity x_i is the expected population of the ith species. So, this equation tells us how that changes. But what about the right hand side? As you might expect, it’s a sum over reactions. And:

• The term for the reaction \tau is proportional to the rate constant r(\tau).

• Each reaction \tau goes between two complexes, so we can write it as m(\tau) \to n(\tau). Among chemists the input m(\tau) is called the reactant complex, and the output is called the product complex. The difference n_i(\tau)-m_i(\tau) tells us how many items of species i get created, minus how many get destroyed. So, it’s the net amount of this species that gets produced by the reaction \tau. The term for the reaction \tau is proportional to this, too.

• Finally, the law of mass action says that the rate of a reaction is proportional to the product of the concentrations of the species that enter as inputs. More precisely, if we have a reaction \tau with input is the complex m(\tau), we define x^{m(\tau)} = x_1^{m_1(\tau)} \cdots x_k^{m_k(\tau)}. The law of mass action says the term for the reaction \tau is proportional to this, too!

Let’s see what this says for the reaction network we’re studying:

Let’s write x_1(t) for the number of A atoms and x_2(t) for the number of B molecules. Let the rate constant for the reaction B \to A + A be \alpha, and let the rate constant for A + A \to B be \beta, Then the rate equation is this:

\displaystyle{\frac{d}{d t} x_1 =  2 \alpha x_2 - 2 \beta x_1^2 }

\displaystyle{\frac{d}{d t} x_2 = -\alpha x_2 + \beta x_1^2 }

This is a bit intimidating. However, we can solve it in closed form thanks to something very precious: a conserved quantity.

We’ve got two species, A and B. But remember, B is just an abbreviation for a molecule made of two A atoms. So, the total number of A atoms is conserved by the reactions in our network. This is the number of A‘s plus twice the number of B‘s: x_1 + 2x_2. So, this should be a conserved quantity: it should not change with time. Indeed, by adding the first equation above to twice the second, we see:

\displaystyle{\frac{d}{d t} \left( x_1 + 2x_2 \right) = 0 }

As a consequence, any solution will stay on a line

x_1 + 2 x_2 = c

for some constant c. We can use this fact to rewrite the rate equation just in terms of x_1:

\displaystyle{ \frac{d}{d t} x_1 = \alpha (2c - x_1) - 2 \beta x_1^2 }

This is a separable differential equation, so we can solve it if we can figure out how to do this integral

\displaystyle{ t = \int \frac{d x_1}{\alpha (2c - x_1) - 2 \beta x_1^2 }  }

and then solve for x_1.

This sort of trick won’t work for more complicated examples.
But the idea remains important: the numbers of atoms of various kinds—hydrogen, helium, lithium, and so on—are conserved by chemical reactions, so a solution of the rate equation can’t roam freely in \mathbb{R}^S. It will be trapped on some hypersurface, which is called a ‘stoichiometric compatibility class’. And this is very important.

We don’t feel like doing the integral required to solve our rate equation in closed form, because this idea doesn’t generalize too much. On the other hand, we can always solve the rate equation numerically. So let’s try that!

For example, suppose we set \alpha = \beta = 1. We can plot the solutions for three different choices of initial conditions, say (x_1,x_2) = (0,3), (4,0), and (3,3). We get these graphs:

It looks like the solution always approaches an equilibrium. We seem to be getting different equilibria for different initial conditions, and the pattern is a bit mysterious. However, something nice happens when we plot the ratio x_1^2 / x_2:

Apparently it always converges to 1. Why should that be? It’s not terribly surprising. With both rate constants equal to 1, the reaction A + A \to B proceeds at a rate equal to the square of the number of A‘s, namely x_1^2. The reverse reaction proceeds at a rate equal to the number of B‘s, namely x_2. So in equilibrium, we should have x_1^2 = x_2.

But why is the equilibrium stable? In this example we could see that using the closed-form solution, or maybe just common sense. But it also follows from a powerful theorem that handles a lot of reaction networks.

The deficiency zero theorem

It’s called Feinberg’s deficiency zero theorem, and we saw it last time. Very roughly, it says that if our reaction network is ‘weakly reversible’ and has ‘deficiency zero’, the rate equation will have equilibrium solutions that behave about as nicely as you could want.

Let’s see how this works. We need to remember some jargon:

Weakly reversible. A reaction network is weakly reversible if for every reaction v \to w in the network, there exists a path of reactions in the network starting at w and leading back to v.

Reversible. A reaction network is reversible if for every reaction v \to w in the network, w \to v is also a reaction in the network. Any reversible reaction network is weakly reversible. Our example is reversible, since it consists of reactions A + A \to B, B \to A + A.

But what about ‘deficiency zero’? We defined that concept last time, but let’s review:

Connected component. A reaction network gives a kind of graph with complexes as vertices and reactions as edges. Two complexes lie in the same connected component if we can get from one to the other by a path of reactions, where at each step we’re allowed to go either forward or backward along a reaction. Chemists call a connected component a linkage class. In our example there’s just one:

Stoichiometric subspace. The stoichiometric subspace is the subspace \mathrm{Stoch} \subseteq \mathbb{R}^S spanned by the vectors of the form w - v for all reactions v \to w in our reaction network. This subspace describes the directions in which a solution of the rate equation can move. In our example, it’s spanned by B - 2 A and 2 A - B, or if you prefer, (-2,1) and (2,-1). These vectors are linearly dependent, so the stoichiometric subspace has dimension 1.

Deficiency. The deficiency of a reaction network is the number of complexes, minus the number of connected components, minus the dimension of the stoichiometric subspace. In our example there are 2 complexes, 1 connected component, and the dimension of the stoichiometric subspace is 1. So, our reaction network has deficiency 2 – 1 – 1 = 0.

So, the deficiency zero theorem applies! What does it say? To understand it, we need a bit more jargon. First of all, a vector x \in \mathbb{R}^S tells us how much we’ve got of each species: the amount of species i \in S is the number x_i. And then:

Stoichiometric compatibility class. Given a vector v\in \mathbb{R}^S, its stoichiometric compatibility class is the subset of all vectors that we could reach using the reactions in our reaction network:

\{ v + w \; : \; w \in \mathrm{Stoch} \}

In our example, where the stoichiometric subspace is spanned by (2,-1), the stoichiometric compatibility class of the vector (v_1,v_2) is the line consisting of points

(x_1, x_2) = (v_1,v_2) + s(2,-1)

where the parameter s ranges over all real numbers. Notice that this line can also be written as

x_1 + 2x_2 = c

We’ve already seen that if we start with initial conditions on such a line, the solution will stay on this line. And that’s how it always works: as time passes, any solution of the rate equation stays in the same stoichiometric compatibility class!

In other words: the stoichiometric subspace is defined by a bunch of linear equations, one for each linear conservation law that all the reactions in our network obey.

Here a linear conservation law is a law saying that some linear combination of the numbers of species does not change.

Next:

Positivity. A vector in \mathbb{R}^S is positive if all its components are positive; this describes a a container of chemicals where all the species are actually present. The positive stoichiometric compatibility class of x\in \mathbb{R}^S consists of all positive vectors in its stoichiometric compatibility class.

We finally have enough jargon in our arsenal to state the zero deficiency theorem. We’ll only state the part we need today:

Zero Deficiency Theorem (Feinberg). If a reaction network has deficiency zero and is weakly reversible, and the rate constants are positive, the rate equation has exactly one equilibrium solution in each positive stoichiometric compatibility class. Any sufficiently nearby solution that starts in the same class will approach this equilibrium as t \to +\infty.

In our example, this theorem says there’s just one positive
equilibrium (x_1,x_2) in each line

x_1 + 2x_2 = c

We can find it by setting the time derivatives to zero:

\displaystyle{ \frac{d}{d t}x_1 =  2 \alpha x_2 - 2 \beta x_1^2 = 0 }

\displaystyle{ \frac{d}{d t} x_2 = -\alpha x_2 + \beta x_1^2 = 0 }

Solving these, we get

\displaystyle{ \frac{x_1^2}{x_2} = \frac{\alpha}{\beta} }

So, these are our equilibrium solutions. It’s easy to verify that indeed, there’s one of these in each stoichiometric compatibility class x_1 + 2x_2 = c. And the zero deficiency theorem also tells us that any sufficiently nearby solution that starts in the same class will approach this equilibrium as t \to \infty.

This partially explains what we saw before in our graphs. It shows that in the case \alpha = \beta = 1, any solution that starts by nearly having

\displaystyle{\frac{x_1^2}{x_2} = 1}

will actually have

\displaystyle{\lim_{t \to +\infty} \frac{x_1^2}{x_2} = 1 }

But in fact, in this example we don’t even need to start near the equilibrium for our solution to approach the equilibrium! What about in general? We don’t know, but just to get the ball rolling, we’ll risk the following wild guess:

Conjecture. If a reaction network has deficiency zero and is weakly reversible, and the rate constants are positive, the rate equation has exactly one equilibrium solution in each positive stoichiometric compatibility class, and any positive solution that starts in the same class will approach this equilibrium as t \to +\infty.

If anyone knows a proof or counterexample, we’d be interested. If this result were true, it would really clarify the dynamics of reaction networks in the zero deficiency case.

Next time we’ll talk about this same reaction network from a stochastic point of view, where we think of the atoms and molecules as reacting in a probabilistic way. And we’ll see how the conservation laws we’ve been talking about today are related to Noether’s theorem for Markov processes!


Wild Cats of Sumatra

17 November, 2011

Sumatra is just half an hour from here. I’ve never visited it, but I’m awfully curious. So, I was excited to hear today that the World Wildlife Fund put cameras in the forest there and caught pictures of 5 species of wild cats! You can see them here:

• World Wildlife Fund, Remarkable images of big cats urge forest protection.

I’ve been curious about the smaller wild cats of Asia ever since I met this absurdly sweet thing which belongs to a friend of mine named Julia Strauss:


Julia lives in London, but she went all the way to Wales to buy this cat. Why? Because it’s a Bengal. That means it’s a crossbreed of an ordinary domestic cat with a leopard cat!

The leopard cat, Prionailurus bengalensis, is the most widespread of the Asian small cats. It has a huge range, from the Amur region in the Russian Far East through Korea, China, Indochina, India… all the way to the Pakistan in the west… and to Philippines and some islands in Indonesia in the south. It’s listed as ‘least vulnerable’ to extinction.

So, it’s not surprising that leopard cats are one of the kinds the WWF saw in Sumatra. Here’s one in the Berlin Zoo, photographed by F. Spangenberg:

Ain’t it cute? It’s about the size of a domestic cat, but it has a different number of chromosomes than Felis domesticus, so it’s a bit remarkable that they can interbreed. The resulting Bengals share some traits with the leopard cat: or example, leopard cats like to fish, and Bengals like to play around in their water bowls!

Another cat the WWF saw in Sumatra is the marbled cat, Pardofelis marmorata. It’s again about the same size as a house cat, but it likes to hunt while climbing around in trees!

This feisty fellow is listed as ‘vulnerable;—there are probably about 10,000 of them in the world, not counting kittens. They live from the Himalayan foothills westward into Nepal and eastward into southwest China, and also on Sumatra and Borneo.

Then there’s the Asian golden cat, Pardofelis temminckii. These guys are two to three times as big as a domestic cat! I saw one at the Night Safari in Singapore—a kind of zoo for nocturnal animals. Here’s a picture of one taken by Karen Stout:

They live all the way from Tibet, Nepal, and India to Thailand, Cambodia, Laos, and Vietnam, to down here around Malaysia and Sumatra. However, they’re listed as ‘near threatened’, due to hunting and habitat loss. They’re hunted for the illegal wildlife trade, and some people kill it for eating poultry—and also supposedly sheep, goats and buffalo calves.

Moving further up the size ladder, we meet the Sunda clouded leopard, Neofelis diardi. Here’s a great photo by ‘spencer77’:

These guys are special! They only live on Borneo and Sumatra, they’re listed as “vulnerable”, and scientists only realized they’re a separate species in 2006! Before that, people thought they were the same as the ordinary kind of clouded leopard, Neofelis nebulosa. But genetic testing showed that they diverged from that species about 1.4 million years ago, after having crossed a now submerged land bridge to reach Borneo and Sumatra.

I’ve seen the ordinary kind of clouded leopard at the Night Safari. But calling them ‘ordinary’ is not really fair: they’re beautiful, mysterious, well-camouflaged beasts—very hard to see even if you know they’re right in front of you! Indeed, very little is known about either kind of clouded leopard, because they’re so elusive and reclusive.

And finally, the biggest kitty on the island: the Sumatran tiger, Panthera tigris sumatrae! It’s a subspecies of tiger that only lives on Sumatra. It’s listed as “critically endangered”. The World Wildlife Fund estimates there are fewer than 500 of these tigers left in the wild—maybe a lot fewer.

This beauty was photographed in the Berlin Zoo by ‘Captain Herbert’:

Sumatran tigers have webbing between their toes, which makes them really good swimmers! They get up to 2.5 meters long, but they’re is the smallest of tigers, as you might expect from a species on a hot tropical island. (The biggest is the Siberian tiger, which I talked about earlier.)

There are lots of palm oil plantations in Sumatra. People burn down the jungle to plant palms, and the smoke sometimes creates a thick smelly haze even here in Singapore. It’s horrible. This deforestation is the main threat to the Sumatran Tiger. Also, many tigers are killed every year by poachers.

On the bright side, in 2006 the Indonesia Forestry Service, the Natural Resources and Conservational Agency, and the Sumatran Tiger Conservation Program sat down with companies including Asia Pulp & Paper and set up the Senepis Buluhala Tiger Sanctuary, which is 106,000 hectares in size. There’s also a large area for tigers called the Tambling Wildlife Nature Conservation on the southern tip of Sumatra. And the Australia Zoo has a program of reintroducing tigers to their natural habitat in Sumatra.

Okay, that’s it for now. I’ve got you all softened up, and I’m not even going to ask you for donations. Just be nice to cats, okay? Or better yet, be nice to life in general. We’re all in this together.


Eskimo Words for Snow

15 November, 2011

As I was reading about global warming and its effect on the Arctic and the people who live there, I couldn’t help bumping into some words in West Greenlandic. This is the main Inuit language spoken in Greenland. The people who actually speak it call it ‘Kalaallisut’. In June 2009 it was made the official language of the Greenlandic autonomous territory.

For example, I read about Sermersuaq. This is the Northern Hemisphere’s widest ‘tidewater glacier’: one that begins on land but terminates in water. It stretches 90 kilometers across!

This glacier is also called the Humboldt Glacier, but with all due respect to Humboldt, I’d rather call this magnificent, intensely forbidding realm by the name used by the people who can manage to live there! And so, I’d rather use the Kalaallisut word: Sermersuaq.

To see why Sermersuaq is a big deal, check out these photos taken by Nick Cobbing of Greenpeace:


 

Here’s a photo NASA took of Sermersuaq in 2000:


And here’s a photo they took in 2008, with the 2000 terminus again marked:


It lost 175 square kilometers of ice during that time.

Anyway, after seeing words like Semersuaq, Kangerdlugssuaq, and so on, I started wondering about a famous urban legend.

You’ve probably heard that the Eskimos have lots of words for snow. And maybe you’ve heard other people say “no, that’s not true”.

But the whole dispute starts seeming rather silly when you find out that the Eskimo — or more precisely, the speakers of Kalaallisut — have a word for “once again they tried to build a giant radio station, but it was apparently only on the drawing board.” It’s

nalunaarasuartaatilioqateeraliorfinnialikker-
saatiginialikkersaatilillaranatagoorunarsuarooq
.

When I learned this, I decided I wanted to learn a bit more about Kalaallisut!

For starters, Kalaallisut is just one of several closely related Inuit languages spoken in Greenland and Canada. Here is a map showing these languages:


In my attempts to learn more, I bumped into this piece:

• Mick Mallon, Inuktitut Linguistics for Technocrats, Ittukuluuk Language Programs, Iqaluit, 2000.

It’s about Inuktitut, which is the collective name for a group of Inuit languages spoken in Eastern Canada. You can see them on the map: they’re called Qikiqtaaluk nigiani, Nunavimmiutitut, and Nunatsiavummiutut.

This language is very different from English: it’s polysynthetic, meaning that words can be composed of many pieces.

For example, verbs can be singular, dual, or plural:

takujunga — I see
takujuguk — we two see
takujugut — we several see

But instead of using words like “because”, “if” or “whether”, they use different suffixes:

takugama — because I see
takugunnuk — if we two see
takungmangaatta — whether we several see

The object of the verb can be attached as a suffix:

takujagit — I see you
takujara — I see him
takugakku — because I see him

There are also suffixes that turn verbs to nouns, and suffixes that turn nouns to verbs… and you can even use both in a single complicated word!

There are also ways to indicate whether something is stationary or moving, expected or unexpected:

tavva! — Here it is! (stationary and expected)
avva! — There it is over there! (mobile and unexpected)

There are ways to add spatial information:

tavvani — at this (expected) spot
maangat — from this (unexpected) area
tappaunga — to that (expected) area up there
kanuuna — through that (unexpected) spot down there

And all this is just scratching the surface! Words can easily become huge—and in in a typical written text, only a minority of words are ever repeated.

Whether despite this or because of it, I think we must admit that the Inuit do have a marvelously terse way of describing lots of concepts related to snow and ice. For example, here’s a word list taken from Fortescue’s text on Kalaallisut:

• ‘sea-ice’ — siku (in plural = drift ice)
• ‘pack-ice/large expanses of ice in motion’ — sikursuit, pl. (compacted drift ice/ice field = sikut iqimaniri)
• ‘new ice’ — sikuliaq/sikurlaaq (solid ice cover = nutaaq)
• ‘thin ice’ — sikuaq (in plural = thin ice floes)
• ‘rotten (melting) ice floe’ — sikurluk
• ‘iceberg’ — iluliaq (ilulisap itsirnga = part of iceberg below waterline)
• ‘(piece of) fresh-water ice’ — nilak
• ‘lumps of ice stranded on the beach' — issinnirit, pl.
• ‘glacier’ (also ice forming on objects) — sirmiq (sirmirsuaq = inland ice)
• ‘snow blown in (e.g. doorway)’ — sullarniq
• ‘rime/hoar-frost’ — qaqurnak/kanirniq/kaniq
• ‘frost (on inner surface of e.g. window)’ — iluq
• ‘icy mist’ — pujurak/pujuq kanirnartuq
• ‘hail’ — nataqqurnat
• ‘snow (on ground)’ — aput (aput sisurtuq = avalanche)
• ‘slush (on ground)’ — aput masannartuq
• ‘snow in air/falling’ — qaniit (qanik = snowflake)
• ‘air thick with snow’ — nittaalaq (nittaallat, pl. = snowflakes; nittaalaq nalliuttiqattaartuq = flurries)
• ‘hard grains of snow’ — nittaalaaqqat, pl.
• ‘feathery clumps of falling snow’ — qanipalaat
• ‘new fallen snow’ — apirlaat
• ‘snow crust’ — pukak
• ‘snowy weather’ — qannirsuq/nittaatsuq
• ‘snowstorm’ — pirsuq/pirsirsursuaq
• ‘large ice floe’ — iluitsuq
• ‘snowdrift’ — apusiniq
• ‘ice floe’ — puttaaq
• ‘hummocked ice/pressure ridges in pack ice’ — maniillat/ingunirit, pl.
• ‘drifting lump of ice’ — kassuq (dirty lump of glacier-calved ice = anarluk)
• ‘ice-foot (left adhering to shore)’ — qaannuq
• ‘icicle’ — kusugaq
• ‘opening in sea ice imarnirsaq/ammaniq (open water amidst ice = imaviaq)
• ‘lead (navigable fissure) in sea ice’ — quppaq
• ‘rotten snow/slush on sea’ — qinuq
• ‘wet snow falling’ — imalik
• ‘rotten ice with streams forming’ — aakkarniq
• ‘snow patch (on mountain, etc.)’ — aputitaq
• ‘wet snow on top of ice’ — putsinniq/puvvinniq
• ‘smooth stretch of ice’ — manirak (stretch of snow-free ice = quasaliaq)
• ‘lump of old ice frozen into new ice’ — tuaq
• ‘new ice formed in crack in old ice’ — nutarniq
• ‘bits of floating ice’ — naggutit, pl.
• ‘hard snow’ — mangiggal/mangikaajaaq
• ‘small ice floe (not large enough to stand on)’ — masaaraq
• ‘ice swelling over partially frozen river, etc. from water seeping up to the surface’ — siirsinniq
• ‘piled-up ice-floes frozen together’ — tiggunnirit
• ‘mountain peak sticking up through inland ice’ — nunataq
• ‘calved ice (from end of glacier)’ — uukkarnit
• ‘edge of the (sea) ice’ — sinaaq

Pretty cool, eh?

I made some music and art based on these glacial themes. You can see and hear it here.

If you like this icy ambient music — which is, I readily admit, not exactly dripping with catchy riffs— you’ll love these classic albums by Thomas Köner, which are even more minimal and chilly:

Nunatak.
Teimo.
Permafrost.

You can hear them for free now!


Network Theory (Part 17)

12 November, 2011

joint with Jacob Biamonte

We’ve seen how Petri nets can be used to describe chemical reactions. Indeed our very first example came from chemistry:

However, chemists rarely use the formalism of Petri nets. They use a different but entirely equivalent formalism, called ‘reaction networks’. So now we’d like to tell you about those.

You may wonder: why bother with another formalism, if it’s equivalent to the one we’ve already seen? Well, one goal of this network theory program is to get people from different subjects to talk to each other—or at least be able to. This requires setting up some dictionaries to translate between formalisms. Furthermore, lots of deep results on stochastic Petri nets are being proved by chemists—but phrased in terms of reaction networks. So you need to learn this other formalism to read their papers. Finally, this other formalism is actually better in some ways!

Reaction networks

Here’s a reaction network:

This network involves 5 species: that is, different kinds of things. They could be atoms, molecules, ions or whatever: chemists call all of these species, and there’s no need to limit the applications to chemistry: in population biology, they could even be biological species! We’re calling them A, B, C, D, and E, but in applications, we’d call them by specific names like CO2 and HCO3, or ‘rabbit’ and ‘wolf’, or whatever.

This network also involves 5 reactions, which are shown as arrows. Each reaction turns one bunch of species into another. So, written out more longwindedly, we’ve got these reactions:

A \to B

B \to A

A + C \to D

B + E \to A + C

B + E \to D

If you remember how Petri nets work, you can see how to translate any reaction network into a Petri net, or vice versa. For example, the reaction network we’ve just seen gives this Petri net:

Each species corresponds to a state of this Petri net, drawn as a yellow circle. And each reaction corresponds to transition of this Petri net, drawn as a blue square. The arrows say how many things of each species appear as input or output to each transition. There’s less explicit emphasis on complexes in the Petri net notation, but you can read them off if you want them.

In chemistry, a bunch of species is called a ‘complex’. But what do I mean by ‘bunch’, exactly? Well, I mean that in a given complex, each species can show up 0,1,2,3… or any natural number of times. So, we can formalize things like this:

Definition. Given a set S of species, a complex of those species is a function C: S \to \mathbb{N}.

Roughly speaking, a reaction network is a graph whose vertices are labelled by complexes. Unfortunately, the word ‘graph’ means different things in mathematics—appallingly many things! Everyone agrees that a graph has vertices and edges, but there are lots of choices about the details. Most notably:

• We can either put arrows on the edges, or not.

• We can either allow more than one edge between vertices, or not.

• We can either allow edges from a vertex to itself, or not.

If we say ‘no’ in every case we get the concept of ‘simple graph’, which we discussed last time. At the other extreme, if we say ‘yes’ in every case we get the concept of ‘directed multigraph’, which is what we want now. A bit more formally:

Definition. A directed multigraph consists of a set V of vertices, a set E of edges, and functions s,t: E \to V saying the source and target of each edge.

Given this, we can say:

Definition. A reaction network is a set of species together with a directed multigraph whose vertices are labelled by complexes of those species.

You can now prove that reaction networks are equivalent to Petri nets:

Puzzle 1. Show that any reaction network gives a Petri net, and vice versa.

In a stochastic Petri net each transition is labelled by a rate constant: that is, a numbers in [0,\infty). This lets us write down some differential equations saying how species turn into each other. So, let’s make this definition (which is not standard, but will clarify things for us):

Definition. A stochastic reaction network is a reaction network where each reaction is labelled by a rate constant.

Now you can do this:

Puzzle 2. Show that any stochastic reaction network gives a stochastic Petri net, and vice versa.

For extra credit, show that in each of these puzzles we actually get an equivalence of categories! For this you need to define morphisms between Petri nets, morphisms between reaction networks, and similarly for stochastic Petric nets and stochastic reaction networks. If you get stuck, ask Eugene Lerman for advice. There are different ways to define morphisms, but he knows a cool one.

We’ve been downplaying category theory so far, but it’s been lurking beneath everything we do, and someday it may rise to the surface.

The deficiency zero theorem

You may have already noticed one advantage of reaction networks over Petri nets: they’re quicker to draw. This is true even for tiny examples. For example, this reaction network:

2 X_1 + X_2 \leftrightarrow 2 X_3

corresponds to this Petri net:

But there’s also a deeper advantage. As we saw in Part 8, any stochastic Petri net gives two equations:

• The master equation, which says how the probability that we have a given number of things of each species changes with time.

• The rate equation, which says how the expected number of things in each state changes with time.

The simplest solutions of these equations are the equilibrium solutions, where nothing depends on time. Back in Part 9, we explained when an equilibrium solution of the rate equation gives an equilibrium solution of the master equation. But when is there an equilibrium solution of the rate equation in the first place?

Feinberg’s ‘deficiency zero theorem’ gives a handy sufficient condition. And this condition is best stated using reaction networks! But to understand it, we need to understand the ‘deficiency’ of a reaction network. So let’s define that, and they say what all the words in the definition mean:

Definition. The deficiency of a reaction network is:

• the number of vertices

minus

• the number of connected components

minus

• the dimension of the stoichiometric subspace.

The first two concepts here are easy. A reaction network is a graph (okay, a directed multigraph). So, it has some number of vertices, and also some number of connected components. Two vertices lie in the same connected component iff you can get from one to the other by a path where you don’t care which way the arrows point. For example, this reaction network:

has 5 vertices and 2 connected components.

So, what’s the ‘stoichiometric subspace’? ‘Stoichiometry’ is a scary-sounding word. According to the Wikipedia article:

‘Stoichiometry’ is derived from the Greek words στοιχεῖον (stoicheion, meaning element) and μέτρον (metron, meaning measure.) In patristic Greek, the word Stoichiometria was used by Nicephorus to refer to the number of line counts of the canonical New Testament and some of the Apocrypha.

But for us, stoichiometry is just the art of counting species. To do this, we can form a vector space \mathbb{R}^S where S is the set of species. A vector in \mathbb{R}^S is a function from species to real numbers, saying how much of each species is present. Any complex gives a vector in \mathbb{R}^S, because it’s actually a function from species to natural numbers.

Definition. The stoichiometric subspace of a reaction network is the subspace \mathrm{Stoch} \subseteq \mathbb{R}^S spanned by vectors of the form x - y where x and y are complexes connected by a reaction.

‘Complexes connected by a reaction’ makes sense because vertices in the reaction network are complexes, and edges are reactions. Let’s see how it works in our example:

Each complex here can be seen as a vector in \mathbb{R}^S, which is a vector space whose basis we can call A, B, C, D, E. Each reaction gives a difference of two vectors coming from complexes:

• The reaction A \to B gives the vector B - A.

• The reaction B \to A gives the vector A - B.

• The reaction A + C \to D gives the vector D - A - C.

• The reaction B + E \to A + C gives the vector A + C - B - E.

• The reaction B + E \to D gives the vector D - B - E.

The pattern is obvious, I hope.

These 5 vectors span the stoichiometric subspace. But this subspace isn’t 5-dimensional, because these vectors are linearly dependent! The first vector is the negative of the second one. The last is the sum of the previous two. And those are all the linear dependencies, so the stoichiometric subspace is 3 dimensional. For example, it’s spanned by these 3 linearly independent vectors: A - B, D - A - C, and D - B - E.

I hope you see the moral of this example: the stoichiometric subspace is the space of ways to move in \mathbb{R}^S that are allowed by the reactions in our reaction network! And this is important because the rate equation describes how the amount of each species changes as time passes. So, it describes a point moving around in \mathbb{R}^S.

Thus, if \mathrm{Stoch} \subseteq \mathbb{R}^S is the stoichiometric subspace, and x(t) \in \mathbb{R}^S is a solution of the rate equation, then x(t) always stays within the set

x(0) + \mathrm{Stoch} = \{ x(0) + y \colon \; y \in \mathrm{Stoch} \}

Mathematicians would call this set the coset of x(0), but chemists call it the stoichiometric compatibility class of x(0).

Anyway: what’s the deficiency of the reaction complex in our example? It’s

5 - 2 - 3 = 0

since there are 5 complexes, 2 connected components and the dimension of the stoichiometric subspace is 3.

But what’s the deficiency zero theorem? You’re almost ready for it. You just need to know one more piece of jargon! A reaction network is weakly reversible if whenever there’s a reaction going from a complex x to a complex y, there’s a path of reactions going back from y to x. Here the paths need to follow the arrows.

So, this reaction network is not weakly reversible:

since we can get from A + C to D but not back from D to A + C, and from B+E to D but not back, and so on. However, the network becomes weakly reversible if we add a reaction going back from D to B + E:

If a reaction network isn’t weakly reversible, one complex can turn into another, but not vice versa. In this situation, what typically happens is that as time goes on we have less and less of one species. We could have an equilibrium where there’s none of this species. But we have little right to expect an equilibrium solution of the rate equation that’s positive, meaning that it sits at a point x \in (0,\infty)^S, where there’s a nonzero amount of every species.

My argument here is not watertight: you’ll note that I fudged the difference between species and complexes. But it can be made so when our reaction network has deficiency zero:

Deficiency Zero Theorem (Feinberg). Suppose we are given a reaction network with a finite set of species S, and suppose its deficiency is zero. Then:

(i) If the network is not weakly reversible and the rate constants are positive, the rate equation does not have a positive equilibrium solution.

(ii) If the network is not weakly reversible and the rate constants are positive, the rate equation does not have a positive periodic solution, that is, a periodic solution lying in (0,\infty)^S.

(iii) If the network is weakly reversible and the rate constants are positive, the rate equation has exactly one equilibrium solution in each positive stoichiometric compatibility class.
Any sufficiently nearby solution that starts in the same stoichiometric compatibility class will approach this equilibrium as t \to +\infty. Furthermore, there are no other positive periodic solutions.

This is quite an impressive result. We’ll look at an easy example next time.

References and remarks

The deficiency zero theorem was published here:

• Martin Feinberg, Chemical reaction network structure and the stability of complex isothermal reactors: I. The deficiency zero and deficiency one theorems, Chemical Engineering Science 42 (1987), 2229–2268.

These other explanations are also very helpful:

• Martin Feinberg, Lectures on reaction networks, 1979.

• Jeremy Gunawardena, Chemical reaction network theory for in-silico biologists, 2003.

At first glance the deficiency zero theorem might seem to settle all the basic questions about the dynamics of reaction networks, or stochastic Petri nets… but actually, it just means that deficiency zero reaction networks don’t display very interesting dynamics in the limit as t \to +\infty. So, to get more interesting behavior, we need to look at reaction networks that don’t have deficiency zero.

For example, in biology it’s interesting to think about ‘bistable’ chemical reactions: reactions that have two stable equilibria. An electrical switch of the usual sort is a bistable system: it has stable ‘on’ and ‘off’ positions. A bistable chemical reaction can serve as a kind of biological switch:

• Gheorghe Craciun, Yangzhong Tang and Martin Feinberg, Understanding bistability in complex enzyme-driven reaction networks, PNAS 103 (2006), 8697-8702.

It’s also interesting to think about chemical reactions with stable periodic solutions. Such a reaction can serve as a biological clock:

• Daniel B. Forger, Signal processing in cellular clocks, PNAS 108 (2011), 4281-4285.


Azimuth on Google Plus (Part 4)

11 November, 2011

Again, some eye candy to start the show. Stare fixedly at the + sign here until the pink dots completely disappear:

In a semiconductor, a ‘hole’ is the absence of an electron, and it can move around a as if it were a particle. If you have a hole moving to the right, in reality you have electrons moving to the left. Here pink dots moving counterclockwise look like a green dot moving clockwise!

A related puzzle: what happens when you hold a helium balloon on a string while you’re driving in a car with the windows closed… and then you make a sharp right turn? I’ve done it, so I know from experience.

Now for the real stuff:

• Tom Murphy, a physics professor at U.C. San Diego, has a blog worth visiting: Do the Math. He uses physics and math to make informed guesses about the future of energy production. Try out his overview on ‘peak oil’.

• Hundreds of top conservation scientists took a survey, and 99.5% felt that a serious loss of biodiversity is either ‘likely’, ‘very likely’, or ‘virtually certain’. Tropical coral ecosystems were perceived as the most seriously affected. A slim majority think we need to decide on rules for ‘triage’: deciding which species to save and which to give up on.

• Climate change is causing a massive change in tree species across Western USA. “Ecosystems are always changing at the landscape level, but normally the rate of change is too slow for humans to notice,” said Steven Running, a co-author of a study on this at the University of Montana. “Now the rate of change is fast enough we can see it.” The study used remote sensing of large areas over a four-year period.

• The James Dyson Award calls on design and engineering students to create innovative, practical, elegant solutions to the challenges that face us. This year, Edward Linacre won for a self-powering device that extracts water from the air for irrigation purposes. Linacre comes from the drought-afflicted continent of Australia. But his invention borrows some tricks from the Namib beetle, which survives some of the driest deserts in Africa by harvesting the moisture that condenses on its back during the early morning. That’s called biomimicry.


• The New York Times has a great profile of Jeremy Grantham. He heads a successful firm managing $100 billion assets, and now he’s 72. So why is he saying this?

… it’s very important to me to make a lot of money now, much more than when I was 40 or 50.

Not because he has a brand new gold-digger ‘trophy wife’ or spendthrift heirs. No, he puts all the money into the Grantham Foundation for the Protection of the Environment. He’s famous for his quarterly letters on future trends—you can read them free online! And thanks to this, he has some detailed ideas about what’s coming up, and what we should do about it:

Energy “will give us serious and sustained problems” over the next 50 years as we make the transition from hydrocarbons—oil, coal, gas—to solar, wind, nuclear and other sources, but we’ll muddle through to a solution to Peak Oil and related challenges. Peak Everything Else will prove more intractable for humanity. Metals, for instance, “are entropy at work . . . from wonderful metal ores to scattered waste,” and scarcity and higher prices “will slowly increase forever,” but if we scrimp and recycle, we can make do for another century before tight constraint kicks in.

Agriculture is more worrisome. Local water shortages will cause “persistent irritation”—wars, famines. Of the three essential macro nutrient fertilizers, nitrogen is relatively plentiful and recoverable, but we’re running out of potassium and phosphorus, finite mined resources that are “necessary for all life.” Canada has large reserves of potash (the source of potassium), which is good news for Americans, but 50 to 75 percent of the known reserves of phosphate (the source of phosphorus) are located in Morocco and the western Sahara. Assuming a 2 percent annual increase in phosphorus consumption, Grantham believes the rest of the world’s reserves won’t last more than 50 years, so he expects “gamesmanship” from the phosphate-rich.

And he rates soil erosion as the biggest threat of all. The world’s population could reach 10 billion within half a century—perhaps twice as many human beings as the planet’s overtaxed resources can sustainably support, perhaps six times too many.

It’s not that he doesn’t take climate change seriously. However, he seems to have almost given up on the US political establishment doing anything about it. So he’s shifted his focus:

Grantham put his own influence and money behind the climate-change bill passed by the House in 2009. “But even $100 million wouldn’t have gotten it through the Senate,” he said. “The recession more or less ruled it out. It pushed anything having to do with the environment down 10 points, across the board. Unemployment and interest in environmental issues move inversely.”

Having missed a once-in-a-generation legislative opportunity to address climate change, American environmentalists are looking for new strategies. Grantham believes that the best approach may be to recast global warming, which depresses crop yields and worsens soil erosion, as a factor contributing to resource depletion. “People are naturally much more responsive to finite resources than they are to climate change,” he said. “Global warming is bad news. Finite resources is investment advice.” He believes this shift in emphasis plays to Americans’ strength. “Americans are just about the worst at dealing with long-term problems, down there with Uzbekistan,” he said, “but they respond to a market signal better than almost anyone. They roll the dice bigger and quicker than most.”

Let’s wrap up with some more fun stuff: impressive volcanos!

Morgan Abbou explains:

Volcanic lightning photograph by Francisco Negroni. In a scene no human could have witnessed, an apocalyptic agglomeration of lightning bolts illuminates an ash cloud above Chile’s Puyehue volcano in June 2011. The minutes-long exposure shows individual bolts as if they’d all occurred at the same moment and, due to the Earth’s rotation, renders stars (left) as streaks. Lightning to the right of the ash cloud appears to have illuminated nearby clouds.hence the apparent absence of stars on that side of the picture. After an ominous series of earthquakes on the previous day, the volcano erupted that afternoon, convincing authorities to evacuate some 3,500 area residents. Eruptions over the course of the weekend resulted in heavy ashfalls, including in Argentine towns 60 miles (a hundred kilometers) away.

Here’s another shot of the same volcano:

And here’s Mount Etna blowing out a smoke ring in March of 2000. By its shadow, this ring was estimated to be 200 meters in diameter!


Measuring Biodiversity

7 November, 2011

guest post by Tom Leinster

Even if there weren’t a global biodiversity crisis, we’d want to know how to put a number on biodiversity. As Lord Kelvin famously put it:

When you can measure what you are speaking about, and express it in numbers, you know something about it; but when you cannot measure it, your knowledge is of a meagre and unsatisfactory kind: it may be the beginning of knowledge, but you have scarcely, in your thoughts, advanced to the stage of science.

In this post, I’ll talk about what happens when you take a mass of biological data and try to turn it into a single number, intended to measure biodiversity.

There have been more than 50 years of debate about how to measure diversity. While the idea of putting a number on biological diversity goes back to the 1940s at least, the debate really seems to have got going in the wake of pioneering work by the great ecologist Robert Whittaker in the 1960s.

There followed several decades in which progress was made… but there was a lot of talking at cross-purposes. In fact, there was so much confusion that some people gave up on the diversity concept altogether. The mood is summed up by the title of an excellent and much-cited paper of Stuart Hurlbert:

• S. H. Hurlbert, The nonconcept of species diversity: A critique and alternative parameters. Ecology 52:577–586, 1971.

So why all the confusion?

One reason is that the word “diversity” is used by different people in many different ways. We all know that diversity is important: so if you found a quantity that seemed to measure biological variation in a sensible way, you might be tempted to call it “diversity” and publish a paper promoting your quantity over all other quantities that have ever been given that name. There are literally dozens of measures of diversity in the literature. Here are two simple ones:

  • Species richness is simply the number of species in the community concerned.
  • The Shannon entropy is -\sum_{i = 1}^S p_i \log(p_i), where our community consists of S species in proportions p_1, \ldots, p_S.

Which quantity should we call “diversity”? Do all these quantities really measure the same kind of thing? If community A has greater than species richness than community B, but lower Shannon entropy, what does it mean?

Another cause for confusion is a blurring between the questions

Which quantities deserve to be called diversity?

and

Which quantities are we capable of measuring experimentally?

For example, we might all agree that species richness is an important quantity, but that doesn’t mean that species richness is easy to measure in practice. (In fact, it’s not, more on which below.) My own view is that the two questions should be kept separate:

The statistical problem of designing appropriate estimators becomes relevant only after the measure to be estimated is accepted to be meaningful.

(Hans-Rolf Gregorius, Elizabeth M. Gillet, Generalized Simpson-diversity, Ecological Modelling 211:90–96, 2008.)

The problems involved in quantifying diversity are of three types: practical, statistical and conceptual. I’ll say a little about the first two, and rather more about the third.

Practical  Suppose that you’re doing a survey of the vertebrates in a forest. Perhaps one important species is brightly coloured and noisy, while another is silent, shy, and well-camouflaged. How do you prevent the first from being recorded disproportionately?

Or suppose that you’re carrying out a survey, with multiple people doing the fieldwork. Different people have a tendency to spot different things: for example, one person might be short-sighted and another long-sighted. How do you ensure that this doesn’t affect your results?

Statistical  Imagine that you want to know how many distinct species of insect live in a particular area — the “species richness”, in the terminology introduced above. You go out collecting, and you come back with 100 specimens representing 10 species.

But your survey might have missed some species altogether, so you go out and get a bigger sample. This time, you get 200 specimens representing 15 species. Does this help you discover how many species there really are?

Logically, not at all. The only certainty is that there are at least 15 species. Maybe there are thousands of species, but almost all of them are extremely rare. Or maybe there are really only 15. Unless you collect all the insects, you’ll never know for sure exactly how many species there are.

However, it may be that you can make reasonable assumptions about the frequency distribution of the species. People sometimes do exactly this, to try to overcome the difficulty of estimating species richness.

Conceptual  This is what I really want to talk about.

I mentioned earlier that different people mean different things by “diversity”. Here’s an example.

Consider two bird communities. The first looks like this:

It contains four species, one of which is responsible for most of the population, and three of which are quite rare. The second looks like this:

It has only three species, but they’re evenly balanced.

Which community is the more diverse? It’s a matter of opinion. Mostly in the press, and in many scholarly articles too, “biodiversity” is used as a synonym for “species richness”. On this count, the first community is more diverse. But if you’re more concerned with the healthy functioning of the whole community, the presence of rare species might not be particularly important: it’s balance that matters, and the second community has more of that.

Different people using the word “diversity” attach different amounts of significance to rare species. There’s a spectrum of points of view, ranging from those who give rare species the same weight as common ones (as in the definition of species richness) to those who are only interested in the most common species of all. Every point on this spectrum of viewpoints is reasonable. None should have a monopoly on the word “diversity”.

At least, that’s what Christina Cobbold and I argue in our new paper:

• Tom Leinster, Christina A. Cobbold, Measuring diversity: the importance of species similarity, Ecology, in press (doi:10.1890/10-2402.1).

But that’s not actually our main point. As the title suggests, the real purpose of our paper is to show how to measure diversity in a way that reflects the varying differences between species. I’ll explain.

Most of the existing approaches to measuring biodiversity go like this.

We have a “community” of organisms — the fish in a lake, the fungi in a forest, or the bacteria on your skin. This community is divided into S groups, conventionally called species, though they needn’t be species in the ordinary sense.

We assume that we know the relative abundances, or relative frequencies, of the species. Write them as p_1, \ldots, p_S. Thus, p_i is the proportion of the total population that belongs to the ith species, where “proportion” is measured in any way you think sensible (number of individuals, total mass, etc).

We only care about relative abundances here, not absolute abundances: so p_1 + \cdots + p_S = 1. If half of a forest is destroyed, it might be a catastrophe, but on the (unrealistic) assumption that all the flora and fauna in the forest were distributed homogeneously, it won’t actually change the biodiversity. (That’s not a statement about what’s important in life; it’s only a statement about the usage of a word.)

This model is common but crude. It can’t detect the difference between a community of six dramatically different species and a community consisting of six species of barnacle.

So, Christina and I use a refined model, as follows. We assume that we also have a measure of the similarity between each pair of species. This is a real number between 0 and 1, with 0 indicating that the species are as dissimilar as could be, and 1 indicating that they’re identical. Writing the similarity between the ith and jth species as Z_{ij}, this gives an S \times S matrix \mathbf{Z}. Our only assumption on \mathbf{Z} is that its diagonal entries are all 1: every species is identical to itself.

There are many ways of measuring inter-species similarity. Probably the most familiar approach is genetic, as in “you share 98% of your DNA with a chimpanzee”. But there are many other possibilities: functional, phylogenetic, morphological, taxonomic, …. Diversity is a measure of the variety of life; having to choose a measure of similarity forces you to get clear exactly what you mean by “variety”.

Christina and I are by no means the first people to incorporate species similarity into the model of an ecological community. The main new thing in our paper is this measure of the community’s diversity:

{}^q D^{\mathbf{Z}}(\mathbf{p}) = ( \sum_i p_i (\mathbf{Z}\mathbf{p})_i^{q - 1} )^{1/(1 - q)}.

What does this mean?

  • {}^q D^{\mathbf{Z}}(\mathbf{p}) is what we call the diversity of order q of the community. Here q is a parameter between 0 and \infty, which you get to choose. Different values of q represent different points on the spectrum of viewpoints described above. Small values of q give high importance to rare species; large values of q give high importance to common species.
  • \mathbf{p} is shorthand for the relative abundances p_1, \ldots, p_S, and \mathbf{Z} is the matrix of similarities.
  • (\mathbf{Z}\mathbf{p})_i means \sum_j Z_{ij} p_j.

The expression doesn’t make sense if q = 1 or q = \infty, but can be made sense of by taking limits. For q = 1, this gives

{}^1 D^{\mathbf{Z}}(\mathbf{p}) = 1/(\mathbf{Z p})_1^{p_1} (\mathbf{Z p})_2^{p_2} \cdots (\mathbf{Z p})_S^{p_S} = \exp(-\sum_i p_i \log(\mathbf{Z p})_i)

If you want to know the value at q = \infty, or any of the other mathematical details, you can read this post at the n-Category Café, or of course our paper. In both places, you’ll also find an explanation of what motivates this formula. What’s more, you’ll see that many existing measures of diversity are special cases of ours, obtained by taking particular values for q and/or \mathbf{Z}.

But I won’t talk about any of that here. Instead, I’ll tell you how taking species similarity into account can radically alter the assessment of diversity.

I’ll do this using an example: butterflies of subfamily Charaxinae at a site in an Ecuadorian rainforest. The data is from here:

• P. J. DeVries, D. Murray, R. Lande, Species diversity in vertical, horizontal and temporal dimensions of a fruit-feeding butterfly community in an Ecuadorian rainforest. Biological Journal of the Linnean Society 62:343–364, 1997.

They measured the butterfly abundances in both the canopy (top level) and understorey (lower level) at this site, with the following results:

Species Canopy   Understorey
Prepona laertes 15 0
Archaeoprepona demophon   14 37
Zaretis itys 25 11
Memphis arachne 89 23
Memphis offa 21 3
Memphis xenocles 32 8

Which is more diverse: canopy or understorey?

We’ve already seen that the answer is going to depend on what exactly we mean by “diverse”.

First let’s answer the question under the (crude!) assumption that different species have nothing whatsoever in common. This means taking our similarity matrix \mathbf{Z} to be the identity matrix: if i \neq j then Z_{ij} = 0 (totally dissimilar), and if i = j then Z_{ii} = 1 (totally identical).

Now, remember that there’s a spectrum of viewpoints on how much importance to give to rare species when measuring diversity. Rather than choosing a particular viewpoint, we’ll calculate the diversity from all viewpoints, and display it on a graph. In other words, we’ll draw the graph of {}^q D^{\mathbf{Z}}(\mathbf{p}) (the diversity of order q) against q (the viewpoint). Here’s what we get:

(the horizontal axis should be labelled with a q.)

Conclusion: from all viewpoints, the butterfly population in the canopy is at least as diverse as that in the understorey.

Now let’s do it again, but this time taking account of the varying similarities between species of butterflies. We don’t have much to go on: how do we know whether Prepona laertes is very similar to, or very different from, Archaeoprepona demophon? With only the data above, we don’t. So what can we do?

All we have to go on is the taxonomy. Remember your high school biology: for the butterfly Prepona laertes, the genus is Prepona and the species is laertes. We’d expect species in the same genus to have more in common than species in different genera. So let’s define the similarity between two species as follows:

  • the similarity is 1 if the species are the same
  • the similarity is 0.5 if the species are different but in the same genus
  • the similarity is 0 if they are not even in the same genus.

This is still crude, but in the absence of further information, it’s about the best we can do. And it’s better than the first approach, where we ignored the taxonomy entirely. Throwing away biologically relevant information is unlikely to lead to a better assessment of diversity.

Using this taxonomic matrix \mathbf{Z}, and the same abundances, the diversity graphs become:

This is more interesting! For q > 1, the understorey looks more diverse than the canopy — the opposite conclusion to our first approach.

It’s not hard to see why. Look again at the table of abundances, but paying attention to the genera of the butterflies. In the canopy, nearly three-quarters of the butterflies are of genus Memphis. So when we take into account the fact that species in the same genus tend to be somewhat similar, the canopy looks much less diverse than it did before. In the understorey, however, the species are spread more evenly between genera, so taking similarity into account leaves its diversity relatively unchanged.

Taking account of species similarity opens up a world of uncertainty. How should we measure similarity? There are as many possibilities as there are quantifiable characteristics of living organisms. It’s much more reassuring to stay in the black-and-white world where distinct species are always assigned a similarity of 0, no matter how similar they might actually be. (This is, effectively, what most existing measures do.) But that’s just hiding from reality.

Maybe you disagree! If so, try the the Discussion section of our paper, where we lay out our arguments in more detail. Or let me know by leaving a comment.


Network Theory (Part 16)

4 November, 2011

We’ve been comparing two theories: stochastic mechanics and quantum mechanics. Last time we saw that any graph gives us an example of both theories! It’s a bit peculiar, but today we’ll explore the intersection of these theories a little further, and see that it has another interpretation. It’s also the theory of electrical circuits made of resistors!

That’s nice, because I’m supposed to be talking about ‘network theory’, and electrical circuits are perhaps the most practical networks of all:

I plan to talk a lot about electrical circuits. I’m not quite ready to dive in, but I can’t resist dipping my toe in the water today. Why don’t you join me? It’s not too cold!

Dirichlet operators

Last time we saw that any graph gives us an operator called the ‘graph Laplacian’ that’s both infinitesimal stochastic and self-adjoint. That means we get both:

• a Markov process describing the random walk of a classical particle on the graph.

and

• a 1-parameter unitary group describing the motion of a quantum particle on the graph.

That’s sort of neat, so it’s natural to wonder what are all the operators that are both infinitesimal stochastic and self-adjoint. They’re called ‘Dirichlet operators’, and at least in the finite-dimensional case we’re considering, they’re easy to completely understand. Even better, it turns out they describe electrical circuits made of resistors!

Today let’s take a lowbrow attitude and think of a linear operator H : \mathbb{C}^n \to \mathbb{C}^n as an n \times n matrix with entries H_{i j}. Then:

H is self-adjoint if it equals the conjugate of its transpose:

H_{i j} = \overline{H}_{j i}

H is infinitesimal stochastic if its columns sum to zero and its off-diagonal entries are real and nonnegative:

\displaystyle{ \sum_i H_{i j} = 0 }

i \ne j \Rightarrow H_{i j} \ge 0

H is a Dirichlet operator if it’s both self-adjoint and infinitesimal stochastic.

What are Dirichlet operators like? Suppose H is a Dirichlet operator. Then its off-diagonal entries are \ge 0, and since

\displaystyle{ \sum_i H_{i j} = 0}

its diagonal entries obey

\displaystyle{ H_{i i} = - \sum_{ i \ne j} H_{i j} \le 0 }

So all the entries of the matrix H are real, which in turn implies it’s symmetric:

H_{i j} = \overline{H}_{j i} = H_{j i}

So, we can build any Dirichlet operator H as follows:

• Choose the entries above the diagonal, H_{i j} with i < j, to be arbitrary nonnegative real numbers.

• The entries below the diagonal, H_{i j} with i > j, are then forced on us by the requirement that H be symmetric: H_{i j} = H_{j i}.

• The diagonal entries are then forced on us by the requirement that the columns sum to zero: H_{i i} = - \sum_{ i \ne j} H_{i j}.

Note that because the entries are real, we can think of a Dirichlet operator as a linear operator H : \mathbb{R}^n \to \mathbb{R}^n. We’ll do that for the rest of today.

Circuits made of resistors

Now for the fun part. We can easily draw any Dirichlet operator! To this we draw n dots, connect each pair of distinct dots with an edge, and label the edge connecting the ith dot to the jth with any number H_{i j} \ge 0:

This contains all the information we need to build our Dirichlet operator. To make the picture prettier, we can leave out the edges labelled by 0:

Like last time, the graphs I’m talking about are simple: undirected, with no edges from a vertex to itself, and at most one edge from one vertex to another. So:

Theorem. Any finite simple graph with edges labelled by positive numbers gives a Dirichlet operator, and conversely.

We already talked about a special case last time: if we label all the edges by the number 1, our operator H is called the graph Laplacian. So, now we’re generalizing that idea by letting the edges have more interesting labels.

What’s the meaning of this trick? Well, we can think of our graph as an electrical circuit where the edges are wires. What do the numbers labelling these wires mean? One obvious possibility is to put a resistor on each wire, and let that number be its resistance. But that doesn’t make sense, since we’re leaving out wires labelled by 0. If we leave out a wire, that’s not like having a wire of zero resistance: it’s like having a wire of infinite resistance! No current can go through when there’s no wire. So the number labelling an edge should be the conductance of the resistor on that wire. Conductance is the reciprocal of resistance.

So, our Dirichlet operator above gives a circuit like this:

Here Ω is the symbol for an ‘ohm’, a unit of resistance… but the upside-down version, namely ℧, is the symbol for a ‘mho’, a unit of conductance that’s the reciprocal of an ohm.

Let’s see if this cute idea leads anywhere. Think of a Dirichlet operator H : \mathbb{R}^n \to \mathbb{R}^n as a circuit made of resistors. What could a vector \psi \in \mathbb{R}^n mean? It assigns a real number to each vertex of our graph. The only sensible option is for this number to be the electric potential at that point in our circuit. So let’s try that.

Now, what’s

\langle \psi, H \psi \rangle  ?

In quantum mechanics this would be a very sensible thing to look at: it would be gives us the expected value of the Hamiltonian H in a state \psi. But what does it mean in the land of electrical circuits?

Up to a constant fudge factor, it turns out to be the power consumed by the electrical circuit!

Let’s see why. First, remember that when a current flows along a wire, power gets consumed. In other words, electrostatic potential energy gets turned into heat. The power consumed is

P = V I

where V is the voltage across the wire and I is the current flowing along the wire. If we assume our wire has resistance R we also have Ohm’s law:

I = V / R

so

\displaystyle{ P = \frac{V^2}{R} }

If we write this using the conductance instead of the resistance R, we get

P = \textrm{conductance} \; V^2

But our electrical circuit has lots of wires, so the power it consumes will be a sum of terms like this. We’re assuming H_{i j} is the conductance of the wire from the ith vertex to the jth, or zero if there’s no wire connecting them. And by definition, the voltage across this wire is the difference in electrostatic potentials at the two ends: \psi_i - \psi_j. So, the total power consumed is

\displaystyle{ P = \sum_{i \ne j}  H_{i j} (\psi_i - \psi_j)^2 }

This is nice, but what does it have to do with \langle \psi , H \psi \rangle?

The answer is here:

Theorem. If H : \mathbb{R}^n \to \mathbb{R}^n is any Dirichlet operator, and \psi \in \mathbb{R}^n is any vector, then

\displaystyle{ \langle \psi , H \psi \rangle = -\frac{1}{2} \sum_{i \ne j}  H_{i j} (\psi_i - \psi_j)^2 }

Proof. Let’s start with the formula for power:

\displaystyle{ P = \sum_{i \ne j}  H_{i j} (\psi_i - \psi_j)^2 }

Note that this sum includes the condition i \ne j, since we only have wires going between distinct vertices. But the summand is zero if i = j, so we also have

\displaystyle{ P = \sum_{i, j}  H_{i j} (\psi_i - \psi_j)^2 }

Expanding the square, we get

\displaystyle{ P = \sum_{i, j}  H_{i j} \psi_i^2 - 2 H_{i j} \psi_i \psi_j + H_{i j} \psi_j^2 }

The middle term looks promisingly similar to \langle \psi, H \psi \rangle, but what about the other two terms? Because H_{i j} = H_{j i}, they’re equal:

\displaystyle{ P = \sum_{i, j} - 2 H_{i j} \psi_i \psi_j + 2 H_{i j} \psi_j^2  }

And in fact they’re zero! Since H is infinitesimal stochastic, we have

\displaystyle{ \sum_i H_{i j} = 0 }

so

\displaystyle{ \sum_i H_{i j} \psi_j^2 = 0 }

and it’s still zero when we sum over j. We thus have

\displaystyle{ P = - 2 \sum_{i, j} H_{i j} \psi_i \psi_j }

But since \psi_i is real, this is -2 times

\displaystyle{ \langle \psi, H \psi \rangle  = \sum_{i, j}  H_{i j} \overline{\psi}_i \psi_j }

So, we’re done.   █

An instant consequence of this theorem is that a Dirichlet operator has

\langle \psi , H \psi \rangle \le 0

for all \psi. Actually most people use the opposite sign convention in defining infinitesimal stochastic operators. This makes H_{i j} \le 0, which is mildly annoying, but it gives

\langle \psi , H \psi \rangle \ge 0

which is nice. When H is a Dirichlet operator, defined with this opposite sign convention, \langle \psi , H \psi \rangle is called a Dirichlet form.

The big picture

Maybe it’s a good time to step back and see where we are.

So far we’ve been exploring the analogy between stochastic mechanics and quantum mechanics. Where do networks come in? Well, they’ve actually come in twice so far:

1) First we saw that Petri nets can be used to describe stochastic or quantum processes where things of different kinds randomly react and turn into other things. A Petri net is a kind of network like this:

The different kinds of things are the yellow circles; we called them states, because sometimes we think of them as different states of a single kind of thing. The reactions where things turn into other things are the blue squares: we called them transitions. We label the transitions by numbers to say the rates at which they occur.

2) Then we looked at stochastic or quantum processes where in each transition a single thing turns into a single thing. We can draw these as Petri nets where each transition has just one state as input and one state as output. But we can also draw them as directed graphs with edges labelled by numbers:

Now the dark blue boxes are states and the edges are transitions!

Today we looked at a special case of the second kind of network: the Dirichlet operators. For these the ‘forward’ transition rate H_{i j} equals the ‘reverse’ rate H_{j i}, so our graph can be undirected: no arrows on the edges. And for these the rates H_{i i} are determined by the rest, so we can omit the edges from vertices to themselves:

The result can be seen as an electrical circuit made of resistors! So we’re building up a little dictionary:

• Stochastic mechanics: \psi_i is a probability and H_{i j} is a transition rate (probability per time).

• Quantum mechanics: \psi_i is an amplitude and H_{i j} is a transition rate (amplitude per time).

• Circuits made of resistors: \psi_i is a voltage and H_{i j} is a conductance.

This dictionary may seem rather odd—especially the third item, which looks completely different than the first two! But that’s good: when things aren’t odd, we don’t get many new ideas. The whole point of this ‘network theory’ business is to think about networks from many different viewpoints and let the sparks fly!

Actually, this particular oddity is well-known in certain circles. We’ve been looking at the discrete version, where we have a finite set of states. But in the continuum, the classic example of a Dirichlet operator is the Laplacian H = \nabla^2. And then we have:

• The heat equation:

\frac{d}{d t} \psi = \nabla^2 \psi

is fundamental to stochastic mechanics.

• The Schrödinger equation:

\frac{d}{d t} \psi = -i \nabla^2 \psi

is fundamental to quantum mechanics.

• The Poisson equation:

\nabla^2 \psi = -\rho

is fundamental to electrostatics.

Briefly speaking, electrostatics is the study of how the electric potential \psi depends on the charge density \rho. The theory of electrical circuits made of resistors can be seen as a special case, at least when the current isn’t changing with time.

I’ll say a lot more about this… but not today! If you want to learn more, this is a great place to start:

• P. G. Doyle and J. L. Snell, Random Walks and Electrical Circuits, Mathematical Association of America, Washington DC, 1984.

This free online book explains, in a really fun informal way, how random walks on graphs, are related to electrical circuits made of resistors. To dig deeper into the continuum case, try:

• M. Fukushima, Dirichlet Forms and Markov Processes, North-Holland, Amsterdam, 1980.


Follow

Get every new post delivered to your Inbox.

Join 3,095 other followers