The Quagga

13 February, 2016

The quagga was a subspecies of zebra found only in South Africa’s Western Cape region. After the Dutch invaded, they hunted the quagga to extinction. While some were taken to zoos in Europe, breeding programs failed. The last wild quagga died in 1878, and the very last quagga died in an Amsterdam zoo in 1883.

Only one was ever photographed—the mare shown above, in London. Only 23 stuffed and mounted quagga specimens exist. There was one more, but it was destroyed in Königsberg, Germany, during World War II. There is also a mounted head and neck, a foot, 7 complete skeletons, and samples of various tissues.


The quagga was the first extinct animal to have its DNA analyzed. It used to be thought that the quagga was a distinct species from the zebra. After some argument, a genetic study published in 2005 convinced most people that the quagga is a subspecies of the zebra. It showed that the quagga diverged from the other zebra subspecies only between 120,000 and 290,000 years ago, during the Pleistocene.

In 1987, a natural historian named Reinhold Rau started the Quagga Project. He was goal was to breed zebras into quaggas by selecting for quagga-like traits, most notably the lack of stripes on the back half of its body.

The founding population consisted of 19 zebras from Namibia and South Africa, chosen because they had reduced striping on the rear body and legs. The first foal was born in 1988.

By now, members of the Quagga Project believe they have recreated the quagga. Here they are:

Rau-quagga (zebra subspecies)

The new quaggas are called ‘rau–quaggas’ to distinguish them from the original ones. Do they look the same as the originals? It’s hard for me to decide. Old paintings show quite a bit of variability:

This is an 1804 illustration by Samuel Daniell, which served as the basis of a claimed subspecies of quagga, Equus quagga danielli. Perhaps they just have variable coloring.

Why try to resurrect the quagga? Rau is no longer alive, but Eric Harley, a retired professor of chemical pathology at the University of Cape Town, had this to say:

It’s an attempt to try and repair ecological damage that was done a long time ago in some sort of small way. It is also to try and get a representation back of a charismatic animal that used to live in South Africa.

We don’t do genetic engineering, we aren’t cloning, we aren’t doing any particularly clever sort of embryo transfers—it is a very simple project of selective breeding. If it had been a different species the whole project would have been unjustifiable.

The current Quagga Project chairman, Mike Gregor, has this to say:

I think there is controversy with all programmes like this. There is no way that all scientists are going to agree that this is the right way to go. We are a bunch of enthusiastic people trying to do something to replace something that we messed up many years ago.

What we’re not doing is selecting some fancy funny colour variety of zebra, as is taking place in other areas, where funny mutations have taken place with strange colouring which may look amusing but is rather frowned upon in conservation circles.

What we are trying to do is get sufficient animals—ideally get a herd of up to 50 full-blown rau-quaggas in one locality, breeding together, and then we would have a herd we could say at the very least represents the original quagga.

We obviously want to keep them separate from other populations of plains zebra otherwise we simply mix them up again and lose the characteristic appearance.

The quotes are from here:

• Lawrence Bartlett, South Africa revives ‘extinct’ zebra subspecies, Phys.org, 12 February 2016.

This project is an example of ‘resurrection biology’, or ‘de-extinction’:

• Wikipedia, De-extinction.

Needless to say, it’s a controversial idea.


Rumors of Gravitational Waves

7 February, 2016

The Laser Interferometric Gravitational-Wave Observatory or LIGO is designed to detect gravitational waves—ripples of curvature in spacetime moving at the speed of light. It’s recently been upgraded, and it will either find gravitational waves soon or something really strange is going on.

Rumors are swirling that LIGO has seen gravitational waves produced by two black holes, of 29 and 36 solar masses, spiralling towards each other—and then colliding to form a single 62-solar-mass black hole!

You’ll notice that 29 + 36 is more than 62. So, it’s possible that three solar masses were turned into energy, mostly in the form of gravitational waves!

According to these rumors, the statistical significance of the signal is supposedly very high: better than 5 sigma! That means there’s at most a 0.000057% probability this event is a random fluke – assuming nobody made a mistake.

If these rumors are correct, we should soon see an official announcement. If the discovery holds up, someone will win a Nobel prize.

The discovery of gravitational waves is completely unsurprising, since they’re predicted by general relativity, a theory that’s passed many tests already. But it would open up a new window to the universe – and we’re likely to see interesting new things, once gravitational wave astronomy becomes a thing.

Here’s the tweet that launched the latest round of rumors:

ligo_tweet_cliff_burgess

For background on this story, try this:

Tale of a doomed galaxy, Azimuth, 8 November 2015.

The first four sections of that long post discuss gravitational waves created by black hole collisions—but the last section is about LIGO and an earlier round of rumors, so I’ll quote it here!


LIGO stands for Laser Interferometer Gravitational Wave Observatory. The idea is simple. You shine a laser beam down two very long tubes and let it bounce back and forth between mirrors at the ends. You use this compare the length of these tubes. When a gravitational wave comes by, it stretches space in one direction and squashes it in another direction. So, we can detect it.

Sounds easy, eh? Not when you run the numbers! We’re trying to see gravitational waves that stretch space just a tiny bit: about one part in 1023. At LIGO, the tubes are 4 kilometers long. So, we need to see their length change by an absurdly small amount: one-thousandth the diameter of a proton!

It’s amazing to me that people can even contemplate doing this, much less succeed. They use lots of tricks:

• They bounce the light back and forth many times, effectively increasing the length of the tubes to 1800 kilometers.

• There’s no air in the tubes—just a very good vacuum.

• They hang the mirrors on quartz fibers, making each mirror part of a pendulum with very little friction. This means it vibrates very well at one particular frequency, and very badly at frequencies far from that. This damps out the shaking of the ground, which is a real problem.

• This pendulum is hung on another pendulum.

• That pendulum is hung on a third pendulum.

• That pendulum is hung on a fourth pendulum.

• The whole chain of pendulums is sitting on a device that detects vibrations and moves in a way to counteract them, sort of like noise-cancelling headphones.

• There are 2 of these facilities, one in Livingston, Louisiana and another in Hanford, Washington. Only if both detect a gravitational wave do we get excited.

I visited the LIGO facility in Louisiana in 2006. It was really cool! Back then, the sensitivity was good enough to see collisions of black holes and neutron stars up to 50 million light years away.

Here I’m not talking about the supermassive black holes that live in the centers of galaxies. I’m talking about the much more common black holes and neutron stars that form when stars go supernova. Sometimes a pair of stars orbiting each other will both blow up, and form two black holes—or two neutron stars, or a black hole and neutron star. And eventually these will spiral into each other and emit lots of gravitational waves right before they collide.

50 million light years is big enough that LIGO could see about half the galaxies in the Virgo Cluster. Unfortunately, with that many galaxies, we only expect to see one neutron star collision every 50 years or so.

They never saw anything. So they kept improving the machines, and now we’ve got Advanced LIGO! This should now be able to see collisions up to 225 million light years away… and after a while, three times further.

They turned it on September 18th. Soon we should see more than one gravitational wave burst each year.

In fact, there’s a rumor that they’ve already seen one! But they’re still testing the device, and there’s a team whose job is to inject fake signals, just to see if they’re detected. Davide Castelvecchi writes:

LIGO is almost unique among physics experiments in practising ‘blind injection’. A team of three collaboration members has the ability to simulate a detection by using actuators to move the mirrors. “Only they know if, and when, a certain type of signal has been injected,” says Laura Cadonati, a physicist at the Georgia Institute of Technology in Atlanta who leads the Advanced LIGO’s data-analysis team.

Two such exercises took place during earlier science runs of LIGO, one in 2007 and one in 2010. Harry Collins, a sociologist of science at Cardiff University, UK, was there to document them (and has written books about it). He says that the exercises can be valuable for rehearsing the analysis techniques that will be needed when a real event occurs. But the practice can also be a drain on the team’s energies. “Analysing one of these events can be enormously time consuming,” he says. “At some point, it damages their home life.”

The original blind-injection exercises took 18 months and 6 months respectively. The first one was discarded, but in the second case, the collaboration wrote a paper and held a vote to decide whether they would make an announcement. Only then did the blind-injection team ‘open the envelope’ and reveal that the events had been staged.

Aargh! The disappointment would be crushing.

But with luck, Advanced LIGO will soon detect real gravitational waves. And I hope life here in the Milky Way thrives for a long time – so that when the gravitational waves from the doomed galaxy PG 1302-102 reach us, hundreds of thousands of years in the future, we can study them in exquisite detail.

For Castelvecchi’s whole story, see:

• Davide Castelvecchi Has giant LIGO experiment seen gravitational waves?, Nature, 30 September 2015.

For pictures of my visit to LIGO, see:

• John Baez, This week’s finds in mathematical physics (week 241), 20 November 2006.

For how Advanced LIGO works, see:

• The LIGO Scientific Collaboration Advanced LIGO, 17 November 2014.


Aggressively Expanding Civilizations

5 February, 2016

Ever since I became an environmentalist, the potential destruction wrought by aggressively expanding civilizations has been haunting my thoughts. Not just here and now, where it’s easy to see, but in the future.

In October 2006, I wrote this in my online diary:

A long time ago on this diary, I mentioned my friend Bruce Smith’s nightmare scenario. In the quest for ever faster growth, corporations evolve toward ever faster exploitation of natural resources. The Earth is not enough. So, ultimately, they send out self-replicating von Neumann probes that eat up solar systems as they go, turning the planets into more probes. Different brands of probes will compete among each other, evolving toward ever faster expansion. Eventually, the winners will form a wave expanding outwards at nearly the speed of light—demolishing everything behind them, leaving only wreckage.

The scary part is that even if we don’t let this happen, some other civilization might.

The last point is the key one. Even if something is unlikely, in a sufficiently large universe it will happen, as long as it’s possible. And then it will perpetuate itself, as long as it’s evolutionarily fit. Our universe seems pretty darn big. So, even if a given strategy is hard to find, if it’s a winning strategy it will get played somewhere.

So, even in this nightmare scenario of "spheres of von Neumann probes expanding at near lightspeed", we don’t need to worry about a bleak future for the universe as a whole—any more than we need to worry that viruses will completely kill off all higher life forms. Some fraction of civilizations will probably develop defenses in time to repel the onslaught of these expanding spheres.

It’s not something I stay awake worrying about, but it’s a depressingly plausible possibility. As you can see, I was trying to reassure myself that everything would be okay, or at least acceptable, in the long run.

Even earlier, S. Jay Olson and I wrote a paper together on the limitations in accurately measuring distances caused by quantum gravity. If you try to measure a distance too accurately, you’ll need to concentrate so much energy in such a small space that you’ll create a black hole!

That was in 2002. Later I lost touch with him. But now I’m happy to discover that he’s doing interesting work on quantum gravity and quantum information processing! He is now at Boise State University in Idaho, his home state.

But here’s the cool part: he’s also studying aggressively expanding civilizations.

Expanding bubbles

What will happen if some civilizations start aggressively expanding through the Universe at a reasonable fraction of the speed of light? We don’t have to assume most of them do. Indeed, there can’t be too many, or they’d already be here! More precisely, the density of such civilizations must be low at the present time. The number of them could be infinite, since space is apparently infinite. But none have reached us. We may eventually become such a civilization, but we’re not one yet.

Each such civilization will form a growing ‘bubble’: an expanding sphere of influence. And occasionally, these bubbles will collide!

Here are some pictures from a simulation he did:





As he notes, the math of these bubbles has already been studied by researchers interested in inflationary cosmology, like Alan Guth. These folks have considered the possibility that in the very early Universe, most of space was filled with a ‘false vacuum’: a state of matter that resembles the actual vacuum, but has higher energy density.

A false vacuum could turn into the true vacuum, liberating energy in the form of particle-antiparticle pairs. However, it might not do this instantly! It might be ‘metastable’, like ball number 1 in this picture:

It might need a nudge to ‘roll over the hill’ (metaphorically) and down into the lower-energy state corresponding to the true vacuum, shown as ball number 3. Or, thanks to quantum mechanics, it might ‘tunnel’ through this hill.

The balls and the hill are just an analogy. What I mean is that the false vacuum might need to go through a stage of having even higher energy density before it could turn into the true vacuum. Random fluctuations, either quantum-mechanical or thermal, could make this happen. Such a random fluctuation could happen in one location, forming a ‘bubble’ of true vacuum that—under certain conditions—would rapidly expand.

It’s actually not very different from bubbles of steam forming in superheated water!

But here’s the really interesting Jay Olson noted in his first paper on this subject. Research on bubbles in the inflationary cosmology could actually be relevant to aggressively expanding civilizations!

Why? Just as a bubble of expanding true vacuum has different pressure than the false vacuum surrounding it, the same might be true for an aggressively expanding civilization. If they are serious about expanding rapidly, they may convert a lot of matter into radiation to power their expansion. And while energy is conserved in this process, the pressure of radiation in space is a lot bigger than the pressure of matter, which is almost zero.

General relativity says that energy density slows the expansion of the Universe. But also—and this is probably less well-known among nonphysicists—it says that pressure has a similar effect. Also, as the Universe expands, the energy density and pressure of radiation drops at a different rate than the energy density of matter.

So, the expansion of the Universe itself, on a very large scale, could be affected by aggressively expanding civilizations!

The fun part is that Jay Olson actually studies this in a quantitative way, making some guesses about the numbers involved. Of course there’s a huge amount of uncertainty in all matters concerning aggressively expanding high-tech civilizations, so he actually considers a wide range of possible numbers. But if we assume a civilization turns a large fraction of matter into radiation, the effects could be significant!

The effect of the extra pressure due to radiation would be to temporarily slow the expansion of the Universe. But the expansion would not be stopped. The radiation will gradually thin out. So eventually, dark energy—which has negative pressure, and does not thin out as the Universe expands—will win. Then the Universe will expand exponentially, as it is already beginning to do now.

(Here I am ignoring speculative theories where dark energy has properties that change dramatically over time.)

Jay Olson’s work

Here are his papers on this subject. The abstracts sketch his results, but you have to look at the papers to see how nice they are. He’s thought quite carefully about these things.

• S. Jay Olson, Homogeneous cosmology with aggressively expanding civilizations, Classical and Quantum Gravity 32 (2015) 215025.

Abstract. In the context of a homogeneous universe, we note that the appearance of aggressively expanding advanced life is geometrically similar to the process of nucleation and bubble growth in a first-order cosmological phase transition. We exploit this similarity to describe the dynamics of life saturating the universe on a cosmic scale, adapting the phase transition model to incorporate probability distributions of expansion and resource consumption strategies. Through a series of numerical solutions spanning several orders of magnitude in the input assumption parameters, the resulting cosmological model is used to address basic questions related to the intergalactic spreading of life, dealing with issues such as timescales, observability, competition between strategies, and first-mover advantage. Finally, we examine physical effects on the universe itself, such as reheating and the backreaction on the evolution of the scale factor, if such life is able to control and convert a significant fraction of the available pressureless matter into radiation. We conclude that the existence of life, if certain advanced technologies are practical, could have a significant influence on the future large-scale evolution of the universe.

• S. Jay Olson, Estimates for the number of visible galaxy-spanning civilizations and the cosmological expansion of life.

Abstract. If advanced civilizations appear in the universe with a desire to expand, the entire universe can become saturated with life on a short timescale, even if such expanders appear but rarely. Our presence in an untouched Milky Way thus constrains the appearance rate of galaxy-spanning Kardashev type III (K3) civilizations, if it is assumed that some fraction of K3 civilizations will continue their expansion at intergalactic distances. We use this constraint to estimate the appearance rate of K3 civilizations for 81 cosmological scenarios by specifying the extent to which humanity could be a statistical outlier. We find that in nearly all plausible scenarios, the distance to the nearest visible K3 is cosmological. In searches where the observable range is limited, we also find that the most likely detections tend to be expanding civilizations who have entered the observable range from farther away. An observation of K3 clusters is thus more likely than isolated K3 galaxies.

• S. Jay Olson, On the visible size and geometry of aggressively expanding civilizations at cosmological distances.

Abstract. If a subset of advanced civilizations in the universe choose to rapidly expand into unoccupied space, these civilizations would have the opportunity to grow to a cosmological scale over the course of billions of years. If such life also makes observable changes to the galaxies they inhabit, then it is possible that vast domains of life-saturated galaxies could be visible from the Earth. Here, we describe the shape and angular size of these domains as viewed from the Earth, and calculate median visible sizes for a variety of scenarios. We also calculate the total fraction of the sky that should be covered by at least one domain. In each of the 27 scenarios we examine, the median angular size of the nearest domain is within an order of magnitude of a percent of the whole celestial sphere. Observing such a domain would likely require an analysis of galaxies on the order of a giga-lightyear from the Earth.

Here are the main assumptions in his first paper:

1. At early times (relative to the appearance of life), the universe is described by the standard cosmology – a benchmark Friedmann-Robertson-Walker (FRW) solution.

2. The limits of technology will allow for self-reproducing spacecraft, sustained relativistic travel over cosmological distances, and an efficient process to convert baryonic matter into radiation.

3. Control of resources in the universe will tend to be dominated by civilizations that adopt a strategy of aggressive expansion (defined as a frontier which expands at a large fraction of the speed of the individual spacecraft involved), rather than those expanding diffusively due to the conventional pressures of population dynamics.

4. The appearance of aggressively expanding life in the universe is a spatially random event and occurs at some specified, model-dependent rate.

5. Aggressive expanders will tend to expand in all directions unless constrained by the presence of other civilizations, will attempt to gain control of as much matter as is locally available for their use, and once established in a region of space, will consume mass as an energy source (converting it to radiation) at some specified, model-dependent rate.


Corelations in Network Theory

2 February, 2016

Category theory reduces a large chunk of math to the clever manipulation of arrows. One of the fun things about this is that you can often take a familiar mathematical construction, think of it category-theoretically, and just turn around all the arrows to get something new and interesting!

In math we love functions. If we have a function

f: X \to Y

we can formally turn around the arrow to think of f as something going back from Y back to X. But this something is usually not a function: it’s called a ‘cofunction’. A cofunction from Y to X is simply a function from X to Y.

Cofunctions are somewhat interesting, but they’re really just functions viewed through a looking glass, so they don’t give much new—at least, not by themselves.

The game gets more interesting if we think of functions and cofunctions as special sorts of relations. A relation from X to Y is a subset

R \subseteq X \times Y

It’s a function when for each x \in X there’s a unique y \in Y with (x,y) \in R. It’s a cofunction when for each y \in Y there’s a unique x \in x with (x,y) \in R.

Just as we can compose functions, we can compose relations. Relations have certain advantages over functions: for example, we can ‘turn around’ any relation R from X to Y and get a relation R^\dagger from Y to X:

R^\dagger = \{(y,x) : \; (x,y) \in R \}

If we turn around a function we get a cofunction, and vice versa. But we can also do other fun things: for example, since both functions and cofunctions are relations, we can compose a function and a cofunction and get a relation.

Of course, relations also have certain disadvantages compared to functions. But it’s utterly clear by now that the category \mathrm{FinRel}, where the objects are finite sets and the morphisms are relations, is very important.

So far, so good. But what happens if we take the definition of ‘relation’ and turn all the arrows around?

There are actually several things I could mean by this question, some more interesting than others. But one of them gives a very interesting new concept: the concept of ‘corelation’. And two of my students have just written a very nice paper on corelations:

• Brandon Coya and Brendan Fong, Corelations are the prop for extraspecial commutative Frobenius monoids.

Here’s why this paper is important for network theory: corelations between finite sets are exactly what we need to describe electrical circuits made of ideal conductive wires! A corelation from a finite set X to a finite set Y can be drawn this way:

I have drawn more wires than strictly necessary: I’ve drawn a wire between two points whenever I want current to be able to flow between them. But there’s a reason I did this: a corelation from X to Y simply tells us when current can flow from one point in either of these sets to any other point in these sets.

Of course circuits made solely of conductive wires are not very exciting for electrical engineers. But in an earlier paper, Brendan introduced corelations as an important stepping-stone toward more general circuits:

• John Baez and Brendan Fong, A compositional framework for passive linear circuits. (Blog article here.)

The key point is simply that you use conductive wires to connect resistors, inductors, capacitors, batteries and the like and build interesting circuits—so if you don’t fully understand the math of conductive wires, you’re limited in your ability to understand circuits in general!

In their new paper, Brendan teamed up with Brandon Coya, and they figured out all the rules obeyed by the category \mathrm{FinCorel}, where the objects are finite sets and the morphisms are corelations. I’ll explain these rules later.

This sort of analysis had previously been done for \mathrm{FinRel}, and it turns out there’s a beautiful analogy between the two cases! Here is a chart displaying the analogy:

Spans Cospans
extra bicommutative bimonoids special commutative Frobenius monoids
Relations Corelations
extraspecial bicommutative bimonoids extraspecial commutative Frobenius monoids

I’m sure this will be cryptic to the nonmathematicians reading this, and even many mathematicians—but the paper explains what’s going on here.

I’ll actually say what an ‘extraspecial commutative Frobenius monoid’ is later in this post. This is a terse way of listing all the rules obeyed by corelations between finite sets—and thus, all the rules obeyed by conductive wires.

But first, let’s talk about something simpler.

What is a corelation?

Just as we can define functions as relations of a special sort, we can also define relations in terms of functions. A relation from X to Y is a subset

R \subseteq X \times Y

but we can think of this as an equivalence class of one-to-one functions

i: R \to X \times Y

Why an equivalence class? The image of i is our desired subset of X \times Y. The set R here could be replaced by any isomorphic set; its only role is to provide ‘names’ for the elements of X \times Y that are in the image of i.

Now we have a relation described as an arrow, or really an equivalence class of arrows. Next, let’s turn the arrow around!

There are different things I might mean by that, but we want to do it cleverly. When we turn arrows around, the concept of product (for example, cartesian product X \times Y of sets) turns into the concept of sum (for example, disjoint union X + Y of sets). Similarly, the concept of monomorphism (such as a one-to-one function) turns into the concept of epimorphism (such as an onto function). If you don’t believe me, click on the links!

So, we should define a corelation from a set X to a set Y to be an equivalence class of onto functions

p: X + Y \to C

Why an equivalence class? The set C here could be replaced by any isomorphic set; its only role is to provide ‘names’ for the sets of elements of X + Y that get mapped to the same thing via p.

In simpler terms, a corelation from X to a set Y is just a partition of the disjoint union X + Y. So, it looks like this:

If we like, we can then draw a line connecting any two points that lie in the same part of the partition:

These lines determine the corelation, so we can also draw a corelation this way:

This is why corelations describe circuits made solely of wires!

The rules governing corelations

The main result in Brandon and Brendan’s paper is that \mathrm{FinCorel} is equivalent to the PROP for extraspecial commutative Frobenius monoids. That’s a terse way of the laws governing \mathrm{FinCorel}.

Let me just show you the most important laws. In each of these law I’ll draw two circuits made of wires, and write an equals sign asserting that they give the same corelation from a set X to a set Y. The inputs X of each circuit are on top, and the outputs Y are at the bottom. I’ll draw 3-way junctions as little triangles, but don’t worry about that. When we compose two corelations we may get a wire left in mid-air, not connected to the inputs or outputs. We draw the end of the wire as a little circle.

There are some laws called the ‘commutative monoid’ laws:

and an upside-down version called the ‘cocommutative comonoid’ laws:

Then we have ‘Frobenius laws’:

and finally we have the ‘special’ and ‘extra’ laws:

All other laws can be derived from these in some systematic ways.

Commutative Frobenius monoids obey the commutative monoid laws, the cocommutative comonoid laws and the Frobenius laws. They play a fundamental role in 2d topological quantum field theory. Special Frobenius monoids are also well-known. But the ‘extra’ law, which says that a little piece of wire not connected to anything can be thrown away with no effect, is less well studied. Jason Erbele and I gave it this name in our work on control theory:

• John Baez and Jason Erbele, Categories in control. (Blog article here.)

For more

David Ellerman has spent a lot of time studying what would happen to mathematics if we turned around a lot of arrows in a certain systematic way. In particular, just as the concept of relation would be replaced by the concept of corelation, the concept of subset would be replaced by the concept of partition. You can see how it fits together: just as a relation from X to Y is a subset of X \times Y, a corelation from X to Y is a partition of X + Y.

There’s a lattice of subsets of a set:

In logic these subsets correspond to propositions, and the lattice operations are the logical operations ‘and’ and ‘or’. But there’s also a lattice of partitions of a set:

In Ellerman’s vision, this lattice of partitions gives a new kind of logic. You can read about it here:

• David Ellerman, Introduction to partition logic, Logic Journal of the Interest Group in Pure and Applied Logic 22 (2014), 94–125.

As mentioned, the main result in Brandon and Brendan’s paper is that \mathrm{FinCorel} is equivalent to the PROP for extraspecial commutative Frobenius monoids. After they proved this, they noticed that the result has also been stated in other language and proved in other ways by two other authors:

• Fabio Zanasi, Interacting Hopf Algebras—the Theory of Linear Systems, PhD thesis, École Normale Supériere de Lyon, 2015.

• K. Dosen and Z. Petrić, Syntax for split preorders, Annals of Pure and Applied Logic 164 (2013), 443–481.

Unsurprisingly, I prefer Brendan and Brandon’s approach to deriving the result. But it’s nice to see different perspectives!


Among the Bone Eaters

30 January, 2016

Anthropologists sometimes talk about the virtues and dangers of ‘going native’: doing the same things as the people they’re studying, adopting their habits and lifestyles—and perhaps even their beliefs and values. The same applies to field biologists: you sometimes see it happen to people who study gorillas or chimpanzees.

It’s more impressive to see someone go native with a pack of hyenas:

• Marcus Baynes-Rock, Among the Bone Eaters: Encounters with Hyenas in Harar, Penn State University Press, 2015.

I’ve always been scared of hyenas, perhaps because they look ill-favored and ‘mean’ to me, or perhaps because their jaws have incredible bone-crushing force:

This is a spotted hyena, the species of hyena that Marcus Baynes-Rock befriended in the Ethiopian city of Harar. Their bite force has been estimated at 220 pounds!

(As a scientist I should say 985 newtons, but I have trouble imagining what it’s like to have teeth pressing into my flesh with a force of 985 newtons. If you don’t have a feeling for ‘pounds’, just imagine a 100-kilogram man standing on a hyena tooth that is pressing into your leg.)

So, you don’t want to annoy a hyena, or look too edible. However, the society of hyenas is founded on friendship! It’s the bonds of friendship that will make one hyena rush in to save another from an attacking lion. So, if you can figure out how to make hyenas befriend you, you’ve got some heavy-duty pals who will watch your back.

In Harar, people have been associating with spotted hyenas for a long time. At first they served as ‘trash collectors’, but later the association deepened. According to Wikipedia:

Written records indicate that spotted hyenas have been present in the walled Ethiopian city of Harar for at least 500 years, where they sanitise the city by feeding on its organic refuse.

The practice of regularly feeding them did not begin until the 1960s. The first to put it into practice was a farmer who began to feed hyenas in order to stop them attacking his livestock, with his descendants having continued the practice. Some of the hyena men give each hyena a name they respond to, and call to them using a “hyena dialect”, a mixture of English and Oromo. The hyena men feed the hyenas by mouth, using pieces of raw meat provided by spectators. Tourists usually organize to watch the spectacle through a guide for a negotiable rate. As of 2002, the practice is considered to be on the decline, with only two practicing hyena men left in Harar.


Hyena man — picture by Gusjer

According to local folklore, the feeding of hyenas in Harar originated during a 19th-century famine, during which the starving hyenas began to attack livestock and humans. In one version of the story, a pure-hearted man dreamed of how the Hararis could placate the hyenas by feeding them porridge, and successfully put it into practice, while another credits the revelation to the town’s Muslim saints convening on a mountaintop. The anniversary of this pact is celebrated every year on the Day of Ashura, when the hyenas are provided with porridge prepared with pure butter. It is believed that during this occasion, the hyenas’ clan leaders taste the porridge before the others. Should the porridge not be to the lead hyenas’ liking, the other hyenas will not eat it, and those in charge of feeding them make the requested improvements. The manner in which the hyenas eat the porridge on this occasion are believed to have oracular significance; if the hyena eats more than half the porridge, then it is seen as portending a prosperous new year. Should the hyena refuse to eat the porridge or eat all of it, then the people will gather in shrines to pray, in order to avert famine or pestilence.

Marcus Baynes-Rock went to Harar to learn about this. He wound up becoming friends with a pack of hyenas:

He would play with them and run with them through the city streets at night. In the end he ‘went native’: he would even be startled, like the hyenas, when they came across a human being!

To get a feeling for this, I think you have to either read his book or listen to this:

In a city that welcomes hyenas, an anthropologist makes friends, Here and Now, National Public Radio, 18 January 2016.

Nearer the beginning of this quest, he wrote this:

The Old Town of Harar in eastern Ethiopia is enclosed by a wall built 500 years ago to protect the town’s inhabitants from hostile neighbours after a religious conflict that destabilised the region. Historically, the gates would be opened every morning to admit outsiders into the town to buy and sell goods and perhaps worship at one of the dozens of mosques in the Muslim city. Only Muslims were allowed to enter. And each night, non-Hararis would be evicted from the town and the gates locked. So it is somewhat surprising that this endogamous, culturally exclusive society incorporated holes into its defensive wall, through which spotted hyenas from the surrounding hills could access the town at night.

Spotted hyenas could be considered the most hated mammal in Africa. Decried as ugly and awkward, associated with witches and sorcerers and seen as contaminating, spotted hyenas are a public relations challenge of the highest order. Yet in Harar, hyenas are not only allowed into the town to clean the streets of food scraps, they are deeply embedded in the traditions and beliefs of the townspeople. Sufism predominates in Harar and at last count there were 121 shrines in and near the town dedicated to the town’s saints. These saints are said to meet on Mt Hakim every Thursday to discuss any pressing issues facing the town and it is the hyenas who pass the information from the saints on to the townspeople via intermediaries who can understand hyena language. Etymologically, the Harari word for hyena, ‘waraba’ comes from ‘werabba’ which translates literally as ‘news man’. Hyenas are also believed to clear the streets of jinn, the unseen entities that are a constant presence for people in the town, and hyenas’ spirits are said to be like angels who fight with bad spirits to defend the souls of spiritually vulnerable people.

[…]

My current research in Harar is concerned with both sides of the relationship. First is the collection of stories, traditions, songs and proverbs of which there are many and trying to understand how the most hated mammal in Africa can be accommodated in an urban environment; to understand how a society can tolerate the presence of a potentially dangerous
species. Second is to understand the hyenas themselves and their participation in the relationship. In other parts of Ethiopia, and even within walking distance of Harar, hyenas are dangerous animals and attacks on people are common. Yet, in the old town of Harar, attacks are unheard of and it is not unusual to see hyenas, in search of food scraps, wandering past perfectly edible people sleeping in the streets. This localised immunity from attack is reassuring for a researcher spending nights alone with the hyenas in Harar’s narrow streets and alleys.

But this sounds like it was written before he went native!

Social networks

By the way: people have even applied network theory to friendships among spotted hyenas:

• Amiyaal Ilany, Andrew S. Booms and Kay E. Holekamp, Topological effects of network structure on long-term social network dynamics in a wild mammal, Ecology Letters, 18 (2015), 687–695.

The paper is not open-access, but there’s an easy-to-read summary here:

Scientists puzzled by ‘social network’ of spotted hyenas, Sci.news.com, 18 May 2015.

The scientists collected more than 55,000 observations of social interactions of spotted hyenas (also known as laughing hyenas) over a 20 year period in Kenya, making this one of the largest to date of social network dynamics in any non-human species.

They found that cohesive clustering of the kind where an individual bonds with friends of friends, something scientists call ‘triadic closure,’ was the most consistent factor influencing the long-term dynamics of the social structure of these mammals.

Individual traits, such as sex and social rank, and environmental effects, such as the amount of rainfall and the abundance of prey, also matter, but the ability of individuals to form and maintain social bonds in triads was key.

“Cohesive clusters can facilitate efficient cooperation and hence maximize fitness, and so our study shows that hyenas exploit this advantage. Interestingly, clustering is something done in human societies, from hunter-gatherers to Facebook users,” said Dr Ilany, who is the lead author on the study published in the journal Ecology Letters

Hyenas, which can live up to 22 years, typically live in large, stable groups known as clans, which can comprise more than 100 individuals.

According to the scientists, hyenas can discriminate maternal and paternal kin from unrelated hyenas and are selective in their social choices, tending to not form bonds with every hyena in the clan, rather preferring the friends of their friends.

They found that hyenas follow a complex set of rules when making social decisions. Males follow rigid rules in forming bonds, whereas females tend to change their preferences over time. For example, a female might care about social rank at one time, but then later choose based on rainfall amounts.

“In spotted hyenas, females are the dominant sex and so they can be very flexible in their social preferences. Females also remain in the same clan all their lives, so they may know the social environment better,” said study co-author Dr Kay Holekamp of Michigan State University.

“In contrast, males disperse to new clans after reaching puberty, and after they disperse they have virtually no social control because they are the lowest ranking individuals in the new clan, so we can speculate that perhaps this is why they are obliged to follow stricter social rules.”

If you like math, you might like this way of measuring ‘triadic closure’:

Triadic closure, Wikipedia.

For a program to measure triadic closure, click on the picture:


The Internal Model Principle

27 January, 2016

“Every good key must be a model of the lock it opens.”

That sentence states an obvious fact, but perhaps also a profound insight if we interpret it generally enough.

That sentence is also the title of a paper:

• Daniel L. Scholten, Every good key must be a model of the lock it opens (the Conant & Ashby Theorem revisited), 2010.

Scholten gives a lot of examples, including these:

• A key is a model of a lock’s keyhole.

• A city street map is a model of the actual city streets

• A restaurant menu is a model of the food the restaurant prepares and sells.

• Honey bees use a kind of dance to model the location of a source of nectar.

• An understanding of some phenomenon (for example a physicist’s understanding of lightning) is a mental model of the actual phenomenon.

This line of thought has an interesting application to control theory. It suggests that to do the best job of regulating some system, a control apparatus should include a model of that system.

Indeed, much earlier, Conant and Ashby tried to turn this idea into a theorem, the ‘good regulator theorem’:

• Roger C. Conant and W. Ross Ashby, Every good regulator of a system must be a model of that system), International Journal of Systems Science 1 (1970), 89–97.

Scholten’s paper is heavily based on this earlier paper. He summarizes it as follows:

What all of this means, more or less, is that the pursuit of a goal by some dynamic agent (Regulator) in the face of a source of obstacles (System) places at least one particular and unavoidable demand on that agent, which is that the agent’s behaviors must be executed in such a reliable and predictable way that they can serve as a representation (Model) of that source of obstacles.

It’s not clear that this is true, but it’s an appealing thought.

A particularly self-referential example arises when the regulator is some organism and the System is the world it lives in, including itself. In this case, it seems the regulator should include a model of itself! This would lead, ultimately, to self-awareness.

It all sounds great. But Scholten raises an obvious question: if Conant and Ashby’s theorem is so great, why isn’t more well-known? Scholten puts it quite vividly:

Given the preponderance of control-models that are used by humans (the evidence for this preponderance will be surveyed in the latter part of the paper), and especially given the obvious need to regulate that system, one might guess that the C&A theorem would be at least as famous as, say, the Pythagorean Theorem (a^2 + b^2 = c^2), the Einstein mass-energy equivalence (E = mc^2, which can be seen on T-shirts and bumper stickers), or the DNA double helix (which actually shows up in TV crime dramas and movies about super heroes). And yet, it would appear that relatively few lay-persons have ever even heard of C&A’s important prerequisite to successful regulation.

There could be various explanations. But here’s mine: when I tried to read Conant and Ashby’s paper, I got stuck. They use some very basic mathematical notation in nonstandard ways, and they don’t clearly state the hypotheses and conclusion of their theorem.

Luckily, the paper is short, and the argument, while mysterious, seems simple. So, I immediately felt I should be able to dream up the hypotheses, conclusion, and argument based on the hints given.

Scholten’s paper didn’t help much, since he says:

Throughout the following discussion I will assume that the reader has studied Conant & Ashby’s original paper, possesses the level of technical competence required to understand their proof, and is familiar with the components of the basic model that they used to prove their theorem [….]

However, I have a guess about the essential core of Conant and Ashby’s theorem. So, I’ll state that, and then say more about their setup.

Needless to say, I looked around to see if someone else had already done the work of figuring out what Conant and Ashby were saying. The best thing I found was this:

• B. A. Francis and W. M. Wonham, The internal model principle of control theory, Automatica 12 (1976) 457–465.

This paper works in a more specialized context: linear control theory. They’ve got a linear system or ‘plant’ responding to some input, a regulator or ‘compensator’ that is trying to make the plant behave in a desired way, and a ‘disturbance’ that affects the plant in some unwanted way. They prove that to perfectly correct for the disturbance, the compensator must contain an ‘internal model’ of the disturbance.

I’m probably stating this a bit incorrectly. This paper is much more technical, but it seems to be more careful in stating assumptions and conclusions. In particular, they seem to give a precise definition of an ‘internal model’. And I read elsewhere that the ‘internal model principle’ proved here has become a classic result in control theory!

This paper says that Conant and Ashby’s paper provided “plausibility arguments in favor of the internal model idea”. So, perhaps Conant and Ashby inspired Francis and Wonham, and were then largely forgotten.

My guess

My guess is that Conant and Ashby’s theorem boils down to this:

Theorem. Let R and S be finite sets, and fix a probability distribution p on S. Suppose q is any probability distribution on R \times S such that

\displaystyle{ p(s) = \sum_{r \in R} q(r,s)  \; \textrm{for all} \; s \in S}

Let H(p) be the Shannon entropy of p and let H(q) be the Shannon entropy of q. Then

H(q) \ge H(p)

and equality is achieved if there is a function

h: S \to R

such that

q(r,s) = \left\{\begin{array}{cc} p(s)  &  \textrm{if} \; r = h(s) \\                                             0  & \textrm{otherwise}  \end{array} \right.       █

Note that this is not an ‘if and only if’.

The proof of this is pretty easy to anyone who knows a bit about probability theory and entropy. I can restate it using a bit of standard jargon, which may make it more obvious to experts. We’ve got an S-valued random variable, say \textbf{s}. We want to extend it to an R \times S-valued random variable (\textbf{r}, \textbf{s}) whose entropy is small as possible. Then we can achieve this by choosing a function h: S \to R, and letting \textbf{s} = h(\textbf{r}).

Here’s the point: if we make \textbf{s} be a function of \textbf{r}, we aren’t adding any extra randomness, so the entropy doesn’t go up.

What in the world does this have to do with a good regulator containing a model of the system it’s regulating?

Well, I can’t explain that as well as I’d like—sorry. But the rough idea seems to be this. Suppose that S is a system with a given random behavior, and R is another system, the regulator. If we want the combination of the system and regulator to behave as ‘nonrandomly’ as possible, we can let the state of the regulator be a function of the state of the system.

This theorem is actually a ‘lemma’ in Conant and Ashby’s paper. Let’s look at their setup, and the ‘good regulator theorem’ as they actually state it.

Their setup

Conant and Ashby consider five sets and three functions. In a picture:


The sets are these:

• A set Z of possible outcomes.

• A goal: some subset G \subseteq Z of good outcomes

• A set D of disturbances, which I might prefer to call ‘inputs’.

• A set S of states of some system that is affected by the disturbances.

• A set R of states of some regulator that is also affected by the disturbances.

The functions are these:

• A function \phi : D \to S saying how a disturbance determines a state of the system.

• A function \rho: D \to R saying how a disturbance determines a state of the regulator.

• A function \psi: S \times R \to Z saying how a state of the system and a state of the regulator determines an outcome.

Of course we want some conditions on these maps. What we want, I guess, is for the outcome to be good regardless of the disturbance. I might say that as follows: for every d \in D we have

\psi(\phi(d), \rho(d)) \in G

Unfortunately Conant and Ashby say they want this:

\rho \subset  [\psi^{-1}(G)]\phi

I can’t parse this: they’re using math notation in ways I don’t recognize. Can you figure out what they mean, and whether it matches my guess above?

Then, after a lot of examples and stuff, they state their theorem:

Theorem. The simplest optimal regulator R of a reguland S produces events R which are related to events S by a mapping h: S \to R.

Clearly I’ve skipped over too much! This barely makes any sense at all.

Unfortunately, looking at the text before the theorem, I don’t see these terms being explained. Furthermore, their ‘proof’ introduces extra assumptions that were not mentioned in the statement of the theorem. It begins:

The sets R, S, and Z and the mapping \psi: R \times S \to Z are presumed given. We will assume that over the set S there exists a probability distribution p(S) which gives the relative frequencies of the events in S. We will further assume that the behaviour of any particular regulator R is specified by a conditional distribution p(R|S) giving, for each event in S, a distribution on the regulatory events in R.

Get it? Now they’re saying the state of the regulator R depends on the state of the system S via a conditional probability distribution p(r|s) where r \in R and s \in S. It’s odd that they didn’t mention this earlier! Their picture made it look like the state of the regulator is determined by the ‘disturbance’ via the function \rho: D \to R. But okay.

They’re also assuming there’s a probability distribution on S. They use this and the above conditional probability distribution to get a probability distribution on R.

In fact, the set D and the functions out of this set seem to play no role in their proof!

It’s unclear to me exactly what we’re given, what we get to choose, and what we’re trying to optimize. They do try to explain this. Here’s what they say:

Now p(S) and p(R|S) jointly determine p(R,S) and hence p(Z) and H(Z), the entropy in the set of outcomes:

\displaystyle{ H(Z) = - \sum_{z \in Z} p(z) \log (p(z)) }

With p(S) fixed, the class of optimal regulators therefore corresponds to the class of optimal distributions p(R|S) for which H(Z) is minimal. We will call this class of optimal distributions \pi.

I could write a little essay on why this makes me unhappy, but never mind. I’m used to the habit of using the same letter p to stand for probability distributions on lots of different sets: folks let the argument of p say which set they have in mind at any moment. So, they’re starting with a probability distribution on S and a conditional probability distribution on r \in R given s \in S. They’re using these to determine probability distribution on R \times S. Then, presumably using the map \psi: S \times R \to Z, they get a probability distribution on Z. H(Z) is the entropy of the probability distribution on Z, and for some reason they are trying to minimize this.

(Where did the subset G \subseteq Z of ‘good’ outcomes go? Shouldn’t that play a role? Oh well.)

I believe the claim is that when this entropy is minimized, there’s a function h : S \to R such that

p(r|s) = 1 \; \textrm{if} \; r = h(s)

This says that the state of the regulator should be completely determined by the the state of the system. And this, I believe, is what they mean by

Every good regulator of a system must be a model of that system.

I hope you understand: I’m not worrying about whether the setup is a good one, e.g. sufficiently general for real-world applications. I’m just trying to figure out what the setup actually is, what Conant and Ashby’s theorem actually says, and whether it’s true.

I think I’ve just made a lot of progress. Surely this was no fun to read. But it I found it useful to write it.


Ken Caldeira on What To Do

25 January, 2016

Famous climate scientist Ken Caldeira has a new article out:

• Ken Caldeira, Stop Emissions!, Technology Review, January/February 2016, 41–43.

Let me quote a bit:

Many years ago, I protested at the gates of a nuclear power plant. For a long time, I believed it would be easy to get energy from biomass, wind, and solar. Small is beautiful. Distributed power, not centralized.

I wish I could still believe that.

My thinking changed when I worked with Marty Hoffert of New York University on research that was first published in Nature in 1998. It was the first peer-reviewed study that examined the amount of near-zero-emission energy we would need in order to solve the climate problem. Unfortunately, our conclusions still hold. We need massive deployment of affordable and dependable near-zero-emission energy, and we need a major research and development program to develop better energy and transportation systems.

It’s true that wind and solar power have been getting much more attractive in recent years. Both have gotten significantly cheaper. Even so, neither wind nor solar is dependable enough, and batteries do not yet exist that can store enough energy at affordable prices to get a modern industrial society through those times when the wind is not blowing and the sun is not shining.

Recent analyses suggest that wind and solar power, connected by a continental-scale electric grid and using natural-gas power plants to provide backup, could reduce greenhouse-gas emissions from electricity production by about two-thirds. But generating electricity is responsible for only about one-third of total global carbon dioxide emissions, which are increasing by more than 2 percent a year. So even if we had this better electric sector tomorrow, within a decade or two emissions would be back where they are today.

We need to bring much, much more to bear on the climate problem. It can’t be solved unless it is addressed as seriously as we address national security. The politicians who go to the Paris Climate Conference are making commitments that fall far short of what would be needed to substantially reduce climate risk.

Daunting math

Four weeks ago, a hurricane-strength cyclone smashed into Yemen, in the Arabian Peninsula, for the first time in recorded history. Also this fall, a hurricane with the most powerful winds ever measured slammed into the Pacific coast of Mexico.

Unusually intense storms such as these are a predicted consequence of global warming, as are longer heat waves and droughts and many other negative weather-related events that we can expect to become more commonplace. Already, in the middle latitudes of the Northern Hemisphere, average temperatures are increasing at a rate that is equivalent to moving south about 10 meters (30 feet) each day. This rate is about 100 times faster than most climate change that we can observe in the geologic record, and it gravely threatens biodiversity in many parts of the world. We are already losing about two coral reefs each week, largely as a direct consequence of our greenhouse-gas emissions.

Recently, my colleagues and I studied what will happen in the long term if we continue pulling fossil carbon out of the ground and releasing it into the atmosphere. We found that it would take many thousands of years for the planet to recover from this insult. If we burn all available fossil-fuel resources and dump the resulting carbon dioxide waste in the sky, we can expect global average temperatures to be 9 °C (15 °F) warmer than today even 10,000 years into the future. We can expect sea levels to be about 60 meters (200 feet) higher than today. In much of the tropics, it is possible that mammals (including us) would not be able to survive outdoors in the daytime heat. Thus, it is essential to our long-term well-being that fossil-fuel carbon does not go into our atmosphere.

If we want to reduce the threat of climate change in the near future, there are actions to take now: reduce emissions of short-lived pollutants such as black carbon, cut emissions of methane from natural-gas fields and landfills, and so on. We need to slow and then reverse deforestation, adopt electric cars, and build solar, wind, and nuclear plants.

But while existing technologies can start us down the path, they can’t get us to our goal. Most analysts believe we should decarbonize electricity generation and use electricity for transportation, industry, and even home heating. (Using electricity for heating is wildly inefficient, but there may be no better solution in a carbon-constrained world.) This would require a system of electricity generation several times larger than the one we have now. Can we really use existing technology to scale up our system so dramatically while markedly reducing emissions from that sector?

Solar power is the only energy source that we know can power civilization indefinitely. Unfortunately, we do not have global-scale electricity grids that could wheel solar energy from day to night. At the scale of the regional electric grid, we do not have batteries that can balance daytime electricity generation with nighttime demand.

We should do what we know how to do. But all the while, we need to be thinking about what we don’t know how to do. We need to find better ways to generate, store, and transmit electricity. We also need better zero-carbon fuels for the parts of the economy that can’t be electrified. And most important, perhaps, we need better ways of using energy.

Energy is a means, not an end. We don’t want energy so much as we want what it makes possible: transportation, entertainment, shelter, and nutrition. Given United Nations estimates that the world will have at least 11 billion people by the end of this century (50 percent more than today), and given that we can expect developing economies to grow rapidly, demand for services that require energy is likely to increase by a factor of 10 or more over the next century. If we want to stabilize the climate, we need to reduce total emissions from today’s level by a factor of 10. Put another way, if we want to destroy neither our environment nor our economy, we need to reduce the emissions per energy service provided by a factor of 100. This requires something of an energy miracle.

The essay continues.

Near the end, he writes “despite all these reasons for despair, I’m hopeful”. He is hopeful that a collective change of heart is underway that will enable humanity to solve this problem. But he doesn’t claim to know any workable solution to the problem. In fact, he mostly list reasons why various possible solutions won’t be enough.


Follow

Get every new post delivered to your Inbox.

Join 3,158 other followers