Probability Puzzles

24 August, 2010

Today Greg Egan mailed me two puzzles in probability theory: a “simple” one, and a more complicated one that compares Bayesian and frequentist interpretations of probability theory.

Try your hand at the simple one first. Egan wrote:

A few months ago I read about a very simple but fun probability puzzle. Someone tells you:

“I have two children. One is a boy born on a Tuesday. What is the probability I have two boys?”

Please give it a try before moving on. Or at least figure out what this is:

Of course, your first reaction should be “it’s irrelevant the boy was born on a Tuesday“. At least that was my first reaction. So I said:

I’d intuitively assume that the day Tuesday is not relevant, so I’d ignore that information – or else look at some hospital statistics to see if it is relevant. I’d also assume that boy/girl births act just like independently distributed fair coin flips — which is surely false, but I’m guessing the puzzle wants us to assume it’s true. And then I’d say there are 4 equally likely options: BB, BG, GB and GG.

If you tell me “one is a boy”, it’s very different from “the first one is a boy”. If one is a boy, we’re down to 3 equally likely options: BB, BG, and GB. So, the probability of two boys is 1/3.

But that’s not the answer Egan gives:

The usual answer to this puzzle — after people get over an initial intuitive sense that the “Tuesday” can’t possibly be relevant — is that the probability of having two sons is 13/27. If someone has two children, for each there are 14 possibilities as to boy/girl and weekday of birth, so if at least one child is a son born on a Tuesday there are 14 + 14 – 1 = 27 possibilities (subtracting 1 for the doubly-counted intersection, where both children are sons born on a Tuesday), of which 7 + 7 – 1 = 13 involve two sons.

If you find that answer unbelievable, read his essay! He does a good job of making it more intuitive:

• Greg Egan, Some thoughts on Tuesday’s child.

But then comes his deeper puzzle, or question:

That’s fine, but as a frequentist if someone asks me to take this probability seriously and start making bets, I will only do so if I can imagine some repetition of the experiment. Suppose someone offered me $81 if the parent had two sons, but I had to pay $54 if they had a son and a daughter. The expected gain from that bet for P(two sons)=13/27 would be $11.

If I took up that bet, I would then resolve that in the future I’d only take the same bet again if the person each time had two children and at least one son born specifically on a TUESDAY. In fact, I’d insist on asking the parent myself “Do you have at least one son born on a Tuesday?” rather than having them volunteer the information (since someone with two sons born on different days might not mention the one born on a Tuesday). That way, I’d be sampling a subset of parents all meeting exactly the same conditions, and I’d be satisfied that my long-term expectation of gain really would be $11 per bet.

But I’m curious as to how a Bayesian, who is happier to think of a probability applying to a single event in isolation, would respond to the same situation. It seems to me (perhaps naively) that a Bayesian ought to be happy to take this bet any time, and then forget about what they did in the past — which ought to make them willing to take the bet on future offers even when the day of the week when the son was born changes. After all, P(two sons)=13/27 whatever day is substituted for Tuesday.

However, anyone who agreed to keep taking the bet regardless of the day of the week would lose money! Without pinning down the day to a particular choice, you’re betting on a sample of parents who simply have two children, at least one of whom is a son. That gives P(two sons)=1/3, and the expectation for the $81/$54 bet becomes a $9 loss.

Now, I understand how the difference between P(two sons)=13/27 and P(two sons)=1/3 arises, despite the perfect symmetry between the weekdays; the subsets with “at least one son born on day X” are not disjoint, so even though they are isomorphic, their union will have a different proportion of two-son families than the individual subsets.

What’s puzzling me is this: how does a Bayesian reason about the thought experiment I’ve described, in such a way that they don’t end up taking the bet every time and losing money?


Control of Cold Molecular Ions

22 August, 2010

On Tuesday, Dzmitry Matsukevich gave a talk on “Control and Manipulation of Cold Molecular Ions”. He just arrived here at the CQT, coming from Christopher Monroe’s Trapped Ion Quantum Information Group at the University of Maryland.



Cold molecules can be used to study:

• Quantum information
• Precision measurements
• Quantum chemistry
• Strongly interacting degenerate gases

But most work uses neutral molecules; work on cold molecular ions is a bit new. The advantage of working with ions is that since they’re electrically charged, they can be trapped in radio-frequency Paul traps. This allows them to be isolated from the environment for days or weeks, and methods developed for ion trap quantum computations can be applied to them. On the downside, it’s hard to find spectroscopic data on most molecular ions.

Molecules and molecular ions can wiggle in various ways, with different characteristic frequencies:

Vibrational modes: about 30 terahertz

Rotational modes: about 10-100 gigahertz

Hyperfine modes: about 1 gigahertz

Room temperature corresponds to a frequency of about 6.25 gigahertz, so lots of modes are excited at this temperature. To make molecular ions easier to understand and manipulate, we’d prefer them to jump around between just a few modes: those of least energy. For this, we need to cool them down.

How can we do this? Use sympathetic cooling: our molecules can lose energy by interacting with trapped atomic ions that we keep cool using a laser!

(It might not be obvious that you can use a laser to cool something, but you can. The most popular method is called Doppler cooling. It’s basically a trick to make moving atoms more likely to emit photons than atoms that are standing still.)

Once our molecular ions are cold, how can we get them into specific desired states? Use a mode locked pulsed laser to drive stimulated Raman transitions.

Huh? As far as I can tell, this means “blast our molecular ion with an extremely brief pulse of light: it can then absorb a photon and emit a photon of a different energy, while itself jumping to a state of higher or lower energy.”

Here “extremely brief” can mean anywhere from picoseconds (10-12 seconds) to femtoseconds (10-15 seconds).

Once we’ve got our molecular ion in a specific state, it’ll get entangled with neighboring atomic ions thanks to their collective motion. This lets us try to implement quantum logic operations. There’s a large available Hilbert space: many qubits can be stored in a single molecule.

This paper shows how to use stimulated Raman transitions to create entangled atomic qubits:

• D. Hayes, D. N. Matsukevich, P. Maunz, D. Hucul, Q. Quraishi, S. Olmschenk, W. Campbell, J. Mizrahi, C. Senko, and C. Monroe, Entanglement of atomic qubits using an optical frequency comb, Phys. Rev. Lett. 104 (2010) 140501.

Precision control of molecular ions also lets us do precision measurements! Hyperfine modes depend on the mass of the electron and the fine structure constant. Vibrational and rotational modes depend on the mass of the proton. This allows accurate measurement of the ratio of the proton and electron mass.

(People looking at quasars in different parts of the sky see different drifts in the fine structure constant. One observation in the Northern hemisphere sees the fine structure constant changing, while one in the Southern hemisphere sees that it’s not. It’ll probably turn out nothing real is happening — at least that’s my conservative opinion — but it’s worth studying.)

What molecular ions are good to use?

• ionized silicon oxide, SiO+: convenient transition wavelengths, no hyperfine structure, and… umm… almost diagonal Frank-Condon factors.

• ionized molecular chlorine, Cl2+: the fine structure splitting is close to the vibrational splitting, so this is good for precision measurements of the variation of the fine structure constant.

Matsukevich’s goals in the next 3 years:

• build an ion trap apparatus for simultaneously trapping Yb+ atomic ions and SiO+ molecular ions: the ytterbium lets us do sympathetic cooling.

• develop methods to load them, cool them, and do cool things with them!


Dying Coral Reefs

18 August, 2010


Global warming has been causing the "bleaching" of coral reefs. A bleached coral reef has lost its photosynthesizing symbiotic organisms, called zooxanthellae. It may look white as a ghost — as in the picture above — but it is not yet dead. If the zooxanthellae come back, the reef can recover.

With this year’s record high temperatures, many coral reefs are actually dying:

• Dan Charles, Massive coral die-off reported in Indonesia, Morning Edition, August 17, 2010.

DAN CHARLES: This past spring and early summer, the Andaman Sea, off the coast of Sumatra, was three, five, even seven degrees [Fahrenheit] warmer than normal. That can be dangerous to coral, so scientists from the Wildlife Conservation Society went out to the reefs to take a look. At that time, about 60 percent of the coral had turned white – it was under extreme stress but still alive.

Caleb McClennen from the Wildlife Conservation Society says they just went out to take a look again.

DR. CALEB MCCLENNEN: The shocking situation, now, is that about 80 percent of those that were bleached have now died.

CHARLES: That’s just in the area McClennen’s colleagues were able to survey. They’re asking other scientists to check on coral in other areas of the Andaman Sea.

Similar mass bleaching events have been observed this year in Sri Lanka, Thailand, Malaysia, and other parts of Indonesia.

For more, see:

• Environmental news service, Corals bleached and dying in overheated south Asian waters, August 16, 2010.

It’s interesting to look back back at the history of corals — click for a bigger view:



Corals have been around for a long time. But the corals we see now are completely different from those that ruled the seas before the Permian-Triassic extinction event 250 million years ago. Those earlier corals, in turn, are completely different from those that dominated before the Ordovician began around 490 million years ago. A major group of corals called the Heliolitida died out in the Late Devonian extinction. And so on.

Why? Corals live near the surface of the ocean and are thus particularly sensitive not only to temperature changes but also changes in sea levels and changes in the amount of dissolved CO2, which makes seawater more acid.

We are now starting to see what the Holocene extinction will do to corals. Not only the warming but also the acidification of oceans are hurting them. Indeed, seawater is reaching the point where aragonite, the mineral from which corals are made, becomes more soluble in water.

This paper reviews the issue:

• O. Hoegh-Guldberg, P. J. Mumby, A. J. Hooten, R. S. Steneck, P. Greenfield, E. Gomez, C. D. Harvell, P. F. Sale, A. J. Edwards, K. Caldeira, N. Knowlton, C. M. Eakin, R. Iglesias-Prieto, N. Muthiga, R. H. Bradbury, A. Dubi and M. E. Hatziolos, Coral reefs under rapid climate change and ocean acidification, Science 318 (14 December 2007), 1737-1742.

Chris Colose has a nice summary of what this paper predicts under three scenarios:

1) If CO2 is stablilized today, at 380 ppm-like conditions, corals will change a bit but areas will remain coral dominated. Hoegh-Guldberg et al. emphasize the importance of solving regional problems such as fishing pressure, and air/water quality which are human-induced but not directly linked to climate change/ocean acidification.

2) Increases of CO2 at 450 to 500 ppmv at current >1 ppmv/yr scenario will cause significant declines in coral populations. Natural adaptive shifts to symbionts with a +2°C resistance may delay the demise of some reefs, and this will differ by area. Carbonate-ion concentrations will drop below the 200 µmol kg-1 threshold and coral erosion will outweigh calcification, with significant impacts on marine biodiversity.

3) In the words of the study, a scenario of >500 ppmv and +2°C sea surface temperatures “will reduce coral reef ecosystems to crumbling frameworks with few calcareous corals”. Due to latitudinally decreasing aragonite concentrations and projected atmospheric CO2 increases adaptation to higher latitudes with areas of more thermal tolerance is unlikely. Coral reefs exist within a narrow band of temperature, light, and aragonite saturation states, and expected rises in SST’s will produce many changes on timescales of decades to centuries (Hoegh-Guldberg 2005). Rising sea levels may also harm reefs which necessitate shallow water conditions. Under business-as-usual to higher range scenarios used by the IPCC, corals will become rare in the tropics, and have huge impacts on biodiversity and the ecosystem services they provide.

The chemistry of coral is actually quite subtle. Here’s a nice introduction, at least for people who aren’t scared by section headings like “Why don’t corals simply pump more protons?”:

• Anne L. Cohen and Michael Holcomb, Why corals care about ocean acidification: uncovering the mechanism, Oceanography 22 (2009), 118-127.


Quantum Control Theory

16 August, 2010

The environmental thrust of this blog will rise to the surface again soon, I promise. I’m just going to a lot of talks on quantum technology, condensed matter physics, and the like. Ultimately the two threads should merge in a larger discourse that ranges from highly theoretical to highly practical. But right now you’re probably just confused about the purpose of this blog — it’s smeared out all across the intellectual landscape.

Anyway, to add to the confusion: I just got a nice email from Giampiero Campa, who in week294 had pointed me to the fascinating papers on control theory by Jan Willems. Control theory is the art of getting open systems — systems that interact with their environment — to behave in ways you want.

Since complex systems like ecosystems or the entire Earth are best understood as made of many interacting open systems, and/or being open systems themselves, I think ideas from control theory could become very important in understanding the Earth and how our actions affect it. But I’m also fascinated by control theory because of how it combines standard ideas in physics with new ideas that are best expressed using category theory — a branch of math I happen to know and like. (See week296 and subsequent issues for more on this.) And quantum control theory — the art of getting quantum systems to do what you want — is the sort of thing people here at the CQT may find interesting.

In short, control theory seems like a promising meeting-place for some of my disparate interests. Not necessarily the most important thing for ‘saving the planet’, by any means! But the kind of thing I can’t resist thinking about.

In his email, Campa pointed me to two new papers on this subject:

• Anthony M. Bloch, Roger W. Brockett, and Chitra Rangan, Finite controllability of infinite-dimensional quantum systems, IEEE Transactions on Automatic Control 55 (August 2010), 1797-1805.

• Matthew James and John E. Gough, Quantum dissipative systems and feedback control design by interconnection, IEEE Transactions on Automatic Control 55 (August 2010), 1806-1821.

The second one is related to the ideas of Jan Willems:

Abstract: The purpose of this paper is to extend J.C. Willems’ theory of dissipative systems to open quantum systems described by quantum noise models. This theory, which combines ideas from quantum physics and control theory, provides useful methods for analysis and design of dissipative quantum systems. We describe the interaction of the plant and a class of external systems, called exosystems, in terms of feedback networks of interconnected open quantum systems. Our results include an infinitesimal characterization of the dissipation property, which generalizes the well-known Positive Real and Bounded Real Lemmas, and is used to study some properties of quantum dissipative systems. We also show how to formulate control design problems using network models for open quantum systems, which implements Willems’ “control by interconnection” for open quantum systems. This control design formulation includes, for example, standard problems of stabilization, regulation, and robust control.

I don’t have anything intelligent to say about these papers yet. Does anyone out know if ideas from quantum control theory have been used to tackle the problems that decoherence causes in quantum computation? The second article makes me wonder about this:

In the physics literature, methods have been developed to model energy loss and decoherence (loss of quantum coherence) arising from the interaction of a system with an environment. These models may be expressed using tools which include completely positive maps, Lindblad generators, and master equations. In the 1980s it became apparent that a wide range of open quantum systems, such as those found in quantum optics, could be described within a new unitary framework of quantum stochastic differential equations, where quantum noise is used to represent the influence of large heat baths and boson fields (which includes optical and phonon fields). Completely positive maps, Lindblad generators, and master equations are obtained by taking expectations.

Quantum noise models cover a wide range of situations involving light and matter. In this paper, we use quantum noise models for boson fields, as occur in quantum optics, mesoscopic superconducting circuits, and nanomechanical systems, although many of the ideas could be extended to other contexts. Quantum noise models can be used to describe an optical cavity, which consists of a pair of mirrors (one of which is partially transmitting) supporting a trapped mode of light. This cavity mode may interact with a free external optical field through the partially transmitting mirror. The external field consists of two components: the input field, which is the field before it has interacted with the cavity mode, and the output field, being the field after interaction. The output field may carry away energy, and in this way the cavity system dissipates energy. This quantum system is in some ways analogous to the RLC circuit discussed above, which stores electromagnetic energy in the inductor and capacitor, but loses energy as heat through the resistor. The cavity also stores electromagnetic energy, quantized as photons, and these may be lost to the external field…


Trends in Quantum Information Processing

16 August, 2010

This week the scientific advisory board of the CQT is coming to town, so we’re having lots and lots of talks.

Right now Michele Mosca, deputy director of the Institute for Quantum Computing in Waterloo, Canada, is speaking about “Trends in Quantum Information Processing” from a computer science / mathematics perspective.

(Why mathematics? He says he will always consider himself a mathematician…)

Alice and Bob

Mosca began by noting some of the cultural / linguistic differences that become important in large interdisciplinary collaborations. For example, computer scientists like to talk about people named Alice and Bob playing bizarre games. These games are usually an attempt to distill some aspect of a hard problem into a simple story.

Impossibility

Mosca also made a remark on “impossibility”. Namely: proofs that things are impossible, often called “no-go theorems”, are very important, but we shouldn’t overinterpret them. Just because one approach to doing something doesn’t work, doesn’t mean we can’t do it! From an optimistic viewpoint, no-go theorems simply tell us what assumptions we need to get around. There is, however, a difference between optimism and insanity.

For example: “We can’t build a quantum computer because of decoherence…”

Knill, Laflamme and Milburn initially set out to prove that linear optics alone would not give universal quantum computer… but then they discovered it could.

• E. Knill, R. Laflamme, and G. J. Milburn, A scheme for efficient quantum computation with linear optics, Nature 409 (2001), 46. Also see their paper Efficient linear optics quantum computation on the arXiv.

Another example: “Quantum black-box algorithms give at most a polynomial speedup for total functions…”

If this result — proved by whom? — had been discovered before Shor’s algorithm, it might have discouraged Shor from finding this algorithm. But his algorithm involves a partial function.

Yet another: “Nuclear magnetic resonance quantum computing is not scalable…”

So far each argument for this can be gotten around, though nobody knows how to get around all of them simultaneously. Attempts to do this have led to important discoveries.

From theory to practice

Next, Mosca showed a kind of flow chart. Abstract algorithms spawn various models of quantum computation: circuit models, measurement-based models, adiabatic models, topological models, continuous-time models, cellular automaton models, and so on. These spawn specific architectures for (so far hypothetical) quantum computers; a lot of work still needs to be done to develop fault-tolerant architectures. Then come specific physical realizations that might implement these architectures: involving trapped ions, photon qubits, superconducting circuit qubits, spin qubits and so on. We’re just at the beginning of a long path.

A diversity of quantum algorithms

There are still lots of people who say there are just two interesting quantum algorithms:

Shor’s algorithm (for factoring integers in a time that’s polynomial as a function in their number of digits), and

Grover’s algorithm for searching lists in a time that’s less than linear as a function of the number of items in the list.

Mosca said:

The next time you hear someone say this, punch ’em in the head!

For examples of many different quantum algorithms, see Stephen Jordan’s ‘zoo’, and this paper:

• Michele Mosca, Quantum algorithms.

Then Mosca discussed three big trends…

1. Working with untrusted quantum apparatus

This trend started with:

• Artur Ekert, Quantum cryptography based on Bell’s theorem,
Phys. Rev. Lett. 67 (1991 Aug 5), 661-663.

Then:

• Dominic Mayers and and Andrew Yao, Quantum cryptography with imperfect apparatus.

And then much more, thanks in large part to people at the CQT! The idea is that if you have a quantum system whose behavior you don’t entirely trust, you can still do useful stuff with it. This idea is related to the subject of multi-prover interactive proofs, which however are studied by a different community.

This year some of these ideas got implemented in an actual experiment:

• S. Pironio, A. Acin, S. Massar, A. Boyer de la Giroday, D. N. Matsukevich, P. Maunz, S. Olmschenk, D. Hayes, L. Luo, T. A. Manning, and C. Monroe, Random numbers certified by Bell’s theorem, Nature 464 (2010), 1021.

(Yay! Someone who had the guts to put a paper for Nature on the arXiv!)

2. Ideas from topological quantum computing

The seeds here were planted in the late 1990s by:

• Michael Freedman, P/NP, and the quantum field computer,
Proc. Natl. Acad. Sci. USA 95 (1998 January 6), 98–101.

This suggested that a topological quantum computer could efficiently compute the Jones polynomial at certain points — a #P-hard problem. That would mean quantum computers can solve NP problems in polynomial time! But it turned out that limitations in precision prevent this. With a great sigh of relief, computer scientists decided they didn’t need to learn topological quantum field theory… but then came:

• A. Kitaev, Fault-tolerant quantum computation by anyons, Ann. Phys. 303 (2003), 2-30.

This sort of computer may or may not ever be built, but it’s very interesting either way. And in 2005, Raussendorf, Harrington and Goyal used ideas from topological quantum computing to build fault-tolerance into another approach to quantum computing, based on “cluster states”:

• Robert Raussendorf, Jim Harrington, and Kovid Goyal, Topological fault-tolerance in cluster state quantum computation, 2007.

Subsequently there’s been a huge amount of work along these lines. (Physics junkies should check out the Majorana fermion codes of Bravyi, Leemhuis and Terhal.)

3. Semidefinite programming

The seeds here were planted about 10 years ago by Kitaev and Watrous. There’s been a wide range of applications since then. If you search quant-ph for abstracts containing the buzzword “semidefinite” you’ll get almost a hundred hits!

So what is semidefinite programming? It’s a relative of linear programming, an optimization method that’s widely used in microeconomics and the management of production, transportation and the like. I don’t understand it, but I guess it’s like a quantum version of linear programming!

I can parrot the definition:

A semidefinite program is a triple (\Phi, a,b). Here \Phi is a linear map that sends linear operators on some Hilbert space X to linear operators on some Hilbert space Y. Such a linear map is called a superoperator, since it operates on operators! We assume \Phi takes self-adjoint operators to self-adjoint operators. What about a and b? They are self-adjoint operators on X and Y, respectively.

This data gives us two problems:

Primal problem: find x, a positive semidefinite operator on X that minimizes \langle a,x \rangle subject to the constraint such that \Phi(x) \ge b.

Dual problem: find y, a positive semidefinite operator on Y that maximizes \langle b, y\rangle subject to the constraint \Phi^*(y) \le a.

This should remind you of duality in linear programming. In linear programming, the dual of the dual problem is the original ‘primal’ problem! Also, if you know the solution to either problem, you know the solution to both. Is this stuff true for semidefinite programming too?

Example: given a bunch of density matrices, find an optimal bunch of measurements that distinguish them.

Example: ‘Quantum query algorithms’ are a generalization of Grover’s search algorithm. Reichardt used semidefinite programming to show there is always an optimal algorithm that uses just 2 reflections:

• Ben W. Reichardt, Reflections for quantum query algorithms.

There are lots of other trends, of course! These are just a few. Mosca apologized to anyone whose important work wasn’t mentioned, and I apologize even more, since I left out a lot of what he said….


The Geometry of Quantum Phase Transitions

13 August, 2010

Today at the CQT, Paolo Zanardi from the University of Southern California is giving a talk on “Quantum Fidelity and the Geometry of Quantum Criticality”. Here are my rough notes…

The motto from the early days of quantum information theory was “Information is physical.” You need to care about the physical medium in which information is encoded. But we can also turn it around: “Physics is informational”.

In a “classical phase transition”, thermal fluctuations play a crucial role. At zero temperature these go away, but there can still be different phases depending on other parameters. A transition between phases at zero temperature is called a quantum phase transitions. One way to detect a quantum phase transition is simply to notice that ground state depends very sensitively on the parameters near such a point. We can do this mathematically using a precise way of measuring distances between states: the Fubini-Study metric, which I’ll define below.

Suppose that M is a manifold parametrizing Hamiltonians for a quantum system, so each point x \in M gives a self-adjoint operator H(x) on some finite-dimensional Hilbert space, say \mathbb{C}^n. Of course in the thermodynamic limit (the limit of infinite volume) we expect our quantum system to be described by an infinite-dimensional Hilbert space, but let’s start out with a finite-dimensional one.

Furthermore, let’s suppose each Hamiltonian has a unique ground state, or at least a chosen ground state, say \psi(x). Here x does not indicate a point in space: it’s a point in M, our space of Hamiltonians!

This ground state \psi(x) is really defined only up to phase, so we should think of it as giving an element of the projective space \mathbb{C P}^{n-1}. There’s a god-given metric on projective space, called the Fubini-Study metric. Since we have a map from M to projective space, sending each point x to the state \psi(x) (modulo phase), we can pull back the Fubini-Study metric via this map to get a metric on M.

But, the resulting metric may not be smooth, because \psi(x) may not depend smoothly on x. The metric may have singularities at certain points, especially after we take the thermodynamic limit. We can think of these singular points as being ‘phase transitions’.

If what I said in the last two paragraphs makes no sense, perhaps a version in something more like plain English will be more useful. We’ve got a quantum system depending on some parameters, and there may be points where the ground state of this quantum system depends in a very drastic way on slight changes in the parameters.

But we can also make the math a bit more explicit. What’s the Fubini-Study metric? Given two unit vectors in a Hilbert space, say \psi and \psi', their Fubini-Study distance is just the angle between them:

d(\psi, \psi') = \cos^{-1}|\langle \psi, \psi' \rangle|

This is an honest Riemannian metric on the projective version of the Hilbert space. And in case you’re wondering about the term ‘quantum fidelity’ in the title of Zanardi’s talk, the quantity

|\langle \psi, \psi' \rangle|

is called the fidelity. The fidelity ranges between 0 and 1, and it’s 1 when two unit vectors are the same up to a phase. To convert this into a distance we take the arc-cosine.

When we pull the Fubini-Study metric back to M, we get a Riemannian metric away from the singular points, and in local coordinates this metric is given by the following cool formula:

g_{\mu \nu} = {\rm Re}\left(\langle \partial_\mu \psi, \partial_\nu \psi \rangle - \langle \partial_\mu \psi, \psi \rangle \langle \psi, \partial_\nu \psi \rangle \right)

where \partial_\mu \psi is the derivative of the ground state \psi(x) as we move x in the \muth coordinate direction.

But Michael Berry came up with an even cooler formula for g_{\mu \nu}. Let’s call the eigenstates of the Hamiltonian \psi_n(x), so that

H(x) \psi_n(x) = E_n(x) \, \psi_n(x)

And let’s rename the ground state \psi_0(x), so

\psi(x) = \psi_0(x)

and

H(x) \psi_0(x) = E_0(x) \, \psi_0(x)

Then a calculation familiar to those you’d see in first-order perturbation theory shows that

g_{\mu \nu} = {\rm Re} \sum_n \langle \psi_0 , \partial_\mu H \; \psi_n \rangle \langle \partial_\nu H \; \psi_n, \psi_0 \rangle / (E_n - E_0)^2

This is nice because it shows g_{\mu \nu} is likely to become singular at points where the ground state becomes degenerate, i.e. where two different states both have minimal energy, so some difference E_n - E_0 becomes zero.

To illustrate these ideas, Zanardi did an example: the XY model in an external magnetic field. This is a ‘spin chain’: a bunch of spin-1/2 particles in a row, each interacting with their nearest neighbors. So, for a chain of length L, the Hilbert space is a tensor product of L copies of \mathbb{C}^2:

\mathbb{C}^2 \otimes \cdots \otimes \mathbb{C}^2

The Hamiltonian of the XY model depends on two real parameters \lambda and \gamma. The parameter \lambda describes a magnetic field pointing in the z direction:

H(\lambda, \gamma) = \sum_i \left(\frac{1+\gamma}{2}\right) \, \sigma^x_i \sigma^x_{i+1} \; + \; \left(\frac{1-\gamma}{2}\right) \, \sigma^y_i \sigma^y_{i+1} \; + \; \lambda \sigma_i^z

where the \sigma‘s are the ever-popular Pauli matrices. The first term makes the x components of the spins of neighboring particles want to point in opposite directions when \gamma is big. The second term makes y components of neighboring spins want to point in the same direction when \gamma is big. And the third term makes all the spins want to point up (resp. down) in the z direction when \lambda is big and negative (resp. positive).

What’s our poor spin chain to do, faced with such competing directives? At zero temperature it seeks the state of lowest energy. When \lambda is less than -1 all the spins get polarized in the spin-up state; when it’s bigger than 1 they all get polarized in the spin-down state. For \lambda in between, there is also some sort of phase transition at \gamma = 0. What’s this like? Some sort of transition between ferromagnetic and antiferromagnetic?

We can use a transformation to express this as a fermionic system and solve it exactly. Physicists love exactly solvable systems, so there have been thousands of papers about the XY model. In the thermodynamic limit (L \to +\infty) the ground state can be computed explicitly, so we can explicitly work out the metric d on the parameter space that has \lambda, \gamma as coordinates!

I will not give the formulas — Zanardi did, but they’re too scary for me. I’ll skip straight to the punchline. Away from phase transitions, we see that for nearby values of parameters, say

x = (\gamma, \delta)

and

x' = (\gamma', \delta')

the ground states have

|\langle \psi(x), \psi(x') \rangle| \sim \exp(-c L)

for some constant c. That’s not surprising: even though the two ground states are locally very similar, since we have a total of L spins in our spin chain, the overall inner product goes like \exp(-c L).

But at phase transitions, the inner product |\langle \psi(x), \psi(x') \rangle| decays even faster with L:

|\langle \psi(x), \psi(x') \rangle| \sim \exp(-c' L^2)

for some other constant c'.

This is called enhanced orthogonalization since it means the ground states at slightly different values of our parameters get close to orthogonal even faster as L grows. Or in other words: their distance as measured by the metric g_{\mu \nu} grows even faster.

This sort of phase transition is an example of a “quantum phase transition”. Note: we’re detecting this phase transition not by looking at the ground state expectation value of a given observable, but by how the ground state itself changes drastically as we change the parameters governing the Hamiltonian.

The exponent of L here — namely the 2 in L^2 — is ‘universal’: i.e., it’s robust with respect to changes in the parameters and even the detailed form of the Hamiltonian.

Zanardi concluded with an argument showing that not every quantum phase transition can be detected by enhanced orthogonalization. For more details, try:

• Silvano Garnerone, N. Tobias Jacobson, Stephan Haas and Paolo Zanardi, Fidelity approach to the disordered quantum XY model.

• Silvano Garnerone, N. Tobias Jacobson, Stephan Haas and Paolo Zanardi, Scaling of the fidelity susceptibility in a disordered quantum spin chain.

For more on the basic concepts, start here:

• Lorenzo Campos Venuti and Paolo Zanardi, Quantum critical scaling of the geometric tensors, 10.1103 Phys. Rev. Lett. 99.095701.

As a final little footnote, I should add that Paolo Zanardi said the metric g_{\mu \nu} defined as above was analogous to Fisher information metric. So, David Corfield should like this…


This Week’s Finds in Mathematical Physics (Week 300)

11 August, 2010

This is the last of the old series of This Week’s Finds. Soon the new series will start, focused on technology and environmental issues — but still with a hefty helping of math, physics, and other science.

When I decided to do something useful for a change, I realized that the best way to start was by interviewing people who take the future and its challenges seriously, but think about it in very different ways. So far, I’ve done interviews with:

Tim Palmer on climate modeling and predictability.

Thomas Fischbacher on sustainability and permaculture.

Eliezer Yudkowsky on artificial intelligence and the art of rationality.

I hope to do more. I think it’ll be fun having This Week’s Finds be a dialogue instead of a monologue now and then.

Other things are changing too. I started a new blog! If you’re interested in how scientists can help save the planet, I hope you visit:

1) Azimuth, https://johncarlosbaez.wordpress.com

This is where you can find This Week’s Finds, starting now

Also, instead of teaching math in hot dry Riverside, I’m now doing research at the Centre for Quantum Technologies in hot and steamy Singapore. This too will be reflected in the new This Week’s Finds.

But now… the grand finale of This Week’s Finds in Mathematical Physics!

I’d like to take everything I’ve been discussing so far and wrap it up in a nice neat package. Unfortunately that’s impossible – there are too many loose ends. But I’ll do my best: I’ll tell you how to categorify the Riemann zeta function. This will give us a chance to visit lots of our old friends one last time: the number 24, string theory, zeta functions, torsors, Joyal’s theory of species, groupoidification, and more.

Let me start by telling you how to count.

I’ll assume you already know how to count elements of a set, and move right along to counting objects in a groupoid.

A groupoid is a gadget with a bunch of objects and a bunch of isomorphisms between them. Unlike an element of a set, an object of a groupoid may have symmetries: that is, isomorphisms between it and itself. And unlike an element of a set, an object of a groupoid doesn’t always count as “1 thing”: when it has n symmetries, it counts as “1/nth of a thing”. That may seem strange, but it’s really right. We also need to make sure not to count isomorphic objects as different.

So, to count the objects in our groupoid, we go through it, take one representative of each isomorphism class, and add 1/n to our count when this representative has n symmetries.

Let’s see how this works. Let’s start by counting all the n-element sets!

Now, you may have thought there were infinitely many sets with n elements, and that’s true. But remember: we’re not counting the set of n-element sets – that’s way too big. So big, in fact, that people call it a “class” rather than a set! Instead, we’re counting the groupoid of n-element sets: the groupoid with n-element sets as objects, and one-to-one and onto functions between these as isomorphisms.

All n-element sets are isomorphic, so we only need to look at one. It has n! symmetries: all the permutations of n elements. So, the answer is 1/n!.

That may seem weird, but remember: in math, you get to make up the rules of the game. The only requirements are that the game be consistent and profoundly fun – so profoundly fun, in fact, that it seems insulting to call it a mere “game”.

Now let’s be more ambitious: let’s count all the finite sets. In other words, let’s work out the cardinality of the groupoid where the objects are all the finite sets, and the isomorphisms are all the one-to-one and onto functions between these.

There’s only one 0-element set, and it has 0! symmetries, so it counts for 1/0!. There are tons of 1-element sets, but they’re all isomorphic, and they each have 1! symmetries, so they count for 1/1!. Similarly the 2-element sets count for 1/2!, and so on. So the total count is

1/0! + 1/1! + 1/2! + … = e

The base of the natural logarithm is the number of finite sets! You learn something new every day.

Spurred on by our success, you might want to find a groupoid whose cardinality is π. It’s not hard to do: you can just find a groupoid whose cardinality is 3, and a groupoid whose cardinality is .1, and a groupoid whose cardinality is .04, and so on, and lump them all together to get a groupoid whose cardinality is 3.14… But this is a silly solution: it doesn’t shed any light on the nature of π.

I don’t want to go into it in detail now, but the previous problem really does shed light on the nature of e: it explains why this number is related to combinatorics, and it gives a purely combinatorial proof that the derivative of ex is ex, and lots more. Try these books to see what I mean:

2) Herbert Wilf, Generatingfunctionology, Academic Press, Boston, 1994. Available for free at http://www.cis.upenn.edu/~wilf/.

3) F. Bergeron, G. Labelle, and P. Leroux, Combinatorial Species and Tree-Like Structures, Cambridge, Cambridge U. Press, 1998.

For example: if you take a huge finite set, and randomly pick a permutation of it, the chance every element is mapped to a different element is close to 1/e. It approaches 1/e in the limit where the set gets larger and larger. That’s well-known – but the neat part is how it’s related to the cardinality of the groupoid of finite sets.

Anyway, I have not succeeded in finding a really illuminating groupoid whose cardinality is π, but recently James Dolan found a nice one whose cardinality is π2/6, and I want to lead up to that.

Here’s a not-so-nice groupoid whose cardinality is π2/6. You can build a groupoid as the “disjoint union” of a collection of groups. How? Well, you can think of a group as a groupoid with one object: just one object having that group of symmetries. And you can build more complicated groupoids as disjoint unions of groupoids with one object. So, if you give me a collection of groups, I can take their disjoint union and get a groupoid.

So give me this collection of groups:

Z/1×Z/1, Z/2×Z/2, Z/3×Z/3, …

where Z/n is the integers mod n, also called the “cyclic group” with n elements. Then I’ll take their disjoint union and get a groupoid, and the cardinality of this groupoid is

1/12 + 1/22 + 1/32 + … = π2/6

This is not as silly as the trick I used to get a groupoid whose cardinality is π, but it’s still not perfectly satisfying, because I haven’t given you a groupoid of “interesting mathematical gadgets and isomorphisms between them”, as I did for e. Later we’ll see Jim’s better answer.

We might also try taking various groupoids of interesting mathematical gadgets and computing their cardinality. For example, how about the groupoid of all finite groups? I think that’s infinite – there are just “too many”. How about the groupoid of all finite abelian groups? I’m not sure, that could be infinite too.

But suppose we restrict ourselves to abelian groups whose size is some power of a fixed prime p? Then we’re in business! The answer isn’t a famous number like π, but it was computed by Philip Hall here:

4) Philip Hall, A partition formula connected with Abelian groups, Comment. Math. Helv. 11 (1938), 126-129.

We can write the answer using an infinite product:

1/(1-p-1)(1-p-2)(1-p-3) …

Or, we can write the answer using an infinite sum:

p(0)/p0 + p(1)/p1 + p(2)/p2 + …

Here p(n) is the number of “partitions” of n: that is, the number of ways to write it as a sum of positive integers in decreasing order. For example, p(4) = 5 since we can write 4 as a sum in 5 ways like this:

4 = 4
4 = 3+1
4 = 2+2
4 = 2+1+1
4 = 1+1+1+1

If you haven’t thought about this before, you can have fun proving that the infinite product equals the infinite sum. It’s a cute fact, and quite famous.

But Hall proved something even cuter. This number

p(0)/p0 + p(1)/p1 + p(2)/p2 + …

is also the cardinality of another, really different groupoid. Remember how I said you can build a groupoid as the “disjoint union” of a collection of groups? To get this other groupoid, we take the disjoint union of all the abelian groups whose size is a power of p.

Hall didn’t know about groupoid cardinality, so here’s how he said it:

The sum of the reciprocals of the orders of all the Abelian groups of order a power of p is equal to the sum of the reciprocals of the orders of their groups of automorphisms.

It’s pretty easy to see that sum of the reciprocals of the orders of all the Abelian groups of order a power of p is

p(0)/p0 + p(1)/p1 + p(2)/p2 + …

To do this, you just need to show that there are p(n) abelian groups with pn elements. If I shows you how it works for n = 4, you can guess how the proof works in general:

4 = 4                 Z/p4

4 = 3+1           Z/p3 × Z/p

4 = 2+2           Z/p2 × Z/p2

4 = 2+1+1       Z/p2 × Z/p2 × Z/p

4 = 1+1+1+1   Z/p × Z/p × Z/p × Z/p

So, the hard part is showing that

p(0)/p0 + p(1)/p1 + p(2)/p2 + …

is also the sum of the reciprocals of the sizes of the automorphism groups of all groups whose size is a power of p.

I learned of Hall’s result from Aviv Censor, a colleague who is an expert on groupoids. He had instantly realized this result had a nice formulation in terms of groupoid cardinality. We went through several proofs, but we haven’t yet been able to extract any deep inner meaning from them:

5) Avinoam Mann, Philip Hall’s “rather curious” formula for abelian p-groups, Israel J. Math. 96 (1996), part B, 445-448.

6) Francis Clarke, Counting abelian group structures, Proceedings of the AMS, 134 (2006), 2795-2799.

However, I still have hopes, in part because the math is related to zeta functions… and that’s what I want to turn to now.

Let’s do another example: what’s the cardinality of the groupoid of semisimple commutative rings with n elements?

What’s a semisimple commutative ring? Well, since we’re only talking about finite ones, I can avoid giving the general definition and take advantage of a classification theorem. Finite semisimple commutative rings are the same as finite products of finite fields. There’s a finite field with pn whenever p is prime and n is a positive integer. This field is called Fpn, and it has n symmetries. And that’s all the finite fields! In other words, they’re all isomorphic to these.

This is enough to work out the cardinality of the groupoid of semisimple commutative rings with n elements. Let’s do some examples. Let’s try n = 6, for example.

This one is pretty easy. The only way to get a finite product of finite fields with 6 elements is to take the product of F2 and F3:

F2 × F3

This has just one symmetry – the identity – since that’s all the symmetries either factor has, and there’s no symmetry that interchanges the two factors. (Hmm… you may need check this, but it’s not hard.)

Since we have one object with one symmetry, the groupoid cardinality is

1/1 = 1

Let’s try a more interesting one, say n = 4. Now there are two options:

F4

F2 × F2

The first option has 2 symmetries: remember, Fpn has n symmetries. The second option also has 2 symmetries, namely the identity and the symmetry that switches the two factors. So, the groupoid cardinality is

1/2 + 1/2 = 1

But now let’s try something even more interesting, like n = 16. Now there are 5 options:

F16

F8×F2

F4×F4

F4×F2×F2

F2×F2×F2×F2

The field F16 has 4 symmetries because 16 = 24, and any field Fpn has n symmetries. F8×F2 has 3 symmetries, coming from the symmetries of the first factor. F4×F4 has 2 symmetries in each factor and 2 coming from permutations of the factors, for a total of 2× 2×2 = 8. F4×F2×F2 has 2 symmetries coming from those of the first factor, and 2 symmetries coming from permutations of the last two factors, for a total of 2×2 = 4 symmetries. And finally, F2×F2×F2×F2 has 24 symmetries coming from permutations of the factors. So, the cardinality of this groupoid works out to be

1/4 + 1/3 + 1/8 + 1/4 + 1/24

Hmm, let’s put that on a common denominator:

6/24 + 8/24 + 3/24 + 6/24 + 1/24 = 24/24 = 1

So, we’re getting the same answer again: 1.

Is this just a weird coincidence? No: this is what we always get! For any positive integer n, the groupoid of n-element semsimple commutative rings has cardinality 1. For a proof, see:

7) John Baez and James Dolan, Zeta functions, at http://ncatlab.org/johnbaez/show/Zeta+functions

Now, you might think this fact is just a curiosity, but actually it’s a step towards categorifying the Riemann zeta function. The Riemann zeta function is

ζ(s) = ∑n > 0 n-s

It’s an example of a “Dirichlet series”, meaning a series of this form:

n > 0 an n-s

In fact, any reasonable way of equipping finite sets with extra stuff gives a Dirichlet series – and if this extra stuff is “being a semisimple commutative ring”, we get the Riemann zeta function.

To explain this, I need to remind you about “stuff types”, and then explain how they give Dirichlet series.

A stuff type is a groupoid Z where the objects are finite sets equipped with “extra stuff” of some type. More precisely, it’s a groupoid with a functor to the groupoid of finite sets. For example, Z could be the groupoid of finite semsimple commutative rings – that’s the example we care about now. Here the functor forgets that we have a semisimple commutative ring, and only remembers the underlying finite set. In other words, it forgets the “extra stuff”.

In this example, the extra stuff is really just extra structure, namely the structure of being a semisimple commutative ring. But we could also take X to be the groupoid of pairs of finite sets. A pair of finite sets is a finite set equipped with honest-to-goodness extra stuff, namely another finite set!

Structure is a special case of stuff. If you’re not clear on the difference, try this:

8) John Baez and Mike Shulman, Lectures on n-categories and cohomology, Sec. 2.4: Stuff, structure and properties, in n-Categories: Foundations and Applications, eds. John Baez and Peter May, Springer, Berlin, 2009. Also available as arXiv:math/0608420.

Then you can tell your colleagues: “I finally understand stuff.” And they’ll ask: “What stuff?” And you can answer, rolling your eyes condescendingly: “Not any particular stuff – just stuff, in general!”

But it’s not really necessary to understand stuff in general here. Just think of a stuff type as a groupoid where the objects are finite sets equipped with extra bells and whistles of some particular sort.

Now, if we have a stuff type, say Z, we get a list of groupoids Z(n). How? Simple! Objects of Z are finite sets equipped with some particular type of extra stuff. So, we can take the objects of Z(n) to be the n-element sets equipped with that type of extra stuff. The groupoid Z will be a disjoint union of these groupoids Z(n).

We can encode the cardinalities of all these groupoids into a Dirichlet series:

z(s) = ∑n > 0 |Z(n)| n-s

where |Z(n)| is the cardinality of Z(n). In case you’re wondering about the minus sign: it’s just a dumb convention, but I’m too overawed by the authority of tradition to dream of questioning it, even though it makes everything to come vastly more ugly.

Anyway: the point is that a Dirichlet series is like the “cardinality” of a stuff type. To show off, we say stuff types categorify Dirichlet series: they contain more information, and they’re objects in a category (or something even better, like a 2-category) rather than elements of a set.

Let’s look at an example. When Z is the groupoid of finite semisimple commutative rings, then

|Z(n)| = 1

so the corresponding Dirichlet series is the Riemann zeta function:

z(s) = ζ(s)

So, we’ve categorified the Riemann zeta function! Using this, we can construct an interesting groupoid whose cardinality is

ζ(2) = ∑n > 0 n-2 = π2/6

How? Well, let’s step back and consider a more general problem. Any stuff type Z gives a Dirichlet series

z(s) = ∑n > 0 |Z(n)| n-s

How can use this to concoct a groupoid whose cardinality is z(s) for some particular value of s? It’s easy when s is a negative integer (here that minus sign raises its ugly head). Suppose S is a set with s elements:

|S| = s

Then we can define a groupoid as follows:

Z(-S) = ∑n > 0 Z(n) × nS

Here we are playing some notational tricks: nS means “the set of functions from S to our favorite n-element set”, the symbol × stands for the product of groupoids, and ∑ stands for what I’ve been calling the “disjoint union” of groupoids (known more technically as the “coproduct”). So, Z(-S) is a groupoid. But this formula is supposed to remind us of a simpler one, namely

z(-s) = ∑n > 0 |Z(n)| ns

and indeed it’s a categorified version of this simpler formula.

In particular, if we take the cardinality of the groupoid Z(-S), we get the number z(-s). To see this, you just need to check each step in this calculation:

|Z(-S)| = |∑ Z(n) × nS|

= ∑ |Z(n) × nS|

= ∑ |Z(n)| × |nS|

= ∑ |Z(n)| × ns

= z(-s)

The notation is supposed to make these steps seem plausible.

Even better, the groupoid Z(-S) has a nice description in plain English: it’s the groupoid of finite sets equipped with Z-stuff and a map from the set S.

Well, okay – I’m afraid that’s what passes for plain English among mathematicians! We don’t talk to ordinary people very often. But the idea is really simple. Z is some sort of stuff that we can put on a finite set. So, we can do that and also choose a map from S to that set. And there’s a groupoid of finite sets equipped with all this extra baggage, and isomorphisms between those.

If this sounds too abstract, let’s do an example. Say our favorite example, where Z is the groupoid of finite semisimple commutative rings. Then Z(-S) is the groupoid of finite semisimple commutative rings equipped with a map from the set S.

If this still sounds too abstract, let’s do an example. Do I sound repetitious? Well, you see, category theory is the subject where you need examples to explain your examples – and n-category theory is the subject where this process needs to be repeated n times. So, suppose S is a 1-element set – we can just write

S = 1

Then Z(-1) is a groupoid where the objects are finite semisimple commutative rings with a chosen element. The isomorphisms are ring isomorphisms that preserve the chosen element. And the cardinality of this groupoid is

|Z(-1)| = ζ(-1) = 1 + 2 + 3 + …

Whoops – it diverges! Luckily, people who study the Riemann zeta function know that

1 + 2 + 3 + … = -1/12

They get this crazy answer by analytically continuing the Riemann zeta function ζ(s) from values of s with a big positive real part, where it converges, over to values where it doesn’t. And it turns out that this trick is very important in physics. In fact, back in "week124" – "week126", I explained how this formula

ζ(-1) = -1/12

is the reason bosonic string theory works best when our string has 24 extra dimensions to wiggle around in besides the 2 dimensions of the string worldsheet itself.

So, if we’re willing to allow this analytic continuation trick, we can say that

THE GROUPOID OF FINITE SEMISIMPLE COMMUTATIVE RINGS WITH A CHOSEN ELEMENT HAS CARDINALITY -1/12

Someday people will see exactly how this is related to bosonic string theory. Indeed, it should be just a tiny part of a big story connecting number theory to string theory… some of which is explained here:

9) J. M. Luck, P. Moussa, and M. Waldschmidt, eds., Number Theory and Physics, Springer Proceedings in Physics, Vol. 47, Springer-Verlag, Berlin, 1990.

10) C. Itzykson, J. M. Luck, P. Moussa, and M. Waldschmidt, eds, From Number Theory to Physics, Springer, Berlin, 1992.

Indeed, as you’ll see in these books (or in "week126"), the function we saw earlier:

1/(1-p-1)(1-p-2)(1-p-3) … = p(0)/p0 + p(1)/p1 + p(2)/p2 + …

is also important in string theory: it shows up as a “partition function”, in the physical sense, where the number p(n) counts the number of ways a string can have energy n if it has one extra dimension to wiggle around in besides the 2 dimensions of its worldsheet.

But it’s the 24th power of this function that really matters in string theory – because bosonic string theory works best when our string has 24 extra dimensions to wiggle around in. For more details, try:

11) John Baez, My favorite numbers: 24. Available at http://math.ucr.edu/home/baez/numbers/24.pdf

But suppose we don’t want to mess with divergent sums: suppose we want a groupoid whose cardinality is, say,

ζ(2) = 1-2 + 2-2 + 3-2 + … = π2/6

Then we need to categorify the evalution of Dirichlet series at positive integers. We can only do this for certain stuff types – for example, our favorite one. So, let Z be the groupoid of finite semisimple commutative rings, and let S be a finite set. How can we make sense of

Z(S) = ∑n > 0 Z(n) × n-S ?

The hard part is n-S, because this has a minus sign in it. How can we raise an n-element set to the -Sth power? If we could figure out some sort of groupoid that serves as the reciprocal of an n-element set, we’d be done, because then we could take that to the Sth power. Remember, S is a finite set, so to raise something (even a groupoid) to the Sth power, we just multiply a bunch of copies of that something – one copy for each element of S.

So: what’s the reciprocal of an n-element set? There’s no answer in general – but there’s a nice answer when that set is a group, because then that group gives a groupoid with one object, and the cardinality of this groupoid is just 1/n.

Here is where our particular stuff type Z comes to the rescue. Each object of Z(n) is a semisimple commutative ring with n elements, so it has an underlying additive group – which is a group with n elements!

So, we don’t interpret Z(n) × n-S as an ordinary product, but something a bit sneakier, a “twisted product”. An object in Z(n) × n-S is just an object of Z(n), that is an n-element semisimple commutative ring. But we define a symmetry of such an object to be a symmetry of this ring together with an S-tuple of elements of its underlying additive group. We compose these symmetries with the help of addition in this group. This ensures that

|Z(n) × n-S| = |Z(n)| × n-s

when |S| = s. And this in turn means that

|Z(S)| = |∑ Z(n) × n-S|

= ∑ |Z(n) × n-S|

= ∑ |Z(n)| × n-s

= ζ(-s)

So, in particular, if S is a 2-element set, we can write

S = 2

for short and get

|Z(2)| = ζ(2) = π2/6

Can we describe the groupoid Z(2) in simple English, suitable for a nice bumper sticker? It’s a bit tricky. One reason is that I haven’t described the objects of Z(2) as mathematical gadgets of an appealing sort, as I did for Z(-1). Another closely related reason is that I only described the symmetries of any object in Z(2) – or more technically, morphisms from that object to itself. It’s much better if we also describe morphisms from one object to another.

For this, it’s best to define Z(n) × n-S with the help of torsors. The idea of a torsor is that you can take the one-object groupoid associated to any group G and find a different groupoid, which is nonetheless equivalent, and which is a groupoid of appealing mathematical gadgets. These gadgets are called “G-torsors”. A “G-torsor” is just a nonempty set on which G acts freely and transitively:

12) John Baez, Torsors made easy, http://math.ucr.edu/home/baez/torsors.html

All G-torsors are isomorphic, and the group of symmetries of any G-torsor is G.

Now, any ring R has an underlying additive group, which I will simply call R. So, the concept of “R-torsor” makes sense. This lets us define an object of Z(n) × n-S to be an n-element semisimple commutative ring R together with an S-tuple of R-torsors.

But what about the morphisms between these? We define a morphism between these to be a ring isomorphism together with an S-tuple of torsor isomorphisms. There’s a trick hiding here: a ring isomorphism f: R → R’ lets us take any R-torsor and turn it into an R’-torsor, or vice versa. So, it lets us talk about an isomorphism from an R-torsor to an R’-torsor – a concept that at first might have seemed nonsensical.

Anyway, it’s easy to check that this definition is compatible with our earlier one. So, we see:

THE GROUPOID OF FINITE SEMISIMPLE COMMUTATIVE RINGS EQUIPPED WITH AN n-TUPLE OF TORSORS HAS CARDINALITY ζ(n)

I did a silly change of variables here: I thought this bumper sticker would sell better if I said “n-tuple” instead of “S-tuple”. Here n is any positive integer.

While we’re selling bumper stickers, we might as well include this one:

THE GROUPOID OF FINITE SEMISIMPLE COMMUTATIVE RINGS EQUIPPED WITH A PAIR OF TORSORS HAS CARDINALITY π2/6

Now, you might think this fact is just a curiosity. But I don’t think so: it’s actually a step towards categorifying the general theory of zeta functions. You see, the Riemann zeta function is just one of many zeta functions. As Hasse and Weil discovered, every sufficiently nice commutative ring R has a zeta function. The Riemann zeta function is just the simplest example: the one where R is the ring of integers. And the cool part is that all these zeta functions come from stuff types using the recipe I described!

How does this work? Well, from any commutative ring R, we can build a stuff type ZR as follows: an object of ZR is a finite semisimple commutative ring together with a homomorphism from R to that ring. Then it turns out the Dirichlet series of this stuff type, say

ζR(s) = ∑n > 0 |ZR(n)| n-s

is the usual Hasse-Weil zeta function of the ring R!

Of course, that fact is vastly more interesting if you already know and love Hasse-Weil zeta functions. You can find a definition of them either in my paper with Jim, or here:

12) Jean-Pierre Serre, Zeta and L functions, Arithmetical Algebraic Geometry (Proc. Conf. Purdue Univ., 1963), Harper and Row, 1965, pp. 82–92.

But the basic idea is simple. You can think of any commutative ring R as the functions on some space – a funny sort of space called an “affine scheme”. You’re probably used to spaces where all the points look alike – just little black dots. But the points of an affine scheme come in many different colors: for starters, one color for each prime power pk! The Hasse-Weil zeta function of R is a clever trick for encoding the information about the numbers of points of these different colors in a single function.

Why do we get points of different colors? I explained this back in "week205". The idea is that for any commutative ring k, we can look at the homomorphisms

f: R → k

and these are called the “k-points” of our affine scheme. In particular, we can take k to be a finite field, say Fpn. So, we get a set of points for each prime power pn. The Hasse-Weil zeta function is a trick for keeping track of many Fpn-points there are for each prime power pn.

Given all this, you shouldn’t be surprised that we can get the Hasse-Weil zeta function of R by taking the Dirichlet series of the stuff type ZR, where an object is a finite semisimple commutative ring k together with a homomorphism f: R → k. Especially if you remember that finite semisimple commutative rings are built from finite fields!

In fact, this whole theory of Hasse-Weil zeta functions works for gadgets much more general than commutative rings, also known as affine schemes. They can be defined for “schemes of finite type over the integers”, and that’s how Serre and other algebraic geometers usually do it. But Jim and I do it even more generally, in a way that doesn’t require any expertise in algebraic geometry. Which is good, because we don’t have any.

I won’t explain that here – it’s in our paper.

I’ll wrap up by making one more connection explicit – it’s sort of lurking in what I’ve said, but maybe it’s not quite obvious.

First of all, this idea of getting Dirichlet series from stuff types is part of the groupoidification program. Stuff types are a generalization of “structure types”, often called “species”. André Joyal developed the theory of species and showed how any species gives rise to a formal power series called its generating function. I told you about this back in "week185" and "week190". The recipe gets even simpler when we go up to stuff types: the generating function of a stuff type Z is just

n ≥ 0 |Z(n)| zn

Since we can also describe states of the quantum harmonic oscillator as power series, with zn corresponding to the nth energy level, this
lets us view stuff types as states of a categorified quantum harmonic oscillator! This explains the combinatorics of Feynman diagrams:

14) Jeffrey Morton, Categorified algebra and quantum mechanics, TAC 16 (2006), 785-854, available at http://www.emis.de/journals/TAC/volumes/16/29/16-29abs.html Also available as arXiv:math/0601458.

And, it’s a nice test case of the groupoidification program, where we categorify lots of algebra by saying “wherever we see a number, let’s try to think of it as the cardinality of a groupoid”:

15) John Baez, Alex Hoffnung and Christopher Walker, Higher-dimensional algebra VII: Groupoidification, available as arXiv:0908.4305

But now I’m telling you something new! I’m saying that any stuff type also gives a Dirichlet series, namely

n > 0 |Z(n)| n-s

This should make you wonder what’s going on. My paper with Jim explains it – at least for structure types. The point is that the groupoid of finite sets has two monoidal structures: + and ×. This gives the category of structure types two monoidal structures, using a trick called “Day convolution”. The first of these categorifies the usual product of formal power series, while the second categorifies the usual product of Dirichlet series. People in combinatorics love the first one, since they love chopping a set into two disjoint pieces and putting a structure on each piece. People in number theory secretly love the second one, without fully realizing it, because they love taking a number and decomposing it into prime factors. But they both fit into a single picture!

There’s a lot more to say about this, because actually the category of structure types has five monoidal structures, all fitting together in a nice way. You can read a bit about this here:

16) nLab, Schur functors, http://ncatlab.org/nlab/show/Schur+functor

This is something Todd Trimble and I are writing, which will eventually evolve into an actual paper. We consider structure types for which there is a vector space of structures for each finite set instead of a set of structures. But much of the abstract theory is similar. In particular, there are still five monoidal structures.

Someday soon, I hope to show that two of the monoidal structures on the category of species make it into a “ring category”, while the other two – the ones I told you about, in fact! – are better thought of as “comonoidal” structures, making it into a “coring category”. Putting these together, the category of species should become a “biring category”. Then the fifth monoidal structure, called “plethysm”, should make it into a monoid in the monoidal bicategory of biring categories!

This sounds far-out, but it’s all been worked out at a decategorified level: rings, corings, birings, and monoids in the category of birings:

17) D. Tall and Gavin Wraith, Representable functors and operations on rings, Proc. London Math. Soc. (3), 1970, 619-643.

18) James Borger and B. Wieland, Plethystic algebra, Advances in Mathematics 194 (2005), 246-283. Also available at http://wwwmaths.anu.edu.au/~borger/papers/03/paper03.html

19) Andrew Stacey and S. Whitehouse, The hunting of the Hopf ring, Homology, Homotopy and Applications, 11 (2009), 75-132, available at http://intlpress.com/HHA/v11/n2/a6/ Also available as arXiv:0711.3722.

Borger and Wieland call a monoid in the category of birings a “plethory”. The star example is the algebra of symmetric functions. But this is basically just a decategorified version of the category of Vect-valued species. So, the whole story should categorify.

In short: starting from very simple ideas, we can very quickly find a treasure trove of rich structures. Indeed, these structures are already staring us in the face – we just need to open our eyes. They clarify and unify a lot of seemingly esoteric and disconnected things that mathematicians and physicists love.



I think we are just beginning to glimpse the real beauty of math and physics. I bet it will be both simpler and more startling than most people expect.

I would love to spend the rest of my life chasing glimpses of this beauty. I wish we lived in a world where everyone had enough of the bare necessities of life to do the same if they wanted – or at least a world that was safely heading in that direction, a world where politicians were looking ahead and tackling problems before they became desperately serious, a world where the oceans weren’t dying.

But we don’t.


Certainty of death. Small chance of success. What are we waiting for?
– Gimli


Follow

Get every new post delivered to your Inbox.

Join 3,095 other followers