Probabilities Versus Amplitudes

5 December, 2011

Here are the slides of the talk I’m giving at the CQT Annual Symposium on Wednesday afternoon, which is Tuesday morning for a lot of you. If you catch mistakes, I’d love to hear about them before then!

Probabilities versus amplitudes.

Abstract: Some ideas from quantum theory are just beginning to percolate back to classical probability theory. For example, there is a widely used and successful theory of “chemical reaction networks”, which describes the interactions of molecules in a stochastic rather than quantum way. If we look at it from the perspective of quantum theory, this turns out to involve creation and annihilation operators, coherent states and other well-known ideas—but with a few big differences. The stochastic analogue of quantum field theory is also used in population biology, and here the connection is well-known. But what does it mean to treat wolves as fermions or bosons?

Liquid Light

28 November, 2011

Elisabeth Giacobino works at the Ecole Normale Supérieure in Paris. Last week she gave a talk at the Centre for Quantum Technologies. It was about ‘polariton condensates’. You can see a video of her talk here.

What’s a polariton? It’s a strange particle: a blend of matter and light. Polaritons are mostly made of light… with just enough matter mixed in so they can form a liquid! This liquid can form eddies just like water. Giacobino and her team of scientists have actually gotten pictures:

Physicists call this liquid a ‘polariton condensate’, but normal people may better appreciate how wonderful it is if we call it liquid light. That’s not 100% accurate, but it’s close—you’ll see what I mean in a minute.

Here’s a picture of Elisabeth Giacobino (at right) and her coworkers in 2010—not exactly the same team who is working on liquid light, but the best I can find:

How to make liquid light

How do you make liquid light?

First, take a thin film of some semiconductor like gallium arsenide. It’s full of electrons roaming around, so imagine a sea of electrons, like water. If you knock out an electron with enough energy, you’ll get a ‘hole’ which can move around like a particle of its own. Yes, the absence of a thing can act like a thing. Imagine an air bubble in the sea.

All this so far is standard stuff. But now for something more tricky: if you knock an electron just a little, it won’t go far from the hole it left behind. They’ll be attracted to each other, so they’ll orbit each other!

What you’ve got now is like a hydrogen atom—but instead of an electron and a proton, it’s made from an electron and a hole! It’s called an exciton. In Giacobino’s experiments, the excitons are 200 times as big as hydrogen atoms.

Excitons are exciting, but not exciting enough for us. So next, put a mirror on each side of your thin film. Now light can bounce back and forth. The light will interact with the excitons. If you do it right, this lets a particle of light—called a photon—blend with an exciton and form a new particle called polariton.

How does a photon ‘blend’ with an exciton? Umm, err… this involves quantum mechanics. In quantum mechanics you can take two possible situations and add them and get a new one, a kind of ‘blend’ called a ‘superposition’. ‘Schrödinger’s cat’ is what you get when you blend a live cat and a dead cat. People like to argue about why we don’t see half-live, half-dead cats. But never mind: we can see a blend of a photon and an exciton! Giacobino and her coworkers have done just that.

The polaritons they create are mostly light, with just a teeny bit of exciton blended in. Photons have no mass at all. So, perhaps it’s not surprising that their polaritons have a very small mass: about 10-5 times as heavy as an electron!

They don’t last very long: just about 4-10 picoseconds. A picosecond is a trillionth of a second, or 10-12 seconds. After that they fall apart. However, this is long enough for polaritons to do lots of interesting things.

For starters, polaritons interact with each other enough to form a liquid. But it’s not just any ordinary liquid: it’s often a superfluid, like very cold liquid helium. This means among other things, that it has almost no viscosity.

So: it’s even better than liquid light: it’s superfluid light!

The flow of liquid light

What can you do with liquid light?

For starters, you can watch it flow around obstacles. Semiconductors have ‘defects’—little flaws in the crystal structure. These act as obstacles to the flow of polaritons. And Giacobimo and her team have seen the flow of polaritons around defects in the semiconductor:

The two pictures at left are two views of the polariton condensate flowing smoothly around a defect. In these pictures the condensate is a superfluid.

The two pictures in the middle show a different situation. Here the polariton condensate is viscous enough so that it forms a trail of eddies as it flows past the defect. Yes, eddies of light!

And the two pictures at right show yet another situation. In every fluid, we can have waves of pressure. This is called… ‘sound’. Yes, this is how ordinary sound works in air, or under water. But we can also have sound in a polariton condensate!

That’s pretty cool: sound in liquid light! But wait. We haven’t gotten to the really cool part yet. Whenever you have a fluid moving past an obstacle faster than the speed of sound, you get a ‘shock wave’: the obstacle leaves an expanding trail of sound in its wake, behind it, because the sound can’t catch up. That’s why jets flying faster than sound leave a sonic boom behind them.

And that’s what you’re seeing in the pictures at right. The polariton condensate is flowing past the defect faster than the speed of sound, which happens to be around 850,000 meters per second in this experiment. We’re seeing the shock wave it makes. So, we’re seeing a sonic boom in liquid light!

It’s possible we’ll be able to use polariton condensates for interesting new technologies. Giacobimo and her team are also considering using them to study Hawking radiation: the feeble glow that black holes emit according to Hawking’s predictions. There aren’t black holes in polariton condensates, but it may be possible to create a similar kind of radiation. That would be really cool!

But to me, just being able to make a liquid consisting mostly of light, and study its properties, is already a triumph: just for the beauty of it.

Scary technical details

All the pictures of polariton condensates flowing around a defect came from here:

• A. Amo, S. Pigeon, D. Sanvitto, V. G. Sala, R. Hivet, I. Carusotto, F. Pisanello, G. Lemenager, R. Houdre, E. Giacobino, C. Ciuti, and A. Bramati, Hydrodynamic solitons in polariton superfluids.

and this is the paper to read for more details.

I tried to be comprehensible to ordinary folks, but there are a few more things I can’t resist saying.

First, there are actually many different kinds of polaritons. In general, polaritons are quasiparticles formed by the interaction of photons and matter. For example, in some crystals sound acts like it’s made of particles, and these quasiparticles are called ‘phonons’. But sometimes phonons can interact with light to form quasiparticles—and these are called ‘phonon-polaritons’. I’ve only been talking about ‘exciton-polaritons’.

If you know a bit about superfluids, you may be interested to hear that the wavy patterns show the phase of the order parameter ψ in the Landau-Ginzburg theory of superfluids:

If you know about quantum field theory, you may be interested to know that the Hamiltonian describing photon-exciton interactions involves terms roughly like

\alpha a^\dagger a + \beta b^\dagger b + \gamma (a^\dagger b + b^\dagger a)

where a is the annihilation operator for photons, b is the annihilation operator for excitons, the Greek letters are various constants, and the third term describes the interaction of photons and excitons. We can simplify this Hamiltonian by defining new particles that are linear combinations of photons and excitons. It’s just like diagonalizing a matrix; we get something like

\delta c^\dagger c + \epsilon d^\dagger d

where c and d are certain linear combinations of a and b. These act as annihilation operators for our new particles… and one of these new particles is the very light ‘polariton’ I’ve been talking about!

Is Life Improbable?

31 May, 2011

Mine? Yes. And maybe you’ve wondered just how improbable your life is. But that’s not really the question today…

Here at the Centre for Quantum Technologies, Dagomir Kaszlikowski asked me to give a talk on this paper:

• John Baez, Is life improbable?, Foundations of Physics 19 (1989), 91-95.

This was the second paper I wrote, right after my undergraduate thesis. Nobody ever seemed to care about it, so it’s strange—but nice—to finally be giving a talk on it.

My paper does not try to settle the question its title asks. Rather, it tries to refute the argument here:

• Eugene P. Wigner, The probability of the existence of a self-reproducing unit, Symmetries and Reflections, Indiana University Press, Bloomington, 1967, pp. 200-208.

According Wigner, his argument

purports to show that, according to standard quantum mechanical theory, the probability is zero for the existence of self-reproducing states, i.e., organisms.

Given how famous Eugene Wigner is (he won a Nobel prize, after all) and how earth-shattering his result would be if true, it’s surprising how little criticism his paper has received. David Bohm mentioned it approvingly in 1969. In 1974 Hubert Yockey cited it saying

for all physics has to offer, life should never have appeared and if it ever did it would soon die out.

As you’d expect, there are some websites mentioning Wigner’s argument as evidence that some supernatural phenomenon is required to keep life going. Wigner himself believed it was impossible to formulate quantum theory in a fully consistent way without referring to consciousness. Since I don’t believe either of these claims, I think it’s good to understand the flaw in Wigner’s argument.

So, let me start by explaining his argument. Very roughly, it purports to show that if there are many more ways a chunk of matter can be ‘dead’ than ‘living’, the chance is zero that we can choose some definition of ‘living’ and a suitable ‘nutrient’ state such that every ‘living’ chunk of matter can interact with this ‘nutrient’ state to produce two ‘living’ chunks.

In making this precise, Wigner considers more than just two chunks of matter: he also allows there to be an ‘environment’. So, he considers a quantum system made of three parts, and described by a Hilbert space

H = H_1 \otimes H_1 \otimes H_2

Here the first H_1 corresponds to a chunk of matter. The second H_1 corresponds to another chunk of matter. The space H_3 corresponds to the ‘environment’. Suppose we wait for a certain amount of time and see what the system does; this will be described by some unitary operator

S: H \to H

Wigner asks: if we pick this operator S in a random way, what’s the probability that there’s some n-dimensional subspace of ‘living organism’ states in H_1, and some ‘nutrient plus environment’ state in H_1 \otimes H_2, such that the time evolution sends any living organism together with the nutrient plus environment to two living organisms and some state of the environment?

A bit more precisely: suppose we pick S in a random way. Then what’s the probability that there exists an n-dimensional subspace

V \subseteq H_1

and a state

w \in H_1 \otimes H_2

such that S maps every vector in V \otimes \langle w \rangle to a vector in V \otimes V \otimes H_2? Here \langle w \rangle means the 1-dimensional subspace spanned by the vector w.

And his answer is: if

\mathrm{dim}(H_1) \gg n

then this probability is zero.

You may need to reread the last few paragraphs a couple times to understand Wigner’s question, and his answer. In case you’re still confused, I should say that V \subseteq H_1 is what I’m calling the space of ‘living organism’ states of our chunk of matter, while w \in H_1 \otimes H_2 is the ‘nutrient plus environment’ state.

Now, Wigner did not give a rigorous proof of his claim, nor did he say exactly what he meant by ‘probability’: he didn’t specify a probability measure on the space of unitary operators on H. But if we use the obvious choice (called ‘normalized Haar measure’) his argument can most likely be turned into a proof.

So, I don’t want to argue with his math. I want to argue with his interpretation of the math. He concludes that

the chances are nil for the existence of a set of ‘living’ states for which one can find a nutrient of such nature that interaction always leads to multiplication.

The problem is that he fixed the decomposition of the Hilbert space H as a tensor product

H = H_1 \otimes H_1 \otimes H_2

before choosing the time evolution operator S. There is no good reason to do that. It only makes sense split up a physical into parts this way after we have some idea of what the dynamics is. An abstract Hilbert space doesn’t come with a favored decomposition as a tensor product into three parts!

If we let ourselves pick this decomposition after picking the operator S, the story changes completely. My paper shows:

Theorem 1. Let H, H_1 and H_2 be finite-dimensional Hilbert spaces with H \cong H_1 \otimes H_1 \otimes H_2. Suppose S : H \to H is any unitary operator, suppose V is any subspace of H_1, and suppose w is any unit vector in H_1 \otimes H_2 Then there is a unitary isomorphism

U: H \to H_1 \otimes H_1 \otimes H_2

such that if we identify H with H_1 \otimes H_1 \otimes H_2 using U, the operator S maps V \otimes \langle w \rangle into V \otimes V \otimes H_2.

In other words, if we allow ourselves to pick the decomposition after picking S, we can always find a ‘living organism’ subspace of any dimension we like, together with a ‘nutrient plus environment’ state that allows our living organism to reproduce.

However, if you look at the proof in my paper, you’ll see it’s based on a kind of cheap trick (as I forthrightly admit). Namely, I pick the ‘nutrient plus environment’ state to lie in V \otimes H_2, so the nutrient actually consists of another organism!

This goes to show that you have to be very careful about theorems like this. To prove that life is improbable, you need to find some necessary conditions for what counts as life, and show that these are improbable (in some sense, and of course it matters a lot what that sense is). Refuting such an argument does not prove that life is probable: for that you need some sufficient conditions for what counts as life. And either way, if you prove a theorem using a ‘cheap trick’, it probably hasn’t gotten to grips with the real issues.

I also show that as the dimension of H approaches infinity, the probability approaches 1 that we can get reproduction with a 1-dimensional ‘living organism’ subspace and a ‘nutrient plus environment’ state that lies in orthogonal complement of V \otimes H_2. In other words, the ‘nutrient’ is not just another organism sitting there all ready to go!

More precisely:

Theorem 2. Let H, H_1 and H_2 be finite-dimensional Hilbert spaces with \mathrm{dim}(H) = \mathrm{dim}(H_1)^2 \cdot \mathrm{dim}(H_2). Let \mathbf{S'} be the set of unitary operators S: H \to H with the following property: there’s a unit vector v \in H_1, a unit vector w \in V^\perp \otimes H_2, and a unitary isomorphism

U: H \to H_1 \otimes H_1 \otimes H_2

such that if we identify H with H_1 \otimes H_1 \otimes H_2 using U, the operator S maps v \otimes w into \langle v\rangle \otimes \langle v \rangle \otimes H_2. Then the normalized Haar measure of \mathbf{S'} approaches 1 as \mathrm{dim}(H) \to \infty.

Here V^\perp is the orthogonal complement of V \subseteq H_1; that is, the space of all vectors perpendicular to V.

I won’t include the proofs of these theorems, since you can see them in my paper.

Just to be clear: I certainly don’t think these theorems prove that life is probable! You can’t have theorems without definitions, and I think that coming up with a good general definition of ‘life’, or even supposedly simpler concepts like ‘entity’ and ‘reproduction’, is extremely tough. The formalism discussed here is oversimplified for dozens of reasons, a few of which are listed at the end of my paper. So far we’re only in the first fumbling stages of addressing some very hard questions.

All my theorems do is point out that Wigner’s argument has a major flaw: he’s choosing a way to divide the world into chunks of matter and the environment before choosing his laws of physics. This doesn’t make much sense, and reversing the order dramatically changes the conclusions.

By the way: I just started looking for post-1989 discussions of Wigner’s paper. So far I haven’t found any interesting ones. Here’s a more recent paper that’s somewhat related, which doesn’t mention Wigner’s work:

• Indranil Chakrabarty and Prashant, Non existence of quantum mechanical self replicating machine, 2005.

The considerations here seem more closely related to the Wooters–Zurek no-cloning theorem.

Quantum Information Processing 2011 (Day 2)

12 January, 2011

Here are some very fragmentary notes on the second day of QIP 2011. You can see arXiv references, slides, and videos of the talks here. I’ll just give links to the slides, and again I’ll only mention 3 talks.

Stephanie Werner gave a talk on the relation between the uncertainty principle and nonlocality in quantum theory. There’s a general framework for physical theories, called “generalized probabilistic theories”, which includes classical and quantum mechanics as special cases. In this framework we can see that while quantum theory is “nonlocal” in the sense made famous by John Bell, even more nonlocal theories are logically possible!

For example, while quantum theory violates the Clauser–Horn–Shimony–Holt inequality, which is obeyed by any local hidden variables theory, it doesn’t violate it to the maximum possible extent. There is a logically conceivable gadget, the Popescu–Rohrlich box, which violates the CHSH inequalities to the maximum extent allowed by a theory that prohibits faster-than-light signalling. However, such a gadget would give us implausibly godlike computational powers.

In Werner’s talk, she explained another reason not to hope for more nonlocality than quantum theory provides. Namely, given the “steering” ability we have in quantum theory — that is, our ability to prepare a state at one location by doing a measurement at another — the theory cannot be more nonlocal than it is while still obeying the uncertainty principle.

Jérémie Roland gave a talk on generalizations of Grover’s search algorithm. Grover’s algorithm is one of the implausibly godlike powers that quantum computers might give us, if only we could build the bloody things: it’s a way to search a list of N items for a given item in a time that’s only on the order of N1/2. This algorithm assumes we can freely jump from place to place on the list, so instead of a linearly ordered “list” it’s probably better to visualize this data structure as a complete graph with N vertices. Roland’s talk explained a way to generalize this idea to arbitrary graphs.

He began by considering a Markov chain on the graph — that is, a way to blunder randomly from vertex to vertex along the graph, where you can only go from one vertex to another in one step if there’s an edge connecting them. He assumed that it’s reversible and ergodic. Then, starting from this, he described how to fashion a quantum process that finds the desired vertex (or vertices) in about the square root of the time that the Markov chain would take.

Robin Kothari gave a talk on using quantum computation to efficiently detect certain properties of graphs. He considered “minor-closed properties” of graphs. Let me just tell you what those properties are, and tell you about a fascinating older result about them.

The word graph means many slightly different things, but in this blog entry I mean a finite set V whose elements are called vertices, together with a set E of unordered pairs of distinct vertices, which we call edges. So, these are undirected finite graphs without self-loops or multiple edges connecting vertices.

A minor of a graph is another graph that can be obtained from the first one by repeatedly:

1) removing edges,

2) removing vertices that have no edges connected to them, and

3) contracting edges, that is, shrinking them to nothing and then identifying the vertices at both ends, like this:

For example, this graph:

is a minor of this one:

A property of graphs is minor-closed if whenever one graph has it, all its minors also have it.

What are some minor-closed properties? An obvious one is lacking cycles, that is, being a forest. You can get rid of cycles by getting rid of edges and vertices, or contracting edges, but you can’t create them!

A more interesting minor-closed property is planarity. If you can draw a graph on the plane, you can clearly also draw the graph you get by removing an edge, or removing a lone vertex, or contracting an edge.

Now, Kuratowski showed that planar graphs as precisely those that don’t have this graph:

or this one:

as minors.

Similarly, graphs that lack cycles are precisely those that don’t have a triangle as a minor!

So, we could ask if this pattern generalizes. Given a minor-closed property of graphs, is it equivalent to a property saying that there’s some finite set of graphs that don’t show up as minors?

This is called Wagner’s conjecture, after Klaus Wagner. He published it in 1970.

Wagner’s conjecture is true! It was proved by Neil Robertson and Paul Seymour in 2004. But the really interesting thing, to me, is that their proof takes about 400 or 500 pages!

I find this quite surprising… but then, I wouldn’t have guessed the four-color theorem was so hard to prove, either.

To make sure you understand Wagner’s conjecture, check that if we dropped the word “finite”, it would have a one-sentence proof.

Quantum Information Processing 2011 (Day 1)

11 January, 2011

This year’s session of the big conference on quantum computation, quantum cryptography, and so on is being held in Singapore this year:

QIP 2011, the 14th Workshop on Quantum Information Processing, 8-14 January 2011, Singapore.

Because the battery on my laptop is old and not very energetic, and I can’t find any sockets in the huge ballroom where the talks are being delivered, I can’t live-blog the talks. So, my reports here will be quite incomplete.

Here are microscopic summaries of just three talks from Monday’s session. You can see arXiv references, slides, and videos of the talks here. I’ll just give links to the slides.

Christian Kurtsiefer gave a nice talk on how to exploit the physics of photodetectors to attack quantum key distribution systems! By cutting the optical fiber and shining a lot of light down both directions, evil Eve can ‘blind’ Alice and Bob’s photodetectors. Then, by shining a quick even brighter pulse of light, she can fool one of their photodetectors into thinking it’s seen a single photon. She can even fool them into thinking they’ve seen a violation of Bell’s inequality, by purely classical means, thanks to the fact that only a small percentage of photons make it down the cable in the first place. Christian and his collaborators have actually done this trick in an experiment here at the CQT!

Tzu-Chieh Wei and Akimasa Miyake gave a two-part joint talk on how the AKLT ground state is universal for measurement-based quantum computation. The AKLT ground state works like this: you’ve got a hexagonal lattice with three spin-1/2 particles at each vertex. Think of each particle as attached to one of the three edges coming out of that vertex. In the ground state, you start by putting the pair of particles at the ends of each edge in the spin-0 (also known as “singlet”, or antisymmetrized) state, and then you project the three particles at each vertex down to the spin-3/2 (completely symmetrized) state. This is indeed the ground state of a cleverly chosen antiferromagnetic Hamiltonian. But has anyone ever prepared this sort of system in the lab?

David Poulin gave a talk on how to efficiently compute time evolution given a time-dependent quantum Hamiltonian. The trickiness arises from Hamiltonians that change very rapidly with time. In a naive evenly spaced discretization of the time-ordered exponential, this would require you to use lots of tiny time steps to get a good approximation. But using a clever randomly chosen discretization you can avoid this problem, at least for uniformly bounded Hamiltonians, those obeying:

\| H(t) \| \le K

for all times t. The key is that the high-frequency part of a time-dependent Hamiltonian only couples faraway energy levels, but a uniformly bounded Hamiltonian doesn’t have faraway energy levels.

A couple more things — really just notes to myself:

Steve Flammia told me about this paper relating the Cramer-Rao bound (which involves Fisher information) to the time-energy uncertainty principle:

• Sergio Boixo, Steven T. Flammia, Carlton M. Caves, and J.M. Geremia, Generalized limits for single-parameter quantum estimation.

Markus Müller told me about a paper mentioning relations between Maxwell’s demon and algorithmic entropy. I need to get some references on this work — it might help me make progress on algorithmic thermodynamics. It’s probably one of these:

• Markus Müller, Quantum Kolmogorov complexity and the quantum Turing machine (PhD thesis).

• Markus Müller, On the quantum Kolmogorov complexity of classical strings, Int. J. Quant. Inf. 7 (2009), 701-711.

Hmm — the first one says:

A concrete proposal for an application of quantum Kolmogorov complexity is to analyze a quantum version of the thought experiment of Maxwell’s demon. In one of the versions of this thought experiment, some microscopic device tries to decrease the entropy of some gas in a box, without the expense of energy, by intelligently opening or closing some little door that separates both halves of the box. It is clear that a device like this cannot work as described, since its existence would violate the second law of thermodynamics. But then, the question is what prevents such a little device (or “demon”) from operating.

Roughly, the answer is that the demon has to make observations to decide whether to close or open the door, and these observations accumulate information. From time to time, the demon must erase this additional information, which is only possible at the expense of energy, due to Landauer’s principle. In Li and Vitanyi’s book An Introduction to Kolmogorov Complexity and Its Applications, this cost of energy is analyzed under very weak assumptions with the help of Kolmogorov complexity. Basically, the energy that the demon can extract from the gas is limited by the difference of the entropy of the gas, plus the difference of the Kolmogorov complexity of the demon’s memory before and after the demon’s actions. The power of this analysis is that it even encloses the case that the demon has a computer to do clever calculations, e.g. to compress the accumulated information before erasing it.

So, I guess I need to reread Li and Vitanyi’s book!

Quantum Foundations Mailing List

9 December, 2010

Bob Coecke and Jamie Vicary have started a mailing list on “quantum foundations”.

They write:

It was agreed by many that the existence of a quantum foundations mailing list, with a wide scope and involving the broad international community, was long overdue. This moderated list (to avoid spam or abuse) will mainly distribute announcements of conferences and other international events in the area, as well as other relevant adverts such as jobs in the area. It is set up at Oxford University, which should provide a guarantee of stability and sustainability. The scope ranges from the mathematical end of quantum foundations research to the purely philosophical issues.


To subscribe to the list, send a blank email to

To unsubscribe from the list, send a blank email to

Any complaints etc can be send to Bob Coecke and Jamie Vicary.

I have deleted their email addresses here, along with the address for posting articles to the list, to lessen the amount of spam these addresses get. But it’s easy enough to find Bob and Jamie’s addresses, and presumably when you subscribe you’ll be told how to post messages!

Solèr’s Theorem

1 December, 2010

Here’s another post on the foundations of quantum theory:

Solèr’s Theorem.

It’s about an amazing result, due to Maria Pia Solèr, which singles out real, complex and quaternionic Hilbert spaces as special. If you want to talk about it, please join the conversation over on the n-Category Café.

All these recent highly mathematical blog posts are a kind of spinoff of a paper I’m writing on quantum theory and division algebras. That paper is almost done. Then our normal programming will continue: I’ll keep going through Pacala and Socolow’s “stabilization wedges”, and also do a This Week’s Finds where I interview Tim Palmer.

State-Observable Duality

25 November, 2010

It’s confusing having two blogs if you only have one life. I post about my work at the Centre for Quantum Technology here. I post about abstract algebra at the n-Category Café. But what do I do when my work at the Centre for Quantum Technology starts using a lot of abstract algebra?

I guess this time I’ll do posts over there, but link to them here:

State-Observable Duality (Part 1).

State-Observable Duality (Part 2).

State-Observable Duality (Part 3).

This is a 3-part series on the foundations of quantum theory, leading up to a discussion of a concept I call ‘state-observable duality’. The first part talks about normed division algebras. The second talks about the Jordan-von Neumann-Wigner paper on Jordan algebras in quantum theory. The third talks about state-observable duality and the Koecher-Vinberg theorem.

I think I’ll take comments over there, so our discussion of environmental issues here doesn’t get interrupted!

Entropy and Uncertainty

19 October, 2010

I was going to write about a talk at the CQT, but I found a preprint lying on a table in the lecture hall, and it was so cool I’ll write about that instead:

• Mario Berta, Matthias Christandl, Roger Colbeck, Joseph M. Renes, Renato Renner, The uncertainty principle in the presence of quantum memory, Nature Physics, July 25, 2010.

Actually I won’t talk about the paper per se, since it’s better if I tell you a more basic result that I first learned from reading this paper: the entropic uncertainty principle!

Everyone loves the concept of entropy, and everyone loves the uncertainty principle. Even folks who don’t understand ’em still love ’em. They just sound so mysterious and spooky and dark. I love ’em too. So, it’s nice to see a mathematical relation between them.

I explained entropy back here, so let me say a word about the uncertainty principle. It’s a limitation on how accurately you can measure two things at once in quantum mechanics. Sometimes you can only know a lot about one thing if you don’t know much about the other. This happens when those two things “fail to commute”.

Mathematically, the usual uncertainty principle says this:

\Delta A \cdot \Delta B \ge \frac{1}{2} |\langle [A,B] \rangle |

In plain English: the uncertainty in A times the uncertainty in B is bigger than the absolute value of the expected value of their commutator

[A,B] = A B - B A

Whoops! That started off as plain English, but it degenerated into plain gibberish near the end… which is probably why most people don’t understand the uncertainty principle. I don’t think I’m gonna cure that today, but let me just nail down the math a bit.

Suppose A and B are observables — and to keep things really simple, by observable I’ll just mean a self-adjoint n \times n matrix. Suppose \psi is a state: that is, a unit vector in \mathbb{C}^n. Then the expected value of A in the state \psi is the average answer you get when you measure that observable in that state. Mathematically it’s equal to

\langle A \rangle = \langle \psi, A \psi \rangle

Sorry, there are a lot of angle brackets running around here: the ones at right stand for the inner product in \mathbb{C}^n, which I’m assuming you understand, while the ones at left are being defined by this equation. They’re just a shorthand.

Once we can compute averages, we can compute standard deviations, so we define the standard deviation of an observable A in the state \psi to be \Delta A where

(\Delta A)^2 = \langle A^2 \rangle - \langle A \rangle^2

Got it? Just like in probability theory. So now I hope you know what every symbol here means:

\Delta A \cdot \Delta B \ge \frac{1}{2} |\langle [A,B] \rangle |

and if you’re a certain sort of person you can have fun going home and proving this. Hint: it takes an inequality to prove an inequality. Other hint: what’s the most important inequality in the universe?

But now for the fun part: entropy!

Whenever you have an observable A and a state \psi, you get a probability distribution: the distribution of outcomes when you measure that observable in that state. And this probability distribution has an entropy! Let’s call the entropy S(A). I’ll define it a bit more carefully later.

But the point is: this entropy is really a very nice way to think about our uncertainty, or ignorance, of the observable A. It’s better, in many ways, than the standard deviation. For example, it doesn’t change if we multiply A by 2. The standard deviation doubles, but we’re not twice as ignorant!

Entropy is invariant under lots of transformations of our observables. So we should want an uncertainty principle that only involves entropy. And here it is, the entropic uncertainty principle:

S(A) + S(B) \ge \mathrm{log} \, \frac{1}{c}

Here c is defined as follows. To keep things simple, suppose that A is nondegenerate, meaning that all its eigenvalues are distinct. If it’s not, we can tweak it a tiny bit and it will be. Let its eigenvectors be called \phi_i. Similarly, suppose B is nondegenerate and call its eigenvectors \chi_j. Then we let

c = \mathrm{max}_{i,j} |\langle \phi_i, \chi_j \rangle|^2

Note this becomes 1 when there’s an eigenvector of A that’s also an eigenvector of B. In this case its possible to find a state where we know both observables precisely, and in this case also

\mathrm{log}\, \frac{1}{c} = 0

And that makes sense: in this case S(A) + S(B), which measures our ignorance of both observables, is indeed zero.

But if there’s no eigenvector of A that’s also an eigenvector of B, then c is smaller than 1, so

\mathrm{log} \, \frac{1}{c} > 0

so the entropic uncertainty principle says we really must have some ignorance about either A or B (or both).

So the entropic uncertainty principle makes intuitive sense. But let me define the entropy S(A), to make the principle precise. If \phi_i are the eigenvectors of A, the probabilities of getting various outcomes when we measure A in the state \psi are

p_i = |\langle \phi_i, \psi \rangle|^2

So, we define the entropy by

S(A) = - \sum_i p_i \; \mathrm{log}\, p_i

Here you can use any base for your logarithm, as long as you’re consistent. Mathematicians and physicists use e, while computer scientists, who prefer integers, settle for the best known integer approximation: 2.

Just kidding! Darn — now I’ve insulted all the computer scientists. I hope none of them reads this.

Who came up with this entropic uncertainty principle? I’m not an expert on this, so I’ll probably get this wrong, but I gather it came from an idea of Deutsch:

• David Deutsch, Uncertainty in quantum measurements, Phys. Rev. Lett. 50 (1983), 631-633.

Then it got improved and formulated as a conjecture by Kraus:

• K. Kraus, Complementary observables and uncertainty relations, Phys. Rev. D 35 (1987), 3070-3075.

and then that conjecture was proved here:

• H. Maassen and J. B. Uffink, Generalized entropic uncertainty relations, Phys. Rev. Lett. 60 (1988), 1103-1106.

The paper I found in the lecture hall proves a more refined version where the system being measured — let’s call it X — is entangled to the observer’s memory apparatus — let’s call it O. In this situation they show

S(A|O) + S(B|O) \ge S(X|O) + \mathrm{log} \, \frac{1}{c}

where I’m using a concept of “conditional entropy”: the entropy of something given something else. Here’s their abstract:

The uncertainty principle, originally formulated by Heisenberg, clearly illustrates the difference between classical and quantum mechanics. The principle bounds the uncertainties about the outcomes of two incompatible measurements, such as position and momentum, on a particle. It implies that one cannot predict the outcomes for both possible choices of measurement to arbitrary precision, even if information about the preparation of the particle is available in a classical memory. However, if the particle is prepared entangled with a quantum memory, a device that might be available in the not-too-distant future, it is possible to predict the outcomes for both measurement choices precisely. Here, we extend the uncertainty principle to incorporate this case, providing a lower bound on the uncertainties, which depends on the amount of entanglement between the particle and the quantum memory. We detail the application of our result to witnessing entanglement and to quantum key distribution.

By the way, on a really trivial note…

My wisecrack about 2 being the best known integer approximation to e made me wonder: since 3 is actually closer to e, are there some applications where ternary digits would theoretically be better than binary ones? I’ve heard of "trits" but I don’t actually know any applications where they’re optimal.

Oh — here’s one.

Quantum Entanglement from Feedback Control

28 September, 2010

Now André Carvalho from the physics department at Australian National University in Canberra is talking about “Quantum feedback control for entanglement production”. He’s in a theory group with strong connections to the atom laser experimental group at ANU. This theory group works on measurement and control theory for Bose-Einstein condensates and atom lasers.

The good news: recent advances in real-time monitoring allows the control of quantum systems using feedback.

The big question: can we use feedback to design the system dynamics to produce and stabilize entangled states?

The answer: yes.

Start by considering two atoms in a cavity, interacting with a laser. Think of each atom as a 2-state system — so the Hilbert space of the pair of atoms is

\mathbb{C}^2 \otimes \mathbb{C}^2

We’ll say what the atoms are doing using not a pure state (a unit vector) but a mixed state (a density matrix). The atoms’ time evolution will be described by Lindbladian mechanics. This is a generalization of Hamiltonian mechanics that allows for dissipative processes — processes that increase entropy! A bit more precisely, we’re talking here about the quantum analogue of a Markov process. Even more precisely, we’re talking about the Lindblad equation: the most general equation describing a time evolution for density matrices that is time-translation-invariant, Markovian, trace preserving and completely positive.

As time passes, an initially entangled 2-atom state will gradually ‘decohere’, losing its entanglement.

But next, introduce feedback. Can we do this in a way that makes the entanglement become large as time passes?

With ‘homodyne monitoring’, you can do pretty well. But with ‘photodetection monitoring’, you can do great! As time passes, every state will evolve to approach the maximally entangled state: the ‘singlet state’. This is the density matrix

| \psi \rangle \langle \psi |

corresponding to the pure state

|\psi \rangle = \frac{1}{\sqrt{2}} (\uparrow \otimes \downarrow - \downarrow \otimes \uparrow)

So: the system dynamics can be engineered using feedback to product and stabilize highly entangled state. In fact this is true not just for 2-atom systems, but multi-atom systems! And at least for 2-atom systems, this scheme is robust against imperfections and detection inefficiencies. The question of robustness is still under study for multi-atom systems.

For more details, try:

• A. R. R. Carvalho, A. J. S. Reid, and J. J. Hope, Controlling entanglement by direct quantum feedback.

We discuss the generation of entanglement between electronic states of two atoms in a cavity using direct quantum feedback schemes. We compare the effects of different control Hamiltonians and detection processes in the performance of entanglement production and show that the quantum-jump-based feedback proposed by us in Phys. Rev. A 76 010301(R) (2007) can protect highly entangled states against decoherence. We provide analytical results that explain the robustness of jump feedback, and also analyse the perspectives of experimental implementation by scrutinising the effects of imperfections and approximations in our model.

How do homodyne and photodetection feedback work? I’m not exactly sure, but this quote helps:

In the homodyne-based scheme, the detector registers
a continuous photocurrent, and the feedback Hamiltonian
is constantly applied to the system. Conversely, in
the photocounting-based strategy, the absence of signal
predominates and the control is only triggered after a
detection click, i.e. a quantum jump, occurs.