I’ll start you off with two puzzles. Their relevance should become clear by the end of this post:
• Puzzle 1. Suppose I have a box of jewels. The average value of a jewel in the box is $10. I randomly pull one out of the box. What’s the probability that its value is at least $100?
• Puzzle 2. Suppose I have a box full of numbers—they can be arbitrary real numbers. Their average is zero, and their standard deviation is 10. I randomly pull one out. What’s the probability that it’s at least 100?
Before you complain, I’ll admit: in both cases, you can’t actually tell me the probability. But you can say something about the probability! What’s the most you can say?
Some good news: Brendan Fong, who worked here with me, has now gotten a scholarship to do his PhD at the University of Oxford! He’s talking to with people like Bob Coecke and Jamie Vicary, who work on diagrammatic and category-theoretic approaches to quantum theory.
But we’ve also finished a paper on good old-fashioned probability theory:
• John Baez and Brendan Fong, A Noether theorem for Markov processes.
This is based on a result Brendan proved in the network theory series on this blog. But we go further in a number of ways.
What’s the basic idea?
For months now I’ve been pushing the idea that we can take ideas from quantum mechanics and push them over to ‘stochastic mechanics’, which differs in that we work with probabilities rather than amplitudes. Here we do this for Noether’s theorem.
I should warn you: here I’m using ‘Noether’s theorem’ in an extremely general way to mean any result relating symmetries and conserved quantities. There are many versions. We prove a version that applies to Markov processes, which are random processes of the nicest sort: those where the rules don’t change with time, and the state of the system in the future only depends on its state now, not the past.
In quantum mechanics, there’s a very simple relation between symmetries and conserved quantities: an observable commutes with the Hamiltonian if and only if its expected value remains constant in time for every state. For Markov processes this is no longer true. But we show the next best thing: an observable commutes with the Hamiltonian if and only if both its expected value and standard deviation are constant in time for every state!
Now, we explained this stuff very simply and clearly back in Part 11 and Part 13 of the network theory series. We also tried to explain it clearly in the paper. So now let me explain it in a complicated, confusing way, for people who prefer that.
(Judging from the papers I read, that’s a lot of people!)
I’ll start by stating the quantum theorem we’re trying to mimic, and then state the version for Markov processes.
Noether’s theorem: quantum versions
For starters, suppose both our Hamiltonian and the observable are bounded self-adjoint operators. Then we have this:
Noether’s Theorem, Baby Quantum Version. Let and be bounded self-adjoint operators on some Hilbert space. Then
if and only if for all states obeying Schrödinger’s equation
the expected value is constant as a function of
What if is an unbounded self-adjoint operator? That’s no big deal: we can get a bounded one by taking where is any bounded measurable function. But Hamiltonians are rarely bounded for fully realistic quantum systems, and we can’t mess with the Hamiltonian without changing Schrödinger’s equation! So, we definitely want a version of Noether’s theorem that lets be unbounded.
It’s a bit tough to make the equation precise in a useful way when is unbounded, because then is only densely defined. If doesn’t map the domain of to itself, it’s hard to know what even means! We could demand that does preserve the domain of but a better workaround is instead to say that
for all Then we get this:
Noether’s Theorem, Full-fledged Quantum Version. Let and be self-adjoint operators on some Hilbert space, with being bounded. Then
if and only if for all states
the expected value is constant as a function of
Here of course we’re using the fact that is what we get when we solve Schrödinger’s equation with initial data
But in fact, this version of Noether’s theorem follows instantly from a simpler one:
Noether’s Theorem, Simpler Quantum Version. Let be a unitary operator and let be a bounded self-adjoint operator on some Hilbert space. Then
if and only if for all states
This version applies to a single unitary operator instead of the 1-parameter unitary group
It’s incredibly easy to prove. And this is is the easiest version to copy over to the Markov case! However, the proof over there is not quite so easy.
Noether’s theorem: stochastic versions
In stochastic mechanics we describe states using probability distributions, not vectors in a Hilbert space. We also need a new concept of ‘observable’, and unitary operators will be replaced by ‘stochastic operators’.
Suppose that is a -finite measure space with a measure we write simply as . Then probability distributions on lie in Let’s define an observable to be any element of the dual space allowing us to define the expected valued of in the probability distribution to be
The angle brackets are supposed to remind you of quantum mechanics, but we don’t have an inner product on a Hilbert space anymore! Instead, we have a pairing between and Probability distributions live in while observables live in But we can also think of an observable as a bounded operator on namely the operator of multiplying by the function
Let’s say an operator
is stochastic if it’s bounded and it maps probability distributions to probability distributions. Equivalently, is stochastic if it’s linear and it obeys
for all .
A Markov process, or technically a Markov semigroup, is a collection of operators
for such that:
• is stochastic for all
• depends continuously on
• for all
By the Hille–Yosida theorem, any Markov semigroup may be written as
for some operator , called its Hamiltonian. However, is typically unbounded and only densely defined. This makes it difficult to work with the commutator So, we should borrow a trick from quantum mechanics and work with the commutator instead. This amounts to working directly with the Markov semigroup instead of its Hamiltonian. And then we have:
Noether’s Theorem, Full-fledged Stochastic Version. Suppose is a -finite measure space and
is a Markov semigroup. Suppose is an observable. Then
for all if and only if for all probability distributions on , and are constant as a function of
In plain English: time evolution commutes with an observable if the mean and standard deviation of that observable never change with time. As in the quantum case, this result follows instantly from a simpler one, which applies to a single stochastic operator:
Noether’s Theorem, Simpler Stochastic Version. Suppose is a -finite measure space and
is stochastic operator. Suppose is an observable. Then
if and only if for all probability distributions on
It looks simple, but the proof is a bit tricky! It’s easy to see that implies those other equations; the work lies in showing the converse. The reason is that implies
for all , not just 1 and 2. The expected values of the powers of are more or less what people call its moments. So, we’re saying all the moments of are unchanged when we apply to an arbitrary probability distribution, given that we know this fact for the first two.
The proof is fairly technical but also sort of cute: we use Chebyshev’s inequality, which says that the probability of a random variable taking a value at least standard deviations away from its mean is less than or equal to I’ve always found this to be an amazing fact, but now it seems utterly obvious. You can figure out the proof yourself if you do the puzzles at the start of this post.
But now I’ll let you read our paper! And I’m really hoping you’ll spot mistakes, or places it can be improved.