jointly written with Brendan Fong
Noether proved lots of theorems, but when people talk about Noether’s theorem, they always seem to mean her result linking symmetries to conserved quantities. Her original result applied to classical mechanics, but today we’d like to present a version that applies to ‘stochastic mechanics’—or in other words, Markov processes.
What’s a Markov process? We’ll say more in a minute—but in plain English, it’s a physical system where something hops around randomly from state to state, where its probability of hopping anywhere depends only on where it is now, not its past history. Markov processes include, as a special case, the stochastic Petri nets we’ve been talking about.
Our stochastic version of Noether’s theorem is copied after a well-known quantum version. It’s yet another example of how we can exploit the analogy between stochastic mechanics and quantum mechanics. But for now we’ll just present the stochastic version. Next time we’ll compare it to the quantum one.
Markov processes
We should and probably will be more general, but let’s start by considering a finite set of states, say To describe a Markov process we then need a matrix of real numbers
The idea is this: suppose right now our system is in the state
. Then the probability of being in some state
changes as time goes by—and
is defined to be the time derivative of this probability right now.
So, if is the probability of being in the state
at time
we want the master equation to hold:
This motivates the definition of ‘infinitesimal stochastic’, which we recall from Part 5:
Definition. Given a finite set , a matrix of real numbers
is infinitesimal stochastic if
and
for all .
The inequality says that if we start in the state , the probability of being found in some other state, which starts at 0, can’t go down, at least initially. The equation says that the probability of being somewhere or other doesn’t change. Together, these facts imply that that:
That makes sense: the probability of being in the state $i$, which starts at 1, can’t go up, at least initially.
Using the magic of matrix multiplication, we can rewrite the master equation as follows:
and we can solve it like this:
If is an infinitesimal stochastic operator, we will call
a Markov process, and
its Hamiltonian.
(Actually, most people call a Markov semigroup, and reserve the term Markov process for another way of looking at the same idea. So, be careful.)
Noether’s theorem is about ‘conserved quantities’, that is, observables whose expected values don’t change with time. To understand this theorem, you need to know a bit about observables. In stochastic mechanics an observable is simply a function assigning a number to each state
.
However, in quantum mechanics we often think of observables as matrices, so it’s nice to do that here, too. It’s easy: we just create a matrix whose diagonal entries are the values of the function And just to confuse you, we’ll also call this matrix
So:
One advantage of this trick is that it lets us ask whether an observable commutes with the Hamiltonian. Remember, the commutator of matrices is defined by
Noether’s theorem will say that if and only if
is ‘conserved’ in some sense. What sense? First, recall that a stochastic state is just our fancy name for a probability distribution
on the set
. Second, the expected value of an observable
in the stochastic state
is defined to be
In Part 5 we introduced the notation
for any function on
. The reason is that later, when we generalize
from a finite set to a measure space, the sum at right will become an integral over
. Indeed, a sum is just a special sort of integral!
Using this notation and the magic of matrix multiplication, we can write the expected value of in the stochastic state
as
We can calculate how this changes in time if obeys the master equation… and we can write the answer using the commutator
:
Lemma. Suppose is an infinitesimal stochastic operator and
is an observable. If
obeys the master equation, then
Proof. Using the master equation we have
But since is infinitesimal stochastic,
so for any function on
we have
and in particular
Since , we conclude from (1) and (2) that
as desired. █
The commutator doesn’t look like it’s doing much here, since we also have
which is even simpler. But the commutator will become useful when we get to Noether’s theorem!
Noether’s theorem
Here’s a version of Noether’s theorem for Markov processes. It says an observable commutes with the Hamiltonian iff the expected values of that observable and its square don’t change as time passes:
Theorem. Suppose is an infinitesimal stochastic operator and
is an observable. Then
if and only if
and
for all obeying the master equation.
If you know Noether’s theorem from quantum mechanics, you might be surprised that in this version we need not only the observable but also its square to have an unchanging expected value! We’ll explain this, but first let’s prove the theorem.
Proof. The easy part is showing that if then
and
. In fact there’s nothing special about these two powers of
; we’ll show that
for all . The point is that since
commutes with
, it commutes with all powers of
:
So, applying the Lemma to the observable , we see
The backward direction is a bit trickier. We now assume that
for all solutions of the master equation. This implies
or since this holds for all solutions,
We wish to show that .
First, recall that we can think of is a diagonal matrix with:
So, we have
To show this is zero for each pair of elements , it suffices to show that when
, then
. That is, we need to show that if the system can move from state
to state
, then the observable takes the same value on these two states.
In fact, it’s enough to show that this sum is zero for any :
Why? When ,
, so that term in the sum vanishes. But when
,
and
are both non-negative—the latter because
is infinitesimal stochastic. So if they sum to zero, they must each be individually zero. Thus for all
, we have
. But this means that either
or
, which is what we need to show.
So, let’s take that sum and expand it:
which in turn equals
The three terms here are each zero: the first because is infinitesimal stochastic, and the latter two by equation (3). So, we’re done! █
Markov chains
So that’s the proof… but why do we need both and its square to have an expected value that doesn’t change with time to conclude
? There’s an easy counterexample if we leave out the condition involving
. However, the underlying idea is clearer if we work with Markov chains instead of Markov processes.
In a Markov process, time passes by continuously. In a Markov chain, time comes in discrete steps! We get a Markov process by forming where
is an infinitesimal stochastic operator. We get a Markov chain by forming the operator
where
is a ‘stochastic operator’. Remember:
Definition. Given a finite set , a matrix of real numbers
is stochastic if
for all and
for all .
The idea is that describes a random hop, with
being the probability of hopping to the state
if you start at the state
. These probabilities are nonnegative and sum to 1.
Any stochastic operator gives rise to a Markov chain And in case it’s not clear, that’s how we’re defining a Markov chain: the sequence of powers of a stochastic operator. There are other definitions, but they’re equivalent.
We can draw a Markov chain by drawing a bunch of states and arrows labelled by transition probabilities, which are the matrix elements :
Here is Noether’s theorem for Markov chains:
Theorem. Suppose is a stochastic operator and
is an observable. Then
if and only if
and
for all stochastic states
In other words, an observable commutes with iff the expected values of that observable and its square don’t change when we evolve our state one time step using
.
You can probably prove this theorem by copying the proof for Markov processes:
Puzzle. Prove Noether’s theorem for Markov chains.
But let’s see why we need the condition on the square of observable! That’s the intriguing part. Here’s a nice little Markov chain:

where we haven’t drawn arrows labelled by 0. So, state 1 has a 50% chance of hopping to state 0 and a 50% chance of hopping to state 2; the other two states just sit there. Now, consider the observable with
It’s easy to check that the expected value of this observable doesn’t change with time:
for all . The reason, in plain English, is this. Nothing at all happens if you start at states 0 or 2: you just sit there, so the expected value of
doesn’t change. If you start at state 1, the observable equals 1. You then have a 50% chance of going to a state where the observable equals 0 and a 50% chance of going to a state where it equals 2, so its expected value doesn’t change: it still equals 1.
On the other hand, we do not have in this example, because we can hop between states where
takes different values. Furthermore,
After all, if you start at state 1, equals 1 there. You then have a 50% chance of going to a state where
equals 0 and a 50% chance of going to a state where it equals 4, so its expected value changes!
So, that’s why for all
is not enough to guarantee
. The same sort of counterexample works for Markov processes, too.
Finally, we should add that there’s nothing terribly sacred about the square of the observable. For example, we have:
Theorem. Suppose is an infinitesimal stochastic operator and
is an observable. Then
if and only if
for all smooth and all
obeying the master equation.
Theorem. Suppose is a stochastic operator and
is an observable. Then
if and only if
for all smooth and all stochastic states
These make the ‘forward direction’ of Noether’s theorem stronger… and in fact, the forward direction, while easier, is probably more useful! However, if we ever use Noether’s theorem in the ‘reverse direction’, it might be easier to check a condition involving only and its square.
In the first defiintion you need sigma H = 1, not sigma H = 0
Thanks, but no: I think you’re mixing up ‘infinitesimal stochastic’ and ‘stochastic’: see Part 5 for both those concepts.
Later in this post you’ll see a stochastic matrix
, which obeys
Nice post. However I have some issues with the terminology used:
1) “the number
describes the probability per unit time of hopping from the state
to the state
” is not quite correct.
is the rate at which the system hops from state
to state
. In other words, for an infinitesimal time
the probability for jumping is
. The difference between the two is similar to the difference between the interest rate and the AER (annual equivalent rate, for non-UK readers).
2) “Together, they imply that the probability of staying in the same place goes down:
.” is not clear to me. What is this “probability of staying in the same place” and it is going down as what changes? I think it would be clearer to say that
is the rate at which a system in state
leaves that state.
3) “we call
a Markov process”. The term “Markov process” already has a different well-defined mathematical meaning. The group of operators
is often referred to as the “Markov semigroup”. I am happy with the term “Hamiltonian” for the generator of this semigroup.
In principle I like the approach taken here of giving physicists’s quantum mechanical names to probability theory concepts. It makes the theory more accessible to physicists. In that vein I would go even further and use Dirac’s bra-ket notation rather than the integral notation you employ. So, for example, instead of
I would write
where
satisfies
. I think the integral notation is unsatisfying to both mathematicians and physicists (physicists will be wondering where the dx went and mathematicians will want to know what measure is used).
Gustav wrote:
This issue comes up over and over when I write about these things. I feel I have trouble explaining this concept both accurately and very quickly.
I completely understand the problem with what I said: it might seem like
the probability that the system hops from state j to state i in, say, one second. As you know, what I really mean is to take the probability that the system hopes from state j to state i in
seconds, divide it by
, and then take the limit as
. But that takes a while to say!
I’m always afraid that calling this quantity “the rate at which the system hops from state j to state i” will confuse people, because this description doesn’t mention probabilities. This rate is a “probabilistic rate”, and the phrase “probabilistic rate” is not part of everyday English. I guess say things like “on average, a bus comes by every 10 minutes”. But if you say “the rate at which buses come by is 1 per hour”, I bet they won’t guess you mean a probabilistic rate.
I think that leaving out the word “unit” would help: “the number
describes the probability per time of hopping from the state j to the state i.”
Or I could say “the number
describes the average rate at which the system hops from the state j to the state i.”
What do people think is clearest? This time I’ll add a lengthy precise description, but I don’t want to always have to give such a long description. I want a clear short description that nonexperts can understand.
John, this discussion has been very instructive for me. It opened my eyes to the problems one runs in to when one wants to be precise and colloquial at the same time. In particular this is difficult if one wants a blog post to be readable to someone who has not read the previous blog posts in the series.
I would vote for “probabilistic rate”. You are right that this is not part of everyday English. Therefore I suspect it also does not carry any misleading connotations. Most readers will probably just swallow it and the curious ones will be tempted to read your earlier posts with the precise explanations. Initially I had felt that “stochastic rate” might work, but I now realise that the word “stochastic” might sound technical.
One of my books on probability uses “probability intensity of transition from state i to state j”. What you call the Hamiltonian I would call a “rate matrix”, or an “instantaneous rate matrix” if I thought the former was likely to confuse.
“Probability intensity” is an interesting phrase. I’m not sure most people would instantly understand it, but they could learn.
“Rate matrix” is certainly clearer than “Hamiltonian”, so I should mention that in the (dreamt-of) final polished version of these notes. “Hamiltonian” is mainly good for helping physicists see that all this stuff is a lot like quantum mechanics. In quantum mechanics we have
while here we have
While we’re comparing conventions, I should add that lots of people prefer
and this indeed has advantages. But I thought that sticking a minus sign would seem peculiar to beginners.
Gustav wrote:
These comments are really useful, because while you and I both know what I really meant to say, I plan to turn these posts into a paper or book someday, and then it’s important that they be clear.
So, here’s what I mean. The probability of staying in some particular place, say place
, is the matrix element
and this goes down as time passes:
But again, the problem comes when I try to say this very quickly and informally but still clearly.
Again, I avoided saying this because this rate is a “probabilistic rate”, a rate of change of probabilities, and what you say here doesn’t make that clear. “The rate at which the probability of staying in the state
diminishes” is perhaps more precise—but it sounds stilted, not conversational.
Also, the minus sign also looks like it’s inserted ad hoc when we say things this way. What’s uniformly true is that
is the probability of hopping from state
to state
after time
and
So
is the “instantaneous rate of change, at
, of the probability of hopping from state
to state
after time
.”
But I’d like a way to say this that’s quick, informal, yet clear. Of course I need to explain this idea clearly and patiently somewhere. But then there will be times I need to remind people of it—and those reminders should be terse but not misleading.
Thank you John, that was helpful. While reading that sentence that gave me difficulties I had not realised that the probability you were talking about was given as a simple exponential and therefore I had not made the connection between the decrease in that probability and the sign of
.
I know I am in a pedantic mood. But I think being pedantic is fun. So here I go again. There is a difference between two probabilities.
gives the probability of _being_ in state
at time
given that we start in state
at time
. The probability of staying (in the sense of never leaving) in state
until at least time
is
. Luckily they both have the same derivative
at
, so they both go down at
. The probability of staying has the added benefit of going down also at
. The probability of being in the state on the other hand could conceivably go up again later. So it is good that you chose to talk of the probability of staying.
Gustav wrote:
Hmm, when I read the definition of Markov process, it sounds like a long-winded way of describing a Markov semigroup. Isn’t there a one-to-one correspondence between Markov processes and Markov semigroups? If there is, I can just insert a little note saying that I’m abusing language a bit.
(I’m a bit of a radical, I’m afraid: I think the world needs people who try using terminology in new ways… as long as they define it. Such people are nuisances, I know. But they provide the lubrication needed to eventually find the optimal terminology: otherwise things get locked in place at suboptimal local maxima.)
Good, because I want you to be happy, and that’s what I’m going to use.
I can’t do that, because as I explained in Part 5, the fundamental structure here is not the Hilbert space
but rather the space
, which doesn’t have an inner product on it! This is the big philosophical point I’m trying to make throughout these notes. I’ll quote myself and then say a bit more:
Privately I often use angle brackets like this:
to denote the operation I’m publicly calling the integral
This heightens the resemblance to Dirac’s bracket notation: quantum mechanics uses
for the expected value of an observable, while stochastic mechanics uses
.
However, I’m sure that writing
for the expected value of an observable
in the state
would annoy lots of people. For one thing, lots of people use
, sweeping
under the carpet. This is sort of stupid, but it’s completely entrenched.
So, for now I’m using
instead. And this has the advantage of having a fairly self-evident meaning: I’m integrating the function
over the space
.
The proof of Noether’s theorem is neat, but I wonder if it can be cleaned up? For example, is there a way to prove it without resorting to components?
Good point! Someone try it!
John, I am not yet sure why you do not like the notation
. I am using it simply in the spirit of Dirac, who wasn’t worrying about mathematical subtleties like whether wavefunctions live in
or
.
I guess, in the language of
spaces I would have to say that the kets live in
and the bras in
, but for the purpose of these blog posts we can follow Dirac and ignore such details.
Everyone has their own notation and nobody likes anyone else’s. I take that for granted as a condition of life. I don’t want to know why other people hate my notation; I don’t expect them to care why I hate theirs. I prefer to discuss more interesting things. So, this comment will be somewhat grumpy in tone.
I am trying to clarify and exploit the logical relation between probability theory and quantum theory. This involves noting the similarities but also respecting the differences.
I am not at all interested in ‘following the spirit of Dirac’, if that means ‘glossing over mathematical subtleties’. However, I don’t want to scare my readers by introducing too much formalism too soon—especially if I haven’t worked out the details!
So far in this series of posts, I’m pursuing the philosophy that quantum theory is about Hilbert spaces while probability theory is about vector spaces equipped with some other structure. This extra structure is something like that of an integration algebra… but that may not be quite right, so I’d rather not talk about it yet.
So, instead, I’m saying that quantum theory is about
while probability theory is about
. This is easier for everyone to understand.
Given this, I want to write the integral of the function
as
, rather than trying to artificially force probability theory into looking like quantum theory by writing it as
.
There is a certain quaint charm in using an integral sign to denote integration, after all.
But if someone held a gun to my head and forced me to use Dirac notation here, I would write
, which at least makes some sense: as you note, we can say we’re pairing the element
with the element
.
But if we try to understand the relation between quantum theory and probability theory this way, I believe we’ll get quite confused.
Anyway, there are lots of interesting issues to discuss here, but I think it will be easiest if we decouple them from the question of what notation to use.
John Baez wrote:
A lot of people do this — it’s the first way I’d seen the technology set up, though of course that doesn’t mean it’s the best or the most illuminating choice.
Do any of these people reflect out loud about what this approach means? It means something like: there’s a god-given ‘default state’ called 1, and the expectation value of an observable
in the state
is the transition amplitude
. But actually it’s weirder than that, since if we have
then the $\psi$ will hardly ever count as a quantum state, since typically
and similarly, unless our measure space is a probability measure space the default state will be neither a stochastic state:
nor a quantum state:
So it’s all very weird. Basically, it ignores the fact that quantum states should have
while stochastic states are very different beast, with
We can get a stochastic state from a quantum state
by forming
: we all learn about this in school, when people discuss the probability interpretation of the wavefunction.
by forming
. But in the approach where we talk about
, it seems we are simply pretending a stochastic state is a quantum state, while neglecting all the problems this raises!
Conversely (though I never hear anyone talk about this) we can get a quantum state from a stochastic state
Believe me, I’d be fascinated if someone could tell a coherent story about this… I’m not trying to nip an nascent idea in the bud… but so far all my thoughts about this suggest it’s a wrong road.
Does it still count as a “nascent idea” if it’s been around since 1976? :-P
More seriously, I think the main issue is that most of the people involved just weren’t that concerned with quantum-to-classical transitions. If the smallest thing you’re considering is a rabbit, a sand grain or even a clump of cells in a human neocortex, going from a probability distribution to a quantum density matrix or vice versa isn’t a top priority. So, while being able to lift tools out of the quantum toolbox is nice, relating a stochastic description of a system to a quantum description of the same physical system isn’t a goal.
Cardy (1996) is typical:
Second, I think that the people who study diffusion-limited reactions, active-to-absorbing phase transitions, directed percolation and the like are generally eager to skip past the first steps of defining the formalism and get to a Lagrangian they can play with. A better notation at the beginning may obviate the need for a few awkwardnesses further along (e.g., field redefinitions); I’ll have to look into that. The stuff they seem to spend the most time worrying over comes after they’ve a stochastic Hamiltonian in the coherent-state representation: renormalization, estimating critical exponents, etc.
Blake wrote:
I’d say the mathematical trick has been around since 1976. The nascent idea lurking in this trick is that we can think of a probability distribution as a quantum state if we normalize it in a nonstandard way and promise to only ask about its transition amplitudes to a certain ‘default’ state
. Mathematical tricks often conceal ideas that are too strange for people to say in words.
Yes, that’s one part of it. But even if we don’t try to describe the same system both classically and quantumly, there’s also the question of the logical relation between the classical and quantum descriptions: that’s what I’m especially interested in. But this is not the sort of question that ‘practical’ people tend to enjoy—perhaps because they can’t imagine what one might do with the answer.
Right. For me the murky beginning steps are the most interesting part, because they hint at a relation between quantum mechanics and probability theory that seems a bit different than the ‘obvious’ one, where
rather than the wavefunction
acts like a probability distribution. I’ve got a bunch of ideas about this that I’ll reveal as soon as I can.
Could someone please go here and tell me whether you see what you should near the beginning of the section Markov processes, namely:
or what I see on my browser at work, namely
This is a really annoying bug!
I use Firefox 7.0.1 and see the first version.
With Google chrome and IE, what I see is mathematically correct, but very messy. The slash through the equals sign is too far to the right, and the arrow is made of an equals sign and and arrow which don’t line up.
That’s what I used to get, back when things worked for me. I thought that was bad… but I don’t mind an ugly
sign nearly as much as one that looks exactly like
!
By the way, I don’t know why your 3 attempts to post this ran into trouble.
I get the first too, although it’s not nicely rendered. I’m using Firefox 7.0.
Thanks, guys! Does anyone see the ≠ as an equal sign? I now think it could be because on this computer I’ve downloaded fonts so that jsmath doesn’t need to grab them from somewhere else. I assume you guys are getting the little message on top, about jsmath?
It looks fine to me (Firefox 3.6.22). The “not equals” sign is perfect. There’s a slight wobble in the shaft of the “implies” sign are slightly misaligned, but I’d hardly have noticed.
Oops, there was a wobble in my English composition there, too. But I think you know what I mean.
Opera 10.0, which is notorious for getting maths wrong.
It shows an extra ‘=’ prepended to the => sign (is this to make a long implication arrow?) In Opera itself the ‘=’ is at a slight angle but when I did a screen grab it came out straight, except that you can see the join, so I have some kind of optical illusion as well.
Anyhow, I am wondering if the extra ‘=’ is being moved around somehow.
John Beattie wrote:
Graham wrote:
Tom wrote approximately:
I think you’re all describing the same thing, which I’m also seeing on my laptop at home. I think that’s because I risked a \Longrightarrow instead of a mere \Rightarrow.
Luckily, none of you see the ≠ coming out as an =, and neither do I, here at home. Only my computer at work commits that heinous crime!
So I will relax, somewhat, and change the \Longrightarrow to \Rightarrow. (In case you’re wondering, jsmath doesn’t recognize \implies.)
Thanks, everyone!!!
Over on the Forum, Eric Forgy came up with a relative of Noether’s theorem that goes like this.
Let’s assume that
obeys the master equation, so
Then we have
Now, if
also obeys the master equation then we also have
From these we can conclude
. Conversely if
, then
so
obeys the master equation.
So, we have:
Proposition: if
obeys the master equation, then
iff
obeys the master equation.
Eric Forgy also lured me into thinking about the Schrödinger versus Heisenberg pictures in stochastic mechanics.
So far I’ve been using time-independent observables and letting states evolve in time via
This is the Schrödinger picture. However, we may also use time-independent states and let observables evolve in time via
This is the Heisenberg picture. These pictures are compatible in that we may use either one to compute the expected value of an observable
measured in the state
after waiting a time $t$, and we get the same answer:
In the Schrödinger picture we have the master equation
while in the Heisenberg picture we have
This is amusingly different than quantum mechanics. In quantum mechanics we define a time-dependent version of either the state
or the observable
by setting
and
and we obtain the compatibility equation
and also the equation
There’s nothing like posting something publicly to stir up thoughts that make that post seem ill-considered and rash! I’ve changed my mind a bit about the Heisenberg picture in stochastic mechanics. While nothing I said above seems mathematically incorrect, it’s upsetting that while the product of observables is an observable, we have
if, as above, we define
So, suppose
is infinitesimal stochastic. Also suppose our set
of states is finite, to avoid subtleties of analysis I’d rather postpone thinking about. Then
is defined for negative
as well as positive
, and while it’s usually not stochastic for negative times, we have
for both positive and negative times.
Then, we can either use time-independent observables and let states evolve in time via
or use time-independent states and let observables evolve in time via
These pictures are compatible in that we may use either one to compute the expected value of the observable
measured in the state
after waiting a time
, and we get the same answer:
The reason is that
where the second step uses the remarks I made earlier.
Now we have
and also
for any observables
.
There’s more to say, but not now! It’s dinnertime!
Another option is to let
evolve as an operator. I’m writing some notes on that idea on the forum as we speak.
Dear John,
I don’t get one thing. You write “That is, we need to show that if the system can move from state j to state i, then the observable takes the same value on these two states.”
So, if I understand well, if the graph is connected, observable O will take the same value on all of the states. Otherwise, it will have different constant values in each component (but I do not consider disconnected graphs as really interesting, as each component is completely independent of another: each process happens on its own).
So, maybe what you proved is not “Noether’s theorem”, but the (still nice) result that
“If a time-independent observable’s average and variance do not vary in time, then the observable is uniform of the vertex set”.
Observations:
– I wonder if this could have anyhing to do with Discrete Analytic Functions (http://www.cs.elte.hu/~lovasz/analytic.pdf), which are constant on any compact discretized Riemann surface.
– The fact that first and second moments play the crucial role for Markov processes resounds with the continuous variable case, where the underlying stochastic processes have at each time gaussian distributions – the gaussian has only first and second nonvanishing moments – and something similar happens in Pawula’s theorem for the truncation of the Kramers-Moyal expansion after the second term.
Tomate wrote:
That’s right. By the way, for people who don’t understand what you said, let me add that you’re taking the points of our set
as the vertices of a directed graph, and drawing an edge from
to
whenever
is nonzero.
Well, you may not consider it interesting, but that’s what a conserved quantity
does: it splits the set
into a disjoint union of subsets on which
takes different constant values, and our Markov process then becomes a ‘disjoint union’ of Markov processes on these subsets. It’s exactly like in quantum mechanics, where a conserved quantity splits the Hilbert space up as a direct sum of eigenspaces, and time evolution separately preserves each eigenspace.
Personally I consider this very interesting: this is how conserved quantities let us simplify physics problems! And they arise quite often: for example, in the reversible reaction we considered last time:
the total number of particles of types 1 and 2 is conserved. This explains how from a single Poisson equilibrium state we were able to extract a lot of different equilibrium states in which that number took different values. I’ll work out this example in detail sometime, for people who need a bit of help.
OK, I buy it. But still I prefer the formulation “If a time-independent observable’s average and variance do not vary in time, then the observable is uniform over the vertex set (of a connected graph)”. I think it does have something deep in it related to the key role of the first and second moment for stochastic processes.
(Also because in QM you can build entangled wave functions over factorized subspaces, while here the superposition between probabilties, or populations, is always what one would call a “mixture” in the QM case.)
On the Schrödinger/Heisenberg picture: I’ve seen people using a sort of “interaction picture”, where the hamiltonian
is split in the waiting time contribution
and an interaction hamiltonian
and then take care of these two pieces when exponentiating
. It’s very useful for guessing the correct path measure, for example. I myself had a complete discussion of this procedure on my master thesis, but it is in italian. However, I’ve never seen it discussed in relation to the evolution of a conjugate observable
. It would be interesting to see what happens if one discharges part of the evolution (the “free” one) on an observable and part (the “interacting”) on the probability measure itself. Maybe it would make calculations easier.
Hi! Great to see you here again! James Dolan had suggested to me the idea of using an interaction picture of precisely this sort. I’d never seen it before. What’s I find amusing is that the particle’s probability of staying where it is decays before the particle jumps somewhere else… as if it’s dreaming of the jump before it goes:
This is different than the interaction picture in quantum mechanics, where
by itself is already self-adjoint, so that the free evolution
is unitary between the ‘jumps’.
I’ve always been here – but with too little time for discussion.
This is precisely what I had in mind. I can send you via email a couple of pages from my master thesis if you want: they are in Italian, but the formulas are quite clear. Funnily, I don’t have references… I wrote that chapter out of some personal notes of my professor, which didn’t have references neither. In the field, it’s like everybody knows about it but nobody knows exactly where it comes from…
“as if it’s dreaming of the jump before it goes”: this is always the effect it has when we project statistical arguments onto the individuals, like when people play long-overdue numbers at the lotteries…
Tomate wrote:
Oh, good! Sometimes I think everyone is leaving, or falling asleep.
If it mainly says what we’ve already discussed, I guess I won’t make you bother. I guess this is some sort of ‘folk wisdom’.
Okay, good point!
Brendan Fong proved the stochastic version of Noether’s theorem in Part 11. Now let’s do the quantum version […]
Since nobody did the puzzle this time, I’ll have to do it myself.
Puzzle. Suppose
is a stochastic operator and
is an observable. Show that
commutes with
iff the expected values of
and its square don’t change when we evolve our state one time step using
In other words, show that
if and only if
and
for all stochastic states
Answer. One direction is easy: if
then
for all
so
where in the last step we use the fact that
is stochastic.
For the converse direction we can use the same tricks that worked for Markov processes. Assume that
and
for all stochastic states
. These imply that
and
We wish to show that
. Note that
To show this is always zero, we’ll show that when
, then
. This says that when our system can hop from one state to another, the observable
must take the same value on these two states.
For this, in turn, it’s enough to show that the following sum vanishes for any
:
Why? The matrix elements
are nonnegative since
is stochastic. Thus the sum can only vanish if each term vanishes, meaning that
whenever
.
To show the sum vanishes, let’s expand it:
Now, since (1) and (2) hold for all stochastic states
, this equals
But this is zero because
is stochastic, which implies
So, we’re done!
Word spreads fast! Here’s an announcement of a talk at the Oxford OASIS series. That stands for Oxford Advanced Seminar on Informatic Structures.
Philosophers of physics being as they are, the phrase “what they call” makes me afraid he’s planning to chide me for using the term “Noether’s theorem” in a very extended sense, not very close to that of her original 1918 paper. Physicists being as they are, such chiding wouldn’t stop me. But I’m curious to hear what he actually says. The talk will be videotaped and put on the OASIS website. Furthermore, Brendan is now at Oxford and can hear the talk in person!
Very cool. I look forward to seeing that.
[…] some applications of discrete calculus. In this post, I reformulate some of the material in Part 11 pertaining to Noether’s […]
In the 2 theorems at the end, why is
, when f takes and gives only expressions in O ?
Also, When we have f(O), does f being smooth mean that, it can be expanded in the form of a power series in O?
The functional calculus allows you to apply any function
to any self-adjoint
matrix (and thus any self-adjoint operator on a finite-dimensional Hilbert space), or any holomorphic function
to any
matrix (and thus to any linear operator on a finite-dimensional space).
We can do that when f is holomorphic. This means that
and the power series converges for all
You can then prove that
converges for all matrices
, so we can define
to be this.
When
is self-adjoint we can go further, and define
for any function
, simply by saying that
has the same eigenvectors as
, and
We need
to be self-adjoint here because that guarantees that we can choose a basis of eigenvectors of
.
For some reason I chose to focus on the case where
is smooth, but there was no need to do this. Since
is self-adjoint, I could have assumed
is any function whatsoever.