Geometric Quantization (Part 1)

1 December, 2018

I can’t help thinking about geometric quantization. I feel it holds some lessons about the relation between classical and quantum mechanics that we haven’t fully absorbed yet. I want to play my cards fairly close to my chest, because there are some interesting ideas I haven’t fully explored yet… but still, there are also plenty of ‘well-known’ clues that I can afford to explain.

The first one is this. As beginners, we start by thinking of geometric quantization as a procedure for taking a symplectic manifold and constructing a Hilbert space: that is, taking a space of classical states and contructing the corresponding space of quantum states. We soon learn that this procedure requires additional data as its input: a symplectic manifold is not enough. We learn that it works much better to start with a Kähler manifold equipped with a holomorphic hermitian line bundle with a connection whose curvature is the imaginary part of the Kähler structure. Then the space of holomorphic sections of that line bundle gives the Hilbert space we seek.

That’s quite a mouthful—but it makes for such a nice story that I’d love to write a bunch of blog articles explaining it with lots of examples. Unfortunately I don’t have time, so try these:

• Matthias Blau, Symplectic geometry and geometric quantization.

• A. Echeverria-Enriquez, M.C. Munoz-Lecanda, N. Roman-Roy, C. Victoria-Monge, Mathematical foundations of geometric quantization.

But there’s a flip side to this story which indicates that something big and mysterious is going on. Geometric quantization is not just a procedure for converting a space of classical states into a space of quantum states. It also reveals that a space of quantum states can be seen as a space of classical states!

To reach this realization, we must admit that quantum states are not really vectors in a Hilbert space H; from a certain point of view they are really 1-dimensonal subspaces of a Hilbert space, so the set of quantum states I’m talking about is the projective space PH. But this projective space, at least when it’s finite-dimensional, turns out to be the simplest example of that complicated thing I mentioned: a Kähler manifold equipped with a holomorphic hermitian line bundle whose curvature is the imaginary part of the Kähler structure!

So a space of quantum states is an example of a space of classical states—equipped with precisely all the complicated extra structure that lets us geometrically quantize it!

At this point, if you don’t already know the answer, you should be asking: and what do we get when we geometrically quantize it?

The answer is exciting only in that it’s surprisingly dull: when we geometrically quantize PH, we get back the Hilbert space H.

You may have heard of ‘second quantization’, where we take a quantum system, treat it as classical, and quantize it again. In the usual story of second quantization, the new quantum system we get is more complicated than the original one… and we can repeat this procedure again and again, and keep getting more interesting things:

• John Baez, Nth quantization.

The story I’m telling now is different. I’m saying that when we take a quantum system with Hilbert space H, we can think of it as a classical system whose symplectic manifold of states is PH, but then we can geometrically quantize this and get H back.

The two stories are not in contradiction, because they rely on two different notions of what it means to ‘think of a quantum system as classical’. In today’s story that means getting a symplectic manifold PH from a Hilbert space H. In the other story we use the fact that H itself is a symplectic manifold!

I should explain the relation of these two stories, but that would be a big digression from today’s intended blog article: indeed I’m already regretting having drifted off course. I only brought up this other story to heighten the mystery I’m talking about now: namely, that when we geometrically quantize the space PH, we get H back.

The math is not mysterious here; it’s the physical meaning of the math that’s mysterious. The math seems to be telling us that contrary to what they say in school, quantum systems are special classical systems, with the special property that when you quantize them nothing new happens!

This idea is not mine; it goes back at least to Kibble, the guy who with Higgs invented the method whereby the Higgs boson does its work:

• Tom W. B. Kibble, Geometrization of quantum mechanics, Comm. Math. Phys. 65 (1979), 189–201.

This led to a slow, quiet line of research that continues to this day. I find this particular paper especially clear and helpful:

• Abhay Ashtekar, Troy A. Schilling, Geometrical formulation of quantum mechanics, in On Einstein’s Path, Springer, Berlin, 1999, pp. 23–65.

so if you’re wondering what the hell I’m talking about, this is probably the best place to start. To whet your appetite, here’s the abstract:

Abstract. States of a quantum mechanical system are represented by rays in a complex Hilbert space. The space of rays has, naturally, the structure of a Kähler manifold. This leads to a geometrical formulation of the postulates of quantum mechanics which, although equivalent to the standard algebraic formulation, has a very different appearance. In particular, states are now represented by points of a symplectic manifold (which happens to have, in addition, a compatible Riemannian metric), observables are represented by certain real-valued functions on this space and the Schrödinger evolution is captured by the symplectic flow generated by a Hamiltonian function. There is thus a remarkable similarity with the standard symplectic formulation of classical mechanics. Features—such as uncertainties and state vector reductions—which are specific to quantum mechanics can also be formulated geometrically but now refer to the Riemannian metric—a structure which is absent in classical mechanics. The geometrical formulation sheds considerable light on a number of issues such as the second quantization procedure, the role of coherent states in semi-classical considerations and the WKB approximation. More importantly, it suggests generalizations of quantum mechanics. The simplest among these are equivalent to the dynamical generalizations that have appeared in the literature. The geometrical reformulation provides a unified framework to discuss these and to correct a misconception. Finally, it also suggests directions in which more radical generalizations may be found.

Personally I’m not interested in the generalizations of quantum mechanics: I’m more interested in what this circle of ideas means for quantum mechanics.

One rather cynical thought is this: when we start our studies with geometric quantization, we naively hope to extract a space of quantum states from a space of classical states, e.g. a symplectic manifold. But we then discover that to do this in a systematic way, we need to equip our symplectic manifold with lots of bells and whistles. Should it really be a surprise that when we’re done, the bells and whistles we need are exactly what a space of quantum states has?

I think this indeed dissolves some of the mystery. It’s a bit like the parable of ‘stone soup’: you can make a tasty soup out of just a stone… if you season it with some vegetables, some herbs, some salt and such.

However, perhaps because by nature I’m an optimist, I also think there are interesting things to be learned from the tight relation between quantum and classical mechanics that appears in geometric quantization. And I hope to talk more about those in future articles.

Noether’s Theorem

12 September, 2018


I’ve been spending the last month at the Centre of Quantum Technologies, getting lots of work done. This Friday I’m giving a talk, and you can see the slides now:

• John Baez, Getting to the bottom of Noether’s theorem.

Abstract. In her paper of 1918, Noether’s theorem relating symmetries and conserved quantities was formulated in term of Lagrangian mechanics. But if we want to make the essence of this relation seem as self-evident as possible, we can turn to a formulation in term of Poisson brackets, which generalizes easily to quantum mechanics using commutators. This approach also gives a version of Noether’s theorem for Markov processes. The key question then becomes: when, and why, do observables generate one-parameter groups of transformations? This question sheds light on why complex numbers show up in quantum mechanics.

At 5:30 on Saturday October 6th I’ll talk about this stuff at this workshop in London:

The Philosophy and Physics of Noether’s Theorems, 5-6 October 2018, Fischer Hall, 1-4 Suffolk Street, London, UK. Organized by Bryan W. Roberts (LSE) and Nicholas Teh (Notre Dame).

This workshop celebrates the 100th anniversary of Noether’s famous paper connecting symmetries to conserved quantities. Her paper actually contains two big theorems. My talk is only about the more famous one, Noether’s first theorem, and I’ll change my talk title to make that clear when I go to London, to avoid getting flak from experts. Her second theorem explains why it’s hard to define energy in general relativity! This is one reason Einstein admired Noether so much.

I’ll also give this talk at DAMTP—the Department of Applied Mathematics and Theoretical Physics, in Cambridge—on Thursday October 4th at 1 pm.

The organizers of London workshop on the philosophy and physics of Noether’s theorems have asked me to write a paper, so my talk can be seen as the first step toward that. My talk doesn’t contain any hard theorems, but the main point—that the complex numbers arise naturally from wanting a correspondence between observables and symmetry generators—can be expressed in some theorems, which I hope to explain in my paper.


The Philosophy and Physics of Noether’s Theorems

11 August, 2018


I’ll be speaking at a conference celebrating the centenary of Emmy Noether’s work connecting symmetries and conservation laws:

The Philosophy and Physics of Noether’s Theorems, 5-6 October 2018, Fischer Hall, 1-4 Suffolk Street, London, UK. Organized by Bryan W. Roberts (LSE) and Nicholas Teh (Notre Dame).

They write:

2018 brings with it the centenary of a major milestone in mathematical physics: the publication of Amalie (“Emmy”) Noether’s theorems relating symmetry and physical quantities, which continue to be a font of inspiration for “symmetry arguments” in physics, and for the interpretation of symmetry within philosophy.

In order to celebrate Noether’s legacy, the University of Notre Dame and the LSE Centre for Philosophy of Natural and Social Sciences are co-organizing a conference that will bring together leading mathematicians, physicists, and philosophers of physics in order to discuss the enduring impact of Noether’s work.

There’s a registration fee, which you can see on the conference website, along with a map showing the conference location, a schedule of the talks, and other useful stuff.

Here are the speakers:

John Baez (UC Riverside)

Jeremy Butterfield (Cambridge)

Anne-Christine Davis (Cambridge)

Sebastian De Haro (Amsterdam and Cambridge)

Ruth Gregory (Durham)

Yvette Kosmann-Schwarzbach (Paris)

Peter Olver (UMN)

Sabrina Pasterski (Harvard)

Oliver Pooley (Oxford)

Tudor Ratiu (Shanghai Jiao Tong and Geneva)

Kasia Rejzner (York)

Robert Spekkens (Perimeter)

I’m looking forward to analyzing the basic assumptions behind various generalizations of Noether’s first theorem, the one that shows symmetries of a Lagrangian give conserved quantities. Having generalized it to Markov processes, I know there’s a lot more to what’s going on here than just the wonders of Lagrangian mechanics:

• John Baez and Brendan Fong, A Noether theorem for Markov processes, J. Math. Phys. 54 (2013), 013301. (Blog article here.)

I’ve been trying to get to the bottom of it ever since.

The Behavioral Approach to Systems Theory

19 June, 2018


Two more students in the Applied Category Theory 2018 school wrote a blog article about something they read:

• Eliana Lorch and Joshua Tan, The behavioral approach to systems theory, The n-Category Café, 15 June 2018.

Eliana Lorch is a mathematician based in San Francisco. Joshua Tan is a grad student in computer science at the University of Oxford and one of the organizers of Applied Category Theory 2018.

They wrote a great summary of this paper, which has been an inspiration to me and many others:

• Jan Willems, The behavioral approach to open and interconnected systems, IEEE Control Systems 27 (2007), 46–99.

They also list many papers influenced by it, and raise a couple of interesting problems with Willems’ idea, which can probably be handled by generalizing it.

Dynamical Systems and Their Steady States

17 June, 2018


As part of the Applied Category Theory 2018 school, Maru Sarazola wrote a blog article on open dynamical systems and their steady states. Check it out:

• Maru Sarazola, Dynamical systems and their steady states, The n-Category Café, 2 April 2018.

She compares two papers:

• David Spivak, The steady states of coupled dynamical systems compose according to matrix arithmetic.

• John Baez and Blake Pollard, A compositional framework for reaction networks, Reviews in Mathematical Physics 29 (2017), 1750028.
(Blog article here.)

It’s great, because I’d never really gotten around to understanding the precise relationship between these two approaches. I wish I knew the answers to the questions she raises at the end!


2 June, 2018

Big news! An experiment called MiniBooNE at Fermilab in Chicago has found more evidence that neutrinos are not acting as the Standard Model says they should:

• The MiniBooNE Collaboration, Observation of a significant excess of electron-like events in the MiniBooNE short-baseline neutrino experiment.

In brief, the experiment creates a beam of muon neutrinos (or antineutrinos—they can do either one). Then they check, with a detector 541 meters away, to see if any of these particles have turned into electron neutrinos (or antineutrinos). They’ve been doing this since 2002, and they’ve found a small tendency for this to happen.

This seems to confirm findings of the Liquid Scintillator Neutrino Detector or ‘LSND’ at Los Alamos, which did a similar experiment in the 1990s. People in the MiniBooNE collaboration claim that if you take both experiments into account, the results have a statistical significance of 6.1 σ.

This means that if the Standard Model is correct and there’s no experimental error or other mistake, the chance of seeing what these experiments saw is about 1 in 1,000,000,000.

There are 3 known kinds of neutrinos: electron, muon and tau neutrinos. Neutrinos of any kind are already known to turn into those of other kinds: these are called neutrino oscillations, and they were first discovered in the 1960’s, when it was found that 1/3 as many electron neutrinos were coming from the Sun as expected.

At the time this was a big surprise, because people thought neutrinos were massless, moved at the speed of light, and thus didn’t experience the passage of time. Back then, the Standard Model looked like this:

The neutrinos stood out as weird in two ways: we thought they were massless, and we thought they only come in a left-handed form—meaning roughly that they spin clockwise around the axis they’re moving along.

People did a bunch of experiments and wound up changing the Standard Model. Now we know neutrinos have nonzero mass. Their masses, and also neutrino oscillations, are described using a 3×3 matrix called the lepton mixing matrix. This is not a wacky idea: in fact, quarks are described using a similar 3×3 matrix called the quark mixing matrix. So, the current-day Standard Model is more symmetrical than the earlier version: leptons are more like quarks.

There is, however, still a big difference! We haven’t seen right-handed neutrinos.

MiniBooNE and LSND are seeing muon neutrinos turn into electron neutrinos much faster than the Standard Model theory of neutrino oscillations predicts. There seems to be no way to adjust the parameters of the lepton mixing matrix to fit the data from all the other experiments people have done, and also the MiniBooNE–LSND data. If this is really true, we need a new theory of physics.

And this is where things get interesting.

The most conservative change to the Standard Model would be to add three right-handed neutrinos to go along with the three left-handed ones. This would not be an ugly ad hoc trick: it would make the Standard Model more symmetrical, by making leptons even more like quarks.

If we do this in the most beautiful way—making leptons as similar to quarks as we can get away with, given their obvious differences—the three new right-handed neutrinos will be ‘sterile’. This means that they will interact only with the Higgs boson and gravity: not electromagnetism, the weak force or the strong force. This is great, because it would mean there’s a darned good reason we haven’t seen them yet!

Neutrinos are already very hard to detect, since they don’t interact with electromagnetism or the strong force. They only interact with the Higgs boson (that’s what creates their mass, and oscillations), gravity (because they have energy), and the weak force (which is how we create and detect them). A ‘sterile’ neutrino—one that also didn’t interact with the weak force—would be truly elusive!

In practice, the main way to detect sterile neutrinos would be via oscillations. We could create an ordinary neutrino, and it might turn into a sterile neutrino, and then back into an ordinary neutrino. This would create new kinds of oscillations.

And indeed, MiniBooNE and LSND seem to be seeing new oscillations, much more rapid than those predicted by the Standard Model and our usual best estimate of the lepton mixing matrix.

So, people are getting excited! We may have found sterile neutrinos.

There’s a lot more to say. For example, the SO(10) grand unified theory predicts right-handed neutrinos in a very beautiful way, so I’m curious about what the new data implies about that. There are also questions about whether a sterile neutrino could explain dark matter… or what limits astronomical observations place on the properties of sterile neutrinos. One should also wonder about the possibility of experimental error!

I would enjoy questions that probe deeper into this subject, since they might force me to study and learn more. Right now I have to go to Joshua Tree! But I’ll come back and answer your questions tomorrow morning.

Effective Thermodynamics for a Marginal Observer

8 May, 2018

guest post by Matteo Polettini

Suppose you receive an email from someone who claims “here is the project of a machine that runs forever and ever and produces energy for free!” Obviously he must be a crackpot. But he may be well-intentioned. You opt for not being rude, roll your sleeves, and put your hands into the dirt, holding the Second Law as lodestar.

Keep in mind that there are two fundamental sources of error: either he is not considering certain input currents (“hey, what about that tiny hidden cable entering your machine from the electrical power line?!”, “uh, ah, that’s just to power the “ON” LED”, “mmmhh, you sure?”), or else he is not measuring the energy input correctly (“hey, why are you using a Geiger counter to measure input voltages?!”, “well, sir, I ran out of voltmeters…”).

In other words, the observer might only have partial information about the setup, either in quantity or quality. Because he has been marginalized by society (most crackpots believe they are misunderstood geniuses) we will call such observer “marginal,” which incidentally is also the word that mathematicians use when they focus on the probability of a subset of stochastic variables.

In fact, our modern understanding of thermodynamics as embodied in statistical mechanics and stochastic processes is founded (and funded) on ignorance: we never really have “complete” information. If we actually had, all energy would look alike, it would not come in “more refined” and “less refined” forms, there would not be a differentials of order/disorder (using Paul Valery’s beautiful words), and that would end thermodynamic reasoning, the energy problem, and generous research grants altogether.

Even worse, within this statistical approach we might be missing chunks of information because some parts of the system are invisible to us. But then, what warrants that we are doing things right, and he (our correspondent) is the crackpot? Couldn’t it be the other way around? Here I would like to present some recent ideas I’ve been working on together with some collaborators on how to deal with incomplete information about the sources of dissipation of a thermodynamic system. I will do this in a quite theoretical manner, but somehow I will mimic the guidelines suggested above for debunking crackpots. My three buzzwords will be: marginal, effective, and operational.

“Complete” thermodynamics: an out-of-the-box view

The laws of thermodynamics that I address are:

• The good ol’ Second Law (2nd)

• The Fluctuation-Dissipation Relation (FDR), and the Reciprocal Relation (RR) close to equilibrium.

• The more recent Fluctuation Relation (FR)1 and its corollary the Integral Fluctuation Relation (IFR), which have been discussed on this blog in a remarkable post by Matteo Smerlak.

The list above is all in the “area of the second law”. How about the other laws? Well, thermodynamics has for long been a phenomenological science, a patchwork. So-called stochastic thermodynamics is trying to put some order in it by systematically grounding thermodynamic claims in (mostly Markov) stochastic processes. But it’s not an easy task, because the different laws of thermodynamics live in somewhat different conceptual planes. And it’s not even clear if they are theorems, prescriptions, or habits (a bit like in jurisprudence2).

Within stochastic thermodynamics, the Zeroth Law is so easy nobody cares to formulate it (I do, so stay tuned…). The Third Law: no idea, let me know. As regards the First Law (or, better, “laws”, as many as there are conserved quantities across the system/environment interface…), we will assume that all related symmetries have been exploited from the offset to boil down the description to a minimum.


This minimum is as follows. We identify a system that is well separated from its environment. The system evolves in time, the environment is so large that its state does not evolve within the timescales of the system3. When tracing out the environment from the description, an uncertainty falls upon the system’s evolution. We assume the system’s dynamics to be described by a stochastic Markovian process.

How exactly the system evolves and what is the relationship between system and environment will be described in more detail below. Here let us take an “out of the box” view. We resolve the environment into several reservoirs labeled by index \alpha. Each of these reservoirs is “at equilibrium” on its own (whatever that means4). Now, the idea is that each reservoir tries to impose “its own equilibrium” on the system, and that their competition leads to a flow of currents across the system/environment interface. Each time an amount of the reservoir’s resource crosses the interface, a “thermodynamic cost” has to be to be paid or gained (be it a chemical potential difference for a molecule to go through a membrane, or a temperature gradient for photons to be emitted/absorbed, etc.).

The fundamental quantities of stochastic thermodynamic modeling thus are:

• On the “-dynamic” side: the time-integrated currents \Phi^t_\alpha, independent among themselves5. Currents are stochastic variables distributed with joint probability density


• On the “thermo-” side: The so-called thermodynamic forces or “affinities”6 \mathcal{A}_\alpha (collectively denoted \mathcal{A}). These are tunable parameters that characterize reservoir-to-reservoir gradients, and they are not stochastic. For convenience, we conventionally take them all positive.

Dissipation is quantified by the entropy production:

\sum \mathcal{A}_\alpha \Phi^t_\alpha

We are finally in the position to state the main results. Be warned that in the following expressions the exact treatment of time and its scaling would require a lot of specifications, but keep in mind that all these relations hold true in the long-time limit, and that all cumulants scale linearly with time.

FR: The probability of observing positive currents is exponentially favoured with respect to negative currents according to

P(\{\Phi_\alpha\}_\alpha) / P(\{-\Phi_\alpha\}_\alpha) = \exp \sum \mathcal{A}_\alpha \Phi^t_\alpha

Comment: This is not trivial, it follows from the explicit expression of the path integral, see below.

IFR: The exponential of minus the entropy production is unity

\big\langle  \exp - \sum \mathcal{A}_\alpha \Phi^t_\alpha  \big\rangle_{\mathcal{A}} =1

Homework: Derive this relation from the FR in one line.

2nd Law: The average entropy production is not negative

\sum \mathcal{A}_\alpha \left\langle \Phi^t_\alpha \right\rangle_{\mathcal{A}} \geq 0

Homework: Derive this relation using Jensen’s inequality.

Equilibrium: Average currents vanish if and only if affinities vanish:

\left\langle \Phi^t_\alpha \right\rangle_{\mathcal{A}} \equiv 0, \forall \alpha \iff  \mathcal{A}_\alpha \equiv 0, \forall \alpha

Homework: Derive this relation taking the first derivative w.r.t. {\mathcal{A}_\alpha} of the IFR. Notice that also the average depends on the affinities.

S-FDR: At equilibrium, it is impossible to tell whether a current is due to a spontaneous fluctuation (quantified by its variance) or to an external perturbation (quantified by the response of its mean). In a symmetrized (S-) version:

\left.  \frac{\partial}{\partial \mathcal{A}_\alpha}\left\langle \Phi^t_{\alpha'} \right\rangle \right|_{0} + \left.  \frac{\partial}{\partial \mathcal{A}_{\alpha'}}\left\langle \Phi^t_{\alpha} \right\rangle \right|_{0} = \left. \left\langle \Phi^t_{\alpha} \Phi^t_{\alpha'} \right\rangle \right|_{0}

Homework: Derive this relation taking the mixed second derivatives w.r.t. {\mathcal{A}_\alpha} of the IFR.

RR: The reciprocal response of two different currents to a perturbation of the reciprocal affinities close to equilibrium is symmetrical:

\left.  \frac{\partial}{\partial \mathcal{A}_\alpha}\left\langle \Phi^t_{\alpha'} \right\rangle \right|_{0} - \left.  \frac{\partial}{\partial \mathcal{A}_{\alpha'}}\left\langle \Phi^t_{\alpha} \right\rangle \right|_{0} = 0

Homework: Derive this relation taking the mixed second derivatives w.r.t. {\mathcal{A}_\alpha} of the FR.

Notice the implication scheme: FR ⇒ IFR ⇒ 2nd, IFR ⇒ S-FDR, FR ⇒ RR.

“Marginal” thermodynamics (still out-of-the-box)

Now we assume that we can only measure a marginal subset of currents \{\Phi_\mu^t\}_\mu \subset \{\Phi_\alpha^t\}_\alpha (index \mu always has a smaller range than \alpha), distributed with joint marginal probability

P(\{\Phi_\mu\}_\mu) = \int \prod_{\alpha \neq \mu} d\Phi_\alpha \, P(\{\Phi_\alpha\}_\alpha)


Notice that a state where these marginal currents vanish might not be an equilibrium, because other currents might still be whirling around. We call this a stalling state.

\mathrm{stalling:} \qquad \langle \Phi_\mu \rangle \equiv 0,  \quad \forall \mu

My central question is: can we associate to these currents some effective affinity \mathcal{Q}_\mu in such a way that at least some of the results above still hold true? And, are all definitions involved just a fancy mathematical construct, or are they operational?

First the bad news: In general the FR is violated for all choices of effective affinities:

P(\{\Phi_\mu\}_\mu) / P(\{-\Phi_\mu\}_\mu) \neq \exp \sum \mathcal{Q}_\mu \Phi^t_\mu

This is not surprising and nobody would expect that. How about the IFR?

Marginal IFR: There are effective affinities such that

\left\langle \exp - \sum \mathcal{Q}_\mu \Phi^t_\mu \right\rangle_{\mathcal{A}} =1

Mmmhh. Yeah. Take a closer look this expression: can you see why there actually exists an infinite choice of “effective affinities” that would make that average cross 1? Which on the other hand is just a number, so who even cares? So this can’t be the point.

The fact is, the IFR per se is hardly of any practical interest, as are all “absolutes” in physics. What matters is “relatives”: in our case, response. But then we need to specify how the effective affinities depend on the “real” affinities. And here steps in a crucial technicality, whose precise argumentation is a pain. Basing on reasonable assumptions7, we demonstrate that the IFR holds for the following choice of effective affinities:

\mathcal{Q}_\mu = \mathcal{A}_\mu - \mathcal{A}^{\mathrm{stalling}}_\mu,

where \mathcal{A}^{\mathrm{stalling}} is the set of values of the affinities that make marginal currents stall. Notice that this latter formula gives an operational definition of the effective affinities that could in principle be reproduced in laboratory (just go out there and tune the tunable until everything stalls, and measure the difference). Obviously:

Stalling: Marginal currents vanish if and only if effective affinities vanish:

\left\langle \Phi^t_\mu \right\rangle_{\mathcal{A}} \equiv 0, \forall \mu \iff \mathcal{A}_\mu \equiv 0, \forall \mu

Now, according to the inference scheme illustrated above, we can also prove that:

Effective 2nd Law: The average marginal entropy production is not negative

\sum \mathcal{Q}_\mu \left\langle \Phi^t_\mu \right\rangle_{\mathcal{A}} \geq 0

S-FDR at stalling:

\left. \frac{\partial}{\partial \mathcal{A}_\mu}\left\langle \Phi^t_{\mu'} \right\rangle \right|_{\mathcal{A}^{\mathrm{stalling}}} + \left. \frac{\partial}{\partial \mathcal{A}_{\mu'}}\left\langle \Phi^t_{\mu} \right\rangle \right|_{\mathcal{A}^{\mathrm{stalling}}} = \left. \left\langle \Phi^t_{\mu} \Phi^t_{\mu'} \right\rangle \right|_{\mathcal{A}^{\mathrm{stalling}}}

Notice instead that the RR is gone at stalling. This is a clear-cut prediction of the theory that can be experimented with basically the same apparatus with which response theory has been experimentally studied so far (not that I actually know what these apparatus are…): at stalling states, differing from equilibrium states, the S-FDR still holds, but the RR does not.

Into the box

You’ve definitely gotten enough at this point, and you can give up here. Please exit through the gift shop.

If you’re stubborn, let me tell you what’s inside the box. The system’s dynamics is modeled as a continuous-time, discrete configuration-space Markov “jump” process. The state space can be described by a graph G=(I, E) where I is the set of configurations, E is the set of possible transitions or “edges”, and there exists some incidence relation between edges and couples of configurations. The process is determined by the rates w_{i \gets j} of jumping from one configuration to another.

We choose these processes because they allow some nice network analysis and because the path integral is well defined! A single realization of such a process is a trajectory

\omega^t = (i_0,\tau_0) \to (i_1,\tau_1) \to \ldots \to (i_N,\tau_N)

A “Markovian jumper” waits at some configuration i_n for some time \tau_n with an exponentially decaying probability w_{i_n} \exp - w_{i_n} \tau_n with exit rate w_i = \sum_k w_{k \gets i}, then instantaneously jumps to a new configuration i_{n+1} with transition probability w_{i_{n+1} \gets {i_n}}/w_{i_n}. The overall probability density of a single trajectory is given by

P(\omega^t) = \delta \left(t - \sum_n \tau_n \right) e^{- w_{i_N}\tau_{i_N}} \prod_{n=0}^{N-1} w_{j_n \gets i_n} e^{- w_{i_n} \tau_{i_n}}

One can in principle obtain the probability distribution function of any observable defined along the trajectory by taking the marginal of this measure (though in most cases this is technically impossible). Where does this expression come from? For a formal derivation, see the very beautiful review paper by Weber and Frey, but be aware that this is what one would intuitively come up with if one had to simulate with the Gillespie algorithm.

The dynamics of the Markov process can also be described by the probability of being at some configuration i at time t, which evolves via the master equation

\dot{p}_i(t) = \sum_j \left[ w_{ij} p_j(t) - w_{ji} p_i(t) \right].

We call such probability the system’s state, and we assume that the system relaxes to a uniquely defined steady state p = \mathrm{lim}_{t \to \infty} p(t).

A time-integrated current along a single trajectory is a linear combination of the net number of jumps \#^t between configurations in the network:

\Phi^t_\alpha = \sum_{ij} C^{ij}_\alpha \left[ \#^t(i \gets j) - \#^t(j\gets i) \right]

The idea here is that one or several transitions within the system occur because of the “absorption” or the “emission” of some environmental degrees of freedom, each with different intensity. However, for the moment let us simplify the picture and require that only one transition contributes to a current, that is that there exist i_\alpha,j_\alpha such that

C^{ij}_\alpha = \delta^i_{i_\alpha} \delta^j_{j_\alpha}.

Now, what does it mean for such a set of currents to be “complete”? Here we get inspiration from Kirchhoff’s Current Law in electrical circuits: the continuity of the trajectory at each configuration of the network implies that after a sufficiently long time, cycle or loop or mesh currents completely describe the steady state. There is a standard procedure to identify a set of cycle currents: take a spanning tree T of the network; then the currents flowing along the edges E\setminus T left out from the spanning tree form a complete set.

The last ingredient you need to know are the affinities. They can be constructed as follows. Consider the Markov process on the network where the observable edges are removed G' = (I,T). Calculate the steady state of its associated master equation (p^{\mathrm{eq}}_i)_i, which is necessarily an equilibrium (since there cannot be cycle currents in a tree…). Then the affinities are given by

\mathcal{A}_\alpha = \log  w_{i_\alpha j_\alpha} p^{\mathrm{eq}}_{j_\alpha} / w_{j_\alpha i_\alpha} p^{\mathrm{eq}}_{i_\alpha}.

Now you have all that is needed to formulate the complete theory and prove the FR.

Homework: (Difficult!) With the above definitions, prove the FR.

How about the marginal theory? To define the effective affinities, take the set E_{\mathrm{mar}} = \{i_\mu j_\mu, \forall \mu\} of edges where there run observable currents. Notice that now its complement obtained by removing the observable edges, the hidden edge set E_{\mathrm{hid}} = E \setminus E_{\mathrm{mar}}, is not in general a spanning tree: there might be cycles that are not accounted for by our observations. However, we can still consider the Markov process on the hidden space, and calculate its stalling steady state p^{\mathrm{st}}_i, and ta-taaa: The effective affinities are given by

\mathcal{Q}_\mu = \log w_{i_\mu j_\mu} p^{\mathrm{st}}_{j_\mu} / w_{j_\mu i_\mu} p^{\mathrm{st}}_{i_\mu}.

Proving the marginal IFR is far more complicated than the complete FR. In fact, very often in my field we will not work with the current’ probability density itself, but we prefer to take its bidirectional Laplace transform and work with the currents’ cumulant generating function. There things take a quite different and more elegant look.

Many other questions and possibilities open up now. The most important one left open is: Can we generalize the theory the (physically relevant) case where the current is supported on several edges? For example, for a current defined like \Phi^t = 5 \Phi^t_{12} + 7 \Phi^t_{34}? Well, it depends: the theory holds provided that the stalling state is not “internally alive”, meaning that if the observable current vanishes on average, then also should \Phi^t_{12} and \Phi^t_{34} separately. This turns out to be a physically meaningful but quite strict condition.

Is all of thermodynamics “effective”?

Let me conclude with some more of those philosophical considerations that sadly I have to leave out of papers…

Stochastic thermodynamics strongly depends on the identification of physical and information-theoretic entropies — something that I did not openly talk about, but that lurks behind the whole construction. Throughout my short experience as researcher I have been pursuing a program of “relativization” of thermodynamics, by making the role of the observer more and more evident and movable. Inspired by Einstein’s Gedankenexperimenten, I also tried to make the theory operational. This program may raise eyebrows here and there: Many thermodynamicians embrace a naive materialistic world-view whereby what only matters are “real” physical quantities like temperature, pressure, and all the rest of the information-theoretic discourse is at best mathematical speculation or a fascinating analog with no fundamental bearings. According to some, information as a physical concept lingers alarmingly close to certain extreme postmodern claims in the social sciences that “reality” does not exist unless observed, a position deemed dangerous at times when the authoritativeness of science is threatened by all sorts of anti-scientific waves.

I think, on the contrary, that making concepts relative and effective and by summoning the observer explicitly is a laic and prudent position that serves as an antidote to radical subjectivity. The other way around—clinging to the objectivity of a preferred observer, which is implied in any materialistic interpretation of thermodynamics, e.g. by assuming that the most fundamental degrees of freedom are the positions and velocities of gas’s molecules—is the dangerous position, expecially when the role of such preferred observer is passed around from the scientist to the technician and eventually to the technocrat, who would be induced to believe there are simple technological fixes to complex social problems

How do we reconcile observer-dependency and the laws of physics? The object and the subject? On the one hand, much like the position of an object depends on the reference frame, so much so entropy and entropy production do depend on the observer and the particular apparatus that he controls or experiment he is involved with. On the other hand, much like motion is ultimately independent of position and it is agreed upon by all observers that share compatible measurement protocols, so much so the laws of thermodynamics are independent of that particular observer’s quantification of entropy and entropy production (e.g., the effective Second Law holds independently of how much the marginal observer knows of the system, if he operates according to our phenomenological protocol…). This is the case even in the every-day thermodynamics as practiced by energetic engineers et al., where there are lots of choices to gauge upon, and there is no other external warrant that the amount of dissipation being quantified is the “true” one (whatever that means…)—there can only be trust in one’s own good practices and methodology.

So in this sense, I like to think that all observers are marginal, that this effective theory serves as a dictionary by which different observers practice and communicate thermodynamics, and that we should not revere the laws of thermodynamics as “true” idols, but rather as tools of good scientific practice.


• M. Polettini and M. Esposito, Effective fluctuation and response theory, arXiv:1803.03552.

In this work we give the complete theory and numerous references to work of other people that was along the same lines. We employ a “spiral” approach to the presentation of the results, inspired by the pedagogical principle of Albert Baez.

• M. Polettini and M. Esposito, Effective thermodynamics for a marginal observer, Phys. Rev. Lett. 119 (2017), 240601, arXiv:1703.05715.

This is a shorter version of the story.

• B. Altaner, M. Polettini and M. Esposito, Fluctuation-dissipation relations far from equilibrium, Phys. Rev. Lett. 117 (2016), 180601, arXiv:1604.0883.

An early version of the story, containing the FDR results but not the full-fledged FR.

• G. Bisker, M. Polettini, T. R. Gingrich and J. M. Horowitz, Hierarchical bounds on entropy production inferred from partial information, J. Stat. Mech. (2017), 093210, arXiv:1708.06769.

Some extras.

• M. F. Weber and E. Frey, Master equations and the theory of stochastic path integrals, Rep. Progr. Phys. 80 (2017), 046601, arXiv:1609.02849.

Great reference if one wishes to learn about path integrals for master equation systems.


1 There are as many so-called “Fluctuation Theorems” as there are authors working on them, so I decided not to call them by any name. Furthermore, notice I prefer to distinguish between a relation (a formula) and a theorem (a line of reasoning). I lingered more on this here.

2 “Just so you know, nobody knows what energy is.”—Richard Feynman.

I cannot help but mention here the beautiful book by Shapin and Schaffer, Leviathan and the Air-Pump, about the Boyle vs. Hobbes diatribe about what constitutes a “matter of fact,” and Bruno Latour’s interpretation of it in We Have Never Been Modern. Latour argues that “modernity” is a process of separation of the human and natural spheres, and within each of these spheres a process of purification of the unit facts of knowledge and the unit facts of politics, of the object and the subject. At the same time we live in a world where these two spheres are never truly separated, a world of “hybrids” that are at the same time necessary “for all practical purposes” and unconceivable according to the myths that sustain the narration of science, of the State, and even of religion. In fact, despite these myths, we cannot conceive a scientific fact out of the contextual “network” where this fact is produced and replicated, and neither we can conceive society out of the material needs that shape it: so in this sense “we have never been modern”, we are not quite different from all those societies that we take pleasure of studying with the tools of anthropology. Within the scientific community Latour is widely despised; probably he is also misread. While it is really difficult to see how his analysis applies to, say, high-energy physics, I find that thermodynamics and its ties to the industrial revolution perfectly embodies this tension between the natural and the artificial, the matter of fact and the matter of concern. Such great thinkers as Einstein and Ehrenfest thought of the Second Law as the only physical law that would never be replaced, and I believe this is revelatory. A second thought on the Second Law, a systematic and precise definition of all its terms and circumstances, reveals that the only formulations that make sense are those phenomenological statements such as Kelvin-Planck’s or similar, which require a lot of contingent definitions regarding the operation of the engine, while fetishized and universal statements are nonsensical (such as that masterwork of confusion that is “the entropy of the Universe cannot decrease”). In this respect, it is neither a purely natural law—as the moderns argue, nor a purely social construct—as the postmodern argue. One simply has to renounce to operate this separation. While I do not have a definite answer on this problem, I like to think of the Second Law as a practice, a consistency check of the thermodynamic discourse.

3 This assumption really belongs to a time, the XIXth century, when resources were virtually infinite on planet Earth…

4 As we will see shortly, we define equilibrium as that state where there are no currents at the interface between the system and the environment, so what is the environment’s own definition of equilibrium?!

5 This because we have already exploited the First Law.

6 This nomenclature comes from alchemy, via chemistry (think of Goethe’s The elective affinities…), it propagated in the XXth century via De Donder and Prigogine, and eventually it is still present in language in Luxembourg because in some way we come from the “late Brussels school”.

7 Basically, we ask that the tunable parameters are environmental properties, such as temperatures, chemical potentials, etc. and not internal properties, such as the energy landscape or the activation barriers between configurations.