The Philosophy and Physics of Noether’s Theorems

11 August, 2018


I’ll be speaking at a conference celebrating the centenary of Emmy Noether’s work connecting symmetries and conservation laws:

The Philosophy and Physics of Noether’s Theorems, 5-6 October 2018, Fischer Hall, 1-4 Suffolk Street, London, UK. Organized by Bryan W. Roberts (LSE) and Nicholas Teh (Notre Dame).

They write:

2018 brings with it the centenary of a major milestone in mathematical physics: the publication of Amalie (“Emmy”) Noether’s theorems relating symmetry and physical quantities, which continue to be a font of inspiration for “symmetry arguments” in physics, and for the interpretation of symmetry within philosophy.

In order to celebrate Noether’s legacy, the University of Notre Dame and the LSE Centre for Philosophy of Natural and Social Sciences are co-organizing a conference that will bring together leading mathematicians, physicists, and philosophers of physics in order to discuss the enduring impact of Noether’s work.

There’s a registration fee, which you can see on the conference website, along with a map showing the conference location, a schedule of the talks, and other useful stuff.

Here are the speakers:

John Baez (UC Riverside)

Jeremy Butterfield (Cambridge)

Anne-Christine Davis (Cambridge)

Sebastian De Haro (Amsterdam and Cambridge)

Ruth Gregory (Durham)

Yvette Kosmann-Schwarzbach (Paris)

Peter Olver (UMN)

Sabrina Pasterski (Harvard)

Oliver Pooley (Oxford)

Tudor Ratiu (Shanghai Jiao Tong and Geneva)

Kasia Rejzner (York)

Robert Spekkens (Perimeter)

I’m looking forward to analyzing the basic assumptions behind various generalizations of Noether’s first theorem, the one that shows symmetries of a Lagrangian give conserved quantities. Having generalized it to Markov processes, I know there’s a lot more to what’s going on here than just the wonders of Lagrangian mechanics:

• John Baez and Brendan Fong, A Noether theorem for Markov processes, J. Math. Phys. 54 (2013), 013301. (Blog article here.)

I’ve been trying to get to the bottom of it ever since.

The Behavioral Approach to Systems Theory

19 June, 2018


Two more students in the Applied Category Theory 2018 school wrote a blog article about something they read:

• Eliana Lorch and Joshua Tan, The behavioral approach to systems theory, The n-Category Café, 15 June 2018.

Eliana Lorch is a mathematician based in San Francisco. Joshua Tan is a grad student in computer science at the University of Oxford and one of the organizers of Applied Category Theory 2018.

They wrote a great summary of this paper, which has been an inspiration to me and many others:

• Jan Willems, The behavioral approach to open and interconnected systems, IEEE Control Systems 27 (2007), 46–99.

They also list many papers influenced by it, and raise a couple of interesting problems with Willems’ idea, which can probably be handled by generalizing it.

Dynamical Systems and Their Steady States

17 June, 2018


As part of the Applied Category Theory 2018 school, Maru Sarazola wrote a blog article on open dynamical systems and their steady states. Check it out:

• Maru Sarazola, Dynamical systems and their steady states, The n-Category Café, 2 April 2018.

She compares two papers:

• David Spivak, The steady states of coupled dynamical systems compose according to matrix arithmetic.

• John Baez and Blake Pollard, A compositional framework for reaction networks, Reviews in Mathematical Physics 29 (2017), 1750028.
(Blog article here.)

It’s great, because I’d never really gotten around to understanding the precise relationship between these two approaches. I wish I knew the answers to the questions she raises at the end!


2 June, 2018

Big news! An experiment called MiniBooNE at Fermilab in Chicago has found more evidence that neutrinos are not acting as the Standard Model says they should:

• The MiniBooNE Collaboration, Observation of a significant excess of electron-like events in the MiniBooNE short-baseline neutrino experiment.

In brief, the experiment creates a beam of muon neutrinos (or antineutrinos—they can do either one). Then they check, with a detector 541 meters away, to see if any of these particles have turned into electron neutrinos (or antineutrinos). They’ve been doing this since 2002, and they’ve found a small tendency for this to happen.

This seems to confirm findings of the Liquid Scintillator Neutrino Detector or ‘LSND’ at Los Alamos, which did a similar experiment in the 1990s. People in the MiniBooNE collaboration claim that if you take both experiments into account, the results have a statistical significance of 6.1 σ.

This means that if the Standard Model is correct and there’s no experimental error or other mistake, the chance of seeing what these experiments saw is about 1 in 1,000,000,000.

There are 3 known kinds of neutrinos: electron, muon and tau neutrinos. Neutrinos of any kind are already known to turn into those of other kinds: these are called neutrino oscillations, and they were first discovered in the 1960’s, when it was found that 1/3 as many electron neutrinos were coming from the Sun as expected.

At the time this was a big surprise, because people thought neutrinos were massless, moved at the speed of light, and thus didn’t experience the passage of time. Back then, the Standard Model looked like this:

The neutrinos stood out as weird in two ways: we thought they were massless, and we thought they only come in a left-handed form—meaning roughly that they spin clockwise around the axis they’re moving along.

People did a bunch of experiments and wound up changing the Standard Model. Now we know neutrinos have nonzero mass. Their masses, and also neutrino oscillations, are described using a 3×3 matrix called the lepton mixing matrix. This is not a wacky idea: in fact, quarks are described using a similar 3×3 matrix called the quark mixing matrix. So, the current-day Standard Model is more symmetrical than the earlier version: leptons are more like quarks.

There is, however, still a big difference! We haven’t seen right-handed neutrinos.

MiniBooNE and LSND are seeing muon neutrinos turn into electron neutrinos much faster than the Standard Model theory of neutrino oscillations predicts. There seems to be no way to adjust the parameters of the lepton mixing matrix to fit the data from all the other experiments people have done, and also the MiniBooNE–LSND data. If this is really true, we need a new theory of physics.

And this is where things get interesting.

The most conservative change to the Standard Model would be to add three right-handed neutrinos to go along with the three left-handed ones. This would not be an ugly ad hoc trick: it would make the Standard Model more symmetrical, by making leptons even more like quarks.

If we do this in the most beautiful way—making leptons as similar to quarks as we can get away with, given their obvious differences—the three new right-handed neutrinos will be ‘sterile’. This means that they will interact only with the Higgs boson and gravity: not electromagnetism, the weak force or the strong force. This is great, because it would mean there’s a darned good reason we haven’t seen them yet!

Neutrinos are already very hard to detect, since they don’t interact with electromagnetism or the strong force. They only interact with the Higgs boson (that’s what creates their mass, and oscillations), gravity (because they have energy), and the weak force (which is how we create and detect them). A ‘sterile’ neutrino—one that also didn’t interact with the weak force—would be truly elusive!

In practice, the main way to detect sterile neutrinos would be via oscillations. We could create an ordinary neutrino, and it might turn into a sterile neutrino, and then back into an ordinary neutrino. This would create new kinds of oscillations.

And indeed, MiniBooNE and LSND seem to be seeing new oscillations, much more rapid than those predicted by the Standard Model and our usual best estimate of the lepton mixing matrix.

So, people are getting excited! We may have found sterile neutrinos.

There’s a lot more to say. For example, the SO(10) grand unified theory predicts right-handed neutrinos in a very beautiful way, so I’m curious about what the new data implies about that. There are also questions about whether a sterile neutrino could explain dark matter… or what limits astronomical observations place on the properties of sterile neutrinos. One should also wonder about the possibility of experimental error!

I would enjoy questions that probe deeper into this subject, since they might force me to study and learn more. Right now I have to go to Joshua Tree! But I’ll come back and answer your questions tomorrow morning.

Effective Thermodynamics for a Marginal Observer

8 May, 2018

guest post by Matteo Polettini

Suppose you receive an email from someone who claims “here is the project of a machine that runs forever and ever and produces energy for free!” Obviously he must be a crackpot. But he may be well-intentioned. You opt for not being rude, roll your sleeves, and put your hands into the dirt, holding the Second Law as lodestar.

Keep in mind that there are two fundamental sources of error: either he is not considering certain input currents (“hey, what about that tiny hidden cable entering your machine from the electrical power line?!”, “uh, ah, that’s just to power the “ON” LED”, “mmmhh, you sure?”), or else he is not measuring the energy input correctly (“hey, why are you using a Geiger counter to measure input voltages?!”, “well, sir, I ran out of voltmeters…”).

In other words, the observer might only have partial information about the setup, either in quantity or quality. Because he has been marginalized by society (most crackpots believe they are misunderstood geniuses) we will call such observer “marginal,” which incidentally is also the word that mathematicians use when they focus on the probability of a subset of stochastic variables.

In fact, our modern understanding of thermodynamics as embodied in statistical mechanics and stochastic processes is founded (and funded) on ignorance: we never really have “complete” information. If we actually had, all energy would look alike, it would not come in “more refined” and “less refined” forms, there would not be a differentials of order/disorder (using Paul Valery’s beautiful words), and that would end thermodynamic reasoning, the energy problem, and generous research grants altogether.

Even worse, within this statistical approach we might be missing chunks of information because some parts of the system are invisible to us. But then, what warrants that we are doing things right, and he (our correspondent) is the crackpot? Couldn’t it be the other way around? Here I would like to present some recent ideas I’ve been working on together with some collaborators on how to deal with incomplete information about the sources of dissipation of a thermodynamic system. I will do this in a quite theoretical manner, but somehow I will mimic the guidelines suggested above for debunking crackpots. My three buzzwords will be: marginal, effective, and operational.

“Complete” thermodynamics: an out-of-the-box view

The laws of thermodynamics that I address are:

• The good ol’ Second Law (2nd)

• The Fluctuation-Dissipation Relation (FDR), and the Reciprocal Relation (RR) close to equilibrium.

• The more recent Fluctuation Relation (FR)1 and its corollary the Integral Fluctuation Relation (IFR), which have been discussed on this blog in a remarkable post by Matteo Smerlak.

The list above is all in the “area of the second law”. How about the other laws? Well, thermodynamics has for long been a phenomenological science, a patchwork. So-called stochastic thermodynamics is trying to put some order in it by systematically grounding thermodynamic claims in (mostly Markov) stochastic processes. But it’s not an easy task, because the different laws of thermodynamics live in somewhat different conceptual planes. And it’s not even clear if they are theorems, prescriptions, or habits (a bit like in jurisprudence2).

Within stochastic thermodynamics, the Zeroth Law is so easy nobody cares to formulate it (I do, so stay tuned…). The Third Law: no idea, let me know. As regards the First Law (or, better, “laws”, as many as there are conserved quantities across the system/environment interface…), we will assume that all related symmetries have been exploited from the offset to boil down the description to a minimum.


This minimum is as follows. We identify a system that is well separated from its environment. The system evolves in time, the environment is so large that its state does not evolve within the timescales of the system3. When tracing out the environment from the description, an uncertainty falls upon the system’s evolution. We assume the system’s dynamics to be described by a stochastic Markovian process.

How exactly the system evolves and what is the relationship between system and environment will be described in more detail below. Here let us take an “out of the box” view. We resolve the environment into several reservoirs labeled by index \alpha. Each of these reservoirs is “at equilibrium” on its own (whatever that means4). Now, the idea is that each reservoir tries to impose “its own equilibrium” on the system, and that their competition leads to a flow of currents across the system/environment interface. Each time an amount of the reservoir’s resource crosses the interface, a “thermodynamic cost” has to be to be paid or gained (be it a chemical potential difference for a molecule to go through a membrane, or a temperature gradient for photons to be emitted/absorbed, etc.).

The fundamental quantities of stochastic thermodynamic modeling thus are:

• On the “-dynamic” side: the time-integrated currents \Phi^t_\alpha, independent among themselves5. Currents are stochastic variables distributed with joint probability density


• On the “thermo-” side: The so-called thermodynamic forces or “affinities”6 \mathcal{A}_\alpha (collectively denoted \mathcal{A}). These are tunable parameters that characterize reservoir-to-reservoir gradients, and they are not stochastic. For convenience, we conventionally take them all positive.

Dissipation is quantified by the entropy production:

\sum \mathcal{A}_\alpha \Phi^t_\alpha

We are finally in the position to state the main results. Be warned that in the following expressions the exact treatment of time and its scaling would require a lot of specifications, but keep in mind that all these relations hold true in the long-time limit, and that all cumulants scale linearly with time.

FR: The probability of observing positive currents is exponentially favoured with respect to negative currents according to

P(\{\Phi_\alpha\}_\alpha) / P(\{-\Phi_\alpha\}_\alpha) = \exp \sum \mathcal{A}_\alpha \Phi^t_\alpha

Comment: This is not trivial, it follows from the explicit expression of the path integral, see below.

IFR: The exponential of minus the entropy production is unity

\big\langle  \exp - \sum \mathcal{A}_\alpha \Phi^t_\alpha  \big\rangle_{\mathcal{A}} =1

Homework: Derive this relation from the FR in one line.

2nd Law: The average entropy production is not negative

\sum \mathcal{A}_\alpha \left\langle \Phi^t_\alpha \right\rangle_{\mathcal{A}} \geq 0

Homework: Derive this relation using Jensen’s inequality.

Equilibrium: Average currents vanish if and only if affinities vanish:

\left\langle \Phi^t_\alpha \right\rangle_{\mathcal{A}} \equiv 0, \forall \alpha \iff  \mathcal{A}_\alpha \equiv 0, \forall \alpha

Homework: Derive this relation taking the first derivative w.r.t. {\mathcal{A}_\alpha} of the IFR. Notice that also the average depends on the affinities.

S-FDR: At equilibrium, it is impossible to tell whether a current is due to a spontaneous fluctuation (quantified by its variance) or to an external perturbation (quantified by the response of its mean). In a symmetrized (S-) version:

\left.  \frac{\partial}{\partial \mathcal{A}_\alpha}\left\langle \Phi^t_{\alpha'} \right\rangle \right|_{0} + \left.  \frac{\partial}{\partial \mathcal{A}_{\alpha'}}\left\langle \Phi^t_{\alpha} \right\rangle \right|_{0} = \left. \left\langle \Phi^t_{\alpha} \Phi^t_{\alpha'} \right\rangle \right|_{0}

Homework: Derive this relation taking the mixed second derivatives w.r.t. {\mathcal{A}_\alpha} of the IFR.

RR: The reciprocal response of two different currents to a perturbation of the reciprocal affinities close to equilibrium is symmetrical:

\left.  \frac{\partial}{\partial \mathcal{A}_\alpha}\left\langle \Phi^t_{\alpha'} \right\rangle \right|_{0} - \left.  \frac{\partial}{\partial \mathcal{A}_{\alpha'}}\left\langle \Phi^t_{\alpha} \right\rangle \right|_{0} = 0

Homework: Derive this relation taking the mixed second derivatives w.r.t. {\mathcal{A}_\alpha} of the FR.

Notice the implication scheme: FR ⇒ IFR ⇒ 2nd, IFR ⇒ S-FDR, FR ⇒ RR.

“Marginal” thermodynamics (still out-of-the-box)

Now we assume that we can only measure a marginal subset of currents \{\Phi_\mu^t\}_\mu \subset \{\Phi_\alpha^t\}_\alpha (index \mu always has a smaller range than \alpha), distributed with joint marginal probability

P(\{\Phi_\mu\}_\mu) = \int \prod_{\alpha \neq \mu} d\Phi_\alpha \, P(\{\Phi_\alpha\}_\alpha)


Notice that a state where these marginal currents vanish might not be an equilibrium, because other currents might still be whirling around. We call this a stalling state.

\mathrm{stalling:} \qquad \langle \Phi_\mu \rangle \equiv 0,  \quad \forall \mu

My central question is: can we associate to these currents some effective affinity \mathcal{Q}_\mu in such a way that at least some of the results above still hold true? And, are all definitions involved just a fancy mathematical construct, or are they operational?

First the bad news: In general the FR is violated for all choices of effective affinities:

P(\{\Phi_\mu\}_\mu) / P(\{-\Phi_\mu\}_\mu) \neq \exp \sum \mathcal{Q}_\mu \Phi^t_\mu

This is not surprising and nobody would expect that. How about the IFR?

Marginal IFR: There are effective affinities such that

\left\langle \exp - \sum \mathcal{Q}_\mu \Phi^t_\mu \right\rangle_{\mathcal{A}} =1

Mmmhh. Yeah. Take a closer look this expression: can you see why there actually exists an infinite choice of “effective affinities” that would make that average cross 1? Which on the other hand is just a number, so who even cares? So this can’t be the point.

The fact is, the IFR per se is hardly of any practical interest, as are all “absolutes” in physics. What matters is “relatives”: in our case, response. But then we need to specify how the effective affinities depend on the “real” affinities. And here steps in a crucial technicality, whose precise argumentation is a pain. Basing on reasonable assumptions7, we demonstrate that the IFR holds for the following choice of effective affinities:

\mathcal{Q}_\mu = \mathcal{A}_\mu - \mathcal{A}^{\mathrm{stalling}}_\mu,

where \mathcal{A}^{\mathrm{stalling}} is the set of values of the affinities that make marginal currents stall. Notice that this latter formula gives an operational definition of the effective affinities that could in principle be reproduced in laboratory (just go out there and tune the tunable until everything stalls, and measure the difference). Obviously:

Stalling: Marginal currents vanish if and only if effective affinities vanish:

\left\langle \Phi^t_\mu \right\rangle_{\mathcal{A}} \equiv 0, \forall \mu \iff \mathcal{A}_\mu \equiv 0, \forall \mu

Now, according to the inference scheme illustrated above, we can also prove that:

Effective 2nd Law: The average marginal entropy production is not negative

\sum \mathcal{Q}_\mu \left\langle \Phi^t_\mu \right\rangle_{\mathcal{A}} \geq 0

S-FDR at stalling:

\left. \frac{\partial}{\partial \mathcal{A}_\mu}\left\langle \Phi^t_{\mu'} \right\rangle \right|_{\mathcal{A}^{\mathrm{stalling}}} + \left. \frac{\partial}{\partial \mathcal{A}_{\mu'}}\left\langle \Phi^t_{\mu} \right\rangle \right|_{\mathcal{A}^{\mathrm{stalling}}} = \left. \left\langle \Phi^t_{\mu} \Phi^t_{\mu'} \right\rangle \right|_{\mathcal{A}^{\mathrm{stalling}}}

Notice instead that the RR is gone at stalling. This is a clear-cut prediction of the theory that can be experimented with basically the same apparatus with which response theory has been experimentally studied so far (not that I actually know what these apparatus are…): at stalling states, differing from equilibrium states, the S-FDR still holds, but the RR does not.

Into the box

You’ve definitely gotten enough at this point, and you can give up here. Please exit through the gift shop.

If you’re stubborn, let me tell you what’s inside the box. The system’s dynamics is modeled as a continuous-time, discrete configuration-space Markov “jump” process. The state space can be described by a graph G=(I, E) where I is the set of configurations, E is the set of possible transitions or “edges”, and there exists some incidence relation between edges and couples of configurations. The process is determined by the rates w_{i \gets j} of jumping from one configuration to another.

We choose these processes because they allow some nice network analysis and because the path integral is well defined! A single realization of such a process is a trajectory

\omega^t = (i_0,\tau_0) \to (i_1,\tau_1) \to \ldots \to (i_N,\tau_N)

A “Markovian jumper” waits at some configuration i_n for some time \tau_n with an exponentially decaying probability w_{i_n} \exp - w_{i_n} \tau_n with exit rate w_i = \sum_k w_{k \gets i}, then instantaneously jumps to a new configuration i_{n+1} with transition probability w_{i_{n+1} \gets {i_n}}/w_{i_n}. The overall probability density of a single trajectory is given by

P(\omega^t) = \delta \left(t - \sum_n \tau_n \right) e^{- w_{i_N}\tau_{i_N}} \prod_{n=0}^{N-1} w_{j_n \gets i_n} e^{- w_{i_n} \tau_{i_n}}

One can in principle obtain the probability distribution function of any observable defined along the trajectory by taking the marginal of this measure (though in most cases this is technically impossible). Where does this expression come from? For a formal derivation, see the very beautiful review paper by Weber and Frey, but be aware that this is what one would intuitively come up with if one had to simulate with the Gillespie algorithm.

The dynamics of the Markov process can also be described by the probability of being at some configuration i at time t, which evolves via the master equation

\dot{p}_i(t) = \sum_j \left[ w_{ij} p_j(t) - w_{ji} p_i(t) \right].

We call such probability the system’s state, and we assume that the system relaxes to a uniquely defined steady state p = \mathrm{lim}_{t \to \infty} p(t).

A time-integrated current along a single trajectory is a linear combination of the net number of jumps \#^t between configurations in the network:

\Phi^t_\alpha = \sum_{ij} C^{ij}_\alpha \left[ \#^t(i \gets j) - \#^t(j\gets i) \right]

The idea here is that one or several transitions within the system occur because of the “absorption” or the “emission” of some environmental degrees of freedom, each with different intensity. However, for the moment let us simplify the picture and require that only one transition contributes to a current, that is that there exist i_\alpha,j_\alpha such that

C^{ij}_\alpha = \delta^i_{i_\alpha} \delta^j_{j_\alpha}.

Now, what does it mean for such a set of currents to be “complete”? Here we get inspiration from Kirchhoff’s Current Law in electrical circuits: the continuity of the trajectory at each configuration of the network implies that after a sufficiently long time, cycle or loop or mesh currents completely describe the steady state. There is a standard procedure to identify a set of cycle currents: take a spanning tree T of the network; then the currents flowing along the edges E\setminus T left out from the spanning tree form a complete set.

The last ingredient you need to know are the affinities. They can be constructed as follows. Consider the Markov process on the network where the observable edges are removed G' = (I,T). Calculate the steady state of its associated master equation (p^{\mathrm{eq}}_i)_i, which is necessarily an equilibrium (since there cannot be cycle currents in a tree…). Then the affinities are given by

\mathcal{A}_\alpha = \log  w_{i_\alpha j_\alpha} p^{\mathrm{eq}}_{j_\alpha} / w_{j_\alpha i_\alpha} p^{\mathrm{eq}}_{i_\alpha}.

Now you have all that is needed to formulate the complete theory and prove the FR.

Homework: (Difficult!) With the above definitions, prove the FR.

How about the marginal theory? To define the effective affinities, take the set E_{\mathrm{mar}} = \{i_\mu j_\mu, \forall \mu\} of edges where there run observable currents. Notice that now its complement obtained by removing the observable edges, the hidden edge set E_{\mathrm{hid}} = E \setminus E_{\mathrm{mar}}, is not in general a spanning tree: there might be cycles that are not accounted for by our observations. However, we can still consider the Markov process on the hidden space, and calculate its stalling steady state p^{\mathrm{st}}_i, and ta-taaa: The effective affinities are given by

\mathcal{Q}_\mu = \log w_{i_\mu j_\mu} p^{\mathrm{st}}_{j_\mu} / w_{j_\mu i_\mu} p^{\mathrm{st}}_{i_\mu}.

Proving the marginal IFR is far more complicated than the complete FR. In fact, very often in my field we will not work with the current’ probability density itself, but we prefer to take its bidirectional Laplace transform and work with the currents’ cumulant generating function. There things take a quite different and more elegant look.

Many other questions and possibilities open up now. The most important one left open is: Can we generalize the theory the (physically relevant) case where the current is supported on several edges? For example, for a current defined like \Phi^t = 5 \Phi^t_{12} + 7 \Phi^t_{34}? Well, it depends: the theory holds provided that the stalling state is not “internally alive”, meaning that if the observable current vanishes on average, then also should \Phi^t_{12} and \Phi^t_{34} separately. This turns out to be a physically meaningful but quite strict condition.

Is all of thermodynamics “effective”?

Let me conclude with some more of those philosophical considerations that sadly I have to leave out of papers…

Stochastic thermodynamics strongly depends on the identification of physical and information-theoretic entropies — something that I did not openly talk about, but that lurks behind the whole construction. Throughout my short experience as researcher I have been pursuing a program of “relativization” of thermodynamics, by making the role of the observer more and more evident and movable. Inspired by Einstein’s Gedankenexperimenten, I also tried to make the theory operational. This program may raise eyebrows here and there: Many thermodynamicians embrace a naive materialistic world-view whereby what only matters are “real” physical quantities like temperature, pressure, and all the rest of the information-theoretic discourse is at best mathematical speculation or a fascinating analog with no fundamental bearings. According to some, information as a physical concept lingers alarmingly close to certain extreme postmodern claims in the social sciences that “reality” does not exist unless observed, a position deemed dangerous at times when the authoritativeness of science is threatened by all sorts of anti-scientific waves.

I think, on the contrary, that making concepts relative and effective and by summoning the observer explicitly is a laic and prudent position that serves as an antidote to radical subjectivity. The other way around—clinging to the objectivity of a preferred observer, which is implied in any materialistic interpretation of thermodynamics, e.g. by assuming that the most fundamental degrees of freedom are the positions and velocities of gas’s molecules—is the dangerous position, expecially when the role of such preferred observer is passed around from the scientist to the technician and eventually to the technocrat, who would be induced to believe there are simple technological fixes to complex social problems

How do we reconcile observer-dependency and the laws of physics? The object and the subject? On the one hand, much like the position of an object depends on the reference frame, so much so entropy and entropy production do depend on the observer and the particular apparatus that he controls or experiment he is involved with. On the other hand, much like motion is ultimately independent of position and it is agreed upon by all observers that share compatible measurement protocols, so much so the laws of thermodynamics are independent of that particular observer’s quantification of entropy and entropy production (e.g., the effective Second Law holds independently of how much the marginal observer knows of the system, if he operates according to our phenomenological protocol…). This is the case even in the every-day thermodynamics as practiced by energetic engineers et al., where there are lots of choices to gauge upon, and there is no other external warrant that the amount of dissipation being quantified is the “true” one (whatever that means…)—there can only be trust in one’s own good practices and methodology.

So in this sense, I like to think that all observers are marginal, that this effective theory serves as a dictionary by which different observers practice and communicate thermodynamics, and that we should not revere the laws of thermodynamics as “true” idols, but rather as tools of good scientific practice.


• M. Polettini and M. Esposito, Effective fluctuation and response theory, arXiv:1803.03552.

In this work we give the complete theory and numerous references to work of other people that was along the same lines. We employ a “spiral” approach to the presentation of the results, inspired by the pedagogical principle of Albert Baez.

• M. Polettini and M. Esposito, Effective thermodynamics for a marginal observer, Phys. Rev. Lett. 119 (2017), 240601, arXiv:1703.05715.

This is a shorter version of the story.

• B. Altaner, M. Polettini and M. Esposito, Fluctuation-dissipation relations far from equilibrium, Phys. Rev. Lett. 117 (2016), 180601, arXiv:1604.0883.

An early version of the story, containing the FDR results but not the full-fledged FR.

• G. Bisker, M. Polettini, T. R. Gingrich and J. M. Horowitz, Hierarchical bounds on entropy production inferred from partial information, J. Stat. Mech. (2017), 093210, arXiv:1708.06769.

Some extras.

• M. F. Weber and E. Frey, Master equations and the theory of stochastic path integrals, Rep. Progr. Phys. 80 (2017), 046601, arXiv:1609.02849.

Great reference if one wishes to learn about path integrals for master equation systems.


1 There are as many so-called “Fluctuation Theorems” as there are authors working on them, so I decided not to call them by any name. Furthermore, notice I prefer to distinguish between a relation (a formula) and a theorem (a line of reasoning). I lingered more on this here.

2 “Just so you know, nobody knows what energy is.”—Richard Feynman.

I cannot help but mention here the beautiful book by Shapin and Schaffer, Leviathan and the Air-Pump, about the Boyle vs. Hobbes diatribe about what constitutes a “matter of fact,” and Bruno Latour’s interpretation of it in We Have Never Been Modern. Latour argues that “modernity” is a process of separation of the human and natural spheres, and within each of these spheres a process of purification of the unit facts of knowledge and the unit facts of politics, of the object and the subject. At the same time we live in a world where these two spheres are never truly separated, a world of “hybrids” that are at the same time necessary “for all practical purposes” and unconceivable according to the myths that sustain the narration of science, of the State, and even of religion. In fact, despite these myths, we cannot conceive a scientific fact out of the contextual “network” where this fact is produced and replicated, and neither we can conceive society out of the material needs that shape it: so in this sense “we have never been modern”, we are not quite different from all those societies that we take pleasure of studying with the tools of anthropology. Within the scientific community Latour is widely despised; probably he is also misread. While it is really difficult to see how his analysis applies to, say, high-energy physics, I find that thermodynamics and its ties to the industrial revolution perfectly embodies this tension between the natural and the artificial, the matter of fact and the matter of concern. Such great thinkers as Einstein and Ehrenfest thought of the Second Law as the only physical law that would never be replaced, and I believe this is revelatory. A second thought on the Second Law, a systematic and precise definition of all its terms and circumstances, reveals that the only formulations that make sense are those phenomenological statements such as Kelvin-Planck’s or similar, which require a lot of contingent definitions regarding the operation of the engine, while fetishized and universal statements are nonsensical (such as that masterwork of confusion that is “the entropy of the Universe cannot decrease”). In this respect, it is neither a purely natural law—as the moderns argue, nor a purely social construct—as the postmodern argue. One simply has to renounce to operate this separation. While I do not have a definite answer on this problem, I like to think of the Second Law as a practice, a consistency check of the thermodynamic discourse.

3 This assumption really belongs to a time, the XIXth century, when resources were virtually infinite on planet Earth…

4 As we will see shortly, we define equilibrium as that state where there are no currents at the interface between the system and the environment, so what is the environment’s own definition of equilibrium?!

5 This because we have already exploited the First Law.

6 This nomenclature comes from alchemy, via chemistry (think of Goethe’s The elective affinities…), it propagated in the XXth century via De Donder and Prigogine, and eventually it is still present in language in Luxembourg because in some way we come from the “late Brussels school”.

7 Basically, we ask that the tunable parameters are environmental properties, such as temperatures, chemical potentials, etc. and not internal properties, such as the energy landscape or the activation barriers between configurations.


6 May, 2018

A new journal! We’ve been working on it for a long time, but we finished sorting out some details at ACT2018, and now we’re ready to tell the world!

It’s free to read, free to publish in, and it’s about building big things from smaller parts. Here’s the top of the journal’s home page right now:

Here’s the official announcement:

We are pleased to announce the launch of Compositionality, a new diamond open-access journal for research using compositional ideas, most notably of a category-theoretic origin, in any discipline. Topics may concern foundational structures, an organizing principle, or a powerful tool. Example areas include but are not limited to: computation, logic, physics, chemistry, engineering, linguistics, and cognition. To learn more about the scope and editorial policies of the journal, please visit our website at

Compositionality is the culmination of a long-running discussion by many members of the extended category theory community, and the editorial policies, look, and mission of the journal have yet to be finalized. We would love to get your feedback about our ideas on the forum we have established for this purpose:

Lastly, the journal is currently receiving applications to serve on the editorial board; submissions are due May 31 and will be evaluated by the members of our steering board: John Baez, Bob Coecke, Kathryn Hess, Steve Lack, and Valeria de Paiva.

We will announce a call for submissions in mid-June.

We’re looking forward to your ideas and submissions!

Best regards,

Brendan Fong, Nina Otter, and Joshua Tan

Props in Network Theory

27 April, 2018

Long before the invention of Feynman diagrams, engineers were using similar diagrams to reason about electrical circuits and more general networks containing mechanical, hydraulic, thermodynamic and chemical components. We can formalize this reasoning using ‘props’: a certain kind of categories whose objects are natural numbers, with the tensor product of objects given by addition. In this approach, each kind of network corresponds to a prop, and each network of this kind is a morphism in that prop. A network with m inputs and n outputs is a morphism from m to n. Putting networks together in series is composition, and setting them side by side is tensoring.

In this paper, we study the props for various kinds of electrical circuits:

• John Baez, Brandon Coya and Franciscus Rebro, Props in network theory.

We start with circuits made solely of ideal perfectly conductive wires. Then we consider circuits with passive linear components like resistors, capacitors and inductors. Finally we turn on the power and consider circuits that also have voltage and current sources.

And here’s the cool part: each kind of circuit corresponds to a prop that pure mathematicians would eventually invent on their own! So, what’s good for engineers is often mathematically natural too.

We describe the ‘behavior’ of these various kinds of circuits using morphisms between props. In particular, we give a new proof of the black-boxing theorem proved earlier with Brendan Fong. Unlike the original proof, this new one easily generalizes to circuits with nonlinear components! We also use a morphism of props to clarify the relation between circuit diagrams and the signal-flow diagrams in control theory.

Here’s a quick sketch of the main ideas.

Props in network theory

In his 1963 thesis, Lawvere introduced functorial semantics: the use of categories with specified extra structure as ‘theories’ whose ‘models’ are structure-preserving functors into other such categories:

• F. W. Lawvere, Functorial semantics of algebraic theories, Ph.D. thesis, Department of Mathematics, Columbia University, 1963. Also in Reprints in Theory and Applications of Categories 5 (2003), 1–121.

In particular, a Lawvere theory is a category with finite cartesian products and a distinguished object X such that every object is a power X^n. These can serve as theories of mathematical structures that are sets X equipped with n-ary operations

f \colon X^n \to X

obeying equational laws. However, structures of a more linear-algebraic nature are often vector spaces equipped with operations of the form

f \colon X^{\otimes m} \to X^{\otimes n}

To extend functorial semantics to these, Mac Lane introduced props—or as he called them, ‘PROPs’. The acronym stands for ‘products and permutations’:

• Saunders Mac Lane, Categorical algebra, Bulletin of the American Mathematical Society 71 (1965), 40–106.

A prop is a symmetric monoidal category equipped with a distinguished object X such that every object is a tensor power X^{\otimes n}. Working with tensor products rather than cartesian products puts operations having multiple outputs on an equal footing with operations having multiple inputs.

Already in 1949 Feynman had introduced his famous diagrams, which he used to describe theories of elementary particles. For a theory with just one type of particle, Feynman’s method amounts to specifying a prop where an operation f \colon X^{\otimes m} \to X^{\otimes n} describes a process with m particles coming in and n going out. Although Feynman diagrams quickly caught on in physics, only in the 1980s did it become clear that they were a method of depicting morphisms in symmetric monoidal categories. A key step was the work of Joyal and Street, which rigorously justified reasoning in any symmetric monoidal category using ‘string diagrams’—a generalization of Feynman diagrams:

• André Joyal and Ross Street, The geometry of tensor calculus I, Advances in Mathematics 88 (1991), 55–112.

By now, many mathematical physicists are aware of props and the general idea of functorial semantics. In constrast, props seem to be virtually unknown in engineering!

But engineers have been using diagrammatic methods ever since the rise of electrical circuits. And in the 1940s, Olson explained how to apply circuit diagrams to networks of mechanical, hydraulic, thermodynamic and chemical components:

• Harry F. Olson, Dynamical Analogies, Van Nostrand, New York, 1943.

By 1961, Paynter had made the analogies between these various systems mathematically precise using ‘bond graphs’:

• Henry M. Paynter, Analysis and Design of Engineering Systems, MIT Press, Cambridge, Massachusetts, 1961.

Here he shows a picture of a hydroelectric power plant, and the bond graph that abstractly describes it:

By 1963, Forrester was using circuit diagrams in economics:

• Jay Wright Forrester, Industrial Dynamics, MIT Press, Cambridge, Massachusetts, 1961.

In 1984, Odum published a beautiful and influential book on their use in biology and ecology:

• Howard T. Odum, Ecological and General Systems: An Introduction to Systems Ecology, Wiley, New York, 1984.

Energy Systems Symbols

We can use props to study circuit diagrams of all these kinds! The underlying mathematics is similar in each case, so we focus on just one example: electrical circuits. For other examples, take a look at this:

• John Baez, Network theory (part 29), Azimuth, 23 April 2013.

In our new paper, we illustrate the usefulness of props by giving a new, shorter proof of the ‘black-boxing theorem’ proved here:

• John Baez and Brendan Fong, A compositional framework for passive linear networks. (Blog article here.)

A ‘black box’ is a system with inputs and outputs whose internal mechanisms are unknown or ignored. A simple example is the lock on a doorknob: one can insert a key and try to turn it; it either opens the door or not, and it fulfills this function without us needing to know its inner workings. We can treat a system as a black box through a process called ‘black-boxing’, which forgets its inner workings and records only the relation it imposes between its inputs and outputs. Systems with inputs and outputs can be seen as morphisms in a category, where composition uses the outputs of the one system as the inputs of another. We thus expect black-boxing to be a functor out of a category of this sort. A ‘black-boxing theorem’ makes this intuition precise.

In an electrical circuit, associated to each wire there is a pair of variables called the potential \phi and current I. When we black-box such a circuit, we record only the relation it imposes between the variables on its input and output wires. Since these variables come in pairs, this is a relation between even-dimensional vector spaces. But these vector spaces turn out to be equipped with extra structure: they are symplectic vector spaces, meaning they are equipped with a nondegenerate antisymmetric bilinear form. Black-boxing gives a relation that respects this extra structure: it is a ‘Lagrangian’ relation.

Why does symplectic geometry show up when we black-box an electrical circuit? The first proof of the black-boxing theorem answered this question. A circuit made of linear resistors acts to minimize the total power dissipated in the form of heat. More generally, any circuit made of linear resistors, inductors and capacitors obeys a generalization of this ‘principle of minimum power’. Whenever a system obeys a minimum principle, it establishes a Lagrangian relation between input and output variables. This fact was first noticed in classical mechanics, where systems obey the ‘principle of least action’. Indeed, symplectic geometry has its origins in classical mechanics. But it applies more generally: for any sort of system governed by a minimum principle, black-boxing should give a functor to some category where the morphisms are Lagrangian relations.

The first step toward such a theorem for electrical circuits is to treat circuits as morphisms in a suitable category. We start with circuits made only of ideal perfectly conductive wires. These are morphisms in a prop we call \mathrm{Circ}, defined in Section 3 of our paper. In Section 8 we construct a black-boxing functor

\blacksquare \colon \mathrm{Circ} \to \mathrm{LagRel}_k

sending each such circuit to the relation it defines between its input and output potentials and currents. Here \mathrm{LagRel}_k is a prop with symplectic vector spaces of the form k^{2n} as objects and linear Lagrangian relations as morphisms, and \blacksquare is a morphism of props. We work in a purely algebraic fashion, so k here can be any field.

In Section 9 we extend black-boxing to a larger class of circuits that include linear resistors, inductors and capacitors. This gives a new proof of the black-boxing theorem that Brendan and I proved: namely, there is a morphism of props

\blacksquare \colon \mathrm{Circ}_k \to \mathrm{LagRel}_k

sending each such linear circuit to the Lagrangian relation it defines between its input and output potentials and currents. The ease with which we can extend the black-boxing functor is due to the fact that all our categories with circuits as morphisms are props. We can describe these props using generators and relations, so that constructing a black-boxing functor simply requires that we choose where it sends each generator, and check that the all the relations hold. In Section 10 we explain how electric circuits are related to signal-flow diagrams, used in control theory. Finally, in Section 11, we illustrate how props can be used to study nonlinear circuits.

Outline of the results

The paper is pretty long, so here’s a more detailed outline of the results.

In Section 1 we explain a general notion of `L-circuit’ that was first introduced under a different name here:

• R. Rosebrugh, N. Sabadini and R. F. C. Walters, Generic commutative separable algebras and cospans of graphs, Theory and Applications of Categories 15 (2005), 164–177.

An L-circuit is a cospan of finite sets where the apex is the set of nodes of a graph whose edges are labelled by elements of some set L. In applications to electrical engineering, the elements of L describe different ‘circuit elements’ such as resistors, inductors and capacitors. We discuss a symmetric monoidal category \mathrm{Circ}_L whose objects are finite sets and whose morphisms are (isomorphism classes of) L-circuits.

In Section 2 we study \mathrm{Circ}_L when L is a 1-element set. We call this category simply \mathrm{Circ}. In applications to engineering, a morphism in \mathrm{Circ} describes circuit made solely of ideal conductive wires. We show how such a circuit can be simplified in two successive stages, described by symmetric monoidal functors:

\mathrm{Circ} \stackrel{G}{\longrightarrow} \mathrm{FinCospan} \stackrel{H}{\longrightarrow} \mathrm{FinCorel}

Here \mathrm{FinCospan} is the category of cospans of finite sets, while \mathrm{FinCorel} is the category of ‘corelations’ between finite sets. Corelations, categorically dual to relations, are already known to play an important role in network theory:

• Brendan Fong, Decorated corelations.

Just as a relation can be seen as a jointly monic span, a corelation can be seen as a jointly epic cospan. The functor G crushes any graph down to its set of components, while H reduces any cospan to a jointly epic one.

In Section 4 we turn to props. Propositions 11 and 12, proved in Appendix A.1 with the help of Steve Lack, characterize which symmetric monoidal categories are equivalent to props and which symmetric monoidal functors are isomorphic to morphisms of props. We use these to find props equivalent to \mathrm{Circ}_L, \mathrm{Circ}, \mathrm{FinCospan} and \mathrm{FinCorel}. This lets us reinterpret the arrows here as morphisms of props:

\mathrm{Circ} \stackrel{G}{\longrightarrow} \mathrm{FinCospan} \stackrel{H}{\longrightarrow} \mathrm{FinCorel}

In Section 5 we discuss presentations of props. Proposition 19, proved in Appendix A.2 using a result of Todd Trimble, shows that the category of props is monadic over the category of signatures, \mathrm{Set}^{\mathbb{N} \times \mathbb{N}}. This lets us work with props using generators and relations. We conclude by recalling a presentation of \mathrm{FinCospan} due to Lack and a presentation of \mathrm{FinCorel} due to Coya and Fong:

• Steve Lack, Composing PROPs, Theory and Applications of Categories 13 (2004), 147–163.

• Brandon Coya and Brendan Fong, Corelations are the prop for extraspecial commutative Frobenius monoids, Theory and Applications of Categories 32 (2017), 380–395. (Blog article here.)

In Section 6 we introduce the prop \mathrm{FinRel}_k. This prop is equivalent to the symmetric monoidal category with finite-dimensional vector spaces over the field k as objects and linear relations as morphisms, with direct sum as its tensor product. A presentation of this prop was given in these papers:

• Filippo Bonchi, Pawel Sobociénski and Fabio Zanasi, Interacting Hopf algebras, Journal of Pure and Applied Algebra 221 (2017), 144–184.

• John Baez and Jason Erbele, Categories in control, Theory and Application of Categories 30 (2015), 836–881. (Blog article here.)

In Section 7 we state the main result in the paper by Rosebrugh, Sabadini and Walters. This gives a presentation of \mathrm{Circ}_L. Equivalently, it says that \mathrm{Circ}_L is the coproduct, in the category of props, of \mathrm{FinCospan} and the free prop on a set of unary operations, one for each element of L. This result makes it easy to construct morphisms from \mathrm{Circ}_L to other props.

In Section 8 we introduce the prop \mathrm{LagRel}_k where morphisms are Lagrangian linear relations between symplectic vector spaces, and we construct the black-boxing functor \blacksquare \colon \mathrm{Circ} \to \mathrm{LagRel}_k. Mathematically, this functor is the composite

\mathrm{Circ} \stackrel{G}{\longrightarrow} \mathrm{FinCospan} \stackrel{H}{\longrightarrow} \mathrm{FinCorel} \stackrel{K}{\longrightarrow} \mathrm{LagRel}_k

where K is a symmetric monoidal functor defined by its action on the generators of \mathrm{FinCorel}. In applications to electrical engineering, the black-boxing functor maps any circuit of ideal conductive wires to its ‘behavior’: that is, to the relation that it imposes on the potentials and currents at its inputs and outputs.

In Section 9 we extend the black-boxing functor to circuits that include resistors, inductors, capacitors and certain other linear circuit elements. The most elegant prop having such circuits as morphisms is \mathrm{Circ}_k, meaning \mathrm{Circ}_L with the label set L taken to be a field $k.$ We characterize the black-boxing functor

\blacksquare \colon \mathrm{Circ}_k \to \mathrm{LagRel}_k

in Theorem 41.

In Section 10 we expand the scope of inquiry to include ‘signal-flow diagrams’, which are important in control theory. We explain how signal-flow diagrams serve as a syntax for linear relations. Concretely, this means that signal-flow diagrams are morphisms in a free prop \mathrm{SigFlow}_k with the same generators as \mathrm{FinRel}_k, but no relations. There is thus a morphism of props

\square \colon \mathrm{SigFlow}_k \to \mathrm{FinRel}_k

mapping any signal-flow diagrams to the linear relation that it denotes. It is natural to wonder how this is related to the black-boxing functor

\blacksquare \colon \mathrm{Circ}_k \to \mathrm{LagRel}_k

The answer involves the free prop \widetilde{\mathrm{Circ}}_k which arises when we take the simplest presentation of \mathrm{Circ}_k and omit the relations. This comes with a map

P \colon \widetilde{\mathrm{Circ}}_k \to \mathrm{Circ}_k

which reinstates those relations, and in Theorem 44 we show there is a map of props T \colon \widetilde{\mathrm{Circ}}_k \to \mathrm{SigFlow}_k making this diagram commute:

Putting everything together, we get this grand commuting diagram relating circuit diagrams, linear circuits, signal flow diagrams, cospans, corelations, Lagrangian relations, and linear relations:

Finally, in Section 11 we illustrate how props can also be used to study nonlinear circuits. Namely, we show how to include voltage and current sources. Black-boxing these gives Lagrangian affine relations between symplectic vector spaces! Eventually we’ll get around to studying more general nonlinear circuits… but not today.