## Energy and the Environment – What Physicists Can Do

25 April, 2013

The Perimeter Institute is a futuristic-looking place where over 250 physicists are thinking about quantum gravity, quantum information theory, cosmology and the like. Since I work on some of these things, I was recently invited to give the weekly colloquium there. But I took the opportunity to try to rally them into action:

Energy and the Environment: What Physicists Can Do. Watch the video or read the slides.

Abstract. The global warming crisis is part of a bigger transformation in which humanity realizes that the Earth is a finite system and that our population, energy usage, and the like cannot continue to grow exponentially. While politics and economics pose the biggest challenges, physicists are in a good position to help make this transition a bit easier. After a quick review of the problems, we discuss a few ways physicists can help.

On the video you can hear me say a lot of stuff that’s not on the slides: it’s more of a coherent story. The advantage of the slides is that anything in blue, you can click on to get more information. So for example, when I say that solar power capacity has been growing annually by 75% in recent years, you can see where I got that number.

I was pleased by the response to this talk. Naturally, it was not a case of physicists saying “okay, tomorrow I’ll quit working on the foundations of quantum mechanics and start trying to improve quantum dot solar cells.” It’s more about getting them to see that huge problems are looming ahead of us… and to see the huge opportunities for physicists who are willing to face these problems head-on, starting now. Work on energy technologies, the smart grid, and ‘ecotechnology’ is going to keep growing. I think a bunch of the younger folks, at least, could see this.

However, perhaps the best immediate outcome of this talk was that Lee Smolin introduced me to Manjana Milkoreit. She’s at the school of international affairs at Waterloo University, practically next door to the Perimeter Institute. She works on “climate change governance, cognition and belief systems, international security, complex systems approaches, especially threshold behavior, and the science-policy interface.”

So, she knows a lot about the all-important human and political side of climate change. Right now she’s interviewing diplomats involved in climate treaty negotiations, trying to see what they believe about climate change. And it’s very interesting!

In my next post, I’ll talk about something she pointed me to. Namely: what we can do to hold the temperature increase to 2 °C or less, given that the pledges made by various nations aren’t enough.

## Network Theory (Part 29)

23 April, 2013

I’m talking about electrical circuits, but I’m interested in them as models of more general physical systems. Last time we started seeing how this works. We developed an analogy between electrical circuits and physical systems made of masses and springs, with friction:

 Electronics Mechanics charge: $Q$ position: $q$ current: $I = \dot{Q}$ velocity: $v = \dot{q}$ flux linkage: $\lambda$ momentum: $p$ voltage: $V = \dot{\lambda}$ force: $F = \dot{p}$ inductance: $L$ mass: $m$ resistance: $R$ damping coefficient: $r$ inverse capacitance: $1/C$ spring constant: $k$

But this is just the first of a large set of analogies. Let me list some, so you can see how wide-ranging they are!

### More analogies

People in system dynamics often use effort as a term to stand for anything analogous to force or voltage, and flow as a general term to stand for anything analogous to velocity or electric current. They call these variables $e$ and $f.$

To me it’s important that force is the time derivative of momentum, and velocity is the time derivative of position. Following physicists, I write momentum as $p$ and position as $q.$ So, I’ll usually write effort as $\dot{p}$ and flow as $\dot{q}$.

Of course, ‘position’ is a term special to mechanics; it’s nice to have a general term for the thing whose time derivative is flow, that applies to any context. People in systems dynamics seem to use displacement as that general term.

It would also be nice to have a general term for the thing whose time derivative is effort… but I don’t know one. So, I’ll use the word momentum.

Now let’s see the analogies! Let’s see how displacement $q$, flow $\dot{q},$ momentum $p$ and effort $\dot{p}$ show up in several subjects:

 displacement:    $q$ flow:      $\dot q$ momentum:      $p$ effort:           $\dot p$ Mechanics: translation position velocity momentum force Mechanics: rotation angle angular velocity angular momentum torque Electronics charge current flux linkage voltage Hydraulics volume flow pressure momentum pressure Thermal Physics entropy entropy flow temperature momentum temperature Chemistry moles molar flow chemical momentum chemical potential

We’d been considering mechanics of systems that move along a line, via translation, but we can also consider mechanics for systems that turn round and round, via rotation. So, there are two rows for mechanics here.

There’s a row for electronics, and then a row for hydraulics, which is closely analogous. In this analogy, a pipe is like a wire. The flow of water plays the role of current. Water pressure plays the role of electrostatic potential. The difference in water pressure between two ends of a pipe is like the voltage across a wire. When water flows through a pipe, the power equals the flow times this pressure difference—just as in an electrical circuit the power is the current times the voltage across the wire.

A resistor is like a narrowed pipe:

An inductor is like a heavy turbine placed inside a pipe: this makes the water tend to keep flowing at the same rate it’s already flowing! In other words, it provides a kind of ‘inertia’ analogous
to mass.

A capacitor is like a tank with pipes coming in from both ends, and a rubber sheet dividing it in two lengthwise:

When studying electrical circuits as a kid, I was shocked when I first learned that capacitors don’t let the electrons through: it didn’t seem likely you could do anything useful with something like that! But of course you can. Similarly, this gizmo doesn’t let the water through.

A voltage source is like a compressor set up to maintain a specified pressure difference between the input and output:

Similarly, a current source is like a pump set up to maintain a specified flow.

Finally, just as voltage is the time derivative of a fairly obscure quantity called ‘flux linkage’, pressure is the time derivative of an even more obscure quantity which has no standard name. I’m calling it ‘pressure momentum’, thanks to the analogy

momentum: force :: pressure momentum: pressure

Just as pressure has units of force per area, pressure momentum has units of momentum per area!

People invented this analogy back when they were first struggling to understand electricity, before electrons had been observed:

Hydraulic analogy, Wikipedia.

The famous electrical engineer Oliver Heaviside pooh-poohed this analogy, calling it the “drain-pipe theory”. I think he was making fun of William Henry Preece. Preece was another electrical engineer, who liked the hydraulic analogy and disliked Heaviside’s fancy math. In his inaugural speech as president of the Institution of Electrical Engineers in 1893, Preece proclaimed:

True theory does not require the abstruse language of mathematics to make it clear and to render it acceptable. All that is solid and substantial in science and usefully applied in practice, have been made clear by relegating mathematic symbols to their proper store place—the study.

According to the judgement of history, Heaviside made more progress in understanding electromagnetism than Preece. But there’s still a nice analogy between electronics and hydraulics. And I’ll eventually use the abstruse language of mathematics to make it very precise!

But now let’s move on to the row called ‘thermal physics’. We could also call this ‘thermodynamics’. It works like this. Say you have a physical system in thermal equilibrium and all you can do is heat it up or cool it down ‘reversibly’—that is, while keeping it in thermal equilibrium all along. For example, imagine a box of gas that you can heat up or cool down. If you put a tiny amount $dE$ of energy into the system in the form of heat, then its entropy increases by a tiny amount $dS.$ And they’re related by this equation:

$dE = TdS$

where $T$ is the temperature.

Another way to say this is

$\displaystyle{ \frac{dE}{dt} = T \frac{dS}{dt} }$

where $t$ is time. On the left we have the power put into the system in the form of heat. But since power should be ‘effort’ times ‘flow’, on the right we should have ‘effort’ times ‘flow’. It makes some sense to call $dS/dt$ the ‘entropy flow’. So temperature, $T,$ must play the role of ‘effort’.

This is a bit weird. I don’t usually think of temperature as a form of ‘effort’ analogous to force or torque. Stranger still, our analogy says that ‘effort’ should be the time derivative of some kind of ‘momentum’, So, we need to introduce temperature momentum: the integral of temperature over time. I’ve never seen people talk about this concept, so it makes me a bit nervous.

But when we have a more complicated physical system like a piston full of gas in thermal equilibrium, we can see the analogy working. Now we have

$dE = TdS - PdV$

The change in energy $dE$ of our gas now has two parts. There’s the change in heat energy $TdS$, which we saw already. But now there’s also the change in energy due to compressing the piston! When we change the volume of the gas by a tiny amount $dV,$ we put in energy $-PdV.$

Now look back at the first chart I drew! It says that pressure is a form of ‘effort’, while volume is a form of ‘displacement’. If you believe that, the equation above should help convince you that temperature is also a form of effort, while entropy is a form of displacement.

But what about the minus sign? That’s no big deal: it’s the result of some arbitrary conventions. $P$ is defined to be the outward pressure of the gas on our piston. If this is positive, reducing the volume of the gas takes a positive amount of energy, so we need to stick in a minus sign. I could eliminate this minus sign by changing some conventions—but if I did, the chemistry professors at UCR would haul me away and increase my heat energy by burning me at the stake.

Speaking of chemistry: here’s how the chemistry row in the analogy chart works. Suppose we have a piston full of gas made of different kinds of molecules, and there can be chemical reactions that change one kind into another. Now our equation gets fancier:

$\displaystyle{ dE = TdS - PdV + \sum_i \mu_i dN_i }$

Here $N_i$ is the number of molecules of the ith kind, while $\mu_i$ is a quantity called a chemical potential. The chemical potential simply says how much energy it takes to increase the number of molecules of a given kind. So, we see that chemical potential is another form of effort, while number of molecules is another form of displacement.

But chemists are too busy to count molecules one at a time, so they count them in big bunches called ‘moles’. A mole is the number of atoms in 12 grams of carbon-12. That’s roughly

602,214,150,000,000,000,000,000

atoms. This is called Avogadro’s constant. If we used 1 gram of hydrogen, we’d get a very close number called ‘Avogadro’s number’, which leads to lots of jokes:

(He must be desperate because he looks so weird… sort of like a mole!)

So, instead of saying that the displacement in chemistry is called ‘number of molecules’, you’ll sound more like an expert if you say ‘moles’. And the corresponding flow is called molar flow.

The truly obscure quantity in this row of the chart is the one whose time derivative is chemical potential! I’m calling it chemical momentum simply because I don’t know another name.

Why are linear and angular momentum so famous compared to pressure momentum, temperature momentum and chemical momentum?

I suspect it’s because the laws of physics are symmetrical
under translations and rotations. When the assumptions of Noether’s theorem hold, this guarantees that the total momentum and angular momentum of a closed system are conserved. Apparently the laws of physics lack the symmetries that would make the other kinds of momentum be conserved.

This suggests that we should dig deeper and try to understand more deeply how this chart is connected to ideas in classical mechanics, like Noether’s theorem or symplectic geometry. I will try to do that sometime later in this series.

More generally, we should try to understand what gives rise to a row in this analogy chart. Are there are lots of rows I haven’t talked about yet, or just a few? There are probably lots. But are there lots of practically important rows that I haven’t talked about—ones that can serve as the basis for new kinds of engineering? Or does something about the structure of the physical world limit the number of such rows?

### Mildly defective analogies

Engineers care a lot about dimensional analysis. So, they often make a big deal about the fact that while effort and flow have different dimensions in different rows of the analogy chart, the following four things are always true:

$pq$ has dimensions of action (= energy × time)
$\dot{p} q$ has dimensions of energy
$p \dot{q}$ has dimensions of energy
$\dot{p} \dot{q}$ has dimensions of power (= energy / time)

In fact any one of these things implies all the rest.

These facts are important when designing ‘mixed systems’, which combine different rows in the chart. For example, in mechatronics, we combine mechanical and electronic elements in a single circuit! And in a hydroelectric dam, power is converted from hydraulic to mechanical and then electric form:

One goal of network theory should be to develop a unified language for studying mixed systems! Engineers have already done most of the hard work. And they’ve realized that thanks to conservation of energy, working with pairs of flow and effort variables whose product has dimensions of power is very convenient. It makes it easy to track the flow of energy through these systems.

However, people have tried to extend the analogy chart to include ‘mildly defective’ examples where effort times flow doesn’t have dimensions of power. The two most popular are these:

 displacement:    $q$ flow:      $\dot q$ momentum:      $p$ effort:           $\dot p$ Heat flow heat heat flow temperature momentum temperature Economics inventory product flow economic momentum product price

The heat flow analogy comes up because people like to think of heat flow as analogous to electrical current, and temperature as analogous to voltage. Why? Because an insulated wall acts a bit like a resistor! The current flowing through a resistor is a function the voltage across it. Similarly, the heat flowing through an insulated wall is about proportional to the difference in temperature between the inside and the outside.

However, there’s a difference. Current times voltage has dimensions of power. Heat flow times temperature does not have dimensions of power. In fact, heat flow by itself already has dimensions of power! So, engineers feel somewhat guilty about this analogy.

Being a mathematical physicist, a possible way out presents itself to me: use units where temperature is dimensionless! In fact such units are pretty popular in some circles. But I don’t know if this solution is a real one, or whether it causes some sort of trouble.

In the economic example, ‘energy’ has been replaced by ‘money’. So other words, ‘inventory’ times ‘product price’ has units of money. And so does ‘product flow’ times ‘economic momentum’! I’d never heard of economic momentum before I started studying these analogies, but I didn’t make up that term. It’s the thing whose time derivative is ‘product price’. Apparently economists have noticed a tendency for rising prices to keep rising, and falling prices to keep falling… a tendency toward ‘conservation of momentum’ that doesn’t fit into their models of rational behavior.

I’m suspicious of any attempt to make economics seem like physics. Unlike elementary particles or rocks, people don’t seem to be very well modelled by simple differential equations. However, some economists have used the above analogy to model economic systems. And I can’t help but find that interesting—even if intellectually dubious when taken too seriously.

### An auto-analogy

Beside the analogy I’ve already described between electronics and mechanics, there’s another one, called ‘Firestone’s analogy’:

• F.A. Firestone, A new analogy between mechanical and electrical systems, Journal of the Acoustical Society of America 4 (1933), 249–267.

Alain Bossavit pointed this out in the comments to Part 27. The idea is to treat current as analogous to force instead of velocity… and treat voltage as analogous to velocity instead of force!

In other words, switch your $p$’s and $q$’s:

 Electronics Mechanics          (usual analogy) Mechanics      (Firestone’s analogy) charge position: $q$ momentum: $p$ current velocity: $\dot{q}$ force: $\dot{p}$ flux linkage momentum: $p$ position: $q$ voltage force: $\dot{p}$ velocity: $\dot{q}$

This new analogy is not ‘mildly defective’: the product of effort and flow variables still has dimensions of power. But why bother with another analogy?

It may be helpful to recall this circuit from last time:

It’s described by this differential equation:

$L \ddot{Q} + R \dot{Q} + C^{-1} Q = V$

We used the ‘usual analogy’ to translate it into classical mechanics problem, and we got a problem where an object of mass $L$ is hanging from a spring with spring constant $1/C$ and damping coefficient $R,$ and feeling an additional external force $F:$

$m \ddot{q} + r \dot{q} + k q = F$

And that’s fine. But there’s an intuitive sense in which all three forces are acting ‘in parallel’ on the mass, rather than in series. In other words, all side by side, instead of one after the other.

Using Firestone’s analogy, we get a different classical mechanics problem, where the three forces are acting in series. The spring is connected to source of friction, which in turn is connected to an external force.

This may seem a bit mysterious. But instead of trying to explain it, I’ll urge you to read his paper, which is short and clearly written. I instead want to make a somewhat different point, which is that we can take a mechanical system, convert it to an electrical one following the usual analogy, and then convert back to a mechanical one using Firestone’s analogy. This gives us an ‘auto-analogy’ between mechanics and itself, which switches $p$ and $q.$

And although I haven’t been able to figure out why from Firestone’s paper, I have other reasons for feeling sure this auto-analogy should contain a minus sign. For example:

$p \mapsto q, \qquad q \mapsto -p$

In other words, it should correspond to a 90° rotation in the $(p,q)$ plane. There’s nothing sacred about whether we rotate clockwise or counterclockwise; we can equally well do this:

$p \mapsto -q, \qquad q \mapsto p$

But we need the minus sign to get a so-called symplectic transformation of the $(p,q)$ plane. And from my experience with classical mechanics, I’m pretty sure we want that. If I’m wrong, please let me know!

I have a feeling we should revisit this issue when we get more deeply into the symplectic aspects of circuit theory. So, I won’t go on now.

### References

The analogies I’ve been talking about are studied in a branch of engineering called system dynamics. You can read more about it here:

• Dean C. Karnopp, Donald L. Margolis and Ronald C. Rosenberg, System Dynamics: a Unified Approach, Wiley, New York, 1990.

• Forbes T. Brown, Engineering System Dynamics: a Unified Graph-Centered Approach, CRC Press, Boca Raton, 2007.

• Francois E. Cellier, Continuous System Modelling, Springer, Berlin, 1991.

System dynamics already uses lots of diagrams of networks. One of my goals in weeks to come is to explain the category theory lurking behind these diagrams.

## Petri Net Programming (Part 3)

19 April, 2013

guest post by David Tanzer

### The role of differential equations

Last time we looked at stochastic Petri nets, which use a random event model for the reactions. Individual entities are represented by tokens that flow through the network. When the token counts get large, we observed that they can be approximated by continuous quantities, which opens the door to the application of continuous mathematics to the analysis of network dynamics.

A key result of this approach is the “rate equation,” which gives a law of motion for the expected sizes of the populations. Equilibrium can then be obtained by solving for zero motion. The rate equations are applied in chemistry, where they give the rates of change of the concentrations of the various species in a reaction.

But before discussing the rate equation, here I will talk about the mathematical form of this law of motion, which consists of differential equations. This form is naturally associated with deterministic systems involving continuous magnitudes. This includes the equations of motion for the sine wave:

and the graceful ellipses that are traced out by the orbits of the planets around the sun:

This post provides some mathematical context to programmers who have not worked on scientific applications. My goal is to get as many of you on board as possible, before setting sail with Petri net programming.

### Three approaches to equations: theoretical, formula-based, and computational

Let’s first consider the major approaches to equations in general. We’ll illustrate with a Diophantine equation

$x^9 + y^9 + z^9 = 2$

where $x, y$ and $z$ are integer variables.

In the theoretical approach (aka “qualitative analysis”), we start with the meaning of the equation and then proceed to reason about its solutions. Here are some simple consequences of this equation. They can’t all be zero, can’t all be positive, can’t all be negative, can’t all be even, and can’t all be odd.

In the formula-based approach, we seek formulas to describe the solutions. Here is an example of a formula (which does not solve our equation):

$\{(x,y,z) | x = n^3, y = 2n - 4, z = 4 n | 1 \leq n \leq 5 \}$

Such formulas are nice to have, but the pursuit of them is diabolically difficult. In fact, for Diophantine equations, even the question of whether an arbitrarily chosen equation has any solutions whatsoever has been proven to be algorithmically undecidable.

Finally, in the computational approach, we seek algorithms to enumerate or numerically approximate the solutions to the equations.

### The three approaches to differential equations

Let’s apply the preceding classification to differential equations.

#### Theoretical approach

A differential equation is one that constrains the rates at which the variables are changing. This can include constraints on the rates at which the rates are changing (second-order equations), etc. The equation is ordinary if there is a single independent variable, such as time, otherwise it is partial.

Consider the equation stating that a variable increases at a rate equal to its current value. The bigger it gets, the faster it increases. Given a starting value, this determines a process — the solution to the equation — which here is exponential growth.

Let $X(t)$ be the value at time $t,$ and let’s initialize it to 1 at time 0. So we have:

$X(0) = 1$

$X'(t) = X(t)$

These are first-order equations, because the derivative is applied at most once to any variable. They are linear equations, because the terms on each side of the equations are linear combinations of either individual variables or derivatives (in this case all of the coefficients are 1). Note also that a system of differential equations may in general have zero, one, or multiple solutions. This example belongs to a class of equations which are proven to have a unique solution for each initial condition.

You could imagine more complex systems of equations, involving multiple dependent variables, all still depending on time. That includes the rate equations for a Petri net, which have one dependent variable for each of the population sizes. The ideas for such systems are an extension of the ideas for a single-variable system. Then, a state of the system is a vector of values, with one component for each of the dependent variables. For first-order systems, such as the rate equations, where the derivatives appear on the left-hand sides, the equations determine, for each possible state of the system, a “direction” and rate of change for the state of the system.

Now here is a simple illustration of what I called the theoretical approach. Can $X(t)$ ever become negative? No, because it starts out positive at time 0, and in order to later become negative, it must be decreasing at a time $t_1$ when it is still positive. That is to say, $X(t_1) > 0$, and $X'(t_1) < 0$. But that contradicts the assumption $X'(t) = X(t)$. The general lesson here is that we don’t need a solution formula in order to make such inferences.

For the rate equations, the theoretical approach leads to substantial theorems about the existence and structure of equilibrium solutions.

#### Formula-based approach

It is natural to look for concise formulas to solve our equations, but the results of this overall quest are largely negative. The exponential differential equation cannot be solved by any formula that involves a finite combination of simple operations. So the solution function must be treated as a new primitive, and given a name, say $\exp(t)$. But even when we extend our language to include this new symbol, there are many differential equations that remain beyond the reach of finite formulas. So an endless collection of primitive functions is called for. (As standard practice, we always include $exp(t),$ and its complex extensions to the trigonometric functions, as primitives in our toolbox.)

But the hard mathematical reality does not end here, because even when solution formulas do exist, finding them may call for an ace detective. Only for certain classes of differential equations, such as the linear ones, do we have systematic solution methods.

The picture changes, however, if we let the formulas contain an infinite number of operations. Then the arithmetic operators give a far-reaching base for defining new functions. In fact, as you can verify, the power series

$X(t) = 1 + t + t^2/2! + t^3/3! + ...$

which we view as an “infinite polynomial” over the time parameter t, exactly satisfies our equations for exponential motion, $X(0) = 1$ and $X'(t) = X(t).$ This power series therefore defines $\exp(t).$ By the way, applying it to the input 1 produces a definition for the transcendental number $e$:

$e = X(1) = 1 + 1 + 1/2 + 1/6 + 1/24 + 1/120 + ... \approx 2.71828$

#### Computational approach

Let’s leave aside our troubles with formulas, and consider the computational approach. For broad classes of differential equations, there are approximation algorithms that be successfully applied.

For starters, any power series that satisfies a differential equation may work for a simple approximation method. If a series is known to converge over some range of inputs, then one can approximate the value at those points by stopping the computation after a finite number of terms.

But the standard methods work directly with the equations, provided that they can be put into the right form. The simplest one is called Euler’s method. It works over a sampling grid of points separated by some small number $\epsilon$. Let’s take the case where we have a first-order equation in explicit form, which means that $X'(t) = f(X(t))$ for some function $f.$

We begin with the initial value $X(0)$. Applying $f,$ we get $X'(0) = f(X(0)).$ Then for the interval from 0 to $\epsilon$,we use a linear approximation for $X(t)$, by assuming that the derivative remains constant at $X'(0).$ That gives $X(\epsilon) = X(0) + \epsilon \cdot X'(0).$ Next, $X'(\epsilon) = f(X(\epsilon)),$ and $X(2 \epsilon) = X(\epsilon) + \epsilon \cdot X'(\epsilon),$ etc. Formally,

$X(0) = \textrm{initial}$

$X((n+1) \epsilon) = X(n \epsilon) + \epsilon f(X(n \epsilon))$

Applying this to our exponential equation, where $f(X(t)) = X(t),$ we get:

$X(0) = 1$

$X((n+1) \epsilon) = X(n \epsilon) + \epsilon X(n \epsilon) = X(n \epsilon) (1 + \epsilon)$

Hence:

$X(n \epsilon) = (1 + \epsilon) ^ n$

So the approximation method gives a discrete exponential growth, which converges to a continuous exponential in the limit as the mesh size goes to zero.

Note, the case we just considered has more generality than might appear at first, because (1) the ideas here are easily extended to systems of explicit first order equations, and (2) higher-order equations that are “explicit” in an extended sense—meaning that the highest-order derivative is expressed as a function of time, of the variable, and of the lower-order derivatives—can be converted into an equivalent system of explicit first-order equations.

### The challenging world of differential equations

So, is our cup half-empty or half-full? We have no toolbox of primitive formulas for building the solutions to all differential equations by finite compositions. And even for those which can be solved by formulas, there is no general method for finding the solutions. That is how the cookie crumbles. But on the positive side, there is an array of theoretical tools for analyzing and solving important classes of differential equations, and numerical methods can be applied in many cases.

The study of differential equations leads to some challenging problems, such as the Navier-Stokes equations, which describe the flow of fluids.

These are partial differential equations involving flow velocity, pressure, density and external forces (such as gravity), all of which vary over space and time. There are non-linear (multiplicative) interactions between these variables and their spatial and temporal derivatives, which leads to complexity in the solutions.

At high flow rates, this complexity can produce chaotic solutions, which involve complex behavior at a wide range of resolution scales. This is turbulence. Here is an insightful portrait of turbulence, by Leonardo da Vinci, whose studies in turbulence date back to the 15th Century.

Turbulence, which has been described by Richard Feynman as the most important unsolved problem of classical physics, also presents a mathematical puzzle. The general existence of solutions to the Navier-Stokes equations remains unsettled. This is one of the “Millennium Prize Problems”, for which a one million dollar prize is offered: in three dimensions, given initial values for the velocity and scalar fields, does there exist a solution that is smooth and everywhere defined? There are also complications with grid-based numerical methods, which will fail to produce globally accurate results if the solutions contain details at a smaller scale than the grid mesh. So the ubiquitous phenomenon of turbulence, which is so basic to the movements of the atmosphere and the seas, remains an open case.

But fortunately we have enough traction with differential equations to proceed directly with the rate equations for Petri nets. There we will find illuminating equations, which are the subject of both major theorems and open problems. They are non-linear and intractable by formula-based methods, yet, as we will see, they are well handled by numerical methods.

## Milankovitch Cycles and the Earth’s Climate

13 April, 2013

Here are the slides for a talk I’m giving at the Cal State Northridge Climate Science Seminar:

It’s a gentle introduction to these ideas, and it presents a lot of what Blake Pollard and I have said about Milankovitch cycles, in a condensed way. Of course when I give the talk, I’ll add more words, especially about the different famous ‘puzzles’.

If you have any corrections, please let me know!

I’m eager to visit Cal State Northridge and especially David Klein in their math department, since I’d like to incorporate some climate science in our math curriculum the way they’ve done there.

## Network Theory (Part 28)

10 April, 2013

Last time I left you with some puzzles. One was to use the laws of electrical circuits to work out what this one does:

If we do this puzzle, and keep our eyes open, we’ll see an analogy between electrical circuits and classical mechanics! And this is the first of a huge set of analogies. The same math shows up in many different subjects, whenever we study complex systems made of interacting parts. So, it should become part of any general theory of networks.

This simple circuit is very famous: it’s called a series RLC circuit, because it has a resistor of resistance $R,$ an inductor of inductance $L,$ and a capacitor of capacitance $C,$ all hooked up ‘in series’, meaning one after another. But understand this circuit, it’s good to start with an even simpler one, where we leave out the voltage source:

This has three edges, so reading from top to bottom there are 3 voltages $V_1, V_2, V_3,$ and 3 currents $I_1, I_2, I_3,$ one for each edge. The white and black dots are called ‘nodes’, and the white ones are called ‘terminals’: current can flow in or out of those.

The voltages and currents obey a bunch of equations:

• Kirchhoff’s current law says the current flowing into each node that’s not a terminal equals the current flowing out:

$I_1 = I_2 = I_3$

• Kirchhoff’s voltage law says there are potentials $\phi_0, \phi_1, \phi_2, \phi_3$, one for each node, such that:

$V_1 = \phi_0 - \phi_1$

$V_2 = \phi_1 - \phi_2$

$V_3 = \phi_2 - \phi_3$

In this particular problem, Kirchhoff’s voltage law doesn’t say much, since we can always find potentials obeying this, given the voltages. But in other problems it can be important. And even here it suggests that the sum $V_1 + V_2 + V_3$ will be important; this is the ‘total voltage across the circuit’.

Next, we get one equation for each circuit element:

• The law for a resistor says:

$V_1 = R I_1$

The law for a inductor says:

$\displaystyle{ V_2 = L \frac{d I_2}{d t} }$

The law for a capacitor says:

$\displaystyle{ I_3 = C \frac{d V_3}{d t} }$

These are all our equations. What should we do with them? Since $I_1 = I_2 = I_3,$ it makes sense to call all these currents simply $I$ and solve for each voltage in terms of this. Here’s what we get:

$V_1 = R I$

$\displaystyle{ V_2 = L \frac{d I}{d t} }$

$\displaystyle {V_3 = C^{-1} \int I \, dt }$

So, if we know the current flowing through the circuit we can work out the voltage across each circuit element!

Well, not quite: in the case of the capacitor we only know it up to a constant, since there’s a constant of integration. This may seem like a minor objection, but it’s worth taking seriously. The point is that the charge on the capacitor’s plate is proportional to the voltage across the capacitor:

$\displaystyle{V_3 = C^{-1} Q }$

When electrons move on or off the plate, this charge changes, and we get a current:

$\displaystyle{I = \frac{d Q}{d t} }$

So, we can work out the time derivative of $V_3$ from the current $I$, but to work out $V_3$ itself we need the charge $Q.$

Treat these as definitions if you like, but they’re physical facts too! And they let us rewrite our trio of equations:

$V_1 = R I$

$\displaystyle{ V_2 = L \frac{d I}{d t} }$

$\displaystyle{V_3 = C^{-1} \int I \, dt }$

in terms of the charge, as follows:

$V_1 = R \dot{Q}$

$V_2 = L \ddot{Q}$

$V_3 = C^{-1} Q$

Then if we add these three equations, we get

$V_1 + V_2 + V_3 = L \ddot Q + R \dot Q + C^{-1} Q$

So, if we define the total voltage by

$V = V_1 + V_2 + V_3 = \phi_0 - \phi_3$

we get

$L \ddot Q + R \dot Q + C^{-1} Q = V$

And this is great!

Why? Because this equation is famous! If you’re a mathematician, you know it as the most general second-order linear ordinary differential equation with constant coefficients. But if you’re a physicist, you know it as the damped driven oscillator.

### The analogy between electronics and mechanics

Here’s an example of a damped driven oscillator:

We’ve got an object hanging from a spring with some friction, and an external force pulling it down. Here the external force is gravity, so it’s constant in time, but we can imagine fancier situations where it’s not. So in a general damped driven oscillator:

• the object has mass $m$ (and the spring is massless),

• the spring constant is $k$ (this says how strong the spring force is),

• the damping coefficient is $r$ (this says how much friction there is),

• the external force is $F$ (in general a function of time).

Then Newton’s law says

$m \ddot{q} + r \dot{q} + k q = F$

And apart from the use of different letters, this is exactly like the equation for our circuit! Remember, that was

$L \ddot Q + R \dot Q + C^{-1} Q = V$

So, we get a wonderful analogy relating electronics and mechanics! It goes like this:

 Electronics Mechanics charge: $Q$ position: $q$ current: $I = \dot{Q}$ velocity: $v = \dot{q}$ voltage: $V$ force: $F$ inductance: $L$ mass: $m$ resistance: $R$ damping coefficient: $r$ inverse capacitance: $1/C$ spring constant: $k$

If you understand mechanics, you can use this to get intuition about electronics… or vice versa. I’m more comfortable with mechanics, so when I see this circuit:

I imagine a current of electrons whizzing along, ‘forced’ by the voltage across the circuit, getting slowed by the ‘friction’ of the resistor, wanting to continue their motion thanks to the inertia or ‘mass’ of the inductor, and getting stuck on the plate of the capacitor, where their mutual repulsion pushes back against the flow of current—just like a spring fights back when you pull on it! This lets me know how the circuit will behave: I can use my mechanical intuition.

The only mildly annoying thing is that the inverse of the capacitance $C$ is like the spring constant $k.$ But this makes perfect sense. A capacitor is like a spring: you ‘pull’ on it with voltage and it ‘stretches’ by building up electric charge on its plate. If its capacitance is high, it’s like a easily stretchable spring. But this means the corresponding spring constant is low.

Besides letting us transfer intuition and techniques, the other great thing about analogies is that they suggest ways of extending themselves. For example, we’ve seen that current is the time derivative of charge. But if we hadn’t, we could still have guessed it, because current is like velocity, which is the time derivative of something important.

Similarly, force is analogous to voltage. But force is the time derivative of momentum! We don’t have momentum on our chart. Our chart is also missing the thing whose time derivative is voltage. This thing is called flux linkage, and sometimes denotes $\lambda.$ So we should add this, and momentum, to our chart:

 Electronics Mechanics charge: $Q$ position: $q$ current: $I = \dot{Q}$ velocity: $v = \dot{q}$ flux linkage: $\lambda$ momentum: $p$ voltage: $V = \dot{\lambda}$ force: $F = \dot{p}$ inductance: $L$ mass: $m$ resistance: $R$ damping coefficient: $r$ inverse capacitance: $1/C$ spring constant: $k$

### Fourier transforms

But before I get carried away talking about analogies, let’s try to solve the equation for our circuit:

$L \ddot Q + R \dot Q + C^{-1} Q = V$

This instantly tells us the voltage $V$ as a function of time if we know the charge $Q$ as a function of time. So, ‘solving’ it means figuring out $Q$ if we know $V.$ You may not care about $Q$—it’s the charge of the electrons stuck on the capacitor—but you should certainly care about the current $I = \dot{Q},$ and figuring out $Q$ will get you that.

Besides, we’ll learn something good from solving this equation.

We could solve it using either the Laplace transform or the Fourier transform. They’re very similar. For some reason electrical engineers prefer the Laplace transform—does anyone know why? But I think the Fourier transform is conceptually preferable, slightly, so I’ll use that.

The idea is to write any function of time as a linear combination of oscillating functions $\exp(i\omega t)$ with different frequencies $\omega.$ More precisely, we write our function $f$ as an integral

$\displaystyle{ f(t) = \frac{1}{\sqrt{2 \pi}} \int_{-\infty}^\infty \hat{f}(\omega) e^{i\omega t} \, d\omega }$

Here the function $\hat{f}$ is called the Fourier transform of $f$, and it’s given by

$\displaystyle{ \hat{f}(\omega) = \frac{1}{\sqrt{2 \pi}} \int_{-\infty}^\infty f(t) e^{-i\omega t} \, dt }$

There is a lot one could say about this, but all I need right now is that differentiating a function has the effect of multiplying its Fourier transform by $i\omega.$ To see this, we simply take the Fourier transform of $\dot{f}$:

$\begin{array}{ccl} \hat{\dot{f}}(\omega) &=& \displaystyle{ \frac{1}{\sqrt{2 \pi}} \int_{-\infty}^\infty \frac{df(t)}{dt} \, e^{-i\omega t} \, dt } \\ \\ &=& \displaystyle{ -\frac{1}{\sqrt{2 \pi}} \int_{-\infty}^\infty f(t) \frac{d}{dt} e^{-i\omega t} \, dt } \\ \\ &=& \displaystyle{ i\omega \; \frac{1}{\sqrt{2 \pi}} \int_{-\infty}^\infty f(t) e^{-i\omega t} \, dt } \\ \\ &=& i\omega \hat{f}(\omega) \end{array}$

where in the second step we integrate by parts. So,

$\hat{\dot{f}}(\omega) = i\omega \hat{f}(\omega)$

The Fourier transform is linear, too, so we can start with our differential equation:

$L \ddot Q + R \dot Q + C^{-1} Q = V$

and take the Fourier transform of each term, getting

$\displaystyle{ \left((i\omega)^2 L + (i\omega) R + C^{-1}\right) \hat{Q}(\omega) = \hat{V}(\omega) }$

We can now solve for the charge in a completely painless way:

$\displaystyle{ \hat{Q}(\omega) = \frac{1}{((i\omega)^2 L + (i\omega) R + C^{-1})} \, \hat{V}(\omega) }$

Well, we actually solved for $\hat{Q}$ in terms of $\hat{V}.$ But if we’re good at taking Fourier transforms, this is good enough. And it has a deep inner meaning.

To see its inner meaning, note that the Fourier transform of an oscillating function $\exp(i \omega_0 t)$ is a delta function at the frequency $\omega = \omega_0.$ This says that this oscillating function is purely of frequency $\omega_0,$ like a laser beam of one pure color, or a sound of one pure pitch.

Actually there’s a little fudge factor due to how I defined the Fourier transform: if

$f(t) = e^{i\omega_0 t}$

then

$\displaystyle{ \hat{f}(\omega) = \sqrt{2 \pi} \, \delta(\omega - \omega_0) }$

But it’s no big deal. (You can define your Fourier transform so the $2\pi$ doesn’t show up here, but it’s bound to show up somewhere.)

Also, you may wonder how the complex numbers got into the game. What would it mean to say the voltage is $\exp(i \omega t)?$ The answer is: don’t worry, everything in sight is linear, so we can take the real or imaginary part of any equation and get one that makes physical sense.

Anyway, what does our relation

$\displaystyle{ \hat{Q}(\omega) = \frac{1}{((i\omega)^2 L + (i\omega) R + C^{-1})} \hat{V}(\omega) }$

mean? It means that if we put an oscillating voltage of frequency $\omega_0$ across our circuit, like this:

$V(t) = e^{i \omega_0 t}$

then we’ll get an oscillating charge at the same frequency, like this:

$\displaystyle{ Q(t) = \frac{1}{((i\omega_0)^2 L + (i\omega_0) R + C^{-1})} e^{i \omega_0 t} }$

To see this, just use the fact that the Fourier transform of $\exp(i \omega_0 t)$ is essentially a delta function at $\omega_0,$ and juggle the equations appropriately!

But the magnitude and phase of this oscillating charge $Q(t)$ depends on the function

$\displaystyle{ \frac{1}{((i\omega_0)^2 L + (i\omega_0) R + C^{-1})} }$

For example, $Q(t)$ will be big when $\omega_0$ is near a pole of this function! We can use this to study the resonant frequency of our circuit.

The same idea works for many more complicated circuits, and other things too. The function up there is an example of a transfer function: it describes the response of a linear, time-invariant system to an input of a given frequency. Here the ‘input’ is the voltage and the ‘response’ is the charge.

### Impedance

Taking this idea to its logical conclusion, we can see inductors and capacitors as being resistors with a frequency-dependent, complex-valued resistance! This generalized resistance is called ‘impedance. Let’s see how it works.

Suppose we have an electrical circuit. Consider any edge $e$ of this circuit:

• If our edge $e$ is labelled by a resistor of resistance $R$:

then

$V_e = R I_e$

Taking Fourier transforms, we get

$\hat{V}_e = R \hat{I}_e$

so nothing interesting here: our resistor acts like a resistor of resistance $R$ no matter what the frequency of the voltage and current are!

• If our edge $e$ is labelled by an inductor of inductance $L$:

then

$\displaystyle{ V_e = L \frac{d I_e}{d t} }$

Taking Fourier transforms, we get

$\hat{V}_e = (i\omega L) \hat{I}_e$

This is interesting: our inductor acts like a resistor of resistance $i \omega L$ when the frequency of the current and voltage is $\omega.$ So, we say the ‘impedance’ of the inductor is $i \omega L.$

• If our edge $e$ is labelled by a capacitor of capacitance $C$:

we have

$\displaystyle{ I_e = C \frac{d V_e}{d t} }$

Taking Fourier transforms, we get

$\hat{I}_e = (i\omega C) \hat{V}_e$

or

$\displaystyle{ \hat{V}_e = \frac{1}{i \omega C} \hat{I_e} }$

So, our capacitor acts like a resistor of resistance $1/(i \omega C)$ when the frequency of the current and voltage is $\omega.$ We say the ‘impedance’ of the capacitor is $1/(i \omega L).$

It doesn’t make sense to talk about the impedance of a voltage source or current source, since these circuit elements don’t give a linear relation between voltage and current. But whenever an element is linear and its properties don’t change with time, the Fourier transformed voltage will be some function of frequency times the Fourier transformed current. And in this case, we call that function the impedance of the element. The symbol for impedance is $Z,$ so we have

$\hat{V}_e(\omega) = Z(\omega) \hat{I}_e(\omega)$

or

$\hat{V}_e = Z \hat{I}_e$

for short.

### The big picture

In case you’re getting lost in the details, here are the big lessons for today:

• There’s a detailed analogy between electronics and mechanics, which we’ll later extend to many other systems.

• The study of linear time-independent elements can be reduced to the study of resistors if we generalize resistance to impedance by letting it be a complex-valued function instead of a real number.

One thing we’re doing is preparing for a general study of linear time-independent open systems. We’ll use linear algebra, but the field—the number system in our linear algebra—will consist of complex-valued functions, rather than real numbers.

### Puzzle

Let’s not forget our original problem:

This is closely related to the problem we just solved. All the equations we derived still hold! But if you do the math, or use some intuition, you’ll see the voltage source ensures that the voltage we’ve been calling $V$ is a constant. So, the current $I$ flowing around the wire obeys the same equation we got before:

$L \ddot Q + R \dot Q + C^{-1} Q = V$

where $\dot Q = I.$ The only difference is that now $V$ is constant.

Puzzle. Solve this equation for $Q(t).$

There are lots of ways to do this. You could use a Fourier transform, which would give a satisfying sense of completion to this blog article. Or, you could do it some other way.

## Network Theory (Part 27)

3 April, 2013

This quarter my graduate seminar at UCR will be about network theory. I have a few students starting work on this, so it seems like a good chance to think harder about the foundations of the subject. I’ve decided that bicategories of spans play a basic role, so I want to talk about those.

If you haven’t read the series up to now, don’t worry! Nothing I do for a while will rely on that earlier stuff. I want a fresh start. But just for a minute, I want to talk about the big picture: how the new stuff will relate to the old stuff.

So far this series has been talking about three closely related kinds of networks:

but there are many other kinds of networks, and I want to bring some more into play:

These come from the world of control theory and engineering—especially electrical engineering, but also mechanical, hydraulic and other kinds of engineering.

My goal is not to tour different formalisms, but to integrate them into a single framework, so we can easily take ideas and theorems from one discipline and apply them to another.

For example, in Part 16 we saw that a special class of Markov processes can also be seen as a special class of circuit diagrams: namely, electrical circuits made of resistors. Also, in Part 17 we saw that stochastic Petri nets and stochastic reaction networks are just two different ways of talking about the same thing. This allows us to take results from chemistry—where they like stochastic reaction networks, which they call ‘chemical reaction networks’—and apply them to epidemiology, where they like stochastic Petri nets, which they call ‘compartmental models’.

As you can see, fighting through the thicket of terminology is half the battle here! The problem is that people in different applied subjects keep reinventing the same mathematics, using terminologies specific to their own interests… making it harder to see how generally applicable their work actually is. But we can’t blame them for doing this. It’s the job of mathematicians to step in, learn all this stuff, and extract the general ideas.

We can see a similar thing happening when writing was invented in ancient Mesopotamia, around 3000 BC. Different trades invented their own numbering systems! A base-60 system, the S system, was used to count most discrete objects, such as sheep or people. But for ‘rations’ such as cheese or fish, they used a base 120 system, the B system. Another system, the ŠE system, was used to measure quantities of grain. There were about a dozen such systems! Only later did they get standardized.

### Circuit diagrams

But enough chit-chat; let’s get to work. I want to talk about circuit diagrams—diagrams of electrical circuits. They can get really complicated:

This is a 10-watt audio amplifier with bass boost. It looks quite intimidating. But I’ll start with a simple class of circuit diagrams, made of just a few kinds of parts:

• resistors,
• inductors,
• capacitors,
• voltage sources

and maybe some others later on. I’ll explain how you can translate any such diagram into a system of differential equations that describes how the voltages and currents along the wires change with time.

This is something you’d learn in a basic course on electrical engineering, at least back in the old days before analogue circuits had been largely replaced by digital ones. But my goal is different. I’m not mainly interested in electrical circuits per se: to me the important thing is how circuit diagrams provide a pictorial way of reasoning about differential equations… and how we can use the differential equations to describe many kinds of systems, not just electrical circuits.

So, I won’t spend much time explaining why electrical circuits do what they do—see the links for that. I’ll focus on the math of circuit diagrams, and how they apply to many different subjects, not just electrical circuits.

This describes a current flowing around a loop of wire with 4 elements on it: a resistor, an inductor, a capacitor, and a voltage source—for example, a battery. Each of these elements is designated by a cute symbol, and each has a real number associated to it:

• This is a resistor:

and it comes with a number $R,$ called its resistance.

• This is an inductor:

and it comes with a number $L,$ called its inductance.

• This is a capacitor:

and it comes with a number $C,$ called its capacitance.

• This is a voltage source:

and it comes with a number $V,$ called its voltage.

You may wonder why inductance got called $L$ instead of $I.$ Well, it’s probably because $I$ stands for ‘current’. And then you’ll ask why current is called $I$ instead of $C.$ I don’t know: maybe because $C$ stands for ‘capacitance’. If every word started with its own unique letter, we wouldn’t have these problems. But then we wouldn’t need words.

Here’s another example:

This example has two new features. First, it has places where wires meet, drawn as black dots. These dots are often called nodes, or sometimes vertices. Since ‘vertex’ starts with V and so does ‘voltage’, let’s call the dots ‘nodes’. Roughly speaking, a graph is a thing with nodes and edges, like this:

This suggests that in our circuit, the wires with elements on them should be seen as edges of a graph. Or perhaps just the wires should be seen as edges, and the elements should be seen as nodes! This is an example of a ‘design decision’ we have to make when formalizing the theory of circuit diagrams. There are also various different precise definitions of ‘graph’, and we need to try to choose the best one.

A second new feature of this example is that it has some white dots called terminals, where wires end. Mathematically these terminals are also vertices in our graph, but they play a special role: they are places where we are allowed to connect this circuit to another circuit. You’ll notice this circuit doesn’t have a voltage source. So, it’s like piece of electrical equipment without its own battery. We need to plug it in for it to do anything interesting!

This is very important. Big complicated electrical circuits are often made by hooking together smaller ones. The pieces are best thought of as ‘open systems’: that is, physical systems that interact with the outside world. Traditionally, a lot of physics focuses on ‘closed systems’, which don’t interact with the outside the world—the part of the world we aren’t modeling. But network theory is all about how we can connect open systems together to form larger open systems (or closed systems). And this is one place where category shows up. As we’ll see, we can think of an open system as a ‘morphism’ going from some inputs to some outputs, and we can ‘compose’ morphisms to get new morphisms by hooking them together.

### Differential equations from circuit diagrams

Let me sketch how to get a bunch of ordinary differential equations from a circuit diagram. These equations will say what the circuit does.

We start with a graph having some set $N$ of nodes and some set $E$ of edges. To say how much current is flowing along each edge it will be helpful to give each edge a direction, like this:

So, define a graph to consist of two functions

$s,t : E \to N$

Then each edge $e$ will have some vertex $s(e)$ as its source, or starting-point, and some vertex $t(e)$ as its target, or endpoint:

(This kind of graph is often called a directed multigraph or quiver, to distinguish it from other kinds, but I’ll just say ‘graph’.)

Next, each edge is labelled by one of four elements: resistor, capacitor, inductor or voltage source. It’s also labelled by a real number, which we call the resistance, capacitance, inductance or voltage of that element. We will make this part prettier later on, so we can easily introduce more kinds of elements without any trouble.

Finally, we specify a subset $T \subseteq N$ and call these nodes terminals.

Our goal now is to write down some ordinary differential equations that say how a bunch of variables change with time. These variables come in two kinds:

• Each edge $e$ has a current running along it, which is a function of time denoted $I_e$. So, for each $e \in E$ we have a function

$I_e : \mathbb{R} \to \mathbb{R}$

• Each edge $e$ also has a voltage across it, which is a function of time denoted $V_e$. So, for each $e \in E$ we have a function

$V_e : \mathbb{R} \to \mathbb{R}$

We now write down a bunch of equations obeyed by these currents and voltages. First there are some equations called Kirchhoff’s laws:

Kirchhoff’s current law says that for each node that is not a terminal, the total current flowing into that node equals the total current flowing out. In other words:

$\displaystyle{ \sum_{e: t(e) = n} I_e = \sum_{e: s(e) = n} I_e }$

for each node $n \in N - T.$ We don’t impose Kirchhoff’s current law at terminals, because we want to allow current to flow in or out there!

Kirchhoff’s voltage law says that we can choose for each node a potential $\phi_n,$ which is a function of time:

$\phi_n : \mathbb{R} \to \mathbb{R}$

such that

$V_e = \phi_{s(e)} - \phi_{t(e)}$

for each $e \in E.$ In other words, the voltage across each edge is the difference of potentials at the two ends of this edge. This is a slightly nonstandard way to state Kirchhoff’s voltage law, but it’s equivalent to the usual one.

In addition to Kirchhoff’s laws, there’s an equation for each edge, relating the current and voltage on that edge. The details of this equation depends on the element labelling that edge, so we consider the four cases in turn:

• If our edge $e$ is labelled by a resistor of resistance $R$:

we write the equation

$V_e = R I_e$

This is called Ohm’s law.

• If our edge $e$ is labelled by an inductor of inductance $L$:

we write the equation

$\displaystyle{ V_e = L \frac{d I_e}{d t} }$

I don’t know a name for this equation, but you can read about it here.

• If our edge $e$ is labelled by a capacitor of capacitance $C$:

we write the equation

$\displaystyle{ I_e = C \frac{d V_e}{d t} }$

I don’t know a name for this equation, but you can read about it here.

• If our edge $e$ is labelled by a voltage source of voltage $V$:

we write the equation

$V_e = V$

This explains the term ‘voltage source’.

### Puzzles

Next time we’ll look at some examples and see how we can start polishing up this formalism into something more pretty. But you can get to work now:

Puzzle 1. Starting from the rules above, write down and simplify the equations for this circuit:

Puzzle 2. Do the same for this circuit:

Puzzle 3. If we added a fifth kind of element, our rules for getting equations from circuit diagrams would have more symmetry between voltages and currents. What is this extra element?

## Probability Theory and the Undefinability of Truth

31 March, 2013

In 1936 Tarski proved a fundamental theorem of logic: the undefinability of truth. Roughly speaking, this says there’s no consistent way to extend arithmetic so that it talks about ‘truth’ for statements about arithmetic. Why not? Because if we could, we could cook up a statement that says “I am not true.” This would lead to a contradiction, the Liar Paradox: if this sentence is true then it’s not, and if it’s not then it is.

This is why the concept of ‘truth’ plays a limited role in most modern work on logic… surprising as that might seem to novices!

However, suppose we relax a bit and allow probability theory into our study of arithmetic. Could there be a consistent way to say, within arithmetic, that a statement about arithmetic has a certain probability of being true?

We can’t let ourselves say a statement has a 100% probability of being true, or a 0% probability of being true, or we’ll get in trouble with the undefinability of truth. But suppose we only let ourselves say that a statement has some probability greater than $a$ and less than $b$, where $0 < a < b < 1.$ Is that okay?

Yes it is, according to this draft of a paper:

• Paul Christiano, Eliezer Yudkowsky, Marcello Herresho ff and Mihaly Barasz, De finability of “Truth” in Probabilistic Logic
(Early draft)
, 28 March 2013.

But there’s a catch, or two. First there are many self-consistent ways to assess the probability of truth of arithmetic statements. This suggests that the probability is somewhat ‘subjective’ . But that’s fine if you think probabilities are inherently subjective—for example, if you’re a subjective Bayesian.

A bit more problematic is this: their proof that there exists a self-consistent way to assess probabilities is not constructive. In other words, you can’t use it to actually get your hands on a consistent assessment.

Fans of game theory will be amused to hear why: the proof uses Kakutani’s fixed point theorem! This is the result that John Nash used to prove games have equilibrium solutions, where nobody can improve their expected payoff by changing their strategy. And this result is not constructive.

In game theory, we use Kakutani’s fixed point theorem by letting each player update their strategy, improving it based on everyone else’s, and showing this process has a fixed point. In probabilistic logic, the process is instead that the thinker reflects on what they know, and updates their assessment of probabilities.

### The statement

I have not yet carefully checked the proof of Barasz, Christiano, Herreshoff and Yudkowsky’s result. Some details have changed in the draft since I last checked, so it’s probably premature to become very nitpicky. But just to encourage technical discussions of this subject, let me try stating the result a bit more precisely. If you don’t know Tarski’s theorem, go here:

Tarski’s undefinability theorem, Wikipedia.

I’ll assume you know that and are ready for the new stuff!

The context of this work is first-order logic. So, consider any language $L$ in first-order logic that lets us talk about natural numbers and also rational numbers. Let $L'$ be the language $L$ with an additional function symbol $\mathbb{P}$ thrown in. We require that $\mathbb{P}(n)$ be a rational number whenever $n$ is a natural number. We want $\mathbb{P}(n)$ to stand for the probability of the truth of the sentence whose Gödel number is $n.$ This will give a system that can reflect about probability that what it’s saying is true.

So, suppose $T$ is some theory in the language $L'.$ How can we say that the probability function $\mathbb{P}$ has ‘reasonable opinions’ about truth, assuming that the axioms of $T$ are true?

The authors have a nice way of answering this. First they consider any function $P$ assigning a probability to each sentence of $L'.$ They say that $P$ is coherent if there is a probability measure on the set of models of $L'$ such that $P(\phi)$ is the measure of the set of models in which $\phi$ is satisfied. They show that $P$ is coherent iff these three conditions hold:

1) $P(\phi) = P(\phi \wedge \psi) + P(\phi \wedge \lnot \psi)$ for all sentences $\phi, \psi.$

2) $P(\phi) = 1$ for each tautology.

3) $P(\phi) = 0$ for each contradiction.

(By the way, it seems to me that 1) and 2) imply $P(\phi) + P(\lnot \phi) = 1$ and thus 3). So either they’re giving a slightly redundant list of conditions because they feel in the mood for it, or they didn’t notice this list was redundant, or it’s not and I’m confused. It’s good to always say a list of conditions is redundant if you know it is. You may be trying to help your readers a bit, and it may seem obvious to you, but it you don’t come out and admit the redundancy, you’ll make some of your readers doubt their sanity.)

(Also by the way, they don’t say how they’re making the set of all models into a measurable space. But I bet they’re using the σ-algebra where all subsets are measurable, and I think there’s no problem with the fact that this set is very large: a proper class, I guess! If you believe in the axiom of universes, you can just restrict attention to ‘small’ models… and your probability measure will be supported on a countable set of models, since an uncountable sum of positive numbers always diverges, so the largeness of the set of these models is largely irrelevant.)

So, let’s demand that $P$ be coherent. And let’s demand that $P(\phi) = 1$ whenever the sentence $\phi$ is one of the axioms of $T.$

At this point, we’ve got this thing $P$ that assigns a probability to each sentence in our language. We’ve also got this thing $\mathbb{P}$ in our language, such that $\mathbb{P}(n)$ is trying to be the probability of the truth of the sentence whose Gödel number is $n.$ But so far these two things aren’t connected.

To connect them, they demand a reflection principle: for any sentence $\phi$ and any rational numbers $0 < a < b < 1,$

$a < P(\phi) < b \implies P(a < \mathbb{P}(\ulcorner \phi \urcorner) < b) = 1$

Here $\ulcorner \phi \urcorner$ is the Gödel number of the sentence $\phi.$ So, this principle says that if a sentence has some approximate probability of being true, the thinker—as described by $\mathbb{P}$—knows this. They can’t know precise probabilities, or we’ll get in trouble. Also, making the reflection principle into an if and only if statement:

$a < P(\phi) < b \iff P(a < \mathbb{P}(\ulcorner \phi \urcorner) < b) = 1$

is too strong. It leads to a contradictions, very much as in Tarski’s original theorem on the undefinability of truth! However, in the latest draft of the paper, the authors seem to have added a weak version of the converse to their formulation of the reflection principle.

Anyway, the main theorem they’re claiming is this:

Theorem (Barasz, Christiano, Herresho ff and Yudkowsky). There exists a function $P$ assigning a probability to each sentence of $L',$ such that

1) $P$ is coherent,

2) $P(\phi) = 1$ whenever the sentence $\phi$ is one of the axioms of $T,$

and

3) the reflection principle holds. 