Petri Net Programming (Part 3)

guest post by David Tanzer

The role of differential equations

Last time we looked at stochastic Petri nets, which use a random event model for the reactions. Individual entities are represented by tokens that flow through the network. When the token counts get large, we observed that they can be approximated by continuous quantities, which opens the door to the application of continuous mathematics to the analysis of network dynamics.

A key result of this approach is the “rate equation,” which gives a law of motion for the expected sizes of the populations. Equilibrium can then be obtained by solving for zero motion. The rate equations are applied in chemistry, where they give the rates of change of the concentrations of the various species in a reaction.

But before discussing the rate equation, here I will talk about the mathematical form of this law of motion, which consists of differential equations. This form is naturally associated with deterministic systems involving continuous magnitudes. This includes the equations of motion for the sine wave:

and the graceful ellipses that are traced out by the orbits of the planets around the sun:

This post provides some mathematical context to programmers who have not worked on scientific applications. My goal is to get as many of you on board as possible, before setting sail with Petri net programming.

Three approaches to equations: theoretical, formula-based, and computational

Let’s first consider the major approaches to equations in general. We’ll illustrate with a Diophantine equation

x^9 + y^9 + z^9 = 2

where x, y and z are integer variables.

In the theoretical approach (aka “qualitative analysis”), we start with the meaning of the equation and then proceed to reason about its solutions. Here are some simple consequences of this equation. They can’t all be zero, can’t all be positive, can’t all be negative, can’t all be even, and can’t all be odd.

In the formula-based approach, we seek formulas to describe the solutions. Here is an example of a formula (which does not solve our equation):

\{(x,y,z) | x = n^3, y = 2n - 4, z = 4 n | 1 \leq n \leq 5 \}

Such formulas are nice to have, but the pursuit of them is diabolically difficult. In fact, for Diophantine equations, even the question of whether an arbitrarily chosen equation has any solutions whatsoever has been proven to be algorithmically undecidable.

Finally, in the computational approach, we seek algorithms to enumerate or numerically approximate the solutions to the equations.

The three approaches to differential equations

Let’s apply the preceding classification to differential equations.

Theoretical approach

A differential equation is one that constrains the rates at which the variables are changing. This can include constraints on the rates at which the rates are changing (second-order equations), etc. The equation is ordinary if there is a single independent variable, such as time, otherwise it is partial.

Consider the equation stating that a variable increases at a rate equal to its current value. The bigger it gets, the faster it increases. Given a starting value, this determines a process — the solution to the equation — which here is exponential growth.

Let X(t) be the value at time t, and let’s initialize it to 1 at time 0. So we have:

X(0) = 1

X'(t) = X(t)

These are first-order equations, because the derivative is applied at most once to any variable. They are linear equations, because the terms on each side of the equations are linear combinations of either individual variables or derivatives (in this case all of the coefficients are 1). Note also that a system of differential equations may in general have zero, one, or multiple solutions. This example belongs to a class of equations which are proven to have a unique solution for each initial condition.

You could imagine more complex systems of equations, involving multiple dependent variables, all still depending on time. That includes the rate equations for a Petri net, which have one dependent variable for each of the population sizes. The ideas for such systems are an extension of the ideas for a single-variable system. Then, a state of the system is a vector of values, with one component for each of the dependent variables. For first-order systems, such as the rate equations, where the derivatives appear on the left-hand sides, the equations determine, for each possible state of the system, a “direction” and rate of change for the state of the system.

Now here is a simple illustration of what I called the theoretical approach. Can X(t) ever become negative? No, because it starts out positive at time 0, and in order to later become negative, it must be decreasing at a time t_1 when it is still positive. That is to say, X(t_1) > 0, and X'(t_1) < 0. But that contradicts the assumption X'(t) = X(t). The general lesson here is that we don’t need a solution formula in order to make such inferences.

For the rate equations, the theoretical approach leads to substantial theorems about the existence and structure of equilibrium solutions.

Formula-based approach

It is natural to look for concise formulas to solve our equations, but the results of this overall quest are largely negative. The exponential differential equation cannot be solved by any formula that involves a finite combination of simple operations. So the solution function must be treated as a new primitive, and given a name, say \exp(t). But even when we extend our language to include this new symbol, there are many differential equations that remain beyond the reach of finite formulas. So an endless collection of primitive functions is called for. (As standard practice, we always include exp(t), and its complex extensions to the trigonometric functions, as primitives in our toolbox.)

But the hard mathematical reality does not end here, because even when solution formulas do exist, finding them may call for an ace detective. Only for certain classes of differential equations, such as the linear ones, do we have systematic solution methods.

The picture changes, however, if we let the formulas contain an infinite number of operations. Then the arithmetic operators give a far-reaching base for defining new functions. In fact, as you can verify, the power series

X(t) = 1 + t + t^2/2! + t^3/3! + ...

which we view as an “infinite polynomial” over the time parameter t, exactly satisfies our equations for exponential motion, X(0) = 1 and X'(t) = X(t). This power series therefore defines \exp(t). By the way, applying it to the input 1 produces a definition for the transcendental number e:

e = X(1) = 1 + 1 + 1/2 + 1/6 + 1/24 + 1/120 + ... \approx 2.71828

Computational approach

Let’s leave aside our troubles with formulas, and consider the computational approach. For broad classes of differential equations, there are approximation algorithms that be successfully applied.

For starters, any power series that satisfies a differential equation may work for a simple approximation method. If a series is known to converge over some range of inputs, then one can approximate the value at those points by stopping the computation after a finite number of terms.

But the standard methods work directly with the equations, provided that they can be put into the right form. The simplest one is called Euler’s method. It works over a sampling grid of points separated by some small number \epsilon. Let’s take the case where we have a first-order equation in explicit form, which means that X'(t) = f(X(t)) for some function f.

We begin with the initial value X(0). Applying f, we get X'(0) = f(X(0)). Then for the interval from 0 to \epsilon,we use a linear approximation for X(t), by assuming that the derivative remains constant at X'(0). That gives X(\epsilon) = X(0) + \epsilon \cdot X'(0). Next, X'(\epsilon) = f(X(\epsilon)), and X(2 \epsilon) = X(\epsilon) + \epsilon \cdot X'(\epsilon), etc. Formally,

X(0) = \textrm{initial}

X((n+1) \epsilon) = X(n \epsilon) + \epsilon f(X(n \epsilon))

Applying this to our exponential equation, where f(X(t)) = X(t), we get:

X(0) = 1

X((n+1) \epsilon) = X(n \epsilon) + \epsilon X(n \epsilon) = X(n \epsilon) (1 + \epsilon)

Hence:

X(n \epsilon) = (1 + \epsilon) ^ n

So the approximation method gives a discrete exponential growth, which converges to a continuous exponential in the limit as the mesh size goes to zero.

Note, the case we just considered has more generality than might appear at first, because (1) the ideas here are easily extended to systems of explicit first order equations, and (2) higher-order equations that are “explicit” in an extended sense—meaning that the highest-order derivative is expressed as a function of time, of the variable, and of the lower-order derivatives—can be converted into an equivalent system of explicit first-order equations.

The challenging world of differential equations

So, is our cup half-empty or half-full? We have no toolbox of primitive formulas for building the solutions to all differential equations by finite compositions. And even for those which can be solved by formulas, there is no general method for finding the solutions. That is how the cookie crumbles. But on the positive side, there is an array of theoretical tools for analyzing and solving important classes of differential equations, and numerical methods can be applied in many cases.

The study of differential equations leads to some challenging problems, such as the Navier-Stokes equations, which describe the flow of fluids.

These are partial differential equations involving flow velocity, pressure, density and external forces (such as gravity), all of which vary over space and time. There are non-linear (multiplicative) interactions between these variables and their spatial and temporal derivatives, which leads to complexity in the solutions.

At high flow rates, this complexity can produce chaotic solutions, which involve complex behavior at a wide range of resolution scales. This is turbulence. Here is an insightful portrait of turbulence, by Leonardo da Vinci, whose studies in turbulence date back to the 15th Century.

Turbulence, which has been described by Richard Feynman as the most important unsolved problem of classical physics, also presents a mathematical puzzle. The general existence of solutions to the Navier-Stokes equations remains unsettled. This is one of the “Millennium Prize Problems”, for which a one million dollar prize is offered: in three dimensions, given initial values for the velocity and scalar fields, does there exist a solution that is smooth and everywhere defined? There are also complications with grid-based numerical methods, which will fail to produce globally accurate results if the solutions contain details at a smaller scale than the grid mesh. So the ubiquitous phenomenon of turbulence, which is so basic to the movements of the atmosphere and the seas, remains an open case.

But fortunately we have enough traction with differential equations to proceed directly with the rate equations for Petri nets. There we will find illuminating equations, which are the subject of both major theorems and open problems. They are non-linear and intractable by formula-based methods, yet, as we will see, they are well handled by numerical methods.

11 Responses to Petri Net Programming (Part 3)

  1. I think this is a very well written and extremely clear post! Had i still been a teacher i would have certainly recommended it to my students. Thanks David.

    One thing that i’d like to mention about the formula-based approach is that in certain cases the formula exist but become so long (pages) that it becomes practically useless (this happens for example in Robotics).

    On the other hand the computational approach abstracts away the solution process, thereby focusing more on the differential equations that define the system, so i personally like it a little bit more from a pedagogical perspective.

    • David Tanzer says:

      Hi Giampiero, I am curious about the long formulas that you mention. What kinds of equations do you find in Robotics that have long solution formulas, and what kinds of problems lead to these equations?

      • Giampiero Campa says:

        Hi David, that’s easy. Try to write the equations of motion for a 4 (or more) DOF robotic arm, and then you’ll see by yourself …

        For example, have a look at this old blog post that describes just a double pendulum, and imagine what happens when you add a couple or more links to it, and maybe some translational and rotational joints …

        You can derive the equations with symbolic tools (e.g Mathematica, Mupad, Maple), and there are people successfully doing that. However, as you can imagine, an equation that span 3 full pages is not that insightful. Also, from a purely computational perspective, you quickly hit a wall where just evaluating that formula takes more time than actually integrating the ODEs or DAEs (at least that was true when i was more involved in these things about 15 years ago, but i suspect that it’s still true, even though i hear that symbolic-based algorithms have advanced in the meantime).

        Finally, imagine your robotic arm has an end effector (an hand) that at some point picks something up. When that happens, the mass changes, and you have to recalculate the numerical parameters of your symbolic equations, while if you are just integrating an ODE you can just go on with a different value of the mass parameter for your last link. Similarly, if your arm (or leg) touches the ground, the structure of your equations drastically and instantly changes, so if you are using a symbolic approach you have to recalculate all the equations, while if you are integrating the ODEs, you can just more or less change the structure on the fly (to a DAE) and then keep on with the integration.

        • Eugene says:

          What’s a DAE? Is it a differential-algebraic equation? In other words, an implicit differential equation?

        • Yes, a Differential Algebraic Equation, sorry should have specified it above. Also DOF means degrees of freedom, in the mechanical sense.

        • Giampiero Campa says:

          By the way, not sure why you call it “implicit”. To me a DAE is just an ODE with additional (typically equality) constraints, which force the solution into a subspace or subset. In the first case (subspace) this is equivalent to applying a projection operator, which results in a lower dimensional ODE.

        • Eugene says:

          I don’t know much about DAEs, but I was under the impression that equations of the form f(y, y', y'', \ldots y^{(n)})) =0, where f is a function of n+1 variables, are considered to be differential algebraic equations

  2. John Baez says:

    I like your trinity of approaches, David: theoretical, formula-based and computational. Some people may think of the first two as ‘math’ and the third as ‘computer science’. So, it’s perhaps worth noting that the theoretical approach is closely linked to the computational approach. You gave a great example here. Your simple algorithm for approximately computing the solution of

    X'(t) = X(t), \qquad X(0) = 1

    is closely linked to the proof that a solution exists. To convert the algorithm into a proof of existence, we need to show that the approximate solution with step size \epsilon gets closer and closer to an actual solution when \epsilon \to 0. Conversely, we need to show this to be sure our algorithm is a good one: if chose a bad algorithm, the ‘approximate solution’ might get worse and worse as \epsilon \to 0.

    Furthermore, we’d really like to go a bit further, and get an estimate of how close our approximation is to the actual solution, depending on the step size we choose.

    All this is fairly easy for this particular differential equation… mainly because it’s so famous that its solution has a name, and we all learn lots of properties of this function. For example, every sufficiently detailed textbook on calculus includes a proof that

    \displaystyle{ \lim_{n \to +\infty} (1 + \frac{1}{n})^n = e}

    where e may be defined some other way, e.g. by the property

    \displaystyle{ \int_1^e \frac{1}{x} \, dx = 1}

    (By the way, did you know that Euler gave e its name in a paper where the first variable was called A, the second one B, and so on, until the fifth one was the famous number E? Coincidence… or clever branding strategy?)

    However, many of the same ideas generalize to differential equations of the form

    X'(t) = f(X(t))

    where f is any ‘sufficiently nice’ function, for example any Lipschitz function. We can even generalize them to equations of the form

    X'(t) = f(X(t), t)

    where f is continuous in both variables, and Lipschitz in the first. This is the Picard existence and uniqueness theorem, which also lets us create algorithms for solving differential equations like this, and prove their properties.

    And as you note, this idea can be used to study higher-order differential equations, since those can be viewed as systems of first-order ones. It can also be used to study a bunch of nonlinear partial differential equations, since those can be viewed as ordinary differential equations for functions taking value in a space of functions!

    (In fact this is what I used to work on, back when I was a postdoc. When U. C. Riverside hired me, they were hiring a guy who worked on nonlinear partial differential equations.)

    The case of the Navier–Stokes equations is very important and frustrating. On the theoretical side, we don’t know solutions exist for all times, because the nonlinearity is too strong to apply conservation laws to prove the solutions don’t blow up. On the computational side, weather forecasters have trouble knowing what’s the best approach to subgrid-scale modeling. Subgrid-scale modeling is a name for ‘guessing how to model fluid flow on length scales smaller than our grid size’. If we knew more about the theory of the Navier–Stokes equations, we might have more systematic ways to do subgrid-scale modeling. But apparently this area holds many mysteries.

    • domenico says:

      I don’t know if exist already a my old toy model.
      I tried to solve a general differential equation, using the approximation of the surface in the derivative space with a plane: this is a mix of formula-based approach, and a computational approach:
      0 = F(y,\dot y, \ddot y,\cdots)
      then near the initial value:
      0 =\left.\frac{\partial F}{\partial y}\right|_0 (y-y_0)+\left.\frac{\partial F}{\partial \dot y}\right|_0 (\dot y-\dot y_0)+\left.\frac{\partial F}{\partial \ddot y}\right|_0(\ddot y-\ddot y_0)+\cdots
      so that is possible the solution with the power series:
      y(t) = y_0+\dot y_0+\frac{\ddot y}{2}+\sum_n a_n \frac{t^n}{n!}
      but is simple to obtain that:
      a_n = -\frac{\left.\frac{\partial F}{\partial \dot y}\right|_0}{\left.\frac{\partial F}{\partial \ddot y}\right|_0} a_{n-1}-\frac{\left.\frac{\partial F}{\partial y}\right|_0}{\left.\frac{\partial F}{\partial \dot y}\right|_0} a_{n-2}
      it is possible, for iteration, to obtain the solution as power series near the initial point.
      I think that can be generalized to higher dimension space (with more variable y,z and more derivative), but I have never verified (for example with Navier-Stokes equations).

      • domenico says:

        I am thinking today that there is a differential equation, and so a function, for each

        0=F(y,\dot y,\ddot y)

        have the local solution

        a_n =-\frac{\left.\frac{\partial F}{\partial \dot y}\right|_0}{\left.\frac{\partial F}{\partial \ddot y}\right|_0} a_{n-1}-\frac{\left.\frac{\partial F}{\partial y}\right|_0}{\left.\frac{\partial F}{\partial \ddot y}\right|_0} a_{n-2}

        If the parameters are constant, and equal to 1, then these are Fibonacci numbers (or more complex sequences):

        a_n =a_{n-1}+a_{n-2}

        then the parameters are (using matrix calculus to obtain the eigenvalues):

        a_n=\frac{1}{\sqrt{5}}\left[\left(\frac{1+\sqrt{5}}{2}\right)^{n+1}-\left(\frac{1-\sqrt{5}}{2}\right)^{n+1}\right]

        then the differential equation (Fibonacci differential equation) is:

        0 = \ddot y -\dot y- y

        and the solution is:

        y(t)=A\ e^{\frac{1-\sqrt{5}}{2}t}+ B e^{\frac{1+\sqrt{5}}{2}t}

        there is a countable number of differential equation, and the nth number can be obtained (for each differential equation) from the solution of the differential equation:

        a_n = \left.\frac{\partial y(t)}{\partial t}\right|_0

        If all is right, it seem to me interesting.

  3. [...] Here is a post (via John Baez’s Google+ share) that describes the three approaches to equations: what the post calls theoretical and formula based can also be called as  analysis and seeking analytical solution respectively. There are, in a similar manner, three approaches to doing science itself: theoretical, experimental and computational. The best approach, be it equations or science, is, of course, to combine all the three (in proportions according to taste). [...]

You can use HTML in your comments. You can also use LaTeX, like this: $latex E = m c^2 $. The word 'latex' comes right after the first dollar sign, with a space after it.

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 2,805 other followers