Network Theory (Part 15)

Last time we saw how to get a graph whose vertices are states of a molecule and whose edges are transitions between states. We focused on two beautiful but not completely realistic examples that both give rise to the same highly symmetrical graph: the ‘Desargues graph’.

Today I’ll start with a few remarks about the Desargues graph. Then I’ll get to work showing how any graph gives:

• A Markov process, namely a random walk on the graph.

• A quantum process, where instead of having a probability to hop from vertex to vertex as time passes, we have an amplitude.

The trick is to use an operator called the ‘graph Laplacian’, a discretized version of the Laplacian which happens to be both infinitesimal stochastic and self-adjoint. As we saw in Part 12, such an operator will give rise both to a Markov process and a quantum process (that is, a one-parameter unitary group).

The most famous operator that’s both infinitesimal stochastic and self-adjoint is the Laplacian, \nabla^2. Because it’s both, the Laplacian shows up in two important equations: one in stochastic mechanics, the other in quantum mechanics.

• The heat equation:

\displaystyle{ \frac{d}{d t} \psi = \nabla^2 \psi }

describes how the probability \psi(x) of a particle being at the point x smears out as the particle randomly walks around:

The corresponding Markov process is called ‘Brownian motion’.

• The Schrödinger equation:

\displaystyle{ \frac{d}{d t} \psi = -i \nabla^2 \psi }

describes how the amplitude \psi(x) of a particle being at the point x wiggles about as the particle ‘quantumly’ walks around.

Both these equations have analogues where we replace space by a graph, and today I’ll describe them.

Drawing the Desargues graph

First I want to show you a nice way to draw the Desargues graph. For this it’s probably easiest to go back to our naive model of an ethyl cation:

Even though ethyl cations don’t really look like this, and we should be talking about some trigonal bipyramidal molecule instead, it won’t affect the math to come. Mathematically, the two problems are isomorphic! So let’s stick with this nice simple picture.

We can be a bit more abstract, though. A state of the ethyl cation is like having 5 balls, with 3 in one pile and 2 in the other. And we can focus on the first pile and forget the second, because whatever isn’t in the first pile must be in the second.

Of course a mathematician calls a pile of things a ‘set’, and calls those things ‘elements’. So let’s say we’ve got a set with 5 elements. Draw a red dot for each 2-element subset, and a blue dot for each 3-element subset. Draw an edge between a red dot and a blue dot whenever the 2-element subset is contained in the 3-element subset. We get the Desargues graph.

That’s true by definition. But I never proved that any of the pictures I showed you are correct! For example, this picture shows the Desargues graph:

but I never really proved this fact—and I won’t now, either.

To draw a picture we know is correct, it’s actually easier to start with a big graph that has vertices for all the subsets of our 5-element set. If we draw an edge whenever an n-element subset is contained in an (n+1)-element subset, the Desargues graph will be sitting inside this big graph.

Here’s what the big graph looks like:

This graph has 2^5 vertices. It’s actually a picture of a 5-dimensional hypercube! The vertices are arranged in columns. There’s

• one 0-element subset,

• five 1-element subsets,

• ten 2-element subsets,

• ten 3-element subsets,

• five 4-element subsets,

• one 5-element subset.

So, the numbers of vertices in each column go like this:

1 \quad 5 \quad 10 \quad 10 \quad 5 \quad 1

which is a row in Pascal’s triangle. We get the Desargues graph if we keep only the vertices corresponding to 2- and 3-element subsets, like this:

It’s less pretty than our earlier picture, but at least there’s no mystery to it. Also, it shows that the Desargues graph can be generalized in various ways. For example, there’s a theory of bipartite Kneser graphs H(n,k). The Desargues graph is H(5,2).

Desargues’ theorem

I can’t resist answering this question: why is it called the ‘Desargues graph’? This name comes from Desargues’ theorem, a famous result in projective geometry. Suppose you have two triangles ABC and abc, like this:

Suppose the lines Aa, Bb, and Cc all meet at a single point, the ‘center of perspectivity’. Then the point of intersection of ab and AB, the point of intersection of ac and AC, and the point of intersection of bc and BC all lie on a single line, the ‘axis of perspectivity’. The converse is true too. Quite amazing!

The Desargues configuration consists of all the actors in this drama:

• 10 points: A, B, C, a, b, c, the center of perspectivity, and the three points on the axis of perspectivity


• 10 lines: Aa, Bb, Cc, AB, AC, BC, ab, ac, bc and the axis of perspectivity

Given any configuration of points and lines, we can form a graph called its Levi graph by drawing a vertex for each point or line, and drawing edges to indicate which points lie on which lines. And now for the punchline: Levi graph of the Desargues configuration is the ‘Desargues-Levi graph’!—or Desargues graph, for short.

Alas, I don’t know how this is relevant to anything I’ve discussed. For now it’s just a tantalizing curiosity.

A random walk on the Desargues graph

Back to business! I’ve been telling you about the analogy between quantum mechanics and stochastic mechanics. This analogy becomes especially interesting in chemistry, which lies on the uneasy borderline between quantum and stochastic mechanics.

Fundamentally, of course, atoms and molecules are described by quantum mechanics. But sometimes chemists describe chemical reactions using stochastic mechanics instead. When can they get away with this? Apparently whenever the molecules involved are big enough and interacting with their environment enough for ‘decoherence’ to kick in. I won’t attempt to explain this now.

Let’s imagine we have a molecule of iron pentacarbonyl with—here’s the unrealistic part, but it’s not really too bad—distinguishable carbonyl groups:

Iron pentacarbonyl is liquid at room temperatures, so as time passes, each molecule will bounce around and occasionally do a maneuver called a ‘pseudorotation’:

We can approximately describe this process as a random walk on a graph whose vertices are states of our molecule, and whose edges are transitions between states—namely, pseudorotations. And as we saw last time, this graph is the Desargues graph:

Note: all the transitions are reversible here. And thanks to the enormous amount of symmetry, the rates of all these transitions must be equal.

Let’s write V for the set of vertices of the Desargues graph. A probability distribution of states of our molecule is a function

\displaystyle{ \psi : V \to [0,\infty) }


\displaystyle{ \sum_{x \in V} \psi(x) = 1 }

We can think of these probability distributions as living in this vector space:

L^1(V) = \{ \psi: V \to \mathbb{R} \}

I’m calling this space L^1 because of the general abstract nonsense explained in Part 12: probability distributions on any measure space live in a vector space called L^1. Today that notation is overkill, since every function on V lies in L^1. But please humor me.

The point is that we’ve got a general setup that applies here. There’s a Hamiltonian:

H : L^1(V) \to L^1(V)

describing the rate at which the molecule randomly hops from one state to another… and the probability distribution \psi \in L^1(X) evolves in time according to the equation:

\displaystyle{ \frac{d}{d t} \psi(t) = H \psi(t) }

But what’s the Hamiltonian H? It’s very simple, because it’s equally likely for the state to hop from any vertex to any other vertex that’s connected to that one by an edge. Why? Because the problem has so much symmetry that nothing else makes sense.

So, let’s write E for the set of edges of the Desargues graph. We can think of this as a subset of V \times V by saying (x,y) \in E when x is connected to y by an edge. Then

\displaystyle{ (H \psi)(x) =  \sum_{y \,\, \textrm{such that} \,\, (x,y) \in E} \!\!\!\!\!\!\!\!\!\!\! \psi(y) \quad - \quad 3 \psi(x) }

We’re subtracting 3 \psi(x) because there are 3 edges coming out of each vertex x, so this is the rate at which the probability of staying at x decreases. We could multiply this Hamiltonian by a constant if we wanted the random walk to happen faster or slower… but let’s not.

The next step is to solve this discretized version of the heat equation:

\displaystyle{ \frac{d}{d t} \psi(t) = H \psi(t) }

Abstractly, the solution is easy:

\psi(t) = \exp(t H) \psi(0)

But to actually compute \exp(t H), we might want to diagonalize the operator H. In this particular example, we could take advantage of the enormous symmetry of the Desargues graph. Its symmetry group includes the permutation group S_5, so we could take the vector space L^1(V) and break it up into irreducible representations of S_5. Each of these will give an eigenspace of H, so by this method we can diagonalize H. I’d sort of like to try this… but it’s a big digression, so I won’t. At least, not today!

Graph Laplacians

The Hamiltonian we just saw is an example of a ‘graph Laplacian’. We can write down such a Hamiltonian for any graph, but it gets a tiny bit more complicated when different vertices have different numbers of edges coming out of them.

The word ‘graph’ means lots of things, but right now I’m talking about simple graphs. Such a graph has a set of vertices V and a set of edges E \subseteq V \times V, such that

(x,y) \in E \implies (y,x) \in E

which says the edges are undirected, and

(x,x) \notin E

which says there are no loops. Let d(x) be the degree of the vertex x, meaning the number of edges coming out of it.

Then the graph Laplacian is this operator on L^1(V):

\displaystyle{ (H \psi)(x) =  \sum_{y \,\, \textrm{such that} \, \,(x,y) \in E} \!\!\!\!\!\!\!\!\!\!\! \Psi(y) \quad - \quad d(x) \Psi(x) }

There is a huge amount to say about graph Laplacians! If you want, you can get started here:

• Michael William Newman, The Laplacian Spectrum of Graphs, Masters Thesis, Department of Mathematics, University of Manitoba, 2000.

But for now, let’s just say that \exp(t H) is a Markov process describing a random walk on the graph, where hopping from one vertex to any neighboring vertex has unit probability per unit time. We can make the hopping faster or slower by multiplying H by a constant. And here is a good time to admit that most people use a graph Laplacian that’s the negative of ours, and write time evolution as \exp(-t H). The advantage is that then the eigenvalues of the Laplacian are \ge 0.

But what matters most is this. We can write the operator H as a matrix whose entry H_{x y} is 1 when there’s an edge from x to y and 0 otherwise, except when x = y, in which case the entry is -d(x). And then:

Puzzle 1. Show that for any finite graph, the graph Laplacian H is infinitesimal stochastic, meaning that:

\displaystyle{ \sum_{x \in V} H_{x y} = 0 }


x \ne y \implies  H_{x y} \ge 0

This fact implies that for any t \ge 0, the operator \exp(t H) is stochastic—just what we need for a Markov process.

But we could also use H as a Hamiltonian for a quantum system, if we wanted. Now we think of \psi(x) as the amplitude for being in the state x \in V. But now \psi is a function

\psi : V \to \mathbb{C}


\displaystyle{ \sum_{x \in V} |\psi(x)|^2 = 1 }

We can think of this function as living in the Hilbert space

L^2(V) = \{ \psi: V \to \mathbb{C} \}

where the inner product is

\langle \phi, \psi \rangle = \displaystyle{ \sum_{x \in V} \overline{\phi(x)} \psi(x) }

Puzzle 2. Show that for any finite graph, the graph Laplacian H: L^2(V) \to L^2(V) is self-adjoint, meaning that:

H_{x y} = \overline{H}_{y x}

This implies that for any t \in \mathbb{R}, the operator \exp(-i t H) is unitary—just what we need for one-parameter unitary group. So, we can take this version of Schrödinger’s equation:

\displaystyle{ \frac{d}{d t} \psi = -i H \psi }

and solve it:

\displaystyle{ \psi(t) = \exp(-i t H) \psi(0) }

and we’ll know that time evolution is unitary!

So, we’re in a dream world where we can do stochastic mechanics and quantum mechanics with the same Hamiltonian. I’d like to exploit this somehow, but I’m not quite sure how. Of course physicists like to use a trick called Wick rotation where they turn quantum mechanics into stochastic mechanics by replacing time by imaginary time. We can do that here. But I’d like to do something new, special to this context.

Maybe I should learn more about chemistry and graph theory. Of course, graphs show up in at least two ways: first for drawing molecules, and second for drawing states and transitions, as I’ve been doing. These books are supposed to be good:

• Danail Bonchev and D.H. Rouvray, eds., Chemical Graph Theory: Introduction and Fundamentals, Taylor and Francis, 1991.

• Nenad Trinajstic, Chemical Graph Theory, CRC Press, 1992.

• R. Bruce King, Applications of Graph Theory and Topology in Inorganic Cluster Coordination Chemistry, CRC Press, 1993.

The second is apparently the magisterial tome of the subject. The prices of these books are absurd: for example, Amazon sells the first for $300, and the second for $222. Luckily the university here should have them…

41 Responses to Network Theory (Part 15)

  1. jamievicary says:

    Suppose our function \phi(x), be it a stochastic probability distribution or a quantum wavefunction, is defined at t=0 by \phi(p)=1, and \phi(x)=0 otherwise. Then, after some small amount of time \epsilon, at some point x away from p, we’ll have

    \phi_\epsilon(x) = -C \epsilon (H \phi)(x) = C \epsilon \delta _{(p,x) \in E}

    where C=1 in the stochastic case and C= - i in the quantum case.

    Now, suppose we measure the system, to see which vertex our particle is at. In the classical stochastic case, the probability of finding our particle at x \neq p is just \phi_\epsilon(x) as computed above – so it’s \epsilon \delta _{(p_x) \in E}. But in the quantum case, we have to to take the absolute value squared! So the probability is \epsilon^2 \delta_{(p,x)\in E}, and it seems the wavefunction will spread out more slowly in the quantum case.

    This suggests to me that even though these classical and quantum systems are described so neatly by the same operator, you can’t obtain the classical system by implementing the quantum system and measuring it very often, even in the limit that you measure it infinitely rapidly. Maybe there’s some other way to do it?

    Also, you seem to have some errant p: symbols underneath your summation signs. Should they be there? And I think it’s only the first term after the \Sigma that’s intended to be included in the summation, but that’s not very clear — maybe they would look better if they were grouped together by brackets.

    • John Baez says:

      Hi, Jamie! Here are some trivial remarks… I’ll say some nontrivial things after I go to the gym, eat dinner, and perhaps even sleep.

      Jamie wrote:

      Also, you seem to have some errant p: symbols underneath your summation signs. Should they be there?

      I don’t see any of those. I see y:, meaning “y such that”. I wrote

      \sum_{y: (x,y) \in E}

      to make it clear that we’re summing over all y‘s with (x,y) \in E, as opposed to

      \sum_{(x,y) \in E}

      which might mean summing over all x‘s and y‘s with (x,y) \in E.

      And I think it’s only the first term after the \Sigma that’s intended to be included in the summation, but that’s not very clear — maybe they would look better if they were grouped together by brackets.

      Really? Even though I put in an enormous amount of space:

      \displaystyle{ (H \psi)(x) =  \sum_{y: (x,y) \in E} \Psi(y) \quad - \quad d(x) \Psi(x) }

      you still think someone will think I’m summing over the second part? It’s no fair! I tried writing

      \displaystyle{ (H \psi)(x) =  ( \sum_{y: (x,y) \in E} \Psi(y)) - d(x) \Psi(x) }

      but I thought it looked stupid. I’ve always been a bit annoyed by how some people think a summation sign mercilessly applies to everything that comes after it, no matter how far away. Perhaps on the opposite page, perhaps in other books in the room…

      • jamievicary says:

        :) To be fair to me, John, the distance between the ‘\Sigma‘ and the ‘\Psi(y)‘ is pretty similar to the distance between the ‘\Psi(y)‘ and the ‘{}-{}‘. The \Psi(y) has been moved to the right by the long subscript beneath the \Sigma. If you could suck it back in somehow, that would improve things.

      • Aaron says:

        Why not put the -d(x) \psi(x) term before the sum? That shouldn’t look too ugly, but still won’t confuse those who don’t know your conventions for how far the summation extends.

        • John Baez says:

          Thanks! I considered that, but alas, I have a grumpy, perhaps irrational reluctance to write -1+100 when I’m thinking 100-1. I think of the term being subtracted off here as a ‘correction’ to the ‘main idea’, so mentally it comes second.

          Anyway, when I publish these notes I’ll probably put in parentheses around the sum, but make sure they don’t look too ghastly.

    • jamievicary says:

      I meant `y: symbols’, not `p: symbols’.

  2. jamievicary says:

    Sorry the maths isn’t coming out right, John, hopefully you can fix that.

    It seems to me that if the graph wasn’t symmetric, H would still be infinitesimal stochastic, but it wouldn’t be self-adjoint. I wonder if there’s some way to fix it up the quantum version.

    • John Baez says:

      I massively fixed your TeX. You can’t do stuff like \[ \] or \emph here, this isn’t really TeX-land.

      You can only put things in single dollar signs, and the word ‘latex’ must appear directly after the opening dollar sign to let the system know you’re entering TeX mode. So, if you type

      $latex E = mc^2 $

      the world will see

      E = mc^2

      and think you’re Einstein, but if you type

      $ E = mc^2 $

      the world will see

      $ E = mc^2 $

      and think you’re Einstein still trying to learn how TeX works on WordPress blogs.

      I’ll post a comment on more substantial issues later!

  3. Greg Egan says:

    Here’s a simple proof that this picture of the Desargues graph with red dots for 2-element subsets and blue dots for 3-element subsets is correct:

    For i=0,...,4, and with all the additions below modulo 5, define five pairs of red dots as:

    \{i, i+1\}, \{i+1,i+4\}

    and five pairs of blue dots as:

    \{i, i+1, i+4\}, \{i+1, i+2, i+4\}

    The union of the ith pair of red dots is the first of the ith pair of blue dots, and the union of the second of the ith pair of red dots and the first of the (i+1)th pair of red dots is the second of the ith pair of blue dots. So if we form a 20-sided polygon whose vertices are alternating red and blue dots generated in this order, all the edges of the polygon will join red dots to blue dots of which they are subsets:

    The first red dot of the ith pair is also a subset of the first blue dot of the (i+1)th pair:

    \{i+1, i+2, i\}

    which gives the five short chords in the picture, while the second red dot of the ith pair is a subset of the second blue dot of the (i+2)th pair:

    \{i+3, i+4, i+1\}

    which gives the five long chords in the picture.

    • John Baez says:

      That’s great, Greg! You solved a puzzle I hadn’t even thought of making into a puzzle! When I get around to publishing these notes someday, I’ll add this as a puzzle and include your solution. (Of course, all puzzle solvers will be credited.)

      If anyone out there desires further puzzles of this sort, they can try to find an elegant proof that this, too, is the Desargues graph:

    • John Baez says:

      I added a picture Greg emailed to me to his comment above. This helps explain his argument.

      I’ve also taken this whole business and made it ‘Puzzle 2’ in the website version of Part 14.

    • Arjun Jain says:

      How did you think of defining the dots as:

      \{i, i+1\}, \{i+1,i+4\} ; \{i, i+1, i+4\}, \{i+1, i+2, i+4\}

    • Greg Egan says:

      How did you think of defining the dots as {i,i+1} etc.

      Trial and error! The picture was symmetrical enough that it seemed likely that some systematic assignment of sets to the vertices would work, and {i,i+1} for five of the 2-element sets was a simple thing to start with. From there, I just played around with similar expressions for the other dots, until I found ones that met all the requirements.

  4. Greg Egan says:

    The most obvious eigenvector of the Desargues graph Hamiltonian is the equilibrium state, with equal probability at every vertex and an eigenvalue of 0.

    I guess all the other eigenvectors will be stochastic “states” that undergo exponential decay at various rates … but since probability is conserved, any vector that undergoes exponential decay must have terms that sum to zero. So a vector like that won’t represent a proper stochastic state, but rather a deviation from equilibrium. So it should be possible to write any proper stochastic state as the equilibrium state plus a sum of terms that decay with different rate constants.

    If you put a value of 1 on every red dot and -1 on every blue dot, you get an eigenvector of the Hamiltonian with eigenvalue -6 (-3 for the diagonal entry for each vertex, and -3 for the sum over the neighbours).

    By trial and error, it’s not too hard to find examples of variations on this where some vertices have a value of zero. Every vertex with zero either has its neighbours all zero, or two neighbours of opposite signs.

    Eigenvalue -5: every non-zero vertex has two neighbours with the opposite value to it, and one neighbour of zero.

    Eigenvalue -4: every non-zero vertex has one neighbour with the opposite value to it, and two neighbours of zero.

    Eigenvalue -2: every non-zero vertex has one neighbour with the same value, and two neighbours of zero.

    Eigenvalue -1: every non-zero vertex has two neighbours with the same value, and one neighbour of zero.

    So the general pattern is what you’d expect: the more neighbours of equal value each non-zero vertex has, the more slowly that term will decay.

    • John Baez says:

      Brilliant! I’d been hoping to use the representation theory of S_5 to work out the eigenspaces, but that takes some work so I was going to put it off until Jim Dolan told me how to break down this particular representation (which is of a specially nice kind) into irreps. You more boldly went ahead and just figured it out. I’m sort of wimpy when it comes to concrete calculations like this.

      I’ll try to write this up and add it to the website version of these lecture notes, which will become a bit different from the blog version.

      I’m amazed that the spectrum of the graph Laplacian is all integers. It must mean something, but I don’t know what. If I had the energy to investigate this, which I don’t, I’d try looking at some other cases. The Desargues graph is based on the fact that 2+3=5. I’d try the analogous graph based on 1+2=3, and see if that too had an integer spectrum. If it did, I might boldly try 3+4=7.

      Actually, being lazy, I might instead look for information using the keywords “bipartite kneser graph laplacian”.

    • Greg Egan says:

      The bivalent graph with six vertices corresponding to the subsets of size 1 and 2 of {0,1,2} has eigenvalues:

      0 and -4 with 1-dimensional eigenspaces
      -1 and -3 with 2-dimensional eigenspaces

      The tetravalent graph with 70 vertices corresponding to the subsets of size 3 and 4 of {0,1,2,3,4,5,6} has eigenvalues:

      0 and -8 with 1-dimensional eigenspaces

      -1 and -7 with 6-dimensional eigenspaces

      -2 and -6 with 14-dimensional eigenspaces

      -3 and -5 with 14-dimensional eigenspaces

      and the original example, the trivalent graph with 20 vertices corresponding to the subsets of size 2 and 3 of {0,1,2,3,4}, has eigenvalues:

      0 and -6 with 1-dimensional eigenspaces

      -1 and -5 with 4-dimensional eigenspaces

      -2 and -4 with 5-dimensional eigenspaces

      I’m not surprised that some of the eigenvalues are integers, because it’s easy to see that you can form some eigenvectors by arranging an equal number of ones and minus ones on the vertices in a symmetrical fashion such that every non-zero vertex has equal counts for (i) neighbours with the same value, (ii) neighbours with the opposite value, and (iii) neighbours that are zero. Those kind of eigenvectors will clearly have integer eigenvalues. But it’s not obvious to me why these kinds of eigenvectors span the whole space.

      • John Baez says:

        Cool! At this point there are some pretty conjectures for any mathematicians lurking out there without enough to prove. But again, it’s quite possible these conjectures have already been proved, since there’s an enormous literature on graph Laplacians. So don’t anyone work on this without checking the literature, unless you don’t mind reproving known stuff.

      • Greg Egan says:

        Working in the (n+1)-dimensional subspace given by the vectors S_s defined above, the squared adjacency matrix J^2 is:

        \left[ \begin{array}{cccccc} 1 \cdot (2n+1)  & n^2 & 0 & 0 & ... \\ 1 \cdot 2 & 2 \cdot (2n-1) & (n-1)^2 & 0 & ... \\ 0 & 2 \cdot 3 & 3 \cdot (2n-3) & (n-2)^2 & ... \\ ... & ... & ... & ... & ... \end{array} \right]

        It’s easy to see at least one eigenvector of this matrix:

        J^2 (1,1,1,...,1) = (n+1)^2 (1,1,1,...,1)

    • John Baez says:

      Greg sent me this picture illustrating the eigenvectors of the Laplacian of the Desargues graph:

      He wrote:

      On the eigenvectors, I guess the only thing I wanted to say in reference to the diagram was that you can span the eigenspaces by rotating the particular eigenvectors I drew by successive multiples of 1/5 of a rotation, i.e. 4 dots.

      If you do this for the examples with eigenvalues -2 and -4, it’s not too hard to see that all five rotated diagrams are linearly independent. If you do it for the examples with eigenvalues -1 and -5, four rotated diagrams are linearly independent. So along with the two 1-dimensional spaces, that’s enough to prove that you’ve exhausted the 20-dimensional space.

      I’ve used this to rewrite the website version of Part 15. I moved all the stuff about drawing the Desargues graph to Part 14, where it really belongs.

    • John Baez says:

      I don’t really want to think about this question—the eigenvalues of the Laplacian for the Desargues graph. This is the kind of math I used to like, that I’m trying to get away from! But I can’t help it, because it’s closely related to stuff Jim Dolan and I used to talk about: Young diagrams and ‘the fundamental theorem of Hecke operators’. Just to get it out of my system, let me say a bit about it!

      Greg showed the Laplacian of the Desargues graph has 6 eigenvalues.

      We’d know it has at most 6 eigenvalues if we knew that L^2(V) breaks up into 6 irreducible representations of S_5 \times S_2. Here L^2(V) is the space of functions on the set of vertices of the Desargues graph: a 20-dimensional complex vector space. S_5 \times S_2 is the symmetry group of the Desargues graph. The Laplacian of the Desargues graph, which we’re calling H, commutes with all the symmetries. Thus, by Schur’s Lemma, it can have at most one eigenvalue for each irreducible representation of S_5 \times S_2 sitting in L^2(V).

      Now there are different ways to break L^2(V) into irreducible representations, but the fun way would be to use the theory of ‘Hecke operators’ described here.

      Suppose we have a relation R on vertices of the Desargues graph that’s invariant under all the symmetries of this graph. We can think of R as a subset of V \times V that’s invariant under the action of the symmetry group. Or, we can think of R as a V \times V-shaped matrix of 0’s and 1’s, where 0 means ‘false’ and 1 means ‘true’. We can reinterpret this matrix as an operator

      R: L^2(V) \to L^2(V)

      and then this operator will commute with the action of the symmetry group. In other words, it’ll be an intertwining operator. Jim and I call an intertwining operator obtained this way a ‘Hecke operator’.

      The fundamental (but incredibly easy) theorem is that every intertwining operator is a linear combination of Hecke operators. Since the graph Laplacian H is also an intertwining operator, we can write it as a linear combination of Hecke operators. That’s easy, in fact, and I’ll say how in a minute.

      But we may also be able to use this idea to prove things about H: for example, facts about how many eigenvalues it has, or maybe facts about what its eigenvalues are! And if so, these ideas should generalize from the Desargues graph to the other bipartite Kneser graphs Greg studied here.

      What are some invariant relations on vertices of the Desargues graph?

      The most obvious one is ‘being connected by an edge’. Let’s call this J, for ‘jump’. We have

      J_{i j} = 1

      if you can get from vertex j to vertex i in one jump (along an edge), and

      J_{i j} = 0

      otherwise. This is the most important Hecke operator. The Laplacian of the Desargues graph is just

      H = J - 3

      so understanding the eigenvalues of H is equivalent to understanding the eigenvalues of J.

      Greg showed that H has eigenvalues 0,-1,-2,-4,-5,-6. In other words, J has eigenvalues 3, 2, 1, -1, -2, -3. This new way of saying it already looks nicer.

      But it would be nice to understand this conceptually. If our understanding is ‘conceptual’, we should be able to generalize it to all the other bipartite Kneser graphs that Greg was considering.

      To show J has these eigenvalues, it would suffice to show

      (J - 3)(J - 2)(J - 1)(J + 1)(J + 2)(J + 3) = 0

      This is some sort of relation in the algebra of intertwining operators. I suspect this algebra is 6-dimensional and generated by J. If so, this relation says the algebra of intertwining operators is the polynomial algebra on a formal variable J, modulo this relation:

      (J - 3)(J - 2)(J - 1)(J + 1)(J + 2)(J + 3) = 0

      This relations lets us rewrite the 6th power of J, or any higher power, in terms of lower powers.

      What’s so important about the 6th power? Well, if you look at the Desargues graph:

      you’ll see that if you start at any vertex, and try to get to any other, the longest it can take is 5 jumps. For example, getting from {0,1,2} to {3,4} takes 5 jumps. So there’s some sense in which a 6th jump is ‘redundant’, and this should give the relation… somehow.

      How can we exploit this vague idea? So far I only know a very dumb and laborious approach.

      Consider the operator J^n. The matrix element

      (J^n)_{i j}

      counts how many ways you can get from vertex j to vertex i in n jumps… but here we do allow ourselves to retrace our steps! If we count these ‘paths’ from j to i we can work out the matrix elements of J^n.

      When we get to n = 6, it should somehow be obvious that can rewrite J^6 as a linear combination of lower powers of J. Why? I don’t know, but it should be related to the fact that a 6th jump is ‘redundant’. In any event, even if it’s not obvious, it’s true! So, it must be true that by counting paths we can show that

      (J - 3)(J - 2)(J - 1)(J + 1)(J + 2)(J + 3) = 0

      But of course we don’t want to do this counting in a stupid brute-force way! There should be some pattern at work, which makes this relation obvious.

      Why am I so sure of that? Well, it’s the simplest explanation of the pattern Greg found here. If we replace the Desargues graph by a similar graph based on splittings of a 7-element set {0,1,2,3,4,5,6} into 3- and 4-element subsets, Greg’s calculations are equivalent to this fact: the jump operator J obeys

      (J-4)(J-3)(J-2)(J-1)(J+1)(J+2)(J+3)(J+4) = 0

      This relation expresses the 8th power of J in terms of lower powers. And, if you think about it, you can see that it takes at most 7 jumps to get from any vertex to any other. For example, it take 7 jumps to get from {0,1,2,3} to {4,5,6}.

      Now, I seem to be claiming that the number of eigenvalues of the graph Laplacian is at most one more than the ‘diameter’ of a graph, meaning the maximum number of jumps it takes to get between vertices. If that’s true, I’m sure it’s known. But I’m not really claiming it’s true for all graphs: I’m working with a class of specially symmetrical graphs, and my intuitions are based on some ideas from group representation theory.

    • Greg Egan says:

      I can’t yet see how to make the general result we’re seeking obviously true, but here are some ideas that at least simplify some specific calculations.

      Assume we’re working with a bipartite Kneser graph H(2n+1,n) whose vertices are the subsets of size n and size n+1 of the set \{0,1,2,...,2n\}. Define J to be the adjacency matrix of this graph.

      We can really get everything we need by working only with the “small” vertices, the subsets of size n, and J^2, which takes a vector whose support is only on small vertices to another such vector. What’s more, by the symmetry of the problem we can work with the vectors S_s (defined for s=0,1,...n), which assign a value of 1 to every small vertex whose intersection with \{0,1,...,n-1\} contains exactly s elements, and assigns 0 to every other vertex. J^2 will map each S_s to linear combinations of other vectors of this form. And in fact, with a few simple combinatorial calculations it’s not hard to show that:

      J^2 S_s = (n-s+1)^2 S_{s-1} + (s+1)(2n-2s+1) S_s + (s+1)(s+2) S_{s+1}

      When s=0 the first term is missing and when s=n the last term is missing.

      For the Desargues graph, n=2, this gives us:

      J^2 S_2 = S_1 + 3 S_2

      J^2 S_1 = 4 S_0 + 6 S_1 + 6 S_2

      J^2 S_0 = 5 S_0 + 2 S_1

      It’s then a short calculation to show that:

      (J^2-9)(J^2-4)(J^2-1) S_2 = 0

      which from the symmetry of the graph is enough to imply that:

      (J^2-9)(J^2-4)(J^2-1) = 0

      I suppose that with a bit more effort it shouldn’t be too hard to use this approach to prove the general case:

      (J^2-(n+1)^2)...(J^2-4)(J^2-1) = 0

      Or maybe there’s a more conceptual approach that bypasses that calculation, but I can’t see what it is.

  5. John Baez says:

    Jamie wrote:

    This suggests to me that even though these classical and quantum systems are described so neatly by the same operator, you can’t obtain the classical system by implementing the quantum system and measuring it very often, even in the limit that you measure it infinitely rapidly.

    Right! You’re talking about the quantum Zeno effect, or ‘watched pot never boils’ effect. I just found out, to my delight, that Alan Turing discovered this effect in 1954. He wrote:

    It is easy to show using standard theory that if a system starts in an eigenstate of some observable, and measurements are made of that observable N times a second, then, even if the state is not a stationary one, the probability that the system will be in the same state after, say, one second, tends to one as N tends to infinity; that is, that continual observations will prevent motion…

    This effect has been seen in experiments!

    Maybe there’s some other way to do it?

    How about frequent ‘weak measurements’? Instead of just observing our system very often, let’s flip a suitably weighted coin each time to decide whether to observe it. We’ll observe it with some small probability \epsilon.

    Let me make a precise conjecture about this. I don’t know if it’s true, but at least it’s something we can figure out!

    As you know, there’s a projection p from the space of square matrices down to the subspace of matrices that are diagonal in our chosen basis. This is related to ‘collapsing the wavefunction by means of an observation’, where we use the density matrix description of mixed states.

    I think your procedure amounts to doing this:

    Evolve a matrix D for a short amount of time 1/N in the usual way:

    D \mapsto \exp(i H / N) D \exp(-i H /N )

    Then project down to the diagonal:

    D \mapsto p(D)

    Alternate these two steps N t times, where t is the time we want to evolve for. Then take the limit as N \to \infty.

    If I’m not getting mixed up, you showed this complicated-sounding operation on matrices should be the identity: in the limit where we look at the pot ever more often, the watched pot never boils.

    So now let’s change the procedure a bit. Evolve a matrix D for a short amount of time 1/N in the usual way:

    D \mapsto \exp(i H / n) D \exp(-i H /n )

    Then do

    D \mapsto \epsilon p(D) + (1-\epsilon) D

    where the probability \epsilon depends on N in some cleverly chosen way. Then take the limit as N \to \infty.

    I’ll conjecture that if we choose \epsilon as a function of N correctly, this operation applied to a diagonal density matrix D coming from a probability distribution \psi as follows:

    D_{i j} = \psi(i) \delta_{i j}

    will give the diagonal matrix coming from the probability distribution \exp(t H) \psi, at least when H is both self-adjoint and infinitesimal stochastic.

    In a short slogan suitable for bumper stickers: frequently repeated weak measurements turn a quantum process into a Markov process.

    I don’t know if this slogan is true, but unlike most slogans it only takes a calculation to prove or disprove it! I wouldn’t be surprised if someone had already done this calculation and settled this question.

    Thanks for goading me into this guess.

  6. jamievicary says:

    I like this idea. I played around with it for a bit. Ultimately, I couldn’t get it to work.

    Here is some intuition that it won’t work out as you hope, just to play devil’s advocate. Suppose you have a quantum system evolving under the Hamiltonian H, and you perform an infinite sequence of weak measurements, each governed by a small measurement probability \epsilon as you suggest, spaced apart by some time T. As long you ultimately take some limit where T becomes small, this is well-described by the statement “I will perform measurements over time according to a Poisson distribution, with mean time T/\epsilon between measurements”, however cleverly you choose \epsilon. But we know what happens if we measure our quantum system in such a way — the result still behaves as a quantum system, only reaching classical (and stationary) behaviour when the mean time between measurements approaches zero. Certainly, none of the evolutions are described by classical stochastic evolution under the operator H interpreted as a generator of stochastic transformations.

    Let me know whether this moves you.

    • John Baez says:

      Jamie wrote:

      But we know what happens if we measure our quantum system in such a way…

      Actually I didn’t (and don’t) know that! But over the last few days I’ve been thinking my conjecture was probably wrong. After all, if I measure things too weakly, I won’t manage to keep my density matrix diagonal! And if I measure it strongly enough to keep it diagonal, well… I bet the quantum Zeno effect will kick in. (That’s just a guess; I’ve been too lazy to do the calculation.)

      Now I’m curious: do we get any interesting time evolution intermediate between \exp(- i t H) time evolution and no time evolution at all, if we do the “I will perform measurements over time according to a Poisson distribution, with mean time T/\epsilon between measurements” trick?

      Since you say “we know”, it sounds like you know what’ll happen.

      Here’s another idea. Take a free quantum particle obeying

      \displaystyle{ \frac{d}{d t} \psi = \frac{i}{2 m} \nabla^2 \psi }

      Then, increase the mass m while letting our particle interact with a ‘bath’ of many light particles, say via some billiard-ball-like interaction. As the mass increases the particle becomes more ‘classical’, but all those light particles should make it undergo Brownian motion. So, in some appropriate limit, we should get a stochastic system described by

      \displaystyle{ \frac{d}{d t} \psi = k \nabla^2 \psi }

      namely the heat equation. But maybe this new \psi is the old |\psi|^2 .

      Is these some way to make this precise and also more abstract and general, so it applies to the big class of processes we’re talking about now?

      Hmm, I’m imagining some sort of stochastic Petri net, or its quantum analogue, which describes the interaction of our ‘favorite’ particle with the ‘bath’ of lighter ones.

      • jamievicary says:

        If we measure our quantum system at some particular rate, in a Markovian manner, then I don’t know exactly what will happen, in the sense that I could tell you precisely how it evolves as a function of time. But certainly the system’s density matrix will not be diagonal at all times, unless the system’s Hamiltonian is diagonal in the basis in which we’re doing our measurements.

        Actually, having said that, maybe I do know exactly what will happen. I bet it’s something like this:

        \rho(t) = e^{(i (H\otimes H_*) + a J) t} (\rho(0))

        Here, H is the Hamiltonian of the system, and so H \otimes H_* can be seen as an operator on the space of density matrices, where H_* is the conjugate of H. The operator J is the infinitesimal stochastic matrix, acting on the space of density matrices, describing our measurement process, and a is a positive real constant describing the rate at which our measurements will perform.

        So, I’m suggesting that if H generates quantum evolution, and J generates stochastic evolution, you simply add them to get the resulting operator. I’m claiming this will produce a well-behaved 1-parameter family of completely positive maps, which are the things which take density matrices to density matrices.

        Why should it work like this? Take the series expansion of the exponential – and then the individual terms describe all the possible histories of the system after some time t, each weighted by an appropriate factor.

        Of course, this gives us a great question. If self-adjoint matrices generate 1-parameter families of unitary maps, and infinitesimal stochastic matrices generate 1-parameter families of stochastic maps, then what sort of beast generates 1-parameter families of completely positive maps? It has to be something pretty bizarre, if this class of maps includes not only the previous two classes, but also their span!

        I’ll say more later on about the other interesting thing you suggested.

      • jamievicary says:

        I made a mistake in that comment: the operator J isn’t infinitestimal stochastic, it’s a ‘projection’ operator, which simply sets all the off-diagonal entries of a density matrix to zero, in some particular basis. We could imagine something more general, where we not only kill off-diagonal entries, but also apply an infinitesimal stochastic transformation to the remaining diagonal entries. Such a J should also give rise to a 1-parameter family of completely positive maps.

  7. Laurens Gunnarsen says:

    It seems to me that at least some footnotes are needed to explain how the graph depicted here

    can possibly be the Desargues graph.

    First of all, the above graph appears to have only fourteen vertices, while of course the Desargues graph has twenty. And second, the six (yellow) vertices in the center of the above graph have valence 4, while the remaining 8 (red) vertices each have valence 3; obviously all vertices of the Desargues graph must have valence 3.

    I suppose the six yellow vertices at the center of the above graph might in some sense have “multiplicity two” (since this would bring the vertex count in each “column” up to ten.) But this would seem to single them out among the other vertices. And isn’t such a privileged subset of vertices inconsistent with the homogeneity of the Desargues graph?

    • John Baez says:

      Laurens wrote:

      It seems to me that at least some footnotes are needed to explain how the graph depicted here can possibly be the Desargues graph.

      Yeah, sorry—I was so busy thinking about it that I didn’t actually look at it.

      I explained how it works already. First you draw a 5-dimensional hypercube. It has 25 vertices, one for each subset of a 5-element set. It has an edge between vertices precisely when one subset contains (or is contained in) a subset with one more element. Then you throw out all the vertices except those corresponding to the 2- and 3-element subsets. You keep the edges between these. That’s the Desargues graph, basically by definition.

      Unfortunately the picture of a 5-dimensional hypercube that I got from Wikipedia is drawn in such a way that some vertices lie directly behind others!

      The columns here actually consist of 1, 5, 10, 10, 5, and 1 dots, but you can’t see all those dots. Each ‘yellow’ dot—they look orange to me—is actually sitting right directly in front of another dot.

      In particular, a 5-element set has ten 2-element subsets and ten 3-element subsets. These form the two middle columns in the above picture. But we only see 7 dots in each of those columns, not 10, since 3 in each column are blocking the view of 3 more. When we throw away the other columns, we’re left with this:

      That’s also why these yellow dots look like they have more than 3 edges coming out of them!

      So, the concept is clear but the picture sucks, at least as drawn here.

      I’m probably too lazy to fix the picture, but I’ll explain what’s going on in the website version of this course. Luckily, Greg Egan gave a nice proof that this other graph really shows all the 2- and 3-element subsets of a 5-element set:

      Here’s his proof in a picture:

      By the way, I urge you to read the website versions of Part 14 and Part 15, not the blog versions: they’ve been drastically improved with help from Greg!

    • John Baez says:

      Laurens sent me a corrected version of the Desargues graph, where some dots are moved so they don’t lie directly in front of others:

      I’ll add this to the website version of these notes! Thanks, Laurens!

  8. Dave DeBenedetto says:


    I would like to read the Network Theory series from the beginning, but my searches only return installments going back to part 4. Can you help?

    Thank You,
    Dave DeBenedetto

  9. We have been comparing two theories: stochastic mechanics and quantum mechanics. Last time we saw that any graph gives us an example of both theories! It’s a bit peculiar, but today […]

  10. There are many examples of irreducible Dirichlet operators. For instance, in Part 15 we talked about graph Laplacians. The Laplacian of a connected simple graph is always irreducible. […]

  11. There’s a way to get a Hamiltonian for a Markov process from any graph with edges labelled by positive numbers. I defined this Hamiltonian back in Part 15, but now I see a slicker way to write it […]

  12. As I’ve grown interested in network theory, I’ve gotten more and more interested in ‘graph Laplacians’. There’s been some important progress on these recently…

  13. Arjun Jain says:

    Why are we calling H, the laplacian of the graph? Apart from it being self adjoint and infinitesimal stochastic, is there a better reason to call it a laplacian? Does this somehow give us the usual laplacian when V is a set like the real numbers?- if so, how?

    In QM, H has a laplacian term and a potential term. Are we considering the \displaystyle{ d(x) \Psi(x) } in

    \displaystyle{ (H \psi)(x) = \sum_{y: (x,y) \in E} \Psi(y) \quad - \quad d(x) \Psi(x) }

    to be the potential term?

    I did not understand how we are taking the same H for the corresponding SM and QM problems.

    • John Baez says:

      Arjun wrote:

      Why are we calling H the laplacian of the graph?

      When our graph is a n-dimensional lattice, my formula for H gives us the usual discrete approximation to the Laplacian on \mathbb{R}^n. This is how you numerically compute a Laplacian using a computer. I explained this in a comment on my article about graph Laplacians.

      In QM, H has a laplacian term and a potential term. Are we considering the \displaystyle{ d(x) \Psi(x) } in

      \displaystyle{ (H \psi)(x) = \sum_{y: (x,y) \in E} \Psi(y) \quad - \quad d(x) \Psi(x) }

      to be the potential term ?

      No, all this is the Laplacian term. I’m considering a free particle, with no potential. But we can add a potential term if we want, too:

      (V \psi)(x) = V(x) \psi(x)

      for some real-valued function V defined on the vertices of the graph.

      I did not understand how we are taking the same H for the corresponding SM and QM problems?

      It’s a generalization of how we use the Laplacian both in Schrödinger’s equation

      \displaystyle{ \frac{d}{d t} \psi = -\nabla^2 \psi }

      (for a free particle in quantum mechanics) and the heat equation

      \displaystyle{ \frac{d}{d t} \psi = \nabla^2 \psi }

      (for a free particle in stochastic mechanics). I explained this at the start of this post.

      Mathematically, the only difference between Schrödinger’s equation and the heat equation is the factor of -i. But one describes how amplitudes change with time, while the other describes how probabilities change with time! This is well-known, but it’s a fascinating and mysterious thing. This whole series of notes is about taking techniques from quantum mechanics and applying them to stochastic mechanics, so it’s good to study their ‘overlap’.

      In Part 16 I’ll explain that there’s a large class of operators H such that \exp(-i t H) is unitary and \exp(t H) is stochastic. The most important example is when H is the Laplacian. Graph Laplacians are also examples.

You can use Markdown or HTML in your comments. You can also use LaTeX, like this: $latex E = m c^2 $. The word 'latex' comes right after the first dollar sign, with a space after it.

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s