In the comments on this blog post I’m taking some notes on this conference:
• Applied Algebraic Topology 2017, August 8-12, 2017, Hokkaido University, Sapporo, Japan.
Unfortunately these notes will not give you a good summary of the talks—and almost nothing about applications of algebraic topology. Instead, I seem to be jotting down random cool math facts that I’m learning and don’t want to forget.
A young mathematician came up after my talk and asked me: “What did Gromov mean when he said the brain is a 2-category?”
She was not able to offer any clues, so I said I had no idea. I said I might ask him. If anyone knows where he said this, that would help a lot.
I found this: “Probably, the full concept of ”ergo-brain“ can be expressed in a language similar to that of n-categories” (p. 53 at https://www.ihes.fr/~gromov/PDF/ergobrain.pdf)
Thanks! That’s a quite interesting screed by Gromov—I hadn’t seen it. It has some nice remarks on category theory, but I haven’t seen any real detail on n-categories. The full sentence is:
where he hasn’t put in the reference for ???.
I’m listening to a talk about biochemical networks, how to simplify them, how to extract dynamical systems from them, and how to test these models:
• Konstantin Mischaikov, Towards a computational, discrete geometric foundation for nonlinear dynamics.
If there’s a binomial identity like
where all the numbers on the bottom are distinct (in this case 0, 3, 1 and 2), then you can write a protocol for ‘weak symmetry breaking’ that uses just 3 communication steps: that is, a way to get
people to collectively make a binary choice under certain constraints, including that they can send messages just 3 times:
• D. N. Kozlov, Combinatorial topology of the standard chromatic subdivision and Weak Symmetry Breaking for 6 processes.
• D. N. Kozlov, All binomial identities are orderable.
Such identities exist for infinitely many
but they also fail to exist for infinitely many 
When we have an identity of this sort, it means there’s a bijection between some collection
of subsets of an
-element and some other collection
In the above example, there’s a bijection between the collection of 0- and 3-element subsets of a 6-element set, and the collection of 1- and 2-element subsets.
That much is obvious. Less obviously, Kozylov proved that when we have such an identity we can choose a bijection, say
such for any
we either have
or
That’s pretty cool!
Here’s a juicy little fact from this talk:
• Yasuaki Hiraoka, Materials topological data analysis and random/statistical topology.
Suppose you take the complete graph on
vertices and randomly assign a number between 0 and 1 to each edge. Call these numbers lengths, and let them be independent and uniformly distributed random variables.
What is the total length of the minimal (= shortest possible) spanning tree?
Of course, this is a random variable. But Frieze showed that as
it converges in probability to
In other words, for any
the probability that this total length differs from
by more than
approaches zero as 
I find this to be a wonderfully fundamental appearance of Apery’s constant
It was proved by Frieze in 1983, and later published here:
• Alan M. Frieze, On the value of a random minimum spanning tree problem, Discrete Applied Mathematics 10 (1985), 47–56.
He actually proved a more general result that applies to non-uniform probability distributions on
obeying some conditions, and this was generalized further here:
• J. Michael Steele, On Frieze’s zeta(3) limit for lengths of minimal spanning trees, Discrete Applied Mathematics 18 (1987), 99–103.
One gets a simple explicit formula involving
that only depends on the probability distribution for very small lengths. The last part makes perfect sense, since small lengths are the most important for finding the shortest spanning tree! But the appearance of
is still mysterious to me.
Over on G+, we had an interesting conversation about this. I wrote:
Akira Bergman wrote:
I wrote:
Matt McIrvin wrote:
I wrote:
The link to Simplicial and Cellular Trees appears to be broken.
Thanks—I’ve fixed it. It had an unnecessary period at the end.
So then I read that paper….
It turns out Gil Kalai did some crucial work in figuring out the generalization of spanning trees to higher dimensions, and also Kirchhoff’s matrix tree theorem, and also the right generalization of a complete graph, and the generalization of Cayley’s formula counting the spanning trees in a complete graph.
The d-dimensional generalization of a complete graph on n vertices is pretty simple: take all the d-dimensional faces of an (n-1)-simplex. The generalization of a spanning tree is not so simple: a beautiful generalization of Cayley’s formula says there should be
of them, but with a naive definition of this formula fails with n = 6, d = 2: we have
while there are only 46620 ‘spanning hypertrees’: ways of choosing triangles in the 5-simplex such that every edge lies in some triangle and the integral 2nd homology of the resulting space vanishes.
Thus, Kalai had to invent a subtler definition to save the formula. In a blog article on this topic he writes:
The answer is that we should call a collection of d-dimensional faces of a simplex a spanning hypertree if:
1) every (d-1)-dimensional face lies on one of these d-dimensional faces (that’s the spanning part), and
2) the dth rational homology of the space formed by these d-dimensional faces vanishes.
The subtlety is that we use rational homology: homology with rational coefficients (or if you prefer, real coefficients). The homology with integer coefficients may still be nonzero! If the dth rational homology vanishes, the dth integral homology is a finite group G, and we should count each spanning hypertree |G|2 times if we want the total number of spanning hypertrees of an (n-1)-simplex to come out to
This is Kalai’s result.
So maybe someone should try this:
Take an (n-1)-simplex and label all its d-dimensional faces with numbers in [0,1], randomly chosen from independent uniform distributions. Define the volume of a spanning hypertree to be the sum of these numbers over all d-dimensional faces of the hypertree. Define a minimal spanning hypertree to be a spanning hypertree of minimal volume.
Calculate the expected value of the volume of a minimal spanning hypertree. Take the limit as n → ∞.
Here’s how
shows up as the expected length of the minimal spanning tree:
I don’t really understand how the left hand side shows up in this problem, but Cayley’s formula says that
is the number of spanning trees with
vertices.
Thanks John. I’m pretty sure the math behind the spanning tree result is beyond me but have some elementary observations which I hope aren’t all wet.
1) Given one of these complete length-assigned n-graphs, if we try to choose a spanning subset of arcs by picking the n vertices in random order, choosing for each vertex the shortest arc from that vertex to any other vertex, then the expected total length of the chosen arcs will be n/(n+1) (right?), which approaches 1 for large n. Alas in general the chosen arcs won’t all be connected, so we’re not guaranteed a spanning tree.
2) If we modify the above algorithm by replacing “any other” with “already-picked”, we do get a spanning tree, but now the chosen arcs have an expected total length = the sum over k=1 to n of 1/(k+1) , which diverges as log n. Of course there’s no reason to expect this ‘greedy’ spanning tree algorithm to be optimal, so no conflict with the result you reported.
3) What is surprising to me is that the optimal spanning tree (whose expected length obviously lies between the above two values, 1 and log n) comes so close to the former: The constraint of having to span all n vertices has only a fixed cost relative to the unconstrained case #1 (and a modest one at that, only 20%).
I’m sorry, I don’t have much intuition for this stuff, so I don’t have anything interesting to say! Maybe someone else does?
I think you can get a rough estimate using the theory about giant components in random graphs. See Theorem 16.1 from https://people.eecs.berkeley.edu/~sinclair/cs271/n16.pdf. (In statement 2, that should be (0,1] not [0,1] because b=0 is always a solution.)
First use all edges smaller than 2c/n. There will be about cn of them, so for c>1, there will be a giant component C of size bn where b + exp(-bc) – 1 = 0. Delete edges from C until it is a tree T. The sum of weights in T is about bn(c/n) = bc.
Next, use edges with weights in [2c/n,1] to join G-T to T, using the edge with smallest weight for each vertex in G-T. There are about bn choices for each one, so these weights add up to about (1-b)n (2c/n + 1/(bn)) = (1-b)(2c_1/b). The total is bc + 2c – 2cb + 1/b – 1 = 2c – 1 – bc – 1/b. For example, if c=2, b is about .8 and this is 3-1.6+1.25=2.65.
Thanks, Graham! That gives a nice intuition for why the answer works out to be finite. Now that I reread Frieze’s paper I see some of this intuition lurking between the lines:
• Alan M. Frieze, On the value of a random minimum spanning tree problem, Discrete Applied Mathematics 10 (1985), 47–56.
Perhaps this was before people knew about, or talked about, the ‘giant component’ of a random graph.
I thought I’d heard about the ‘phase transition’ at c=1 as edges are added to an empty graph ages ago, possibly via you! So I did a little digging. Erdős and Rényi (1960) mention ‘giant’ components. https://www.renyi.hu/~p_erdos/1961-15.pdf
Thank you Graham and John. Graham you seem to have lowered the comprehension bar sufficiently that on a good day I may be able to clear it. Maybe this Saturday:-)
• Satohiro Tajima, Toru Yanagawa, Naotaka Fujii and Taro Toyoizumi, Untangling brain-wide dynamics in consciousness by cross-embedding, PLOS Computational Biology.
This thread is full of fun and interesting things!
Thanks! I just added an apology at the top, saying that I’m not really summarizing the talks—just taking notes on random cool facts that I’m learning here and don’t want to forget. Glad you like ’em.
Dear John, do you know if there is likely to be a video link for the talks? I know it’s not always the case in Japan, but if you happen to know I would appreciate it. I cannot tell from the workshop website.
Hi! No, unfortunately they are not making videos of the talks or even collecting the speakers’ slides on a website.
I got an email today from the conference organizers asking for my slides. So, many speakers’ slides will eventually appear on the conference website.
Nice work on network motifs in neuroscience, good for me to learn so I can start thinking about the network dynamics of neurons:
• Sen Song, Per Jesper Sjöström, Markus Reigl, Sacha Nelson, and Dmitri B Chklovskii, Highly nonrandom features of synaptic connectivity in local cortical circuits, PLOS Biology (2005).
• Michael W. Reimann, Max Nolte, Martina Scolamiero, Katharine Turner, Rodrigo Perin, Giuseppe Chindemi, Paweł Dłotko, Ran Levi, Kathryn Hess and Henry Markram, Cliques of neurons bound into cavities provide a missing link between structure and function, Frontiers in Computational Neuroscience 11 (2017), 48.
One coauthor on the second one is my friend the homotopy theorist Kathryn Hess, who invited me to this conference!
I heard about the above papers in this great talk:
• Carina Curto, Topological organization of neural networks.
It was a great talk because she gave a simple recipe to get ODEs from directed graphs, based on ideas of how neurons affect the firing of their neighbors, and has figured out how to read off qualitative properties of the ODE from the graph! She calls these ODEs, or maybe the graphs they come from, combinatorial threshold-linear networks.
Theorem. If the graph has no sinks, then the ODE has no stable fixed points.
Theorem. A clique is the support of a stable fixed point iff it is a clique with no target (i.e. a vertex outside the clique such that every vertex in the clique has an edge going to that vertex).
Her work is here:
• Katherine Morrison, Anda Degeratu, Vladimir Itskov and Carina Curto, Diversity of emergent dynamics in competitive threshold-linear networks: a preliminary report.
and here’s a poster summarizing it:
• Katherine Morrison, Anda Degeratu, Vladimir Itskov and Carina Curto, Emergent dynamics from network connectivity: a minimal model.
Thank you as usual for writing on topics like this! Of related interest might be “The Dream Child Hypothesis”– that “the self” only enters when the proprioception associated with breathing begins, and leaves when breathing stops. Could algebraic topology help test the hypothesis?
https://www.ncbi.nlm.nih.gov/pubmed/28127277
https://en.m.wikipedia.org/wiki/Neural_basis_of_self
Carina Curto also has a link to tunes made by neural networks.
I worked on motifs in (undirected) graphs, ages ago. The best idea I had was the following:
Let
be the probability that two vertices chosen at random from a graph are connected by an edge, and let
be the clustering coefficient, i.e., the probability that two vertices which have a common neighbor are themselves directly linked by an edge. Then, given a pattern of connections among a set of four vertices, one can make an educated guess about how many times that pattern will occur in the graph overall, by multiplying together some combination of
and
and
and
, with some symmetry factor. I think I got the formulae right in the end, because they made pretty good predictions for motif abundances in a certain class of graphs. Specifically, this worked well for random spatial graphs, made by sprinkling vertices randomly throughout a 2- or 3-dimensional box and connecting them when they fall within a threshold distance of each other. I never made a satisfactory write-up of this, mostly because I couldn’t find where to go with the idea next, but I did give a couple talks about it. In fact, back in 2013 I gave an hour-long lecture on “Mesoscale Structure in Complex Networks”, in which I described several “solutions looking for a problem”: ways of characterizing graphs that sounded interesting and seemed well-motivated at the time, but for which I’d never found a truly incisive application. And the general idea of a system having “structure at a scale of two components or three components, but no further structure at scale four” did feed into stuff that I worked on later.
You might be interested in this:
• John Baez, Networks and population biology (Part 4), 6 May 2011.
It’s about an attempt by Persi Diaconis to define random graphs where certain features show up. Here’s a teaser, which leaves out the shocking conclusion:
His approach seems very well-motivated by statistical mechanics. However, it doesn’t work! But the way it fails is itself interesting.
Interesting failures can be quite instructive!
And, funnily enough, the two papers I linked at the end there bring me around to the subject of Tom Leinster’s talk.
While I was working on the “multiscale information theory” effort described in the aforelinked papers, I tried to come up with a simpler setting for the basic ideas, one that would illustrate the basics and wouldn’t require taking logarithms of probabilities all over the place. Take a set
of questions, all of which pertain to some physical system—to keep it simple, make them all binary questions. We suppose that each question
carries one unit of information; or, to say it another way, answering any
removes one unit of uncertainty. Furthermore, we suppose that the questions are independent, in that answering any
does not help us to answer any other
.
Our physical system may contain multiple pieces, or components; call them
,
,
and so forth. Some of the questions in
may pertain to part
, but not all of them have to, so let’s call
the subset of
consisting of questions that each pertain to component
. Likewise for
and
. So, the total information content of component
is
: This is the total number of questions which we must get the answers to in order to remove our uncertainty about
. Then, the mutual information between components
and
is
We can go further and define a tertiary mutual information
in the same manner, and we can do even fancier things as the number of components in the system grows larger. This provides a kind of “toy version” of information theory: It doesn’t have the richness of Shannon theory, but it does furnish functions that satisfy some of the important properties of Shannon information.
So, keeping the familiar concepts of Shannon theory firmly set aside for the moment, in this more abstract setting, a natural puzzle arises. What if the questions in the set
are informative about each other? Maybe we want to keep probability theory at a distance for now, because it doesn’t feel natural to apply. Perhaps our notion of “similarity” between questions is something like, “Given the answer to question
, we could conceivably compute the answer to question
, given sufficient computer time and memory.” So, the first thing we try is to quantify this with a notion of “distance”
. Why should this work like a distance? Well, if
indicates how hard it is to calculate the answer to
given an answer to
, and
quantifies the difficulty of computing an answer to
given one for
, then the worst-case scenario for calculating an answer for
starting with
feels like it should be the sum total difficulty for the two-step path,
.
Thus, we put a metric on
, and we can talk about the magnitude of
, as well as of the subspaces within it. All our measures of information pick up a scale parameter
, which indicates the cost of computation.
Fun to think about, maybe, as a setting where various abstract ideas can be applied.