Networks and Population Biology (Part 4)

Today was the last day of the tutorials on discrete mathematics and probability in networks and population biology. Persi Diaconis gave two talks, one on ‘exponential families’ of random graphs and one on ‘exchangeability’. Since there’s way too much to summarize, I’ll focus on explaining ideas from the first talk, leaving you to read about the second here:

• Persi Diaconis and Svante Janson, Graph limits and exchangeable random graphs.

Susan Holmes also gave two talks. The first was on metagenomics and the human microbiome—very cool stuff. Did you know that your body contains 100 trillion bacteria, and only 10 trillion human cells? And you’ve got 1000 species of bacteria in your gut? Statistical ecologists are getting very interested in this.

Her second talk was about doing statistics when you’ve got lots of data of different kinds that need to be integrated: numbers, graphs and trees, images, spatial information, and so on. This is clearly the wave of the future. You can see the slides for this talk here:

• Susan Holmes, Heterogeneous data challenge: combining complex data.

The basic idea of Persi Diaconis’ talk was simple and shocking. Suppose you choose a random graph in the most obvious way designed to heighten the chance that it contains a triangle, or some other figure. Then in fact all you’ve done is change the chance that there’s an edge between any given pair of vertices!

But to make this precise—to make it even true—we need to say what the rules are.

For starters, let me point you back to part 2 for Persi’s definitions of ‘graph’ and ‘graph homomorphism’. If we fix a finite set \{1,\dots, n\}, there will be a big set \mathcal{G}_n of graphs with exactly these vertices. To define a kind of ‘random graph’, we first pick a probability measure on each set \mathcal{G}_n. Then, we demand that these probability measures converge in a certain sense as n \to \infty.

However, we can often describe random graphs in a more intuitive way! For example, the simplest random graphs are the Erdős–Rényi random graphs. These depend on a parameter p \in [0,1]. The idea here is that we take our set of vertices and for each pair we flip a coin that lands heads up with probability p. If it lands heads up, we stick in an edge between those vertices; otherwise not. So, the presence or absence of each edge is an independent random variable.

Here’s a picture of an Erdős–Rényi random graph drawn by von Frisch, with a 1% chance of an edge between any two vertices. But it’s been drawn in a way so that the best-connected vertices are near the middle, so it doesn’t look as random as it is:

People have studied the Erdős–Rényi random graphs very intensively, so now people are eager to study random graphs with more interesting correlations. For example, consider the graph where we draw an edge between any two people who are friends. If you’re my friend and I’m friends with someone else, that improves the chances that you’re friends with them! In other words, friends tend to form ‘triangles’. But in an Erdős–Rényi random graph there’s no effect like that.

‘Exponential families’ of random graphs seem like a way around this problem. The idea here is to pick a specific collection of graphs H_1, \dots, H_k and say how commonly we want these to appear in our random graph. If we only use one graph H_1, and we take this to be two vertices connected by an edge, we’ll get an Erdős–Rényi random graph. But, if we also want our graph to contain a lot of triangles, we can pick H_2 to be a triangle.

More precisely, remember from part 2 that t(H_i,G) is the fraction of functions mapping H_i to vertices of G that are actually graph homomorphisms. This is the smart way to keep track of how often H_i shows up inside G. So, we pick some numbers \beta_1, \dots , \beta_k and define a probability measure on \mathcal{G}_n as follows: the probability of any particular graph G \in \mathcal{G}_n should be proportional to

\displaystyle{\exp \left( \beta_1 \, t(H_1, G) + \cdots + \beta_k \, t(H_k, G) \right)}

If you’re a physicist you’ll call this a ‘Gibbs state’, and you’ll know this is the way to get a probability distribution that maximizes entropy while holding the expected values of t(H_i, G) constant. Statisticians like to call the whole family of Gibbs states as we vary the number \beta_i an ‘exponential family’. But the cool part, for me, is that we can apply ideas from physics—namely, statistical mechanics—to graph theory.

So far we’ve got a probability measure on our set \mathcal{G}_n of graphs with n vertices. These probability measures converge in a certain sense as n \to \infty. But Diaconis and Chatterjee proved a shocking theorem: for almost all choices of the graphs H_i and numbers \beta_i > 0, these probability measures converge to an Erdős–Rényi random graph! And in the other cases, they converge to a probabilistic mixture of Erdős–Rényi random graphs.

In short, as long as the numbers \beta_i are positive, exponential families don’t buy us much. We could just work with Erdős–Rényi random graphs, or probabilistic mixtures of these. The exponential families are still very interesting to study, but they don’t take us into truly new territory.

The theorem is here:

• Sourav Chatterjee and Persi Diaconis, Estimating and understanding random graph models.

To reach new territory, we can try letting some \beta_i be negative. The paper talks about this too. Here many questions remain open!

18 Responses to Networks and Population Biology (Part 4)

  1. John Baez says:

    I guess this post was a real conversation-stopper.

    I just thought it was incredibly cool that if you try to choose a random graph in the most obvious sort of way designed to boost the chances of it including a triangle (or any other sort of shape), you wind up simply changing the chance that that there’s an edge between any pair of vertices.

    However, I was probably too eager to state this idea precisely to do a good job of explaining it.

    I’ve rewritten the article and tried explain the idea a tiny bit better now. There’s even a picture now! There should be lots, but I’m too lazy to draw them.

    • Web Hub Tel says:

      I guess this post was a real conversation-stopper.

      This blog is interesting in that responses seem to dribble in over time.

      My question is whether this Erdos graph has ever been observed in real-life, other than by simulation? Is it that the various forms of clustering prevent it from ever occurring?

      • John Baez says:

        Web Hub Tel wrote:

        This blog is interesting in that responses seem to dribble in over time.

        The other blog I participate in, the n-Category Café, works even more that way. There we talk about some fairly abstruse pure mathematics. Math seems to have a long half-life: sometimes we’ll get stuck on a question and then a year or two someone will answer it! I really like that.

        My question is whether this Erdős–Rényi graph has ever been observed in reallife, other than by simulation? Is it that the various forms of clustering prevent it from ever occurring?

        While they’re mathematically very pretty and one of the first kinds of random graphs to be studied, Erdős–Rényi random graphs don’t seem at all common in nature. It’s hard to get a large number of things each of which has an equal probability to being connected to each other one. If they’re distributed in space, for example, it’s more likely for them to be connected to their neighbors. Even in other situations, like webpages with links to each other, it’s very common that if A is connected to B and B is connected to C, then A has an enhanced probability of being connected to C.

        So you’re exactly right: in real life, unlike in Erdős–Rényi random graphs, clusters tend to form.

        The exponential families of random graphs discussed here are an interesting failed attempt to create clustering.

        Also, Erdős–Rényi random graphs are ‘dense’ graphs, meaning that they have lots of edges. There’s been a lot of interest lately in ‘sparse’ graphs, which seem more common in nature. An especially popular topic of research is graphs where the fraction of vertices with n edges goes like 1/n^{p} for some power p. Such a power law has been reported for the world-wide web. For this, try:

        • William Aiello, Fan Chung and Linyuan Lu, A random graph model for power law graphs.

        (What does ‘lots’ of edges mean, exactly? It seems people haven’t made up their minds. However, the average number of edges coming out of any vertex in an Erdős–Rényi random graph is proportional to the number of other vertices. That counts as ‘lots’.)

        • Web Hub Tel says:

          Yes, I have been studying the “actual” web linking graphs. A few years ago when the web was still growing, Cosma Shalizi said that the connectivity was not a power-law based on the data then collected. Yet over time it has converged to a power-law fairly close to -2. Actually -2.15 for link connectivity and -1.8 for link strength. I wrote a section on this behavior in http://TheOilConundrum.com. The scientific citation link connectivity is also really close to -2 for the cumulative, or -3 for the PDF. This is a very good data set for exploring taken from the Institute of Scientific Information.

    • DavidTweed says:

      My personal impression is that the problem is you were too good at explaining it. One of the counter-intuitive things is that if you post something everyone understands and agrees with, relatively few people will post replies saying “That’s was a well-written post that I agree with”; conversely if someone either doesn’t understand something or disagrees, they’re more likely to post a question or a comment. (One thing about the n-cafe is that most of the readers there are in the rare position of knowing enough and being quick enough to actually post questions about ramifications beyond the immediate post.)

      Personally, I currently feel like I ought to do some more reading to understand enough that I can formulate a coherent question. My vague thoughts are that I really ought to understand in what sense the measures “converge”, because clearly a graph that has “interesting” structure is often dominated by the “uninteresting” bits. (This wording is unclear, but consider a general tree graph: an interesting bit is the degree’s of the nodes but, except for certain pathological tress, the largest number of nodes are the “leaves” which are (relatively) uninteresting.) So does the convergence happen because of a combination of

      1. in a “Gibbs state” graph the fraction of “interesting” bits declines as the no of nodes increases

      2. these uninteresting bits can be mapped to uninteresting bits of an Erdos Renyi graph.

      I know I’m not being at all clear here: it’s sort of like the difference between f(x) and g(x) converging in the \int (f_n(x)-g_n(x))^2 dx norm and the \max_x (f_n(x)-g_n(x))^2 norm.

      • DavidTweed says:

        Back on the topic of blog responses, they may follow a phenomenon similar to Parkinson’s bikeshed effect. The wikipedia summary is (in the context of an imagined meeting with a nuclear power plant and a bikeshed on the agenda):

        A nuclear reactor is used because it is so vastly expensive and complicated that an average person cannot understand it, so they assume that those working on it understand it. Even those with strong opinions often withhold them for fear of being shown to be insufficiently informed. On the other hand, everyone understands a bicycle shed (or thinks he or she does), so building one can result in endless discussions because everyone involved wants to add his or her touch and show that they have contributed. While discussing the bikeshed, debate emerges over whether the best choice of roofing is aluminium, asbestos, or galvanized iron, rather than whether the shed is a good idea or not.

        I don’t think this applies as is to blogs, but I suspect there’s a similarish thing happening. (Parkinson’s book is an amusing read which sometimes puts its finger on what IMO are real effects.)

        • Web Hub Tel says:

          Thanks for that info. I knew such a law had to exist, but I didn’t know what it was called. I witnessed a recent example of this as a flurry of discussion ensued on an otherwise complex topic when someone commented on the European’s use of a comma as a decimal point indicator. (I hope that doesn’t happen here :)

      • John Baez says:

        Dave wrote:

        (One thing about the n-cafe is that most of the readers there are in the rare position of knowing enough and being quick enough to actually post questions about ramifications beyond the immediate post.)

        Yes, that’s what I love about that blog. David Corfield is remarkably good at thinking of questions that would be worth pursuing—sometimes I think he’s a mathematician who poses as a philosopher in order to get other people to fill in the details. But there are plenty of other people there who also generate new ideas.

        I’m trying to make this blog into a similar place, and it’s really not doing so bad, but the subject matter is more spread out, so it’s proving a bit harder to round up a crew of experts who will jump in at the slightest provocation and push the conversation ahead. I’m also worried that talking about topics like graph theory throws cold water on the people who are mainly interested in ‘save the planet’ type issues.

        Also, I think having some co-bloggers would help. I’ve tried to round up some, so far with limited success. I’m looking for scientists with a strong interest in environmental and energy issues, who like to post fact-filled blog articles!

        I wish I could kidnap Steve Easterbrook and force him to post on this blog, for example.

  2. That makes clearer now the link to exchangeability. I like that result that if the probability you assign to any sequence of 0s and 1s emerging as the outcomes of an experiment is invariant under permutation, then your degrees of belief in the sequences are a probabilistic mixture of Bernoulli distributions for some distribution over [0, 1].

    But I’m still stumbling here

    …for almost all choices of the graphs H_i and numbers \beta_i > 0, these probability measures converge to an Erdős-Rényi random graph! And in the other cases, they converge to a probabilistic mixture of Erdős-Rényi random graphs.

    Oh, isn’t there something about the random graph not depending on the edge probability, so long as it isn’t 0 or 1? Then I’m not sure what your statement meant by ‘the other cases’. Could you elaborate?

    • John Baez says:

      David wrote:

      Oh, isn’t there something about the random graph not depending on the edge probability, so long as it isn’t 0 or 1?

      No, the Erdős-Rényi random graphs for different edge probabilities p \in [0,1] are all quite different.

      What I mean is that for almost all choices of graphs H_i and numbers \beta_i, the trick I describe for creating a random graph merely manages to spit out an Erdős-Rényi random graph. For example: you take an Erdős-Rényi random graph and you exponentially suppress or enhance the presence of triangles in this graph… but all you manage to get is another Erdős-Rényi random graph.

      That should be utterly shocking! If it doesn’t shock you, you don’t get it yet. It’s as if you tried to hire people with higher IQs and discovered the effect was exactly the same as if you’d tried to higher people with worse eyesight. Sure they may be correlated, but…

      Then I’m not sure what your statement meant by ‘the other cases’.

      Well, the scenario I just described is only true for almost all choices of graphs H_i and numbers \beta_i. If you fine-tune those number \beta_i, you may luck out. Then something slightly more interesting happens: you get a probabilistic mixture of Erdős-Rényi random graphs.

      Persi said this was first noticed in numerical experiments. People would repeatedly try to produce random graphs according to a specific recipe, and 30% of the time (say) the graph they got looked like an Erdős-Rényi random graphs with a 1% chance of having an edge between any given two vertices, while 70% of the time it looked like one with a 20% chance of having an edge between vertices. They must have thought there was a strange bug in their program!

    • Oh, I was thinking of countable graphs, where you get the same Rado graph whatever the p. I thought you were talking about the limit of graphs as vertex number n \to \infty. I’d better read it again.

      Is anything known about fine-tuning the \beta_i?

      • John Baez says:

        I am talking about the limit of graphs as the vertex number n \to \infty. The ‘shocking result’ holds in this limit. Part of why it’s less shocking, in the end, is that it only holds in this limit.

        I don’t know what a Rado graph is, and right now I’m too distracted to find out. I was talking about Erdös-Rényi random graphs. These are neither finite graphs nor countable graphs, but ‘graph limits’ in the sense I explained in part 2.

        (Well, maybe there’s some way to think of them as countable graphs, but Diaconis was thinking of them as limits, in a certain sense I explained, of finite graphs.)

        David wrote:

        Is anything known about fine-tuning the \beta_i?

        Yes, there are beautiful color pictures of this in the paper:

        • Sourav Chatterjee and Persi Diaconis, Estimating and understanding random graph models.

        You’ll see a phase transition, and right at that phase transition you get a probabilistic mix of Erdös-Rényi random graphs, just like how water right at its melting point is a mix of liquid water and ice. It’s statistical mechanics, but with graphs. The name \beta_i is chosen to remind you of inverse temperature, but now there’s one kind of ‘temperature’ for each finite graph.

      • David Corfield says:

        You knew what the Rado graph is 18 months ago, and it’s mentioned on the page you link to Erdős–Rényi model, where it says

        Rado graph, the graph formed by extending the G(n, p) model to graphs with a countably infinite number of vertices. Unlike in the finite case, the result of this infinite random process is always (with probability 1) the same graph, up to isomorphism.

        The result I was mentioning is at Rado Graph:

        One may form an infinite graph in the Erdős–Rényi model by choosing, independently and with probability 1/2 for each pair of vertices, whether to connect the two vertices by an edge. With probability 1 the resulting graph has the extension property, and is therefore isomorphic to the Rado graph. This construction also works if any fixed probability p not equal to 0 or 1 is used in place of 1/2. This result, shown by Paul Erdős and Alfréd Rényi in 1963, justifies the definite article in the common alternative name “the random graph” for the Rado graph: for finite graphs, repeatedly drawing a graph from the Erdős–Rényi model will often lead to different graphs, but for countably infinite graphs the model almost always produces the same graph.

        • John Baez says:

          David wrote:

          You knew what the Rado graph is 18 months ago…

          I’ve learned more since coming to Singapore 10 months ago and switching fields than in any comparable period for many years. I guess I had to trash some old knowledge.

          (Maybe that’s it. But in fact, I always wrote This Week’s Finds so that I could look back at it when I forgot stuff.)

          Anyway, that’s interesting. It goes to show why graph limits are useful: they let us make finer distinctions than if we simply working with an infinite graph! One wants to say that a situation where you have a 10% chance of knowing each other person in an infinite hotel is different than a situation where you have just a 1% chance. But due to the ‘Hilbert hotel’ effect, these almost surely give isomorphic infinite graphs: to make the 1% situation look isomorphic to the 10% situation, you just put more of the more popular guys in the rooms with lower room numbers.

  3. physjam says:

    That’s interesting! An often used measure in network theory (particularly in social networks) is the clustering coefficient (which is proportional to the number of graph homomorphisms of the complete graph on three vertices). I think this implies that the clustering coefficient and mean degree don’t produce a new distribution of graphs as the number of nodes goes to infinity, so in this limit the clustering coefficient is somehow unimportant.

    It is important to remember, as Chatterjee and Diaconis point out, that there are many other measures of graphs other than embeddings of subgraphs; for instance there is mean shortest path length or spectral properties. I wouldn’t expect them all to lead to Erdős–Rényi random graphs.

    For instance Watts and Strogatz constructed a model with a mean degree, mean shortest path length and clustering coefficient that are inconsistent with that of an Erdős-Rényi random graph on the same number of vertices. So I would expect that an exponential graph with these three parameters wouldn’t converge to an Erdős-Rényi graph.

    I think it would also be interesting to ask about exponential random graphs on a graph with a prescribed degree distribution. In (often directed) graphs there is the notion of network motifs, which are the number of occurrences of a small subgraph in a graph relative to in a random graph with the same degree distribution. (The number of occurrences of a subgraph is not determined by the number of graph homomorphisms between them; but I think prescribing the proportions of occurrences of all subgraphs with less than m vertices is the same as prescribing the proportion of graph homomorphisms of all graphs with less than m vertices). It would be interesting to see if in this case exponential random graphs lead to new types of graph distributions.

  4. We talked here before about Erdős–Rényi random graphs […]

  5. To what extent are E. coli populations kept under control by phages, or perhaps somehow by other viruses?

    It depends on environment, but mostly kept in check by predators and host immune mechanisms (phagocytosis, oxidative damage, antibodies, complement, etc).

    • If we released a strain of virus-resistant E. coli into the wild, could it take over, thanks to this advantage?

    It would still be susceptible to the above, so most of the usual eco-feedback would apply (abundant species become the favorite snack) — but it could have an advantage over other E.coli, so we are making the strain dependent on chemical not found in nature and testing it in the lab for a variety of environmental scenarios.

    • What could the effects be? For example, if the E. coli in my gut became virus-resistant, would their populations grow enough to make me notice?

    E.coli is generally only 0.1% of your gut flora, and making this the (hypothetically) best E.coli wouldn’t make it the best of all bacteria, but anyway I would discourage that experiment (at least until we learn and engineer it a bit more).

    • What are some of the coolest possible applications of this new MAGE/CAGE technology?

    Optimizing production and lowering costs of chemical and therapeutics. Also virus resistant industrial, bioenergy and agricultural species.

    • What did people actually do with that strain of E. coli that ‘reads through’ amber?

    Beginning to do all of the above.

    • How could such a strain be viable, anyway? Does it mostly avoid using the amber codon, or does it somehow survive having a lot of big proteins where a normal E. coli would have smaller ones?

    All of the amber (UAG) codons in the genome at the ends of protein coding regions are changed to “synonymous” UAA codons.

    • Do humans use selenoproteins containing selenocysteine?

    Yes. They do: http://nar.oxfordjournals.org/content/37/18/6259.full.pdf+html

    • How does our body tell when opal is getting used to code for selenocysteine, and when it’s getting used as a stop codon?

    The secondary structure of mRNA near the opal codon.

    • Are there any cool theories about how life evolved to use selenium, and how the opal codon got hijacked for this secondary purpose?

    It’s not particularly rare element and useful for oxidation-reduction reactions (but I probably wasn’t involved in the hijacking — too young — alibi)

    The paper is here: http://arep.med.harvard.edu/gmc_pub.html

    • John Baez says:

      Yay! Thanks! This is a great example of how whenever I ask some hard questions, eventually some smart person comes along and answers them. It took a while but you did it. I was too stupid to think of deliberately changing the amber codons to another stop codon while creating a strain of E. coli that ‘reads through amber’. I’m still a bit amazed that the body has a special way of telling when to treat the opal codon as a stop codon and when to treat it as a code of selenocysteine.

      By the way, yesterday I discovered something interesting about another essential trace element: molybdenum. Turns out molybdenum concentrations in black shales shot up between 663 and 551 million years ago! This is “the clearest confirmation that there was a steep rise in atmospheric oxygen at the end of the Proterozoic.” I read this in a section on molybdenum in Revolutions that Made the Earth, a wonderful book by Tim Lenton and Andrew Watson. They don’t mention its role in biology, but I’ll stick my neck out and conjecture that it first began after this time.

You can use Markdown or HTML in your comments. You can also use LaTeX, like this: $latex E = m c^2 $. The word 'latex' comes right after the first dollar sign, with a space after it.

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s