• Relative Entropy (Part 1): how various structures important in probability theory arise naturally when you do linear algebra using only the nonnegative real numbers.

• Relative Entropy (Part 2): a category related to statistical inference, and how relative entropy defines a functor on this category.

But then Tobias Fritz noticed a big problem.

]]>This covers the case when We also can’t figure out if isn’t in the image of

But now I see that second case never comes up! In this situation

we can show that the function is onto. The paper I’m writing with Tobias will explain this…

]]>And this is something we really don’t want, usually. Taken together with

that would imply a very strong condition on our function

Can anyone see what this condition amounts to?

]]>I think that the equation ‘q then s yields the same distribution on X as p does’ answers your exercise (in other words, as you imply above, if its diagram commutes a hypothesis is optimal).

]]>“unless there are observations that happen with probability zero—that is, unless there are ”: Do you mean instead of ?

Yes, thanks—I’ll fix that. Deliberately calling an element of “” would be heinous mathematical crime… here in Singapore I’d probably get caned for it.

I’m really glad you found the exposition clear! I’m warming up to write a paper about this, so I need to know what works and what doesn’t.

I’ll tell you what it means for a diagram to commute. Here’s a simple example:

A diagram commutes whenever, given two ways to get from one object to another by following a chain of arrows, those two ways are equal. In this case we have two ways to get from to There's a direct way using just and indirect way using first and then But they are equal—that's what the equation below the diagram says. If I were talking to category theorists, I could skip the equation and just say "the diagram commutes".

This pays off in situations where we have big complicated diagrams like this:

(from the proof of the ‘zig-zag lemma’ on Wikipedia) or this:

(a typical sort of thing you see on an algebraic topologist’s whiteboard, drawn by Patrick Orson.) Instead of writing down dozens of equations, you get a nice visual depiction of what’s going on, and you can learn to reason very quickly with these diagrams.

Unfortunately my post is not a great introduction to commutative diagrams, for two reasons.

First, I’m heavily using *two kinds of arrows*, straight ones and wiggly ones. This is a bit nonstandard; it means my diagrams involve not just one category but two: the category of finite sets and stochastic maps (wiggly arrows), and a subcategory of that, the category of finite sets and functions (straight arrows). If you’re just getting started at this game, you’d want to start with one kind of arrow.

Second, lots of my diagrams *don’t* commute! They don’t completely commute, so I have to say which equations *do* hold, by writing them below the diagram, like here:

As an exercise you can try to guess an equation not listed here that *would* hold if the diagram *did* commute.