I want to keep telling you about information geometry… but I got sidetracked into thinking about something slightly different, thanks to some fascinating discussions here at the CQT.
There are a lot of people interested in entropy here, so some of us — Oscar Dahlsten, Mile Gu, Elisabeth Rieper, Wonmin Son and me — decided to start meeting more or less regularly. I call it the Entropy Club. I’m learning a lot of wonderful things, and I hope to tell you about them someday. But for now, here’s a little idea I came up with, triggered by our conversations:
• John Baez, Rényi entropy and free energy.
In 1960, Alfred Rényi defined a generalization of the usual Shannon entropy that depends on a parameter. If is a probability distribution on a finite set, its Rényi entropy of order is defined to be
where . This looks pretty weird at first, and we need to avoid dividing by zero, but you can show that the Rényi entropy approaches the Shannon entropy as approaches
(A fun puzzle, which I leave to you.) So, it’s customary to define to be the Shannon entropy… and then the Rényi entropy generalizes the Shannon entropy by allowing an adjustable parameter .
But what does it mean?
If you ask people what’s good about the Rényi entropy, they’ll usually say: it’s additive! In other words, when you combine two independent probability distributions into a single one, their Rényi entropies add. And that’s true — but there are other quantities that have the same property. So I wanted a better way to think about Rényi entropy, and here’s what I’ve come up with so far.
Any probability distribution can be seen as the state of thermal equilibrium for some Hamiltonian at some fixed temperature, say . And that Hamiltonian is unique. Starting with that Hamiltonian, we can then compute the free energy at any temperature , and up to a certain factor this free energy turns out to be the Rényi entropy , where . More precisely:
So, up to the fudge factor , Rényi entropy is the same as free energy. It seems like a good thing to know — but I haven't seen anyone say it anywhere! Have you?
Let me show you why it’s true — the proof is pathetically simple. We start with our probability distribution . We can always write
for some real numbers . Let’s think of these numbers as energies. Then the state of thermal equilibrium, also known as the canonical ensemble or Gibbs state at inverse temperature is the probability distribution
where is the partition function:
Since when , the Gibbs state reduces to our original probability distribution at .
Now in thermodynamics, the quantity
is called the free energy. It’s important, because it equals the total expected energy of our system, minus the energy in the form of heat. Roughly speaking, it’s the energy that you can use.
Let’s see how the Rényi entropy is related to the free energy. The proof is a trivial calculation:
at least for . But you can also check that both sides of this equation have well-defined limits as .
The relation between free energy and Rényi entropy looks even neater if we solve for and write the answer using instead of :
So, what’s this fact good for? I’m not sure yet! In my paper, I combine it with this equation:
Here is the expected energy in the Gibbs state at temperature :
while is the usual Shannon entropy of this Gibbs state. I also show that all this stuff works quantum-mechanically as well as classically. But so far, it seems the main benefit is that Rényi entropy has become a lot less mysterious. It’s not a mutant version of Shannon entropy: it’s just a familiar friend in disguise.