My fundamental interest on this geometry are its applications on classical statistical mechanics. This type of geometry was inspired on Ruppeiner’s geometry of thermodynamics. The metric tensor of this last formulation was modified replacing the usual derivative by the Levi-Civita covariant derivative:

.

Here, is the thermodynamic entropy. Since the entropy is a scalar function, the above relation guarantees the covariance of the metric tensor. Moreover, the thermodynamic entropy enters in mathematical expression of Einstein postulate of classical fluctuation theory, which should be extended as follows:

to guarantee the scalar character of the entropy. The gaussian distribution that I commented above is an exact improvement of gaussian approximation of classical fluctuation theory of statistical mechanics.

,

and they are the only affine connections with a tonsion-less covariant derivative that obeys the condition of Levi-Civita parallelism:

.

The relation between the metric tensor and the probability density represents a set of covariant partial equations of second-order in terms of the metric tensor. This equation is non-linear and self-consistent. As expected, it is difficult to solve in most of practical situations. However, this ansatz allows to introduce a measure for the relative entropy only considering the probability density of interest.

Surprinsingly, the value of this relative entropy can be expressed in an exactly way as follows:

,

where is the dimension of the manifold and a certain normalization constant. Precisely, a direct consequence of this set of covariant differential equations is the possibility to rewrite the original distribution function:

as follows:

.

Here, is the invariant measure:

;

denotes the most likely point of the distribution, while denotes the arc-length of geodesics that connects the points and . Formally, this is a Gaussian distribution defined on a Riemannian manifold . The normalization constant is reduced to the unity if the manifold is diffeomorphic to the n-dimensional Euclidean real space , and it takes different values when the manifold exhibits a curved geometry. Consequently, this type of relative entropy is a global measure of the curvature of the manifold where the random variables are defined. To my present understanding, curvature accounts for the existence of irreducible statistical correlations, something analogous to the irreducible character of gravity in different references frames due to its connection with curvature.

Thanks!

]]>Hi! How are you constructing your Levi-Civita connection starting from I know how to construct a Levi-Civita connection starting from a metric, but you’ve defined your metric starting from a Levi-Civita connection. The fact that you speak of ‘the’ Levi-Civita connection makes me a little nervous, since there are many.

]]>was early proposed by Jaynes to extend the notion of information entropy to the framework of continuous distributions. However, I think that the relative entropy is not a fully satisfactory generalization of this concept. A natural question here is how to introduce the measure:

when no other information is available, except the continuous distribution:

Recently, I proposed a way to overcome this difficulty in the framework of Riemannian geometry of fluctuation theory:

http://iopscience.iop.org/1751-8121/45/17/175002/article

This geometry approach introduces a distance notion:

between two infinitely close points and , where the metric tensor is obtained from the probability density as follows:

.

Here, the symbol denote the Levi-Civita affine connections. Formally, this is a set of covariant partial differential equations of second order in terms of the metric tensor. The measure can be defined as follows:

Apparently, the relative entropy that follows from this ansatz is a global measure of the curvature of the manifold where the continuous variables are defined. Some preliminary analysis suggest that curvature is closely related to the existence of irreducible statistical correlations among the random variables , that is, statistical correlations that survive any coordinate transformations .

]]>David wrote:

Still it seems odd where you write

The information gain as we go from to is …

I think I’d prefer it the other way around.

Oh, definitely! That was just a typo — I’ve fixed it now, thanks. I should mind my ‘s and ‘s.

There’s an argument that relative entropy is the square of a distance, agreed a few comments below.

Only after you symmetrize it, of course. The relative entropy is not symmetric:

so it can’t give a metric (of the traditional symmetric sort) when you take its square root. But Suresh Venkat is claiming that the symmetrized gadget, which he calls the ‘Jensen-Shannon distance’:

*does* give a metric when you take its square root. I’ll have to check this, or read about it somewhere if I give up. You just need to check the triangle inequality… the rest is obvious.

But as your n-Café comment notes, there’s also another nice metric floating around. If we define a version of relative entropy based on Rényi entropy instead of the usual Shannon kind, we get something called the ‘Rényi divergence’ which is symmetric *only for the special case *… and while it still doesn’t obey the triangle inequality in that case, it’s a function of something that does. Not-so-coincidentally, I was just reading about this today. This thesis has a nice chapter on Rényi entropy and the corresponding version of relative entropy:

• T. A. L. van Even, *When Data Compression and Statistics Disagree: Two Frequentist Challenges For the Minimum Description Length Principle*, Chap. 6: Rényi Divergence, Ph.D. thesis, Leiden University, 2010.

Yeah, it’s on page 179:

Only for is Rényi divergence symmetric in its arguments. Although not itself a metric, it is a function of the square of the Hellinger distance

[Gibbs and Su, 2002].

Anyway, my goal in this post was pretty limited. I just wanted to explain the concept of relative entropy a little bit, so people can following what I’m saying when I explain how it’s related to the Fisher information metric.

]]>