Cool! Intransitive lizards!

]]>I think there can only be a smallest nonzero amount of information if there’s a smallest nonzero probability: if you believe you have a fair n-sided die and I tell you the first side has not landed up, you’ve received

bits of information, and this can be arbitrarily small if can be arbitrarily large.

I doubt we can build a fair die with a googolplex sides, but I also don’t feel I’ll learn much about information theory by pondering this issue—at least, I won’t learn much very soon.

]]>I miss talking to you too! I’ve got lots of stuff I’d love to talk about, including that book.

I agree that a “bit” is not really the smallest amount of information in any fundamental sense. It’s more like the smallest kind of question you can ask to get some information — the answer to a yes/no question. How much information is contained in the *answer* to a yes/no question depends a lot on the question.

However, your talk got me wondering whether, in some finite universe, there really can be a “smallest possible” question that could be asked, and thus a basic unit of information in that universe. I’m so far just idly wondering, and haven’t tried working it out; maybe it’s obviously nonsense…

]]>Hi! Glad you liked my talk! Great to hear from you! I really miss talking to you. Someday we should finish that book on classical mechanics.

Despite what computer scientists seem to think, there’s no reason to think information comes in integer multiples of bits. For example, if we transmit data in base 3, it’s easy to transmit a **trit** of information, which is log_{2} 3 ≈ 1.585 bits.

You’ll note I cleverly didn’t choose a base for my logarithms near the start of my talk; this is why. In this setup, log 2 of information is one bit of information regardless of the base of the logarithm. The only reason I did calculations where the answers came out to be integer multiples of a bit is to make it easy for people to follow the calculation. Perhaps this was misleading!

Later, when talking about physics, I switched to using logarithms base e. Then information gets measured in **nats**, which I wish were called ‘nits’.

By the way, base 3 seems to have certain information-theoretic advantages over all other integer bases, coming from the fact that the closest integer to e is 3.

]]>Then I happened to also be reading this post on Tim Gowers’s blog today, and realized that the lizards are more like **intransitive dice**!

This is like a probabilistic version of rock-paper-scissors in which, for example, rocks *usually* smash scissors, but sometimes the scissors manage to cut up a rock.

One really basic question about the relative information :

You picked your examples so that this came out as some multiple of log(2), which you interpreted a “bit” of gained information. But, what if I(q,p) isn’t just an (integer) number of bits of this size?

For example, suppose you roll a 6-sided die, and then tell me you didn’t roll a “6”. I’ll update my prior from the uniform distribution to a distribution that’s uniform on the outcomes “1” through “5”, and zero for the outcome “6”. This seems to give me

but how am I to interpret this?

I suppose the “bits” just have a different size in this situation? Maybe that’s the point of information being “relative”? Is there a systematic way of figuring out what constitutes one bit of relative information in a given situation? Or, is it rather that this kind of information doesn’t actually come in discrete chunks?

]]>Shouldnâ€™t it rather be:

Yes. Everyone knows this! I can’t believe I made such a typo and nobody pointed it out during my talk! I’ll check, and fix it on my slides if needed.

I’ll reply to your less distressing comments later.

]]>You had:

but then the unit in the expression would be the exponential of entropy squared. Shouldn’t it rather be:

to make it unit free?

Isn’t it pretty much always the case that the mathematically simplest situations are those that are the most symmetric and finely tuned?

So this all works out to a lot of evolutionary pressure meaning quick fixes are to be found. Those fixes may not be globally optimal, they may have problems, but they could be sufficient to give you an edge. And only if the situation “cools down” do you actually have the time thinking about all this in more detail so you can actually fix and improve things much more easily. However, if things are *too* cool / there is almost no evolutionary pressure, you will likely also lose interest and nothing is evolved at all, right?

I think that should be relevant to innovation theory: Is there some sort of “optimal temperature” at which innovation is ideally balanced between finding new stuff and optimizing old stuff? – Or if not that, is there an “optimal temperature distribution” (meaning it’s allowed to fluctuate, but in a specific way)? And if there is, could we approximate it for real life, facing a lack of information? (Like, for instance, that we don’t and can’t know what there is left to be invented, or that it’s hard to even properly quantify innovation in retrospect, let alone in the moment.)

Really nice talk!

]]>