Information vs entropy estimation

Question

There are two files of the same length. The first is pure white noise. The second contains only zero bytes. So second file has less entropy which means more information. Is that correct?

I lose my Information theory knowleges and cannot express it strictly now.

score 0 · Answer 1 · answered Jun 21 '19 at 17:24

0

First, the concept of (Shannon) entropy applies only to random sources... not to given "files" (i.e., deterministic data). Broadly speking, we often consider such a file a realization of some ideal random source. In this case, we might consider that the first one was generated by a white noise source, and the second by a source that outputs zeroes with probability (almost?) one.

Assuming the above, then, yes, the first one has the highest entropy rate ( 8 bits per byte), while the second has the lowest (0 bits per byte).

less entropy which means more information. Is that correct?

No, of course the second one has less information (actually, no information at all). Low entropy (in the extreme, a deterministic source) has low information content.

If you have trouble digesting the idea that a "pure noise" has maximum information content, you might read this question.

answered Jun 21 '19 at 17:24

leonbloy

66,202

But what if both files are recorded from the same source we have no assumption about? So we can suppose it produces any simbols with equal probability with no correlation. And that's what we got in the first file, but next file breaks our expectation being less probable. As I can remember less likely event contains more information. Am I wrong? – SerG Jun 22 '19 at 15:50
1

Less likely event contains more information, indeed. But 1) the question was about entropy, and you cannot speak of entropy of individual realizations (entropy is the mean information content) 2) contrarily (perhaps) to intuition, under the model of equiprobable and independent bytes, both files (actually, any files) have the same probability. https://math.stackexchange.com/questions/467575/should-i-put-number-combinations-like-1111111-onto-my-lottery-ticket/467581#467581 – leonbloy Jun 22 '19 at 16:40
Looks like "you cannot speak of entropy of individual realizations" is the point. I had messed up with entropy from thermodynamic of bounded system. – SerG Jun 22 '19 at 18:59

Information vs entropy estimation

1 Answers1