10

In A Mathematical Theory of Communication, Shannon states

If $N(t)$ represents the number of sequences of duration $t$ we have $$N(t) = N(t-t_1) + N(t-t_2) + \dots + N(t-t_n).$$ The total number is equal to the sum of the numbers of sequences ending in $S_1,S_2,\dots,S_n$ and these are $N(t-t_1),N(t-t_2),\dots,N(t-t_n)$, respectively. According to a well-known result in finite differences, $N(t)$ is then asymptotic for large $t$ to $AX_0^t$ where $A$ is constant and $X_0$ is the largest real solution of the characteristic equation: $$X^{-t_1}+X^{-t_2}+\cdots+X^{-t_n}=1$$

Where can I find a reference for this well-known result?

Suzu Hirose
  • 11,949

2 Answers2

10

To begin with, assume the numbers $t_i$ are integers, with $t_i<t_{i+1}$; and assume $t$ is also integer; then we have a linear difference equation for $N(t)$, which is homogenous and linear with constant coefficients:

$$N(t) - N(t-t_1)- N(t-t_2)-\cdots N(t-t_n)=0 \tag 1$$

The theory of this is quite elementary. One postulates a solution in the form

$$ N(t)= a^t \tag 2$$ for some (in general complex) number $a$. Replacing in $(1)$ one gets

$$ 1 - a^{-t_1} - a^{-t_2} - \cdots - a^{-t_n}= a^{t_n} - a^{t_n-t_1} - a^{t_n-t_2} - \cdots -1 = 0 \tag{3}$$

This is a polynomial on $a$ of degree $t_n$ (the "characteristic polynomial), so in general it has $t_n$ roots, let them be $a_i$. Then the general solution is

$$N(t) = \sum c_i a_i^t$$

If only real solutions make sense, we stick to the real roots. And the dominant term will correspond to the largest $a_i$, i.e., the largest real root of the polynomial.

What if the numbers $t_i$ are not integer? Well, if they are rational (or at least conmensurable, i.e, they can be expressed as rational multiples of the smallest term) the derivation basically stands. If they are not conmensurable, things are more complex (that's where the answer at MO apply), but we are dealing with engineering here, not pure math, hence this scenario is rather unimportant.

leonbloy
  • 66,202
  • Muchas gracias! Any clue on how one could have arrived to the ansatz $N(t) = a^t$ a priori? – Fernando Martin Aug 10 '22 at 22:44
  • 3
    This is very standard in differential equations, and by extensions, in difference equations. Basically they are the eigenfuctions https://math.stackexchange.com/questions/2525261/can-exponential-functions-be-thought-of-as-eigenfunctions-for-the-derivative-ope – leonbloy Aug 10 '22 at 23:44
0

There is an answer on the mathoverflow site. I could not make head or tail of any of those replies.

There is a reply on a Usenet post from 2007 here.

Someone on this forum refers to a section of the Wikipedia article on Recurrence relations which has since been removed.

This reddit discussion again points to the Wikipedia article on recurrence relations. I'll copy and paste it here:

Of course, the wikipedia page is not the best introduction to the topic-- you might be better off looking for a pdf of generatingfunctionology, which was an introduction to the topic used for an MIT three-week house course.

  1. The numbers 2, 4, 5, 7, 8, and 10 are arbitrary, just for the example. To write out a little longer what Shannon was talking about:

Imagine that you have a telegraph line, or channel. You might use Morse code to send messages. If you look at Morse code, not all symbols (letters) are the same length or take the same duration to send. An 'e' is a single dot, and a '0' is five dashes. For the sake of the example, assume your entire alphabet has six symbols, and the first symbol takes 2 seconds to tap out, the second symbol takes 4 seconds, and the third takes 5 seconds and so on. It may be pretty hard to calculate the efficiency of the channel directly, but let's pretend there is a solution-- that there is a function N(t) for the number of sequences of length t, which we don't know. If enough time has passed, or t is large enough, we know that whatever N(t) is, it will fit this equation:

$$N(t) = N(t-2) + N(t-4) + N(t-5) +...$$

and the reasoning is this: at time t, the most recent symbol could have been any of the symbols of length 2, 4, 5, 7, 8, or 10. We can sort all of the sequences $N(t)$ by whether the most recent symbol took 2s to transmit, or 4s, or 5s, and so on. If the most recent symbol was the 2s symbol, all of those sequences can be represented by all of the $N(t-2)$ sequences followed by that 2s symbol, and similarly, all of the sequences where the 4s symbol was transmitted recently consist of all of the $N(t-4)$ sequences followed by that 4s symbol.

It would be quite a bit of work to explain how to get the polynomial from the recurrence relation (or really, why it all works), but the theory of recurrence relations will explain how to solve for the sequence $N(t)$.

The reason $t$ has to tend toward $\infty$ is because in 8 seconds (small $t$), you can send exactly one symbol of length 7 or 8, (and no other symbol), so they are seemingly the same cost, but over longer periods of time, you will be able to send closer and closer to 8/7 times as many 7's as 8's.

Suzu Hirose
  • 11,949