6

Consider a random $n$ by $n$ matrix $A$ whose entries are chosen from $\{0,1\}$ and a random $n$ dimensional vector $x$ whose entries are also chosen from $\{0,1\}$. Assume $n$ is large.

What is the (base 2) Shannon entropy of $Ax$? That is, can we give a large $n$ approximation for $H(Ax)$?

It feels like $H(Ax)$ should be at least $n$ as that is the entropy of $x$ and $A$ is very likely to be non-singular. We also know $H(Ax) \leq n \log_2{n}$ as we can encode $Ax$ in $n \log_2{n}$ bits (the entries of $Ax$ are no larger than $n$).

Is the entropy of the form $n$ or of the form $n\log_2{n}$ or something in between?

1 Answers1

2

Let $A$ has size $m\times n$ (more general), $y=A x$, $y=(y_1, y_2, \cdots y_m)$. Let $s=\sum_{i=1}^n x_i$

Then $$H(y)= H(y \mid s) + H(s) - H(s \mid y) \tag{1}$$ and we can bound:

$$ H(y \mid s) \le H(y) \le H(y \mid s) + H(s) \tag{2} $$ To compute, $H(y \mid s)$, note that while $y=(y_1, y_2, \cdots y_m)$ are not independent, they are independent if conditioned on $s$. Hence

$$H(y \mid s) = m \, H(y_1 \mid s)$$

Further, $y_1 |s \sim B(s,1/2)$ (Binomial), and $s$ is also Binomial $B(n,1/2)$ Hence

$$H(y \mid s) = m \sum_{s=0}^n \frac{1}{2^n}{n \choose s} h_B(s) \tag{3}$$

$$H(s) = h_B(n) \tag{4}$$

where $$h_B(t)= - \frac{1}{2^t} \sum_{k=0}^t {t\choose k} \log\left(\frac{1}{2^t}{t \choose k}\right) = t - \frac{1}{2^t} \sum_{k=0}^t {t \choose k} \log\left({t \choose k}\right) \tag{5}$$ is the entropy of a Binomial of size $t$ and $p=1/2$.

(all logs are in base $2$ here).

Expressions $(3)$ $(4)$, together with $(2)$, provide exact bounds. We can obtain an approximation by taking the central term in $(3)$ and using the asymptotic $h_B(t) \approx \frac{1}{2} \log(t \, \pi e /2)$. We then get

$$H(y|s) \approx \frac{m}{2} \log(n \pi e /4) \tag{6}$$

$$H(s) \approx \frac{1}{2} \log(n \pi e /2) \tag{7}$$

This strongly suggests that, when $m=n$, $H(y)$ grows as $\frac{n}{2} \log(n)$

The graps shows both bounds and the approximation $(6)$ for the lower bound.

enter image description here

leonbloy
  • 66,202
  • Thank you for this. Is your best guess currently that it is closer to $n$ or $n\log_2{n}$? –  Dec 17 '15 at 12:58
  • 1
    Closer to the second. I'd bet for $\frac{n}{2} \log_2(n , \pi e /4)$ – leonbloy Dec 17 '15 at 21:54
  • This is a very nice answer! Thank you. Your trick of conditioning on $s$ suddenly makes the problem elegantly tractable. I have asked a related question at http://math.stackexchange.com/questions/1580679/entropy-of-the-sum-of-matrix-vector-products which I hope you might also find interesting. –  Dec 18 '15 at 13:20