1

Let $x \in \mathbb R^n$ be a vector of random iid numbers, drawn from normal distribution with mean zero and variance 1, and let $A$ be symmetric $n$ by $n$ real matrix.

I want to show that $\operatorname{Var}(x^TAx) = 2\|A\|_F^2$

What I tried:

First, I'll mention that I already proved that $E[x^TAx] = Tr(A)$. I'll use it here.

$\operatorname{Var}[x^TAx] = E[(x^TAx)^2] - (E[x^TAx])^2 = E[x^TAxx^TAx] - (Tr(A))^2$

Since $A$ is symmetric, there's an orthonormal matrix $U$ and diagonal matrix $D$ such that $A = U^TDU$:

$E[x^TAxx^TAx] - (\operatorname{Tr}(A))^2 = E[x^TU^TDUxx^TU^TDUx] - (\operatorname{Tr}(A))^2 = \\ E[y^TDyy^TDy] - (\operatorname{Tr}(A))^2$

Where $y := Ux$, which is also iid random variables with zero mean and 1 variance.

$\displaystyle E[y^TDyy^TDy] - (\operatorname{Tr}(A))^2 = E[\sum_{i=1}^{n}\lambda_iy_i^2 \cdot\sum_{j=1}^{n}\lambda_j y_j^2] - (\operatorname{Tr}(A))^2 = \\ \displaystyle \sum_{i=1}^{n}\sum_{j=1}^{n}\lambda_i\lambda_jE[y_i^2y_j^2] - (\operatorname{Tr}(A))^2$

Since $y_i, y_j$ are iid, $E[y_i^2 y_j^2 ] = E[y_i^2]E[y_j^2] = 1$ and so we finally have

$\displaystyle \sum_{i=1}^{n}\sum_{j=1}^{n}\lambda_i\lambda_j - (\operatorname{Tr}(A))^2$

Now maybe I'm crazy, but isn't this zero?

Oria Gruber
  • 13,035
  • Note that $$\sum_{i=1}^{n}\sum_{j=1}^{n}\lambda_i\lambda_j = \sum_{i=1}^{n}\lambda_i \sum_{j=1}^{n}\lambda_j = \left(\sum_{i=1}^{n}\lambda_i \right)\left(\sum_{j=1}^{n}\lambda_j \right)=\left(\sum_{k=1}^{n}\lambda_k\right)^2.$$ – Minus One-Twelfth Dec 17 '19 at 21:02
  • Which is basically $(Tr(A))^2$. So the result is zero, which is not what we wanted. – Oria Gruber Dec 17 '19 at 21:03
  • Ah yes, I misread, thought you thought it wasn't $0$. – Minus One-Twelfth Dec 17 '19 at 21:03
  • 1
    I think the error is in assuming that $y_i$ and $y_j$ are iid. – Minus One-Twelfth Dec 17 '19 at 21:06
  • Definitely not wrong. If entries of $x$ are iid normal variables with zero mean and 1 variance, and $U$ is orthogonal, then entries of $Ux$ have the same properties. – Oria Gruber Dec 17 '19 at 21:08
  • 1
    I agree with @MinusOne-Twelfth. They maybe be identically distributed but not independent. Let $y = Ux$ then $y_1 .= u_{11}x_1 + u_{12} x_2$ and $y_2 = u_{21}x_1 + u_{22} x_2$. Then $\mathbb{E}[y_1] = \mathbb{E}[y_2] = 0$. But $Cov(y_1, y_2) = \mathbb{E}[y_1y_2] = u_{11} u_{21} + u_{22} u_{12} \neq 0$ – sudeep5221 Dec 17 '19 at 21:12
  • Ah I noticed I didn't write - $x$ is normally distributed. Now orthogonal transformations preserve the distribution. Question still stands. – Oria Gruber Dec 17 '19 at 21:13
  • It's related to the variance calculation done here http://blog.shakirm.com/2015/09/machine-learning-trick-of-the-day-3-hutchinsons-trick/ – Oria Gruber Dec 17 '19 at 21:27

3 Answers3

1

The error is assuming that $y_i$ and $y_j$ are iid for all $i$ and $j$.

This is true if $i\ne j$, but you should consider separately the case where $i=j$.

  • Not sure I understand...If $i=j$ then $y_i =y_j$ and they are obviously not independent. At any rate, the claim is true - https://math.stackexchange.com/questions/3179763/multiplication-of-normal-distributed-variables-with-orthonormal-matrix – Oria Gruber Dec 17 '19 at 21:36
  • 2
    @OriaGruber The point is that if $i=j$ then $E(y_i^2y_j^2)=E(y_i^4)$ which doesn't equal $1$. – grand_chat Dec 17 '19 at 21:40
  • 1
    $\newcommand{\E}{\mathbb{E}}$When you have $i=j$ in the double sum, your $y_i$ and $y_j$ are no longer independent, and your assumption that $\E\left[ y_i^2y_j^2\right] = \E\left[ y_i^2\right]\E\left[y_j^2\right]$ is false in this case. In fact, in this case, you can show that $\E\left[ y_i^2y_j^2\right] = \E\left[ y_i^4\right] = \color{blue}3$. – Minus One-Twelfth Dec 17 '19 at 21:40
  • Ok now I understand what you mean and where the mistake is. I didn't know $E(y_i^4)$ is not $1$. Oh Wow, how can I see why it's 3? – Oria Gruber Dec 17 '19 at 21:43
  • Try the MGF for the standard normal: $m_y(t) = e^{\frac{1}{2} t^2}$. – Gregory Dec 17 '19 at 21:46
  • 1
    Or integration by parts: https://math.stackexchange.com/q/1982634/215011 – grand_chat Dec 17 '19 at 21:47
1

Starting from $$\sum_i \sum_j \lambda_i \lambda_j E[y_i^2 y_j^2] - \text{Tr}(A)^2.$$ Split into cases $i=j$ and $i\neq j$. This results in the sum breaking into two parts $$\sum_{i\neq j} \lambda_i \lambda_j + 3 \sum_i \lambda_i^2 - \text{Tr}(A)^2.$$ Note that we have used the identity the $E[y_i^4] = 3$ in the above formula. Now, $$\sum_{i\neq j} \lambda_i \lambda_j = \sum_i \sum_j \lambda_i \lambda_j - \sum_i \lambda_i^2 = \left(\sum_i \lambda_i \right)^2 - \sum_i \lambda_i^2. $$ Plugging this into the equation gives $$2 \sum_i \lambda_i^2 = 2 \| A \|_F.$$

Gregory
  • 3,794
  • 13
  • 16
1

Others have identified the error, so let's do a concise corrected calculation. Since $E[x_ix_j]=\delta_{ij}$ and $E[x_ix_jx_kx_l]=\delta_{ij}\delta_{kl}+\delta_{ik}\delta_{jl}+\delta_{il}\delta_{jk}$,$$\begin{align}\operatorname{Var}(A_{ij}x_ix_j)&=A_{ij}A_{kl}(\delta_{ij}\delta_{kl}+\delta_{ik}\delta_{jl}+\delta_{il}\delta_{jk})-(A_{ij}\delta_{ij})^2\\&=A_{ij}A_{kl}(\delta_{ik}\delta_{jl}+\delta_{il}\delta_{jk})\\&=A_{ij}A_{ij}+A_{ij}A_{ji}.\end{align}$$By symmetry, this simplifies to $2A_{ij}A_{ij}=2\Vert A\Vert_F^2$.

J.G.
  • 118,053