Why is this not a valid Variance Covariance matrix, and inherently not positive semi-definite?

Question

      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
[1,]  1.0 -0.5 -0.5 -0.5 -0.5 -0.5 -0.5 -0.5
[2,] -0.5  1.0 -0.5 -0.5 -0.5 -0.5 -0.5 -0.5
[3,] -0.5 -0.5  1.0 -0.5 -0.5 -0.5 -0.5 -0.5
[4,] -0.5 -0.5 -0.5  1.0 -0.5 -0.5 -0.5 -0.5
[5,] -0.5 -0.5 -0.5 -0.5  1.0 -0.5 -0.5 -0.5
[6,] -0.5 -0.5 -0.5 -0.5 -0.5  1.0 -0.5 -0.5
[7,] -0.5 -0.5 -0.5 -0.5 -0.5 -0.5  1.0 -0.5
[8,] -0.5 -0.5 -0.5 -0.5 -0.5 -0.5 -0.5  1.0

The reason I know is because this results in a negative eigenvalue, and variance-covariance matrices are positive semi-definite.

My thinking here was to have the variances be one, so that the correlations were the covariances, and thus equal -0.5.

Is there something with the theory I am missing? I understand that it is not positive semi-definite and how to show as such, but I am more curious what assumptions this is violating in terms of probability/statistics.

I went to generate MVN data with this variance-covariance structure and realized this wasn't positive semi-definite, and then became curious what was inherently wrong with this matrix.

I find it a very fascinating question. If you don't get more responses here you might try at stats.stackexchange.com — Vincent, Mar 21 '19 at 14:21
I have a bit of a resolution to this, I'll try to post it (and possibly an addendum to this question elsewhere like at stats) and be sure to leave a link if so! — OGV, Mar 23 '19 at 04:25
If $E[\vec{X}] = \vec{0}$, then $\Sigma = E[\vec{X} \vec{X}^\top]$ (expected value of an "outer product"). Hence if $\Sigma \vec{v} = \lambda \vec{v}$, we have $\lambda |\vec{v}|^2 = \vec{v}^\top (\lambda \vec{v}) = \vec{v}^\top \Sigma \vec{v} = E[\vec{v}^\top \vec{X} \vec{X} \vec{v}] = E[|\vec{X}^\top \vec{v}|^2] \geq 0$. That is, a negative eigenvalue yields a projection $\vec{X}^\top \vec{v} = \sum_i v_i X_i$ with negative variance, a contradiction. — Joshua P. Swanson, Dec 17 '24 at 01:43
A simple intuitive reason is that if you have one variable that is negatively correlated with other random variables, that tends to push these other variables to be more positively correlated with each other, almost by definition. Still, a matrix of that form is still possible for 3 variable, but only just. The answers you got add more detail to this. — QuantumWiz, Dec 17 '24 at 07:49

Semiclassical · Answer 1 · 2024-12-16T16:35:01.427

Suppose we have a set of real r.v.s $\{X_k\}_{k=1}^n$ such that $\text{E}[X_k^2]=1$ for all $k$ and $\text{E}[X_jX_k]=-1/2$ for all $j\neq k$. Then the variance of $X:=\sum_{k=1}^n X_k$ would be

\begin{align} \text{E}[X^2] &=\text{E}\left[\left(\sum_{k=1}^n X_k\right)^2\right]\\ &=\sum_{k=1}^n \text{E}[X_k^2]+2\sum_{\underset{j< k}{j,k=1}}^n\text{E}[X_j X_k]\\ &=n\cdot 1+2\cdot\frac{n(n-1)}{2}\cdot-\frac12=\frac{n}{2}(3-n) \end{align} If $n>3$, then $\text{E}[X^2]<0$ which is impossible. Hence no set of more than three random variables can have a covariance matrix of this form.

Suppose more generally that our random variables have variance-covariance matrix $\Sigma$ with matrix elements $\Sigma_{jk}=\text{E}[X_j X_k]$ for $j,k=1$ to $n$. Then a sum of the form $X_v:=\sum_{k=1}^n v_n X_n$ has variance

$$ \text{E}[X_v^2] =\text{E}\left[\sum_{j,k=1}^n v_j X_j v_j X_k \right] =\sum_{j,k=1}^n v_j \text{E}[X_j X_k] v_k =\sum_{j,k=1}^n v_j \Sigma_{jk} v_k =v^\top \Sigma v $$ Hence the requirement of $\Sigma$ to be PSD ($v^\top \sigma v\geq 0$ for all $v$) amounts to requiring that all variances are nonnegative; any vector which violates this condition directly corresponds to a linear combination with nonsense variance.

Benjamin Wang · Answer 2 · 2024-12-16T00:59:55.737

There have been two good answers on the impossibility of the covariance matrix. On the other hand, OP asked whether there is something "inherently wrong" with the matrix. This suggests that OP is looking for intuition. We can try to take an extreme example.

Let $n$ be the number of random variables and the covariance matrix be

$$C_n(a, b) = \begin{bmatrix} a & b & \ldots & b\\ b & a & \ldots & b\\ \vdots & \vdots & \ddots & \vdots\\ b & b & \ldots & a \end{bmatrix}$$

Now $C_2(1, -1)$ is clearly intuitively possible (you need two completely oppositely correlated random variables). But $C_3(1,-1)$ is impossible: if the second RV is completely oppositely correlated with the first and the third is completely oppositely correlated with the first, then the second must be fully correlated with the third, so we can't have them be completely oppositely correlated.

According to the linear algebra in the other answers, $C_3(1,-1/2)$ is barely possible, and impossible for more RVs.

The intuition is that if you have many other RVs sufficiently oppositely correlated to the first one, then some of those RVs have to be somewhat correlated.

There's also a cute geometric POV here. If we interpret the covariance between two random variables as an inner product, then $C_n(1,-1)$ would amount to having $n$ unit vectors which are all mutually anti-parallel. But there can only be at most two anti-parallel vectors. (This idea goes back to at least a paper of de Finetti---see http://www.brunodefinetti.it/Opere/AboutCorrelations.pdf.) For three vectors in the plane, the best we can do is place them at 120 degree angles to get inner products of $-1/2$. — Semiclassical, Dec 16 '24 at 00:28

Minus One-Twelfth · Answer 3 · 2019-03-20T20:36:26.743

0

As shown here, an $n\times n$ matrix with $a$ on the diagonal and $b$ elsewhere, that is, the matrix $$\begin{bmatrix} a & b & \ldots & b\\ b & a & \ldots & b\\ \vdots & \vdots & \ddots & \vdots\\ b & b & \ldots & a\end{bmatrix}$$ has $\color{blue}{a +(n-1)b}$ as one of its eigenvalues. For your particular matrix, we have $a = 1, b = -0.5, n =8$, so $$a + (n-1)b = 1 + 7\times (-0.5) = -2.5< 0.$$ Hence your symmetric matrix has a negative eigenvalue ($-2.5$), so cannot be positive semi-definite. This implies that it is not a valid variance-covariance matrix.

EDIT: Just realised you said you already know this...

edited Mar 20 '19 at 20:36

answered Mar 20 '19 at 20:31

Minus One-Twelfth

7,463

Yup! Sorry, the title was misleading people I think. I already knew for a fact it wasn't positive semi-definite, but theoretically random variables should have this structure. In terms of probability something is specifically violated making it not positive semi-definite, and I was curious what it might be. – OGV Mar 20 '19 at 20:43

lonza leggiera · Answer 4 · 2024-12-17T06:43:09.343

The covariance matrix $\ \Sigma\ $ you propose has the form $$ \Sigma=aI- b\mathbf{1}\mathbf{1}^T $$ (where $\ a=1.5\ $ and $\ b=0.5\ $ in your case). If this is the covariance matrix of a random vector $\ \mathbf{X}\ $ of dimension $\ n\ $ with mean $\ \vec{\mu}\ ,$ then, as Semiclassical points out in an earlier answer, the fact that the variance of the random variable $\ \mathbf{1}^T\mathbf{X}\ $ must be non-negative imposes a limit on the relative sizes of $\ a,b\ $ and $\ n\ :$ \begin{align} 0&\le\text{Var}\big(\mathbf{1}^TX\big)\\ &=\mathbb{E}\big(\big(\mathbf{1}^T(\mathbf{X}-\vec{\mu})\big)^2\big)\\ &=\mathbb{E}\big(\mathbf{1}^T(\mathbf{X}-\vec{\mu})(\mathbf{X}-\vec{\mu})^T\mathbf{1}\big)\\ &=\mathbf{1}^T\big(aI- b\mathbf{1}\mathbf{1}^T\big)\mathbf{1}\\ &=a\mathbf{1}^T\mathbf{1}-b\big(\mathbf{1}^T\mathbf{1}\big)^2\\ &=an-bn^2\ , \end{align} so the inequality $\ bn\le a\ $ must be satisfied. In fact, since the eigenvalues of $\ \Sigma\ $ are $\ a\ $ and $\ a-bn\ ,$ the inequality $\ a\ge\max(0,bn)\ $ is necessary and sufficient for $\ \Sigma\ $ to be a covariance matrix of some random vector. In your case, if you want the all the diagonal entries of your $\ 8\times8\ $ matrix to be $1$, and all the off-diagonal entries to be the same, then you'll need to chooose the value $\ {-}v\ ,$ say, of those off-diagonal entries so that $$ 8v\le1+v $$ or $$ v\le\frac{1}{7}\ . $$

Why is this not a valid Variance Covariance matrix, and inherently not positive semi-definite?

4 Answers4