Orthogonal diagonalization of dummy variables vectors covariance matrix

Question

Let $n$ be a positive integer, and let $(p_i)_{i \in \{1,\cdots,n\}}$ be a finite sequence of real numbers, that are assumed to be nonnegative and sum to $1$. Let us denote by $\mathbf{p}$ the column vector of the $p_i$'s.

We consider the $n\times n$ matrix $CV_p := Diag(p_1,\cdots ,p_n) - \mathbf{p}\mathbf{p}^T$.

Let also $Y$ be a random variable such that for all $i \in \{1, \cdots, n\}$, $\mathbb{P}[Y = i] = p_i$. Let us denote by $X$ the dummy-variable encoding of $Y$, that is, $X_i := \delta_{Y,i}$.

Then $CV_p$ is the covariance matrix of $X$: indeed, if $i,j \in \{1,\cdots,n\}$, $\mathbb{E}[(\delta_{Y,i} - \mathbb{E}[\delta_{Y,i}] )(\delta_{Y,j} - \mathbb{E}[\delta_{Y,j}])]$ equals $p_i(1-p_i)$ if $i=j$ and $-p_ip_j$ if $i \neq j$.

Indeed, $\mathbb{E}[\delta_{Y,i}] = p_i$, and

$\mathbb{E}[(\delta_{Y,i} - \mathbb{E}[\delta_{Y,i}] )(\delta_{Y,j} - \mathbb{E}[\delta_{Y,j}])]\\ = \mathbb{E}[(\delta_{Y,i} - p_i )(\delta_{Y,j} - p_j)]\\ = \mathbb{E}[\delta_{Y,i}\delta_{Y,j}] - p_ip_j - p_jp_i + p_ip_j\\ = \mathbb{E}[\delta_{Y,i}\delta_{Y,j}] - p_ip_j.$

Now, if $i=j$, $\mathbb{E}[\delta_{Y,i}\delta_{Y,j}] = \mathbb{E}[\delta_{Y,i}] = p_i$, so we get $\mathbb{E}[(\delta_{Y,i} - \mathbb{E}[\delta_{Y,i}] )(\delta_{Y,j} - \mathbb{E}[\delta_{Y,j}])] = \mathbb{E}[\delta_{Y,i}\delta_{Y,j}] - p_ip_j = p_i(1-p_i)$.

And if $i\neq j$, $\delta_{Y,i}\delta_{Y,j} = 0$, so $\mathbb{E}[(\delta_{Y,i} - \mathbb{E}[\delta_{Y,i}] )(\delta_{Y,j} - \mathbb{E}[\delta_{Y,j}])] = \mathbb{E}[\delta_{Y,i}\delta_{Y,j}] - p_ip_j = -p_ip_j$.

Is it possible to provide explicit (in terms of the $p_i$'s) and beautiful formulae for the spectrum of $CV_p$? Of an orthogonal diagonalization matrix?

This is an instance of a generally intractable problem (see here and here), so I suspect that the answer is no. — Ben Grossmann, Jan 30 '23 at 17:58
How did you come to the conclusion that $\Bbb E[\delta_{Y,i}\delta_{Y,j}] = 0$? Shouldn't it be the case that $\Bbb P[Y = i \text{ and } Y = j] = 0$ when $i \neq j$? — Ben Grossmann, Jan 30 '23 at 18:09
Regardless of its derivation, the matrix $CV_p$ is indeed a covariance matrix of some kind, and it has rank $n-1$ as long as all $p_i$ are positive. The row-sums of $CV_p$ are all equal to $0$, to $CV_p$ will have $(1,1,\dots,1)$ as an eigenvector associated with $0$. — Ben Grossmann, Jan 30 '23 at 18:29
I recommend that you consider the similar matrix $$ M = W^(CV_p)W, $$ where $W$ denotes a DFT matrix and $W^$ denotes a conjugate-transpose. Note that the first row and column of $M$ are $0$, corresponding to the zero-eigenvalue of $CV_p$. — Ben Grossmann, Jan 30 '23 at 18:43
Thank you for the references! If you gather everything into an answer, I'll accept it! What's wrong with the formulas? I added details to my derivation. — Plop, Jan 30 '23 at 22:14
Your work is correct; I had made a mistake when computing the covariance matrix for your random variable — Ben Grossmann, Jan 30 '23 at 23:12

score 1 · Accepted Answer · answered Jan 30 '23 at 23:41

This is an instance of a generally intractable problem (see here and here). Numerically, we can make some use of the structure of the matrix in order to quickly find eigenvalues using the BNS algorithm, as is explained here.

A few things that can be said about $CV_p$: $CV_p$ has rank $n-1$ as long as all $p_i$ are non-zero (more generally, the rank is equal to the size of the support of the distribution of $Y$). Its kernel is spanned by the vector $(1,1,\dots,1)$.

One strategy that might yield some insight is looking at the similar matrix $M = W^*C_pW$, where $W$ denotes the DFT matrix. The fact that the first column of $W$ spans the kernel means that the first row and column of $M$ will be zero. Empirically, there seems to be some kind of structure in the entries of the resulting matrix (for instance, repeated entries along the diagonal).

Orthogonal diagonalization of dummy variables vectors covariance matrix

1 Answers1