1

Let be $\boldsymbol{p}$ a $n$-dimensional real vector ($p_i \geq 0, \sum p_i =1$).
Is there a general analytical solution for, or a simple way to compute the eigendecomposition
$ \boldsymbol{I}_p - \boldsymbol{p}\boldsymbol{p}^T = \boldsymbol{U \Lambda U^T}$, where $ \boldsymbol{I}_p$ is a $n \times n$ diagonal matrix with entries $p_i$ on the diagonal?


The multinomial distribution can be approximated by the multivariate normal distribution. Let be $\boldsymbol{p}$ the $n$-dim probability vector of the multinomial and $k$ the number of trials. Then the mean vector of the approximating multivariate normal distribution is $k\boldsymbol{p}=(kp_1,kp_2,...,kp_n)$. Its covariance matrix is a $n \times n$ symmetric matrix with diagonal elements $kp_i(1−p_i), i=1,2...,n$ and off-diagonal elements $−kp_ip_j, i\neq j$.
Hence, the term $ k (\boldsymbol{I}_p - \boldsymbol{p}\boldsymbol{p}^T)$.
Now its eigendecomposition $\boldsymbol{U \Lambda U^T}$ is going to give $\boldsymbol{Y}=\sqrt{k \boldsymbol{\Lambda}} \boldsymbol{ U X}$, where $\boldsymbol{X} \sim \mathcal{N}(0,\boldsymbol{I})$ is a column vector of $n$ standard normal RV's, i.e. $n$ RVs $Y_i=\sqrt{k\lambda_i} \sum_{j=1}^n u_{i,j} X_j$ with the desired covariance.
I believe the $u_{i,j}$ have been analytically determined for this multinomial-normal relation.

M. Noll
  • 33
  • 1
    Can you compute the determinant, to start with? The only similar formula I know is this one, but the similarity may be only psychological, due to the fact that your $\mathbf{I}_p$ "looks like" the identity... – Giuseppe Negro Oct 20 '22 at 11:15
  • 1
    Thx - looked into it. This gave me the idea to look for the determination of eigenvalues of a sum of two matrices. It would appear this is not a straightforward operation. I'll add to my post a bit more background info of how I came up with my stated problem. The problem originates from an elementary relationship to the Gaussian, which is reason enough for me to believe that someone has already solved it. – M. Noll Oct 20 '22 at 11:33
  • 2
    We can compute the determinat by a matrix determinant lemma $\det (I_p+pp^\top) = (1+p^\top I_p^{-1}p) \det I_p =(1+p^T\mathbf 1)\prod_i p_i=(1+\sum_i p_i) \prod_i p_i$, where $\mathbf 1=(1,1,\ldots,1)^T$ https://en.wikipedia.org/wiki/Matrix_determinant_lemma – Jochen Oct 20 '22 at 12:04
  • How handy! Didn't know this one. Thx a lot Jochen. – M. Noll Oct 20 '22 at 12:11
  • I'm pretty sure you can compute $p^T \mathbf 1 = 1$ and $I_p \mathbf 1 =p$, giving that $\mathbf 1$ is in the kernel. I don't know how to reconcile my observation with @Jochen's though, in the case that all $p_i$ are nonzero. – Dustan Levenstein Oct 20 '22 at 12:34
  • 1
  • 1
    Oh you need to replace every $+$ with $-$ to correct their application of the matrix determinant lemma. – Dustan Levenstein Oct 20 '22 at 12:40
  • Thx for the remark @Dustan. That would actually result in $det(I_p - p p^T) = (1-1) \prod_i p_i$. – M. Noll Oct 20 '22 at 12:59

0 Answers0