1

Given a $k$-variate normal distribution with mean $\mathbf \mu$ and covariance matrix $\mathbf \Sigma$, what is the variance of a 1-draw sample (which will be of size $k$) from this distribution? (I am not even sure how to begin calculating this.)

Add 1

So, given one such draw $\mathbf X = (X_1,X_2,...,X_k)^T$, the sample variance (or variance of the sample?) is given by

$$\frac{1}{k}\sum_{i=1}^{k}\left({X_i}-\frac{1}{k}\sum_{i=1}^{k}{X_i} \right)^2$$

but what about "population"? Do we "just" take the expectation of the above?

Add 2

If we expand the above to express it as

$$\frac{1}{k}\sum_{i=1}^{k}\left({X_i}\right)^2 - \left(\frac{1}{k}\sum_{i=1}^{k}{X_i} \right)^2$$

then the second term is a square of a normal RV with mean $m = \frac{1}{k}\sum_{i=1}^{k}{\mu_i}$ and variance (if I am not mistaken) $\sigma^2=\frac{1}{k^2} \mathbf 1^T \mathbf \Sigma \mathbf 1$.

  • Well, it doesn't make much sense to talk about "variance" of random vectors. If you mean its covariance Matrix, then it should just be $\mathbf\Sigma$ since the "one draw sample" has just the distribution in question as its distribution. – Vim Jul 20 '17 at 16:24
  • No, I do mean variance of elements within the vector. Each time a draw one observation from a k-variate distribution, I get a vector of k elements. What is the variance of those elements? – Confounded Jul 20 '17 at 16:31
  • if you want the variance of each element just look at the diagonal of the covariance Matrix. – Vim Jul 20 '17 at 16:36
  • No, not of each element. Please see the edit, hopefully it makes it clearer. Thanks – Confounded Jul 20 '17 at 16:42
  • I see. If you want the average variance or "population variance" then I think you have to take expectation of this expression. I can't see an easy closed form at this moment though. – Vim Jul 20 '17 at 16:53

1 Answers1

1

Clearly,

$$\left( X_i - \frac{1}{k} \sum_{j=1}^k X_j \right)^2 = X_i^2 - \frac{2}{k} \sum_{j=1}^k X_i X_j + \frac{1}{k^2} \sum_{j=1}^k \sum_{\ell=1}^k X_j X_{\ell}.$$

Summing over $i=1,\ldots,k$ and taking the expectation we find

$$\begin{align*} \mathbb{E} \left( \frac{1}{k} \sum_{i=1}^k \left( X_i - \frac{1}{k} \sum_{j=1}^k X_j \right)^2 \right) &= \frac{1}{k} \sum_{i=1}^k \mathbb{E}(X_i^2) - \frac{2}{k^2} \sum_{i=1}^k \sum_{j=1}^k \mathbb{E}(X_i X_j) + \frac{1}{k^2} \sum_{j=1}^k \sum_{\ell=1}^k\mathbb{E}(X_j X_{\ell}). \\ &= \frac{1}{k} \sum_{i=1}^k \mathbb{E}(X_i^2) - \frac{1}{k^2} \sum_{i=1}^k \sum_{j=1}^{k} \mathbb{E}(X_i X_j). \end{align*}$$

The right-hand side can be expressed in terms of the covariance matrix $\Sigma$ and the mean $\mu$.

saz
  • 123,507
  • Thank you for your reply. Does taking the expectation of sample variance, however, define the population variance? For example, we know that, as stated, it would be a biased estimate of variance in case of IIDs. – Confounded Jul 20 '17 at 17:26
  • @Confounded No, it's not biased in case of IIDs; see also this question: https://math.stackexchange.com/q/1924343/ – saz Jul 20 '17 at 18:07
  • Isn't true that in case of IIDs one needs to use Bessel's correction and use $k-1$ instead of $k$ in the denominator of the outermost sum in order to have an unbiased estimate? – Confounded Jul 21 '17 at 08:14
  • @Confounded Sorry, my mistake; let me clarify. Set $$S_k := \frac{1}{k} \sum_{i=1}^k \left( X_i - \bar{X} \right)^2,$$ then $S_k \to \sigma^2$ almost surely as $k \to \infty$. You are right, however, that $S_k$ is not unbiased as $\mathbb{E}(S_k^2) < \sigma^2$. In contrast, $T_k := k/(k-1) S_k$ is unbiased and we still have $T_k \to \sigma^2$ as $k \to \infty$. So if you are looking for an unbiased estimate, then you should indeed use $T_k$ and note $S_k$. – saz Jul 21 '17 at 13:25