2

I'm interested in some proof (simple if possible) as to why Hotelling's $T^2$ is chi-squared distributed for large n. I understand and can show that the Mahalanobis Distance is in fact chi-squared distributed (as bellow), but I have a little bit of trouble showing it should be the same case for the Hotelling's $T^2$ case since there is the component $n$ and I'm not sure what to do with it.

Hotelling's $T^2$: $n(\bar{\boldsymbol{X}} - \boldsymbol{\mu})^T\boldsymbol{S}^{-1}(\bar{\boldsymbol{X}} - \boldsymbol{\mu})$

I know that for large $n$ we can assume $\boldsymbol{S}^{-1} \approx \boldsymbol{\Sigma}^{-1}$, and that $\boldsymbol{\Sigma}^{-1} = \boldsymbol{\Sigma}^{-\frac{1}{2}}\boldsymbol{\Sigma}^{-\frac{1}{2}}$, so far large $n$ we can update the Hotelling's $T^2$ formula to:

$n(\bar{\boldsymbol{X}} - \boldsymbol{\mu})^T\boldsymbol{\Sigma}^{-1}(\bar{\boldsymbol{X}} - \boldsymbol{\mu})$

and expand it to

$n(\bar{\boldsymbol{X}} - \boldsymbol{\mu})^T\boldsymbol{\Sigma}^{-\frac{1}{2}}\boldsymbol{\Sigma}^{-\frac{1}{2}}(\bar{\boldsymbol{X}} - \boldsymbol{\mu})$

Mahalanobis Distance proof: $$ \begin{align} D &= (\boldsymbol{X} - \boldsymbol{\mu})^T\boldsymbol{\Sigma}^{-1}(\boldsymbol{X} - \boldsymbol{\mu}) \\ &= (\boldsymbol{X} - \boldsymbol{\mu})^T\boldsymbol{\Sigma}^{-\frac{1}{2}}\boldsymbol{\Sigma}^{-\frac{1}{2}}(\boldsymbol{X} - \boldsymbol{\mu}) \\ &= \big(\boldsymbol{\Sigma}^{-\frac{1}{2}}(\boldsymbol{X} - \boldsymbol{\mu})\big)^T\big(\boldsymbol{\Sigma}^{-\frac{1}{2}}(\boldsymbol{X} - \boldsymbol{\mu})\big)\\ &= \boldsymbol{Y}^T\boldsymbol{Y} \\ &= ||\boldsymbol{Y}||^2\\ &= \sum \limits_{k=1}^lY_k^2 \\ D &\sim \chi_k^2 \end{align} $$

I know also that $\frac{n-p}{(n-1)p}T^2 \sim F_{p, n-p}$ and that an F distribution with large n and low p is approximately $\chi_p^2$ distributed, but when trying to connect this information to write a proof I end up being lost and frustrated.

1 Answers1

0

Turns out the answer to this was rather simple. We can say then that Hotelling's $T^2$ is in fact the t-test formula squared.

\begin{align} t &= \frac{\bar{X}-\mu}{\frac{\sigma}{\sqrt{n}}} \\ t^2 &= \bigg( \frac{\bar{X}-\mu}{\frac{\sigma}{\sqrt{n}}} \bigg)^2 \\ &= \bigg( \frac{n^{\frac{1}{2}}(\bar{X}-\mu)}{\sigma} \bigg)\bigg( \frac{n^{\frac{1}{2}}(\bar{X}-\mu)}{\sigma} \bigg) \end{align}

Now, we can see this formula under the light of a multivariate case, and that we are likely dealing with sample statistics rather than population ones. However, as $n \rightarrow \infty$ the sample variance-covariance matrix $\boldsymbol{S}$ gets closer to the population variance-covariance matrix $\boldsymbol{\Sigma}$.

As we know that the square root matrix of $\boldsymbol{\Sigma}$ is also symmetric we have that $\boldsymbol{\Sigma^{-1}} = \boldsymbol{\Sigma}^{-\frac{1}{2}}\boldsymbol{\Sigma}^{-\frac{1}{2}}$. Then we have

\begin{align} T^2 &= n(\bar{\boldsymbol{X}} - \boldsymbol{\mu})^T\boldsymbol{\Sigma}^{-1}(\bar{\boldsymbol{X}} - \boldsymbol{\mu}) \\ &= n^{\frac{1}{2}}(\bar{\boldsymbol{X}} - \boldsymbol{\mu})^T\boldsymbol{\Sigma}^{-\frac{1}{2}}\boldsymbol{\Sigma}^{-\frac{1}{2}}n^{\frac{1}{2}}(\bar{\boldsymbol{X}} - \boldsymbol{\mu}) \\ &= \big(n^{\frac{1}{2}}\boldsymbol{\Sigma}^{-\frac{1}{2}}(\bar{\boldsymbol{X}} - \boldsymbol{\mu})\big)^T\big(n^{\frac{1}{2}}\boldsymbol{\Sigma}^{-\frac{1}{2}}(\bar{\boldsymbol{X}} - \boldsymbol{\mu})\big)\\ &= \boldsymbol{Y}^T\boldsymbol{Y} \\ &= ||\boldsymbol{Y}||^2\\ &= \sum \limits_{k=1}^lY_k^2 \\ &\sim \chi_k^2 \end{align}

  • The link StubbornAtom posted was also helpful and it offers a very convincing answer that is related to Slutsky's theorem and the $T^2$ being F distributed, however the answer I posted here shows the similarity between $T^2$ and the Mahalanobis distance, and how to prove it using the whitening transformation which is what I needed. – Willian Leite Dec 02 '22 at 08:59