4

If $X_1, X_2\ldots,X_n$ are independent variables with $X_i \sim \mathcal N(i\theta,1)$, $\theta$ is an unknown parameter. What is a one dimensional sufficient statistic $T$ of this sample?

I have a intuition guess that the answer is $\frac{1}{n}\sum_{i=1}^n \frac{X_i}{i}$, but I don't know how to prove it through definition or get it using factorization. Can anyone give me a hint on how to derive it?

Thanks!

callculus42
  • 31,012
Lesley
  • 41
  • Maybe you can define $X_i=Y_i\cdot i$ with $Y_i \sim \mathcal N(\theta,1)$. – callculus42 Aug 29 '15 at 18:13
  • Your intuition is completely back-to-front: knowing $X_{100}=201$ is more helpful than knowing $X_1=3$ and so $X_{100}$ should have a higher weighting than $X_1$. So intuitively $\displaystyle \sum_{i=1}^n iX_i$ is more likely to be a sufficient statistic, or $\dfrac{ \sum_i iX_i}{ \sum_i i^2}$ if you want an unbiased estimator – Henry Aug 29 '15 at 18:15
  • @calculus: you would have $Y_i \sim \mathcal N(\theta,1/i^2)$ which may not make things easier – Henry Aug 29 '15 at 18:18
  • I have deleted my answer for now since I suspect an error on these grounds: it makes sense to use as an estimate of a mean a weighted average of observations with the weights proportional to the reciprocals of the variances. (Here I have in mind the expected value $\theta$ of $X_i/i$.) Among linear combinations of the observations, that one has the smallest mean squared error. And Lehmann--Scheffe tells us that the estimator with the smallest mean squared error should be a function of the sufficient statistic. (This doesn't account for estimators that are not linear combinations of${},\ldots$ – Michael Hardy Aug 29 '15 at 19:15
  • $\ldots,{}$the observations; hence there are some uncertainties I need to clear up.) ${}\qquad{}$ – Michael Hardy Aug 29 '15 at 19:15
  • @MichaelHardy: If the variance of $X_i/i$ is $1/i^2$ then its reciprocal (the weights) is $i^2$ so the weighted mean of the $X_i/i$ would be $\dfrac{\sum_i i^2 X_i/i}{\sum_i i^2} = \dfrac{\sum_i i X_i}{\sum_i i^2}$. This is indeed a function of $ \sum_i i X_i$. So this makes sense to me. – Henry Aug 29 '15 at 23:03

1 Answers1

0

$$f_{X_1,\ldots,X_n}(x_1,\ldots,x_n \mid \theta) = (2\pi)^{-n/2} \prod_{i=1}^n \exp\left( -\frac{1}{2} (x_i - i\theta)^2 \right) $$

$$= (2\pi)^{-n/2} \exp\left( -\frac{1}{2} \sum_{i=1}^n(x_i - i\theta)^2 \right) $$

$$= (2\pi)^{-n/2} \exp\left( -\frac{1}{2} \left(\sum_{i=1}^n x_i^2 -2\theta\sum_{i=1}^n ix_i + \theta^2\sum_{i=1}^n i^2 \right)\right) $$

$$= (2\pi)^{-n/2} \exp\left( -\frac{\displaystyle\sum_{i=1}^n i^2}{2} \left(\frac{\displaystyle\sum_{i=1}^n x_i^2}{\displaystyle\sum_{i=1}^n i^2}-\left(\frac{\displaystyle\sum_{i=1}^n ix_i}{\displaystyle\sum_{i=1}^n i^2}\right)^2 +\left(\frac{\displaystyle\sum_{i=1}^n ix_i}{\displaystyle\sum_{i=1}^n i^2}\right)^2 -2\theta\frac{\displaystyle\sum_{i=1}^n ix_i}{\displaystyle\sum_{i=1}^n i^2} + \theta^2 \right)\right) $$

$$= (2\pi)^{-n/2} \exp\left( -\frac{1}{2} \left(\displaystyle\sum_{i=1}^n x_i^2-\frac{\left(\displaystyle\sum_{i=1}^n ix_i\right)^2}{\displaystyle\sum_{i=1}^n i^2} \right)\right)\exp\left( -\frac{\displaystyle\sum_{i=1}^n i^2}{2} \left(\frac{\displaystyle\sum_{i=1}^n ix_i}{\displaystyle\sum_{i=1}^n i^2} - \theta\right)^2 \right). $$

You are interested in the last $\exp$ term, so $\displaystyle\sum_{i=1}^n ix_i$ is a sufficient statistic for $\theta$.

Henry
  • 169,616