1

My uni professor has taught us the following:

If the likelihood formed on the basis of a random sample from a distribution belongs to the regular exponential family, then the likelihood equation for finding the ML estimate of the parameter vector $\boldsymbol{\theta}$ is given by [equation 1]$$\mathop{\mathbb{E}}(\boldsymbol{T}(\boldsymbol{X_1}, ..., \boldsymbol{X_n}))=\boldsymbol{T}(\boldsymbol{x_1}, ..., \boldsymbol{x_n})$$ That is, the likelihood equation can be obtained by equating the expectation of $\boldsymbol{T}(\boldsymbol{X_1}, ..., \boldsymbol{X_n})$ equal to its observed value.

I am having some trouble interpretting the observed values.


If we take the normal distribution (unknown mean $\mu$, known variance $\sigma^2$) for example, we get that the sufficient statistic $\boldsymbol{T}(\boldsymbol{X_1}, ..., \boldsymbol{X_n})=\frac{x}{\sigma}$.

Calculating the LHS of equation 1: $\mathop{\mathbb{E}}(\boldsymbol{T}(\boldsymbol{X_1}, ..., \boldsymbol{X_n}))=\mathop{\mathbb{E}}(\frac{x}{\sigma})=\frac{1}{\sigma}\mathop{\mathbb{E}}(x)=\frac{\mu}{\sigma}$.

I'm not exactly sure how to calculate $\boldsymbol{T}(\boldsymbol{x_1}, ..., \boldsymbol{x_n})$.

Can anyone provide some guidance? (I haven't been able to find any resources that uses this result)

Wivaviw
  • 27

1 Answers1

1

It is a bit unclear what you mean by $T(X_1,\dots,X_n)=\frac{x}{\sigma}$. I believe it should be $T(X_1,\dots,X_n)=\frac{1}{n\sigma} \sum_{i=1}^n X_i$, which is a sufficient statistic. It is then correct, that $$\mathbb{E}[T(X_1,\dots,X_n)] = \frac{\mu}{\sigma}.$$ The quantity $T(x_1,\dots,x_n)$ is computed simply as $T(x_1,\dots,x_n)=\frac{1}{n\sigma} \sum_{i=1}^n x_i$ (that is by replacing the stochastic variables $X_i$ with the observed values $x_i$) The likelihood equation can then be solved by solving $$T(x_1,\dots,x_n) = \frac{1}{n\sigma} \sum_{i=1}^n x_i = \frac{\mu}{\sigma},$$ which results in the wellknown maximum likelihood estimate $$\hat{\mu} = \frac{1}{n} \sum_{i=1}^n x_i$$ for normally distributed data.

  • I think I'm a little confused by notation. The exponential family form I use to find the sufficient statistic is $f(x; \theta)= h(x) c(\theta) exp(\sum_{i=1}^k w_i(\theta)t_i(x))$ where I thought the $t_i(x)$ gives the sufficient statistic? Am I supposed to include a summation? – Wivaviw Feb 28 '22 at 12:50
  • 1
    @Wivaviw Note that if $X_1,\dots,X_n$ are independent with the same density $f(x_i;\theta) := h(x_i)c(\theta)\exp(\sum_{i=1}^k w_i(\theta)t_i(x))$, then the joint density of $(X_1,\dots,X_n)$ becomes $$f(x_1,\dots,x_n ; \theta) = \prod_{i=1}^n f(x_i ; \theta) = h(x_1)\dots h(x_n)c(x_1)\dots c(x_n) \exp(\sum_{i=1}w_i(\theta)(t_i(x_1)+t_i(x_2)+\dots+t_i(x_n)),$$ in particular it is clear that the joint distribution is again an exponential family, but the new sufficient statistic is given by $T(x_1)+\dots+T(x_n)$. – Leander Tilsted Kristensen Feb 28 '22 at 13:47
  • Thank you for your explanation. I just wanted to clarify a couple things about your expression for the joint density. Do you mean to have $c(\theta)^n$ since $c$ doesn't depend on $x_i$? Also, within the exponent, wouldn't it be $\sum_{i=1}^n \sum_{j=1}^k w_j(\theta)t_j(x_i)$? From what I could work out in my research I'm pretty sure the $k$ corresponds to the number of unknown parameters we're working with. In the case of this example, its just $k=1$ for the unknown mean $\mu$. – Wivaviw Feb 28 '22 at 16:25
  • 1
    Yes, that was a typo, it should have been $c(\theta)^n$ in my previous comment. We have, that $$\sum_{i=1}^n \sum_{j=1}^k w_j(\theta)t_j(x_i) = \sum_{j=1}^k w_j(\theta)(\sum_{i=1}^n t_j(x_i)),$$ where the expression on the right hand side has the advantage, that we can read of the sufficient statistic to be $\sum_{i=1}^n t_j(x_i)$. Yes, for this example $k=1$. – Leander Tilsted Kristensen Feb 28 '22 at 16:39