0

Suppose $X_1, \dots , X_n \sim f_{\theta}(x) = e^{-(x-\theta)}$ are IID.

I'm interested in the conditional expectation $E[X_1 | X_{(1)} = t]$, where $X_{(1)} = \min(X_1, \dots, X_n)$.

Let $f_{X_1|T} (x|t)$ be the conditional pdf of $X_1$ given $X_{(1)} = t$. By definition:

\begin{equation} f_{X_1|T} (x|t) = \frac{f_{X_1,T} (x,t)}{f_T(t)} \end{equation}

I have $f_T(t) = ne^{-n(t-\theta)}$ for $t \in (\theta, \infty)$.

For the joint pdf $f_{X_1,T}$ I consider two cases.

If $X_1 = X_{(1)}$, then I argue $f_{X_1,T}(x,t)$ is $0$ almost everywhere because it only has support on the diagonal where $x=t$.

If $X_1 \neq X_{(1)}$, then $X_{(1)}$ is some $X_i$ for $i \neq 1$ and thus $X_1, X_{(1)}$ are independent. In this case the joint pdf splits:

\begin{equation} f_{X_1|T} (x|t) = \frac{f_\theta(x)f_T(t)}{f_T(t)} = f_\theta(x) \end{equation}

Thus $E[X_1 | X_{(1)} = t] = E[X_1] = \int_{\theta}^\infty xe^{-(x-\theta)}dx = [-xe^{-(x-\theta)} - e^{-(x-\theta)}]_\theta^\infty = \theta + 1$.

Intuitively, this would mean knowing the minimum of the sample doesn't give me any information about $X_1$.

Is this argument correct?

Bastiza
  • 467
  • 2
    Three mistakes: 1. If $X_1 = X_{(1)}$, then $X_1 = t$; we can't just ignore that case. 2. If $X_1\neq X_{(1)}$, this doesn't make $X_1$ and $X_{(1)}$ independent; in particular, we know that $X_1 > X_{(1)}$. (A hint for this part is to use the memoryless property of the exponential.) 3. By your reasoning, you need to split $E[X_1\mid X_{(1)} = t]$ into the conditions where $X_1 = X_{(1)}$ and $X_1 \neq X_{(1)}$ using the law of total expectation. – user806050 Sep 14 '24 at 21:14
  • 2
    I'd recommend doing sanity checks on your solution. Like if $t > \theta +1$, then this answer obviously doesn't make sense (how could the expectation of $X_1$ given the minimum be less than the minimum?). – user806050 Sep 14 '24 at 21:19

1 Answers1

3

There is no joint density of $(X_1,X_{(1)})$ in the usual sense (with respect to Lebesgue measure) because $X_1=X_{(1)}$ occurs with a positive probability of $\frac1n$.

Here, as @user806050 suggests, you can use the law of total expectation.

For any $t\in [\theta,\infty)$, you have

\begin{align} E\left[X_1\mid X_{(1)}=t\right]&=E\left[X_1\mid X_1=t\right]\cdot\frac1n + E\left[X_1\mid X_1>t\right]\cdot\left(1-\frac1n\right) \\&=\frac{t}{n}+(1+t)\left(1-\frac1n\right) \\&=t+1-\frac1n. \end{align}

That $E\left[X_1\mid X_1>t\right]=1+t$ can be shown directly from $E\left[X_1\mid X_1>t\right]=\frac{E\left[X_1\mathbf1_{X_1>t}\right]}{P(X_1>t)}$ or equivalently by noting that the conditional distribution of $X_1$ given $X_1>t$ is another shifted exponential with shift $t$:

$$f_{X_1\mid X_1>t}(x)=e^{-(x-t)}\mathbf1_{x>t}.$$

Hence, $$E\left[X_1\mid X_{(1)}\right]=X_{(1)}+1-\frac1n \quad,\text{ a.e. }$$

There are also other indirect ways of finding this conditional expectation (compare with Computing the UMVUE for Uniform$(0,\theta)$). For more formal derivations, have a look at Finding an efficient estimator for $\theta$ in $U[0, \theta]$ in terms of the sample maximum and Finding $E(|X_1|\ |\max |X_i| )$ for similar discussions on a $\text{Uniform}(0,\theta)$ model.

Edit:

In case it is not clear, the law of total expectation is used in the following way:

$$E\left[X_1\mid X_{(1)}=t\right]=E\left[X_1\mid X_{(1)}=t,A \right]P(A\mid X_{(1)}=t)+E\left[X_1\mid X_{(1)}=t,A^c\right]P(A^c\mid X_{(1)}=t)\,,$$ where $A$ is the event $\{X_1=X_{(1)}\}$.


As an aside, here $X_1-X_{(1)}$ happens to be independent of $X_{(1)}$, which can be argued using Basu's theorem. This immediately gives $$E\left[X_1\mid X_{(1)}\right]=E\left[X_1-X_{(1)}\mid X_{(1)}\right]+X_{(1)}=E\left[X_1-X_{(1)}\right]+X_{(1)}\,,$$ and all that remains is to find $E\left[X_1\right]$ and $E\left[X_{(1)}\right]$.

StubbornAtom
  • 17,932
  • Could you please explain how we know $P[X_1 = X_{(1)}] = \frac{1}{n}$? Is it because the $X_i$ are IID and thus each one has an equal chance of being the smallest? – Bastiza Sep 14 '24 at 22:53
  • 1
    It is because $X_i$'s are iid and continuous. – StubbornAtom Sep 14 '24 at 22:57
  • I have $P[X_1 = X_{(1)}] = P[X_2 > X_1 , \dots , X_n > X_1] = \int_\theta^\infty P[X_2 > x , \dots , X_n > x] dx$ but the computation is giving me $\frac{1}{n-1}$...not sure where I am going wrong – Bastiza Sep 14 '24 at 23:02
  • 1
    @Bastiza You could say $P[X_1 = X_{(1)}] = \int_\theta^\infty P[X_2 > x , \dots , X_n > x] , f(x) , dx$ $= \int_\theta^\infty \left(e^{-(x-\theta)}\right)^{n-1}, e^{-(x-\theta)}, dx$ $= \int_\theta^\infty e^{-n(x-\theta)}, dx $ $=\frac1n$ though the exchangeability argument may be faster – Henry Sep 14 '24 at 23:09
  • @Henry Makes sense, I forgot the $f(x)$ term, I guess that is the "$P[X_1=x]$" part – Bastiza Sep 14 '24 at 23:14
  • The formula: $$E\left[X_1\mid X_{(1)}=t\right]=E\left[X_1\mid X_1=t\right]\cdot\frac1n + E\left[X_1\mid X_1>t\right]\cdot\left(1-\frac1n\right)$$ is difficult to understand, can it be demonstrated? – Speltzu Sep 15 '24 at 10:25
  • @Speltzu Conditioning on the minimum $X_{(1)}=t$, then either $X_1=X_{(1)}=t$ with probability $\frac1n$ or $X_1>X_{(1)}=t$ with probability $1-\frac1n$. – Henry Sep 15 '24 at 17:45
  • I still don't understand. What formula or theorem applies? – Speltzu Sep 15 '24 at 20:00
  • Right, but then we will have to prove that $E[X_1\mid X_{(1)}=t, A]=E[X_1\mid X_1=t]$ and $E[X_1\mid X_{(1)}=t, A^c]=E[X_1\mid X_1>t]$. – Speltzu Sep 16 '24 at 21:23
  • @Speltzu $E[X_1\mid X_{(1)}=t,A]=E[X_1\mid X_1=t,X_{(1)}=t]=E[X_1\mid X_1=t]$. You need to ask a separate question if your query persists. – StubbornAtom Sep 16 '24 at 22:18
  • Then $E[X_1\mid X_{(1)}=t,X_2= X_{(1)}]=E[X_1\mid X_2=t,X_{(1)}=t] = E[X_1\mid X_2=t]$?. – Speltzu Sep 17 '24 at 06:53
  • You are right: $P(X_1-X_{(1)}\in A\mid X_{(1)})=\frac{1}{n}I_A(0)+\frac{n-1}{n}\int_A e^{-x}dx$ – Speltzu Sep 18 '24 at 09:13