4

From Casella Berger exercise 5.65: Let us have $X \sim f$. Then, assume we produce $m$ i.i.d. random variables $Y_1,...,Y_m$ from another distribution $g$.

Let us have

$$q_i = \frac{\frac{f(Y_i)}{g(Y_i)}}{\sum_{j = 1}^{m}\frac{f(Y_j)}{g(Y_j)}}$$

Now, we generate random variables $X^\star$ from the discrete distribution of $P(X^\star = Y_i) = q_i$. This technique seems to be called "Sampling/Importance Resampling (SIR) / weighted bootstrap".

I need to show that $X_1^\star, X_2^\star, ..., X_r^\star,..$ are approximately a sample from $f$.

The textbook hint given says: "Show that $P(X^\star \leq x) = \sum_{i = 1}^m q_iI(Y_i \leq x)$. From there use WLLN."

The hint itself seems wrong - we are not supposed to have indicator random variables $I(Y_i \leq x)$ in the definition of $P(X^\star \leq x)$.

The solution manual on the internet is wrong. The issue in my reasoning:

$$P(X^\star \leq x) = \sum_{i = 1}^m P(X^\star \leq x | X^\star = Y_i)P(X^\star = Y_i) = \sum_{i = 1}^m P(Y_i \leq x)q_i = P(Y_i \leq x)\sum_{i = 1}^m q_i$$

This does not look anything like the required equality in the hint. We arrived at the conclusion that $F_{X^\star}(x) = G_Y(x)$. Which is back to the definition...

How can this be shown, preferably using WLLN?

John
  • 1,353
  • 1
    The hint is indeed strange, because $P(X^\star \le x)$ is a real number whereas $\sum_{i=1}^m q_i I(Y_i \le x)$ is a random variable... – jII Jul 23 '21 at 22:06
  • The solution manual on the internet is correct if you define $P(X^\star \le x)$ to instead be the Monte Carlo approximation of $\Pr[Y \le x]$ for $Y \sim f$. – jII Jul 24 '21 at 12:45
  • @jII, which one are you using? Can you give a link? – John Jul 24 '21 at 12:57
  • There is an issue in your reasoning, in particular the final equality, you cannot extract $P(Y_i \le x)$ from the sum, since it is indexed by $i$. – jII Jul 24 '21 at 13:09
  • @jII, All $Y_i$'s come from the same distribution g(y), independent. Thus, this probability should be the same. – John Jul 24 '21 at 19:36

1 Answers1

4

From the basic properties of importance sampling, we have

$$ \mathbb{E}_f\left[ I(Y \le x) \right] = \mathbb{E}_g\left[ \frac{f(Y)}{g(Y)} I(Y \le x) \right] = \frac{\mathbb{E}_g\left[ \frac{f(Y)}{g(Y)} I(Y \le x) \right]}{\mathbb{E}_g\left[ \frac{f(Y)}{g(Y)} \right]}, $$

where the first expression is the CDF of the target distribution $f$. The first equality comes from the standard importance sampling argument. The final equality from the fact that denominator $\mathbb{E}_g\left[ \frac{f(Y)}{g(Y)} \right] = 1$ (this ratio in the denominator is related to "auto-normalized" importance sampling, and is typically, but not necessarily, used when the weight $f(Y)/g(Y)$ can only be computed up to a normalizing constant).

Now, suppose that $Y_1, \dots, Y_m$ i.i.d. from $g$ and let $X^*$ take value $Y_i$ with probability $q_i := \displaystyle \frac{f(Y_i)/g(Y_i)}{\sum_{j=1}^mf(Y_j)/g(Y_j)}$. Consider the random variable

$$ \begin{aligned} \mathbb{E}\left[ I(X^\star \le x)\mid Y_{1:m} \right] &= \sum_{i=1}^{n}\mathbb{E}\left[ I(X^\star \le x)\mid I(X^\star=Y_i), Y_{1:m} \right] \cdot \mathbb{E}[I(X^*=Y_i) \mid Y_{1:m}] \\ &= \sum_{i=1}^{m}I(Y_i \le x)q_i \\ &= \sum_{i=1}^{m} \frac{f(Y_i)/g(Y_i)}{\sum_{j=1}^mf(Y_j)/g(Y_j)}I(Y_i \le x) \\ &= \frac{ \frac{1}{m}\sum_{i=1}^m [f(Y_i)/g(Y_i)] I(Y_i\le x)}{ \frac{1}{m}\sum_{j=1}^mf(Y_j)/g(Y_j)} \\ & \overset{m \to \infty}\longrightarrow = \frac{\mathbb{E}_g\left[ \frac{f(Y)}{g(Y)} I(Y \le x) \right]}{\mathbb{E}_g\left[ \frac{f(Y)}{g(Y)} \right]} \qquad \mbox{(WLLN)} \\ &= \mathbb{E}_f\left[ I(Y \le x) \right]. \end{aligned} $$

Thus, we have shown that the sequence of random variables $\{ \mathbb{E}\left[ I(X^\star \le x)\mid Y_{1:m} \right] \}_{m=1}^{\infty}$ converges in probability to the desired probability of $ Y \le x$ under the target distribution $g$. Note that the estimator is biased but consistent. Further, the argument above can be generalized to the expectation of any function under $f$, not only $I(Y \le x)$.

jII
  • 3,158