Intuition behind rejection sampling proof

Question

I have a quick question about the proof of rejection sampling.

Suppose we know how to sample from a distribution with $Y$ pdf $q$, and want to sample from a distribution $X$ with (known) pdf $\pi$. Suppose we also know that $\pi(x) \leq M q(x)$ for some constant $M>0$ and for all $x$. Then rejection sampling works by showing that: if we sample $x$ first from Y, and then only accept this value with probability $Mq(x)/{\pi(x)}$, then the outputs from our sample are from the distribution $\pi$ as required.

Now the standard proof of of this is trivial, it amounts to showing that if the results from our sample has distribution $J$, then (for $A$ being some Borel measurable set) J and X have the same distribution. This is shown by noting that $P(J=x)=P(Y=x | \text{Value accepted})$, and then using Bayes law to get $P(Y=x | \text{Value accepted})=\pi(x)$.

I want to clarify why $P(J=x)=P(Y=x | \text{Value accepted})$ is obvious. In particular, why can we not say what $P(J=x)=P(Y=x \text{ and Value accepted})$? Depending on the "order" of which you think about the two events (sampling from Y, accepting or rejection) are occuring, I don't see why this is obvious. One way of thinking about it (i.e. first consider acceptance/ rejection then the output) makes this seem obvious. However, if we consider the situation in the order of first consider output from $Y$, then whether we accept/ reject it, then why does this not give $P(J=x)=P(Y=x \text{ and Value accepted})$? I want to get an intuitive understanding of the situation. Thanks.

Note that $P(J=x)=P(J=x|\text{Value accepted})$ since there is no output unless the value is accepted. — A.S., Dec 10 '15 at 14:20
Yes, that's one way of thinking about it (i.e. first consider acceptance/ rejection then the output). However, if we consider the situation in the order of first consider output, then whether we accept/ reject it, then why does this not give $P(J=x)=P(Y=x | \text{Value accepted})$? — guest, Dec 10 '15 at 14:22
For a single try $P(J=x)=P(Y=x,\text {accepted})$ is correct - but you also get a big fat $P(J\text{ undefined})=1-P(\text{accepted})$. Since you are running the algorithm until the acceptance happen, you sum geometric series which results in $P(J=x)=P(Y=x|\text{accepted})$. — A.S., Dec 10 '15 at 14:26
Perhaps see http://math.stackexchange.com/questions/1635250/ for more on acc-rej method. — BruceET, Feb 01 '16 at 17:35

Intuition behind rejection sampling proof

0 Answers0