Let $Y=g(X)$ with $X\sim F_X$ and $Y^\ast=g(\mu_X)+g^\prime(\mu_X)(X-\mu_X)$. Under what conditions does $Y\overset{d}{\to}Y^\ast$?

Question

Let $Y=g(X)$ be a nonlinear transformation of some continuous random variable $X$. Assume $Y$ does not have any well-defined moments, e.g. $Y=1/X$ with $X\sim\mathcal N(\mu,1)$ and $\mu\neq 0$. If we expand $Y$ as a Taylor polynomial of order one about $\mu_X$ we obtain a new random variable $$ Y^\ast= g(\mu_X)+g^\prime(\mu_X)(X-\mu_X). $$ Now, if $X$ occurs with high probability in a sufficiently small neighborhood centered on $\mu_X$ then we can conclude that $Y\approx Y^\ast$ in the sense of distribution, that is, the distributions of $Y$ and $Y^\ast$ will be similar.

What I am looking for is a formal proof of this fact. Some sort of limit theorem. I would think such a limit theorem would be very closely related to the delta method.

For example it is easy to see (this is probably an abuse of notation, don't shoot me) that as $\mathsf{Var}X\to0$, then $X\overset{d}{\to}\mu_X$ and we have $Y\overset{d}{\to} Y^\ast\overset{d}{\to}g(\mu_X)$ so long as $g(\mu_X)$ exists. But how would we formally state this?

Would you mind clarifying how this differs from the proof of the delta method? [Not saying that they are the same, I'm just trying to understand.] — angryavian, Feb 14 '22 at 19:28
@angryavian That's a great question...a question I was expecting to be asked. In short, I'm not sure how its different. I think the difference here is that we don't necessarily have convergence to a normal distribution. Imagine if $X$ is not normal. Then $Y^\ast$ is also non-normal. Yet, if $\mathsf{Var}X$ is small enough we see that the distributions of $Y$ and $Y^\ast$ are very similar but also non-normal. — Aaron Hendrickson, Feb 14 '22 at 19:36

angryavian · Answer 1 · 2022-02-14T20:33:39.090

Some thoughts which may or may not be what you're looking for.

In the proof of the delta method, one relies on the fact that $X_n \overset{p}{\to} \mu$. (This is implied by the delta method's condition $\sqrt{n}(X_n - \mu) \overset{d}{\to} Z$ for some distribution $Z$. Separately, note that this is also implied by your condition $\text{Var}(X_n) \to 0$ via Chebychev's inequality.)

Suppose that $X_n \overset{p}{\to} \mu$, and that $g$ is differentiable.

By the mean value theorem, $Y_n := g(X_n) = g(\mu) + g'(\xi_{X_n})(X_n-\mu)$ where $\xi_{X_n}$ lies between $X_n$ and $0$.

Thus, $$Y_n - Y_n^* = g(X_n) - (g(\mu) + g'(\mu)(X_n - \mu)) = (g'(\xi_{X_n}) - g'(\mu))(X_n - \mu).$$

If $g'$ is continuous, then $g'(\xi_{X_n}) \overset{p}{\to} g'(\mu)$ by virtue of $X_n \overset{p}{\to} \mu$. This then implies $Y_n - Y^*_n \overset{p}{\to} 0$.

Let $Y=g(X)$ with $X\sim F_X$ and $Y^\ast=g(\mu_X)+g^\prime(\mu_X)(X-\mu_X)$. Under what conditions does $Y\overset{d}{\to}Y^\ast$?

1 Answers1