Getting $p_y(y) = p_x(g^{-1}(y)) \left| \frac{\partial{x}}{\partial{y}} \right|$ by solving $| p_y(g(x)) \ dy | = | p_x (x) \ dx |$?

Question

My textbook has a very brief section that introduces some concepts from measure theory:

Another technical detail of continuous variables relates to handling continuous random variables that are deterministic functions of one another. Suppose we have two random variables, $\mathbf{x}$ and $\mathbf{y}$, such that $\mathbf{y} = g(\mathbf{x})$, where $g$ is an invertible, continuous, differentiable transformation. One might expect that $p_y(\mathbf{y}) = p_x(g^{−1} (\mathbf{y}))$. This is actually not the case.

As a simple example, suppose we have scalar random variables $x$ and $y$. Suppose $y = \dfrac{x}{2}$ and $x \sim U(0,1)$. If we use the rule $p_y(y) = p_x(2y)$, then $p_y$ will be $0$ everywhere except the interval $\left[ 0, \dfrac{1}{2} \right]$, and it will be $1$ on this interval. This means

$$\int p_y(y) \ dy = \dfrac{1}{2},$$

which violates the definition of a probability distribution. This is a common mistake. The problem with this approach is that it fails to account for the distortion fo space introduced by the function $g$. Recall that the probability of $\mathbf{x}$ lying in an infinitesimally small region with volume $\delta \mathbf{x}$ is given by $p(\mathbf{x}) \delta \mathbf{x}$. Since $g$ can expand or contract space, the infinitesimal volume surrounding $\mathbf{x}$ in $\mathbf{x}$ space may have different volume in $\mathbf{y}$ space.

To see how to correct the problem, we return to the scalar case. We need to present the property

$$| p_y(g(x)) \ dy | = | p_x (x) \ dx |$$

Solving from this, we obtain

$$p_y(y) = p_x(g^{-1}(y)) \left| \dfrac{\partial{x}}{\partial{y}} \right|$$

or equivalently

$$p_x(x) = p_y(g(x)) \left| \dfrac{\partial{g(x)}}{\partial{x}} \right|$$

How do they get $p_y(y) = p_x(g^{-1}(y)) \left| \dfrac{\partial{x}}{\partial{y}} \right|$ or equivalently $p_x(x) = p_y(g(x)) \left| \dfrac{\partial{g(x)}}{\partial{x}} \right|$ by solving $| p_y(g(x)) \ dy | = | p_x (x) \ dx |$?

Can someone please demonstrate this and explain the steps?

this is chain rule + an integration, i.e. integration by substitution — Calvin Khor, Sep 08 '18 at 08:37
Have you seen e.g. this before? https://proofwiki.org/wiki/Integration_by_Substitution — Calvin Khor, Sep 08 '18 at 09:44
Not at a computer but $\int_A p_X(x)dx= P(X \in A) = P(Y \in g(A)) = \int_{g(A)} p_Y(y)dy$ then apply substitution $y=g(x)$ and conclude with the arbitrariness of $A$. — Calvin Khor, Sep 08 '18 at 13:32
I'm not sure if that helps or you want me to put it in the notation you have — , Sep 13 '18 at 15:10
@RHowe I haven't read it yet, but It looks good! Thank you for taking the time to post it. I will read it soon, don't worry! :) — , Sep 13 '18 at 15:11
This has already been answered here, but in the 1-D case: https://math.stackexchange.com/questions/2591703/understanding-why-f-psi-psi-f-thetag-1-psi-left-fracdd-psig/ — Clarinetist, Sep 14 '18 at 12:19

score 3 · Accepted Answer · edited Sep 21 '18 at 16:14

$p_X(x)dx$ represents the probability measur $\mathbb{P}_X$ which is the probability distribution of the random variable $X$, it is defined by its action on measurable positive functions by $$\mathbb{E}(f(X))=\int_{\Omega}f(X)d\mathbb{P}=\int_{\mathbb{R}}f(x)d\mathbb{P}_X(x)=\int_{\mathbb{R}}f(x)p_X(x)dx.$$ Now, we consider a new random variable $Y=g(X)$, (with some conditions on $g$), and we seek $p_Y$ the probability density distribution of $Y$. So we calculate, for an arbitrary measurable positive function $f$ the expectation $\mathbb{E}(f(Y))$ in two ways: First, $$\mathbb{E}(f(Y))=\int_{\mathbb{R}}f(y)\color{red}{p_Y(y)dy}\tag1$$ Second, $$\eqalignno{\mathbb{E}(f(Y))&=\mathbb{E}(f(g(X)))\cr &=\int_{\mathbb{R}}f(g(x))p_X(x)dx\qquad\text{now a change of variables}\cr &=\int_{\mathbb{R}}f(y)\color{red}{p_X(g^{-1}(y))\left|\frac{dx}{dy}\right|dy}&(2) }$$ Now, because $f$ is arbitrary, comparing (1) and (2) we get $$p_Y(y)=p_X(x)\left|\frac{dx}{dy}\right|, \quad\text{where $y=g(x)$.}$$ Or, better $$p_Y(y)=p_X(g^{-1}(y))\left|\frac{1}{g’(g^{-1}(y))}\right|\iff p_Y(g(x))|g’(x)|=p_X(x).$$

I'm curious, why do you choose to represent it as $p_Y(y)=p_X(g^{-1}(y))\left|\frac{1}{g’(g^{-1}(y))}\right|$? It seems it would be simpler (and equivalent) to write $p_Y(y)=p_X(g^{-1}(y))\left|\frac{d}{dy}g^{-1}(y)\right|$. (Is it in order to use the derivative of $g$ rather than its inverse? To me it just seems a little odd, since $g'$ is a derivative with respect to $x$, so you're hiding an $x$ in an equation where $y$ is the variable... ) — postylem, Feb 21 '23 at 00:30

score 1 · Answer 2 · answered Sep 12 '18 at 15:49

1

This is called the method of transformations. It is detailed on this site. You need to transform a function of a random variable in order to make the CDF equal to $1$. For a demonstration.

Suppose that $X \sim \textrm{Unif}(0,1)$ and let $Y = e^{X}$

Note that cdf of $X$ is given by

$$ F_{X}(x) =\begin{align}\begin{cases} 0 & x < 0 \\ \\ x & 0 \leq x \leq1 \\ 1 & x > 1 \end{cases} \end{align} \tag{1}$$

Then to find the cdf of $Y$

$$ F_{Y}(y) = P(Y \leq y) \\ P(e^{X} \leq y) \\ = P(X \leq \ln(y)) \\ = F_{X}(\ln(y)) = \ln(y) \tag{2}$$

$$ F_{Y}(y) =\begin{align}\begin{cases} 0 & y < 1 \\ \\ \ln(y) & 1 \leq \ln(y) \leq e \\ 1 & x > e \end{cases} \end{align} \tag{3}$$

To obtain the pdf we take the derivative

$$f_{Y}(y) = F_{Y}^{'}(y) = \begin{align}\begin{cases} \frac{1}{y} & 1 \leq \ln(y) \leq e \\ 0 & \textrm{ otherwise} \end{cases} \end{align} \tag{4}$$

Concerning the problem above, suppose that $ X \sim \textrm{Unif}(0,1)$ and that $ Y = \frac{X}{2}$

The CDF for $X$ is the same as above. Let's look at the cdf of $Y$. We note that $R_{X} =[0,1]$ so then $ R_{Y}=[0,\frac{1}{2}]$

$$ F_{Y}(y) = P(Y \leq y) \\ P(\frac{X}{2} \leq y) \\ = P(X \leq 2y) \\ = F_{X}(2y) = 2y \tag{5}$$

You are simply taking the reciprocal. To find the pdf, we differenitate.

$$ F_{Y}(y) = \begin{align}\begin{cases} 0 & y< 0 \\ \\ 2y & 0 \leq y \leq \frac{1}{2} \\ 1 & y > \frac{1}{2} \end{cases} \end{align} \tag{6}$$

to find the pdf

$$f_{Y}(y) = F_{Y}^{'}(y) = \begin{align}\begin{cases} 2 & 0 \leq y \leq \frac{1}{2} \\ 0 & \textrm{ otherwise} \end{cases} \end{align} \tag{7}$$

Visually the difference in the two uniform distributions can be seen below.

$$ X\sim \textrm{Unif}(0,1) \tag{8} $$

$$ X\sim \textrm{Unif}(0,1) , Y = \frac{X}{2} , Y \sim \textrm{Unif}(0,\frac{1}{2}) \tag{9} $$

answered Sep 12 '18 at 15:49

It is not clear to me how this part makes sense: $F_{Y}(y) =\begin{align}\begin{cases} 0 & y < 1 \ \ \ln(y) & 1 \leq \ln(y) \leq e \ 1 & x > e \end{cases} \end{align} \tag{3}$ – Sep 15 '18 at 03:39
Why is $F_Y(y) = 0$ when $y < 1$? Same with all of the other values. – Sep 15 '18 at 03:39
I'll give you the bounty, since it's running out, but I'll wait before I accept it as the answer. – Sep 15 '18 at 03:40
I may have wrote something wrong. I can fix it. Let me look back over it. – Sep 15 '18 at 03:41
It's probably just my misunderstanding, since you're more experienced. – Sep 15 '18 at 03:51
Ok, if we had originally taken CDF of $Y=e^{X}$ the pdf of $X$ is simply $1$ so we would have had $e$ right..in order to make the new variable have a cdf of $1$ we need to take $ln(y)$ the maximal value is $e$ – Sep 15 '18 at 03:51
1

we are taking it from $[1,e]$ to $[0,1]$ – Sep 15 '18 at 03:52
I choose this problem from the site. There is a link at the top. I may be misinterpreting that. see on line a. https://www.probabilitycourse.com/chapter4/4_1_3_functions_continuous_var.php – Sep 15 '18 at 03:56
Sorry, you're right. I was misinterpreting the domain of the uniform distribution. I will continue studying your answer now. – Sep 15 '18 at 03:58
Ok, I finished reading and understood everything. Thank you for the post, it was enlightening. The problem is, I you addressed a different aspect of the passage than what I was asking: I asked how the authors got $p_y(y) = p_x(g^{-1}(y)) \left| \dfrac{\partial{x}}{\partial{y}} \right|$, or equivalently $p_x(x) = p_y(g(x)) \left| \dfrac{\partial{g(x)}}{\partial{x}} \right|$, by solving $| p_y(g(x)) \ dy | = | p_x (x) \ dx |$. This is what I wanted demonstrated. – Sep 15 '18 at 04:32
But, as I said, your answer was still very enlightening with regards to a different aspect of the passage, so I think it was worth the bounty. I will start another for the other part. – Sep 15 '18 at 04:33
Alrighty then like you mean with regards to the notation? I will attempt to put it in that manner if I can. – Sep 15 '18 at 05:34
I'm not referring to notation -- unless I'm misunderstanding your answer. I want to know how the authors got $p_y(y) = p_x(g^{-1}(y)) \left| \dfrac{\partial{x}}{\partial{y}} \right|$, or equivalently $p_x(x) = p_y(g(x)) \left| \dfrac{\partial{g(x)}}{\partial{x}} \right|$, by solving $| p_y(g(x)) \ dy | = | p_x (x) \ dx |$. So how is $| p_y(g(x)) \ dy | = | p_x (x) \ dx |$ "solved" to get $p_y(y) = p_x(g^{-1}(y)) \left| \dfrac{\partial{x}}{\partial{y}} \right|$ or, equivalently, $p_x(x) = p_y(g(x)) \left| \dfrac{\partial{g(x)}}{\partial{x}} \right|$)? – Sep 15 '18 at 09:04

Getting $p_y(y) = p_x(g^{-1}(y)) \left| \frac{\partial{x}}{\partial{y}} \right|$ by solving $| p_y(g(x)) \ dy | = | p_x (x) \ dx |$?

2 Answers2

Linked