12

I'm learning probability, specifically transformations of random variables, and need help to understand the solution to the following exercise:

Consider the continuous random variable $X$ with probability density function $$f(x) = \begin{cases} \frac{1}{3}x^2 \quad -1 \leq x \leq 2, \\ 0 \quad \quad \text{elsewhere}. \end{cases}$$ Find the cumulative distribution function of the random variable $Y = X^2$.

The author gives the following solution:

For $0 \leq y \leq 1: F_Y(y) = P(Y \leq y) = P(X^2 \leq y) \stackrel{?}{=} P(-\sqrt y \leq X \leq \sqrt y) = \int_{-\sqrt y}^{\sqrt y}\frac{1}{3}x^2\, dx = \frac{2}{9}y\sqrt y.$

For $1 \leq y \leq 4: F_Y(y) = P(Y \leq y) = P(X^2 \leq y) \stackrel{?}{=} P(-1 \leq X \leq \sqrt y) = \int_{-1}^{\sqrt y}\frac{1}{3}x^2\, dx = \frac{1}{9} + \frac{1}{9}y\sqrt y.$

For $y > 4: F_{Y}(y) = 1.$


Previous to this exercise, I've managed to follow the solutions of two similar (obviously simpler) problems for a strictly increasing and strictly decreasing function of $X$, respectively. However in this problem, I don't understand the computations being done, specifically:

  • How does the three intervals $0 \leq y \leq 1$, $1 \leq y \leq 4$ and $y > 4$ are determined? In the two previous problems I've encountered, we only considered one interval which was identical to the interval where $f(x)$ was non-zero.
  • In the case where $0 \leq y \leq 1$, why does $P(X^2 \leq y) = P(-\sqrt y \leq X \leq \sqrt y)$ and not $P(X \leq \sqrt y)$? I have put question marks above the equalities that I don't understand.

I think I have not understand the theory well enough. I'm looking for an answer that will make me understand the solution to this problem and possibly make the theory clearer.

2 Answers2

11

Let's start by seeing what the density function $f_X$ of $X$ tells us about the cumulative distribution function $F_X$ of $X$. Since $f_X(x) = 0$ for $-\infty < x < -1$, we see that $$F_X(x) = \int_{-\infty}^x f_X(t) \, dt \equiv 0 $$ in this range. Similarly, since $f_X(x) = 0$ in the range $2 < x < \infty$, we see that $$F_X(x) = \int_{-\infty}^x f_X(t) \, dt = \int_{-\infty}^{\infty} f_X(t) \, dt \equiv 1$$ in this range. In other words, the random variable is "supported on the interval $[-1,2]$" in the sense that $P(X \notin [-1,2]) = 0$.

Now let us consider $Y = X^2$. This variable is clearly non-negative and since $X$ is supported on $[-1,2]$, we must have that $Y$ is supported on $[0, \max((-1)^2,2^2)] = [0,4]$. This is intuitively clear because the variable $X$ (with probability $1$) takes values in [-1,2] and so $X^2$ takes values in $[0,\max((-1)^2,(2)^2)]$. So we only need to understand $F_Y(y)$ in the range $y \in [0,4]$. Now, we always have

$$ F_Y(y) = P(Y < y) = P(X^2 < y) = P(-\sqrt{y} < X < \sqrt{y}) = \int_{-\sqrt{y}}^{\sqrt{y}} f_X(t) \, dt $$

but since $f_X$ is defined piecewise, to proceed at this point we need to analyze several cases. We already know that $F_Y(y) = 0$ if $y \leq 0$ and $F_Y(y) = 1$ if $y \geq 4$.

If $0 \leq y \leq 1$ then $[-\sqrt{y},\sqrt{y}]$ is contained in $[-1,1]$ and on $[-1,1]$ the density function is $f_X(x) = \frac{1}{3}x^2$ so we can write

$$ F_Y(y) = \int_{-\sqrt{y}}^{\sqrt{y}} \frac{1}{3} t^2 \, dt. $$

However, if $1 < y \leq 4$ then $-\sqrt{y} < -1$ and so the interval of integration splits as $[-\sqrt{y}, -1] \cup [-1,\sqrt{y}]$. Over the left $[-\sqrt{y},-1]$ part, the density function is zero so the integal will be zero and we are left only with calculating the integral over the right part:

$$ F_Y(y) = \int_{-\sqrt{y}}^{-1} f_X(t) \, dt + \int_{-1}^{\sqrt{y}} f_X(t) \, dt = \int_{-1}^{\sqrt{y}} \frac{1}{3}t^2 \, dt. $$

levap
  • 67,610
  • 1
    I could not hope for a better answer! I was able to follow your explanation with ease and everything makes perfect sense now. I had a hard time understanding the notion of the support of a random variable which is now perfectly clear. Thank you very much! –  Feb 06 '17 at 17:43
  • @Elix: You're welcome, glad I could help. – levap Feb 06 '17 at 17:48
  • I understand the part that says if the support of $X\in [-1,2]$ then that of $Y\in[0,4]$. But I'm lost beginning from: "If $0\le y \le 1$ then $[-\sqrt{y},\sqrt{y}]$ is contained in $[-1,1].$" How did $[-1,1]$ came about? Next is "However, if $1< y \le 4$ then $-\sqrt{y} <-1,$ I can't see how the connection/conclusion was made there. Please, anyone can chip in to explain these parts for me in an easy to follow explanation. Thanks in advance! – Ab2020 Feb 05 '23 at 07:22
  • @Ab2020 If $0\leq y\leq 1$, then the interval $[-\sqrt y,\sqrt y ]$ is maximum if y is 1. All ( positive) values below 1 for y would make the interval smaller. The other part I do not understand as well. I've noticed, that this exercise is almost identical to the exercise we have worked on. Thus you can do it in the same way. Have you comprehend steps I have posted? I hope it. – callculus42 Feb 05 '23 at 08:30
2

If the square of a number is between $0$ and $1$ then the number itself has to be between $-1$ and $1$. Like, the squares of $-0.5$ and $0.5$ are both $0.25$. However, no real number will produce negative squares. This is independent of the nature of the number, it can be a fixed one or a randomly selected one. So,

$$P(X^2<y)= \begin{cases} 0&\text{ if }& \ y<0\\ P(-\sqrt y < X <\sqrt y)&\text{ if }& 0\leq y<1 \end{cases}.$$

If $y\geq 1$ then the square of every number between $-1$ and $\sqrt y$ will be less than $ y$. So

$$P(X^2<y)=P(-1<X<\sqrt y).$$ If, however, $y\geq 4$ then the square of any number between $-1$ and $2$ will be less than $y$, that is

$$P(X^2<y)=1$$ if $y\geq 4$ because all of our random numbers are less than two; their squares are less than $4$.

This is why

$$P(X^2<y)= \begin{cases} 0&\text{ if }& \ y<0\\ P(-\sqrt y < X <\sqrt y)&\text{ if }& 0\leq y<1\\ P(-1 < X <\sqrt y)&\text{ if }& 1\leq y<4\\ 1&\text{ if }&y\geq 4. \end{cases}$$

The rest is given by integrating the pdf over the respective domains.

zoli
  • 20,817