Dumb question: Computing expectation without change of variable formula

Question

Possibly related question: Making sense of measure-theoretic definition of random variable

Given a random variable $X$ on $(\Omega, \mathscr{F}, \mathbb{P})$, its law $\mathcal{L}_X$ and a Borel function $g: \mathbb{R} \to \mathbb{R}$,

$$E[g(X)] := \int_{\Omega} g(X(\omega)) d\mathbb{P}(\omega)$$
Change of variable theorem allows us to compute as follows:

$$E[g(X)] = \int_{\mathbb{R}} g(t) d\mathcal{L}_X(t)$$

Dumb question: Without using change of variable theorem, how do we compute $E[g(X)]$?

-

Side question: The point of change of variable is to go back to Riemann or Riemann-Stieltjes integrals to avoid the Lebesgue integral?

-

I guess the answer is to use the measure-theoretic definition of expectation for measurable functions. Since the proof of the change of variable formula is actually to go through indicator, step, nonnegative and measurable functions. It seems like we would end up reinventing the wheel. Humour me anyway, please. How exactly would we be reinventing the wheel?

Say for example $g(x) = x^2$ and $X$ is Unif([0,1]). Then how do we compute

$$\int_{\Omega} X(\omega)^2 d\mathbb{P}(\omega) \tag{*}$$

?

Here's what I got so far.

$$ (*) = \int_{\Omega} (X(\omega)^2)^{+} d\mathbb{P}(\omega) - \int_{\Omega} (X(\omega)^2)^{-} d\mathbb{P}(\omega)$$

where we compute $$\int_{\Omega} (X(\omega)^2)^{+} d\mathbb{P}(\omega) = \sup_{h \in SF^{+}, h \le (X^2)^{+}}\{\int_{\Omega} h d \mathbb P\}$$

and where we compute $$\int_{\Omega} h d \mathbb P = \int_{\Omega} a_11_{A_1} + \cdots + a_n1_{A_n} d \mathbb P = \int_{\Omega} a_11_{A_1} d \mathbb P + \cdots + \int_{\Omega} a_n1_{A_n} d \mathbb P$$

where $A_1, ..., A_n \in \mathscr F$

and finally where we compute

$$\int_{\Omega} a_11_{A_1} d \mathbb P = a_1\int_{\Omega} 1_{A_1} d \mathbb P = a_1 \mathbb P(A_1)$$.

Without using change of variable formula, would we have to come up with indicator and simple functions that lead to a uniformly distributed random variable?

If so, what are these indicator and simple functions that lead to an uniform distribution please?

If not, what to do?

As for the probability space, I was thinking that $X$ being distributed as 'Unif(0,1)' means $X$ is in $(\Omega, \mathscr F, \mathbb P) = ([0,1], \mathscr B[0,1], \lambda)$ or $([0,1], \mathscr M[0,1], \lambda)$?

Actually, I was hoping there would be a way to define $X$ explicitly. For a discrete uniform distribution, say, where $X$ represents toss of a fair die, I guess we would have

$(\Omega, \mathscr F, \mathbb P) = (\{1, \dots ,6\}, 2^{\Omega}, \mathbb P(\omega) = \frac16)$ and $X = \sum_{n=1}^{6} n \cdot 1_{\{\omega = n\}}(\omega)$

Then

$$E[X] = \int_{\Omega}\int_0^1 n 1_{\{(\omega)=n\}}(\omega)dnd\mathbb P(\omega)$$

$$ = \int_0^1 n \int_{\Omega} 1_{\{(\omega)=n\}}(\omega)d\mathbb P(\omega)dn \tag{by Fubini's?}$$

$$ = \int_0^1 n \mathbb P(\{(\omega) = n\}) dn$$

$$ = \int_0^1 n f_X(n) dn$$

$$ = \int_0^1 n \frac11 dn$$

$$ = \int_0^1 (n) dn$$

$$=\frac{n^2}{2} |_{0}^{1}$$

$$=\frac12 - 0 = \frac12$$

As for the second moment,

$$E[X^2] = \int_{\Omega} (\int_0^1 n 1_{\{n = \omega\}}(\omega)dn)^2 d\mathbb P(\omega)$$

$$E[X^2] = \int_{\Omega} \int_0^1 n 1_{\{n = \omega\}}(\omega)dn \int_0^1 m 1_{\{m = \omega\}}(\omega)dm d\mathbb P(\omega)$$

$$E[X^2] = \int_{\Omega} \int_0^1 \int_0^1 n m 1_{\{n = m = \omega\}}(\omega)dn dm d\mathbb P(\omega)$$

$$E[X^2] = \int_{\Omega} \int_0^1 \int_0^1 n^2 1_{\{n = n = \omega\}}(\omega)dn dn d\mathbb P(\omega) \tag{??}$$

$$E[X^2] = \int_0^1 \int_0^1 n^2 dn dn \tag{??}$$

$$E[X^2] = \frac13$$

I think I can do similarly for discrete uniform, but both discrete and continuous uniform are simple random variables. What does $X$ ~ $N(\mu,\sigma^2)$ look like? I guess it would be $X=X^+ - X^-$ where $X^{\pm} = \sup\{\text{simple functions}\}$. Should/Can we use central limit theorem? I'm thinking bernoulli is indicator, binomial is simple and then use binomial to approximate normal?

I guess I'm not making much sense, but what references/topics can I look up for something similar which does? For example, where can I read on explicit representations of or approximations with simple functions for random variables to compute such integrals without change of variable formula?

Edit: h should be simple and not simply constant times indicator — BCLC, Mar 14 '18 at 14:50
As I understand things, there is no single universal probability space - they're essentially specified up to measure preserving transformations. The change of variables is just one, convenient way to express this fact, especially when the prob. space is un-/under-specified as in the above case. In case you want to 'freeze' one prob. space as canonical, and want to integrate only in that, well, you need to find an explicit description of it, at which point you can define $X$ explicitly as a map from it to $[0,1],$ and then explicitly compute all relevant integrals. — stochasticboy321, Mar 14 '18 at 22:18
@stochasticboy321 Understand only partially: Some integrals cannot be computed because something is lacking? If so, which integrals cannot be computed because something is lacking, and then what exactly is lacking please? If not, what do you mean please? — BCLC, Mar 15 '18 at 08:36
In your toss of a fair die, $\mathbb P(\omega) = |\omega|$ and $X(\omega) = \omega$. In general, if $X$ is uniform, $X$ is the 'identity'. — Fimpellizzeri, Mar 17 '18 at 19:19
@Fimpellizieri As I thought. Thanks! Comments on other stuff? — BCLC, Mar 19 '18 at 07:26
Well, something is definitely off since the expectation of a fair die roll is not $1/2$. — Fimpellizzeri, Mar 23 '18 at 18:36

Fimpellizzeri · Accepted Answer · 2018-03-25T06:08:05.683

This is too long for a comment, so I'll post here in an attempt to make this as basic as possible. For your die roll example, let $\Omega = \{1,2,\dots, 6\}$, $\mathscr F = 2^\Omega$ and $\mathbb P$ be the (normalized) counting measure.

We may define the random variable $X:\Omega \longrightarrow [0,+\infty)$ as $X(\omega) = \omega$. In other words, $X$ is the result of a die roll and it is uniform because of the probability measure we've chosen. We'd have

\begin{align} \mathbb E(X) &= \int_{\Omega} X(\omega) \,d\mathbb P(\omega) \\&= \int_0^\infty\mathbb P\Big(X^{-1}\big(t, +\infty\big)\Big)\, dt \\&= \int_{0}^1\mathbb P\Big(\{1,2,3,4,5,6\}\Big)\, dt +\int_{1}^2\mathbb P\Big(\{2,3,4,5,6\}\Big)\, dt +\int_{2}^3\mathbb P\Big(\{3,4,5,6\}\Big)\, dt +\int_{3}^4\mathbb P\Big(\{4,5,6\}\Big)\, dt +\int_{4}^5\mathbb P\Big(\{5,6\}\Big)\, dt +\int_{5}^6\mathbb P\Big(\{6\}\Big)\, dt \\&= 1+\frac56+\frac46+\frac36+\frac26+\frac16=3.5 \end{align}

That said, I think the formalization of probability is in general very messy and I may not be able to help with harder examples.

In a similar vein, for the 'Unif(0,1)' example we have $\Omega = [0,1]$, $\mathscr F$ can be one of the Borel or Lebesgue-measurable subsets of $[0,1]$, and $\mathbb P$ is the Lebesgue measure $\mu$.
The random varialbe $X : \Omega \longrightarrow [0,+\infty)$ is defined as $X(\omega) = \omega$. Then

\begin{align} \mathbb E(X) &= \int_{\Omega} X(\omega) \,d\mathbb P(\omega) \\&= \int_0^\infty\mathbb P\Big(X^{-1}\big(t, +\infty\big)\Big)\, dt \\&= \int_{0}^1\mathbb \mu\Big((t,1]\Big)\, dt \\&= \int_0^1\,1-t \,dt = {\left[t-\frac{t^2}2\right]}_0^1 = 1-\frac12 = \frac12 \end{align}

Thanks Fimpellizieri! Are you sure you didn't use change of variable? — BCLC, Mar 24 '18 at 03:35
If you're referring to the change from the first line to the second line, I used the 'definition' of Lebesgue integral. — Fimpellizzeri, Mar 24 '18 at 14:26
oh right. Looks like I'll have to do some revision there and get back to you. Thanks Fimpellizieri! — BCLC, Mar 24 '18 at 14:37
"the formalization of probability is in general very messy" ?? Any specific examples of "messiness"? — Did, Mar 24 '18 at 20:58
I got it after I re-discovered Skorokhod representation. I posted an answer of my own. Again, thanks! Happy 4th week of Easter! ^-^ — BCLC, Apr 28 '18 at 14:08

BCLC · Answer 2 · 2018-04-28T17:17:55.473

This was actually pretty basic (as I suspected): Use Skorokhod representation (so-called in David Williams' Probability with Martingales). (*)

For a given cdf $F$, the random variable can be explicitly represented by computing $$X(\omega) = \sup\{y \in \mathbb{R}: F(y) < \omega\}$$ where $X \in \mathscr L^1 ((0,1),\mathscr B(0,1), \mu)$ where $\mu$ is Lebesgue measure.

Eg for exponential: $X \sim \text{Exp}(\lambda)$

$$F(y) < \omega$$

$$\iff 1-e^{-\lambda y} < \omega$$

$$\iff y < \frac{1}{\lambda} \ln(\frac{1}{1-\omega})$$

Thus, $$X(\omega) = \sup(y \in \mathbb{R}: F(y) < \omega) = \sup(-\infty,\frac{1}{\lambda} \ln(\frac{1}{1-\omega})) = \frac{1}{\lambda} \ln(\frac{1}{1-\omega})$$

$$\to E[X] = \int_0^1 \frac{1}{\lambda} \ln(\frac{1}{1-\omega}) d\mu(\omega) = \int_0^1 \frac{1}{\lambda} \ln(\frac{1}{1-\omega}) d\omega$$

It can be verified that this integral is the same as

$$E[X] = \int_{\mathbb R} \lambda x e^{-\lambda x} 1_{(0,\infty)} dx$$

Eg for continuous uniform distribution: $U \sim \text{Unif}((a,b))$

$$F(y) < \omega$$

$$\iff \frac{y-a}{b-a} < \omega$$

$$\iff y < a + \omega(b-a)$$

Thus, $$U(\omega) = a + \omega(b-a)$$

$$\to E[U] = \int_0^1 a + \omega(b-a) d\mu(\omega) = \int_0^1 a + \omega(b-a) d\omega$$

It can be verified that this integral is the same as

$$E[U] = \int_{\mathbb R} u 1_{(a,b)}\frac{1}{b-a} du$$

(*) This can also be called canonical representation (MAT 235A / 235B: Probability Instructor: Prof. Roman Vershynin Prof typeset by Edward D. Kim) or Skorokhod representation of random variables using quantile transforms (Optimal Transport Methods in Economics By Alfred Galichon).

Skorokhod representations relate to quantile functions, similarly defined:

$$Q(p) = \inf\{x \in \mathbb R | F(x) \ge p\}$$

In the wiki page for random variables under distribution functions, it says:

The probability distribution "forgets" about the particular probability space used to define X and only records the probabilities of various values of X. [...] In practice, one often disposes of the space $\Omega$ altogether and just puts a measure on $\mathbb {R}$ that assigns measure 1 to the whole real line, i.e., one works with probability distributions instead of random variables.

This does not address non-real random variables, I guess? – Fimpellizzeri May 04 '18 at 23:34 — Fimpellizzeri, May 04 '18 at 23:34

Dumb question: Computing expectation without change of variable formula

2 Answers2

Linked