3

To be honest, I'm really struggling with the intuition of conditional expectation where we condition on a sub $\sigma$-algebra. The definition in my lecture notes is as follows:

Let $(\Omega,\mathcal{F},\mathbb{P})$ a probability space and $\mathcal{G}\subseteq\mathcal{F}$ a sub $\sigma$-algebra and $X\in\mathcal{L}^1(\Omega,\mathcal{F},\mathbb{P})$. The conditional expectation $\mathbb{E}\left[X\mid\mathcal{G}\right]$ is a subset of random variables in $Y\in\mathcal{L}^1(\Omega,\mathcal{F},\mathbb{P})$ for which: $$\int_GYd\mathbb{P}=\int_GXd\mathbb{P}\text{ for all }G\in\mathcal{G}$$

Although I technically understand the requirement in the definition, I have no intuition at all for what a conditional expectation in this sense means. If $\mathbb{E}\left[X\mid\mathcal{G}\right]$ is a random variable, then what is $\mathbb{E}\left[X\mid\mathcal{G}\right](\omega)$? What does it mean for conditional expectations to be independent?

StubbornAtom
  • 17,932
Brickcity
  • 125
  • 3
    $\mathbb{E}X \mid \mathcal{G}$ is intuitively the best approximation to $X(\omega)$ that you can give knowing whether $\omega \in A$ for all $A \in \mathcal{G}$ with $\mathbb{P}(A)>0$. The sense of this approximation is that they have the same mean and the difference of their $L^2$ norms is minimal among $\mathcal{G}$-measurable random variables. – Ian Nov 03 '21 at 12:34
  • 1
    Careless mistake: I meant to say the $L^2$ norm of their difference is minimal, not the difference of their $L^2$ norms. – Ian Nov 03 '21 at 13:17

1 Answers1

2

Consider the case when $\mathcal{G}$ is generated by a measurable partition $G_1,G_2,\ldots$ of $\Omega$. In this case, the conditions for $Y$ to be the conditional expectation of $X$ given $\mathcal{G}$ can be stated as:

  1. $Y$ is constant on each $G_i$, and
  2. $Y1_{G_i}=\mathsf{E}[X1_{G_i}]$ for each $i\ge 1$.

That is, $Y$ is the average of $X$ on each set in the partition. See also this question.