3

In probability theory, we start with a probability triple $(\Omega , \mathcal{F} , \mathbb{P})$. The purpose of defining $\mathcal{F}$ is clear to me: we postulate three intuitive axioms for $\mathbb{P}$ and, to avoid paradoxes, restrict our consideration to certain subsets, $\mathcal{F} \subset 2^\Omega$, which satisfy the properties:

  1. $\Omega \in \mathcal{F},$
  2. $A \in \mathcal{F} \implies A'\in \mathcal{F},$
  3. $A_1,A_2,A_3,\dots \in \mathcal{F} \implies A_1 \cup A_2 \cup A_3 \cup \dots \in \mathcal{F}.$

These properties ensure that the axioms for $\mathbb{P}$ are well-defined. For example, closure under countable unions allows us to make sense of statements like $$\mathbb{P}(A_1 \cup A_2 \cup A_3 \cup \dots) = \mathbb{P}(A_1)+\mathbb{P}(A_2) + \mathbb{P}(A_3)+\dots$$ When we move to the next step and define functions from $\Omega$ to $\mathbb{R}$, it’s clear that because $\mathbb{P}$ only considers subsets in $\mathcal{F}$, we should require that for every subset $B \subset \mathbb{R}$, the preimage $f^{-1}(B) = \{\omega\in \Omega : f(\omega)\in B\}\in \mathcal{F}$ lies in $\mathcal{F}$. However, I don’t fully understand why we restrict the subsets of the codomain and, specifically, why we require the codomain to have a $\sigma$-algebra as well. Could you explain the motivation for this part of the definition of a measurable function?

S.H.W
  • 4,462
  • 1
    Well, one of the reasons is because, if we do not restrict the subsets of the codomain, not even the identity map would be measurable... – Célio Augusto Nov 06 '24 at 17:03
  • @CélioAugusto Are you referring to the case $\Omega = \mathbb{R}$? – S.H.W Nov 06 '24 at 17:38
  • Yes, if you take $\Omega=\mathbb{R}$ and $\mathcal{F}$ be the Borel $\sigma$-algebra, the identity map would not be measurable. – Célio Augusto Nov 06 '24 at 18:58
  • @CélioAugusto I see. In that case, it's necessary to restrict the subsets in the codomain in some way, though it's still unclear why the codomain should require a $\sigma$-algebra. The idea doesn’t feel entirely intuitive to me; perhaps other collections of subsets could also serve as suitable candidates. – S.H.W Nov 06 '24 at 20:03
  • Also maybe in the general case where $\Omega$ is an uncountable set, it can be proved that many "good" functions can't be measurable without restricting the subsets of the codomain. I don't know how to make this precise. – S.H.W Nov 06 '24 at 20:06
  • 5
    I'm not sure what you mean by "require" here. If you have a measurable space $(\Omega, \mathcal{F})$ and a function $f$ from $\Omega$ to some set $E$ (for example $E=\mathbb{R}$), then the collection $\mathcal{E}$ of subsets $B\subseteq E$ such that $f^{-1}(B)\in\mathcal{F}$ must be a $\sigma$-algebra - this follows from the assumption that $\mathcal{F}$ itself is a $\sigma$-algebra. – Nicholas Burbank Nov 06 '24 at 23:53
  • 1
    So if $\mathcal{E}$ contains a collection $\mathcal{A}$ of subsets of $E$, it must in fact contain $\sigma(\mathcal{A})$. For example, in the case $E=\mathbb{R}$, if $f^{-1}(A)\in\mathcal{F}$ for all $A$ of the form $(-\infty, x]$, then in fact $f^{-1}(B)\in\mathcal{F}$ for all "Borel measurable" $B$. – Nicholas Burbank Nov 07 '24 at 00:38
  • @NicholasBurbank Thanks. So by using the $\sigma$-algebra $\mathcal{F}$, we can get a $\sigma$-algebra $\mathcal{E}$ on the codomain and in the case $E=\mathbb{R}$, $\mathcal{E}$ can coincide with Borel $\sigma$-algebra. Is my understanding correct? If you post it as an answer, I can accept it. – S.H.W Nov 07 '24 at 09:12
  • @NicholasBurbank Does this $\sigma$-algebra have a specific name? It seems to me that $\mathcal{E}$ is complementary to $\sigma$-algebra generated by a random variable. – S.H.W Nov 07 '24 at 09:17
  • 2
    Just accept that to speak of the distribution of a random variable (which is a probability measure on the codomain) we need a $\sigma$-algebra there. Bountying such a question and demanding an answer from a "reputable source" is a waste of resources. – Kurt G. Nov 08 '24 at 08:06
  • @KurtG. I don’t quite understand why you see it as a waste of resources. My goal is to develop a deeper understanding of probability theory, and exploring basic concepts thoroughly is essential for that. I often ask myself, If I didn’t know this definition, how would I arrive at it? While some definitions took years to refine, asking such foundational questions helps me build a more solid understanding. – S.H.W Nov 08 '24 at 08:58
  • The question itself is legit. What almost infuriates me is the bounty plus asking someone to write a formal answer. – Kurt G. Nov 08 '24 at 09:44
  • @KurtG. You’re right about requesting a formal response. I should have selected the "draw attention" option for the bounty. – S.H.W Nov 08 '24 at 09:57
  • Let me finish this discussion with some personal opinions: In fact, as a fairly new MSE member @NicholasBurbank deserves the bounty you offered. What I find very odd is that in the subsequent comment you were not sure if your understanding was correct but willing to accept the answer once it was formally delivered. Where is the math in here? Have nice day. – Kurt G. Nov 08 '24 at 10:05

2 Answers2

0

Just for clarification: It's not true that for a random variable $f^{-1}(\mathcal{P}(\mathbb{R}) \subseteq \mathcal{F}$ holds. Such a $f$ is ofcourse a r.v. but it isn't the definition.

Coming to the question, it turns out that collection of sets on the codomain whose preimage belong to $\mathcal{F}$ is a $\sigma-$algebra.(Follows by simple set theoretic computations) (See Rudin Real and Complex Analysis Theorem 1.12 (a)) Hence, naturally there arises a sigma algebra on the codomain when we deal with a measure space.

SSAD
  • 100
0

With the insights provided by Nicholas Burbank's comments, I believe I’ve found a satisfying explanation.

As mentioned earlier, we define the $\sigma$-algebra $\mathcal{F}$ for the input space $\Omega$ and seek to define a mapping $X$ from the input space to the output space $\Omega'$, which is typically $\mathbb{R}$. Since our goal is to work with the measure $\mathbb{P}$, we must consider only subsets of $\Omega'$ whose preimages lie in $\mathcal{F}$. The collection of such subsets is given by: $$\begin{align*} X(\mathcal{F})=&\;\{E\subseteq \Omega'\,|\,\exists A\in \mathcal{F}:X^{-1}(E)=A\} \end{align*}.$$ Using the properties of preimages, it can be shown that $X(\mathcal{F})$ forms a $\sigma$-algebra on $\Omega'$, known as the induced $\sigma$-algebra by $X$ and $\mathcal{F}$ or pushforward $\sigma$-algebra.

  1. Containment of $\Omega'$: As $X$ is a mapping, it's clear that $X^{-1}(\Omega') = \Omega\in \mathcal{F}$. Hence $\Omega' \in X(\mathcal{F}).$
  2. Closure under Complements: Take $E\in X(\mathcal{F})$. So there exists $A \in \mathcal{F}$ such that $A = X^{-1}(E)$. We have $X^{-1}(\Omega' \setminus E) = X^{-1}(\Omega') \setminus X^{-1}(E) = \Omega \setminus X^{-1}(E) = \Omega \setminus A \in \mathcal{F}$. This shows that $\Omega' \setminus E\in X(\mathcal{F}).$
  3. Closure under Countable Unions: Let $E_1,E_2,E_3, \dots \in X(\mathcal{F})$. So there exist $A_1,A_2,A_3,\dots \in \mathcal{F}$ such that $A_1 = X^{-1}(E_1),A_2 = X^{-1}(E_2),A_3 = X^{-1}(E_3),\dots \in \mathcal{F}$. We have $X^{-1}(E_1 \cup E_2 \cup E_3 \cup \dots ) = X^{-1}(E_1) \cup X^{-1}(E_2) \cup X^{-1}(E_3) \cup \dots = A_1 \cup A_2 \cup A_3 \cup \dots \in \mathcal{F}$, which shows $E_1 \cup E_2 \cup E_3 \cup \dots \in X(\mathcal{F})$.

Thus, it is natural to equip $\Omega'$ with the $\sigma$-algebra $X(\mathcal{F})$ and if we want to be able to deal with multiple mappings, it's natural to equip $\Omega'$ with a single $\sigma$-algebra which works for all of the mappings. In addition, we want to have as many as possible mappings and to achieve that, we should make the $\sigma$-algebra on $\Omega'$ as small as possible. See following answers for more details.

Nate Eldredge's answer:

The moral is this: To get as many $(\mathcal{B}_X,\mathcal{B}_Y)$-measurable functions $f : X \to Y$ as possible, one wants $\mathcal{B}_X$ to be as large as possible, so it makes sense to use a complete $\sigma$-algebra there. (You already know some of the nice properties of this, e.g. an a.e. limit of measurable functions is measurable.) But one wants $\mathcal{B}_Y$ to be as small as possible. When $Y$ is a topological space, we usually want to be able to compose $f$ with continuous functions $g : Y \to Y$, so $\mathcal{B}_Y$ had better contain the open sets (and hence the Borel $\sigma$-algebra), but we should stop there.

Michael Greinecker's answer:

On a more conceptual note, the less measurable sets you have in your codomain, the easier it is for a function to be measurable. And if a random variable should represent a random quantity, then all empirically interesting questions can be formulated in terms of simple intervals and their combinations. For, say, statistical applications there is no empirical difference between Borel sets and a Borel set modified by a null set. The distributions (on the reals) commonly applied can usually be given by a cumulative distribution function and such a function essentially determines the probability of intervals.

In the case where $\Omega' = \mathbb{R}$, to answer practical questions, we need all intervals $[a,b]$ with $a,b \in \mathbb{R}$ to be included in the $\sigma$-algebra. This implies that the $\sigma$-algebra must contain the $\sigma$-algebra generated by these intervals, which is the Borel $\sigma$-algebra. At this point, we can stop, as we have a suitable $\sigma$-algebra on $\mathbb{R}$. However, if we extend the $\sigma$-algebra further and make it larger, the mappings will become more restricted. For example, using the Lebesgue $\sigma$-algebra on $\mathbb{R}$ imposes additional constraints on the mappings and may lead to some unexpected results.

S.H.W
  • 4,462