2

Let $(\Omega, \mathcal{A}, \mathbb{P})$ be a probability space. What does it mean (in the most formal and rigorous sense possible) to "draw a sample" $\omega \in \Omega$ from this space? Intuitively, I think I understand what is happening, but I am looking for a precise mathematical way of describing the process of sampling.

Kind regards and thank you very much!

Joker

Joker123
  • 1,804

3 Answers3

2

I'm going to jump right in here and give a non-answer, since none of the experts seem to have anything to say. I asked almost exactly the same question here (What is a sample of a random variable?), and the answers I got were quite useful.

One short version of the answer is "What are you going to use your sample for?"

Suppose you say "Well, I've got a random variable $X$ defined on $\Omega$, and I'd like to know whether, on average, $X$ for my sample will be larger than $17$."

In that case, I'd say "Then you should compute $\Bbb P\{X > 17\}$; you don't need to mention samples at all."

In fact, it doesn't take long to get good at removing the word "sample" from most questions just like that --- it's a little like learning not to talk about the ether when you're discussing physics. :)

John Hughes
  • 100,827
  • 4
  • 86
  • 159
2

There's probably no precise answer to this question, since "taking a sample", "observing the occurrence of a random variable" and "performing a random experiment" are just expressions we use in reference to real life actions that we interpret as taking note of the result of an experiment (in a broad sense, i.e. a procedure) whose result we can't completely predict beforehand. For this to make sense we first need to agree on the aspect of the final result that we are interested on; and also we could discuss the cause of the uncertainty. For instance, when throwing a coin in the air we could:

  • check if it eventually goes down in a reasonable amount of time (maybe a very long one, like an hour);
  • check if it touches the ground in a very precise amount of time (maybe a very small interval instead of an exact number), which we could calculate using some physic's laws and given that we know the initial height and velocity;
  • check if when touching the ground is heads or tails that looks upwards, maybe given that we know every detail about velocity, height, initial position, point where force is applied and the actual force, etc.;
  • the same but not having all that information.

Which of these experiments are random and which are not? Well, perhaps there's no right answer. Of course, I tried to describe them from more to less predictable outcome (if we could think that is a matter of degree), but maybe we could agree that the first one is not random, while the last one is; the second might be, but the third one is difficult to actually perform, giving some chaotic dynamics involved. In fact, Bohr and Einstein had a famous controversy regarding this subject: basically, Einstein would say that the third one would not be random if we were good enough with our theories and predictions, while Bohr would say that all of them — maybe even the first one — are random.

All this is just to explain that there's no clear notion — even less a definition — of what "random" means. In a sense, is just a consequence of our lack of knowledge or imprecision, although it could also be a fundamental characteristic of the universe and the way it works.

So when we refer to a random experiment and a random event we can look at it in two ways:

  • before the experiment is performed, the event $A$ is an element of the $\sigma$-algebra $\mathcal A$, and so the value $P(A)$ is defined, where $P\colon \mathcal A\longrightarrow [0,1]$ is a function satisfying the usual properties of a probability;
  • after the experiment is performed we get as result a specific element of $\Omega$, say $\omega_0$, and we say that $A$ "occured", "happened", etc., if $\omega_0\in A$, and that it didn't otherwise.

This goes all around the interpretation of probability theory, and usually there's not much formalization surrounding this, but what we can formalize is that performing a random experiment is selecting an element $\omega_0$ of the random space $\Omega$ (or maybe, letting the universe do it in a way that "respects" the probability law, whatever that means).

In the same sense, observing the value of a random variable $X$ or registering the value of an occurrence of $X$ can be represented as the value $X(\omega_0)$, where $\omega_0$ is the result of the random experiment.

And since a "random sample of size $n$" is just a collection of $n$ independent and identically distributed random variables, say $$X_1,X_2,\ldots,X_n$$ (also representable as a random vector), "drawing a sample" is performing the experiment (*) underlying the definition of the space $(\Omega,\mathcal A,P)$ in such a way that the independence and identical distribution assumptions are valid, to obtain a result $\omega_0$. The observed sample is not a collection of random variables, but the $n$-tuple $$\big(X_1(\omega_0),X_2(\omega_0),\ldots,X_n(\omega_0)\big) \in \mathbb R^n.$$


(*) Here, we must consider that the random experiment is, for instance, selecting $n$ persons who will answer a question, or any other action that results in $n$ results/elements of $\Omega$.

It is true that we could also think that there's only one variable $X$ defined on the space and that we perform the experiment $n$ times in order to get the $n$ results $$\omega_1,\omega_2,\ldots,\omega_n,$$ and so drawing a sample gives us the $n$-tuple $$X(\omega_1),X(\omega_2),\ldots,X(\omega_n).$$ But even in this case, we could define a new space taking $$\tilde\Omega=\Omega^n=\Omega \times \Omega \times \ldots \times \Omega \quad\text{($n$ times)},$$ with $\tilde {\mathcal A}$ as the product $\sigma$-algebra and $\tilde P$ according to the usual probability properties. In this way, performing the $n$ successive experiments represented by $(\Omega,\mathcal A,P)$ would be equivalent to performing just once the experiment represented by $(\tilde\Omega,\tilde{\mathcal A},\tilde P)$.

0

My two cents:

The difficulty of rigorously defining "drawing a sample from a distribution $\mathbb{P}$" comes from the fact that, ultimately, probability theory is only an attempt to describe or model the reality of a "random phenomenon / experiment", and such a "random phenomenon / experiment" is difficult to further pin down mathematically (see @Alejandro's answer). It is this "random experiment" (e.g., a toss of a coin, a draw of a card, an execution of the RNG algorithm on your computer) that produces a random outcome (which we model as an element $\omega \in \Omega$), not the probability measure $\mathbb{P}$. In other words, the random experiment/phenomenon exists by itself (from which we can draw a sample), independently of what probability space $(\Omega, \Sigma, \mathbb{P})$ we choose to describe it (which only reflects our knowledge of reality).

Now, how do we associate a probability model $(\Omega, \Sigma, \mathbb{P})$ with a random phenomenon? This is often an art (e.g., we model a fair coin toss with $Bernoulli(0.5)$ because of symmetry) and falls into the realm of statistics and machine learning. But in the ideal case when our probability model perfectly matches the random phenomenon, we should see that in the limit of infinitely many repeated runs of the same random experiment (i.e., we collect i.i.d. samples), the frequency with which an outcome falls within any (measurable) set $B \subset \Omega$ should converge to $\mathbb{P}(B)$ for our choice of probability measure $\mathbb{P}$. (We can also flip the question on its head; instead of trying to verify that our model $\mathbb{P}$ perfectly captures the probabilistic behavior of a given random process, we can also ask the question of "is my sampling algorithm truly drawing a sample from a target distribution $\mathbb{P}$", and this studied in the design of a good (pseudo) random number generator.)

So when we "draw a sample $\omega$ from a distribution $\mathbb{P}$", it's as if there is an underlying random experiment whose behavior we can perfectly capture with $\mathbb{P}$, such that some outcome $\omega$ belongs to any measurable set $B$ (i.e., "event B occurs") with probability $\mathbb{P}(B)$.

P.S. Terrance Tao has a nice discussion on the distinction between probabilistic concepts and our models for them: "sample spaces (and their attendant structures) will be used to model probabilistic concepts, rather than to actually be the concepts themselves." -- https://terrytao.wordpress.com/2015/09/29/275a-notes-0-foundations-of-probability-theory/

Yibo Yang
  • 1,636