0

Gaussian processes are generally introduced as families of rvs where all finite vectors are multivariate normal. However, they are also described sometimes as "distributions over functions." I'd like to formalize this second claim using the language of measure theory on Banach spaces of functions like $L^p$.

I know that Brownian motion (a specific type of GP) can be seen as a "$C[0,1]$-valued random variable," that is, a measurable function whose output is a random function. I also know that we cannot write PDFs for arbitrary distributions over infinite-dimensional Banach spaces like $L^p$ because there is no Lebesgue measure on such spaces. My questions are 1) what exact space of functions, $F$, a GP with a specific mean and covariance function $\mu, K$ is a distribution (probability measure) over, 2) whether we can write a CDF for such a GP, since every random variable must have a CDF, and 3) how I would find the probability of a rv $f \sim GP(\mu, k)$ being contained in some subset of $F$, the space of functions a GP is defined over. For instance, if $F$ were $C[0,1]$, then I would want to know what subset of $C[0,1]$ is more probable to be drawn, and what is less, by looking at $\mu, K$.

I have taken a course in measure theoretic probability, so measure theory, where illuminating, is welcome. Also note that my question is to do with GPs as distributions over function space, not anything to do with GP regression. I have read this, this and this, which do some of what I ask and outlined the Kolmogorov existence theorem, but my questions are more specifically to do with PDFs/CDFs, and specifically writing down the space of functions the distribution of a GP is over, which is not addressed in those answers.

  • For a real-value random variable $X$, the CDF is $F(c) := \mathbb{P}(X \le c)$. If $X$ is function-valued, this definition doesn't really make sense, which is why you can't write out a CDF for it. If $X$ is function-valued, one could, in principal, look at $F: C([0,1]) \rightarrow \mathbb{R}$ defined by $F(f) := \mathbb{P}(X \le f) = \mathbb{P}(X_t \le f(t) \text{ for all }t \in [0,1])$, but computing that is quite non-trivial. – user6247850 Mar 20 '23 at 19:53
  • Thanks, this is helpful. Any thoughts on my questions 1) and 3) @user6247850? – Tanishq Kumar Mar 20 '23 at 19:58
  • For the Brownian motion case (Wiener measure): https://en.wikipedia.org/wiki/Classical_Wiener_space – Stefan Perko Mar 20 '23 at 20:11

1 Answers1

2

There is a lot to unpack in this question! (1) For a given GP (say defined over $[0,1]$ just to make things simple), it is very hard to tell immediately what function spaces it belongs to. First and foremost, a GP is a stochastic process, so each realization (i.e. fixing the sample point from the sample space) results in a function (sample path) for the GP. Say the GP is $z: [0,1] \times \Omega \to \mathbb{R}$, then $z(\cdot,\omega)= z(\cdot)$ is the sample path, and is a function. As such for any ``nice" subset of functions $A$, we can ask what is $$ P(z \in A) = P(\{\omega \in \Omega : z(\cdot,\omega) \in A\}). $$
This probability as a function of sets $A$ is the distribution of the process $z$ (answering question (2)). Of course what sets $A$ we can put in here will depend on the nature of the process $z$. A typical approach is to consider sets $A$ that are borel subsets of a metric space, usually a typical function space.

For example, brownian motion lives in $C[0,1]$ (and hence also $L^p[0,1]$), so we might look at the distribution of brownian motion as a function of borel subsets of these spaces. But Brownian motion does not live in for instance the Sobolev space $W^{1,2}$. There are though GP's that are differentiable, and take values in $W^{1,2}$ with probability 1. It is natural to consider the distribution of brownian motion as a function of the borel sets in these spaces. When looking at $C[0,1]$ this leads to the typical abstract Wiener space.

To compute the probabilities (question 3) is done on a case by case basis-- it is not easy to see based on the mean and covariance alone, since you will also need to incorporate what types of subsets $A$ you are considering, and their nature.

LostStatistician18
  • 2,361
  • 8
  • 19