0

Introduction

Usually Bayesian Networks are defined as follows. Let $(\mathcal{X},\mathcal{P})$ be a probability space, and let $X_1,...,X_n$ be random variables on this space with values in $\mathbb{R}$. Let $G$ be a DAG with vertices bijective to the $X_i$'s. Then the space is said to be Bayesian with respect to $G$ (or Markovian, etc.) if $$P(\forall i\, X_i=x_i)=\prod_iP(X=x_i|PA_i=pa_i)$$ where $PA_i$ are the parents of $X_i$ in the graph and $pa_i$ are the values among $x_1,...,x_n$ of $PA_i$, and where the equality is whenever defined.

Problem

The definition above is only defined for discrete variables; otherwise $P(PA_i=pa_i)$ can be always a probability $0$ event. Therefore this is not a rigorous definition. (This is the definition that is written on wikipedia, despite conditioning on probability $0$ events.)

Guidelines

I am looking for a definition that is as natural as possible. Ideally one that does not refer to pdfs, does not assume that the $X_i$'s take values in the reals, etc. I presume this will use the concept of a conditional mean, which usually solves problem of making non-rigorous definitions rigorous going from discrete variables to continuous, but I couldn't figure out how to do it.

Abstraction and rigor are central to my question.

  • Isn't the Local Markov Property definition given on Wikipedia agnostic of density/mass functions? It only relies on the notion of conditional independence which should be abstract enough for your purposes. – angryavian Oct 22 '22 at 02:23
  • Using Markov kernels this model extends to Borel spaces, meaning any measurable space with countably generated $\sigma$-algebra. And I think that this further extends to countably many random objects, under mild assumptions. I discussed related questions here and here. – Matija Oct 22 '22 at 19:07
  • As noted in the comments, if you define the distribution of $X_0$ and if you define the conditional distribution of $X_n \mid X_1, \dots, X_{n - 1}$ for each $n \geq 1$, then that defines a sequence $(X_n)_{n \geq 1}$ with the desired conditional distributions. Here the $X_i$s can take values in arbitrary measure spaces $\Omega_i$. Intuitively, this sequence is the result of a random experiment with infinitely many stages, where each stage only uses results from the previous stage. – Mason Oct 24 '22 at 02:43

0 Answers0