0

Let $\mathcal{X}, \|\cdot\|_p$ be a metric space where $\mathcal{X}\subseteq \mathbb{R}^n$ and $\mathcal{P}(\mathcal{X})$ the set of probability measures on that space. A paper I am reading defines the Total Variation Distance between two distributions $P, Q \in \mathcal{P}(\mathcal{X})$ as follows: $$\text{TV}(P, Q) = \frac{1}{2} \int_\mathcal{X} |dP(x) - dQ(x)|$$ I'm used to integrals like $\int f(x) dx$ and found this post which explains the notion behind something like $\int f(x) dP(x)$. However, I don't understand the $|dP(x) - dQ(x)|$ part - how would we integrate over the absolute value of two different infinitesimally small elements taken from two distributions? I would be grateful if someone could provide some intuition behind this or some keywords/topics I would need to read about to understand this.

1 Answers1

0

I think it refers to the following:

The set of (finite) measures on a space can be made into a normed space. The norm of a measure is given by $$|\mu|=\sup \sum_i |\mu(A_i)|$$ where the supremum is taken over all partitions $(A_i)_{i \in \mathbb{N}}$ of the sample space.

Hence, you can think of $\mathrm{d}P-\mathrm{d}Q$ as a new measure $\mathrm{d}\mu$ and then the TV reduces to $0.5$ times the norm of this measure.

P.Jo
  • 839