1

I'm interested in the information gained from using a joined distribution, compared to just one of it's marginals, that is:

$$\begin{align}g_x &= D_{KL}(P_{(X,Y)} | Q_{(X,Y)}) - D_{KL}(P_X | Q_X) \\\quad \text{and} \quad g_y &= D_{KL}(P_{(X,Y)} | Q_{(X,Y)}) - D_{KL}(P_Y | Q_Y)\end{align}$$

where $P$ and $Q$ are distributions.

Clearly we have $$\max\{D_{KL}(P_X | Q_X), D_{KL}(P_Y | Q_Y)\} \le D_{KL}(P_{(X,Y)} | Q_{(X,Y)}) \le D_{KL}(P_X | Q_X) + D_{KL}(P_Y | Q_Y)$$ with equality when $X$ and $Y$ are independent, and so $0 \le g_x \le D_{KL}(P_Y | Q_Y)$ and $0 \le g_y \le D_{KL}(P_X | Q_X)$.

Superficially they seem similar to to the uncertainty coefficients: $$ C_{XY} = \frac{\operatorname{I}(X;Y)}{H(Y)} ~~~~\mbox{and}~~~~ C_{YX} = \frac{\operatorname{I}(X;Y)}{H(X)}. $$

I'm wondering if these have been studied before, and/or have a name I may refer them by?

RobPratt
  • 50,938
Thomas Ahle
  • 5,629
  • 1
    $g_x$ is the conditional KL divergence $D(P_{Y|X}|Q_{Y|X}|P_X)$ - see ch 2 (particularly pg 24) here. – stochasticboy321 Mar 24 '20 at 00:03
  • That's great! Am I right to assume that the $|P_X$ at the end is a more "correct" notation used by the authors, but that most sources ignore it and simply write $D(P_{Y|X} | Q_{Y|X}) $? – Thomas Ahle Mar 24 '20 at 09:41
  • I haven't seen that notation, but the trouble with it at face value is that leaves things undetermined. $P_{Y|X}$ and $Q_{Y|X}$ have a random variable $X$ in them. The functiona includes the conditional laws for $Y$, but what is X's law? Do I not need it to define this quantity? – stochasticboy321 Mar 25 '20 at 22:30
  • In this question it appears to be "implicit": https://mathoverflow.net/questions/97755/ while in this one it appears to be a random variable: https://math.stackexchange.com/questions/3016421/ – Thomas Ahle Apr 07 '20 at 21:49
  • $D(P_{XY},|,P_XQ_{Y|X})$ looks like another nice version. @stochasticboy321 If you add an answer I'll mark it as accepted. – Thomas Ahle Apr 08 '20 at 11:27

0 Answers0