1

I would like to calculate the functional derivative w.r.t the first term of the Renyi divergence

\begin{align} D_\alpha(q||p)=\frac{1}{\alpha-1}\log\int q^\alpha(x) p^{1-\alpha}(x)dx \end{align}

Personally, I would proceed as follows

\begin{align} \frac{D_\alpha(q||p)}{dq}&=\frac{1}{\alpha-1}\frac{\log\int q^\alpha(x) p^{1-\alpha}(x)dx}{dq}\\ &=\frac{\alpha}{1-\alpha}\frac{q^{\alpha-1}(x)p^{1-\alpha}(x)}{\int q^\alpha(x) p^{1-\alpha}(x)dx} \end{align} where I have applied the chain rule on the logarithm and then Euler-Lagrange to its arguments.

However, I am not sure if the above is correct. For example, I'd expect to obtain the functional derivative of the KL divergence for $\alpha\to1$.

1 Answers1

1

Let $p$ and $q$ be densities and $\xi$ a function with zero mean. Then (assuming sufficient regularity so we can apply Lebesgue's dominated convergence theorem in $(\star)$) we obtain \begin{align*} \frac{\text{d}}{\text{d} \varepsilon}\bigg|_{\varepsilon = 0} D_{\alpha}(p + \varepsilon \xi \mid q) & = \frac{1}{\alpha - 1} \frac{\text{d}}{\text{d} \varepsilon}\bigg|_{\varepsilon = 0} \ln\left(\int_{\mathbb R^n} \big(p(x) + \varepsilon \xi(x)\big)^{\alpha} q(x)^{1 - \alpha} \, \text{d}x\right) \\ & = \frac{1}{\alpha - 1} \frac{1}{e^{(\alpha - 1) D_{\alpha}(p \mid q)}} \frac{\text{d}}{\text{d} \varepsilon}\bigg|_{\varepsilon = 0} \int_{\mathbb R^n} \big(p(x) + \varepsilon \xi(x)\big)^{\alpha} q(x)^{1 - \alpha} \, \text{d}x \\ &\overset{(\star)}{=} \frac{1}{\alpha - 1} e^{(1 - \alpha) D_{\alpha}(p \mid q)} \int_{\mathbb R^n} \left[\frac{\text{d}}{\text{d} \varepsilon}\bigg|_{\varepsilon = 0} \big(p(x) + \varepsilon \xi(x)\big)^{\alpha}\right] q(x)^{1 - \alpha} \, \text{d}x \\ & = \frac{\alpha}{\alpha - 1} e^{(1 - \alpha) D_{\alpha}(p \mid q)} \int_{\mathbb R^n} p(x)^{\alpha - 1} q(x)^{1 - \alpha} \xi(x) \, \text{d}x. \end{align*} Hence $$\frac{\delta}{\delta p} D_{\alpha}(p \mid q) = \frac{\alpha}{\alpha - 1} e^{(1 - \alpha) D_{\alpha}(p \mid q)} p(x)^{\alpha - 1} q(x)^{1 - \alpha}.$$

Similarly, one obtains $$ \frac{\delta D_{\alpha}(q \mid p)}{\delta p} = - e^{(1 - \alpha) D_{\alpha}(q \mid p)} \left(\frac{q}{p}\right)^{\alpha}, $$ which is also mentioned in Example 6 of Convex Optimization on Functionals of Probability Densities by Tomohiro Nishiyama.

ViktorStein
  • 5,024