27

What nice ways do you know in order to prove Jensen's inequality for integrals? I'm looking for some various approaching ways.
Supposing that $\varphi$ is a convex function on the real line and $f$ is an integrable real-valued function we have that:

$$\varphi\left(\int_a^b f\right) \leqslant \int_a^b \varphi(f).$$

Riemann
  • 11,801
user 1591719
  • 44,987

5 Answers5

24

First of all, Jensen's inequality requires a domain, $X$, where $$ \int_X\,\mathrm{d}\mu=1\tag{1} $$ Next, suppose that $\varphi$ is convex on the convex hull of the range of $f$, $\mathcal{K}(f(X))$; this means that for any $t_0\in \mathcal{K}(f(X))$, $$ \frac{\varphi(t)-\varphi(t_0)}{t-t_0}\tag{2} $$ is non-decreasing for $t\in\mathcal{K}(f(X))\setminus\{t_0\}$. This means that we can find a $\Phi$ so that $$ \sup_{t<t_0}\frac{\varphi(t)-\varphi(t_0)}{t-t_0}\le\Phi\le\inf_{t>t_0}\frac{\varphi(t)-\varphi(t_0)}{t-t_0}\tag{3} $$ and therefore, for all $t$, we have $$ (t-t_0)\Phi\le\varphi(t)-\varphi(t_0)\tag{4} $$ Now, let $t=f(x)$ and set $$ t_0=\int_Xf(x)\,\mathrm{d}\mu\tag{5} $$ and $(4)$ becomes $$ \left(f(x)-\int_Xf(x)\,\mathrm{d}\mu\right)\Phi\le\varphi(f(x))-\varphi\left(\int_Xf(x)\,\mathrm{d}\mu\right)\tag{6} $$ Integrating both sides of $(6)$ while remembering $(1)$ yields $$ \left(\int_Xf(x)\,\mathrm{d}\mu-\int_Xf(x)\,\mathrm{d}\mu\right)\Phi\le\int_X\varphi(f(x))\,\mathrm{d}\mu-\varphi\left(\int_Xf(x)\,\mathrm{d}\mu\right)\tag{7} $$ which upon rearranging, becomes $$ \varphi\left(\int_Xf(x)\,\mathrm{d}\mu\right)\le\int_X\varphi(f(x))\,\mathrm{d}\mu\tag{8} $$

robjohn
  • 353,833
  • You are also taking $x=f(x)$, which is a confusing notation. – M Turgeon Jul 16 '12 at 18:57
  • 3
    @MTurgeon: I have changed to $t=f(x)$. – robjohn Jul 16 '12 at 19:13
  • @robjohn I corrected a small typo that had me scratching my head for a few moments. I hope you don't mind. – Potato Jan 27 '13 at 05:48
  • @Potato: thanks. That was a residual error from the previous edit. – robjohn Jan 27 '13 at 09:17
  • @robjohn would any steps in the proof change if $\varphi$ was convex on $\mathbb{R}^n$ instead of $\mathbb{R}$? – cap Nov 15 '15 at 08:26
  • @cap: I don't think this proof is easily extended to multiple dimensions; it is very one-dimensional. However, if a multidimensional function is convex in each variable separately, then one should be able to apply Jensen one variable at a time. – robjohn Nov 15 '15 at 15:23
  • Why does $\Phi$ for $t_0$ exist? Don’t you need $f(X) \subseteq [a, b] \subseteq (c, d)$ and $\phi$ is convex on $[c, d]$? – froyooo Nov 07 '19 at 17:39
  • The left side of $(3)$ is $\le$ the right side. Thus, a $\Phi$ exists, but it may not be unique. – robjohn Nov 07 '19 at 19:37
  • 1
    Right before $(2)$ we have $t_0\in f(X)$ and then, at $(5)$, $t_0 = \int_Xf$, which may not belong in $f(X)$. The actual requirement for $t_0$ is to be in the domain of $\varphi$. So, what is missing here is the reason why the integral of $f$ is in the domain of $\varphi$. – Leandro Caniglia May 02 '20 at 21:14
  • Because $\int_X1,\mathrm{d}x=1$, it seems we need to extend the convexity of $\varphi$ to the convex hull of $f(X)$ for full generality. – robjohn May 03 '20 at 02:18
  • @robjohn Thanks for reviewing your answer. Please, take a look at the clarification I just added. – Leandro Caniglia May 03 '20 at 08:57
  • @Matematleta: what mistake did you correct? – robjohn Jul 26 '22 at 22:47
  • 1
    @Matematleta: Ah, that just showed up an hour ago. Thanks. – robjohn Jul 27 '22 at 03:04
11

One way would be to apply the finite Jensen's inequality $$\varphi\left(\frac{\sum a_i x_i}{\sum a_j}\right) \le \frac{\sum a_i \varphi (x_i)}{\sum a_j}$$ to each Riemann sum. The finite inequality is itself easily proved by induction on the number of points, using the definition of convexity.

10

I like this, maybe it is what you want ...

Let $E$ be a separable Banach space, let $\mu$ be a probability measure defined on $E$, let $f : E \to \mathbb R$ be convex and (lower semi-)continuous. Then $$ f\left(\int_E x d\mu(x)\right) \le \int_E f(x)\,d\mu(x) . $$ Of course we assume $\int_E x d\mu(x)$ exists, say for example $\mu$ has bounded support.

For the proof, use Hahn-Banach. Write $y = \int_E x d\mu(x)$. The super-graph $S=\{(x,t) : t \ge f(x)\}$ is closed convex. (Closed, because $f$ is lower semicontinuous; convex, because $f$ is convex.) So for any $\epsilon > 0$ by Hahn-Banach I can separate $(y,f(y)-\epsilon)$ from $S$. That is, there is a continuous linear functional $\phi$ on $E$ and a scalar $s$ so that $t \ge \phi(x)+s$ for all $(x,t) \in S$ and $\phi(y)+s > f(y)-\epsilon$. So: $$ f(y) -\epsilon < \phi(y)+s = \phi\left(\int_E x d\mu(x)\right)+s = \int_E (\phi(x)+s) d\mu(x) < \int_E f(x) d\mu(x) . $$ This is true for all $\epsilon > 0$, so we have the conclusion.

diagram

GEdgar
  • 117,296
5

Here's a nice proof:

Step 1: Let $\varphi$ be a convex function on the interval $(a,b)$. For $t_0\in (a,b)$, prove that there exists $\beta\in\mathbb{R}$ such that $\varphi(t)-\varphi(t_0)\geq\beta(t-t_0)$ for all $t\in(a,b)$.

Step 2: Take $t_0=\int_a^bfdx$ and $t=f(x)$, and integrate with respect to $x$ to prove the desired inequality.

M Turgeon
  • 10,785
1

Issue

This answer assumes that the integral belongs in the domain of $\varphi$, i.e., $$ \int_X\!f(x)\;dx \tag{1} \in {\rm dom}(\varphi) $$ Even more, the statement we want to prove involves the evaluation of $\varphi$ at the value of the integral. So, a question remains: why does $(1)$ hold?

Let $a<b$ be the ends of the interval $I$ where $\varphi$ is defined (they may or may not belong to $I$). In order to show $(1)$ we need a hypothesis, namely $$ \int_{\{f\notin I\}}dx = 0, \tag2 $$ i.e., $f\in I\rm\ a.e.$ (which does happen if $f(X)\subseteq{\rm dom}(\varphi)=I$)

Because of hypothesis $(2)$, to prove $(1)$ it is enough to show $$ \int_{\{f \in I\}}f(x)\;dx \in I. $$ Since $$ 1 = \int_X dx = \int_{\{f\in I\}}dx + \int_{\{f\notin I\}}dx = \int_{\{f\in I\}}dx, $$ we obtain $$ a = \int_{\{f\in I\}} a\;dx \le \int_{\{f\in I\}}f(x)\;dx $$ Now, assume $a\notin I$ and $$ \int_{\{f\in I\}} f(x)\;dx = \int_Xf(x)\;dx = a.\tag3 $$ Then we would also have $$ \int_{\{f \ge a\}}f(x)\;dx = \int_{\{f\in I\}}f(x)\;dx = a = \int_{\{f\ge a\}}a\,dx.\tag4 $$ But $(4)$ means that $f = a,\rm\ a.e.$, which contradicts our hypothesis because $a\notin I$.

Thus $\int_X f(x)\,dx=a$ and $a\in I$ or $\int_X f(x)\,dx > a$. Similarly, $\int_X f(x)\,dx=b$ and $b\in I$ or $\int_X f(x)\,dx < b$.

In any case $(1)$ does hold and we can proceed as in the answer of reference.

  • It looks as if you are showing that $\int_Xf(x),\mathrm{d}x\in\mathcal{K}(f(X))$. – robjohn May 03 '20 at 09:31
  • @robjohn It depends on your hypothesis because ${\cal K}(f(X))$ could be "artificially" enlarged by means of a set of measure zero. This is why I've chosen the weaker assumption that ${f\notin I}$ has measure $0$. Not a big deal anyway, I just wanted to clarify your answer a little further. – Leandro Caniglia May 03 '20 at 09:41