8

Bennett's Inequality is stated with a rather unintuitive function,

$$ h(u) = (1+u) \log(1+u) - u $$

See here. I have seen in multiple places that Bernstein's Inequality, while slightly weaker, can be obtained by bounding $h(u)$ from below,

$$ h(u) \ge \frac{ u^2 }{ 2 + \frac{2}{3} u} $$

and plugging it back into Bennett's Inequality. However, I can't see where this expression comes from. Could someone point me in the right direction?

River Li
  • 49,125
duckworthd
  • 1,145

5 Answers5

2

Actually, the OP is asking where $\frac{u^2}{2 + \frac23 u} $ comes from.

The answer is: $f(u) = \frac{u^2}{2 + \frac23 u}$ is the Pade $(2,1)$ approximation of $h(u)$ at $u = 0$.

As regards the proof for the Pade $(2,1)$ approximation to be a lower bound, let $$g(u) = \ln (1 + u) - \frac{1}{1 + u}\left(u + \frac{u^2}{2 + \frac23 u}\right).$$ We have $$g'(u) = \frac{u^3}{(1 + u)^2(3 + u)^2}.$$ Note that $g'(u) < 0$ on $(-1, 0)$, $g'(u) > 0$ on $(0, \infty)$, and $g(0) = 0$. Thus, $g(u) \ge 0$ on $(-1, \infty)$. We are done.

River Li
  • 49,125
2

This is more like a comment. Define $f(x)=(1+x) \log(1+x) - x- \frac{ x^2 }{ 2 + \frac{2}{3} x}$. You can see that $f(0)=f'(0)=f''(0)=0$. However $f''(x)$ is ratio of some polynomials and it can be seen that for $x>0$, $f''(x)>0$. Hence $f'(x)>0$ and $f(x)>0$ too.

This is an ugly way to prove it but I do not see another way for the moment.

Arash
  • 11,307
2

Here is how the approximation could be derived.

$\begin{align} h(u) &=(1+u)\log(1+u)−u\\ &=(1+u)\sum_{n=1}^{\infty} \dfrac{(-1)^{n-1}u^n}{n}−u\\ &=\sum_{n=1}^{\infty} \dfrac{(-1)^{n-1}u^n}{n} +u\sum_{n=1}^{\infty} \dfrac{(-1)^{n-1}u^n}{n}−u\\ &=\sum_{n=1}^{\infty} \dfrac{(-1)^{n-1}u^n}{n} +\sum_{n=1}^{\infty} \dfrac{(-1)^{n-1}u^{n+1}}{n}−u\\ &=\sum_{n=1}^{\infty} \dfrac{(-1)^{n-1}u^n}{n} +\sum_{n=2}^{\infty} \dfrac{(-1)^{n}u^{n}}{n-1}−u\\ &=u+\sum_{n=2}^{\infty} \dfrac{(-1)^{n-1}u^n}{n} +\sum_{n=2}^{\infty} \dfrac{(-1)^{n}u^{n}}{n-1}−u\\ &=\sum_{n=2}^{\infty} u^n\left(\dfrac{(-1)^{n-1}}{n} +\dfrac{(-1)^{n}}{n-1}\right)\\ &=\sum_{n=2}^{\infty} (-1)^{n}u^n\left(\dfrac{-1}{n} +\dfrac{1}{n-1}\right)\\ &=\sum_{n=2}^{\infty} (-1)^{n}u^n\left(\dfrac{1}{(n-1)n}\right)\\ &=\sum_{n=2}^{\infty} \dfrac{(-1)^{n}u^n}{(n-1)n}\\ &= \dfrac{u^2}{2}-\dfrac{u^3}{6}+\dfrac{u^4}{12} -\dfrac{u^5}{20}\pm ...\\ \end{align} $

If we just look at the first two terms,

$\dfrac{u^2}{2}-\dfrac{u^3}{6} =\dfrac{u^2}{2}(1-\dfrac{u}{3}) $, and since $\dfrac1{1+z} =1-z+z^2 \pm ... $, $1-\dfrac{u}{3} \sim \dfrac1{1+u/3} $.

Therefore $h(u) \sim \dfrac{u^2}{2}\dfrac1{1+u/3} = \dfrac{u^2}{2+2u/3} $.

This shows how the approximation could be derived. The next step is to see how accurate the approximation is. Let $h^*(u) = \dfrac{u^2}{2}\dfrac1{1+u/3} = \dfrac{u^2}{2+2u/3} $.

Expanding the approximation,

$\begin{align} h^*(u) &\sim \dfrac{u^2}{2}\dfrac1{1+u/3}\\ &=\dfrac{u^2}{2}(1-\dfrac{u}{3}+\dfrac{u^2}{9} -\dfrac{u^3}{27}\pm ...)\\ &=\dfrac{u^2}{2}-\dfrac{u^3}{6}+\dfrac{u^4}{18} -\dfrac{u^5}{54}\pm ... \\ \end{align} $

Therefore

$\begin{align} h(u)-h^*(u) &=(\dfrac{u^4}{12} -\dfrac{u^5}{20}\pm ...) -(\dfrac{u^4}{18} -\dfrac{u^5}{54}\pm ...)\\ &=u^4(\dfrac{1}{12}-\dfrac{1}{18}) -u^5(\dfrac{1}{20}-\dfrac{1}{54})\pm ...\\ &=\dfrac{u^4}{36} -\dfrac{17u^5}{540}\pm ... \\ \end{align} $

It certainly looks like $h(u) > h^*(u)$. Once you have this conjecture, you can try to prove it. In this case, a proof can be developed by looking at the successive terms in the difference and showing that the terms are decreasing and alternating in sign.

Note that if we let $h^{**}(u) = \dfrac{u^2}{2}-\dfrac{u^3}{6} $, $h(u)-h^{**}(u) =\dfrac{u^4}{12} -\dfrac{u^5}{20}\pm ... $, and this appears to have an error about three times as large.

marty cohen
  • 110,450
1

One rather clean solution (which happens to generalise pretty nicely to some other relevant settings) is to write

\begin{align} h(u) &= (1+u) \log(1+u) - u \\ &= \int_{0\leq v \leq u} \log(1+v) \, \mathrm{d}v \\ &=\int_{0\leq w\leq v \leq u} \frac{\mathrm{d}w \mathrm{d} v}{1+w} \\ &\geq \int_{0\leq w\leq v \leq u} \frac{\mathrm{d}w \mathrm{d} v}{\left( 1+\frac{w}{3} \right)^3} \\ &= \frac{\frac{1}{2} u^2}{1+\frac{u}{3}}, \end{align}

where the passage from line 3 to 4 uses Bernoulli's inequality (i.e. that $(1 + x)^p \geq 1 + p x$ for $x \geq -1, p \geq 1$), and the remainder follows from elementary calculus.

This is admittedly not really so different to looking at the derivatives of $h$, but the specific choice to compare to something of the form $\left( a + b \cdot u \right)^{-3}$ is particularly useful when working with concentration estimates of this character. There are some related exercises in the book of Boucheron-Lugosi-Massart, if I remember well.

πr8
  • 11,040
0

While the answer by @Arash outlines the general strategy of deriving the inequality, the explicit computation is missing. I am providing it here in the hope that it will be helpful for future readers.

The actual computation of the derivative is highly simplified by first simplifying the last term in the definition of $$ f(x) = (1 + x) \ln(1 + x) - x - \frac{x^2}{2 + \frac{2}{3} x} $$ as follows: $$ \frac{x^2}{2 + \frac{2}{3} x} = \frac{3}{2} \frac{x^2}{3 + x} = \frac{3}{2} \frac{x \cdot (3 + x) - 3 x}{3 + x} = \frac{3}{2} x - \frac{9}{2} \frac{x}{x + 3}. $$ Therefore, we see \begin{align*} \require{cancel} f(x) & = (1 + x) \ln (1 + x) - \frac{5}{2} x + \frac{9}{2} \frac{x}{x + 3} , \\ f'(x) & = \ln(1 + x) + 1 - \frac{5}{2} + \frac{9}{2} \frac{x+3 - x}{(x+3)^2} \\ & = \ln(1 + x) - \frac{3}{2} + \frac{3^3}{2} (x+3)^{-2}, \\ f''(x) & = \frac{1}{1 + x} - 3^3 \cdot (x+3)^{-3} \\ & = \frac{(x+3)^3 - 27 \cdot (1 + x)}{(1 + x) (3+x)^3} \\ & = \frac{x^{3} + 9 x^{2} + \bcancel{27 x} + \cancel{27} - \cancel{27} - \bcancel{27 x}} {(1 + x) (3+x)^3} \\ & = \frac{x^{3} + 9 x^{2}} {(1 + x) (3+x)^3} \geq 0 \end{align*} for $x \geq 0$, which is the only case we are interested in.

Since $f(0) = 0$ and $f'(0) = 0$, this implies $f' \geq 0$ and then $f \geq 0$ on $[0,\infty)$.

PhoemueX
  • 36,211