Prove that these two definitions of convex are equivalent

Question

Let $f : \mathbb{R} \rightarrow \mathbb{R}$ be a differentiable function. Given the following two definitions of convexity of $f$, prove that (i) implies (ii):

(i) $\forall x, y \in \mathbb{R} : f(x) \ge f(y) + f'(y)(x - y)$

(ii) $\forall x, y \in \mathbb{R}, \forall \lambda \in [0, 1] : f(\lambda x + (1 - \lambda)y) \le \lambda f(x) + (1 - \lambda)f(y)$

First I saw that (i) is: $$f'(y) \leq \frac{f(x)-f(y)}{x-y} \,\,\, (*)$$ So the slope from $y$ to a greater point $x$ is always greater than the slope of only $y$.

I have tried writing $f(\lambda x + (1 - \lambda)y) = f(y+\lambda(y-x))$ as $f(y)$ plus, so to speak, the sum of all $f'(y + \epsilon \cdot n) \cdot \epsilon$, where $\epsilon \rightarrow 0$ and $n$ needs to be defined correctly of course. So just the "starting-point" $f(y)$ and then every point with it's slope until we reach $f(y+\lambda(y-x))$. That slope, I would estimate using $(*)$ and get an inequality. But this doesn't work since I would need to use (*) on the actual points $x$ and $y$, but that doesn't work of course.

I can't come up with a different attempt and the other questions on the internet don't have a derivative in them. They are all different definitions of convexity than the ones here.

in (i) set $y:= \lambda \hat x+ (1-\lambda) \hat y$ and $x:=\hat x, \hat y$, where $\hat x$, $\hat y$ referes to points in property (ii) — daw, May 27 '24 at 15:08
honestly, this was certainly asked before on this site. I could not find a duplicate, though. — daw, May 27 '24 at 15:08
@daw I don't understand. Set $x$ to x^? Or to y^? Or both? I tried both and I just get a long inequality which does not me lead to anything useful — mathematics-and-caffeine, May 27 '24 at 15:39

score 1 · Answer 1 · answered May 28 '24 at 08:10

We want to show that i) implies ii). We assume that ii) does not hold true. It exists then two points $a, b$ and a scalar $\mu \in (0, 1)$ such that $$ f(x_\mu) > \mu f(a) +(1 -\mu)f(b) $$ with $x_\mu = \mu a +(1 -\mu) b$. We want to show that this is impossible. We first apply i) to $a$ and $x_\mu$ and then to $b$ and $x_\mu$. We have $$\label{eq:1} f(a) \geq f(x_\mu) +f'(x_\mu)(a -x_\mu) = f(x_\mu) +(1 -\mu)f'(x_\mu)(a -b) \tag{$\ast$} $$ and $$\label{eq:2} f(b) \geq f(x_\mu) +f'(x_\mu)(b -x_\mu) = f(x_\mu) -\mu f'(x_\mu)(a -b) \tag{$\ast\ast$}. $$ We multiply \eqref{eq:1} by $\mu$ and \eqref{eq:2} by $1 -\mu$ and then take their sum. We obtain $$ \mu f(a) +(1 -\mu)f(b) \geq f(x_\mu) $$ which contradicts the initial assumption.

I don't understand how we obtain $()$. Since $b-x_{\mu}$ is negative, we actually would substract something. By replacing it with the derivative, we substract something smaller than before, so the result would be bigger. In $(*)$ we have everything $\geq 0$, so that works, but in $()$ I don't see it working! — mathematics-and-caffeine, May 28 '24 at 10:16
Being positive or negative does not count; it's a computation. You have $b -x_\mu = b -\mu a -(1 -\mu)b = \mu (b -a)$. — Daniel N, May 28 '24 at 18:43

Prove that these two definitions of convex are equivalent

1 Answers1

Linked