16

Consider a real-valued function $f : \mathbb{R} \to \mathbb{R}$. Given any $x \in \mathbb{R}$, we can compute $f(x)$. We do not know the analytical form of $f$, and it should be treated as a black box. However we do know that $f$ is first-order differentiable and can (numerically) compute its derivative $f'$.

Question: Is there an efficient way to numerically check if $f$ is convex using computational methods?

I would ideally use something like a second-derivative test, however there is no guarantee that $f$ is second-order differentiable and this method needs to generalize to arbitrary $C^1$ functions.


One "brute force" idea was to just literally apply the definition of convexity, $$ f(\lambda x_1 + (1-\lambda)x_2) \leq \lambda f(x_1) + (1-\lambda)f(x_2) $$ for all $\lambda \in [0, 1], x_1, x_2 \in \mathbb{R}$. However this is somewhat impractical as it may require testing many different points across $\mathbb{R}$, and there might be local regions where convexity is violated but this test does not pick up on it.

Adam
  • 404
  • 9
    This seems like a hard problem. Even if $f$ is twice differentiable the second derivative could be positive everywhere except for a very small interval which, if you only have query access to $f''$ will be hard to find. If you're looking for approximate methods I think it's best to utilize the first-order characterization of convexity instead of the usual definition. – George Giapitzakis Mar 11 '25 at 03:16
  • If you restricted the set of functions of interest to polynomial functions, you could use quantifier elimination (QE), which is in the realm of computational real algebraic geometry, rather than numerical analysis. However, QE may not meet your expectations regarding efficiency. – Rodrigo de Azevedo Mar 14 '25 at 10:38

4 Answers4

18

No. It's impossible, for any finite algorithm. The reason is exactly as you list: there may be a tiny interval where the algorithm didn't evaluate $f,f'$ and where the function is non-convex.

It's easy to formally prove that no such algorithm can exist, by a standard adversary argument. Let's assume for simplicity that your algorithm is deterministic (the argument can be extended to apply to randomized algorithms as well, with a little messier proof). Let $f$ be some convex function. Suppose your algorithm has evaluated $f$ and $f'$ at some set of points, say $x_1,\dots,x_n$ (assume these are listed in increasing order), and your algorithm has terminated with the conclusion that $f$ is convex. Now let's define a new point $x'=(x_i+x_{i+1})/2$. Suppose we define a new function $g$ that is identical to $f$, except on a tiny interval $(x'-\epsilon,x'+\epsilon)$, where on that interval, $g$ is non-convex. Notice that running the algorithm on $g$ will evaluate $g$ and $g'$ on exactly the same points, namely $x_1,\dots,x_n$, and at those points, $f$ and $g$ are identical and indistinguishable. Therefore, when run on $g$, your algorithm will also terminate with the conclusion that $g$ is convex. But $g$ is (by construction) not convex. Therefore, your algorithm produces the wrong answer when run on $g$. This proves that there is no algorithm that is correct on all functions and always terminates after finitely many steps.

D.W.
  • 5,958
  • This argument was very intuitive and clear. I suppose I'll have to use an approximate method. – Adam Mar 12 '25 at 09:41
  • One wrinkle in this is that the "adversary" will not necessarily have knowledge of the algorithm being run when trying to construct the function that will beat it. In any case, your broad point is correct. – Ben Mar 13 '25 at 11:14
  • @Ben, Thank you for the good words! The proof is valid, and that's not an issue. I am sketching a formal proof that the algorithm is not correct on all inputs, i.e., for every convex functions $f$ where the algorithm is correct, there exists a function $g$ where the algorithm is incorrect. The algorithm is fixed. So for purposes of what I am proving (i.e., proving existence of $g$), it is totally fair game for $g$ to "depend" on the algorithm. – D.W. Mar 13 '25 at 16:01
  • Yes, my point is more around what happens when you instead treat this as a (turn based) strategic game between an algorithm-designer and an adversary. (It is still easy for the latter to win, but they can't just construct a function with knowledge of the algorithm being used against them.) – Ben Mar 13 '25 at 20:28
11

Brute force is as effective as you will get, but it hinges on choice of the "test set"

Since your function is a black-box, and you are unwilling to stipulate any structure for it beyond the fact that it is first-order differentiable, any computational check of convexity is essentially going to boil down to a "brute force" check of convexity over some limited number of distinct input points $x_1 < \cdots < x_n$ (which I will call the "test set").

In your case you can compute the function value and the first-derivative at each point, so there are essentially two things you need to confirm: (1) that the function is convex with respect to every triple of points;$^\dagger$ and (2) that the first derivative is nondecreasing at consecutive points. It is worth noting that this first test is actually significantly less computationally intensive than you are presenting in your question. Because you are dealing with a univariate function, all you need to check is the convexity of consecutive triplets of points. (For a univariate function, if consecutive triples are convex then all triples must be convex.) This means that you will need to make $n-2$ convexity checks and $n-1$ derivative checks, which is computationally linear in $n$.

Suppose you do that and you confirm convexity over the test set. That is great, but of course the test set is only a finite number of values in the domain of the function. Ultimately, you are going to have to decide on what is a "good" way to generate the test set, which is going to require some assumptions about which parts of the domain of the function are "better" to test. With no further information about the function this is an arbitrary choice, but you might prefer a test set that is highly diffuse (i.e., has high variance). Ultimately, if you just have a black-box function over the entire real line then nothing can really mitigate this problem. No matter how many points you test, there will still be infinitely long regions of the real line that you did not test, and there will still be gaps between the points where you did not test. There is always the possibility that convexity fails in an area that you did not test (either beyond the boundary points of the test set or within the gaps between points in the test set).


$^\dagger$ To be clear, what I mean here by testing convexity for a triple of points is checking the inequality constraint that defines convexity with the middle-point of the triple being the interpolated point. For any triple of points $a<b<c$ in the test set, this would mean checking that:

$$f(b) \leqslant \frac{b-a}{c-a} \cdot f(a) + \frac{c-b}{c-a} \cdot f(c).$$

Ben
  • 4,494
  • I'm convinced that an approximate solution would have to suffice. Would you mind explaining why we need to confirm (1) as well? If I recall, a differentiable function $f$ is (weakly) convex if and only if $f'$ is monotonically non-decreasing on some interval $[a, b]$. Would it be sufficient to check only (2), e.g. generate a reasonably large grid over $[a, b]$ with a reasonably large number of points, and check $f'$? – Adam Mar 12 '25 at 09:44
  • 1
    @Adam: That is true, but you are not checking (and cannot check) the derivative over a continuum. You can only compute the derivative at a finite number of distinct points, so you are only ever checking the endpoints of any interval. Taking any interval $[x_k,x_{k+1}]$ in the above case, even if $f'(x_k) \leqslant f'(x_{k+1})$ (and even if these endpoints are for a "fine grid"), there is no guarantee that the derivative in the interior of the interval does not become decreasing somewhere, so there is no guarantee that (1) is implied. – Ben Mar 12 '25 at 18:46
  • (Of course, if you are willing to make assumptions giving an upper bound on how quickly the derivatives can change over a grid of points, then that changes things.) – Ben Mar 12 '25 at 18:47
  • Thanks that makes sense, but my initial impression from your answer was that this problem where convexity might be violated in the interior of intervals was unavoidable. By choosing a very "fine grid", we may possibly be able to make these regions small enough for an approximate solution to be reasonable - but it's still always a problem. How does (1) help mitigate this problem? – Adam Mar 12 '25 at 21:02
  • I'm also not fully understanding what you mean by (1), "function is convex with respect to every triple of points". Do you mean checking for example, $f(x^k) \geq f(x_k) + f'(x{k+1})(x_k-x_{k+1})$, where $x^k \in [x_k, x{k+1}]$ is some additional point in each interval? In other words, we generate a grid with intervals ${[x_k, x_{k+1}]}$, but the grid might not be fine enough so we check an additional point $x^*_k$ in each interval for additional robustness? – Adam Mar 12 '25 at 21:06
  • 1
    Yes, the problem is unavoidable. Ultimately, if you just have a black-box function over the entire real line then nothing can really mitigate the problem. No matter how many points you test, there will still be infinitely long regions of the real line that you did not test, and there will still be gaps between the points where you did not test. Test (1) helps because it tests convexity over the points in the test set --- if you find that convexity does not hold for some triple of points, then you have falsified convexity of the function. – Ben Mar 12 '25 at 21:35
  • 1
    (I have added a footnote in the answer to explain what I mean by testing a triple of points for convexity.) – Ben Mar 12 '25 at 21:44
4

Depending on your $f$ probably no but if one has very strong information on the function then it is possible and there are techniques in rigorous computation that can allow you to do this. As a couple of people have mentioned if you know nothing about the function then it is impossible, since you can just modify the function on a small interval that "evades" your mesh.

If you know something like analyticity, for instance, then something may be possible. There is a theorem that is widely used in rigorous computations. Suppose that you are interested in a function $$f: \mathbb{R} \rightarrow \mathbb{R}$$ on an interval I. Suppose that that function is analytic in an ellipse in $\mathbb{C}$ containing $I$, and you can bound the maximum modulus of $f$ on the ellipse. Then you can get a bound on the error in Chebyshev interpolation of the function on the interval $I$ that is exponential in the number of nodes. This can be used to give a rigorous computer assisted proof that (for instance) the second derivative is strictly positive.

For instance suppose your $f(\lambda)$ arose from solving a linear ODE in which $\lambda$ enters as a parameter. Then $f(\lambda)$ is entire and it is relatively straightforward to estimate $f$ in the complex $\lambda$ plane.

There is a paper by Tadmore in SIAM J Numerical Analysis from 1986 (Volume 23, #1) that gives some theorems.

Geardaddy
  • 187
0

As others have noticed, it’s impossible to explicitly check this due to small intervals potentially having erratic behavior, but if you’re just trying to do something reasonable, it’s pretty easy to do an approximate empirical check.

If you only need to double check $f$ on some interval $[a,b]$, then you can easily subdivide it equally into equal intervals $x_i=a+(b-a)\cdot i/N$ and then check adjacent triples, i.e. $f(x_{i-1})+f(x_{i+1})\geq 2f(x_i)$ for all $i$.

If you can compute $f’$, and it’s not much more work, you can also just check whether $f’$ is monotonically increasing ($f’(x_i)\leq f’(x_{i+1})$). I don’t think this gets you much more than just checking triples though and it’s probably easier to just do the former check with smaller intervals.

If you want any kind of guarantees, you’re going to need to bound $f’’’$ from below. If your estimates give that $f’’\geq A$ (or in terms of just $f$, that $f(x_{i-1})+f(x_{i+1})-2f(x_i)\geq A(x_{i+1}-x_i)^2$), and you can somehow bound $f’’’\geq -B$, then you only need regions of size smaller than $A/B$ to guarantee convexity. If you want to be clever to save computation, you can subdivide regions with convexity close to $0$ or more negative third derivative more, but that seems like more effort than it’s worth.

Eric
  • 8,378