Gradient descent or stochastic gradient descent are frequently used to find stationary points (and in some cases even to local minimum) of a nonconvex function. I was wondering if the same can be said about subgradient method. Can we say that a subgradient method for nonconvex nonsmooth function would find a stationary point too? I know that the set of subgradients can be empty at some points in the nonconvex case, but I was thinking maybe we could use a more generalized definition of gradients such as Clarke subdifferential. I have a nonconvex function which is differentiable almost everywhere. I was wondering, if I could just use such a generalized subgradient method to achieve atleast a local minimum. But I have not been really successful in finding theories that could support my little experiment. Any help would be appreciated!
Asked
Active
Viewed 616 times
2
-
Do you need a method with theoretical convergence guarantees or do you just want a method that performs well in practice? – littleO Jun 14 '22 at 10:14
1 Answers
1
The subgradient method has indeed been utilized to optimize nonconvex nonsmooth functions and this goes back to work done in the Soviet union in the 70s. These methods rely on the assumption that an oracle can provide one subgradient of the objective function, at each point in the domain. The best reference for such methods is probably the monograph of Shor: Minimization methods for non-differentiable functions.
The biggest issue with such methods is that they are non-monotone and so there is no clear way to define stopping criteria. Moreover, the convergence can be poor if the step sizes are chosen offline but adaptive selection of step sizes (tailored to the objective function) would be one possible solution.
Hikaru
- 961