About gradient descent on non-convex functions

Asked Mar 06 '18 at 23:49

Active Mar 06 '18 at 23:49

Viewed 65 times

There is this "folklore" result that gradient descent on a non-convex function takes $O(\frac n {\epsilon^2})$ steps to get to a point whose gradient norm is below $\epsilon$ and with SGD this takes $O(\frac {1}{\epsilon^4})$ steps.

Can someone share a reference where this is proven?

I am aware of the recent references where these numbers have been improved. But I am not able to locate a pedagogic presentation of these older results.

asked Mar 06 '18 at 23:49

gradstudent

About gradient descent on non-convex functions

0 Answers0