1

Why is the gradient descent of this problem with an extremely small step size (start at $\mathbf{x_0}=\mathbf{0}$) $$ \min_\mathbf{x} ||\mathbf{y}-\mathbf{A}\mathbf{x}||_2^2 $$ equivalent to $$ \min_\mathbf{x} ||\mathbf{x}||_2^2 s.t. \mathbf{y}=\mathbf{A}\mathbf{x} $$ where $\mathbf{x}$ and $\mathbf{y}$ are vectors and $\mathbf{A}$ is a fat matrix.

It seems to be a common result but I cannot tell the why, could anyone help me with that?

Royi
  • 10,050
  • 1
    Minimizing $| y - Ax |^2$ using gradient descent does not necessarily find the least norm solution to $Ax = y$. If the first iterate $x^0$ of gradient descent already satisfies $Ax^0 = y$ then gradient descent converges to $x^0$ (and the gradient descent iterates are all equal to $x^0$). – littleO Nov 24 '20 at 11:36
  • I'd be curious to see what you're referring to when you say this is a common result. – littleO Nov 24 '20 at 11:37
  • Oops, I forgot to mention that the gradient descent starts from x=0. zero init. – user853186 Nov 24 '20 at 11:41
  • 1
    The same question was asked here, and apparently a proof was given though I haven't checked it: https://math.stackexchange.com/questions/3451272/does-gradient-descent-converge-to-a-minimum-norm-solution-in-least-squares-probl – littleO Nov 24 '20 at 11:50
  • This isn't true for trivial reasons, for some step sizes gradient descent won't converge at all. There is not "one" gradient descent algorithm you can universally refer to in this context. – Jürgen Sukumaran Nov 24 '20 at 11:53
  • That proof works! Thanks so much! – user853186 Nov 24 '20 at 12:09
  • Where's the compressed sensing problem? – Rodrigo de Azevedo Nov 25 '20 at 03:16

1 Answers1

1

Take a look at Rodrigo de Azevedo's answer to Does gradient descent converge to a minimum-norm solution in least-squares problems?

Royi
  • 10,050