Follow up to this post: Does gradient descent converge to a minimum-norm solution in least-squares problems?, but with all matrices
Suppose we are given $p \times n$ matrix $\mathbf{X}$ and $q \times n$ matrix $\mathbf{Y}$. We would like to find $q \times p$ matrix $\mathbf{C}$ such that the following loss function
$$\| \mathbf{Y} - \mathbf{C} \mathbf{X}\|_F^2 $$
is minimized. How can we prove that gradient descent to estimate $C$ (multivariate regression) converges to a solution?
@Rodrigo de Azevedo provided a nice example responding to the previously mentioned question.
Thanks.