4

Suppose that $H\succ 0$ and we have a sequence $\{x_j\}_{j\ge 1}$ such that each one is the unique solution of the following problem:

$$ x_{j} = \text{argmin}_{x\in \mathbb R^n}(x-x_{j-1})^T\nabla f(x_{j-1})+\frac{1}{2}(x-x_{j-1})^TH(x-x_{j-1})+\frac{1}{2}\rho_j\|x-y_j\|_2^2 .$$

Note that $\rho_j \to \infty$ as $j \to \infty$. Further, $x_0$ can be arbitrary and we know the sequence $\{y_j\}_j$ is bounded.

I want to show the sequence $$(x_j-x_{j-1})^T\nabla f(x_{j-1})+\frac{1}{2}(x_j-x_{j-1})^TH(x_j-x_{j-1}), \, j\in \mathbb N$$ is bounded below.

It turns out that if $f$ is convex and Lipshitz continues on the whole space, there is a lower bound (Existence of a lower-bound for an interesting function! and Lipschitz implies bounded gradient).

Unfortunately, I am not finding practical functions $f$ that satisfy these conditions (convexity and Lipshitz continuity on the entire space) together. So, I want to see if finding a lower bound is doable with less restrictive conditions or at least practical ones.

Thank you for your time!

Sam
  • 323
  • The sequence currently involves $x$. I think $x$ should be replaced by $x_j$ or $x_{j+1}$. – Amir Apr 21 '24 at 08:26
  • 1
    Thanks @Amir. Edited. – Sam Apr 21 '24 at 11:06
  • What are sort of 'practical' (as in 'practical functions $f$ that satisfy convexity and Lipschitz continuity') are you looking for? Linear functions $f(x) = b^T x$ ($b \in \mathbb{R}^n$) or the hyperboloid $f(x) = \sqrt{1 + |x|^2}$ are smooth, Lipschitz and convex (strictly convex for the latter). Are these practical for you? – Jordan Payette Apr 26 '24 at 19:57
  • @JordanPayette the ones that often appear in sparse optimization. Particularly, an objective function $f$ that we need to find its minimizer when a cardinalty constraint exists! – Sam Apr 27 '24 at 07:37
  • @JordanPayette One important example is $f(x)=|y-Ax|_2^2$ where $A$ is underdetermined, coming from compressive sensing. – Sam Apr 27 '24 at 19:36

1 Answers1

1

I assume the $\mathrm{argmin}$ is supposed to define $x_j$ instead of $x_{j+1}$.


In general, without further constraints, the sequence may be unbounded below. For instance, in dimension $1$, take $H = 1$, $x_0 = 0$, $f(x) = -x - (x+1)^4/4$, $y_j = 0$ and $\rho_j = j^2$ for all $j \ge 1$. One can show that $x_j = j$ for all $j \ge 0$, and $$(x_j - x_{j-1})\nabla f(x_{j-1}) + (1/2)H(x_j - x_{j-1})^2 = -(1 + j^3) + (1/2).$$


In a positive direction, let's assume that there exists $K > 0$ such that $\| \nabla f(x)\| \le K$ for $x$ inside the closed ball of radius $3$ centred at $0$, and $\| \nabla f(x)\| \le K \|x\|$ for $x$ outside this ball. Then the sequence $(x_j - x_{j-1})^T \nabla f(x_{j-1}) + \frac{1}{2} (x_j - x_{j-1})^T H (x_j - x_{j-1})$ is bounded from below (by a possibly $x_0$-dependent constant).

We set $c := \mathrm{min}_{\|x\| = 1} x^t H x > 0$ and $C := \mathrm{max}_{\|x\| = 1} x^t H x > 0$. Without loss of generality (dilating/contracting $\mathbb{R}^n$ if necessary), we assume $\|y_j\| \le 1$ for all $j$. Since we are really only concerned with the case of large $j$ and $\rho_j \to \infty$, we may also assume that $\rho_j \ge 3(3C+K)$ for all $j$.

We compute that $$ (\clubsuit) \quad x_j = (H + \rho_j I)^{-1} \left( Hx_{j-1} + \rho_j y_j - \nabla f(x_{j-1}) \right) .$$ Note that $\|(H + \rho_j I)^{-1}\| \le (c + \rho_j)^{-1}$ and $\|(H + \rho_j I)^{-1}H\| \le (c + \rho_j)^{-1}C$.

  • Suppose $\|x_{j-1}\| > 3$. From ($\clubsuit$), we get \begin{align} \|x_j\| \le (c + \rho_j)^{-1}(C+K)\|x_{j-1}\| + 1 \le (1/3) \|x_{j-1}\| + 1 \le (2/3) \|x_{j-1}\|. \end{align} Since $(2/3)^n \to 0$ as $n \to \infty$, we see that there exists $j_0$ such that $\|x_{j_0}\| \le 3$.

  • Whenever $\|x_{j-1}\| \le 3$, from ($\clubsuit$), we get \begin{align} \|x_j\| \le (c + \rho_j)^{-1}(C \|x_{j-1}\| +K) + 1 \le 2 < 3. \end{align}

Therefore, we see that regardless of $x_0$, the sequence $x_j$ eventually falls and stays in the ball of radius $3$ centred at $0$. However, whenever $\|x_{j-1}\| \le 3$, we have the lower bound \begin{align} (x_j - x_{j-1})^T \nabla f(x_{j-1}) &+ \frac{1}{2} (x_j - x_{j-1})^T H (x_j - x_{j-1}) \\ &\ge - \|x_j - x_{j-1}\| K + (c/2) \|x_j - x_{j-1}\|^2 \\ &\ge - K^2/2c. \end{align}

Jordan Payette
  • 6,219
  • 1
  • 13
  • 20
  • Thank you for your time. I am trying to validate your proof and then accept it. It seems amazing so far. I think that a sufficient condition for what you found is $\nabla f(x)$ being Lipschitz continuous on $\mathbb R^n$, right? Which is amazing because I do not need convexity anymore. – Sam Apr 29 '24 at 17:40
  • @Sam Yes, $\nabla f$ being Lipschitz implies the above condition (which is strictly more general though). Besides, since the $\rho_j$s grow to infinity, the quantity you are minimising is (under the conditions of my answer on $f$) a (relatively) small perturbation of the convex function $\rho_j |x-y_j|^2$, so morally the fact that the sequence of $x_j$s is bounded is due to the convexity of that function. – Jordan Payette Apr 29 '24 at 18:02
  • I just wanted to say that I went through your proof again and everything works out very nicely, and I very much appreciate that. This is very helpful. By the way, I mentioned $\nabla f$ be Liptchitz continuous because I am not sure what other commonly used assumption in the literature satisfies your conditions. By the way, sorry that it took some time to reply back because I was busy with the finals. – Sam May 03 '24 at 07:24
  • 1
    @Sam No problem, glad if I've been of help! (Good success on your finals!) My assumption is equivalent to that $\nabla f$ being 'Lipschitz from the origin', i.e. $| \nabla f(x) - \nabla f(0) | \le K | x |$. It's more general than Lipschitz, as it includes for instance any primitive $f$ to the function $f'(x) = \sin(x*x)$. – Jordan Payette May 03 '24 at 14:10
  • Thank you and it is interesting to notice that! The fact that they are equivalent makes this so neat! I am not sure this condition of Lipschitz from the origin is a regular assumption in the literature, is it? – Sam May 06 '24 at 17:04
  • It seems to me that Lipschitz from the origin is more general than that of your solution! – Sam May 06 '24 at 18:54
  • @Sam I'm not familiar with the literature in your field, but indeed I haven't seen anything like 'Lipschitz from the origin' (however the condition is evocative of some seminorms on Schwartz spaces). You're right, 'Lipschitz from origin' isn't equivalent, it's a special case! The condition from my answer can be rephrased (C) $| \nabla f (x)|\le A+B|x|$ for some $A,B>0$. By the triangle identity, 'Lipschitz from the origin' implies $| \nabla f(x)|\le |\nabla f(0)|+K|x|$ hence (C). However (C) doesn't force continuity of $\nabla f$ near $0$; if we assume that, then we have equivalence. – Jordan Payette May 06 '24 at 20:11