Questions tagged [subgradient]

This tag is for questions relating to subgradient, an iterative method for solving convex minimization problems, used predominantly in Nondifferentiable optimization for functions that are convex but nondifferentiable. The subgradient method is a very simple algorithm for minimizing convex nondifferentiable functions where newton's method and simple linear programming will not work.

The Subgradient (related to Subderivative and Subdifferential) of a function is a way of generalizing or approximating the derivative of a convex function at nondifferentiable points.

Definition : A vector $~g ∈ \mathbb R^n~$is a subgradient of $~f : \mathbb R^n → \mathbb R~$ at $~x ∈ \text{dom}~ f~$ if for all $~z ∈\text{dom}~f~,$ $$f(z) ≥ f(x) + g^T(z − x)~.$$

Note : If $~f~$ is convex and differentiable, then its gradient at $~x~$ is a subgradient. But a subgradient can exist even when $~f~$ is not differentiable at $~x~$.

Subgradient methods are convergent when applied even to a non-differentiable objective function. When the objective function is differentiable, sub-gradient methods for unconstrained problems use the same search direction as the method of steepest descent. Subgradient methods are slower than Newton's method when applied to minimize twice continuously differentiable convex functions. However, Newton's method fails to converge on problems that have non-differentiable kinks.

For more details please visit the following references:

https://see.stanford.edu/materials/lsocoee364b/01-subgradients_notes.pdf

https://people.csail.mit.edu/dsontag/courses/ml16/slides/notes_convexity16.pdf

https://optimization.mccormick.northwestern.edu/index.php/Subgradient_optimization

https://en.wikipedia.org/wiki/Subgradient_method

275 questions
15
votes
3 answers

The Proximal Operator of the $ {L}_{\infty} $ (Infinity Norm)

What is the proximal operator of the $ \left\| x \right\|_{\infty} $ norm: $$ \operatorname{Prox}_{\lambda \left\| \cdot \right\|_{\infty}} \left( v \right) = \arg \min_{x} \frac{1}{2} \left\| x - v \right\|_{2}^{2} + \lambda \left\| x…
10
votes
2 answers

How equality in Fenchel-Young inequality characterizes subdifferential?

I am not able to see why equality in the Fenchel-Young inequality characterizes subgradients. As per Fenchel-Young inequality: \begin{equation} f(x)+f^*(u) \geq \langle x,u \rangle \end{equation} while the definition of subdifferential set…
CKM
  • 1,642
8
votes
1 answer

Subgradient of convex minimization duality

$$\min(f_0(x))$$ $$\text{s.t. }f_i(x) \le y_i \forall i, i = 1 ,\ldots, m$$ $$f_i : \text{convex};\quad x : \text{variable}$$ It is also considered that $g(y)$ is the optimal value of the problem and $\lambda^*$ is the optimal dual variable. Then,…
8
votes
2 answers

Proximal Mapping of Least Squares with $ {L}_{1} $ and $ {L}_{2} $ Norm Terms Regularization (Similar to Elastic Net)

I was trying to solve $$\min_x \frac{1}{2} \|x - b\|^2_2 + \lambda_1\|x\|_1 + \lambda_2\|x\|_2,$$ where $ b \in \mathbb{R}^n$ is a fixed vector, and $\lambda_1,\lambda_2$ are fixed scalars. Let $f = \lambda_1\|x\|_1 + \lambda_2\|x\|_2$, that is to…
8
votes
1 answer

Prove the subdifferential of $f(x)=\max_i f_i(x)$ is $\partial f(x) = \operatorname{conv}\left( \bigcup \partial f_i(x)\right) $

How to prove that the subdifferential of $f(x) = \max_{i=1,\dots, n} f_i(x)$ satisfies \begin{align} \partial f(x) = \operatorname{conv}\left( \bigcup \partial f_i(x) \right) \end{align} How am I suppose to utilize the property…
good2know
  • 675
8
votes
2 answers

Subgradient of the $\ell_0$ "norm"

I am trying to characterize the sub-gradient of $\ell_0$ "norm" $$f(x) := \|x\|_0 := \sum_{i=1}^n 1\{{x_i \neq 0}\}$$ At first, since it satisfies the triangle inequality, I thought that the $\ell_0$ "norm" is convex and non-smooth. Then, I tried to…
7
votes
1 answer

When do two functions have the same subdifferentials?

For two functions $f$ and $g$, if $\nabla f(x) = \nabla g(x)$, $f = g + c$ for some constant $c$. Does the same hold if the gradient is replaced by the (convex) subdifferential, ie $\partial f(x) = \partial g(x)$ for all $x$ ? And, as a stronger…
7
votes
1 answer

Sum of subgradients belongs to subgradient of sums?

I was going through this page : https://www.stats.ox.ac.uk/~lienart/blog_opti_basics.html , and at the end of part 1 "Subgradient and First-order Optimality Condition", the author says: Before moving on, it is useful to note (and not too hard to…
Ojas
  • 2,206
  • 1
  • 18
  • 26
7
votes
1 answer

Geometric concept of $A$-orthogonality, $A\succ 0$

Assume the following is in in $\mathbb{R}^n$ 1. If $d_i,d_j$ are orthogonal with $i \neq j$, it means $d_i^Td_j=0$. 2. If $d_i,d_j$ are $A$-orthogonal with $i \neq j$, it means $d_i^TAd_j=0$. In many lecture, it can be viewed as the…
sleeve chen
  • 8,576
6
votes
1 answer

Smooth approximation of maximum using softmax?

Look at the Wiki page for Softmax function (section "Smooth approximation of maximum"): https://en.wikipedia.org/wiki/Softmax_function It is saying that the following is a smooth approximation to the softmax:…
Daniel
  • 2,760
6
votes
4 answers

How to compute the convex conjugate of norm raised to the power of $2$, $f(x) = \|x\|^2/2$

How would one find the conjugate of the following : $$f(x) = \|x\|^2 /2$$ The conjugate function is defined as $ f^*(y) = \max_x y^Tx - f(x)$ I am stuck at how I can derive the explicit form for $x$. So far, here are my steps: To maximize I take…
Ireth
  • 135
6
votes
2 answers

Derivative of the Prox / Proximal Operator

Consider a proximal operator, $$ \operatorname{Prox}_{ \lambda f } \left( \mu x \right) := \arg \min_{u} \lambda f \left( u \right) + \frac{1}{2} {\left\| u - \mu x \right\|}_{2}^{2}.$$ What is the partial derivative of the proximal operator w.r.t.…
6
votes
2 answers

How to show $\partial f(x) =\{\nabla f(x) \}$ for a convex differentiable function?

I want to show that if $f:\mathbb{E}\rightarrow\mathbb{R}$ is convex, and differentiable at $x$, then $\partial f(x) = \{ \nabla f(x) \} $ . I understand that for a convex function, we have the following: $$f(y) \le f(x)+\nabla f^T(y-x) \forall x,y…
6
votes
3 answers

Subdifferential of a convex differentiable function

Let us consider $f:\, \mathbb{R}^n \rightarrow \mathbb{R}$ be a convex function and differentiable at a point $x_0$. If $\partial f(x_0)$ denotes the subdifferential of $f$ I would like to prove that the only element in it is given by the gradient…
5
votes
0 answers

KKT conditions for non-differentiable constraints

So I know that for the problem: $$ \begin{align*} \text{minimize} \quad & f_0(x) \\ \text{subject to} \quad & f_i(x) \leq 0, \quad i = 1, 2, \ldots, m \\ \end{align*} $$ We have the following necessary sufficient KKT conditions, when we assume…
1
2 3
18 19