Proximal Operator to Huber Affine Function

Question

I want to solve the following problem: $$ \arg\min_x |f(x)|_\mu + \frac{1}{2\sigma} |x-x^k|^2 $$

, where

$$|x|_\mu = \begin{cases} \frac{|x|^2}{2\mu}, & |x|<\mu \\ |x|-\frac \mu 2 & |x|\geq \mu \end{cases}. $$

If $f(x)=x$, I can solve it ( See Proximal operator to Huber function ) . However, if $f(x)=ax+b$, I don't know how to proceed it. (e.g., the data fitting terms for denoising or optical flow)

However, at this time, I don't know how to proceed.

Without assuming some peculiar structure in the linear operator $a$, you'll have to solve the problem iteratively via a fast projected gradient scheme like FISTA, e.g. As an example of special case, in case $a = \text{diag}(a_1, \ldots, a_p)$ is diagonal and invertible then you can operate a change of variable $x = a^{-1}(z - b) = ((z_1-b_1)/a_1,\ldots,(z_p-b_p)/a_p)$ reduce your problem to computing the prox of a Huber (with step-sizes $\tau_j = \frac{1}{\sigma a_j}$, for each dimension $j \in [1,p]$), for which a closed-form solution exists (as you already figured out). — dohmatob, Aug 01 '16 at 01:19

Lorenzo Stella · Answer 1 · 2016-08-01T14:15:00.843

I believe the problem with your function is common to every "proximable" function, i.e., a function whose proximal mapping is explicitly computable. Usually, even if you know how to efficiently compute $$\mbox{prox}_{\sigma g}(x) = \mbox{argmin}_z\{g(z) + \tfrac{1}{2\sigma}\|x-z\|^2\},$$ still you are unable to efficiently compute $$\mbox{prox}_{\sigma (g\circ L)}(x) = \mbox{argmin}_z\{g(Lz) + \tfrac{1}{2\sigma}\|x-z\|^2\},$$ where $L$ is a linear map, unless $L$ has a very peculiar structure such as $L L^* = \mu I$ for $\mu>0$, see for example Bauschke, Combettes, Proposition 23.32. If $g$ is separable, that is $g(x) = \sum g_i(x_i)$, then you can compute the prox of $(g\circ L)$ if $L\succ 0$ is diagonal.

For general $L$ either you compute the prox of $(g\circ L)$ numerically (therefore only approximately) or you somehow reformulate your original optimization problem to get rid of $L$ in the proximable term: for example, instead of solving $$ \mbox{minimize}\quad f(x) + g(Lx) $$ where $f$ is some other term in the cost, you may prefer to solve $$ \mbox{minimize}\quad f(x) + g(z)\quad \mbox{subject to}\quad Lx = z$$ by tackling the dual problem. For example you can tackle the above equality-constrained problem, depending on what $f$ is, using ADMM or the (accelerated) alternating minimization method (AMM), which is the dual (fast) proximal-gradient method. In both methods you will only apply the prox of $g$, and not of $(g\circ L)$. Of course this depends on what your original problem is, it might not be in the form I just mentioned.

Indeed, this is exactly the challenge that motivated us to develop the Smoothed Conic Dual (SCD) method in TFOCS. — Michael Grant, Aug 01 '16 at 14:11

Proximal Operator to Huber Affine Function

1 Answers1

Linked