1

I need help to solve the following convex optimization problem: \begin{equation} X = argmin_{\mathbf{X}} \{ ||\mathbf{X} - \mathbf{L}||_F^2 \,\,\,+\,\, \lambda \,||\mathbf{X} - \mathbf{F}||_1 \} ~~~~~\color{red}{(1)} \end{equation}

where $\mathbf{X} \in\mathbb{R}^{n \times p}$, $\mathbf{L} \in \mathbb{R}^{n \times p}$, and $\mathbf{F} \in \mathbb{R}^{n \times p}$, and $||.||_F$ represents the Frobenius norm, and $||.||_1$ is the $L_1$-norm.

$\bf{Before~you~give~your~opinions}:~$ In fact, one can directly expect to first recast equation (1) into independent $(n,p)$ parts. That is: \begin{equation} X_{i,j} = argmin_{X_{i,j}} \{ (X_{i,j} - L_{i,j})^2 + \lambda |X_{i,j} - F_{i,j}|\} ~~~\color{red} {(2)} \end{equation} where $i = 1, \cdots, n$ and $j = 1, \cdots, p$.

In this case, after derivating (2) w.r.t $X_{i,j}$, we get:

$2 (X_{i,j} - L_{i,j})~\pm~\lambda = 0$. Hence we will obtain a soft thersholding solution for $X_{i,j}$.

Is this the correct way to solve problem (1)?

Royi
  • 10,050
Christina
  • 323
  • You are right, if there is no constraint on $\mathbf{X}$ that links its component entries, then the objective function (1) can be minimized by minimizing (2) componentwise. I think the minimization might be a little more complicated than your "derivating" of (2), since a critical point can also be a place where a derivative is undefined (not just zero). – hardmath Mar 17 '17 at 16:39
  • If the $ {L}_{1} $ norm above is like both matrices are vectors then you can solve this as both matrices are vectors. – Royi Nov 17 '23 at 18:32

1 Answers1

0

They way you interpreted the ${L}_{1}$ norm for matrices means the problem can be formulated by vectors:

$$ \arg \min_{\boldsymbol{x}} \frac{1}{2} {\left\| \boldsymbol{x} - \boldsymbol{l} \right\|}_{2}^{2} + \lambda {\left\| \boldsymbol{x} - \boldsymbol{f} \right\|}_{1} $$

Where $\boldsymbol{x} = \operatorname{Vec} \left( \boldsymbol{X} \right)$ (Also the others). The transformation is utilizing the Vectorization Operator. The reasoning is the element wise nature of the problem.

In case $\boldsymbol{f} = \boldsymbol{0}$ the above becomes the Proximal of the ${L}_{1}$ Norm. See Derivation of Soft Thresholding Operator / Proximal Operator of L1 Norm.

This shifted case requires using some Prox Calculus.
Given the prox of the function $\phi \left( \cdot \right)$ the prox of $\phi \left( \cdot - \boldsymbol{a} \right)$ is given by $\operatorname{Prox}_{\lambda \phi} \left( \boldsymbol{x} - \boldsymbol{a} \right) + \boldsymbol{a}$

So in the case above $\boldsymbol{f} + \mathcal{S}_{\lambda} \left( \boldsymbol{l} - \boldsymbol{f} \right)$ where $\mathcal{S}_{\lambda}$ is the Soft Threshold operator.

Royi
  • 10,050