3

I have a similar problem to Linear Matrix Least Squares with Linear Equality Constraint - Minimize $ {\left\| A - B \right\|}_{F}^{2} $ Subject to $ B x = v $, where there is no symmetric constraint for matrix $N$. I've tried to write the Lagrange function as

$$ L(N) = \min_N{\frac{1}{2}\|N - M\|_F^2 - \lambda^T(Nd - g) - \frac{\gamma}{4}\|N - N^T\|^2_F} $$

By taking the derivative over $N$, I got

$$ \frac{\partial L}{\partial N} = N - M - \lambda d^T - \gamma (N - N^T) $$

I got stuck here(please point out if I did wrong in the above steps). Anyone has any idea how to do next or there is another way out?

$M$ is a symmetric matrix in this case.

Thanks in advance.

  • I suggest that you solve $\min_{\bar{N}} \ 1/2 | \bar{N}+\bar{N}^T - M |_F^2 \ s.t. \ (\bar{N}+\bar{N}^T)d = g$. It seems easier to me. – Marc Dinh Apr 20 '20 at 09:07
  • Thanks for the help. It now seems easy and that's a quite useful trick. I've posted the solution below and please point out if there is a mistake. – Kaiwen Sheng Apr 20 '20 at 09:30
  • Really nice question. I liked deriving my 2 answers to it. – Royi Apr 21 '20 at 10:26
  • By the way, you could easily derive an iterative method - Projected Gradient Descent for this method. Since the set of solutions to the linear equality constraint isn't a sub space you can't use POCS. But you can use Dykstra algorithm. See https://math.stackexchange.com/questions/1492095. – Royi Apr 21 '20 at 12:00

3 Answers3

2

Regarding the approach by adding the transpose it should be as following:

$$\begin{aligned} \arg \min_{X} \quad & \frac{1}{2} {\left\| X - Y \right\|}_{F}^{2} \\ \text{subject to} \quad & X \in \mathcal{S}^{n} \\ & X a = b \end{aligned} \\ \Updownarrow \\ \begin{aligned} \arg \min_{X} \quad & \frac{1}{2} {\left\| X + {X}^{T} - Y \right\|}_{F}^{2} \\ \text{subject to} \quad & \left( X + {X}^{T} \right) a = b \end{aligned} $$

The Lagrangian is given by:

$$ L \left( X, v \right) = \frac{1}{2} {\left\| X + {X}^{T} - Y \right\|}_{F}^{2} + {v}^{T} \left( \left( X + {X}^{T} \right) a - b \right) $$

Now, the gradient is given by:

$$ {\nabla}_{X} L \left( X, v \right) = 2 X + 2 {X}^{T} - Y - {Y}^{T} + a {v}^{T} + v {a}^{T} \Leftrightarrow X + {X}^{T} = \frac{1}{2} \left( Y + {Y}^{T} - v {a}^{T} - a {v}^{T} \right) $$

Now multiplying it on the right by $ a $ yields:

$$\begin{aligned} b & = \frac{1}{2} \left( Y + {Y}^{T} - v {a}^{T} - a {v}^{T} \right) a \\ & = \frac{1}{2} \left( Y + {Y}^{T} \right) a - \frac{1}{2} \left( v {a}^{T} a + a {v}^{T} a \right) \\ & = \frac{1}{2} \left( Y + {Y}^{T} \right) a - \frac{1}{2} \left( {a}^{T} a v + \left( {a}^{T} \otimes a \right) v \right) \\ & = \frac{1}{2} \left( Y + {Y}^{T} \right) a - \frac{1}{2} \left( {a}^{T} a I + {a}^{T} \otimes a \right) v \\ & = \frac{1}{2} \left( Y + {Y}^{T} \right) a - \frac{1}{2} \left( {a}^{T} a I + a {a}^{T} \right) v \end{aligned}$$

Hence $ v = {\left( {a}^{T} a I + a {a}^{T} \right)}^{-1} \left( \left( Y + {Y}^{T} \right) a - 2 b \right) $.

Then it implies:

$$ X + {X}^{T} = \frac{1}{2} \left( Y + {Y}^{T} - {\left( {a}^{T} a I + a {a}^{T} \right)}^{-1} \left( \left( Y + {Y}^{T} \right) a - 2 b \right) {a}^{T} - a {\left( {\left( {a}^{T} a I + a {a}^{T} \right)}^{-1} \left( \left( Y + {Y}^{T} \right) a - 2 b \right) \right)}^{T} \right) $$

I implemented both methods in MATLAB and verified the code vs. CVX. The MATLAB Code is accessible in my StackExchange Mathematics Q3631718 GitHub Repository.

Remark: In this solution $ Y $ isn't assumed to be Symmetric Matrix.

Royi
  • 10,050
1

I would like to propose a different approach.
When optimizing over a Frobenius Norm we're basically working with vectors.

So, writing the problem as:

$$\begin{aligned} \arg \min_{X} \quad & \frac{1}{2} {\left\| X - Y \right\|}_{F}^{2} \\ \text{subject to} \quad & X \in \mathcal{S}^{n} \\ & X a = b \end{aligned}$$

Where $ \mathcal{S}^{n} $ is the set of Symmetric Matrices of size $ n $.

Let's define $ x = \operatorname{vec} \left( X \right) $ where $ \operatorname{vec} \left( \cdot \right) $ is the Vectorization Operator. Using it we can rewrite the problem as:

$$\begin{aligned} \arg \min_{X} \quad & \frac{1}{2} {\left\| x - y \right\|}_{F}^{2} \\ \text{subject to} \quad & \left( U - L \right) x = \boldsymbol{0} \\ & \left( {a}^{T} \otimes I \right) x = b \end{aligned}$$

Where $ \otimes $ is the Kronecker Product. In order to convert $ X a = b $ to $ \left( {a}^{T} \otimes I \right) x = b $ I used the Kronecker Product property (See Kronecker Product - Matrix Equations). The $ L $ matrix extract the lower triangle of the Matrix $ X $ from $ x $ and $ U $ is extracting the upper triangle.

By setting $ C = \begin{bmatrix} U - L \\ {a}^{T} \otimes I \end{bmatrix} $ and $ d = \begin{bmatrix} \boldsymbol{0} \\ b \end{bmatrix} $ the problem can be written as:

$$\begin{aligned} \arg \min_{X} \quad & \frac{1}{2} {\left\| x - y \right\|}_{F}^{2} \\ \text{subject to} \quad & C x = d \end{aligned}$$

Now you have simple Linear Least Squares Problem with Equality Constraints.

So all needed is to solve the following system:

$$ \begin{bmatrix} I & {C}^{T} \\ {C} & 0 \end{bmatrix} \begin{bmatrix} \hat{x} \\ \hat{\nu} \end{bmatrix} = \begin{bmatrix} y \\ d \end{bmatrix} $$

Though the system is much larger, all matrices are sparse.

I implemented both methods in MATLAB and verified the code vs. CVX. The MATLAB Code is accessible in my StackExchange Mathematics Q3631718 GitHub Repository.

Remark: In this solution $ Y $ isn't assumed to be Symmetric Matrix.

Royi
  • 10,050
0

Thanks to the help by Marc. I gave the solution following his hint.

The Lagrange function now can be written as:

$$ L(\hat N) = \frac{1}{2} \|\hat N + \hat N^T - M\|_F^2 - \lambda^T (\hat N + \hat N^T) d $$

The derivative gives by:

$$ \frac{\partial L}{\partial \hat N} = 2 * (\hat N + \hat N^T - M) - (\lambda d^T + d \lambda^T) = 0 \\ \Rightarrow N = M + \frac{1}{2} (\lambda d^T + d \lambda^T) $$

Take it back to the secant condition, we can get:

$$ Nd = Md + \frac{1}{2} (\lambda d^T + d \lambda^T)d = g \\ \Rightarrow \lambda = 2(d^TdI + dd^T)^{-1}(g - Md) $$

  • There is a minus sign in your last equation. Alos, the solution is the same as in the link you provided. Another way is to prove that the solution in the link is already symmetric when $M$ is symmetric, so it is also the solution to the symmetric case. – Marc Dinh Apr 20 '20 at 12:51
  • (I have no clue how to do this however). – Marc Dinh Apr 20 '20 at 12:59
  • I think your derivative with respect to $ \hat{N} $ is wrong. I will later solve it with this approach. At the moment I solved it with other approach (I just love the Kronecker product). – Royi Apr 20 '20 at 15:34
  • If $M$ is symmetric, as I supposed in my question, your solution is the same as mine. But your solution should be a more general one:) also note that my solution has some mistakes regarding the sign and I've changed that. – Kaiwen Sheng Apr 21 '20 at 07:11
  • You haven't suggested anywhere that $ M $ is symmetric. Anyhow, I gave you 2 solutions. I think my approach to vectorize the problem will be able to support larger problems. – Royi Apr 21 '20 at 09:49
  • I do have suggested and now I've made it bold in case there is any misunderstanding. Anyway, I really appreciate your solution and your code:) Happy to see all of them. – Kaiwen Sheng Apr 21 '20 at 14:13