Given matrices $A, B \in \mathbb{C}^{(n-s) \times n}$ and matrix $\Sigma \in \mathbb{C}^{n \times n}$, I want to solve the following equality-constrained minimization problem
$$\begin{array}{ll} \underset{X \in \mathbb{C}^{n \times n}}{\text{minimize}} & \| \Sigma - X \cdot X^H \|_F\\ \text{subject to} & A X = B\end{array}$$
where $X^H = X^*$ is the hermitian conjugate (transpose and complex conjugate) and $\| \cdot \|_F$ is the Frobenius norm.
First we note that $\| Y \|_F^2 = \operatorname{tr}(YY^H)$ so minimizing $$ \operatorname{trace}\left( \Sigma\Sigma^H - \Sigma X X^H - X X^H \Sigma^H + X X^H X X^H \right) $$ is equivalent to minimizing the original expression.
First I try to see if there is an analytic solution (closed expression) to this constrained optimization problem, and the natural thing to do is try Lagrange Multipliers Method: $$ f(X) = \operatorname{trace}( (\Sigma - XX^H)(\Sigma - XX^H)^H ) + \lambda (A X - B) $$ but then I encountered two problem:
- What is $\lambda$? It can't be a scalar because the second term is a matrix and the first is a scalar. Moreover, there is no matrix that gives a scalar by pre- or post-multiply the second term (the constrain). A possible solution is to write each of the $(n-s) \times n$ equations separately and assign to each a different $\lambda_{i,j}$ with $i=1,...,n-s$ and $j=1,...,n$ and sum them (this will give us $(n-s)n$ terms in $f(X)$). Another solution is to replace this terms with $\langle \lambda , AX - B \rangle$ when $\lambda \in \mathbb{C}^{(n-s)\times n}$.
- Since there are expressions of the form $X X^H$ which include $z \cdot \bar{z}$ the first term is not differentiable in the complex sense. This hampers Lagrange Multipliers and other gradient based algorithms.
See How to set up Lagrangian optimization with matrix constrains for discussion of question 1.
Am I wrong here? Or these two arguments indeed show that obtaining analytic expression via Largrange Multipliers is not feasible?
Another idea is to try use the (Moore-Penrose) pseudo-inverse of $A$ to write $X = A^+ B$ but this over-determines $X$ which couldn't be the right solution (since if determined uniquely by the constrain then there is no minimization problem). Note that since $B \in \mathbb{C}^{(n-s) \times n}$ there are more variables ($X_{i,j}$) than equations, so there are $n^2 - (n-s) \times n = sn$ degrees of freedom in $X$.
If there is no analytic solution, what is the algorithmic way to solve this constrained minimization problem? I want to be able to program it and check it with Python using packages such as NumPy and SciPy. (Remark: algorithms that use real gradients probably won't work here because of the term $XX^H$ which is not differentiable in the complex sense.) Numerical optimization algorithms will be fine too.
Related questions:
- How to set up Lagrangian optimization with matrix constrains
- Minimize $\mbox{tr}(AX)$ where $X$ is an orthogonal matrix
- Lagrangian multipliers with complex valued vectors and matrices
General resources on complex optimization: