2

Given $A \in \mathbb{R}^{m \times n}$, $B \in \mathbb{R}^{n \times d}$ and $C \in \mathbb{R}^{m \times d}$, I want to find an analytical solution for $X \in \mathbb{R}^{n \times n}$ that minimizes $$ \lVert A X B - C \rVert^2_F$$ subject to the constraint that $X ^ T X = I_n$ (e.g. that $X$ is orthogonal).

This seems like a simple extension of the standard orthogonal Procrustes problem, but I'm finding it difficult to solve, let alone find a resource which provides a solution.

Using the Kronecker-$\textrm{vec}$ trick doesn't seem to get me anywhere, because I can't figure out how enforce orthogonality on $\textrm{vec}(X)$. Similarly, following the standard Procrustes derivation by expanding the inner product reduces only to minimizing $$ \lVert A X B \rVert^2 - 2 \langle A X B, C \rangle,$$ which is not equivalent to maximizing the inner product over $X$ as in the standard Procrustes problem.

tommym
  • 483
  • 2
    Seems to be quite nontrivial. Google "unbalanced Procrustes problem". This gives for example https://www.jstor.org/stable/43693402. This seems to be an open problem in general (though I am not at all an expert on those kind of things). – Severin Schraven Dec 03 '24 at 22:13
  • Hmm, that's a shame. Even for square $X$ as stated here? – tommym Dec 04 '24 at 03:07

1 Answers1

2

Let us write your problem in a more formal way, so we have: $$ \min_X \|AXB - C\|_F^2 \qquad s.t. X^TX=I $$ Using the Lagrangian, we can rewrite this problem as follows: $$ \min_X \nabla \mathcal{L}(X) = \|AXB - C\|_F^2 + \lambda\|X^TX - I\|^2 $$ Now, we can solve this unconstrained problem by taking its derivative w.r.t X and set it =0.

For the fisrt term, we can use the trace operator to write: $$ \min_X \|AXB - C\|_F^2 = \min_X Tr(AXB - C)^T(AXB - C), $$ Taking the partial derivative of this term w.r.t. X yields: $$ \frac{d Tr(AXB - C)^T(AXB - C)}{dX}=2A^T(AXB-C)B^T $$ For the second term, we have $$ \frac{d \|X^TX - I\|^2}{d X} = 2X(X^TX - I) $$

We put the two terms togather, we obtain $$ \nabla \mathcal{L}(X) = A^T(AXB-C)B^T - \lambda X(X^TX - I) = 0 $$ Clearly, this euqation has no closed-form solution. Thus, we can use an iterative technique to solve it, i.e., $$ X^t = X^{t-1} - \alpha \nabla \mathcal{L}(X^{t-1}) $$

  • Right, I'm familiar with the derivation. The standard orthogonal Procrustes problem reduces much in the same way and appears to have no closed form solution. However one can observe before taking the derivative that the minimization is equivalent to maximizing the inner product term which is linear in $X$ and thus can show the solution can be found through the SVD. I was hoping there was another elegant insight here that I was missing. – tommym Dec 04 '24 at 03:11
  • 2
    $+\tt1,$ Also after every iteration you can project $X$ back into $SO(n),,$ i.e. $X_t = X_t,(X_t^TX_t)^{-1/2}.;$ This ensures that the $X(X^TX-I)$ term is always zero, so you can omit it from $\nabla{\cal L}(X).;$ – greg Dec 05 '24 at 19:42