3

Let $A\in\mathbb{R}^{N\times N}$ be a given constant symmetric $N\times N$ square matrix. Consider the equation:

$$X' A X = 0$$

in $X$, which is a rectangular matrix, $N\times M$.

If $A$ is positive definite, then of course the only solution is the zero matrix, $X = 0$. So I'll assume that's not the case: $A$ has both some negative and some positive eigenvalues.

Now consider another rectangular matrix $Y\in\mathbb{R}^{N\times M}$. My goal is to project $Y$ onto the solution space of the above equation. In other words, given $A, Y$, I must $X$ closest to $Y$, which satisfies the above equation:

$$\underset{X}{\mathrm{argmin}} \| X - Y \|^2 \qquad \text{subject to} \qquad X'AX=0$$

Here by the distance $\| X - Y \|^2$ I mean $\sum_{ij} (X_{ij}-Y_{ij})^2$.

Note that, differentiating the Lagrangian of this, leads to a Sylvester equation (hence the tag) which doesn't look very helpful.

a06e
  • 7,079
  • 3
    You are using the term "solution space" but the set of all $X$ such that $X^TAX=0$ isn't a (vector) subspace of any space. – Jean Marie Jan 14 '22 at 15:41
  • I never to imply that it was a vector space. It isn't. Though I'm not sure what kind of space it is. – a06e Jan 14 '22 at 19:35
  • Yes, by "projection on a set" I would mean to find the closest point. Should be clear from the formulation I give in terms of the argmin, right? – a06e Jan 14 '22 at 20:25
  • Note, as far as I'm aware, it is valid to speak of projection also for non-linear target spaces, see for instance https://www.sciencedirect.com/science/article/abs/pii/S0045782508000649. – a06e Jan 14 '22 at 20:27

1 Answers1

1

Before looking at projection properties (we will do it in the last paragraph), it is essential to have an idea of what matrices $X$ such that $X^TAX=0$ "look like".

In the case $N=M$ the set of such matrices $X$ is rather easy to characterize.

Let us illustrate it in the case $N=M=3$ for a general non-definite positive matrix, where:

$$X^TAX=\begin{pmatrix}a&b&c\\d&e&f\\g&h&i \end{pmatrix}\begin{pmatrix}1& 0&0\\0&1&0\\0&0&-1 \end{pmatrix}\begin{pmatrix}a&d&g\\b&e&h\\ c&f&i\end{pmatrix}=0\tag{1}$$

This is equivalent to the set of 6 equations:

$$\begin{cases} a^2+b^2-c^2&=&0&(Eq. 1)\\ d^2+e^2-f^2&=&0&(Eq. 2)\\ g^2+h^2-i^2&=&0&(Eq. 3)\\ ad+be-cf&=&0&(Eq. 4)\\ ag+bh-ci&=&0&(Eq. 5)\\ bc+ef-hi&=&0&(Eq. 6) \end{cases}\tag{2}$$

Let

$$u:=\begin{pmatrix}a\\b\\c \end{pmatrix}, \ \ v:= \begin{pmatrix}d\\e\\f \end{pmatrix}, \ \ w:= \begin{pmatrix}g\\h\\i \end{pmatrix}$$

Let $(C)$ be the cone with equation $x^2+y^2-z^2=0$.

The 3 first relationships in (2) express the fact that points $u,v,w \in (C)$.

Let us take the case of the tangent plane $(T_u)$ to $(C)$ in $u$.

Its equation is $xa+yb-zc=0$.

(Eq. 4) and (Eq. 5) above express the fact that $v \in (T_u)$ and $w \in (T_u)$. This is possible if and only $u,v,w$ are proportional vectors (i.e;, belong to a same generatrix line of the cone).

As a consequence, the set of matrices $X$ such that $X^TAX=0$ is the set of rank-one matrices described by the formula:

$$X=\begin{pmatrix}a&pa&qa\\b&pb&qb\\ c&pc&qc\end{pmatrix}=\begin{pmatrix}a\\b\\c\end{pmatrix}(1 \ \ p \ \ q)\tag{3}$$

In this case, we can characterize the projection of a $3 \times 3$ matrix $Y$. Let us recall that a general result dealing with SVD (Singular Value Decomposition) says that projection of $Y$ onto the family of rank one matrices is obtained by taking $\sigma_1 U_1V_1^T$ where $\sigma_1, U_1, V_1$ are the main singular value, and associated unit left singular vector and right singular resp. in the SVD of $Y$.

Edit: The cases where $M>N$ look to be amenable at the previous case. Let us consider, once agin for the sake of simplicity, the case $N=3$ and $M=N+1=4$. One can write $X=[X_1,X_2]$ where $X_1$ is a $3 \times 3$ matrix and $X_2$ a "column" vector of $\mathbb R^3$.

$$X^TAX=0 \iff \begin{pmatrix}X_1^T\\X_2^T\end{pmatrix}A\begin{pmatrix}X_1&X_2\end{pmatrix}=0 \iff \begin{pmatrix}X_1^TAX_1&X_1^TAX_2\\X_2^TAX_1&X_2^TAX_2\end{pmatrix}=\begin{pmatrix}0&0\\0&0\end{pmatrix}\tag{4}$$

Otherwise said:

$$\begin{cases}X_1^TAX_1&=&0\\X_1^TAX_2&=&0\\X_2^TAX_2&=&0\end{cases}\tag{5}$$ Among the $3$ constraints given by (5), the first one has already been met before. The third one, once again with the cone interpretation means that $X_2$ belongs to the cone, and the second one that $X_2$ belongs to all the tangent planes to the columns of matrix $X_1$. As a consequence, $X_2$ belongs to the column space of matrix $X_1$. Therefore, using (3):

$$X=\begin{pmatrix}a\\b\\c\\d\end{pmatrix}(1 \ \ p \ \ q)\tag{6}$$

Jean Marie
  • 88,997
  • Thanks! This looks very insightful! I'll for sure read it carefully tomorrow. But just a quick question now: You only solve for the square case with M = N = 3. It's this because you think the same method of solution works for general M, N? If so I can try to work it out tomorrow. I did try something similar I think, but got stuck when you had more than one positive and more than one negative eigenvalues. But again thanks! I'll read it carefully tomorrow morning, it's late now. – a06e Jan 15 '22 at 00:23
  • Well good night then! I'll read this tomorrow. Thanks. – a06e Jan 15 '22 at 00:26
  • I'm not sure how to generalize this argument to higher dimensions (or if it's possible to do so). – a06e Jan 23 '22 at 19:44
  • See the Edit I just wrote. – Jean Marie Jan 24 '22 at 11:45
  • Sorry I don't see any edit. Are you sure you posted it? Thanks! – a06e Jan 24 '22 at 20:46
  • Thanks! Unfortunately the work I'm doing where I am applying this actually has M < N typically. – a06e Jan 25 '22 at 08:42
  • 1
    I did some literature search and found this paper from last year: https://www.sciencedirect.com/science/article/abs/pii/S009630032100552X. It handles a similar "non-homogeneous" problem, $X'AX = C$, where $A$ is posdef. – a06e Jan 25 '22 at 08:42
  • 1
    Regarding the solution of space of $X'AX = 0$, we can split into two spaces: corresponding to the positive and negative eigenvalues of $A$. Then this equation says that dot products in the two spaces of the vectors $X$ are preserved. This will still be the case after arbitrary rotations in either of the two spaces (multiplying with some matrix orthogonal with respect to $A^\pm$). Since the overall norm of $X$ doesn't matter, I think the projection is just a matter of deciding how to allocate $X$ among these two subspaces. It's a very intuitive picture I'm trying to formalize .... – a06e Jan 25 '22 at 08:46
  • OK, I wasn't aware of this condition $M<N$... I understand your comments: working separately into these 2 spaces ought to be a good idea. nevertheless, I am almost sure that,afterwards the SVD must be used in these 2 subspaces ("low rank approximation") – Jean Marie Jan 25 '22 at 09:21
  • I'm happy it makes sense to you. I'll try to formalize it better and see if it gets me somewhere. – a06e Jan 25 '22 at 09:41
  • If I abandon the desire to find a projection (in the sense of the minimum distance point), and just focus on finding any parameterization of the solution space of this equation, would that be any easier? – a06e May 06 '22 at 08:04