Proof of the CS (cosine-sine) matrix decomposition

Question

The CS decomposition is a way to write the singular value decomposition of a matrix with orthonormal columns. More specifically, taking the notation from these notes (pdf alert), consider a $(n_1+n_2)\times p$ matrix $Q$, with $$Q=\begin{bmatrix}Q_1 \\ Q_2\end{bmatrix},$$ where $Q_1$ has dimensions $n_1\times p$ and $Q_2$ has dimensions $n_2\times p$. Assume $Q$ has orthonormal columns, that is, $Q_1^\dagger Q_1+Q_2^\dagger Q_2=I$.

Then the CS decomposition essentially tells us that the SVDs of $Q_1$ and $Q_2$ are related. More specifically, there are unitaries $V, U_1, U_2$ such that \begin{aligned} U_1^\dagger Q_1 V=\operatorname{diag}(c_1,...,c_p), \\ U_2^\dagger Q_2 V=\operatorname{diag}(s_1,...,s_q), \end{aligned} with $c_i^2+s_i^2=1$ (from which the name of the decomposition comes). As far as I understand, this means that there is a set of orthonormal vectors $\{v_k\}_k$ such that both $\{Q_1 v_k\}_k$ and $\{Q_2 v_k\}$ are orthogonal sets of vectors (with some relations between their norms).

To prove that this is the case, I start by writing down the SVDs of $Q_1$ and $Q_2$, which tell us that there are unitaries $U_1, U_2, V_1, V_2$, and diagonal positive matrices $D_1, D_2$, such that \begin{aligned} Q_1= U_1 D_1 V_1^\dagger, \\ Q_2= U_2 D_2 V_2^\dagger. \end{aligned} The condition $Q_1^\dagger Q_1+Q_2^\dagger Q_2=I$ then translates into $$V_1 D_1^2 V_1^\dagger + V_2 D_2^2 V_2^\dagger=I.$$ Denoting with $v^{(i)}_k$ the $k$-th column of $V_i$, and $P^{(i)}_k\equiv v^{(i)}_k v^{(i)*}_k$ the associated projector, this condition can be seen to be equivalent to $$\sum_k (d^{(1)}_k)^2 P_k^{(1)}+\sum_k (d^{(2)}_k)^2 P_k^{(2)}=I,\tag A$$ where $d^{(i)}_k\equiv (D_i)_{kk}$.

Now, however, I'm a bit stuck into how to proceed from (A). It seems a generalisation of the things proved in this post and links therein, which show that if a sum of projectors gives the identity then the projectors must be orthogonal, but I'm not sure how to prove this in this case.

score 4 · Answer 1 · answered Nov 11 '19 at 22:24

To get to $(A)$ and proceed from there to show this equation corresponds to $c_i^2 + s_i^2 = 1$, we need to get to $V_1^\dagger = V_2^\dagger$.

To get there consider the "$QR$" decomposition of $Q_2V_1$ matrix. We can write it as: $$ Q_2V_1 = U_2R\\ Q_2 = U_2RV_1^\dagger $$ where $U_2$ is an orthogonal matrix and $R$ is an upper diagonal matrix.

We have $Q_2Q_2^\dagger = I$ ($Q_2$ is full column rank with orthonormal columns). Therefore: $$ (U_2RV_1^\dagger)(VR^\dagger U_2^\dagger) = I \\ U_2 R R^\dagger U_2^\dagger = I \\ R R^\dagger = U_2^\dagger U_2 = I \\ $$

Hence $R$ must be a diagonal matrix, lets call it $D_2$. Rewriting $Q_2$ we get $$ Q_2V_1 = U_2D_2 \\ Q_2 = U_2D_2V_1^\dagger \\ $$ which is same the SVD of $Q_2 = U_2D_2V_2^\dagger$. Therefore $V_2^\dagger = V_1^\dagger$.

Now using the condition $Q_1^\dagger Q_1+Q_2^\dagger Q_2=I$, we get: $$ (V_1D_1^\dagger U_1^\dagger)(U_1D_1V_1^\dagger) + (V_1D_2^\dagger U_2^\dagger)(U_2D_2V_1^\dagger)) = I \\ V_1 D_1^\dagger D_1 V_1^\dagger + V_1 D_2^\dagger D_2 V_1^\dagger = I \\ V_1(D_1^\dagger D_1 + D_2^\dagger D_2)V_1^\dagger = I \\ D_1^\dagger D_1 + D_2^\dagger D_2 = V_1^\dagger V_1 = I \\ \sum_k (d^{(1)}_k)^2 +\sum_k (d^{(2)}_k)^2 = I \\ $$

if $d^{(1)}_i = c_i$ and $d^{(2)}_i = s_i$, then $c_i^2 + s_i^2 = 1$ for $i = 1, 2, .., p$

how do you know that $Q_2$ has orthonormal rows (i.e. $Q_2 Q_2^\dagger=I$)? You could have e.g. $Q_1=I_2$ and $Q_2=0$ the zero $2\times 2$ matrix, and then this would not be true — glS, Dec 01 '19 at 14:06

Druidris · Answer 2 · 2020-05-05T12:28:34.560

If you insert $Q_1=U_1 D_1 V_1^\dagger$ and the QR decomposition from the previous post (https://math.stackexchange.com/q/3431715), $Q_2V_1=U_2R$ or $Q_2=U_2RV_1^\dagger$, into the orthogonality condition you will get $D_1^2 + R^\dagger R = I$ or equivalently $$R^\dagger R = I - D_1^2.$$ Since the right-hand side (RHS) is diagonal, $R^\dagger R$ must be diagonal as well (after reflection, this argument only holds if the triangular part has non-zero diagonal elements, which is the case if $Q_2V_1$ has full column rank). If you consider that $R$ is an upper triangular matrix, then by inspection of the product $R^\dagger R$ you will see that $R$ must have zero off-diagonal elements (you could probably do some proof by induction examining the row-results). In addition, note that $||Q||_2=1$ so $||Q_1||_2\leq 1$ and the RHS is non-negative.

As in the previous post, define $D_2 := \sqrt{R^\dagger R}$ and you can state that one possible singular value decomposition (SVD) of $Q_2$ is: $$Q_2 = U_2 D_2 V_1^\dagger $$

The rest follows from substituting $Q_1$ and the obtained SVD of $Q_2$ in the orthogonality condition again. You can find more accurate statements in Matrix Computations by Golub and Van Loan.

which previous post are you referring to? – glS Mar 31 '20 at 12:18 — glS, Mar 31 '20 at 12:18

score 0 · Answer 3 · answered Mar 31 '20 at 12:52

Upon further reflection, I realised that the answer is actually rather trivial.

Denote with $\mathbf v_k,\mathbf w_k$ the right principal components of $Q_1$ and $Q_2$, respectively, and with $s_k,t_k\ge0$ the corresponding singular values. Let us also denote with $P_{\mathbf v}\equiv \mathbf v\mathbf v^\dagger$ the operator projecting onto the vector $\mathbf v$.

As discussed in the OP, we have the condition $$\sum_k s_k^2 P_{\mathbf v_k} + \sum_k t_k^2 P_{\mathbf w_k}=I.$$ This is an expression of the form $A+B=I$ with $A,B\ge0$. As discussed in this other post, this means that $A,B$ are mutually diagonalisable, and therefore their eigenvalues must sum up to $1$ in each mutual eigenspace. In our case, $A,B$ are already given in diagonal form, and their eigenvalues are $s_k^2$ and $t_k^2$.

In the easy case of both matrices being nondegenerate, $s_j\neq s_k$ and $t_j\neq t_k$ for all $j\neq k$, we can then conclude that, up to some relabelling, we must have $\mathbf v_k=\mathbf w_k$ for all $k$, and that there are angles $\theta_k\in\mathbb R$ such that $s_k=\cos\theta_k$ and $t_k=\sin\theta_k$.

Similar arguments apply when $Q_1,Q_2$ are degenerate, except that we have to work directly on the (possibly more-than-one-dimensional) eigenspaces.

Proof of the CS (cosine-sine) matrix decomposition

3 Answers3

Linked