10

What's an efficient algorithm to get the SVD of $2\times 2$ matrices?

I've found papers about doing SVD on $2\times 2$ triangular matrices, and I've seen the analytic formula to get the singular values of a $2\times 2$ matrix. But how to use either of these to get the SVD of an arbitrary $2\times 2$ matrix?

Are the general algorithms built on these, or are these just some special cases?

shinjin
  • 411
  • For reference, essentially the same question was asked here: https://scicomp.stackexchange.com/questions/8899/robust-algorithm-for-2-times-2-svd and here: https://math.stackexchange.com/questions/861674/decompose-a-2d-arbitrary-transform-into-only-scaling-and-rotation and there are some good answers (the one from @petiaccja and the ones based on Blinn's article – Don Hatch Jul 12 '24 at 16:04

3 Answers3

10

There's a ridiculously easy method to apply the method you already know, starting from your matrix $\mathbf A$. Consider the QR decomposition

$$\mathbf A=\mathbf Q\mathbf R$$

where $\mathbf Q$ is orthogonal and $\mathbf R$ is upper triangular. You say you already know how to take the SVD of a triangular matrix:

$$\mathbf R=\mathbf W\mathbf \Sigma\mathbf V^\top$$

where both $\mathbf W$ and $\mathbf V^\top$ are orthogonal and $\mathbf \Sigma$ is diagonal. You know (or at least are supposed to know) that the product of two orthogonal matrices is again an orthogonal matrix, so letting $\mathbf U=\mathbf Q\mathbf W$, you should be able to obtain your desired singular value decomposition $\mathbf U\mathbf \Sigma\mathbf V^\top$.

In general, preprocessing your matrix with QR decomposition makes for an easier SVD computation. For a general $m\times n$ matrix, one can do a "thin QR decomposition" (with column pivoting if needed) where $\mathbf Q$ has the same dimensions as the original matrix, and $\mathbf R$ is square and triangular. One can then take the SVD of $\mathbf R$ and then multiply out the orthogonal matrices (and permutation matrices if you did pivoting; remember that permutation matrices are also orthogonal!) to then obtain the singular value decomposition of your original matrix.


FWIW, in the $2\times 2$ case, you're charmed, since you can use an appropriately constructed Givens rotation matrix for both the QR and SVD stages. The details should be in e.g. Golub and Van Loan's book.

9

The SVD of a $2\times 2$ matrix has a closed-form formula, which can be worked out by writing the rotation matrices in terms of a single unknown angle each, and then solving for those angles as well as the singular values.

It is worked out here, for instance.

user7530
  • 50,625
  • This method always bothered me since it involves a lot of unnecessary trigonometry. – wcochran Jul 16 '23 at 06:02
  • @wcochran It has bigger problems than that :-) (1) in the case that one of the s.v.'s is much smaller than the other, the computation of $\sigma_2 = \sqrt F$ (where F is (SUsum-SUdif)/2 in the reference code) may blow up due to slightly negative F; clamping to non-negative mitigates the blowup but still results in poor accuracy; (2) in the case that the s.v.'s are equal or almost equal, it chooses $U$ and $V$ independently with no effort to make sure $U V^T$ is the correct rotation. E.g. with input $[[0,-1],[1,0]]$ (simple 90 degree rotation) it outputs $U=V=I$ which is wildly wrong. – Don Hatch Jun 28 '24 at 17:08
2

A fairly straight forward algorithm is to reduce the problem of computing the SVD of $A$ into finding the eigenvalues and eigenvectors of $A^T A:$ $$ \begin{align} A &= U \Sigma V^T \\ A^T A &= (U \Sigma V^T)^T U \Sigma V^T \\ &= V \Sigma U^T U \Sigma V^T \\ &= V \Sigma^2 V^T \\ A^T A V & = V \Sigma^2 \end{align} $$ The columns of $V$ are the eigenvectors of $A^T A$ and the squared singular values $\sigma_0^2$ and $\sigma_1^2$ are the eigenvalues $\lambda_0$ and $\lambda_1.$ Finding the eigenvalues amounts the finding the roots of a quadratic which are guaranteed to be real and non-negative: $$ A^T A= \left[\begin{array}{cc} a & b \\ b & c \end{array}\right] = \left[\begin{array}{cc} A_{00}^2 + A_{10}^2 & A_{00} A_{01} + A_{10} A_{11} \\ A_{00} A_{01} + A_{10} A_{11} & A_{01}^2 + A_{11}^2 \end{array}\right] $$ $$ \begin{align} q &= \sqrt{(a - c)^2 + 4b^2} \\ \lambda_0 &= ((a + c) + q)/2,\ \ \ \lambda_1 = ((a + c) - q)/2 \\ \sigma_0 &= \sqrt{\lambda_0},\ \ \ \sigma_1 = \sqrt{\lambda_1} \end{align} $$ If $b = 0$ then $V$ is either the identity matrix or skewed symmetric depending on relative magnitude of the eigenvalues: $$ V = \left\{\begin{array}{ll} I, & a \geq c \\ \left[\begin{array}{cc} 0 & 1 \\ 1 & 0 \end{array}\right], & a < c \end{array}\right. $$

Otherwise to compute $V$ we first compute the eigenvectors of $A^T A.$ At this point we know $|b| > 0$ and either $a > 0$ or $c > 0.$ For each eigenvector we choose the largest value in the set $S_i = \{|b|,|\lambda_i - a|,|\lambda_i - c|\}$ to stear clear of a division by zero:

$$ {\bf e}_i = \left\{\begin{array}{ll} \left[\begin{array}{c} 1 \\ (\lambda_i - a)/b \end{array}\right], & |b| = \max(S_i) \\ \left[\begin{array}{c} b/(\lambda_i - a) \\ 1 \end{array}\right], & |\lambda_i - a| = \max(S_i), \\ \left[\begin{array}{c} 1 \\ b/(\lambda_i - c) \end{array}\right], & |\lambda_i - c| = \max(S_i). \end{array}\right. $$ and normalize $$ \begin{align} \hat{\bf e}_0 &= {\bf e}_0/\| {\bf e}_0 \|,\ \ \ \hat{\bf e}_1 = {\bf e}_1/\| {\bf e}_1 \| \end{align}. $$ We then set $V = [\hat{\bf e}_0\ \ \hat{\bf e}_1].$

We then solve $A = U \Sigma V^T$ for $U.$ If $\sigma_0 > \sigma_1 > 0$ then we simply have $U = A V \Sigma^{-1}.$ If $\sigma_0 = \sigma_1 = 0$ then $U = I$ suffices. Otherwise $\sigma_0 > \sigma_1 = 0$ and we create the first column of $U$ as follows: $$ \left[\begin{array}{c} U_{00} \\ U_{10} \end{array}\right] = \left\{\begin{array}{ll} \left[\begin{array}{c} A_{00}/(\sigma_0 V_{00})\\ A_{10}/(\sigma_0 V_{00}) \end{array}\right], & |V_{00}| \geq |V_{10}| \\ \left[\begin{array}{c} A_{10}/(\sigma_0 V_{10})\\ A_{11}/(\sigma_0 V_{10}) \end{array}\right], & |V_{00}| < |V_{10}|. \\ \end{array}\right. $$ The second column is simply made to be orthogonal to the first: $$ \left[\begin{array}{c} U_{01} \\ U_{11} \end{array}\right] = \left[\begin{array}{c} +U_{10} \\ -U_{00} \end{array}\right]. $$

wcochran
  • 862
  • I don't think any method that extracts the singular values from $A^T A$ has much hope of accuracy. Consider the case $A=[[1,1],[0,1e-4]]$. Then, in single precision, $A^T A = [[1,1],[1,1]]$ and so all information about the small singular value is already lost. – Don Hatch Jun 28 '24 at 06:33
  • @DonHatch Probably using rotations to zero the off diagonal elements is the more numerically savvy approach. – wcochran Jun 28 '24 at 16:34