10

To compute $A^{-1}B$ using numerical packages like numpy, it is suggested to use solve(A,B) rather than inv(A).dot(B). I know that you can apply LU factorization to speed up in the former case. But why it is more stable than the second choice? How does they work internally differently?

Many thanks.

Roger
  • 353
  • 1
    Just to clarify: LU factorization (or QR for that matter) is not any faster than a naive Gauss algorithm. But it can be much more stable numerically (depending on pivoting strategy which actually makes it slightly slower). Also: I would expect inv(A) to be computed pretty much like solve(A,1), which means the same decomposition/pivoting. – Simon Oct 19 '16 at 13:11
  • @Simon thanks for your reply. I understand pivoting rules make a difference. But I do not understand why in my program the numerical accuracy differs... – Roger Oct 19 '16 at 13:37
  • 2
    From a computational efficiency standpoint computing the LU factorization is typically much more efficient when dealing with sparse matrices. Since the inverse of a sparse matrix is typically dense, it may not even be possible to store the inverse of a large sparse matrix in memory. Not to mention that you now have to do a dense matrix-vector multiply. – K. Miller Oct 19 '16 at 13:40
  • 1
    @K.Miller Thanks a lot. This explains the computational efficiency. But why the quality of the solution it returns are different? – Roger Oct 19 '16 at 14:11

1 Answers1

12

If $x$ is computed by LU factorization, the residual can be bounded by $$ \|b-Ax\|\leq cu\||L||U|\|\|x\|. $$ Assuming for simplicity that $A^{-1}$ is computed exactly and that the only source of error is the matrix-vector multiplication, we have $$ \|b-Ax\|\leq cu\||A||A^{-1}|\|\|b\|. $$ Here, $c$ is a "moderate constant", $u$ is the unit round-off, and $\|\cdot\|$ is the $\infty$-norm.

Often, $\||L||U|\|\approx\|A\|$ (e.g., with a suitable pivoting strategy in LU). The bounds indicate that you might get a much smaller residual with the LU factorization provided that $$\|x\|\approx\|A^{-1}b\|\ll\|A^{-1}\|\|b\|.$$

For illustration, consider this piece of Matlab code:

n = 32;
A = 2 * eye(n) - triu(ones(n));
[U, S, V] = svd(A);
b = U(:,1);

x_LU  = A \ b;
x_INV = inv(A) * b;

norm(b - A * x_LU,  inf)   % 5.551e-17
norm(b - A * x_INV, inf)   % 9.543e-10

The choice of $A$ and $b$ are quite extreme ($A$ highly ill-conditioned, $b$ in the direction of the first left singular vector) but they illustrate clearly the issue here (disregarding of course the computational efficiency).