Why is solving linear equation more stable than directly computing matrix inverse?

Question

To compute $A^{-1}B$ using numerical packages like numpy, it is suggested to use solve(A,B) rather than inv(A).dot(B). I know that you can apply LU factorization to speed up in the former case. But why it is more stable than the second choice? How does they work internally differently?

Many thanks.

Just to clarify: LU factorization (or QR for that matter) is not any faster than a naive Gauss algorithm. But it can be much more stable numerically (depending on pivoting strategy which actually makes it slightly slower). Also: I would expect inv(A) to be computed pretty much like solve(A,1), which means the same decomposition/pivoting. — Simon, Oct 19 '16 at 13:11
@Simon thanks for your reply. I understand pivoting rules make a difference. But I do not understand why in my program the numerical accuracy differs... — Roger, Oct 19 '16 at 13:37
From a computational efficiency standpoint computing the LU factorization is typically much more efficient when dealing with sparse matrices. Since the inverse of a sparse matrix is typically dense, it may not even be possible to store the inverse of a large sparse matrix in memory. Not to mention that you now have to do a dense matrix-vector multiply. — K. Miller, Oct 19 '16 at 13:40
@K.Miller Thanks a lot. This explains the computational efficiency. But why the quality of the solution it returns are different? — Roger, Oct 19 '16 at 14:11

score 12 · Accepted Answer · answered Oct 21 '16 at 07:54

If $x$ is computed by LU factorization, the residual can be bounded by $$ \|b-Ax\|\leq cu\||L||U|\|\|x\|. $$ Assuming for simplicity that $A^{-1}$ is computed exactly and that the only source of error is the matrix-vector multiplication, we have $$ \|b-Ax\|\leq cu\||A||A^{-1}|\|\|b\|. $$ Here, $c$ is a "moderate constant", $u$ is the unit round-off, and $\|\cdot\|$ is the $\infty$-norm.

Often, $\||L||U|\|\approx\|A\|$ (e.g., with a suitable pivoting strategy in LU). The bounds indicate that you might get a much smaller residual with the LU factorization provided that $$\|x\|\approx\|A^{-1}b\|\ll\|A^{-1}\|\|b\|.$$

For illustration, consider this piece of Matlab code:

n = 32;
A = 2 * eye(n) - triu(ones(n));
[U, S, V] = svd(A);
b = U(:,1);

x_LU  = A \ b;
x_INV = inv(A) * b;

norm(b - A * x_LU,  inf)   % 5.551e-17
norm(b - A * x_INV, inf)   % 9.543e-10

The choice of $A$ and $b$ are quite extreme ($A$ highly ill-conditioned, $b$ in the direction of the first left singular vector) but they illustrate clearly the issue here (disregarding of course the computational efficiency).

Thanks a lot Pavel!. Could you please provide a link where those bounds come from? — Roger, Oct 21 '16 at 09:32
@Roger You can find the bounds and how to get them, e.g., here in the chapter on computing matrix inverses. — Algebraic Pavel, Oct 21 '16 at 15:44

Why is solving linear equation more stable than directly computing matrix inverse?

1 Answers1

Linked