4

I was reading a few proofs for the Sherman-Morrison Formula, which states that if $A$ is invertible and $M = A + \mathbf{u}\mathbf{v}^T$, then $M^{-1}$ is given by:

$$A^{-1} - A^{-1}\mathbf{u} \mathbf{v}^T A^{-1}/(1+\mathbf{v}^TA^{-1}\mathbf{u}).$$

There is a proof (verification) of this on Wikipedia as well as here but both of them do not justify why $(1+\mathbf{v}^TA^{-1}\mathbf{u})$ is a scalar. Why is it a scalar?

Basically I am trying to somewhat rigorously proof this formula without too many assumptions. Can I proof the formula without assuming it is true? I.e. only using the facts that $A$ is invertible and $M = A + \mathbf{u}\mathbf{v}^T$?

  • You see that $1$ is a scalar, so look at the other term, $v^T A^{-1} u$. Check the size of the factors and you'll find it is a $1\times 1$ matrix, i.e. a scalar. Of course the inverse only exists if the scalar is nonzero. – hardmath Mar 20 '16 at 13:49
  • Does this mean that A must have size $n \times n$? What is the explanation for this? If A has size $n \times n$ then u and v must have size $n \times 1$, right? – user6005857 Mar 20 '16 at 13:59
  • Yes, here $A$ is $n\times n$ and $u,v$ are $n\times 1$ matrices (column vectors). – hardmath Mar 20 '16 at 14:04
  • Is there an explanation for that? How can I justify that $A$ must be $n \times n$? – user6005857 Mar 20 '16 at 14:08
  • 1
    It is a square matrix, right? Otherwise it doesn't make sense to say $A$ is invertible (much less to ask if the update to $A$ by a rank one term $uv^T$ results in a rank one update to $A^{-1}$). – hardmath Mar 20 '16 at 14:15
  • Makes sense, thank you. I should have realized that A must be square. – user6005857 Mar 20 '16 at 14:29
  • https://math.stackexchange.com/q/252367/321264 – StubbornAtom Sep 06 '20 at 07:22

2 Answers2

7

The idea here is prove a formula for the inverse of $A+uv^T$, a "rank one update" of an invertible matrix. The formula shows that the inverse is a rank one update of $A^{-1}$, so there's a nice bilateral relationship.

The verification of the Sherman-Morrison formula is straightforward but not terribly elegant. We want to show:

$$ (A + uv^T)\left(A^{-1} - \frac{A^{-1}uv^TA^{-1}}{1 + v^T A^{-1} u}\right) = I $$

where $A$ is an $n\times n$ invertible matrix and $u,v$ are $n\times 1$ (column) vectors. The Reader is invited to verify that all the terms in the formula have compatible dimensions, e.g. $uv^T$ is an $n\times n$ matrix that properly can be added to $A$ (respectively, multiplied by $A^{-1}$).

The validity of the formula depends on the scalar $1 + v^T A^{-1} u$ being nonzero, since the indicated "division" by this term actually means multiplying by the reciprocal of that scalar. Our algebra will be somewhat simplified if we replace that scalar reciprocal temporarily by a variable, say $c$, and then substitute the correct value at the end. In what follows we freely use the commutativity of scalar multiplication with matrix multiplication.

We begin by distributing (matrix multiplication over matrix addition:

$$ \begin{align*} (A + uv^T)(A^{-1} - cA^{-1}uv^TA^{-1}) &= AA^{-1} - cAA^{-1}uv^TA^{-1} + uv^T A^{-1} - cu(v^T A^{-1}u)v^T A^{-1} \\ &= I - cI uv^T A^{-1} + uv^T A^{-1} - c(v^T A^{-1}u)uv^T A^{-1} \\ &= I + (-c + 1 -cv^T A^{-1}u) uv^T A^{-1} \end{align*} $$

where in the last step we have grouped together all three terms that are scalar multiples of the matrix term $uv^T A^{-1}$. Clearly the final right-hand side in this last step is just $I$ precisely when the combined scalar coefficient of that matrix term is zero:

$$ -c + 1 -cv^T A^{-1}u = 0 $$

But this is equivalent to:

$$ 1 = c (1 + v^T A^{-1} u) $$

$$ c = (1 + v^T A^{-1} u)^{-1} $$

Therefore when $c$ is assigned this value (the reciprocal of the scalar $1 + v^T A^{-1} u$), then the Sherman-Morrison formula is valid (because carrying out the matrix multiplication indicated above gives us the identity $I$ as the result).

hardmath
  • 37,715
1

Actually there is an elegant proof of the formula

$$(A+uv^T)^{-1} = [A(I+A^{-1}uv^T)]^{-1}\\= (I+A^{-1}uv^T)^{-1}A^{-1} = (I-A^{-1}uv^T+A^{-1}u(v^TA^{-1}u)v^T-\cdots)A^{-1}\\ =[I-A^{-1}u(1-(v^TA^{-1}u)+(v^TA^{-1}u)^2-\cdots)v^T]A^{-1}\\ =A^{-1}-\frac{A^{-1}uv^TA^{-1}}{1+v^TA^{-1}u}$$

ryuzaki
  • 38