4

The projection matrix that projects onto the row space of a matrix $A$ is

$$P_A = A^* (A A^*)^{-1} A$$

Let $B$ be $A$ with a new row vector $v$ appended. Is there an efficient way to compute $P_B$ from $P_A$ without starting from scratch? Something like the matrix determinant lemma or Sherman–Morrison formula?

user76284
  • 6,408

1 Answers1

4

The easiest way is to replace $v$ with $w = (I - P_A)v$, which will lie in the orthogonal complement of the rowspace of $A$. Since $v \notin \operatorname{rowspace} A$, we have $w \neq 0$. If $B$ is the matrix $A$ with $v$ (or $w$) included, then, $$P_B = P_A + \frac{ww^\top}{w^\top w} \tag{$\star$},$$ i.e. adding the rank $1$ projection matrix onto the span of $w$.

Why does this work? We just need to verify two things: $P_Bx \in \operatorname{rowspace}B$, and $x - P_Bx \in (\operatorname{rowspace}B)^\perp$. The former is simple enough: \begin{align*} P_Ax + \frac{ww^\top}{w^\top w}x &= P_Ax + \frac{w^\top x}{w^\top w} w \\ &\in \operatorname{rowspace} A + \operatorname{span} \{w\} \\ &= \operatorname{rowspace} A + \operatorname{span} \{v\} \\ &= \operatorname{rowspace} B. \end{align*} For the latter, suppose $a + \lambda w \in \operatorname{rowspace} B$, where $a \in \operatorname{rowspace} A$. We need to show $$(a + \lambda w)^\top\left(x - P_Ax - \frac{w w^\top}{w^\top w} x\right) = 0.$$ Recall, $w \perp \operatorname{rowspace} A$, so when expanding this, we can ignore any terms with a multiple of $w$ (including $\frac{w w^\top}{w^\top w} x$) and an element of $\operatorname{rowspace} A$. Expanding this, we get the equivalent statement: $$a^\top x - a^\top P_Ax + \lambda w^\top x - \lambda w^\top \frac{ww^\top}{w^\top w} w = 0.$$ But, note that $\lambda w^\top \frac{ww^\top}{w^\top w} w = \lambda w^\top x$, so the equation reduces to $$a^\top x - a^\top P_Ax = 0 \iff a^\top(I - P_A)x = 0,$$ which is true because $(I - P_A)x \in (\operatorname{rowspace} A)^\top$.

So, the formula $(\star)$ is correct.

Theo Bendit
  • 53,568
  • 2
    A warning is that this is essentially the Gram–Schmidt process, which can be subject to large rounding errors when implemented on in floating point arithmetic on a computer. May be fine for appending one vector $v$, but I would be concerned about iterating this process for several steps. – eepperly16 Oct 16 '22 at 14:46
  • @eepperly16 Yes, good observation. Obviously, doing this exactly, the difference between $P_B$ and $P_A$ must be this vector $(ww^\top)/(w^\top w)$. However, since this isn't my forte, do you have any suggestions for a more numerically stable way to build approximations for these projection matrices vector by vector? It might make an interesting and helpful second answer to this question. – Theo Bendit Oct 16 '22 at 23:30