1

I wish to verify the following claim$^\color{magenta}{\dagger}$.

For any $A \in \mathbb{R}^{m \times n}$, $Ax = b \iff x = A^\dagger b + v, v \in N(A),$ where $A^\dagger$ is the pseudo-inverse of $A$ and $N(A)$ is the nullspace of $A$.

I am comfortable with the fact that we can write,

$$Ax = b,$$ as

$$A(x - v) = b,$$ where $Av = 0$ or $v \in N(A)$, $N$ the nullspace of $A$.

But I am not comfortable with the next operation, which is $$x - v = A^\dagger b \implies x = A^\dagger b + v.$$

The reason is because from my understanding, the pseudo-inverse only exists under very specific circumstances. Namely, $n> m$ and $\text{rank}(A) = m$, for which $A^\dagger = A^\top(AA^\top)^{-1}$, or $m >n$ and $\text{rank}(A) = n, A^\dagger = (A^\top A)^{-1}A^\top$, or $\text{rank}(A) = m = n$ for which $A^\dagger = A^{-1}.$ Without these rank conditions, even the pseudo-inverse may not exist.

Thoughts:

  • Since the claim is not very specific about the rank condition of the matrix $A$, therefore I am hesitant to accept it as true.

  • Is it possible that the rank condition is always implicitly true? I think not, because $A$ could be a zero matrix of dimension $m \times n$, and the rank condition would be violated most of the times and the pseudo-inverse is not going to exist.

  • Question: Is the claim true? If not, is there some way to fix it?


$\color{magenta}{\dagger}$ Chong-Yung-Chi, Wei-Chiang Li, Chia-Hsiang Lin, Convex optimization for Signal Processing and Communications — from Fundamentals to Applications.

3 Answers3

2

Let us answer your query by addressing some of your comments and your final thoughts.

The reason is because from my understanding, the pseudo-inverse only exists under very specific circumstances.

In fact, the pseudo-inverse always exists and it is unique for any matrix. In particular, if $A\in\mathbb{R}^{m\times n}$, then the following properties hold: $AA^\dagger=P_{\operatorname{ran}A}$ and $A^\dagger A=P_{\operatorname{ran}A^\dagger}$, where $P_C$ denotes the orthogonal projection to the set $C$.

Namely, $n>m$ and $\operatorname{rank}(A)=m$, for which $A^\dagger = A^\intercal(AA^\intercal)^{-1}$, or $m>n$ and $\operatorname{rank}(A)=n$, $A^\dagger = (A^\intercal A)^{-1}A^\intercal$, or $\operatorname{rank}(A)=m=n$ for which $A^\dagger=A^{-1}$.

These are examples of cases where the pseudo-inverse can be computed by an "easy" formula. In general, one can compute it using the SVD. If $A=U\Sigma V^\intercal$ is its singular value decomposition, then $A^\dagger=V\Sigma^\dagger U^\intercal$, where $(\Sigma^\dagger)_{ii}=1/\Sigma_{ii}$ if $\Sigma_{ii}\neq0$ and zero otherwise.

Is the claim true?

Based on the exact formulation given in the title, it is not true. The implication ($\Rightarrow$) always holds.

Proof of ($\Rightarrow$): Starting from $Ax=b$, we apply the pseudo-inverse on both side of the equation to get $A^\dagger Ax=A^\dagger b$. Now, recall that $A^\dagger A=P_{\operatorname{ran}A^\dagger}$. Another property of the pseudo-inverse is that $\operatorname{ran}A^\dagger=\operatorname{ran}A^*=(\ker A)^\perp$. Hence, $A^\dagger A=P_{(\ker A)^\perp}=I-P_{\ker A}$, where $I$ is the identity matrix. Denoting $v:=P_{\ker A}(x)\in\ker A$, we get that $A^\dagger b=A^\dagger Ax=(I-P_{\ker A})(x)=x-v$. $\square$

If we follow the same reasonings in ($\Leftarrow$) we get that, from $x=A^\dagger b+v$ and applying $A$ on both sides, $Ax=AA^\dagger b+Av$. On the one hand, since $v\in N(A)$, then $Av=0$. On the other hand, $AA^\dagger=P_{\operatorname{ran}A}$. Thus, we obtain the equation: $$Ax=P_{\operatorname{ran}A}b.$$ From the statement, it is not assumed that $b\in\operatorname{ran}A$, so it may happen that $P_{\operatorname{ran}A}b\neq b$.

is there some way to fix it?

If we assume that $b\in\operatorname{ran}A$, then the claim of your title is true. This can be achieved, for example, if $A$ is surjective, i.e., $n>m$ and $\operatorname{rank}(A)=m$.

Is it possible that the rank condition is always implicitly true?

After this analysis, we conclude that it is not. Considering your example $A=0$, then $A^\dagger=0$. If $b\neq 0$, then it is true that $x=A^\dagger b+v=v\in N(A)$ and thus, $Ax=0\neq b$.

Remarks: Some good references for pseudo-inverses are Chapter II of Generalized Inverses of Linear Operators by Groetsch and Section 3.2 of Convex Analysis and Monotone Operator Theory in Hilbert Spaces (2nd Edition) by Bauschke and Combettes. Both develop the pseudo-inverse problem in Hilbert spaces, although they can be translated to matrices really easily. Also, notice that the equation $Ax=P_{\operatorname{ran}A}b$ is the fundamental idea behind $A^\dagger$ and it is treated in both references.

Ikeroy
  • 682
  • The Moore-Penrose pseudo inverse always exists and it is unique. – Mittens Apr 17 '25 at 14:26
  • Thank you for your suggestion. The previous sentence was badly expressed. It is changed now. – Ikeroy Apr 17 '25 at 14:36
  • If you are referring to the gray text on the second line of my answer, then this is the quote from the original post. Right after the quote, I clarify that it exists and it is unique as you state. – Ikeroy Apr 17 '25 at 16:04
0

Suppose $A^g$ is any generalized inverse, that is, it satisfies $AA^gA=A$. There are infinitely many such generalize inverse.

Proposition: The equation $$Ay=b\tag{0}\label{zero}$$ has solution iff $$b=AA^gb\tag{1}\label{one}$$

Indeed, if $y$ is a solution to \eqref{zero}, then $b=Ay=AA^gAy=AA^gb$. Conversely, if \eqref{one} holds, then $x_g:=A^gb$ is a solution to \eqref{one}.

From this, it follows that

Corollary: If $Ay=b$, then $y-A^gb\in\operatorname{N}(A)=\{v:Av=0\}$.

The converse to the corollary above is no true in general unless it is assumed that $b\in\operatorname{R}(A)=\{Aw:w\in\mathbb{R}^n\}$, not even when $A^g=A^+$, the Moore-Pensrose pseudo inverse. For if $b\notin\operatorname{R}(A)$, no $y\in\mathbb{R}^n$ satisfies $Ay=b$.

The Moore-Penrose's pseudo inverse $A^+$ of $A$ always exists and it is unique. furthermore, for any $b\in\mathbb{R}^m$, $x_+:=A^+b$ solves the problem $$x_+=\operatorname{arg.min}\{\|x\|_2: \|Ax-b\|_2=\min_w\|Aw-b\|_2\}\tag{2}\label{two}$$ where $\|\;\|_2$ is the Euclidean norm.

What it is true is the following:

For any $y\in\mathbb{R}^n$, $y-A^+b\in\operatorname{N}(A)$ iff $y$ is solution to the projection problem $$ \operatorname{arg.min}\{\|Aw-b\|_2: w\in\mathbb{R}^n\} $$

Mittens
  • 46,352
0

The claim is true if and only if $b\in\mathrm{im}\,A$.

I will use the inner product space-based definition of pseudoinverse (it is equivalent to the SVD-based one): $$T^\dagger=(T|_{(\mathrm{ker}\,T)^\perp})^{-1}P_{\mathrm{im}\,T},$$ where $P$ means the orthogonal projection operator. You can verify that $A^\dagger A=P_{(\mathrm{ker}\,A)^\perp}$. (I would not distinguish between matrices and linear maps.)

Suppose $Ax=b$. Then $A^\dagger Ax=A^\dagger b$, which implies that $P_{(\mathrm{ker}\,A)^\perp}x=A^\dagger b$. That immediately indicates that $x-A^\dagger b\in\mathrm{ker}\,A$, as desired.

For the other direction, you only need that (you can verify yourself) $AA^\dagger=P_{\mathrm{im}\,A}$.