3

Let $A \in \mathbb{F}^{m \times n}$. How do you prove that row rank of a matrix equals column rank ?

This question has been addressed here and here, but the explanation in one case was descriptive and somewhat involved in the other. The answer below is an introductory-linear-algebra level answer.

curryage
  • 1,201

3 Answers3

2

There is an elegant proof here, https://en.wikipedia.org/wiki/Rank_(linear_algebra)#First_proof

But developing it on an example is better for intuition.

There are 2 equivalent views of the relationship between vectors:

  • The column vectors view
  • The rows: which are the coefficients you use to combine the column vectors

Take: $$ A = \begin{bmatrix}1 & 2 & 3 & -2\\0 & 1 & 1 & -1\end{bmatrix} $$ Note that the first 2 columns are independent and the 2 next columns are linear combinations of the first two. Structure looks like: $$ A=\begin{bmatrix}\alpha_1 & \beta_1 & \gamma(\alpha_1,\beta_1) & \delta(\beta_1)\\\alpha_2 & \beta_2 & \gamma(\alpha_2,\beta_2) & \delta(\beta_2)\end{bmatrix} $$ where $\gamma(x,y)=x+y$ and $\delta(x,y)=-x$

The column space

The $columnspace$ $C(A)$ lives in $R^2$ and is the set of all $A\vec x$ , where $\vec x=\pmatrix{\alpha&\beta&\gamma&\delta}^T$ is a vector of 4 coefficients applied to the columns: $$ C(A) = \alpha \pmatrix{1\\0}+\beta \pmatrix{2\\1}+\gamma\pmatrix{3\\1}+\delta\pmatrix{-2\\-1}\\ C(A) = \alpha \pmatrix{1\\0}+\beta \pmatrix{2\\1}+\gamma\pmatrix{1+2\\0+1}-\delta\pmatrix{2\\1} \\ C(A) = (\alpha+\gamma) \pmatrix{1\\0}+(\beta+\delta-\gamma) \pmatrix{2\\1} $$ So $C(A)$ is a 2-dimensional space that is spanned by the first 2 rows. Notice how actually only 2 coefficients out of 4 are really needed ...

Looking at the constraints on coefficients

The vector space of the coefficients $\vec x=\pmatrix{\alpha&\beta&\gamma&\delta}^T$ lives in $R^4$ but we should ask ourselves why was it that column 3 was a linear combination of column 1 and 2 ? It is because in all rows , the 3^rd^ coefficient (the $\delta$) is the same linear combination ($\alpha+\beta$) of the coefficients of the first 2 coefficients in the row. If you add another row to $A$ where the 3^rd^ coefficient ($\delta$) is not exactly the sum of the 2 first, then column 3 would no more be a linear combination of column 1 and column 2. Likewise, column 4 is a linear combination of the 2 first column, only because on all rows the 4^th^ coefficient is the opposite of the 2^nd^ coefficient.

So now we have another strictly equivalent way to say that column 3 and column 4 are linear combinations of column 1 and 2 but in term of rows:

All rows $(\alpha,\ \beta,\ \gamma,\ \delta)$ have the same structure $(\alpha,\ \beta,\ \alpha+\beta,\ -\beta)$

The row space

Note that the sum of 2 rows with structure $(\alpha,\ \beta,\ \alpha+\beta,\ -\beta)$ has still this structure. And that multiplying by a scalar also keeps the structure. So we could say that all rows in the $rowspace$ noted $C(A^T)$ have this structure: $$ C(A^T) = \pmatrix{\alpha\\ \beta\\ \alpha+\beta \\-\beta} \\C(A^T) = \alpha\pmatrix{1\\0\\1\\0} + \beta\pmatrix{0\\1\\1\\-1} $$

The whole $rowspace$ is generated by 2 vectors and hence has maximum dimension 2. In particular,$row\ 1 = 1\cdot(1,0,1,0) + 2 \cdot (0,1,1,-1)$ and $row\ 2 = 0\cdot(1,0,1,0) + 1 \cdot (0,1,1,-1)$ . Because of the placement of the zeros (that form an identity matrix) we can see that those 2 vectors are independent.

Sum up

So let's go back to the proof that the $dim(rowspace) = dim(colspace)$ applied to our matrix $A$.

  1. We start with $dim(colspace)=r=2$ in $A$ .We suppose that it is because column 3 is some linear combination $\gamma(x,y)$ of column 1 and 2, and because column 4 is some linear combination $\delta(x,y)$ of same columns 1 and 2
  2. Column 3 and 4 are combinations of column 1 and 2 exactly because all rows have the structure $(\alpha,\ \beta,\ \gamma(\alpha,\beta),\ \delta(\alpha,\beta))$ where $\gamma = c_1.\alpha + c_2.\beta$ and $\delta = k_1.\alpha + k_2.\beta$
  3. The $rowspace$ is precisely the set of all 4-d vectors with structure $(\alpha ,\ \beta , \ c_1.\alpha + c_2.\beta , \ k_1.\alpha + k_2.\beta)$ . which is $\alpha.(1,\ 0 , \ c_1, \ k_1) + \beta.(0,\ 1,\ c_2,\ k_2)$
  4. Because the $rowspace$ can be generated by $r$ vectors, its maximum dimension is $r=2$ so that $dim(rowspace) \leq r = dim(colspace)$
  5. Now you can either apply the prove to $A^T$ (to say that $dim(colspace) \leq r = dim(rowspace)$ ) or notice that the 2 $rowspace$ vectors are trivially independent because of the placement of zeros.

It's easy to generalize.

2

Let $A \in \mathbb{F}^{m \times n}$ and let $R = $RREF$(A)$.


The non-zero rows of $R$ are obtained by invertible row operations on $A$. This means we can go back and forth between rows of $A$ and $R$. Therefore, non-zero rows of $R$ span $A$.Also, non-zero rows of $R$ are linearly independent(any dependent rows are reduced to $0$ rows). Therefore, non-zero rows of $R$ form a basis for row space of $A$.

# of non-zero rows = # of leading $1$s in $R$ = dimension(row space of $A$) = row rank($A$)


$R = EA$, where $E$ is an invertible elementary matrix product. Let $B = E^{-1}$. Let $r_j$ be the $j$th pivot column of $R$.

$r_j$ = $E a_j \Rightarrow a_j = B r_j$. Since $r_j$ is a pivot column, $r_j=I_j$($j$ th column of identity matrix). Therefore, $a_j = B I_j$. Since columns of $I$ are independent, $a_j$s are independent(Proving this is easy -- start from definition of linear independence for $I_j$s and show that multiplication by $B$ does not affect the relationship). Therefore, pivot columns of $A$ are linearly independent.

Each non-pivot column of $R$ is linearly dependent on pivot columns of $R$(Otherwise, it would have been a pivot column). Since $a_j = B r_j$, each non-pivot column of $A$ is linearly dependent on pivot columns of $A$. Therefore, pivot columns of $A$ span the column space of $A$.

Therefore, pivot columns of $A$ form a basis for column space of $A$.

$\Rightarrow$ column rank($A$) = dimension(col space($A$)) = # of pivot columns = # of leading $1$s in $R$ = row rank ($A$)

[6chars]

curryage
  • 1,201
0

I will give two of my personal favorite proofs here for matrices over real numbers (as is done in a first course in linear algebra). Importantly, these proofs avoid the rather tedious description of echelon matrix structures and can be presented quickly as independent proofs.

The first is from an absolutely delightful article by George Mackiw in Mathematics Magazine (https://doi.org/10.1080%2F0025570X.1995.11996337).

Let $A$ be an $m\times n$ matrix whose row rank is $r$. Therefore, the dimension of the row space of $A$ is $r$ and let $x_1,\ldots,x_r$ be a basis of the row space of $A$. We claim that the vectors $Ax_1,\ldots,Ax_r$ are linearly independent. To prove this, we consider a linear homogeneous relation, $$ c_1Ax_1 + \cdots + c_rAx_r = 0 \Longrightarrow A(c_1x_1 + \cdots + c_rx_r) = 0, $$ and we prove that $c_1 = \cdots = c_r = 0$.

Let $v = c_1x_1 + \cdots + c_rx_r$. Then, $Av = 0$ which means that the dot product of $v$ with each row vector of $A$ is zero. Therefore, $v$ is orthogonal to each of the rows of $A$ and so $v$ is also orthogonal to any vector in the row space of $A$ (because any vector in the row space of $A$ is a linear combination of row vectors of $A$). But note that $v$ is itself in the row space of $A$ because it is a linear combination of a basis of the row space of $A$. This means that $v$ is orthogonal to itself, which means $v=0$. Therefore, $$ c_1x_1 + \cdots + c_rx_r = 0 \Longrightarrow c_1 = \cdots = c_r = 0 $$ because $x_1,\ldots,x_r$ were taken to be a basis of the row space (hence linearly independent). We have now proved that the coefficients in a linear combination of $Ax_1,\ldots, Ax_r$ are all equal to zero, so the $Ax_i$'s are linearly independent.

Now, each $Ax_i$ is obviously a vector in the column space of $A$ so $\{Ax_1,\ldots, Ax_r\}$ is a set of $r$ linearly independent vectors in the column space of $A$. So the dimension of the column space of $A$ (i.e., the column rank of $A$) must be at least as big as $r$. This proves that row rank of $A$ is no larger than the column rank of $A$: $$ \mbox{row rank}(A) \leq \mbox{column rank}(A)\;. $$
Since this result applies to any matrix, we can also apply this result to the transpose of $A$ to conclude that the row rank of $A^{t}$ is no larger than the column rank of $A^{t}$. But the row (column) rank of $A^{t}$ is the column (row) rank of $A$. This yields the reverse inequality and the proof is complete.


A second proof is posted here:

Sudipto Banerjee (https://math.stackexchange.com/users/19430/sudipto-banerjee), Looking for an intuitive explanation why the row rank is equal to the column rank for a matrix, URL (version: 2023-03-25): https://math.stackexchange.com/q/4367250

Briefly, define $\operatorname{rank}(A)$ to mean the column rank of $A$: $\operatorname{col rank}(A) = \dim \{Ax: x \in \mathbb{R}^n\}$. First we show that $A^{t}Ax = 0$ if and only if $Ax = 0$. If $Ax = 0$, then multiplying both sides by $A^{t}$ shows $A^{t}Ax = 0$. To prove the other direction, argue as follows: $$A^{t}Ax=0 \implies x^{t}A^{t}Ax=0 \implies (Ax)^{t}(Ax) = 0 \implies Ax = 0.$$ Therefore, the null spaces of $A$ and $A^{t}A$ are the same.

Applying the rank plus nullity theorem (https://en.wikipedia.org/wiki/Rank%E2%80%93nullity_theorem) to $A$ and $A^{t}A$ we obtain that the sum of the dimensions of the column space and null space of both these matrices are equal to $n$ (which is the number of columns in $A$ and also in $A^{t}A$). Since the null spaces of $A$ and $A^{t}A$ are the same, we can conclude that $\operatorname{col rank}(A) = \operatorname{col rank}(A^{t}A)$.

Therefore, $\operatorname{col rank}(A) = \operatorname{col rank}(A^{t}A) \leq \operatorname{col rank}(A^{t})$, where the last inequality follows from the fact that the columns of $A^{t}A$ are linear combinations of the columns of $A^{t}$ and, hence, in the column space of $A^{t}$. This proves that $\operatorname{col rank}(A) \leq \operatorname{col rank}(A^{t})$ for any matrix $A$. Applying this inequality to the matrix $A^{t}$ gives the reverse inequality and we conclude $\operatorname{col rank}(A) = \operatorname{col rank}(A^{t})$.