3

For a 2x2 matrix $$A = \begin{bmatrix}a & b \\ c & d\end{bmatrix}$$ its inverse is $$A^{-1} = (\det A)^{-1} \begin{bmatrix}d & -b \\ -c & a\end{bmatrix}.$$

This is easy to derive using Gaussian elimination and $\det A = ad - bc$. But what is the intuition behind it?

Let's write $$\tilde A :=\begin{bmatrix}d & -b \\ -c & a\end{bmatrix}$$ so that $$A^{-1} = (\det A)^{-1} \tilde A.$$ Intuiting the $(\det A)^{-1}$ scalar is simple: $A^{-1}$ must undo the expansion of area of $A$, and since $\tilde A$ has the same determinant (expansion of area) as $A$, we need to divide both rows by the reciprocal of that. We use $\lambda$ such that $\det (\lambda \tilde A) = (\det A)^{-1}$, and $\det (\lambda \tilde A) = \lambda^2\det \tilde A = \lambda^2\det A$.

But what is the intuition behind $\tilde A$? Geometrically, how does $\tilde A$ relate to $A$?

SRobertJames
  • 6,117
  • 1
  • 12
  • 40
  • If you're happy with the determinant part, you can w.l.o.g. assume $det(A)=1$ i.e. $A\in SL_2$. Further, one might focus on the cases $A$ diagonal and $A = \pmatrix{\lambda &b\0 &\lambda}$ (now w.l.o.g. with $\lambda =\pm1$), both of which should be not too hard too visualize. – Torsten Schoeneberg Jan 13 '25 at 22:45
  • 5
    By the way, this matrix is called the adjugate, or in some contexts the classical adjugate, and denoted $\operatorname{adj} A$. It satisfies the property that $A , (\operatorname{adj} A) = (\det A) I$, which is an equation that doesn't involve any division and works even when the matrix is singular. It's entries are computed with cofactors, which are themselves determinants too! More here. – Sammy Black Jan 13 '25 at 23:47
  • You might find my video helpful as it provides a geometrical explanation of the adjugate. – blargoner Jan 14 '25 at 03:48
  • @blargoner Thanks. Could you please post a summary (in words or pictures) of your video as an answer? That will help me and everyone else who comes to this page (or Googles the question). – SRobertJames Jan 14 '25 at 13:20
  • "But what is the intuition behind it?" you are looking for a matrix $A^{-1}$ that satisfies $AA^{-1}=A^{-1}A=I$. For a $2\times 2$ nonsingular matrix, using the Gauss-Jordan method, you get

    $$ A^{-1} = \begin{bmatrix} \frac{d}{ad-bc} & \frac{-b}{ad-bc} \ \frac{-c}{ad-bc} & \frac{a}{ad-bc} \end{bmatrix}. $$ If you take out this scalar $\frac{1}{ad-bc}$ (i.e. $\det(A)$), you get the adjugate matrix. Take a look at What is the intuitive meaning of the adjugate matrix?.

    – CroCo Jan 15 '25 at 21:36

1 Answers1

2

Just expanding on my comment per request. I'll work in $V=\mathbb{R}^2$.

If $f:V\to\mathbb{R}$ is a nonzero linear functional, then $f$ can be viewed as measuring oriented lengths in $V$ (along a dual basis vector). There's a unique vector $v\in V$ such that $f$ actually measures oriented areas of parallelograms in $V$ relative to $v$: $$f(x)=\det(x,v)\tag{1}$$

Indeed, consider $D:V^2\to V$ defined by $$D(x,y)=f(x)y-f(y)x$$ Clearly $D$ is bilinear and alternating, so by the universal property of the determinant there is $v\in V$ unique with $D(x,y)=\det(x,y)v$. If $x=0$, then (1) holds trivially, and if $x\ne 0$ there is $y\in V$ with $\det(x,y)=1$, in which case $v=D(x,y)$ and hence $$\det(x,v)=\det(x,f(x)y-f(y)x)=f(x)\det(x,y)-f(y)\det(x,x)=f(x)$$ so (1) still holds.

Now given an invertible linear map (or matrix) $A:V\to V$, we have the induced nonzero linear functional $f_A=f\circ A$, and $f_A$ also measures oriented lengths. Again, there is $v'\in V$ so that $$f_A(x)=\det(x,v')$$ A very natural question is: what's the relationship between $v$ and $v'$? More specifically, when we apply $A$ before measuring lengths, how does the vector relative to which we're measuring areas change?

The answer is: $v'=\mathrm{adj}(A)v$ (or $v'=\tilde{A}v$ in your notation). In other words, $$\det(Ax,v)=\det(x,\mathrm{adj}(A)v)\tag{2}$$ and this holds for all $x,v\in V$ and fully characterizes $\mathrm{adj}(A)$.

If we take $v=Ay$ in (2), then for all $x\in V$, $$\begin{align*} \det(x,\mathrm{adj}(A)Ay)&=\det(Ax,Ay)\\ &=\det(A)\det(x,y)\\ &=\det(x,\det(A)Iy) \end{align*}$$ which implies $\mathrm{adj}(A)A=\det(A)I$ and $A^{-1}=\det(A)^{-1}\mathrm{adj}(A)$, so this is indeed the familiar adjugate.

I think this is more interesting to look at in the case of $\mathbb{R}^3$ (and higher dimension), but I leave the details of that to my video.

blargoner
  • 3,501
  • Thank you. I want to confirm I understand this correctly. You're stating: Any linear function of $x \in \mathbb R^2$ can be thought of measuring the area of a parallelogram with one side $x$ and the other a particular vector $v_f$, with $f, v_f$ forming a bijection. Then if $A$ is a matrix in $\mathbb R^2$ and $g(x) := f(Ax)$, then $v_g = \operatorname{adj}(A)v_f$. Is that accurate? – SRobertJames Jan 16 '25 at 20:30
  • @SRobertJames That's the idea, just note if the functional or vector is zero we're talking about degenerate geometric objects in that case. – blargoner Jan 16 '25 at 21:40
  • Fascinating. Can you prove that? Or at least do you have a diagram showing it? I can't see how to get this result. – SRobertJames Jan 16 '25 at 22:36
  • 2
    Oriented area ... – Ted Shifrin Jan 16 '25 at 23:49