Given a functor $F:C \to D$, we would like to define what it means for $G: D\to C$ to be a pseudo-inverse to $F$.
The (most) pedagogical way I managed to discover is through the universal arrow definition (see it in nlab here):
On objects
- Start with $F:C\to D$ and fix a $d\in D$ whose pseudo-inverse we want to find.
- If $d$ is uniquely covered by $F$, i.e. if $d=F(c)$ for a unique $c\in C$, we're done and $G(d):=c$ is our inverse.
- If $d$ is not covered by $F$, it is natural to look for (the) closest $F(c)\in D$ to it.
By "closest" I mean a shortest arrow between $c$ and the range of $F$ (objects $F(d)$ for $d\in D$).
- here we have to make a choice: either look for the closest above, i.e. a shortest arrow $d\to F(c)$, or closest below - a shortest arrow $F(c)\to d$.
- Once we've found an object $F(c)\in D$ that we'll use to go back to $C$, we need to pick a preimage. In case there are many, we have again two natural choices - pick the least or the largest among them (if exists), in agreement with the choice in point 3. Inverse of $d$ found, problem solved.
In short, there are two kinds of problems that may occur w.r. to invertibility of $F$ - it is either not surjective or not injective:
- If some $d\in D$ has no preimages via $F:C\to D$, find the closest $d'\in D$ (above or below) that has.
- If some $d$ has many preimages via $F$, select the best (initial or terminal) among them.
Fortunately, the notion of initial arrows captures both finding the shortest arrows of the form $d\to F(c)$ and selecting the least $c$ that works (dually for terminal arrows).
So a left or right adjoint of $F$ is given on objects by selecting universal arrows $d\to F(c)$ or $F(c)\to d$ (called units/counits) that give the closest invertible object and defining $G(d):=c$.
Hence, (nlab)
The left part of a pair of adjoint functors is one of two best approximations to a weak inverse of the other functor of the pair.
On arrows
We defined $G$ on objects above; to make it a functor it has to act on arrows.
Take an arrow $d_1\to d_2$ in $D$. If both $d_1,d_2$ admit universal arrows to (from) $F$, we have a diagram that we want to complete:
$$\require{AMScd}
\begin{CD}
d_1 @>>> Fc_1 \\
@VVV @V{?}V{?}V \\
d_2 @>>> Fc_2\\
\end{CD}$$
The $Fc_1\to Fc_2$ arrow is the unique factorization of the composite $d_1\to d_2 \to Fc_2$ through the initial $d_1\to Fc_1$. The goal is to find a pseudo-preimage $C$ of $d_1\to d_2$ through $F$.
This may seem hard if you're not well-versed with initial arrows $d\to Fc$ (I certainly was not, and got stuck unable to prove functoriality of $G$ for a while. That's why I decided to add this to the answer later :)). In particular, such arrows are compared (in the slice category $d\downarrow F$) not just by any arrows $Fc_1\to Fc_2$, but by arrows $c_1\to c_2$ sent through $F$.
So in the above diagram, initiality of $d_1\to Fc_1$ means not just that $d_1\to d_2 \to Fc_2$ factors via some $Fc_1\to Fc_2$, but that there exists a unique arrow in $C$, $c_1 \to c_2$, whose output through $F$, $Fc_1\to Fc_2$ in $D$, makes the square commute.
Thus a natural candidate for the preimage of $d_1\to d_2$ is precisely that arrow $c_1\to c_2$, and we have defined $G$ on arrows. To call it a functor it remains to show it respects composition, which is now trivial.
For proof of correctness and equivalence to the unit-counit definition, here's the nlab link again.
--
There is a beautiful story that we can see in this paragraph, too. While above we picked an object and searched for the closest invertible (via F) object to it, here we picked an arrow and found the closest invertible (via F) arrow to it.
In fact, picture any commuting square rectangle in $D$ having $d_1\to d_2$ at the bottom and some arrow $F(c_1 \to c_2)$ on top. If among all such "tall" shapes there's a shortest one (comparison done in the slice category as above), we have a natural one best preimage of the $D$ arrow. If such smallest squares exist for all arrows in $D$, we get a functor that we call left or right adjoint depending on whether we took the closest from above or the largerst from below, respectively.
The usual "intuitive" explanation (e.g. youtube videos) starts with the definition of an inverse
$$F\circ G=\textrm{Id}_D \;\textrm{and}\; G\circ F=\textrm{Id}_C$$ by first relaxing equality to natural isomorphisms and then to natural transformations $\eta:\textrm{Id}\Rightarrow R\circ L$ and $\epsilon:L\circ R\Rightarrow \textrm{Id}$.
I put "intuitive" in quotes because this makes a seemingly arbitrary choice about the direction of the arrows. To me this generalization is not "natural" unless one explains why the arrows should be thus oriented (which I have not seen done). It also requires that one comes up with the triangle identities that I find hard to motivate a priori.
The above view also gives very specific meaning to the unit η and counit ϵ - they point to/from a closest invertible object (which is kinda obvious from their signatures and the triangle identities, but we I rarely ponder on these and have not realized this before).