Understanding $\text{Ad}\circ \exp = \exp \circ \text{ad}$, domains and ranges

Question

$\newcommand{\ad}[0]{\text{ad}}$ $\newcommand{\Ad}[0]{\text{Ad}}$ $\newcommand{\R}[0]{\mathbb{R}}$ $\newcommand{\GL}[0]{\text{GL}}$ $\newcommand{\End}[0]{\text{End}}$ $\newcommand{\Aut}[0]{\text{Aut}}$ $\newcommand{\Der}[0]{\text{Der}}$

I am studying $$ \Ad\circ\exp = \exp \circ \ad $$ I'm following Hall Lie Groups, Lie Algebras, and Representations: An Elementary Introduction (second edition). At first, I am just interested in the identification of these maps in the context of matrix Lie groups/algebras.

Hall first presents this identity in chapter 3 after introducing the concept of the Lie algebra corresponding to a Lie group. He also has a nice exercise in chapter three exercise 14 about this where $$ \Ad_{e^X}Y = e^{\ad_X}Y $$ for $X\in M_n(\R)$.

I am trying to work out the domains and ranges of the 4 functions that appear in this identity. $\Ad$, $\ad$, $\exp$ and $\exp$. In the general context of Lie theory I think \begin{align*} \Ad: G &\to H\\ \ad: \mathfrak{g} &\to \mathfrak{h}\\ \exp: \mathfrak{g} &\to G\\ \exp: \mathfrak{h} &\to H \end{align*} this is very nice. But I'm trying to figure out how to spell this out in the matrix case.

edit: The above note is suspect. I think it would be better suited for a general Lie group homomorphism $\Phi: G\to H$ and the corresponding Lie algebra homomorphism $\phi: \mathfrak{g}\to \mathfrak{h}$. For $\Ad$ and $\ad$ specifically I think we need to specify $H = \Aut(\mathfrak{g})$ and $\mathfrak{h} = \text{Lie}(\Aut(\mathfrak{g}))$, but it's not clear to me what these last two sets are in the case of matrix Lie groups/algebras.

In Hall, in the matrix case, $\Ad$ is said to map $G\to \GL(\mathfrak{g})$. In the case where $X$ is any matrix, e.g. $X\in M_n(\R)$, then I think $\mathfrak{g} = M_n(\R)$ and $G = \GL(n;\R)$.

He then identifies $H$ with $\GL(\mathfrak{g})$ thinking of $\mathfrak{g} = M_n(\R)$ as a vector space with some dimension $k=n^2$ in this case.

I'm wondering if this can be presented more symmetrically as follows. Instead of using $M_n(\R)$ and $\GL(n; \R)$ what about using \begin{align*} M_n(\R) =& \End(\R^n)\\ \GL(n; \R) =& \Aut(\R^n) \end{align*} Then I think we have \begin{align*} G =& \Aut(\R^n)\\ \mathfrak{g} =& \End(\R^n)\\ H =& \Aut(\End(\R^n))\\ \mathfrak{h} =& \End(\End(\R^n)) \end{align*} and \begin{align*} \Ad: \Aut(\R^n) \to& \Aut(\End(\R^n))\\ \ad: \End(\R^n) \to& \End(\End(\R^n))\\ \exp: \End(\R^n) \to& \Aut(\R^n)\\ \exp: \End(\End(\R^n)) \to& \Aut(\End(\R^n)) \end{align*}

Of course \begin{align*} \Ad(X)(Y) =& XYX^{-1}\\ \ad(X)(Y) =& [X, Y]\\ \exp(X) =& \sum_{n=0}^{\infty} \frac{X^n}{n!} \end{align*} where the last definition holds whether $X\in\End(\R^n)$ or $X\in \End(\End(\R^n))$.

Then we can see that both sides of $$ \Ad\circ \exp = \exp \circ \ad $$ map $\End(\R^n) \to \Aut(\End(\R^n)).$ That is, both sides take a matrix and return an automorphism on the set of matrices.

My main question: Am I correct in all of my identifications and interpretations?

I know from e.g. the Wikipedia page on the Adjoint Representation, that in the general Lie theory the Lie algebra is thought of as the tangent space of the Lie group (thought of as a differential manifold) at the identity and in this case $\ad$ maps $\mathfrak{g}$ into "derivation algebra" of $\mathfrak{g}$ rather than into $\End(\mathfrak{g})$ as I've indicated here. But like I said, I'm just trying to keep it simple sticking with matrix Lie algebras. Maybe the derivation algebra of $\mathfrak{g}$ is a subspace of $\End(\mathfrak{g})$ meaning my description above is still a correct way to understand why $\Ad\circ \exp = \exp \circ \ad$ makes sense? Maybe the modification taking this into account would look like

\begin{align*} G =& \Aut(\R^n)\\ \mathfrak{g} =& \Der(\R^n)\\ H =& \Aut(\Der(\R^n))\\ \mathfrak{h} =& \Der(\Der(\R^n)) \end{align*} and \begin{align*} \Ad: \Aut(\R^n) \to& \Aut(\Der(\R^n))\\ \ad: \Der(\R^n) \to& \Der(\Der(\R^n))\\ \exp: \Der(\R^n) \to& \Aut(\R^n)\\ \exp: \Der(\Der(\R^n)) \to& \Aut(\Der(\R^n)) \end{align*} Where perhaps $\Der(\R^n)$ is equal to $M_n(\R)$ as a set, but with the commutator taken as the product operation rather than the usual matrix multiplication product? Then both sides of $\Ad\circ \exp = \exp \circ \Ad$ map $\Der(\R^n) \to \Aut(\Der(\R^n))$.

Where you write "perhaps", I can say that $Der(L)\cong M_n(K)$ as vector spaces for an abelian Lie algebra $L$ with $\dim_K L=n$, because every linear map $D$ satisfies $D([x,y])=[x, D(y)]+[D(x),y]$ for the zero Lie bracket. — Dietrich Burde, Jan 12 '25 at 12:36
Why would we need $H$ to be $\operatorname{Aut}(\mathfrak{g})$ rather than simply $GL(\mathfrak{g})$ we only need the codomain of a function not the image (and $\operatorname{Aut}(\mathfrak{g})$ is certainly not guaranteed to be the image in general, the image is the adjoint group $\operatorname{Ad}G$). — Callum, Jan 12 '25 at 20:14
Also $\operatorname{Aut}(\mathbb{R}^n)$ is a little unclear. "Automorphism" means we are preserving some structure so using this notation for $\mathbb{R}^n$ suggests we are trying to preserve additional structure (e.g. Euclidean) since we already have established notation for the group preserving its vector space structure $\operatorname{GL}(n,\mathbb{R})$ — Callum, Jan 12 '25 at 20:14
@Callum yeah.. my notation in the final section was a bit speculative. I agree that basically a lot of what I said or notated there is nonsensical. — Jagerber48, Jan 13 '25 at 00:36
@Callum what is the difference between $\text{GL}(\mathfrak{g})$ and $\text{Aut}(\mathfrak{g})$? — Jagerber48, Jan 13 '25 at 00:37
One is the General Linear group and one is the Automorphism group. The latter is a subgroup of the former. It is the subgroup of elements which preserve the Lie bracket. — Callum, Jan 13 '25 at 01:08
@Callum got it, that is helpful. Well to your point, we can basically relax the codomains of $\text{Ad}_X$ and $\text{ad}_X$ all the way to $\mathfrak{g}^\mathfrak{g}$, just all functions from $\mathfrak{g}\to\mathfrak{g}$. This is sufficient to define and prove the result on matrices. It is only when we want to highlight the additional Lie structure that we want to bother with $\text{GL}(\mathfrak{g})$ or $\text{Aut}(\mathfrak{g})$. Do you agree with what I say here? — Jagerber48, Jan 13 '25 at 02:37
No it is important that $\operatorname{GL}(\mathfrak{g})$ is a Lie group and $\operatorname{End}(\mathfrak{g})$ is its Lie algebra. We certainly don't want to go all the way down to $\mathfrak{g}^\mathfrak{g}$. — Callum, Jan 13 '25 at 07:05

Jagerber48 · Answer 1 · 2025-01-13T01:21:51.430

$\newcommand{\Ad}[0]{\text{Ad}}$ $\newcommand{\ad}[0]{\text{ad}}$ $\newcommand{\R}[0]{\mathbb{R}}$ $\newcommand{\GL}[0]{\text{GL}}$ $\newcommand{\Aut}[0]{\text{Aut}}$ $\newcommand{\Der}[0]{\text{Der}}$ $\newcommand{\Lie}[0]{\text{Lie}}$ $\newcommand{\mf}[1]{\mathfrak{#1}}$

The OP is interested in proving \begin{align*} \tag{1} \Ad_{e^X}(Y) =& e^{\ad_X}(Y)\\ e^X Y e^{-X} =& \sum_{m=0}^{\infty}\frac{1}{m!}[\underbrace{X, [\ldots, [X}_{m\text{ times}}, Y]] \end{align*} with $$ \Ad_X(Y) = XYX^{-1} $$ for invertible $X$ and any square $Y$ and $$ \ad_X(Y) = [X, Y] = XY - YX $$ for any square $X, Y$ and with $\exp$ defined as $$ \exp(X) = e^X = \sum_{m=0}^{\infty} \frac{X^m}{m!} $$ whether $X$ is a square matrix or a function on square matrices. This expression can be condensed as $$ \tag{*} \Ad\circ \exp = \exp \circ \ad, $$ however, in condensing it like this care needs to be taken the the domains and ranges of all the functions are compatible. The question asks how the domains and ranges of $\ad$, $\Ad$, $\exp$ and $\exp$ (one version of $\exp$ for matrices and one for functions on matrices) should be defined so that $(*)$ is sensible.

I will answer at three levels of sophistication, at each level increasing the generality and connection with general Lie theory.

Level One

At the first level we take $(1)$ at face value as a matrix equation. At this level things are surprisingly simple (at least relative to the full Lie theory machinery). We introduce

$M_n(\R)$ is the set of all $n\times n$ matrices which represent linear transformations on $\R^n$,
$\GL(n; \R)$ is the subset of $M_n(\R)$ corresponding to invertible matrices (those with non-zero determinant)
$\mathcal{F}(X, Y)=Y^X$ is the set of all functions from $X$ to $Y$.

We then have

\begin{align*} \Ad:&\: \GL(n; \R) \to \mathcal{F}(M_n(\R), M_n(\R))\\ \exp:&\: M_n(\R) \to \GL(n; \R)\\ \ad:&\: M_n(\R) \to \mathcal{F}(M_n(\R), M_n(\R))\\ \exp:&\: \mathcal{F}(M_n(\R), M_n(\R)) \to \mathcal{F}(M_n(\R), M_n(\R)) \end{align*} In this case we can see that both sides of $$ \Ad\circ \exp = \exp \circ \ad $$ map a matric in $M_n(\R)$ to a matrix function in $\mathcal{F}(M_n(\R), M_n(\R))$. The theorem can be proven with induction, combinatorics and with some calculus on infinite series at this level as in Hall chapter 3 exercise 14. This already essentially answers the OPs question

Level 2

The first level made no connection to Lie theory, rather, it just looked at $(*)$ as an expression on matrices. In this section we conect with matrix Lie theory as in Hall.

In Hall chapter 3, he proves that if $G$ and $H$ are Lie groups with corresponding Lie algebras $\mf{g}$ and $\mf{h}$ and $\Phi: G\to H$ is a (Lie group) homomorphism then there exists $\phi: \mf{g}\to\mf{h}$ which is a (Lie algebra) homomorphism. That is, for $X, Y\in \mf{g}$ we have

$$ \Phi(e^X) = e^{\phi(X)} $$

At this level of sophistication we will have that $\Phi=\Ad$ and $\phi=\ad$. In our specific case we have \begin{align*} G =& \GL(n; \R)\\ \mathfrak{g} =& M_n(\R). \end{align*} This agrees with the previous level of sophistication and also checks out with the fact that $M_n(\R) = \Lie(\GL(n;\R))$ with the Lie bracket given by the matrix commutator is indeed the Lie algebra corresponding to $\GL(n; \R)$.

But it remains to figure out what are $H$ and $\mf{h}$. Here we have $$ \Ad_X = XYX^{-1} $$ with $X\in \GL(n;\R)$ so it makes sense to write down $X^{-1}$. It is again the case that $Y\in M_n(\R)$ is any matrix, but this time we understand it to be an element of $\mf{g}$ and, importantly $\Ad_X$ preserves the commutator. This means that $\Ad_X: \mf{g}\to \mf{g}$ is a Lie algebra automorphism homomorphism. Furthermore, $\Ad_X$ is the identity if $X=I\in \GL(n;\R)$ and $\Ad_X$ is inverted by $\Ad_{X^{-1}}$. If $\Ad$ maps $X\to \Ad_X$ then what space does $\Ad_X$ live in? It's not totally clear what the notation for this space should be, but I suggest $\Aut(\mf{g})$, as used on the Wikipedia page for adjoint representation makes sense. $\Aut(\mf{g})$ is the set of Lie algebra automorphism on the Lie algebra $\mathfrak{g}$. We may also use $\Aut(M_n(\R))$. Hall uses $\GL(\mf{g})$.

For $\ad = \phi$ we know $\mf{g} = M_n(\R)$ is the domain, but what range does $\ad_X$ live in? $\ad_X$ takes an element of $\mf{g}$ and returns a new one. We expect that the range of $\ad$ is $\mf{h}$, but what is this? We can characterize $\mf{h}$ a few ways. One is by noting, from the theorem on $\Phi$ and $\phi$, that $\mf{h} = \Lie(H)$ is the Lie algebra corresponding to the Lie group $H= \Aut(\mf{g})$ so we could write $$ \mf{h} = \Lie(H) = \Lie(\Aut(\mf{g})). $$ We can then note, either by inspection of $\ad_X$ or by studying $\text{Lie}(\Aut(\mf{g}))$ as in <<Add Link!>> that $\ad_X$ is a derivation on $\mf{g}$ meaning $$ \ad_X([Y, Z]) = [\ad_X(Y), Z] + [Y, \ad_X(Z)]. $$ This means we can write $\ad_X\in \text{Der}(\mathfrak{g})$. We then have

\begin{align*} G =& \GL(n; \R)\\ \mf{g} =& M_n(\R)\\ H =& \Aut(\mf{g}) = \Aut(M_n(\R))\\ \mf{h} =& \Der(\mf{g}) = \Der(M_n(\R)) \end{align*} with \begin{align*} \Ad:&\: \GL(n; \R) \to \Aut(\mf{g})\\ \exp:&\: M_n(\R) \to \GL(n;\R)\\ \ad:&\: M_n(\R) \to \Der(M_n(\R))\\ \exp:&\: \Der(M_n(\R)) \to \Aut(M_n(\R)) \end{align*} noting that $\Aut(M_n(\R))$ preserves the matrix commutator on $M_n(\R)$ and $\Der$ is the set of derivations on $M_n(\R)$ as a Lie algebra. We then have $$ \Ad \circ \exp = \exp \circ \ad $$ each map a matrix in $M_n(\R)$ to a Lie algebra homomorphism in $\Aut(\mf{g}) = \Aut(M_n(\R))$.

Note that $\Aut(M_n(\R)), \Der(M_n(\R)) \subset \mathcal{F}(M_n(\R), M_n(\R))$ so this level of sophistication is consistent with the last.

Third Level

In the third level of sophistication we now make contact with the general Lie theory, rather than just the matrix Lie theory. In the case of the matrix Lie theory we could define $\Ad_X$ and $\ad_X$ as above in terms of the matrix product group commutator or matrix algebra bracket commutator. But in the general Lie theory we don't, a priori, have either an action of the Lie group on the Lie algebra which is required for the definition of $\Ad$ nor do we have a concrete definition of $\ad$ as the matrix commutator.

Here we directly follow the Wikipedia page for adjoint representation. Here we start only with a general Lie group $G$. We have the inner automorphism $\Psi$ defined by group conjugation as

\begin{align*} \Psi_G: G \to& G\\ h \mapsto& ghg^{-1} \end{align*} so that \begin{align*} \Psi: G \to& \Aut(G)\\ g \mapsto& \Psi_g \end{align*} We then define $$ \Ad_g = (d\Psi_g)_e: T_eG \to T_eG $$ which means $\Ad_G: \mf{g} \to \mf{g}$ where we've identified $T_eG$ with $\mf{g} = \Lie(G)$ with the Lie bracket given by $[X, Y]$ for $X, Y \in \mf{g}$ (thinking of $X$ and $Y$ as derivations on functions on the manifold. $\Ad_g$ is a Lie group Homomorphism from $G$ onto itself. This means $\Ad_g = (d\Psi_g)_e$ is a Lie algebra automorphism (preserves the Lie algebra structure) so that $\Ad_g \in \Aut(\mf{g})$ and \begin{align*} \Ad: G \to& \Aut(\mf{g})\\ g \mapsto& \Ad_g \end{align*} To define $\ad$ we again take a derivative so that $$ \ad = (d\Ad)_e: \mf{g} \to \mf{g}. $$ It can be shown (I should learn this) that $\ad_X(Y) = [X, Y]$ so that $\ad\in \Der(\mf{g})$.

We then have all of the identifications as in the previous section. However this time $\mf{g}$, $H=\Aut(\mf{g})$ and $\mf{h} = \Der(\mf{g})$ and $\Ad$ and $\ad$ were all derived from only the Lie group structure on $G$.

I think you are still overcomplicating this a little. It is true that the image of $\operatorname{ad}$ is contained in $\operatorname{Der}(\mathfrak{g})$ and the image of $\operatorname{Ad}$ is contained in $\operatorname{Aut}(\mathfrak{g})$ but neither of these is the image in general so Hall is perfectly reasonable in using $\operatorname{GL}(\mathfrak{g})$ and $\mathfrak{gl}(\mathfrak{g})$ (aka $\operatorname{End}(\mathfrak{g})$) as the codomains. Making the codomains the set of all functions from $\mathfrak{g}$ to $\mathfrak{g}$ though actively takes out important structure. — Callum, Jan 14 '25 at 11:09
Ok, I'm getting the $\text{Aut}$ and $\text{Der}$ stuff from Wikipedia. I need to study more to understand the distinctions you're making on those fronts, or you can help clarify more for more. About functions from $\mathfrak{g}\to \mathfrak{g}$, I'm trying to simplify as much as possible so that someone familiar with matrices but NOT familiar with group theory could understand. The point is, a physicist who doesn't know the word "Lie group" or even "group" could understand $e^X Y e^{-X} = \sum_m \frac{1}{m!} [X, [\ldots, [X, Y]]]$ and it's combinatoric proof in Hall. For this, we need — Jagerber48, Jan 14 '25 at 11:47
@Callum only specify functions $\mathcal{g} \to \mathcal{g}$. I agree with you important structure is lost. That's why I try to also present the level 2 and level 3 levels of higher understanding. Level 1 is for the person who doesn't know what a group is, but they do know what matrices, the exponential, and the matrix commutator is. — Jagerber48, Jan 14 '25 at 11:48
But functions from $\mathfrak{g}$ to itself are not matrices in general and the images of $\operatorname{Ad}$ and $\operatorname{ad}$ explicitly are (which is true even when the original group is not a matrix group). We certainly need them to be matrices when we start considering $\operatorname{Ad}_X(Y)$ to be a product $XYX^{-1}$. Perhaps "important" was the wrong word, "absolutely necessary" is better. — Callum, Jan 14 '25 at 13:16
@Callum ok, but in this answer in level 1 I don't actually use the set of all functions from $\mathfrak{g} \to \mathfrak{g}$. I say that $\text{Ad}$ maps a matrix from $\text{GL}(n; \mathbb{R})$ to a function in $\mathcal{F}(M_n(\mathbb{R}), M_n(\mathbb{R}))$, that is from matrices to matrices. So I do capture that fact. I guess this is the functions from $\mathfrak{g}$ to $\mathfrak{g}$, but the notation I'm using emphasizes rather that it takes matrices to matrices which seems sufficient to define the function for someone who doesn't know what groups are. — Jagerber48, Jan 14 '25 at 13:29
My point is the the image of $\operatorname{Ad}$ is itself a linear function on $\mathfrak{g} = M_n(\mathbb{R})$. The set of all functions between a set and itself is a much more complicated beast and we should avoid invoking it here. I certainly don't want to be doing any calculus on it if I can help it. — Callum, Jan 14 '25 at 13:42

Understanding $\text{Ad}\circ \exp = \exp \circ \text{ad}$, domains and ranges

1 Answers1

Level One

Level 2

Third Level