My statistics text states this theorem as if it works for any function $g$:
Let $\tau = g(\theta)$ be a function of $\theta$. Let $\hat{\theta}_n$ be the MLE (Maximum Likelihood Estimator) of $\theta$. Then $\hat{\tau}_n = g(\hat{\theta}_n)$ is the MLE of $g(\theta)$.
And offers this proof that seems to assume $g$ has an inverse:
Proof. Let $h = g^{-1}$ denote the inverse of $g$. Then $\hat{\theta}_n = h(\hat{\tau}_n)$. For any $\tau$, $\mathcal{L}(\tau) = \prod_i f(x_i; h(\tau)) = \prod_i f(x_i;\theta) = \mathcal{L}(\theta)$ where $\theta = h(\tau)$. Hence, for any $\tau$, $\mathcal{L}_n(\tau) = \mathcal{L}(\theta) \leq \mathcal{L}(\hat{\theta}) = \mathcal{L}_n(\hat{\tau})$.
Is an inverse required? Maybe the author is assuming one for a simpler proof? Also I'm not sure where the inequality is coming from?
I tried reading the Wikipedia article on equivariant maps (my statistics text is my first exposure to the term) but it uses too much material I haven't learned yet.