Let say I have a function $f:\mathbb{R}^2 \rightarrow \mathbb{R}$ that depends on two variables $\mathbf{x}=[x, y]$. The hessian matrix $H$ associated to $f$ $$ H(\mathbf{x}):= \begin{pmatrix} \partial^2_{x} f(\mathbf{x})&\partial_{x}\partial_{y} f(\mathbf{x})\\ \partial_{x}\partial_{y} f(\mathbf{x})&\partial^2_{y} f(\mathbf{x}) \end{pmatrix} $$ is in general not diagonal.
I am looking for a change of variable $\mathbf{u}=[u(\mathbf{x}), v(\mathbf{x})]$ so that the hessian matrix $\hat{H}$ associated to $\hat{f}$ is diagonal, where $\hat{f}$ is such that $\hat{f}(\mathbf{u}(\mathbf{x}))=f(\mathbf{x})$. The hessian $\hat{H}$ should therefore look like this
$$ \hat{H}(\mathbf{u}):= \begin{pmatrix} \partial^2_{u} \hat{f}(\mathbf{u})&\partial_{u}\partial_{v} \hat{f}(\mathbf{u})\\ \partial_{u}\partial_{v} \hat{f}(\mathbf{u})&\partial^2_{v} \hat{f}(\mathbf{u}) \end{pmatrix}= \begin{pmatrix} \partial^2_{u} \hat{f}(\mathbf{u})&0\\ 0&\partial^2_{v} \hat{f}(\mathbf{u}) \end{pmatrix}. $$
Any ideas/tips are welcome.
My idea / what I have tried
I tried to apply the chain rule in an attempt to express $\hat{H}$ as function of the original hessian $H$ and the new variables $\mathbf{u}$. This would help determine conditions on the change of variable to make the off-diagonal entries of $\hat{H}$ equal to zero.
To do that I started to express the gradient of $f$: \begin{align*} \partial_{x}f&=\partial_{u} \hat{f}.\partial_{x}u+\partial_{v} \hat{f}.\partial_{x}v\\ \partial_{y}f&=\partial_{u} \hat{f}.\partial_{y}u+\partial_{v} \hat{f}.\partial_{y}v. \end{align*} Then I took the derivatives of the gradient (w.r.t to $\mathbf{x}$ again) to get the original hessian $H$. Using the chain rule and the formula for the derivative of a product, I get \begin{align*} \partial^2_x f=& \partial_x u.(\partial^2_u\hat{f}.\partial_x u + \partial_u\partial_v\hat{f}.\partial_x v) + \partial^2_x u. \partial_u\hat{f}\\ +& \partial_x v.(\partial^2_v\hat{f}.\partial_x v + \partial_u\partial_v\hat{f}.\partial_x u) + \partial^2_x v .\partial_v\hat{f} \end{align*}
\begin{align*} \partial^2_y f=& \partial_y u.(\partial^2_u\hat{f}.\partial_y u + \partial_u\partial_v\hat{f}.\partial_y v) + \partial^2_y u. \partial_u\hat{f}\\ +& \partial_y v.(\partial^2_v\hat{f}.\partial_y v + \partial_u\partial_v\hat{f}.\partial_y u) + \partial^2_y v. \partial_v\hat{f} \end{align*}
\begin{align*} \partial_x\partial_y f=& \partial_y u.(\partial^2_u\hat{f}.\partial_x u + \partial_u\partial_v\hat{f}.\partial_x v) + \partial_x\partial_y u. \partial_u\hat{f}\\ +& \partial_y v.(\partial^2_v\hat{f}.\partial_x v + \partial_u\partial_v\hat{f}.\partial_x u) + \partial_x\partial_y v. \partial_v\hat{f} \end{align*}
This can be expressed in a more compact form by defining the jacobian $J$ of the transformation $\mathbf{u}$ as $$ J:=\frac{\partial \mathbf{u}}{\partial \mathbf{x}}:= \begin{pmatrix} \partial_x u&\partial_y u\\ \partial_xv&\partial_y v \end{pmatrix} $$ and "the jacobian of the jacobian" $L$ as $$ L_{ijk}:=\left(\frac{\partial^2 \mathbf{u}}{\partial^2 \mathbf{x}}\right)_{ijk}:=\partial_{x_i}\partial_{x_j} u_k. $$
By using the definitions, the original hessian $H$ is given by \begin{align*} H_{ij}&=\left( J^\text{T}\hat{H}J\right)_{ij} + \sum_k L_{ijk}\partial_k f\\ &=\sum_k \sum_l J_{ki} \hat{H}_{kl} J_{lj} + \sum_k L_{ijk}\partial_k f \end{align*}
I am now stuck because I'd like to isolate $\hat{H}$ and I do not see how to proceed. By assuming that $L_{ijk}=0$, i.e. that the change of variable is linear in $x$ and $y$ (?), then we have $$ H=J^\text{T} \hat{H} J $$ I think (not sure) that this means that $J$ should be the inverse of the eigenvector matrix of $H$ for $\hat{H}$ to be diagonal. But if I derive the change of variable based on this, I really doubt that the change of variable will be linear in general. Now I am running out of ideas and asking for help :).