An expression for computing second order partial derivatives of an implicitely defined function

Question

Let $\Phi(x,y)=0$ be an implicit function s.t. $\Phi:\mathbb{R}^n\times \mathbb{R}^k\rightarrow \mathbb{R}^n$ and $\det\left(\frac{\partial \Phi}{\partial x}(x_0,y_0)\right)\neq 0$. This means that locally at $(x_0,y_0)$ we can express $x_i$ as functions of $y$.

Next, we can compute partial derivatives of $x$ as \begin{equation}\tag{*}\frac{\partial x_i}{\partial y_j}=-\frac{\det\left(\left[\frac{\partial \Phi}{\partial x_1},\dots,\frac{\partial \Phi}{\partial x_{i-1}}, \frac{\partial \Phi}{\partial y_j}, \frac{\partial \Phi}{\partial x_{i+1}},\dots, \frac{\partial \Phi}{\partial x_n}\right]\right)}{\det\left(\frac{\partial \Phi}{\partial x}\right)}.\end{equation} This is known. What I wonder is:

Q: is it possible to compute second order partial derivatives in a systematic way?

I tried to differentiate determinants using the Jacobi formula, but this leads to very complicated expressions that I cannot handle. I also expanded the determinants in ($*$) along the $i$ column (by which the respective matrices differ) and tried some other approaches, but they do not seem to bring me any further.

On the other hand, if a go a straightforward way and differentiate $\Phi(x,y)$ twice, I get expressions involving tensors or rather multiindex notations, because neither second order partial derivatives, nor the derivatives of type $\frac{\partial x^i}{\partial y^j}$ are actually tensors.

My hope is that maybe it is still possible to extract some ~~nice~~ tractable expression similar to how we got ($*$) from $\frac{\partial x}{\partial y}=-\left[\frac{\partial \Phi}{\partial x}\right]^{-1}\frac{\partial \Phi}{\partial y}$?

Here is a related question.

UPDATE: It seems that the problem turned out to be more difficult than I expected (although many people told me that it must have been solved by somebody). Since the hope for getting a resolutive answer fades and the bounty will expire in a couple of days, I'd gladly grant it to anybody who could point out a way to approach (if not solve) this problem.

UPDATE 2: Let me expand a bit on the above. To illustrate my problem let's differentiate $\left[\frac{\partial \Phi}{\partial x}\right]^{-1}$ w.r.t. $y_i$: \begin{multline*}\frac{\partial}{\partial y_i}\left[\frac{\partial \Phi}{\partial x}\right]^{-1}=-\left[\frac{\partial \Phi}{\partial x}\right]^{-1}\frac{\partial}{\partial y_i}\left[\frac{\partial \Phi}{\partial x}\right]\left[\frac{\partial \Phi}{\partial x}\right]^{-1}\\ =-\left[\frac{\partial \Phi}{\partial x}\right]^{-1}\left[\frac{\partial^2 \Phi}{\partial x\partial x}\right]\frac{\partial x}{\partial y_i}\left[\frac{\partial \Phi}{\partial x}\right]^{-1}-\left[\frac{\partial \Phi}{\partial x}\right]^{-1}\left[\frac{\partial^2 \Phi}{\partial y_i\partial x}\right]\left[\frac{\partial \Phi}{\partial x}\right]^{-1}.\end{multline*} So, what is $\left[\frac{\partial^2 \Phi}{\partial x\partial x}\right]\frac{\partial x}{\partial y_i}$? A 3D matrix multiplied with a vector? How to treat these expressions? To make the things even more complicated we should now substitute $\frac{\partial x}{\partial y_i}$ with the respective expression for the first order partial derivatives. It becomes completely obscure and I cannot recognize any structure in it.

Why not differentiate the matrix equation you put at the end? You need the product rule, chain rule, and the formula for the derivative of the matrix function $f(A)=A^{-1}$. — Ted Shifrin, Apr 21 '19 at 00:43
Dear @Ted, please see the update at the end of the question. In short: yes, I can write the solution in this way, but I do not see a little bit of structure in it. It becomes either a very complex matrix expression involving 3D matrices (which I do not know how to deal with) or a bacchanalia of indices. — Dmitry, Apr 21 '19 at 15:45
Your last term in the edit doesn't belong there. The first term involves the Hessian matrix (the matrix of second partials); there's nothing 3D about it. — Ted Shifrin, Apr 21 '19 at 15:57
It would be a Hessian if $\Phi$ was a scalar function, but $\Phi$ is a vector-valued function. And sorry, what's wrong about the second term? $\frac{\partial \Phi}{\partial x}$ does depend on $y$ as well as on $x$. — Dmitry, Apr 21 '19 at 16:02
Yes, of course you're right on both counts. I was too hasty. You can think about it one component of $\Phi$ at a time, though, so you can think of an $\Bbb R^n$-valued Hessian. ... My main point in commenting was that you don't want to use those Cramer's rule formulas for the inverse computation. — Ted Shifrin, Apr 21 '19 at 16:06

Max · Accepted Answer · 2019-04-22T10:01:26.360

I'll post a partial answer.

Pretend your functions are given by Taylor series to the needed (second) order.

So, we write

$$x_p=h_p(y)=\sum \frac{\partial h_p}{\partial y_k} y_k + \sum_{i,j} \frac{1}{2}\frac{\partial^2 h_p}{\partial y_i \partial y_j} y_i y_j$$

$$\Phi_l(x,y)= \sum_k \frac{\partial \Phi_l}{\partial x_k} x_k + \sum_p \frac{\partial \Phi_l}{\partial y_p} y_p + \sum_{i,j} \frac{1}{2}\frac{\partial^2 \Phi_l}{\partial x_i \partial x_j} x_i x_j+ \sum_{q,r} \frac{1}{2}\frac{\partial^2 \Phi_l}{\partial y_q \partial y_r} y_q y_r+\sum_{p,k} \frac{1}{2} \frac{\partial^2 \Phi_l}{\partial y_p \partial x_k} y_p x_k+ \sum_{p,k} \frac{1}{2} \frac{\partial^2 \Phi_l}{\partial x_k \partial y_p} x_k y_p $$

Now plug in and keep equate the coefficients of $y_i y_j$ to zero. There are 5 terms in $\Phi_l$. They contribute (in the case of $i\neq j$, so summing the "$y_iy_j$" and the "$y_jy_i$" contributions; there are 1/2 factors throughout if $i=j$):

1) Nothing.

2) $\sum_p \frac{\partial \Phi_l}{\partial y_p} \frac{\partial^2 h_p}{\partial y_i \partial y_j} $

3) $\frac{\partial^2 \Phi_l}{\partial y_i \partial y_j}$

4)$ \sum_{q,r} \frac{\partial^2 \Phi_l}{\partial x_q \partial x_r} \frac{\partial h_q}{\partial y_i} \frac{\partial h_r}{\partial y_j}$

5)$\sum_{p} \frac{\partial^2 \Phi_l}{\partial x_p \partial y_j} \frac{\partial h_p}{\partial y_i} $

6)$\sum_{p} \frac{\partial^2 \Phi_l}{ \partial y_i \partial y_p} \frac{\partial h_p}{\partial y_j} $

Varying $l$, one gets $n$ linear equations (labeled by $l$) in $n$ unknowns $\frac{\partial^2 h_p}{\partial y_i \partial y_j} $ (labeled by $p$), which can therefore be written as $A z=b$, with the matrix $A$ of the linear system given by $\Phi_{x}$. Hence these equations can be solved. The only trouble is in writing the $b$ vector, which is the sum of terms 3-6, in a "vector" format.

Maybe a better way to do bookkeeping for this is to use tree-speak like here...

Dear @Max, shouldn't your first expression be $x_p=h_p(y)=\sum \frac{\partial h_p}{\partial y_k} y_k + \sum_{i,j} \frac{1}{2}\frac{\partial^2 h_p}{\partial y_i \partial y_j} y_i y_j$? And why do you equate the coeff's at $x_ix_j$ to zero? — Dmitry, Apr 22 '19 at 09:04
I'm just too used to the "opposite" variable conventions, trying to solve for $y=h(x)$ rather than $x=h(y)$. I have changed to the $x=h(y)$ convention in the answer, but may have introduced new mistakes along the way , sorry. As for why we set the coefficients to zero, well the whole (polynomial in this case) function $\Phi(h(y), y)$ has to be zero, so any monomial coefficient is zero. — Max, Apr 22 '19 at 10:03
On the second issue: sure. I've just misinterpreted you. I have written the equations (where $\frac{\partial h_p}{\partial y_i}$ are to be obtained by setting to 0 the terms at $y_i$). -- It is interesting to note that if we fix $p$ then all the respective second-order partial derivatives can be obtained without solving the algebraic equations. -- Ideally, I'd like to be able to write the resulting expressions using vector-matrix notation, but apparently there is no way to do that. -- And thank you for pointing to me the Faa di Bruno formula and related results. I didn't know about it. — Dmitry, Apr 22 '19 at 12:46

An expression for computing second order partial derivatives of an implicitely defined function

1 Answers1