10

Let $\Phi(x,y)=0$ be an implicit function s.t. $\Phi:\mathbb{R}^n\times \mathbb{R}^k\rightarrow \mathbb{R}^n$ and $\det\left(\frac{\partial \Phi}{\partial x}(x_0,y_0)\right)\neq 0$. This means that locally at $(x_0,y_0)$ we can express $x_i$ as functions of $y$.

Next, we can compute partial derivatives of $x$ as \begin{equation}\tag{*}\frac{\partial x_i}{\partial y_j}=-\frac{\det\left(\left[\frac{\partial \Phi}{\partial x_1},\dots,\frac{\partial \Phi}{\partial x_{i-1}}, \frac{\partial \Phi}{\partial y_j}, \frac{\partial \Phi}{\partial x_{i+1}},\dots, \frac{\partial \Phi}{\partial x_n}\right]\right)}{\det\left(\frac{\partial \Phi}{\partial x}\right)}.\end{equation} This is known. What I wonder is:

Q: is it possible to compute second order partial derivatives in a systematic way?

I tried to differentiate determinants using the Jacobi formula, but this leads to very complicated expressions that I cannot handle. I also expanded the determinants in ($*$) along the $i$ column (by which the respective matrices differ) and tried some other approaches, but they do not seem to bring me any further.

On the other hand, if a go a straightforward way and differentiate $\Phi(x,y)$ twice, I get expressions involving tensors or rather multiindex notations, because neither second order partial derivatives, nor the derivatives of type $\frac{\partial x^i}{\partial y^j}$ are actually tensors.

My hope is that maybe it is still possible to extract some nice tractable expression similar to how we got ($*$) from $\frac{\partial x}{\partial y}=-\left[\frac{\partial \Phi}{\partial x}\right]^{-1}\frac{\partial \Phi}{\partial y}$?

Here is a related question.

UPDATE: It seems that the problem turned out to be more difficult than I expected (although many people told me that it must have been solved by somebody). Since the hope for getting a resolutive answer fades and the bounty will expire in a couple of days, I'd gladly grant it to anybody who could point out a way to approach (if not solve) this problem.

UPDATE 2: Let me expand a bit on the above. To illustrate my problem let's differentiate $\left[\frac{\partial \Phi}{\partial x}\right]^{-1}$ w.r.t. $y_i$: \begin{multline*}\frac{\partial}{\partial y_i}\left[\frac{\partial \Phi}{\partial x}\right]^{-1}=-\left[\frac{\partial \Phi}{\partial x}\right]^{-1}\frac{\partial}{\partial y_i}\left[\frac{\partial \Phi}{\partial x}\right]\left[\frac{\partial \Phi}{\partial x}\right]^{-1}\\ =-\left[\frac{\partial \Phi}{\partial x}\right]^{-1}\left[\frac{\partial^2 \Phi}{\partial x\partial x}\right]\frac{\partial x}{\partial y_i}\left[\frac{\partial \Phi}{\partial x}\right]^{-1}-\left[\frac{\partial \Phi}{\partial x}\right]^{-1}\left[\frac{\partial^2 \Phi}{\partial y_i\partial x}\right]\left[\frac{\partial \Phi}{\partial x}\right]^{-1}.\end{multline*} So, what is $\left[\frac{\partial^2 \Phi}{\partial x\partial x}\right]\frac{\partial x}{\partial y_i}$? A 3D matrix multiplied with a vector? How to treat these expressions? To make the things even more complicated we should now substitute $\frac{\partial x}{\partial y_i}$ with the respective expression for the first order partial derivatives. It becomes completely obscure and I cannot recognize any structure in it.

Dmitry
  • 1,347
  • 1
    Why not differentiate the matrix equation you put at the end? You need the product rule, chain rule, and the formula for the derivative of the matrix function $f(A)=A^{-1}$. – Ted Shifrin Apr 21 '19 at 00:43
  • Dear @Ted, please see the update at the end of the question. In short: yes, I can write the solution in this way, but I do not see a little bit of structure in it. It becomes either a very complex matrix expression involving 3D matrices (which I do not know how to deal with) or a bacchanalia of indices. – Dmitry Apr 21 '19 at 15:45
  • Your last term in the edit doesn't belong there. The first term involves the Hessian matrix (the matrix of second partials); there's nothing 3D about it. – Ted Shifrin Apr 21 '19 at 15:57
  • It would be a Hessian if $\Phi$ was a scalar function, but $\Phi$ is a vector-valued function. And sorry, what's wrong about the second term? $\frac{\partial \Phi}{\partial x}$ does depend on $y$ as well as on $x$. – Dmitry Apr 21 '19 at 16:02
  • Yes, of course you're right on both counts. I was too hasty. You can think about it one component of $\Phi$ at a time, though, so you can think of an $\Bbb R^n$-valued Hessian. ... My main point in commenting was that you don't want to use those Cramer's rule formulas for the inverse computation. – Ted Shifrin Apr 21 '19 at 16:06

1 Answers1

1

I'll post a partial answer.

Pretend your functions are given by Taylor series to the needed (second) order.

So, we write

$$x_p=h_p(y)=\sum \frac{\partial h_p}{\partial y_k} y_k + \sum_{i,j} \frac{1}{2}\frac{\partial^2 h_p}{\partial y_i \partial y_j} y_i y_j$$

$$\Phi_l(x,y)= \sum_k \frac{\partial \Phi_l}{\partial x_k} x_k + \sum_p \frac{\partial \Phi_l}{\partial y_p} y_p + \sum_{i,j} \frac{1}{2}\frac{\partial^2 \Phi_l}{\partial x_i \partial x_j} x_i x_j+ \sum_{q,r} \frac{1}{2}\frac{\partial^2 \Phi_l}{\partial y_q \partial y_r} y_q y_r+\sum_{p,k} \frac{1}{2} \frac{\partial^2 \Phi_l}{\partial y_p \partial x_k} y_p x_k+ \sum_{p,k} \frac{1}{2} \frac{\partial^2 \Phi_l}{\partial x_k \partial y_p} x_k y_p $$

Now plug in and keep equate the coefficients of $y_i y_j$ to zero. There are 5 terms in $\Phi_l$. They contribute (in the case of $i\neq j$, so summing the "$y_iy_j$" and the "$y_jy_i$" contributions; there are 1/2 factors throughout if $i=j$):

1) Nothing.

2) $\sum_p \frac{\partial \Phi_l}{\partial y_p} \frac{\partial^2 h_p}{\partial y_i \partial y_j} $

3) $\frac{\partial^2 \Phi_l}{\partial y_i \partial y_j}$

4)$ \sum_{q,r} \frac{\partial^2 \Phi_l}{\partial x_q \partial x_r} \frac{\partial h_q}{\partial y_i} \frac{\partial h_r}{\partial y_j}$

5)$\sum_{p} \frac{\partial^2 \Phi_l}{\partial x_p \partial y_j} \frac{\partial h_p}{\partial y_i} $

6)$\sum_{p} \frac{\partial^2 \Phi_l}{ \partial y_i \partial y_p} \frac{\partial h_p}{\partial y_j} $

Varying $l$, one gets $n$ linear equations (labeled by $l$) in $n$ unknowns $\frac{\partial^2 h_p}{\partial y_i \partial y_j} $ (labeled by $p$), which can therefore be written as $A z=b$, with the matrix $A$ of the linear system given by $\Phi_{x}$. Hence these equations can be solved. The only trouble is in writing the $b$ vector, which is the sum of terms 3-6, in a "vector" format.

Maybe a better way to do bookkeeping for this is to use tree-speak like here...

Max
  • 14,503
  • Dear @Max, shouldn't your first expression be $x_p=h_p(y)=\sum \frac{\partial h_p}{\partial y_k} y_k + \sum_{i,j} \frac{1}{2}\frac{\partial^2 h_p}{\partial y_i \partial y_j} y_i y_j$? And why do you equate the coeff's at $x_ix_j$ to zero? – Dmitry Apr 22 '19 at 09:04
  • I'm just too used to the "opposite" variable conventions, trying to solve for $y=h(x)$ rather than $x=h(y)$. I have changed to the $x=h(y)$ convention in the answer, but may have introduced new mistakes along the way , sorry. As for why we set the coefficients to zero, well the whole (polynomial in this case) function $\Phi(h(y), y)$ has to be zero, so any monomial coefficient is zero. – Max Apr 22 '19 at 10:03
  • On the second issue: sure. I've just misinterpreted you. I have written the equations (where $\frac{\partial h_p}{\partial y_i}$ are to be obtained by setting to 0 the terms at $y_i$). -- It is interesting to note that if we fix $p$ then all the respective second-order partial derivatives can be obtained without solving the algebraic equations. -- Ideally, I'd like to be able to write the resulting expressions using vector-matrix notation, but apparently there is no way to do that. -- And thank you for pointing to me the Faa di Bruno formula and related results. I didn't know about it. – Dmitry Apr 22 '19 at 12:46