I'm trying to understand the Proof of Chain Rule for functions of 1 independent variable and 2 intermediate variables.
Here's my reasoning, step-by-step:
- The book reasons that the proof "consists of showing that if $x$ and $y$ are differentiable at $t=t_0$, then $w$ is differentiable at $t_0$, then $\frac{d w}{d t}\left(t_{0}\right)=\frac{\partial w}{\partial x}\left(P_{0}\right) \frac{d x}{d t}\left(t_{0}\right)+\frac{\partial w}{\partial y}\left(P_{0}\right) \frac{d y}{d t}\left(t_{0}\right)$". That makes exactly zero sense to me. Why do we have to start like this? Is there no geometric proof, akin to 3Blue1Brown's beautiful, intuitive picture of sliders for the independent, intermediate, and dependent variables?
- OK, say I buy into that -- that $w$ is differentiable if its inner functions are differentiable. Now the book moves onto $\Delta w=\frac{\partial w}{\partial x}\left(P_{0}\right) \Delta x+\frac{\partial w}{\partial y}\left(P_{0}\right) \Delta y+\varepsilon_{1} \Delta x+\varepsilon_{2} \Delta y$, where $\Delta x, \Delta y, \& \Delta w$. What?! Where does the $\varepsilon_{1} \Delta x+\varepsilon_{2}\Delta y$ even come from? There must be a geometrical way to understand this.
All else afterwards is trivial, save for the final statement that
$\frac{d w}{d t}=\frac{\partial w}{\partial x} \frac{d x}{d t}+\frac{\partial w}{\partial y} \frac{d y}{d t}$
Why do we use $\frac{d w}{d t}$ even though $w=f(x(t),y(t))$ and is a function of two variables, not one? Shouldn't it be $\frac{\partial w}{\partial t}$
And as always, constructive criticism > downvotes. I think MSE understands this better than SO.