Why doesn't the definition of derivative generalize smoothly from single variable to multivariable calculus like the definition of continuity?

Question

As from my previous questions, one may follow that I am trying to understand the differences between real and complex derivative.

My question is:

For a function $f:\mathbb{R}\to\mathbb{R}$, the derivative at a point $c\in\mathbb{R}$ is defined as $$f'(c)=\lim_{h\rightarrow 0}\frac{f(c+h)-f(c)}{h}$$

For a complex function $f:\mathbb{C}\longrightarrow\mathbb{C}$ the complex derivative at a point $z$ is defined as $$f'(z)=\lim_{h\rightarrow 0}\frac{f(z+h)-f(z)}{h}.$$

But for a function $f:\mathbb{R}^2\longrightarrow \mathbb{R}^2$, why isn't the derivative at a point $(x,y)$ defined as $$f'(x,y)=\lim_{(h,k)\rightarrow (0,0)}\frac{f((x,y)+(h,k))-f(x,y)}{(h,k)}?$$

Also, in this question, they say, Differences between the complex derivative and the multivariable derivative., that it is because, it is not possible to define division in $\mathbb{R}^2$. But can we not define the inverse of $(h,k)$ as $(\frac{1}{h}, \frac{1}{k})$, for $h\neq 0, k\neq 0$? And define $(a,b){(c,d)}^{-1}$ this way?

How do you propose to "divide" two elements of $\mathbb R^2$? — jlammy, Dec 17 '21 at 18:16
Defining $1/(h,k)$ as $(1/h, 1/k)$ gives one particular way that $h \to 0$ and $k \to 0$, but you can construct examples where the limit exists for this path, but not say for $(1/h, 2/k)$. — David Kraemer, Dec 17 '21 at 18:27
The first rule of studying "how a multivariable function changes" is to hold every variable but one constant, i.e., to "control the variables" in the sense of lab science. But that's precisely saying "make $h = 0$ or make $k = 0$." This causes problems with the proposed notion of inverse. (This is not the only problem with the proposal, unfortunately, but it's the snag most directly related to the body of this question.) — Andrew D. Hwang, Dec 17 '21 at 19:53
@Andrew D. Hwang can you give a good example showcasing the problems created? — user332905, Dec 18 '21 at 02:18
Multivariate calculus isn't only about functions whose domain and range have the same dimension. What about $f:\mathbb R^2\to\mathbb R$ or $f:\mathbb R^2\to\mathbb R^3$? — David K, Dec 18 '21 at 02:31
@David K Yes I totally agree with you. The cases you mentioned it will definitely not work. I just wanted some concrete examples to clear my mind that the division I defined for $f:\mathbb{R}^2\longrightarrow \mathbb{R}^2$ , will not work to find derivative of any such function. — user332905, Dec 18 '21 at 03:00

FShrike · Accepted Answer · 2021-12-18T19:05:24.730

$\newcommand{\d}{\mathrm{d}}$The reasons why your definition fails has been discussed appropriately in the comments, but as you asked I will provide an explicit example of the failure of your definition. I will also showcase the true definition, and link it to the derivatives with which you’re familiar.

Counter example:

Let $f(x,y)=(x^2+y^2,x^2-y^2)$ and let's differentiate it at $(0,0)$. Its Jacobian, i.e. its proper derivative (differential) according to the standard definition, is: $$\d f=\begin{pmatrix}2x&2y\\2x&-2y\end{pmatrix}\overset{(0,0)}=\begin{pmatrix}0&0\\0&0\end{pmatrix}$$Let's attempt your definition. Consider the numerator of your quotient: $$f((0,0)+(h,k))-f(0,0)=(h^2+k^2,h^2-k^2)$$And now let's "divide" it by $(h,k)$, using your pointwise operation and allowing for dodgy notation: $$\frac{\Delta f}{\Delta(x,y)}=\left(\frac{h^2+k^2}{h},\frac{h^2-k^2}{k}\right)=(h+k^2/h,h^2/k-k)$$What happens as $(h,k)\to(0,0)$? The limit exists if it is the same regardless of the path I take and if it exists for all paths. The path $h=k\to0$ clearly has limit $(0,0)$, but the path $h=k^2$ has: $$\lim_{k\to0}(k^2+1,k^3-k)=(1,0)$$So we see that the limit does not exist, as I have achieved two different answers using two paths to $(0,0)$. Moreover, even if it did exist, in what way would it be a derivative? Your answer would be a vector, but a derivative needs to be a linear map, unless you really want to shake up the definitions...

In the spirit of generalising the derivative as a linear approximation, consider this definition:

If $f:X\to Y$, where $X,Y$ are metric spaces, $f$ is defined to be differentiable at $p_0\in X$ if there exists a unique linear map $\d f:X\to Y$ and a function $\psi:X\to Y$ which is continuous near $p_0$ and $|\psi(p-p_0)|\in o|p-p_0|$, so that: $$f(p)-f(p_0)=\d f\circ(p-p_0)+\psi(p-p_0)$$

If $X,Y=\Bbb R^n,\Bbb R^m$, then $\d f$ would be the Jacobian matrix - remember that linear maps over finite dimensional vector spaces can always be represented by matrices.

If that definition is new to you, I reiterate that the motivation is that this takes the notion of local linear approximation (like the "gradient of a line" in the single variable case) and generalises it easily to fairly arbitrary situations. It keeps the idea that if I zoom in on the graph, or here generally if I zoom in on the surface, there comes a point where it appears “flat”, and the derivative (or more often, the differential), is the linear map that resembles this “flat” section of the surface - recall that matrices can describe planes and other higher dimensional surfaces.

Notice that if $X=Y=\Bbb R$, $\d f$ is a real scalar, let's call it $f'(p_0)$, and we can write:

$$f(p)-f(p_0)=f'(p_0)(p-p_0)+\psi(p-p_0)\implies\frac{f(p)-f(p_0)}{p-p_0}\approx f'(p)$$

Where the approximation is precise as $p\to p_0$, since $|\psi(p-p_0)|\in o|p-p_0|$. Notice that this is precisely the definition of derivative that you use for $f:\Bbb R\to\Bbb R$.

What about the complex derivative?

Well, if we consider $f:\Bbb C\to\Bbb C$ differentiable at $z$ only if it is differentiable as a function of $\Bbb R^2\to\Bbb R^2$, the differential in the sense of $\Bbb R^2$ is a Jacobian matrix with respect to the real and imaginary parts of $f(z)=u+iv$, and the real and imaginary parts of the input $z=x+iy$:

$$f'(z)=f’(x,y)=\begin{pmatrix}\partial_x u&\partial_y u\\\partial_x v&\partial_y v\end{pmatrix}$$

But really this is a complex function, so this matrix must represent a complex scalar. Recall that a complex number $a+ib$ has the matrix representation:

$$a+ib\simeq\begin{pmatrix}a&-b\\b&a\end{pmatrix}$$

Equating the matrices, the derivative $f'=a+ib$ has real part $a=\partial_x u=\partial_y v$, and imaginary part $b=\partial_x v=-\partial_y u$. These are precisely the Cauchy-Riemann equations, which are quite restrictive; $f$'s differentiability as a function of $\Bbb R^2$ does not imply differentiability as a function of $\Bbb C$.

If the notion of matrix representations of $\Bbb C$ is unfamiliar to you, just try writing two complex variables in rectangular form and test adding and multiplying them. Do the same with their matrix representations as above. You should find that they are the same! Then the matrix representation perfectly captures the basic arithmetic operations. Whether the representation holds true in the context of more complicated operations I’m unsure of, but that level of representation theory isn’t relevant here. Things like the matrix exponential and analytic matrix functions do exist however...

Hopefully that explains the multivariable derivative's link to single variable derivatives.

Hi. My confusion is in the first paragraph of your answer. I have read and understood the rest of your answer. I am actually aware of it. I just want a clear argument/ example showing that the division the I have defined will not help define derivative for any $f:\mathbb{R}^2\longrightarrow \mathbb{R}^2$ same as we do for $f:\mathbb{R}\longrightarrow\mathbb{R}$?The comments say, such a division will create problems, but what and how? I just wanted a clear answer stating that. — user332905, Dec 18 '21 at 15:34
@user332905 I left it out as I thought the comments addressed it. I have now given an explicit example — FShrike, Dec 18 '21 at 16:14

blamocur · Answer 2 · 2021-12-17T18:53:39.147

In general case a derivative is defined as a linear map which serves as a local approximation of a function.In 1D case you can say that $f(x+h) - f(x) \approx f^\prime h$, with $f^\prime h$ being a linear function of $h$. But in multivariable case $f^\prime$ is represented by a matrix, and $x$, $h$ and $f(x)$ must be vectors.

You can still give a similar definition in vector notation: the result of applying $\mathbf f^\prime$ to vector $\mathbf v$ can be defined as $$ \mathbf f^\prime \mathbf v = \lim_{h\rightarrow 0} \frac{\mathbf f(\mathbf x + h\mathbf v) - \mathbf f (\mathbf x)}{h} $$ BTW, the elements of the matrix $\mathbf f^\prime$, which are called partial derivatives, can still be defined the same way as for single variables: $$\frac{\partial f_i}{\partial x_j} = \lim_{h\rightarrow 0} \frac{f_i(x_1,..,x_{j+h}, ..., x_n) - f_i(x_1,..,x_{j}, ..., x_n)}{h}$$

score 0 · Answer 3 · answered Dec 17 '21 at 19:07

Well, the problem is that there exist one too many derivatives in $\mathbb R^2\to \mathbb R^2$ when compared with $\mathbb C \to \mathbb C$.
I'll give an example, if we look at $f(x,y)=(x,-y)$, the function looks perfectly well defined as every part of it has a derivative, so one expects it to have a nice derivative.
While, when looking at complex numbers we encounter a problem $x+iy\mapsto x-iy$ or $f(z)=z^*$, is not derivable, $$\lim_{h\to0} \frac{(z+h)^*-z^*}{h}=\lim_{h\to0} \frac{h^*}{h}=\lim_{h\to0} \frac{(h^*)^2}{|h|^2}$$ Whose convergence depends on which direction we are going to $0$ from.
The problem encountered here is the fact that complex function derivability is a geometric condition on the function, not only a strengthened version of continuity.
Geometric Interpretation
If the derivative of $f(z)$ at $z_0$ is $w$, it means a lot about $f$'s behavior around $z_0$ can be known regardless of $w$.
because it means near $z_0$: $$f(z_0+h) \approx f(z_0)+wh$$ Which means, For example that if we add a small real value to $z_0$, if we move in some direction, we expect that when we add a small complex value, we'll move in a 90 degree clockwise direction.
In $z^*$ we'd move in a 90 degree counterclockwise direction.
and in general, the complex derivative implies $f$ preserves angles between curves.
Partial Derivatives
if we look at it through the lens of partial derivatives, the geometric condition is phrased as the Cauchy–Riemann equations:
$$f(x+yi)=u(x,y)+iv(x,y)$$ $$\frac{\partial u}{\partial x} = \frac{\partial v}{\partial y}, \frac{\partial u}{\partial y} = -\frac{\partial v}{\partial x}$$

Why doesn't the definition of derivative generalize smoothly from single variable to multivariable calculus like the definition of continuity?

3 Answers3