6

I recently learnt about dual numbers, of the form $u + v\varepsilon$ where $u, v \in \mathbb R$ and $\varepsilon \notin \mathbb R$ is such that $\epsilon^2 = 0$. These numbers are used for automatic differentiation as they allegedly satisfy $f'(x) = \operatorname{Dual}(f(x+\varepsilon))$, where $\operatorname{Dual}$ is the coefficient of $\varepsilon$ in the cartesian expansion of a dual number. However, all resources I came across just showed this property for a few simple examples, usually polynomial.

I tried to prove or reject this property for a more general class of functions. So let's assume $f : (I \subseteq \mathbb R) \to \mathbb R$ is differentiable and $\delta$ is a small non-zero real number. We can then write $f(x+\delta) = f(x) + \delta f'(x) + \delta o(1)$, where $o(1)$ goes to $0$ as $\delta$ goes to $0$. Assuming we manage to make $\varepsilon$ behave like a small non-zero real, we get $f(x+\epsilon) = f(x) + \varepsilon f'(x) + \varepsilon o(1)$, and if $o(1) = \varepsilon \times \text{something}$ we can conclude that the dual part of $f(x+\varepsilon)$ is indeed $f'(x)$. This happens when $f$ can be expressed as a power series, which already helps cover a nice range of functions.

However:

  • being differentiable likely isn't enough to having a first-order Taylor expansion with a remainder of the form $\epsilon^2 \times \text{stuff}$, but can we make weaker assumptions?
  • can we even extend real functions to dual reals in a way that guarantees that $f(x+\varepsilon)$ looks like $f(x+\delta)$ for small $\delta$?

As an example for the second point, defining absolute value seems quite non-trivial. Assuming $|x+\varepsilon|$ is a square root of $(x+\varepsilon)^2$, calculations show that it must equal $\pm(x+\epsilon)$ if $x \neq 0$ and $0$ otherwise. To be able to find the derivative of $x \mapsto |x|$ and $x \mapsto x|x|$, we figure out that the best choice is probably to set $|x + \varepsilon| = x+\varepsilon$ if $x > 0$, $-(x+\varepsilon)$ if $x < 0$ and $0$ if $x = 0$.

Rócherz
  • 4,241
  • 4
    You may be interested in the Transfer Principle for the hyperreals. It's what allows us to extend real functions to hyperreal ones. https://en.wikipedia.org/wiki/Transfer_principle – CyclotomicField Apr 20 '25 at 17:28
  • 3
    @CyclotomicField this question is not about hyperreals. – Anixx Apr 20 '25 at 17:52
  • Thanks @CyclotomicField, I am familiar with the transfer principle for the hyperreals however as per my limited knowledge of hyperreals there is no way of representing them in a computer and therefore they are not useful for automatic differentiation. Dual numbers are of interest because they can be easily represented as a pair of reals (or rather, floating point numbers or fractions). – user8171079 Apr 20 '25 at 18:23
  • 1
    Note that real absolute value is usually replaced by the following "complex" modulus : $|x+\varepsilon|^2 = (x+\varepsilon)(x-\varepsilon) = x^2$, hence $|x+\varepsilon| = |x|$ simply. – Abezhiko Apr 20 '25 at 18:26
  • 2
    @Abezhiko That's useful as a "geometric" absolute value for dual numbers, but it's not a good extension of real absolute value here, since the derivative of $|x|$ via dual numbers would be computed as $0$ for all $x$. – Misha Lavrov Apr 20 '25 at 18:40
  • In general the desired extension would have to be defined by $f(x + \epsilon) = f(x) + \epsilon f'(x)$, but of course this isn't useful for your goal of computing derivatives via dual numbers. – lily Apr 20 '25 at 18:52
  • 2
    It is certainly not defined, as written, until you define it. All we can really say is that the definitions "makes sense" for a wide class of functions, like analytic functions. – Thomas Andrews Apr 20 '25 at 19:46
  • Maybe look at functions of the form $x^a\sin(1/x^b)$. – mr_e_man Apr 24 '25 at 15:47

1 Answers1

2

As far as I am concerned, what you are looking for is true for any holomorphic function. This holds because given a function $f$ whose Taylor series around a certain point $a$ is

$$ f(x) = \sum_{n=0}^{\infty} \frac{d^nf(a)}{dx^n} \cdot (x-a)^n \Longrightarrow f(x+\varepsilon) = \sum_{n=0}^{\infty} \frac{d^nf(a)}{dx^n} \sum_{k=0}^n \binom{n}{k}\varepsilon^k(x-a)^{n-k} $$

has as dual part

$$ \mathfrak{D}(f(x+\varepsilon)) = \sum_{n=1}^{\infty} \frac{d^nf(a)}{dx^n} \cdot \binom{n}{1}(x-a)^{n-1} = f'(x) $$

In general, I cannot quite guarantee you that every single function with derivative verifies that $\mathfrak{D}(f(x+\varepsilon)) = f'(x)$ for all $x$ where $f$ has derivative.