numerically stable log1pexp calculation

Question

What are good approximations for computing log1pexp for single precision and double precision floating point numbers?

Note: log1pexp(x) is log(1 + exp(x))

I have found few implementations of log1pexp for double precision but they don't provide an explanation on how they arrived at the approximations. Hence, I am not able to implement log1exp for single precision numbers (without converting to double precision intermediates of course).

Reference implementation: https://github.com/davisking/dlib/blob/master/dlib/dnn/utilities.h#L16-L29

Пафнутий Чебышев · Accepted Answer · 2019-06-30T17:49:39.120

Let $0 < \varepsilon \lll 1$ be the relative error bound of the floating-point system—$2^{-53}$ in IEEE 754 binary64 arithmetic.

First, the naive formula log1p(exp(x)) always gives a good approximation unless exp(x) overflows: If exp(x) computes $(1 + \delta_1) e^x$, and if log1p(y) computes $(1 + \delta_2) \log(1 + y)$, where $|\delta_1|, |\delta_2| \leq \varepsilon$, then log1p(exp(x)) yields

\begin{equation} (1 + \delta_2) \log\bigl[1 + (1 + \delta_1) e^x\bigr] = (1 + \delta_2) (1 + \delta_1') \log(1 + e^x) \end{equation}

Exercise: Compute a bound on $|\delta_1'/\delta_1|$.

But you can do better, with less computation (only one transcendental evaluation, not two), in some special cases:

If $x \lll 0$, then $e^x$ is near 0 so $\log(1 + e^x) \approx e^x$. Specifically, $0 < e^x < 1$, so:
- $\log(1 + e^x) \geq e^x/2$, since $y \mapsto \log(1 + y)$ is a concave function with derivative $y \mapsto 1/(1 + y)$ bounded below by $1/2$ for $y < 1$; and
- the Taylor expansion of $y \mapsto \log(1 + y)$ converges absolutely at $e^x$.
Thus the relative error of $e^x$ from $\log(1 + e^x)$ is:

\begin{align} \frac{|e^x - \log(1 + e^x)|}{\left|\log(1 + e^x)\right|} &\leq 2 e^{-x} |e^x - \log(1 + e^x)| \\ &= 2 e^{-x} |e^x - (e^x - e^{2x}/2 + e^{3x}/3 - \cdots)| \\ &= 2 e^{-x} |e^{2x}/2 - e^{3x}/3 + e^{4x}/4 - \cdots| \\ &= 2 |e^x/2 - e^{2x}/3 + e^{3x}/4 - \cdots| \\ &< 2 |e^x/2| \\ &= e^x. \end{align}

For $x \leq \log \varepsilon$, this relative error is bounded by $\varepsilon$, which justifies the formula exp(x).

In IEEE 754 binary64 arithmetic, $\log \varepsilon \approx -37$.
If $x \gg 0$, we can use the identity

\begin{equation} \log(1 + e^x) = \log\bigl[e^x (1 + e^{-x})\bigr] = x + \log(1 + e^{-x}) \end{equation}

to rewrite $\log(1 + e^x)$ in terms of $\log(1 + y)$ for $0 < y < 1$, with $y = e^{-x}$, as in case (1). The value of $\log(1 + e^x)$ lies between $x + e^{-x}$ and $x + e^{-x} - e^{-2x}/2$, so the relative error of $x + e^{-x}$ from $\log(1 + e^x)$ is below the relative error of $x + e^{-x}$ from $x + e^{-x} - e^{-2x}/2$, which is

\begin{equation} \frac{|x + e^{-x} - (x + e^{-x} - e^{-2x}/2)|}{|x + e^{-x} - e^{-2x}/2|} = \frac{e^{-2x}/2}{|x + e^{-x} - e^{-2x}/2|} \leq e^{-2x}/2. \end{equation}

For $x \geq \log(1/\sqrt{2\varepsilon})$, this is bounded by $\varepsilon$, which justifies the formula x + exp(-x).

In IEEE 754 binary64 arithmetic, $\log(1/\sqrt{2\varepsilon}) \approx 18$.
If $x \ggg 0$, then $x + \log(1 + e^{-x})$ may simply be rounded to $x$. Specifically, the relative error of $x$ from $\log(1 + e^x) = x + \log(1 + e^{-x})$ is

\begin{equation} \frac{\bigl|x - \bigl[x + \log(1 + e^{-x})\bigr]\bigr|}{\left|\log(1 + e^x)\right|} = \frac{\log(1 + e^{-x})}{\log(1 + e^x)} < \log(1 + e^{-x}), \end{equation}

as long as $e^x \geq e - 1$ so that the denominator $\log(1 + e^x)$ is at least $1$.

For $x \geq \log[1/(e^\varepsilon - 1)]$, then this error is below $\varepsilon$, which justifies the formula x.

In IEEE 754 binary64 arithmetic, $\log[1/(e^\varepsilon - 1)] \approx 37$.

Exercise: Find a nice formula for a bound on $x$ that corresponds to the cutoff $33.3$ in the source you cited, by taking the denominator $\log(1 + e^x)$ into account.

numerically stable log1pexp calculation

1 Answers1