32

Is it possible to revert the softmax function in order to obtain the original values $x_i$?

$$S_i=\frac{e^{x_i}}{\sum e^{x_i}} $$

In case of 3 input variables this problem boils down to finding $a$, $b$, $c$ given $x$, $y$ and $z$:

\begin{cases} \frac{a}{a+b+c} &= x \\ \frac{b}{a+b+c} &= y \\ \frac{c}{a+b+c} &= z \end{cases}

Is this problem solvable?

jojeck
  • 1,363

2 Answers2

29

Note that in your three equations you must have $x+y+z=1$. The general solution to your three equations are $a=kx$, $b=ky$, and $c=kz$ where $k$ is any scalar.

So if you want to recover $x_i$ from $S_i$, you would note $\sum_i S_i = 1$ which gives the solution $x_i = \log (S_i) + c$ for all $i$, for some constant $c$.

angryavian
  • 93,534
  • 1
    So it’s solvable up to a constant. Thank you! – jojeck May 18 '18 at 17:39
  • 1
    Which c constant should I use? There is any way of calculating it? – Joel Carneiro Feb 07 '19 at 17:16
  • 4
    @JoelCarneiro Any $c$ will work; the solution is not unique. – angryavian Feb 07 '19 at 17:58
  • Any $c$ will work, one choice is if you augment the $x_i$ vector like $(0, x_1,...,x_n)$ then this will induce a particular $c$, note the corresponding log-sum-exp -- the gradient of which is the softmax -- would also be convex (https://en.wikipedia.org/wiki/LogSumExp). – Josh Albert Jul 29 '19 at 10:59
  • 3
    In the case anybody like me spend too much time figuring out $c$: If you know your 3 input variables have to sum to 1 then your $c = (1 - log(x \cdot y \cdot z))/3)$. – Rasmus Ø. Pedersen Sep 14 '20 at 13:04
  • 1
    Usually they normalize the xi to sum to zero, in which case the inverse softmax would be log(x)-mean(log(x)). Or they code the first xi to be zero, in which case the inverse softmax would be log(x)-log(x[[1]]). – Tom Wenseleers Aug 24 '22 at 20:12
19

The softmax function is defined as:

$$S_i = \frac{\exp(x_i)}{\sum_{j} \exp(x_j)}$$

Taking the natural logarithm of both sides:

$$\ln(S_i) = x_i - \ln(\sum_{j} \exp(x_j))$$

Rearranging the equation:

$$x_i = \ln(S_i) + \ln(\sum_{j} \exp(x_j))$$

The second term on the right-hand side is a constant over all $i$ and can be written as $C$. Therefore, we can write:

$$x_i = \ln(S_i) + C$$

This answer is adapted from this post on Reddit.

trudolf
  • 103