1

What is a formula to approximate the (left-tail) inverse of the cumulative distribution function?

I've tried the following which produced incorrect results. I can't use an exact (as it will be done in code so integrations won't be valid) method. I only need a precision of roughly 5 decimal places.

From https://en.wikipedia.org/wiki/Error_function#Inverse_functions:

erf^-1(z) = 0.5 * sqrt(M_PI) * (
z
+ pi * z^3
+ (7 * pi^2 * z^) / 6
+ (127 * pi^3 * z^7) / 90
+ (4369 * pi^4 * z^9) / 2520
+ (34807 * pi^5 * z^11) / 16200
+ (20036983 * pi^6 * z^13) / 7484400
+ (2280356863 * pi^7 * z^15) / 681080400
+ (49020204823 * pi^8 * z^17) / 11675664000
+ (65967241200001 * pi^9 * z^19) / 12504636144000
+ (15773461423793767 * pi^10 * z^21) / 2375880867360000
+ (655889589032992201 * pi^11 * z^23) / 78404068622880000
+ (94020690191035873697 * pi^12 * z^25) / 8910391798788480000
+ (655782249799531714375489 * pi^13 * z^27) / 49229914688306352000000
+ (44737200694996264619809969 * pi^14 * z^29) / 2658415393168543008000000)

Derived from 0.5(1+erf((x-mean)/sqrt(2* variance)) = cdf(x):

inverseCdf(x) = sqrt(2 * variance) * erf^-1(2x - 1) + mean
UKB
  • 119
  • 1
    Why reinvent the wheel? This has already been done efficiently by others. – Ian Nov 30 '17 at 16:52
  • That's what I'm asking for. I was trying to implement a method I'd seen done by others and it didn't work, so was asking for either examples where it's been done before or what's wrong with this one. – UKB Nov 30 '17 at 17:40
  • I'm saying that you should be able to use a black box routine for this without writing your own implementation at all, even in C. – Ian Nov 30 '17 at 17:52
  • It's for PHP, and there are only two very-unmaintained packages (one extension, one userland library) for this kind of stats, neither of which are ideally suited (one hasn't been touched since 2006, the other is maintained but isn't very good (doesn't handle precision at all and is not well written/easily fixable). – UKB Nov 30 '17 at 18:17
  • There is a whole section on this on the Wikipedia page:

    https://en.wikipedia.org/wiki/Normal_distribution#Numerical_approximations_for_the_normal_CDF

    The Zelen and Severo algorithm seems to be precise enough for what you're doing, with error below 10^-8. You can probably find a way to make it faster by doing some computationl tricks with the evaluation of the polynomial function of t.

    – Ryan Warnick Dec 03 '17 at 13:03
  • I’m voting to close this question because it has a reasonable answer. – Kurt G. Jan 09 '24 at 07:05

2 Answers2

1

I tried implementing the example on Wikipedia, and I also got an issue. Did some testing of the waters, and turns out the expression on Wikipedia is wrong. It's supposed to be $\sqrt{\frac{\pi}{2}}$, not $\frac{\sqrt{\pi}}{2}$. Change that and it should work.

0

Since you only need a precision of roughly 5 decimal places I'd suggest the approach used in an ancient HP pocket calculator: use an approximation for the cumulative distribution function and use it with a root finder to compute the inverse. With today's hardware and only for five digits precision, it's fast enough.

m-stgt
  • 575