False position:
For functions which are not convex at the root, such as $\sqrt[3]x$ or $\arctan(x)$, false position will give tight bounds without either bound getting stuck. For simple roots, the behavior becomes equivalent to the secant method, giving fast convergence.
For functions which are convex at the root, such as $e^x-1$ or $\ln(x)$, one bound will remain fixed. The convergence becomes linear for simple roots and worse otherwise. This means that at best for convex functions, it converges at a rate which is a constant multiple of bisection, better or worse depending on the size of the initial interval and the curvature of the function over the initial interval.
This drawback alone is a major alarm, because it means most situations will cause false position to perform about as good as bisection. There is also less guarantee on the behavior of false position. Even if it converges linearly, it may converge very slowly (e.g. $x^{10}-1$ on the initial interval $[0,10]$). For roots which are not simple, the convergence is usually sublinear.
Conclusion:
Overall, the benefit of sometimes converging as fast as the secant method is usually outweighed by how slow and unreliable false position can be.
Modifications:
As you point out, there are modifications of this which attempt to remedy this issue, most famously the Illinois method. The Illinois method has the advantage of superlinear convergence for simple roots with an order of convergence of $\sqrt[3]3\approx1.44$ for convex functions and $\varphi\approx1.61$ for non-convex functions.
This is actually comparable to Newton's method, giving higher order of convergence$^{[1]}$ for simple roots. Another advantage is that it usually gives tight bounds on the root and is guaranteed to converge. The drawback is the the function has to change signs on the initial endpoints.
Similar to Newton's method and false position, convergence, especially initial convergence, may be slow if the initial points are chosen poorly and the function exhibits bad behaviors such as large curvature (e.g. $x^{10}-1$ on the initial interval $[0,10]$). In such cases, especially for low accuracy estimations, it is better to use hybrid modifications involving bisection.
Conclusion:
The Illinois method is comparable to both the secant and Newton methods in terms of speed and is also guaranteed to converge but requires a change in the function's sign. For more extreme cases, it is better to use hybrid methods using bisection which guarantee a minimum speed of convergence.
$[1]:$ This is per function evaluation, which is where avoiding the derivative comes into making Newton slower than its theoretical order of convergence.