A B-tree has one significant disadvantage on the fastest deep cache machines, it depends on pointers. So as the size grows each access have a greater and greater risk of causing a cache-miss/TLB-miss.
Effectively getting a K value of z*x, x=sum(cache/TLB-miss per access, L1-TLB misses are typically size of tree / total cache size), z ~= access time of least cache or main memory that can hold the entire tree.
On the other hand the "average case" quicksort streams memory at maximum pre-fetcher speed. Only drawback here is the average case also cause a stream to be written back. And after some partitions the entire active set sit in caches and will get streamed quicker.
Both algorithms suffers heavily from branch mis-predictions but quicksort, just need to backup a bit, B-Tree additionally needs to read in a new address to fetch from as it has a data dependency which quicksort doesn't.
Few algoritmes are implemented as pure theoretically functions. Nearly all have some heuristics to fix their worst problems, Tim-sort excepted as its build of heuristics.
merge-sort and quick-sort are often checked for already sorted ranges, just like Tim-sort. Both also have an insertion sort for small sets, typically less than 16 elements, Tim-sort is build up of these smaller sets.
The C++ std::sort is a quicksort hybrid with insertion sort, with the additional fallback for the worst case behaviour, if the dividing exceed twice the expected depth it changes to a heap-sort algorithm.
The original quicksort used the first element of the array as pivot, this was quickly abandoned for a (pseudo)random element, typically the middle. Some implementations changed to median-of-three (random elements) to get a better pivot, recently a median-of-5-median (of all elements) was used, and last I saw in some presentation from Alexandrescu was a median-of-3-medians (of all elements) to get the a pivot that was close to the actual median (1/3 or a 1/5 of the span).