How is this sorting algorithm Θ(n³) and not Θ(n²), worst-case?

Question

I just starting taking a course on Data Structures and Algorithms and my teaching assistant gave us the following pseudo-code for sorting an array of integers:

void F3() {
    for (int i = 1; i < n; i++) {
        if (A[i-1] > A[i]) {
            swap(i-1, i)
            i = 0
        }
    }
}

It may not be clear, but here $n$ is the size of the array A that we are trying to sort.

In any case, the teaching assistant explained to the class that this algorithm is in $\Theta(n^3)$ time (worst-case, I believe), but no matter how many times I go through it with a reversely-sorted array, it seems to me that it should be $\Theta(n^2)$ and not $\Theta(n^3)$.

Would someone be able to explain to me why this is $Θ(n^3)$ and not $Θ(n^2)$?

score 61 · Accepted Answer · edited Jan 20 '17 at 11:58

This algorithm can be re-written like this

Scan A until you find an inversion.
If you find one, swap and start over.
If there is none, terminate.

Now there can be at most $\binom{n}{2} \in \Theta(n^2)$ inversions and you need a linear-time scan to find each -- so the worst-case running time is $\Theta(n^3)$. A beautiful teaching example as it trips up the pattern-matching approach many succumb to!

Nota bene: One has to be a little careful: some inversions appear early, some late, so it is not per se trivial that the costs add up as claimed (for the lower bound). You also need to observe that swaps never introduce new inversions. A more detailed analysis of the case with the inversely sorted array will then yield something like the quadratic case of Gauss' formula.

As @gnasher729 aptly comments, it's easy to see the worst-case running time is $\Omega(n^3)$ by analyzing the running time when sorting the input $[1, 2, \dots, n, 2n, 2n-1, \dots, n+1]$ (though this input is probably not the worst case).

Be careful: don't assume that a reversely-sorted array will necessarily be the worst-case input for all sorting algorithms. That depends on the algorithm. There are some sorting algorithms where a reversely-sorted array isn't the worst case, and might even be close to the best case.

David E · Answer 2 · 2017-01-20T12:34:37.003

An alternative way of thinking about this is what the maximum value of i becomes before it is reset. This, as it turns out, makes it more straightforward to reason about how the prior sort order of A affects the run time of the algorithm.

In particular, observe that when i sets its new maximal value, let's call it N, the array [A[0], ..., A[N-1]] is sorted in ascending order.

So what happens when we add the element A[N] to the mix?

The mathematics:

Well, lets say it fits at position $p_N$. Then we need $N$ loop iterations (which I'll denote $\text{steps}$) to move it to place $N-1$, $N + (N-1)$ iterations to move it to place $N-2$, and in general:

$$\text{steps}_N(p_N) = N + (N-1) + (N-2) + \dots + (p_N+1) = \tfrac{1}{2}(N(N+1) - p_N(p_N+1))$$

For a randomly sorted array, $p_N$ takes the uniform distribution on $\{0, 1,\dots, N\}$ for each $N$, with:

$$\mathbb{E}(\text{steps}_N(p_N)) = \sum_{a=1}^{N} \mathbb{P}(p_N = a)\text{steps}_N(a) = \sum_{a=1}^{N}\tfrac{1}{N}\tfrac{1}{2}(N(N+1) - a(a+1)) = \tfrac{1}{2} ( N(N+1) - \tfrac{1}{3}(N+1)(N+2)) = \tfrac{1}{3} (N^2-1) = \Theta(N^2)$$

the sum can be shown using Faulhaber's formula or the Wolfram Alpha link at the bottom.

For an inversely sorted array, $p_N=0$ for all $N$, and we get:

$$\text{steps}_N(p_N) = \tfrac{1}{2}N(N+1)$$

exactly, taking strictly longer than any other value of $p_N$.

For an already sorted array, $p_N = N$ and $\text{steps}_N(p_N) = 0$, with the lower-order terms becoming relevant.

Total time:

To get the total time, we sum up the steps over all the $N$. (If we were being super careful, we would sum up the swaps as well as the loop iterations, and take care of the start and end conditions, but it is reasonably easy to see they don't contribute to the complexity in most cases).

And again, using linearity of expectation and Faulhaber's Formula:

$$\text{Expected Total Steps} = \mathbb{E}(\sum_{N=1}^n \text{steps}_N(p_N)) = \sum_{N=1}^n \mathbb{E}(\text{steps}_N(p_N)) = \Theta(n^3)$$

Of course, if for some reason $\text{steps}_N(p_N)$ is not $\Theta(N^2)$ (eg the distribution of arrays we're looking at are already very close to being sorted), then this need not always be the case. But it takes very specific distributions on $p_N$ to achieve this!

Relevant reading:

https://www.wolframalpha.com/input/?i=sum+a(a%2B1)+from+a%3D1+to+a%3DN
https://en.wikipedia.org/wiki/Faulhaber%27s_formula - Faulhaber's Formula

dtldarek · Answer 3 · 2017-01-18T14:39:27.097

Disclaimer:

This is not a proof (it seems that some people think I posted it as if it was). This is only a small experiment that the OP could perform to resolve his or her doubts about the assignment:

no matter how many times I go through it with a reversely-sorted array, it seems to me that it should be $Θ(n^2)$ and not $Θ(n^3)$.

With such a simple code, the difference between $\Theta(n^2)$ and $\Theta(n^3)$ shouldn't be hard to spot and in many practical cases this is a useful approach to check hunches or adjust expectations.

@Raphael answered your question already, but just for kicks, fitting this program's output to $f(x) = a\cdot x^b + c\cdot x$ using this gnuplot script reported exponent values of $2.99796166833222$ and $2.99223727692339$ and produced the following plots (the first is normal scale and the second is log-log scale):

I hope this helps $\ddot\smile$

score 3 · Answer 4 · edited Jan 19 '17 at 19:30

Assume you have an array.

array a[10] = {10,8,9,6,7,4,5,2,3,0,1}

Your algorithm does the following

Scan(1) - Swap (10,8) => {8,10,9,6,7,4,5,2,3,0,1}  //keep looking at "10"
Scan(2) - Swap (10,9) => {8,9,10,6,7,4,5,2,3,0,1}
...
Scan(10) - Swap(10,1) => {8,9,6,7,4,5,2,3,0,1,10}

Basically it moves to the end of the array the highest element, and in doing that it start overs at each scan effectively doing O(n^2) moves.. just for that one element. However, there are n elements so we'll have to repeat this n times. This isn't a formal proof, but it helps understand in an "unformal" way why the running time is O(n^3).

score 0 · Answer 5 · answered Jan 18 '17 at 10:31

The logic seems to be sorting the elements in the array in a ascending order.

Suppose the smallest number is at the end of the array(a[n]). For it to come to its right place - (n + (n-1) + (n-2) + ... 3 + 2 + 1) operations are required. = O(n2).

For a single element in the array O(n2) ops are required. So, for n eements it is O(n3).

How is this sorting algorithm Θ(n³) and not Θ(n²), worst-case?

5 Answers5