Science & Technology

Why do Quicksort’s worst-case and average-case running times differ?

Quicksort algorithm was developed by a British computer scientist Tony Hoare in 1959. “Quick Sort” is capable of sorting a list of data elements significantly faster (twice or thrice faster) than any of the common sorting algorithms and that’s why it is the favourite topic of interviewers in programming interviews. It is one of the most efficient sorting algorithms and is an in-place sorting algorithm.

Quicksort(A, start, end) { 
	if (start < end) { 
		partitionIndex = partition(A, start, end); 
		Quicksort(A, start, partitionIndex-1); 
		Quicksort(A, partitionIndex+1, end); 
	} 
} 

Now partition procedure, rearranges the array such that pivot element is at its correct position. The total time taken to rearrange the array is always ?(?) where n = end-start+1.

partition (arr[], low, high{
    // pivot - Element at end position (right most)
    pivot = arr[high];  
    i = (low - 1); 
    for (j = low; j <= high-1; j++) {
        // If current element is smaller than the pivot, swap the element with pivot
        // to the left
        if (arr[j] < pivot) {
            i++;    // increment index of smaller element
            swap(arr[i], arr[j]);
        }
    }
    swap(arr[i + 1], arr[high]);
    return (i + 1);
}

Let us suppose that the pivot has divided the array into two parts, one of size k and other of size n-k. Both of these parts will need to be sorted. This gives us the following relation


?(?) = ?(?) + ?(?−?) + Θ(?)

where ?(?) is time taken by algorithm to sort n elements and Θ(?) is the cpu time to rearrange elements.

Now, let’s analyse, worst case and best case.

Worst case

Consider the case when pivot happened to be the least element of array, we have k = 1.

?(?) = ?(1) + ?(?−1) + Θ(?)

Solving recurrence by substitution:

?(?) = ?(1) + ?(?−2) + Θ(?−1) + ?(1) + Θ(?)

?(?) = ?(?−2) + 2?(1) + Θ(?−1 + ?)

Likewise,

?(?) = ?(?−3) + 3?(1) + Θ(?−2+?−1+?)

?(?) = ?(?−?) + ??(1) + Θ\sum\limits_{j=0}^{i-1} (n-j)

Now, such a recurrence can only go upto i = n-1, because otherwise n-i would be less than 1. Therefore,

T(n) = T(1) + (n-1)T(1) + Θ\sum\limits_{j=0}^{n-2} (n-j)

T(n) = T(1) + (n-1)T(1) + Θ((n^2+n-2)/2)

which is O(n^2) .

This happens when the array is already sorted.

Also read: N Queen Problem Analysis

Best case

The best case occurs when the pivot divides the array into two equal parts in every step. Thus we have, k = n/2.

Therefore the recurrence becomes,

?(?) = 2?(?/2) + Θ(?)

Using substitution:

?(?) = 2(2(?(?/4) + Θ(?/2))) + Θ(?)

T(n) = 2^2 T(n/4) + 2Θ(n)

Similarly,

T(n) = 2^k T(n/2^k) + kΘ(n)

This will continue until n = 2^k i.e. until k = logn . Thus,

?(?) = ??(1) + Θ(?)????

which is ?(?????).