`
sunwinner
  • 浏览: 202202 次
  • 性别: Icon_minigender_1
  • 来自: 上海
社区版块
存档分类
最新评论

基础数据结构和算法六:Quick sort

阅读更多

Quick sort is probably used more widely than any other. It is popular because it is not difficult to implement, works well for a variety of different kinds of input data, and is substantially faster than any other sorting method in typical applications. The quicksort algorithm’s desirable features are that it is in-place (uses only a small auxiliary stack) and that it requires time proportional to N * logN on the average to sort an array of length N. None of the algorithms that we have so far considered combine these two properties. Furthermore, quicksort has a shorter inner loop than most other sorting algorithms, which means that it is fast in practice as well as in theory. Its primary drawback is that it is fragile in the sense that some care is involved in the implementation to be sure to avoid bad performance. Numerous examples of mistakes leading to quadratic performance in practice are documented in the literature. Fortunately, the lessons learned from these mistakes have led to various improvements to the algorithm that make it of even broader utility, as we shall see.

 

 

Quicksort is a divide-and-conquer method for sorting. It works by partitioning an array into two subarrays, then sorting the subarrays independently. Quicksort is complementary to mergesort: for mergesort, we break the array into two subarrays to be sorted and then combine the ordered subarrays to make the whole ordered array; for quicksort, we rearrange the array such that, when the two subarrays are sorted, the whole array is ordered. In the first instance, we do the two recursive calls before working on the whole array; in the second instance, we do the two recursive calls after working on the whole array. For mergesort, the array is divided in half; for quicksort, the position of the partition depends on the contents of the array.

 

The crux of the method is the partitioning process, which rearranges the array to make the following three conditions hold:

■ The entry a[j] is in its final place in the array, for some j.

■ No entry in a[lo] through a[j-1] is greater than a[j].

■ No entry in a[j+1] through a[hi] is less than a[j].

 

We achieve a complete sort by partitioning, then recursively applying the method.

 

public class Quick {

    // quicksort the array
    public static void sort(Comparable[] a) {
        StdRandom.shuffle(a);
        sort(a, 0, a.length - 1);
    }

    // quicksort the subarray from a[lo] to a[hi]
    private static void sort(Comparable[] a, int lo, int hi) {
        if (hi <= lo) return;
        int j = partition(a, lo, hi);
        sort(a, lo, j - 1);
        sort(a, j + 1, hi);
        assert isSorted(a, lo, hi);
    }

    // partition the subarray a[lo .. hi] by returning an index j
    // so that a[lo .. j-1] <= a[j] <= a[j+1 .. hi]
    private static int partition(Comparable[] a, int lo, int hi) {
        int i = lo;
        int j = hi + 1;
        Comparable v = a[lo];
        while (true) {

            // find item on lo to swap
            while (less(a[++i], v))
                if (i == hi) break;

            // find item on hi to swap
            while (less(v, a[--j]))
                if (j == lo) break;      // redundant since a[lo] acts as sentinel

            // check if pointers cross
            if (i >= j) break;

            exch(a, i, j);
        }

        // put v = a[j] into position
        exch(a, lo, j);

        // with a[lo .. j-1] <= a[j] <= a[j+1 .. hi]
        return j;
    }

    /**
     * ********************************************************************
     * Rearranges the elements in a so that a[k] is the kth smallest element,
     * and a[0] through a[k-1] are less than or equal to a[k], and
     * a[k+1] through a[n-1] are greater than or equal to a[k].
     * *********************************************************************
     */
    public static Comparable select(Comparable[] a, int k) {
        if (k < 0 || k >= a.length) {
            throw new IndexOutOfBoundsException("Selected element out of bounds");
        }
        StdRandom.shuffle(a);
        int lo = 0, hi = a.length - 1;
        while (hi > lo) {
            int i = partition(a, lo, hi);
            if (i > k) hi = i - 1;
            else if (i < k) lo = i + 1;
            else return a[i];
        }
        return a[lo];
    }


    /**
     * ********************************************************************
     * Helper sorting functions
     * *********************************************************************
     */

    // is v < w ?
    private static boolean less(Comparable v, Comparable w) {
        return (v.compareTo(w) < 0);
    }

    // exchange a[i] and a[j]
    private static void exch(Object[] a, int i, int j) {
        Object swap = a[i];
        a[i] = a[j];
        a[j] = swap;
    }


    /**
     * ********************************************************************
     * Check if array is sorted - useful for debugging
     * *********************************************************************
     */
    private static boolean isSorted(Comparable[] a) {
        return isSorted(a, 0, a.length - 1);
    }

    private static boolean isSorted(Comparable[] a, int lo, int hi) {
        for (int i = lo + 1; i <= hi; i++)
            if (less(a[i], a[i - 1])) return false;
        return true;
    }
}

 

 

There are several subtle issues with respect to implementing quicksort that are reflected in this code and worthy of mention, because each either can lead to incorrect code or can significantly impact performance. Next, we discuss several of these issues. Later in this section, we will consider three important higher-level algorithmic improvements.

 

Partitioning in place. If we use an extra array, partitioning is easy to implement, but not so much easier that it is worth the extra cost of copying the partitioned version back into the original. A novice Java programmer might even create a new spare array within the recursive method, for each partition, which would drastically slow down the sort.

Stayinginbounds. If the smallest item or the largest item in the array is the partitioning item, we have to take care that the pointers do not run off the left or right ends of the array, respectively. Our partition() implementation has explicit tests to guard againstthiscircumstance.Thetest(j == lo)isredundant,since the partitioning item is at a[lo] and not less than itself. With a similar technique on the right it is not difficult to eliminate both tests.

 

Preserving randomness. The random shuffle puts the array in random order. Since it treats all items in the subarrays uniformly, our implementation of quick sort has the property that its two subarrays are also in random order. This fact is crucial to the predictability of the algorithm’s running time. An alternate way to preserve randomness is to choose a random item for partitioning within partition().

 

Terminating the loop. Experienced programmers know to take special care to ensure that any loop must always terminate, and the partitioning loop for quicksort is no exception. Properly testing whether the pointers have crossed is a bit trickier than it might seem at first glance. A common error is to fail to take into account that the array might contain other items with the same key value as the partitioning item.

 

Handling items with keys equal to the partitioning item’s key. It is best to stop the left scan for items with keys greater than or equal to the partitioning item’s key and the right scan for items with key less than or equal to the partitioning item’s key, as in our implementation. Even though this policy might seem to create unnecessary exchanges involving items with keys equal to the partitioning item’s key, it is crucial to avoiding quadratic running time in certain typical applications. Later, we discuss a better strategy for the case when the array contains a large number of items with equal keys.

 

 

Terminating the recursion. Experienced programmers also know to take special care to ensure that any recursive method must always terminate, and quicksort is again no exception. For instance, a common mistake in implementing quicksort involves not ensuring that one item is always put into position, then falling into an infinite recursive loop when the partitioning item happens to be the largest or smallest item in the array.

 

 

Quicksort uses ~ 2N * lnN compares (and one-sixth that many exchanges) on the average to sort an array of length N with distinct keys. Quicksort uses ~ N^2/2 compares in the worst case, but random shuffling protects against this case.

 

 

Algorithmic improvements

 

If your sort code is to be used a great many times or to sort a huge array (or, in particular, if it is to be used as a library sort that will be used to sort arrays of unknown characteristics), then it is worthwhile to consider the improvements that are discussed in the next few paragraphs.

 

Cutoff to insertion sort. As with most recursive algorithms, an easy way to improve the performance of quicksort is based on the following two observations:

■ Quicksort is slower than insertion sort for tiny subarrays.

■ Being recursive, quicksort’s sort() is certain to call itself for tiny subarrays. Accordingly, it pays to switch to insertion sort for tiny subarrays. A simple change to Algorithm 2.5 accomplishes this improvement: replace the statement

        if (hi <= lo) return;

in sort() with a statement that invokes insertion sort for small subarrays:

        if (hi <= lo + M) { Insertion.sort(a, lo, hi); return; }

The optimum value of the cutoff M is system-dependent, but any value between 5 and 15 is likely to work well in most situations

 

Median-of-three partitioning. A second easy way to improve the performance of quicksort is to use the median of a small sample of items taken from the subarray as the partitioning item. Doing so will give a slightly better partition, but at the cost of com- puting the median. It turns out that most of the available improvement comes from choosing a sample of size 3 and then partitioning on the middle item. As a bonus, we can use the sample items as sentinels at the ends of the array and remove both array bounds tests in partition().

 

public class QuickX {
    private static final int CUTOFF = 8;  // cutoff to insertion sort, must be >= 1

    public static void sort(Comparable[] a) {
        sort(a, 0, a.length - 1);
    }

    private static void sort(Comparable[] a, int lo, int hi) {
        int N = hi - lo + 1;

        // cutoff to insertion sort
        if (N <= CUTOFF) {
            insertionSort(a, lo, hi);
            return;
        }

        // use median-of-3 as partitioning element
        else if (N <= 40) {
            int m = median3(a, lo, lo + N / 2, hi);
            exch(a, m, lo);
        }

        // use Tukey ninther as partitioning element
        else {
            int eps = N / 8;
            int mid = lo + N / 2;
            int m1 = median3(a, lo, lo + eps, lo + eps + eps);
            int m2 = median3(a, mid - eps, mid, mid + eps);
            int m3 = median3(a, hi - eps - eps, hi - eps, hi);
            int ninther = median3(a, m1, m2, m3);
            exch(a, ninther, lo);
        }

        // Bentley-McIlroy 3-way partitioning
        int i = lo, j = hi + 1;
        int p = lo, q = hi + 1;
        while (true) {
            Comparable v = a[lo];
            while (less(a[++i], v))
                if (i == hi) break;
            while (less(v, a[--j]))
                if (j == lo) break;
            if (i >= j) break;
            exch(a, i, j);
            if (eq(a[i], v)) exch(a, ++p, i);
            if (eq(a[j], v)) exch(a, --q, j);
        }
        exch(a, lo, j);

        i = j + 1;
        j = j - 1;
        for (int k = lo + 1; k <= p; k++) exch(a, k, j--);
        for (int k = hi; k >= q; k--) exch(a, k, i++);

        sort(a, lo, j);
        sort(a, i, hi);
    }


    // sort from a[lo] to a[hi] using insertion sort
    private static void insertionSort(Comparable[] a, int lo, int hi) {
        for (int i = lo; i <= hi; i++)
            for (int j = i; j > lo && less(a[j], a[j - 1]); j--)
                exch(a, j, j - 1);
    }


    // return the index of the median element among a[i], a[j], and a[k]
    private static int median3(Comparable[] a, int i, int j, int k) {
        return (less(a[i], a[j]) ?
                (less(a[j], a[k]) ? j : less(a[i], a[k]) ? k : i) :
                (less(a[k], a[j]) ? j : less(a[k], a[i]) ? k : i));
    }

    /**
     * ********************************************************************
     * Helper sorting functions
     * *********************************************************************
     */

    // is v < w ?
    private static boolean less(Comparable v, Comparable w) {
        return (v.compareTo(w) < 0);
    }

    // does v == w ?
    private static boolean eq(Comparable v, Comparable w) {
        return (v.compareTo(w) == 0);
    }

    // exchange a[i] and a[j]
    private static void exch(Object[] a, int i, int j) {
        Object swap = a[i];
        a[i] = a[j];
        a[j] = swap;
    }


    /**
     * ********************************************************************
     * Check if array is sorted - useful for debugging
     * *********************************************************************
     */
    private static boolean isSorted(Comparable[] a) {
        for (int i = 1; i < a.length; i++)
            if (less(a[i], a[i - 1])) return false;
        return true;
    }


    // test client
    public static void main(String[] args) {

        // generate array of N random reals between 0 and 1
        int N = Integer.parseInt(args[0]);
        Double[] a = new Double[N];
        for (int i = 0; i < N; i++) {
            a[i] = Math.random();
        }

        // sort the array
        sort(a);

        // display results
        for (int i = 0; i < N; i++) {
            System.out.println(a[i]);
        }
        System.out.println("isSorted = " + isSorted(a));
    }
}

 

 

Entropy-optimal sorting. Arrays with large numbers of duplicate keys arise frequently in applications. For example, we might wish to sort a large personnel file by year of birth, or perhaps to separate females from males. In such situations, the quicksort implementation that we have considered has acceptable performance, but it can be substantially improved. For example, a subarray that consists solely of items that are equal (just one key value) does not need to be processed further, but our implementation keeps partitioning down to small subarrays. In a situation where there are large numbers of duplicate keys in the input array, the recursive nature of quicksort ensures that subarrays consisting solely of items with keys that are equal will occur often. There is potential for significant improvement, from the linearithmic-time performance of the implementations seen so far to linear-time performance.

 

One straightforward idea is to partition the array into three parts, one each for items with keys smaller than, equal to, and larger than the partitioning item’s key. Accomplishing this partitioning is more complicated than the 2-way partitioning that we have been using, and various different methods have been suggested for the task. It was a classical programming exercise popularized by E. W. Dijkstra as the Dutch National Flag problem, because it is like sorting an array with three possible key values, which might correspond to the three colors on the flag.

Dijkstra’s solution to this problem leads to the remarkably simple partition code shown on the next page. It is based on a single left-to-right pass through the array that maintains a pointer lt such that a[lo..lt-1] is less than v, a pointer gt such that a[gt+1, hi] is greater than v,and a pointer i such that a [lt..i-1]are equal to v and a[i..gt] are not yet examined. Starting with i equal to lo, we process a[i] using the 3-way comparison given us by the Comparable interface (instead of using less()) to directly handle the three possible cases:

 

■ a[i] less than v: exchange a[lt] with a[i] and increment both lt and i

■ a[i] greater than v: exchange a[i] with a[gt] and decrement gt

■ a[i] equal to v: increment i

 

Each of these operations both maintains the invariant and decreases the value of gt-i (so that the loop terminates). Furthermore, every item encountered leads to an exchange except for those items with keys equal to the partitioning item’s key.

public class Quick3way {

    // quicksort the array a[] using 3-way partitioning
    public static void sort(Comparable[] a) {
        StdRandom.shuffle(a);
        sort(a, 0, a.length - 1);
        assert isSorted(a);
    }

    // quicksort the subarray a[lo .. hi] using 3-way partitioning
    private static void sort(Comparable[] a, int lo, int hi) {
        if (hi <= lo) return;
        int lt = lo, gt = hi;
        Comparable v = a[lo];
        int i = lo;
        while (i <= gt) {
            int cmp = a[i].compareTo(v);
            if (cmp < 0) exch(a, lt++, i++);
            else if (cmp > 0) exch(a, i, gt--);
            else i++;
        }

        // a[lo..lt-1] < v = a[lt..gt] < a[gt+1..hi]. 
        sort(a, lo, lt - 1);
        sort(a, gt + 1, hi);
        assert isSorted(a, lo, hi);
    }


    /**
     * ********************************************************************
     * Helper sorting functions
     * *********************************************************************
     */

    // is v < w ?
    private static boolean less(Comparable v, Comparable w) {
        return (v.compareTo(w) < 0);
    }

    // does v == w ?
    private static boolean eq(Comparable v, Comparable w) {
        return (v.compareTo(w) == 0);
    }

    // exchange a[i] and a[j]
    private static void exch(Object[] a, int i, int j) {
        Object swap = a[i];
        a[i] = a[j];
        a[j] = swap;
    }


    /**
     * ********************************************************************
     * Check if array is sorted - useful for debugging
     * *********************************************************************
     */
    private static boolean isSorted(Comparable[] a) {
        return isSorted(a, 0, a.length - 1);
    }

    private static boolean isSorted(Comparable[] a, int lo, int hi) {
        for (int i = lo + 1; i <= hi; i++)
            if (less(a[i], a[i - 1])) return false;
        return true;
    }

}

 

 

分享到:
评论

相关推荐

    数据结构与算法教程(C++版)实验和课程设计

    - **性能评估**:对不同的数据结构和算法进行时间和空间复杂度的分析,评估其效率。 - **团队合作**:在项目开发过程中,培养团队协作能力,共同完成一个大型项目的设计与实现。 通过本教程的学习,不仅能够掌握...

    c#数据结构和算法源码

    在IT领域,掌握数据结构和算法是至关重要的,特别是对于编程语言如C#的开发者来说。数据结构是存储和组织数据的方式,而算法是解决问题或执行任务的特定步骤。本资源"**c#数据结构和算法源码**"提供了一个实践学习的...

    php数据结构算法

    在IT领域,尤其是在编程中,数据结构和算法是至关重要的组成部分。PHP,作为一种广泛使用的服务器端脚本语言,虽然在处理动态网站和Web应用程序方面表现出色,但它同样支持实现各种数据结构和算法。以下是根据提供的...

    数据结构与算法分析(Java英文版)

    《数据结构与算法分析(Java英文版)》是一本介绍如何利用Java语言处理实际问题中的数据结构和算法的专业书籍。本书由Robert Lafore编写,通过丰富的插图和浅显易懂的语言,帮助读者理解并掌握数据结构和算法的核心...

    数据结构与算法-作者Aho

    通过对《数据结构与算法》这本书的深入学习,读者不仅能够掌握各种算法和数据结构的基本概念,还能学会如何根据实际问题选择合适的解决方案。无论是计算机科学专业的学生还是从事软件开发工作的工程师,都能从本书中...

    数据结构排序算法

    ### 数据结构排序算法详解 #### 冒泡排序(Bubble Sort) 冒泡排序是一种简单的排序算法。它重复地遍历要排序的数列,一次比较两个元素,如果他们的顺序错误就把他们交换过来。遍历数列的工作是重复进行的,直到没有...

    数据结构与算法C版本

    通过以上对数据结构、算法以及C++实现方面的详细阐述,可以看出《数据结构与算法C版本》这本书不仅覆盖了理论基础,还深入到了具体的实现细节和技术要点,对于学习和掌握数据结构与算法具有很高的参考价值。...

    基于java的数据结构与算法的实现(持续更新)

    Java作为一种广泛应用的编程语言,提供了丰富的工具和库来实现各种数据结构和算法。本项目基于Java语言,深入探讨并实现了多种常用的数据结构与算法,旨在帮助开发者巩固理论知识,提高实际编程能力。 首先,我们要...

    数据结构8中算法排序,配源码和动画演示.rar

    数据结构是计算机科学中的核心部分,它探讨了如何有效地存储和组织数据,以便进行高效的查询、更新和操作。在这个压缩包文件中,我们...通过学习和实践这些排序算法,你将能够更好地掌握数据结构和算法,提升编程能力。

    2021年Acm竞赛常用算法与数据结构.doc

    该文档提供了Acm竞赛中常用的算法与数据结构的知识点总结,包括算法、数据结构、算法与数据结构的结合和实践经验等方面的内容。该文档对Acm竞赛的参赛者和关心算法与数据结构的人来说具有重要的参考价值。

    基于Java的数据结构与算法实现.zip

    通过这些实现,用户可以深入理解各种算法的原理和应用场景,同时也可以作为学习和实践数据结构与算法的参考。 项目的主要特性和功能 1. 排序算法 冒泡排序(Bubble Sort) 选择排序(Selection Sort) 插入...

    数据结构与算法分析 C++语言描述

    ### 数据结构与算法分析——C++语言描述 #### 核心知识点概览 根据所提供的文件信息,本资料涉及《数据结构与...对于想要深入了解数据结构与算法的学生和开发者来说,《数据结构与算法分析》是一本不可多得的好书。

    数据结构常用算法c++实现

    数据结构常用算法c++实现,程序目录如下: Array shuffle Prime test(trial division) Prime test(Miller-Rabin's method) 2D Array Arbitary Integer Linear congruential generator Maximum subarray problem Bit...

Global site tag (gtag.js) - Google Analytics