`
sunwinner
  • 浏览: 202503 次
  • 性别: Icon_minigender_1
  • 来自: 上海
社区版块
存档分类
最新评论

基础数据结构和算法一:UnionFind

阅读更多

 

The problem that we consider is not a toy problem; it is a fundamental computational task, and the solution that we develop is of use in a variety of applications, from percolation in physical chemistry to connectivity in communications networks. We start with a simple solution, then seek to understand that solution’s performance characteristics, which help us to see how to improve the algorithm.

 

 

Dynamic connectivity We start with the following problem specification: The input is a sequence of pairs of integers, where each integer represents an object of some type and we are to interpret the pair p q as meaning "p is connected to q." We assume that "is connected to" is an equivalence relation, which means that it is:

■ Reflexive : p is connected to p.

■ Symmetric : If p is connected to q, then q is connected to p.

 

■ Transitive : If p is connected to q and q is connected to r, then p is connected to r.

 

 

An equivalence relation partitions the objects into equivalence classes. In this case, two objects are in the same equivalence class if and only if they are connected. Our goal is to write a program to filter out extraneous pairs (pairs where both objects are in the same equivalence class) from the sequence. In other words, when the program reads a pair p q from the input, it should write the pair to the output only if the pairs it has seen to that point do not imply that p is connected to q. If the previous pairs do imply that p is connected to q, then the program should ignore the pair p q and proceed to read in the next pair. The figure on the facing page gives an example of this process. To achieve the desired goal, we need to devise a data structure that can remember sufficient information about the pairs it has seen to be able to decide whether or not a new pair of objects is connected. Informally, we refer to the task of designing such a method as the dynamic connectivity problem. This problem arises applications such as the following:

 

Networks. The integers might represent computers in a large network, and the pairs might represent connections in the network. Then, our program determines whether we need to establish a new direct connection for p and q to be able to communicate or whether we can use existing connections to set up a communications path. Or, the integers might represent contact sites in an electrical circuit, and the pairs might represent wires connecting the sites. Or, the integers might represent people in a social network, and the pairs might represent friendships. In such applications, we might need to process millions of objects and billions of connections.

 

 

Variable-name equivalence. In certain programming environ- ments, it is possible to declare two variable names as being equivalent (references to the same object). After a sequence of such declarations, the system needs to be able to determine whether two given names are equivalent. This application is an early one (for the FORTRAN programming language) that motivated the development of the algorithms that we are about to consider.

 

 

Mathematical sets. On a more abstract level, you can think of the integers as belonging to mathematical sets. When we process a pair p q, we are asking whether they belong to the same set.If not, we unite p’s set and q’s set, putting them in the same set.

 

 

To specify the problem, we develop an API that encapsulates the basic operations that we need: initialize, add a connection between two sites, identify the component containing a site, determine whether two sites are in the same component, and count the number of components. Thus, we articulate the following API:

public class UF {
    public UF(int N) {} // initialize N sites with integer names (0 to N-1)
    void union(int p, int q) {} // add connection between p and q
    int find(int p) {} // component identifier for p (0 to N-1)
    // return true if p and q are in the same component
    boolean connected(int p, int q) {}  
    int count() {} // number of components
}

 

 

The union() operation merges two components if the two sites are in different com- ponents, the find() operation returns an integer component identifier for a given site, the connected() operation determines whether two sites are in the same component, and the count() method returns the number of components. We start with N components, and each union() that merges two different components decrements the number of components by 1.

 

public class UF {
    private int[] id;    // id[i] = parent of i
    private int[] sz;    // sz[i] = number of objects in subtree rooted at i
    private int count;   // number of components

    /**
     * Create an empty union find data structure with N isolated sets.
     *
     * @throws java.lang.IllegalArgumentException
     *          if N < 0
     */
    public UF(int N) {
        if (N < 0) throw new IllegalArgumentException();
        count = N;
        id = new int[N];
        sz = new int[N];
        for (int i = 0; i < N; i++) {
            id[i] = i;
            sz[i] = 1;
        }
    }

    /**
     * Return the id of component corresponding to object p.
     *
     * @throws java.lang.IndexOutOfBoundsException
     *          unless 0 <= p < N
     */
    public int find(int p) {
        if (p < 0 || p >= id.length) throw new IndexOutOfBoundsException();
        while (p != id[p]) {
            p = id[p];
        }
        return p;
    }

    /**
     * Return the number of disjoint sets.
     */
    public int count() {
        return count;
    }


    /**
     * Are objects p and q in the same set?
     *
     * @throws java.lang.IndexOutOfBoundsException
     *          unless both 0 <= p < N and 0 <= q < N
     */
    public boolean connected(int p, int q) {
        return find(p) == find(q);
    }


    /**
     * Replace sets containing p and q with their union.
     *
     * @throws java.lang.IndexOutOfBoundsException
     *          unless both 0 <= p < N and 0 <= q < N
     */
    public void union(int p, int q) {
        int i = find(p);
        int j = find(q);
        if (i == j) return;

        // make smaller root point to larger one
        if (sz[i] < sz[j]) {
            id[i] = j;
            sz[j] += sz[i];
        } else {
            id[j] = i;
            sz[i] += sz[j];
        }
        count--;
    }
}

 Analysis of complexity:

 

分享到:
评论

相关推荐

    数据结构:并查集 Union-Find算法(不相交集)原理及C++实现.zip

    数据结构、算法相关的资源 数据结构、算法相关的资源 数据结构、算法相关的资源 数据结构、算法相关的资源 数据结构、算法相关的资源 数据结构、算法相关的资源 数据结构、算法相关的资源 数据结构、算法相关的资源 ...

    数据结构与算法分析——C++语言描述第三版习题答案

    《数据结构与算法分析——C++描述 第3版》是一本全面覆盖数据结构与算法基础知识的经典教材。通过本书的学习,不仅可以掌握数据结构与算法的基本概念,还能深入了解各种数据结构的特点及其适用场景,同时掌握算法...

    UnionFind算法

    **UnionFind算法**,也称为并查集算法,是一种用于解决动态连通性问题的数据结构。该算法主要用于判断图中的两个节点是否属于同一个连通分量,并且可以高效地合并不同的连通分量。UnionFind算法在多种场景下都有广泛...

    算法与数据结构:并查集

    并查集是一种用于处理不相交集合操作的数据结构,它主要包含两个基本操作:查找(Find)和合并(Union)。在并查集中,通常用森林的结构来表示各个集合,每棵树代表一个集合,树的根节点代表集合的标识。 **查找...

    数据结构与算法分析第二版 ---C语言描述(附加答案)

    不相交集ADT8.1 等价关系8.2 动态等价性问题8.3 基本数据结构8.4 灵巧求并算法8.5 路径压缩8.6 按秩求并和路径压缩的最坏情形8.6.1 Union/Find算法分析8.7 一个应用总结练习参考文献第9章 图论算法9.1 若干定义9.1.1...

    数据结构与算法分析+C+++描述+第三版》答案

    在计算机科学领域,《数据结构与算法分析》是一门基础而重要的课程,它主要研究如何有效地组织和处理数据。本书旨在为读者提供全面的数据结构和算法知识,帮助读者理解并掌握这些核心概念。 #### 二、算法分析 **...

    西安电子科技大学网络与继续教育学院数据结构与算法全真试题

    在“西安电子科技大学网络与继续教育学院数据结构与算法全真试题”中,我们可以期待涵盖一系列关键概念和技能的测试题目,这些题目可能涉及到以下几个方面: 1. **基本数据结构**:包括数组、链表、栈、队列、哈希...

    数据结构及其算法详细

    数据结构与算法是计算机科学的基础,对于理解和设计高效的软件至关重要。在这个标题为“数据结构及其算法详细”的资源中,我们可预见到它将全面深入地探讨这个领域,如同一本字典,提供了丰富的知识和信息。 首先,...

    UnionFind-2x2.pdf

    Union-Find数据结构是一种特殊的数据结构,用于管理一组不相交的集合(Disjoint-sets),支持三个基本操作:MAKE-SET、FIND和UNION。这种数据结构广泛应用于解决动态连通性问题,例如图形算法、编译器设计等领域。 ...

    数据结构与算法分析—c语言描述_课后答案

    - **并查集**(Union-Find):用于维护一系列不相交集合的数据结构。 - **路径压缩**(Path Compression):一种优化技术,减少查找根节点的操作时间。 不相交集合在图论、网络分析等领域有着广泛的应用。 ### 第9...

    数据结构和算法的标程库(C/C++实现)

    数据结构和算法是计算机科学的基础,对于理解和解决复杂问题至关重要。这个标程库提供了C/C++语言实现的主要数据结构和算法,涵盖了多个重要的计算概念。下面将详细解释这些知识点: 1. 高精度计算:在C和C++中实现...

    数据结构与算法分析 c++描述(第三版)的答案

    - **列表**:一种线性数据结构,支持在任意位置插入和删除操作。 - **栈**:遵循后进先出(LIFO)原则的数据结构。 - **队列**:遵循先进先出(FIFO)原则的数据结构。 ##### 第4章 树 本章介绍了树形数据结构的...

    算法工程师思维导图—数据结构与算法.pdf

    本资源总共涵盖了12个知识点,涵盖了数据结构和算法的基础知识,涵盖了栈、二维矩阵、直方图、链表、LRU Cache、并查集、岛屿个数、树、二叉树前序、中序、后序遍历、二分查找树、AVL树和红黑树等知识点。

    基于Union-Find的Kruskal算法C++实现

    - `class UnionFind`:自定义的并查集类,包括 `find()`(查找顶点所属集合的根节点)和 `unionSet()`(合并两个集合)等方法。 - `void kruskalMST()`:Kruskal算法的主要函数,进行上述步骤。 在代码设计和变量...

Global site tag (gtag.js) - Google Analytics