`
leonzhx
  • 浏览: 786249 次
  • 性别: Icon_minigender_1
  • 来自: 上海
社区版块
存档分类
最新评论
阅读更多

1.  The Union-Find Data Structure

     -- Goal : Maintain a partition of a set X.

     -- FIND: Given x in X, return name of x's group.

     -- UNION: Given x & y, merge groups containing them.

     Basic Solution :

     -- Each x in X points directly to the "leader" of its group.

     -- O(1) FIND [just return x's leader]

     -- O(n log n) total work for n UNIONS [when 2 groups merge, smaller group inherits leader of larger one]

 

2.  Lazy Unions :

    -- Update only one pointer each merge!

    -- When two groups merge in a UNION, make one group's leader (i.e., root of the tree) a child of the other one.

    Pro: UNION reduces to 2 FINDS [r1=FIND(x), r2=FIND(y)] and O(1) extra work [link r1 , r2 together]

    Con: To recover leader of an object, need to follow a path of parent pointers [not just one!] (FIND takes more time)


 

 

3.  The Lazy Union Implementation :

    -- Initially: For all x, parent[x]=x;

    -- FIND(x): Traverse parent pointers from x until you hit the root.

    -- UNION(x; y): s1 =FIND(x); s2 =FIND(y); Reset parent of one of s1 , s2 to be the other.

 

4.  Union by Rank :

    -- Ranks: For each x in X, maintain field rank[x]. (In general rank[x]=1+(max rank of x's children) )

    -- Invariant : For all x in X, rank[x]=maximum number of hops from some leaf to x.

    -- Initially, rank[x]=0 for all x in X

    -- UNION(x, y) : 

        a) s1=FIND(x), s2=FIND(y)

        b) If rank[s1]>rank[s2] then set parent[s2] to s1 else set parent[s1] to s2.

        c) No change to rank of s1 and s2 unless ranks of s1 , s2 were equal, in which case s2's rank goes up by 1

 

5.  Properties of Ranks : 

    (1) For all objects x, rank[x] only goes up over time

    (2) Only ranks of roots can go up [once x a non-root, rank[x] frozen for evermore]

    (3) Ranks strictly increase along a path to the root

 

6.  Rank Lemma :

    -- Consider an arbitrary sequence of UNION (+FIND) operations. For every r in {0 , 1 , 2 , ... }, there are at most n/2^r objects with rank r.

    -- Max rank always <= log2 n

    -- Worst-case running time of FIND, UNION is O(log n). (with Union by Rank)

    Proof :

    -- Claim 1: If x , y have the same rank r , then their subtrees (objects from which can reach x , y) are disjoint.

    Suppose subtrees of x , y have object z in common => exists path z --> x, z --> y => One of x , y is an ancestor of the other => The ancestor has strictly larger rank. Contridiction!

    -- Claim 2: The subtree of a rank-r object has size >= 2^r.

    Base case: Initially all ranks = 0, all subtree sizes = 1

    Inductive step: Nothing to prove unless the rank of some object changes (subtree sizes only go up).

    Interesting case: UNION(x , y), with s1=FIND(x), s2=FIND(y), and rank[s1]=rank[s2]=r => s2's new rank = r + 1 => s2's new subtree size = s2's old subtree size + s1's old subtree size (each at least 2^r by the inductive hypothesis) 2^(r+1).

    -- Claim 1 + Claim 2 imply the Rank Lemma.

 

7.  Path Compression :

    -- Idea: Why bother traversing a leaf-root path multiple times?

    -- Path compression: After FIND(x), install shortcuts (i.e., revise parent pointers) to x's root all along the x --> root path.

    -- Con: Constant-factor overhead to FIND (from "multitasking").

    -- Pro: Speeds up subsequent FINDs.



 

 

8.  Ranks for Path Compression

    -- Important: Maintain all rank fields EXACTLY as without path compression.

    -- Ranks initially all 0

    -- In UNION, new root = old root with bigger rank

    -- When merging two nodes of common rank r , reset new root's rank to r + 1

    -- Rank Lemma still holds ( n=2^r objects with rank r )

    -- Still always have rank[parent[x]]>rank[x] for all non-roots x

 

9.  Hopcroft-Ullman Theorem :

    -- With Union by Rank and path compression, m Union+Find operations take O(m log* n) time, where log* n = the number of times you need to apply log to n before the result is <= 1.

    Proof :

    -- Rank blocks: {0} , {1} , {2} , {3 , 4} , {5 , 6 , ... , 2^4} , { 17 , 18 , ... , 2^16} , {65537 , ... , 2^65536} , ... , { ... , n} (Note: There are O(log* n) different rank blocks because the largest number of  i+1 th block is 2 to the biggest number of ith block ... tower of 2 )

    -- Semantics: Traversal x --> parent(x) is "fast progress" <==> rank[parent[x]] in larger block than rank[x]

    -- Definition: At a given point in time, call object x good if

        a) x or x's parent is a root OR

        b) rank[parent[x]] in larger block than rank[x]

        x is bad otherwise. 

    -- Every FIND visits only O(log* n) good nodes [2 + # of rank blocks = O(log* n)]

    -- Consider a rank block {k + 1; k + 2; : : : ; 2^k}. When a bad node is visited => its parent is changed to one with strictly larger rank => Can only happen 2^k times before x becomes good (forevermore). Due to Rank Lemma: Total number of objects x with final rank in this rank block is Sum(i = k+1 to 2^k ) {n/2^i} <= n/2^k  ==> total visits to bad objects in this rank block <= n

    -- There are only O(log * n) rank blocks , so the total work = visits to good nodes + visits to bad nodes = m * O (log * n) + n * O(log * n) = O((m+n) log * n) = O(m log * n) when m = O(n)

 

10.  The Ackermann Function

    -- Many dierent denitions, all more or less equivalent.

    -- Define Ak (r) for all integers k and r >= 1. (recursively)

        a)  Base case: A0(r ) = r + 1 for all r >= 1.

        b)  In general: For k , r >= 1: Ak (r ) = Apply Ak-1 r times to r

 

    -- A1(r ) = 2r , A2(r) = r 2^r , ...

 

11.  The Inverse Ackermann Function

    -- Denition: For every n >= 4, alpha(n) = minimum value of k such that Ak(2) >= n.

 

12.  Tarjan's Bound

    --  With Union by Rank and path compression, m UNION+FIND operations take O(m alpha(n)) time, where alpha(n) is the inverse Ackerman function.

    Proof :

    -- Define delta(x) = max value of k such that rank[parent[x]] >= Ak (rank[x]) (Note delta(x) only goes up over time)

    -- For all objects x with rank[x] >= 2, then delta(x) <= alpha(n) [Since A alpha(n)(2)>=n while the max rank can be log n]

    -- Denition: An object x is bad if all of the following hold:

    (1) x is not a root

    (2) parent(x) is not a root

    (3) rank(x) >= 2

    (4) x has an ancestor y with alpha(y) = alpha(x)

    x is good otherwise.

    -- the maximum number of good objects on an object-root path <= 1 root + 1 child of root + 1 object with rank 0 + 1 object with rank 1 + 1 object with delta(x) = k for each k = 0, 1, 2, ... , alpha(n) =  O(alpha(n))

    -- Suppose a FIND operation visits a bad object x which has an ancestor y with alpha(y) = alpha(x), p is the parent of x and p' is the parent of y. After FIND , rank[x's new parent] >= rank[p'] >= Ak(rank[y]) >= Ak(rank[p]) >= Ak( Ak (rank[x]) )

    -- Point: Path compression (at least) applies the Ak function to rank[x's parent]

    -- If r=rank[x] ( >=2 ), then after r such pointer updates we have rank[x's parent] >= Ak(Ak(...Ak(r)..)) = Ak+1(r) ==> while x is bad , every r visits increase delta(x) , since delta(x) <= alpha(x) ==> at most r alpha(n) visits to x while it's bad.

    -- Total work of m operations = visits to good objects and visits to bad objects <= O(m alpha(n) ) + Sum( all objects x) { rank[x] alpha(n) } = O(m alpha(n) ) + alpha(n) Sum ( r >= 0 ) r (# of objects with rank r) = O(m alpha(n) ) + alpha(n) Sum(r >0) r n /2^r = O(m alpha(n) ) + alpha(n) n O (1) = O( (m + n) alpha(n) )

 

  • 大小: 37.4 KB
  • 大小: 17.6 KB
分享到:
评论

相关推荐

Global site tag (gtag.js) - Google Analytics