1 / 37

Disjoint Sets Data Structure

Disjoint Sets Data Structure. Disjoint Sets. Some applications require maintaining a collection of disjoint sets. A Disjoint set S is a collection of sets where

eben
Download Presentation

Disjoint Sets Data Structure

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Disjoint Sets Data Structure

  2. Disjoint Sets • Some applications require maintaining a collection of disjoint sets. • A Disjoint set S is a collection of sets where • Each set has a representative which is a member of the set (Usually the minimum if the elements are comparable)

  3. Disjoint Set Operations • Make-Set(x) – Creates a new set where x is it’s only element (and therefore it is the representative of the set). • Union(x,y) – Replaces by one of the elements of becomes the representative of the new set. • Find(x) – Returns the representative of the set containing x

  4. Analyzing Operations • We usually analyze a sequence of m operations, of which n of them are Make_Set operations, and m is the total of Make_Set, Find, and Union operations • Each union operations decreases the number of sets in the data structure, so there can not be more than n-1 Union operations

  5. Applications • Equivalence Relations (e.g Connected Components) • Minimal Spanning Trees

  6. Connected Components • Given a graph G we first preprocess G to maintain a set of connected components. CONNECTED_COMPONENTS(G) • Later a series of queries can be executed to check if two vertexes are part of the same connected component SAME_COMPONENT(U,V)

  7. Connected Components CONNECTED_COMPONENTS(G) for each vertex v in V[G] do MAKE_SET (v) for each edge (u,v) in E[G] do if FIND_SET(u) != FIND_SET(v) then UNION(u,v)

  8. Connected Components SAME_COMPONENT(u,v) return FIND_SET(u) ==FIND_SET(v)

  9. Example e f h j i g a b c d

  10. (b,d) e f h j i g a b c d

  11. (e,g) e f h j i g a b c d

  12. (a,c) e f h j i g a b c d

  13. (h,i) e f h j i g a b c d

  14. (a,b) e f h j i g a b c d

  15. (e,f) e f h j i g a b c d

  16. (b,c) e f h j i g a b c d

  17. Result e f h j i g a b c d

  18. Connected Components • During the execution of CONNECTED-COMPONENTS on a undirected graph G = (V, E) with k connected components, how many time is FIND-SET called? How many times is UNION called? Express you answers in terms of |V|, |E|, and k.

  19. Solution • FIND-SET is called 2|E| times. FIND-SET is called twice on line 4, which is executed once for each edge in E[G]. • UNION is called |V| - k times. Lines 1 and 2 create |V| disjoint sets. Each UNION operation decreases the number of disjoint sets by one. At the end there are k disjoint sets, so UNION is called |V| - k times.

  20. Linked List implementation • We maintain a set of linked list, each list corresponds to a single set. • All elements of the set point to the first element which is the representative • A pointer to the tail is maintained so elements are inserted at the end of the list a b c d

  21. Union with linked lists

  22. Analysis • Using linked list, MAKE_SET and FIND_SET are constant operations, however UNION requires to update the representative for at least all the elements of one set, and therefore is linear in worst case time • A series of m operations could take

  23. Analysis • Let . Let n be the number of make set operations, then a series of n MAKE_SET operations, followed by q-1 UNION operations will take since • q,n are an order of m, so in total we get which is an amortized cost of m for each operations

  24. Improvement – Weighted Union • Always append the shortest list to the longest list. A series of operations will now cost only • MAKE_SET and FIND_SET are constant time and there are m operations. • For Union, a set will not change it’s representative more than log(n) times. So each element can be updated no more than log(n) time, resulting in nlogn for all union operations

  25. Disjoint-Set Forests • Maintain A collection of trees, each element points to it’s parent. The root of each tree is the representative of the set • We use two strategies for improving running time • Union by Rank • Path Compression c a b f d

  26. Make Set • MAKE_SET(x) p(x)=x rank(x)=0 x

  27. Find Set • FIND_SET(d) if d != p[d] p[d]= FIND_SET(p[d]) return p[d] c a b f d

  28. Union w c • UNION(x,y) link(findSet(x), findSet(y)) • link(x,y) if rank(x)>rank(y) then p(y)=x else p(x)=y if rank(x)=rank(y) then rank(y)++ x a b f y d z c w x a b f y d z

  29. Analysis • In Union we attach a smaller tree to the larger tree, results in logarithmic depth. • Path compression can cause a very deep tree to become very shallow • Combining both ideas gives us (without proof) a sequence of m operations in

  30. Exercise • Describe a data structure that supports the following operations: • find(x) – returns the representative of x • union(x,y) – unifies the groups of x and y • min(x) – returns the minimal element in the group of x

  31. Solution • We modify the disjoint set data structure so that we keep a reference to the minimal element in the group representative. • The find operation does not change (log(n)) • The union operation is similar to the original union operation, and the minimal element is the smallest between the minimal of the two groups

  32. Example • Executing find(5) 7 1 4 4 4 6 1 3 7 2 5

  33. Example • Executing union(4,6) 4 6 1 3 7 2 5

  34. Exercise • Describe a data structure that supports the following operations: • find(x) – returns the representative of x • union(x,y) – unifies the groups of x and y • deUnion() – undo the last union operation

  35. Solution • We modify the disjoint set data structure by adding a stack, that keeps the pairs of representatives that were last merged in the union operations • The find operations stays the same, but we can not use path compression since we don’t want to change the modify the structure after union operations

  36. Solution • The union operation is a regular operation and involves an addition push (x,y) to the stack • The deUnion operation is as follows • (x,y)  s.pop() • parent(x) x • parent(y) y

  37. Example • Example why we can not use path compression. • Union (8,4) • Find(2) • Find(6) • DeUnion()

More Related