10

I was studying Union Find, and according to Wikipedia, there are 2 types of union: union by rank and Union by size. My question is, what is the runtime difference between the two (if any)?

Intuitively, it feels like Union by size would always be better, since each time we merge, we are increasing the rank of every node in one tree by 1, and to minimize overall runtime, we want to increase that one tree to be the one with smaller number of nodes, even though the final tree height might be greater.

But does this make a big difference in runtime?

Caleb Stanford
  • 7,298
  • 2
  • 29
  • 50
timg
  • 254
  • 2
  • 7

1 Answers1

11

If you combine union by rank or union by size with e.g. path compression the amortized complexity is the same [$O(m\alpha(m,n))$]. But notice that Wikipedia uses union by rank in order to prove the upper bound $O(m\log^*(n))$ because for proof purposes the union by rank algorithm is simpler to handle. On the other hand if you are implementing such a data structure you usually want to allow the user to access the sizes of the sets in the structure and thus use union by size method in order to keep track of the sizes and don't waste space for an extra array.


Its worth mentioning if we don't combine them with e.g. path compression both work in amortized $O(log(n))$.


This is a good lecture on this topic. It contains references to all the papers proving bounds for the problem. It also contains explanation of what $O(m\log^*(n))$ and $O(m\alpha(m,n))$ mean.

plshelp
  • 1,679
  • 6
  • 15