10

I am trying to create a data structure for handling the subsets of the real line of the form $[x,y)$. That is, suppose $X \subseteq \mathbb{R}$ and the data structure supports two types of operations: $add(X, [x,y))$ and $remove(X,[x,y))$. Each of these two queries return the number of disjoint semi-intervals in $X$. For instance,

> add(X, [2, 10))
> 1
> add(X, [3, 9))
> 1
> add(X, [-1, 1))
> 2
> remove(X, [2, 10))
> 1

I suspect that this can be realised with binary search tree, however I could not properly invent the behaviour. I still suspect this can be done in such a way that each query works in $O(\log n)$ time, where $n$ is the total number of queries. Can you please suggest anything?

Yuval Filmus
  • 280,205
  • 27
  • 317
  • 514
kissanpentu
  • 101
  • 3

1 Answers1

3

Since all intervals are non-overlapping, the use of an interval tree is unnecessary. We will store our intervals in an AVL tree $T$ sorted by start points and use the fact that bulk deletion of a set of contiguous keys $q_i,...q_{i+k-1}$ can be done in $O(\log n + \log k)$ amortized time (see non-open access and open access).

Let $x=[a,b)$ be an interval and $begin(x)=a$, $end(x)=b$. We define $pred(x)$ and $succ(x)$ to be the previous and next intervals. Both functions have $O(\log n)$ complexity, where $n$ is the number of intervals in the tree. The number of disjoint intervals is the number of leaves in $T$.

Insertion

Let us first examine what happens when we insert an interval $x$ into the tree. When inserting $x$, we can determine the range of intervals it overlaps with in $O(\log n)$ time by performing predecessor and successor queries. When inserting $x$, we need to perform one of the following three operations on $T$

  1. Add a new disjoint interval.
  2. Extend an existing interval.
  3. Combine and potentially extend $k$ existing intervals.

(1a) The first case happens if the inserted query $q$ doesn't overlap with any of the intervals in our set i.e. $ end(pred(x)) \leq start(x) \land begin(succ(x)) \geq end(x)$ Since no intervals need to be updated, we simply insert $x$ into $T$.

(2a) The second case happens if $x$ overlaps with or contains one other interval $q$. In order to update the interval, delete $q$ and reinsert $q'$ where $$q'=[\min(start(x), start(q)), \max(end(x), end(q)))$$

(3a) In the third case, let $Q=\{q_i, q_{i+1}, ... q_{i+k-1}\}$ be all $k$ of the intervals that $x$ intersects. $Q$ can be obtained by finding the predecessor and successor of $x$. We can again update the tree by deleting all of $Q$ from $T$ and reinserting an interval $$q'=[\min(start(x), start(q_i)), \max(end(x), end(q_{i+k-1})))$$

The first and second cases have $O(\log n)$ time complexity. The third case at first looks like $O(k\log n)$ time which is undesirable as $k=O(n)$. However, it has been shown that bulk deletion of a set of keys in the range $[L, R]=\{q_i, q_{i+1}, ... ,q_{i+k-1}\}$ can be done in amortized $O(\log n + \log k)$ time. Therefore, for insert we have amortized $O(\log n)$ complexity. Note that the number of leaves (disjoint intervals) in the tree can be updated after each insertion in constant time.

Deletion

By deleting an interval $x$, we either

  1. Split an interval into two
  2. Shorten 1 or 2 intervals and delete those in between (if any).

(1b) The first case happens if $x$ is nested within an interval $q$ i.e. $$start(q) \leq start(x) \land end(x) \leq end(q) $$ Therefore, we delete $q$ and insert the two resulting intervals $[start(q), start(x))$ and $[end(x), end(q))$

(2b) In the second case, let $Q=\{q_i, q_{i+1}, ... q_{i+k-1}\}$ be the intervals that $x$ overlaps with. We can delete all of $Q$ from the tree and update and reinsert the the trimmed versions of $q_i$ and $q_{i+k-1}$ if they were not completely nested in $x$. As in case (3a), this is a bulk deletion that can be done in amortized $O(\log n + \log k)$ time.

Similar to insertions, we can keep track of the number of leaves after each deletion in constant time.

Amortized Complexity of Bulk Updates in AVL-Trees

Bulk Updates and Cache Sensitivity in Search Trees

Throckmorton
  • 1,039
  • 1
  • 8
  • 21