3

Given a sequence of intervals $I_1, I_2, \ldots$, is there an efficient way to detect whether some interval $I_i$ is completely contained in the union of the preceding intervals $I_1, \ldots, I_{i-1}$?

For example, in the sequence $(0, 5), (5, 10), (2, 6)$, the interval $(2,6)$ is covered by the previous intervals, which cover the full range $(0, 10)$.

In contrast, in the sequence $(0, 5), (6, 10), (2, 6)$, the interval $(2,6)$ is not covered by the previous intervals, because the sub-interval $(5,6)$ is not covered.

My initial idea was to use an interval or segment tree to maintain the set of previous intervals, and check before adding each interval whether it is fully contained in the set. However, most algorithms construct the tree efficiently by first pre-sorting the intervals. In contrast, it seems like I need an on-line algorithm that efficiently merges intervals. I also don't need to maintain the original intervals (they can be merged), and lookup speed is not important.

The application here is exception tables in bytecode. I want to determine whether a given bytecode range is already covered by the ranges that precede it.

MattDs17
  • 163
  • 5

3 Answers3

2

The intersection of a sequence of intervals $I_1, ..., I_p$ is $$\left[ \max_{i \leq p} s(I_i), \min_{i \leq p} f(I_i) \right],$$ where $s$ and $f$ are the start and finish times is an interval.

From there it should be simple to find out if the next interval is contained in the intersection.

Ainsley H.
  • 17,823
  • 3
  • 43
  • 68
1

I think I have worked out a solution that takes $O(n \log n)$ time. It relies on a regular interval tree which has $O(\log n + m)$ query (for $m$ overlapping intervals) and $O(\log n)$ insertion and deletion. (Please correct me if you see an error in the algorithm or analysis.)

The algorithm works by iteratively constructing an interval tree and merging overlapping intervals at each step. For each new interval, we find the intervals it overlaps with. If those intervals cover the new interval, we're done; otherwise, we remove the overlaps from the tree and insert one large "merged" interval, then continue:

T <- empty tree
for interval in I:
  overlaps <- query_overlaps(T, interval)
  if sorted(overlaps) is not contiguous:
    return FAIL // there's a hole 
  for overlap in overlaps:
    remove(T, overlap)
  new_start <- min(i.start for i in [i] + overlaps)
  new_end <- max(i.end for i in [i] + overlaps)
  merged_interval <- (new_start, new_end)
  insert(T, merged_interval)

In each iteration of the loop, we perform:

  1. $O(\log n + m)$ work to query the overlapping intervals (for $n$ total intervals and $m$ values returned)
  2. $O(m \log m + m) = O(m \log m)$ work to sort the $m$ overlapping intervals and check whether they're contiguous (because we merge overlapping intervals, these intervals are guaranteed to be disjoint)
  3. $O(m \log n)$ work to remove the overlapping intervals
  4. $O(m)$ work to iterate the overlaps and find the merged interval
  5. $O(\log n)$ work to insert the new interval

We can show that $\sum_{i=1}^{|I|} m$ is $O(n)$ as follows. Suppose we represent the merges that happen as a tree where the leaf nodes are input intervals and internal nodes represent merged intervals. For each internal node in the tree, its number of children is $m + 1$ for some step in the algorithm. The number of edges in this tree is $\sum_{i=1}^{|I|} m + 1$. Each internal node has at least 2 children (you cannot merge a single interval), so the tree has at most $n + \frac{n}{2} + \frac{n}{4} + \ldots = O(n)$ total nodes and consequently $\sum_{i=1}^{|I|} m + 1 = O(n)$ edges.

Therefore, the amount of work done across all iterations is $$ \sum_{i=1}^{|I|} O(\log n + m) + O(m \log m) + O(m \log n) + O(m) + O(\log n) = O(n \log n). $$

MattDs17
  • 163
  • 5
0

Note, that $$I_i \subseteq \bigcup_{k < i} I_k \iff I_i \cap \overline{\bigcup_{k < i} I_k} = \emptyset \iff \Big(\big((I_i \setminus I_1) \setminus I_2\big) ... \setminus I_{i - 1}\Big) = \emptyset$$ You should be able to decide this in linear time by subtracting $I_1, I_2, ..., I_{i - 1}$ from $I_i$ and checking if the resulting interval is empty.

Knogger
  • 2,049
  • 3
  • 15