合并重叠间隔的算法 [英] Algorithm for merging overlapping intervals

查看:114
本文介绍了合并重叠间隔的算法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在寻找一种有效的算法来将重叠的间隔合并到动态间隔数组中.例如,明智的(开始时间,结束时间)

I have been searching for an efficient algorithm to merge overlapping intervals on a dynamic array of intervals. For example, (start time, end time) wise,

[(1, 2), (4, 8), (3, 10)]

成为

[(1, 2), (3, 10)]

,因为(4,8)和(3,10)重叠.重叠意味着两个时间间隔的任何部分共享同一时刻.

after merging because (4, 8) and (3, 10) are overlapped. Overlapping means any part of the two intervals share the same moment.

我知道在给定完整数组时,可以按照开始时间升序对间隔进行排序后使用堆栈进行操作(参考:geeksforgeeks ).但是该算法仅在给定数组不是动态数组的情况下才有效,但是我正在搜索对动态数组有效的数组.例如,数组间隔将被更新并频繁插入,并且间隔应在每次操作中合并.

I know when a full array is given the operation can be done with a stack after sorting the intervals on start time ascending order (reference: geeksforgeeks). But this algorithm is effectively applicable only when given array is non dynamic, but I am searching which will be efficient for dynamic array. For example, array intervals will be updated and inserted frequently and intervals should be merged on each operation.

推荐答案

保留间隔的二进制搜索树(BST),其中关键字为间隔的起点.

Keep a binary search tree (BST) of intervals with the key being the starting point of the interval.

要插入任何新间隔:

  • 在BST中找到小于新间隔起点的最大值(可以在O(log n)中完成).

  • Find the biggest value in the BST smaller than the starting point of the new interval (can be done in O(log n)).

该时间间隔或下一个时间间隔将与新的时间间隔重叠,或者不存在任何重叠(在这种情况下,我们只需执行插入操作即可).

Either that interval or the next interval will overlap with the new interval, or there will be no overlap (in which case we just do an insert).

可能有更多与新间隔重叠的间隔,因此从这里开始,我们需要遍历BST的其余部分(从上面找到的点开始),并将该间隔与任何重叠的间隔合并.

There can be more intervals overlapping with the new interval, so from here we need to iterate through the rest of the BST (starting from the point found above) and merge the interval with any overlapping intervals.

在最坏的情况下(如果间隔与其他间隔重叠的情况),任何给定插入可以取O(n log n),则每个插入的摊销时间将为O(log n)(因为我们可以计算删除元素所完成的工作是插入元素所完成的工作的一部分,该工作总计为O(log n).)

While any given insert can take O(n log n) in the worst case (in case that interval overlaps with e.g. every other interval), the amortised time will be O(log n) per insert (since we can count the work done to delete an element as part of the work done to insert it, which is O(log n) work in total).

一些经过严格测试的C ++代码可以做到这一点:

Some lightly-tested C++ code doing this:

// set<pair<int, int> > can also be used, but this way seems conceptually simpler
map<int, pair<int, int> > intervals;

void insert(int left, int right)
{
  if (!intervals.empty())
  {
    // get the next bigger element
    auto current = intervals.upper_bound(left);
    // checking if not found is not needed because of decrement and how C++ iterators work
    // decrement to get next smaller one instead, but only if we're not that the beginning
    if (current != intervals.begin())
        --current;
    // skip current if it's entirely to the left of the new interval
    if (current->second.second < left)
        ++current;
    // update new starting point if there's an overlap and current is more to the left
    if (current != intervals.end() && current->first <= right)
        left = min(left, current->first);
    // iterate through while there's an overlap (deleting as we go)
    for (; current != intervals.end() && current->first <= right;
           current = intervals.erase(current))
        // update the end point of new interval
        right = max(right, current->second.second);
  }
  // insert our updated interval
  intervals[left] = make_pair(left, right);
}

// FYI: current->first is the start, current->second.second is the end

实时演示.

这篇关于合并重叠间隔的算法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆