找到“最大”值。 O(nlog(n))中的重叠区间对 [英] Finding "maximum" overlapping interval pair in O(nlog(n))

查看:133
本文介绍了找到“最大”值。 O(nlog(n))中的重叠区间对的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

问题陈述

输入
n个间隔; {[s_1,t_1],[s_2,t_2],...,[s_n,t_n]}。

Input set of n intervals; {[s_1,t_1], [s_2,t_2], ... ,[s_n,t_n]}.

输出
对间隔; {[s_i,t_i],[s_j,t_j]},在所有间隔对中最大重叠。

示例

输入间隔:{[1,10],[2,6],[3,15],[5,9]}

input intervals : {[1, 10], [2, 6], [3,15], [5, 9]}

->可能有6个间隔对。在这些对中,[1,10]和[3,15]的最大重叠量为7。

-> There are possible 6 interval pairs. Among those pairs, [1,10] & [3,15] has the largest possible overlap of 7.

输出:{[1,10],[3,15]}

output : {[1,10],[3,15]}

天真算法将是一种蛮力方法,其中所有n个间隔都将相互比较,同时跟踪当前的最大重叠值。在这种情况下,时间复杂度为O(n ^ 2)。

A naive algorithm will be a brute force method where all n intervals get compared to each other, while the current maximum overlap value being tracked. The time complexity would be O(n^2) for this case.

我能够找到许多有关间隔树最大重叠间隔数和最大不重叠间隔集,但此问题一无所获。也许我可以使用上述算法中给出的想法,但是我无法提出一个。

I was able to find many procedures regarding interval trees, maximum number of overlapping intervals and maximum set of non-overlapping intervals, but nothing on this problem. Maybe I would be able to use the ideas given in the above algorithms, but I wasn't able to come up with one.

我花了很多时间试图找出一个好的解决方案,但是我认为我现在需要一些帮助。

I spent many hours trying to figure out a nice solution, but I think I need some help at this point.

任何建议都会有所帮助!

Any suggestions will help!

推荐答案

首先,对时间间隔进行排序:首先按升序从左端点开始,然后—作为次要标准—按右端点以降序排列。对于此答案的其余部分,我将假定间隔已经按照排序的顺序。

First, sort the intervals: first by left endpoint in increasing order, then — as a secondary criterion — by right endpoint in decreasing order. For the rest of this answer, I'll assume that the intervals are already in sorted order.

现在,对于最大可能重叠有两种可能性:

Now, there are two possibilities for what the maximum possible overlap might be:


  • 它可能在一个间隔和它完全覆盖的下一个间隔之间。

  • 它可能在一个区间和它完全不会覆盖的下一个区间之间。

我们可以覆盖通过在时间间隔上进行迭代,在O( n )时间中对这两种情况进行跟踪,并跟踪以下内容:

We can cover both cases in O(n) time by iterating over the intervals, keeping track of the following:


  • 到目前为止,我们已经看到了最大的重叠,以及相关的间隔对。

  • 我们看到的最新间隔,称为 L ,不是完全被其任何前任所涵盖。 (为此,主要的见解是,由于对间隔进行了排序,我们可以很容易地判断出间隔是否被其前任所有对象完全覆盖了,因此是否需要更新 L &mdash ;只需检查当前 L 是否完全覆盖了它,因此我们可以在O(1)时间保持 L 为最新。)

  • the greatest overlap we've seen so far, and the relevant pair of intervals.
  • the latest interval we've seen, call it L, that wasn't completely covered by any of its predecessors. (For this, the key insight is that thanks to the ordering of the intervals, we can easily tell if an interval is completely covered by any of its predecessors — and therefore if we need to update L — by simply checking if it's completely covered by the current L. So we can keep L up-to-date in O(1) time.)

并计算每个区间与 L 的重叠。

and computing each interval's overlap with L.

所以:

result := []
max_overlap := 0
L := sorted_intervals[1]
for interval I in sorted_intervals[2..n]:
    overlap := MIN(L.right, I.right) - I.left
    if overlap >= max_overlap:
        result := [L, I]
        max_overlap := overlap
    if I.right > L.right:
        L := I

所以总成本就是分拣成本时间间隔,可能是O( n log n )时间,但如果可以使用bucket-sort,则可能是O( n )或基数排序或类似内容。

So the total cost is the cost of sorting the intervals, which is likely to be O(n log n) time but may be O(n) if you can use bucket-sort or radix-sort or similar.

这篇关于找到“最大”值。 O(nlog(n))中的重叠区间对的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆