找到一组数字落入的范围 [英] finding a set of ranges that a number fall in

查看:80
本文介绍了找到一组数字落入的范围的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个200k行的数字范围列表,例如start_position,stop position。
列表中除了不重叠以外,还包括所有类型的重叠。



列表看起来像这样




  • [3,5]

  • [10,30]

  • [15,25]

  • [5,15]

  • [25,35]

  • ...



我需要找到给定数字的范围。并将其重复10万个数字。
例如,如果上面的列表中给定的数字为18,则该函数应返回
[10,30]
[15,25]



我正在使用bisect进行过分复杂的操作,任何人都可以提供有关如何以更快的方式进行操作的线索。



谢谢

解决方案

似乎是覆盖范围问题。由于您要处理大量查询,因此需要尽快给出结果。有一种与此问题相关的算法,您可以查看细分树。。 p>

我们的想法是首先根据给定的间隔构建一个细分树,然后对于每个查询,您都可以像 log(N)一样快,其中 N 是间隔数。



要获取所有可能的 k 个时间间隔,请首先找到覆盖所有具有 log(n),然后遍历其子级以检索所有k个间隔。因此,检索给定数字的所有时间间隔的时间复杂度受 log(N)+ O(k)的限制,其中 k<< n



此算法的运行速度可能会慢到 O(N)当所有间隔都包含给定数字时。在这种情况下,由于输出大小为 N ,因此不存在更好的解决方案。因为算法的复杂度必须至少等于或大于输出大小。



希望它会有所帮助。


I have a 200k lines list of number ranges like start_position,stop position. The list includes all kinds of overlaps in addition to nonoverlapping ones.

the list looks like this

  • [3,5]
  • [10,30]
  • [15,25]
  • [5,15]
  • [25,35]
  • ...

I need to find the ranges that a given number fall in. And will repeat it for 100k numbers. For example if 18 is the given number with the list above then the function should return [10,30] [15,25]

I am doing it in a overly complicated way using bisect, can anybody give a clue on how to do it in a faster way.

Thanks

解决方案

It seems a range coverage problem. Since you have a large amount queries to process, you need to give the result as quickly as possible. There is an algorithm relating to this problem, you can take a look at the Segment Tree.

The idea is to build a Segment Tree first based on the intervals given, and then for each query, you can do as fast as log(N), where N is the number of intervals.

To get all possible k intervals, first find the parent node covering all sub intervals with log(n), and then traverse its children to retrieve all k intervals. So the time complexity of retrieving all intervals for a given number is bounded by log(N) + O(k), where k << n.

This algorithm could be deteriorated to as slow as O(N) when all intervals contains the given number. In this case, since the size of output is N, there does not exist any better solution. Because the complexity of the algorithm must be at least at the same order or higher than the output size.

Hope it helps.

这篇关于找到一组数字落入的范围的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆