合并重叠的跨度/范围 [英] Merging overlapping spans/ranges

查看:69
本文介绍了合并重叠的跨度/范围的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我写的是免费查找时间。日历功能。有很多

的时间跨度有开始结束时间,有些重叠,有些则没有。


要查找空闲时间跨度,我首先需要转换将事件转换为

非重叠时间跨度列表meta-span。


这个漂亮的ascii图应该显示我的意思。


1)---

2)---

3)---

4) -----

5)-----

I am writing a "find-free-time" function for a calendar. There are a lot
of time spans with start end times, some overlapping, some not.

To find the free time spans, I first need to convert the events into a
list of non overlapping time spans "meta-spans".

This nice ascii graph should show what I mean.

1) ---
2) ---
3) ---
4) -----
5) -----

------- --- ----- #meta span


我可以遍历元跨度并找到非繁忙时间。


我有写下面的课,但它更像O ^ 2,所以我想知道是否有人想知道更好的方法吗?

######## ##############################

# - * - 编码:latin-1 - * -


""

1)---

2)---

3)---

4)-----

5) -----

------- --------
------- -------- # meta spans
I can then iterate through the meta-spans and find non-busy times.

I have written the class below, but it is rather O^2, so I wondered if
anybody has an idea for a better approach?
######################################
# -*- coding: latin-1 -*-

"""
1) ---
2) ---
3) ---
4) -----
5) -----
------- --------



"""


类MetaSpans:


"""

填充span元组列表[(开始,结束)],它将使

" meta"

跨度,重叠跨度折叠成一个跨度。

" "


def __init __(自我):

self.spans = []


def add (self,span):

start,end = span

重叠= [span]

non_overlapping = []
$ b self.spans中spn的$ b:

spn_start,spn_end = spn

#跨越跨度的跨度规则

starts_before = spn_start< =开始

ends_after = spn_end> =结束

is_outside = starts_before和ends_after

starts_inside = start< = spn_start< = end

ends_inside = start< = spn_end< = end

overlapps = starts_inside或ends_inside or is _outside

如果重叠:

overlap.append(spn)

else:

non_overlapping.append(spn)

#重叠跨度改为一个跨度

starts = []

ends = []

开始,以重叠结束:

starts.append(开始)

ends.append(结束)

min_start = min(开始)

max_end = max(结束)

non_overlapping.append((min_start,max_end))

self.spans = non_overlapping

def findFreeTime(self,duration):

self.spans.sort()


if __name__ ==''__ main __'':


ms = MetaSpans()

ms.add((0,3))

ms.add((4,7) ))

ms.add((2,5))

ms.add((9,14))

ms.add( (12,17))

打印ms.spans
从日期时间导入日期时间


ms = MetaSpans()

ms.add((datetime(2005,1,1,0,0,0),datetime(2005,1,1,3,0,0)))

ms.add(( datetime(2005年,1 ,1,4,0,0),datetime(2005,1,1,7,0,0)))

ms.add((datetime(2005,1,1,2,0) ,0),datetime(2005,1,1,5,0,0)))

ms.add((datetime(2005,1,1,9,0,0),datetime( 2005,1,1,14,0,0)))

ms.add((datetime(2005,1,1,12,0,0),datetime(2005,1,1, 17,10,

0)))

打印ms.spans


-


hilsen / respect Max M,Denmark

http:// www.mxm.dk/

IT的疯狂科学


"""

class MetaSpans:

"""
Populate with a list of span tuples [(start,end)], and it will make
"meta"
spans, with overlapping spans folded into one span.
"""

def __init__(self):
self.spans = []

def add(self, span):
start, end = span
overlapping = [span]
non_overlapping = []
for spn in self.spans:
spn_start, spn_end = spn
# span rules for iterated spans
starts_before = spn_start <= start
ends_after = spn_end >= end
is_outside = starts_before and ends_after
starts_inside = start <= spn_start <= end
ends_inside = start <= spn_end <= end
overlaps = starts_inside or ends_inside or is_outside
if overlaps:
overlapping.append(spn)
else:
non_overlapping.append(spn)
# overlapping spans are changed to one span
starts = []
ends = []
for start, end in overlapping:
starts.append(start)
ends.append(end)
min_start = min(starts)
max_end = max(ends)
non_overlapping.append( (min_start, max_end) )
self.spans = non_overlapping
def findFreeTime(self, duration):
self.spans.sort()


if __name__ == ''__main__'':

ms = MetaSpans()
ms.add((0,3))
ms.add((4,7))
ms.add((2,5))
ms.add((9,14))
ms.add((12,17))
print ms.spans
from datetime import datetime
ms = MetaSpans()
ms.add((datetime(2005, 1, 1, 0, 0, 0), datetime(2005, 1, 1, 3, 0, 0)))
ms.add((datetime(2005, 1, 1, 4, 0, 0), datetime(2005, 1, 1, 7, 0, 0)))
ms.add((datetime(2005, 1, 1, 2, 0, 0), datetime(2005, 1, 1, 5, 0, 0)))
ms.add((datetime(2005, 1, 1, 9, 0, 0), datetime(2005, 1, 1, 14, 0, 0)))
ms.add((datetime(2005, 1, 1, 12, 0, 0), datetime(2005, 1, 1, 17, 0,
0)))
print ms.spans

--

hilsen/regards Max M, Denmark

http://www.mxm.dk/
IT''s Mad Science

推荐答案

这是问题所在在

区间图中查找连接的组件。你可以自己实现算法,你可以在这里使用我的图形数据结构:

http://sourceforge.net/projects/pynetwork/

图表方法:

createIntervalgGraph

和:

connectedComponents

可以很快,算法和

解决你的问题几行代码。

如果你需要更多的帮助,那就请求它,我会给小代码

需要。


熊抱,

Bearophile

This is the problem of finding the connected components inside an
interval graph. You can implement the algorithms yourself, of you can
use my graph data structure here:

http://sourceforge.net/projects/pynetwork/

The graph methods:
createIntervalgGraph
And:
connectedComponents
can probably solve your problem quite fast, algorithmically, and with
few lines of code.
If you need more help, then ask for it and I''ll give the little code
needed.

Bear hugs,
Bearophile




Max M写道:

Max M wrote:
我正在写一个find-free-time日历功能。有很多时间跨度和开始结束时间,有些重叠,有些则没有。

要查找空闲时间跨度,我首先需要将事件转换为
a列表非重叠时间跨度元跨度。

这个漂亮的ascii图应该显示我的意思。

1)---
2) - -
3)---
4)-----
5)-----
I am writing a "find-free-time" function for a calendar. There are a lot of time spans with start end times, some overlapping, some not.

To find the free time spans, I first need to convert the events into a list of non overlapping time spans "meta-spans".

This nice ascii graph should show what I mean.

1) ---
2) ---
3) ---
4) -----
5) -----
>> ------- -------- #meta span
然后我可以遍历元跨度并找到非繁忙的时间。

我写过下面的课程,但它更像是O ^ 2,所以我想知道
,如果有人想要更好的方法吗?

############## ########################
# - * - 编码:latin-1 - * -

" "
1)---
2)---
3)---
4)-----
5) - ---
>> ------- --------"""

类MetaSpans:

"""
填充一个跨度元组列表[(开始,结束)],它将
makemeta
跨度,重叠跨度折叠成一个跨度。
""" ;

def __init __(自我):
self.spans = []

def add(self,span):
start,end = span
重叠= [span]
non_overlapping = []
对于self.spans中的spn:
spn_start,spn_end = spn
#span规则的迭代跨度
starts_before = spn_start< = start
ends_after = spn_end> =结束
is_outside = starts_before和ends_after
starts_inside = start< = spn_start< = end
ends_inside = start < = spn_end< = end
overlapps = starts_inside或ends_inside or is_outside
如果重叠:
overlap.append(spn)
否则:
non_overlapping.append(spn)
#重叠跨度更改为一个跨度
开始= []
结束= []
开始,结束重叠:
starts.append(开始)
ends.append(结束)
min_start = min (开始)
max_end = max(结束)
non_overlapping.append((min_start,max_end))
self.spans = non_overlapping

def findFreeTime(self,duration ):
self.spans.sort()

如果__name__ ==''__ main__'':

ms = MetaSpans()
ms.add((2,5))
ms.add((9 ,14))
ms.add((12,17))
打印ms.spans
从日期时间导入日期时间

ms = MetaSpans()
ms.add ((datetime(2005,1,1,0,0,0),datetime(2005,1,1,3,
0,0)))ms.add((datetime(2005,1,1, 4,0,0),datetime(2005,1,1,7,
0,0)))ms.add((datetime(2005,1,1,2,0,0),datetime(2005) ,1,1,5,
0,0)))ms.add((datetime(2005,1,1,9,0,0),datetime(2005,1,1,14,
0,0)))ms.add((datetime(2005,1,1,12,0,0),datetime(2005,1,1,17,
0,0))) print ms.spans
>> ------- -------- # meta spans
I can then iterate through the meta-spans and find non-busy times.

I have written the class below, but it is rather O^2, so I wondered if anybody has an idea for a better approach?
######################################
# -*- coding: latin-1 -*-

"""
1) ---
2) ---
3) ---
4) -----
5) -----
>> ------- -------- """

class MetaSpans:

"""
Populate with a list of span tuples [(start,end)], and it will make "meta"
spans, with overlapping spans folded into one span.
"""

def __init__(self):
self.spans = []

def add(self, span):
start, end = span
overlapping = [span]
non_overlapping = []
for spn in self.spans:
spn_start, spn_end = spn
# span rules for iterated spans
starts_before = spn_start <= start
ends_after = spn_end >= end
is_outside = starts_before and ends_after
starts_inside = start <= spn_start <= end
ends_inside = start <= spn_end <= end
overlaps = starts_inside or ends_inside or is_outside
if overlaps:
overlapping.append(spn)
else:
non_overlapping.append(spn)
# overlapping spans are changed to one span
starts = []
ends = []
for start, end in overlapping:
starts.append(start)
ends.append(end)
min_start = min(starts)
max_end = max(ends)
non_overlapping.append( (min_start, max_end) )
self.spans = non_overlapping
def findFreeTime(self, duration):
self.spans.sort()


if __name__ == ''__main__'':

ms = MetaSpans()
ms.add((0,3))
ms.add((4,7))
ms.add((2,5))
ms.add((9,14))
ms.add((12,17))
print ms.spans
from datetime import datetime
ms = MetaSpans()
ms.add((datetime(2005, 1, 1, 0, 0, 0), datetime(2005, 1, 1, 3, 0, 0))) ms.add((datetime(2005, 1, 1, 4, 0, 0), datetime(2005, 1, 1, 7, 0, 0))) ms.add((datetime(2005, 1, 1, 2, 0, 0), datetime(2005, 1, 1, 5, 0, 0))) ms.add((datetime(2005, 1, 1, 9, 0, 0), datetime(2005, 1, 1, 14, 0, 0))) ms.add((datetime(2005, 1, 1, 12, 0, 0), datetime(2005, 1, 1, 17, 0, 0)))
print ms.spans




我认为以下代码可以满足您的需求。它应该是O(n log n)

- 至少我希望Python能够对字符串列表进行排序:)

当然我已经假设你了您可以一次性使用这些空间

作为列表,并且不需要像在

原始代码中那样一次添加一个。


def get_metaspans(spans):

"""给定一个span元组列表[(start,end)],将生成所有

元跨度,重叠跨度折叠成一个跨度。

""

spans.sort()

spans = iter(spans)

metaspan = spans.next()
跨度跨度


start,end = span

m_start,m_end = metaspan

如果开始> m_end:

收益分数

metaspan = span

elif end> m_end:

metaspan =(m_start,end)

#一旦跨度列表耗尽,需要产生最终的metaspan

yield metaspan


def get_breaks(metaspans):

"""获取metaspans序列之间的所有间隔"""

metaspans = iter(metaspans)

_,prev_end = metaspans.next()
元数据中metaspan的


start,end = metaspan

收益率(prev_end,start)

prev_end =结束


我必须承认我有点生气勃勃,我会倾向于在任何给定的机会下抛出收益

:)最后不得不再次收益

的get_metaspans循环似乎有点不优雅,也许它可能会更好地完成。但它很好处理空列表的方式

优雅 - .next()调用抛出的StopIteration只是

吸收了,呃,生成器。


一点点测试:



I think the following code does what you want. It should be O(n log n)
- at least I hope thats what Python takes to sort the list of spans :)
Of course I''ve assumed you have the spans available to you all at once
as a list, and dont need to add them one at a time as you did in your
original code.

def get_metaspans(spans):
"""Given a list of span tuples [(start,end)], will generate all
meta spans, with overlapping spans folded into one span.
"""
spans.sort()
spans = iter(spans)
metaspan = spans.next()
for span in spans:
start, end = span
m_start, m_end = metaspan
if start > m_end:
yield metaspan
metaspan = span
elif end > m_end:
metaspan = (m_start, end)
# Need to yield the final metaspan once the span list is exhausted
yield metaspan

def get_breaks(metaspans):
"""Gets all the breaks in between a sequence of metaspans"""
metaspans = iter(metaspans)
_, prev_end = metaspans.next()
for metaspan in metaspans:
start, end = metaspan
yield (prev_end, start)
prev_end = end

I must admit I''m a bit of a generatoraholic, I''ll tend to throw yields
at anything given half a chance :) Having to yield once more at the end
of the get_metaspans loop seems a little inelegant, maybe it could be
done a bit better. But its nice the way empty lists are handled
gracefully - the StopIteration thrown by the .next() calls are just
absorbed by the, uh, generatorness.

A little bit of testing:

spans = [(12,13),(0,3),(2,5),(1) ,4),(4,6),(1,2),(8,9),(9,
10)]打印列表(get_metaspans(spans))
[(0,6) ,(8,10),(12,13)]打印列表(get_breaks(get_metaspans(spans)))
spans = [(12, 13), (0,3), (2,5), (1,4), (4,6), (1,2), (8,9), (9, 10)] print list(get_metaspans(spans)) [(0, 6), (8, 10), (12, 13)] print list(get_breaks(get_metaspans(spans)))



[(6,8) ,(10,12)]


这或多或少是需要的吗?


[(6, 8), (10, 12)]

Is that more or less what was required?


线性方法:


你创建一个数组 - 每分钟一个bool。一天24 * 60的参赛作品

就足够了。跨度(开始,结束)是从午夜开始的几分钟。对于每个输入范围,将数组

范围内的(开始,结束)插槽设置为True。

扫描数组并找到metaspans - 连续的False序列。 br />
The linear method:

You create an array - one bool per minute. For one day 24 * 60 entries
is enough. Spans (Start, End) are in minutes from midnight. Set array
slots in range(Start, End) to True for each input span.

Scan the array and find metaspans - contiguous sequences of False.


这篇关于合并重叠的跨度/范围的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆