具有子集匹配维的间隔树? [英] Interval tree with added dimension of subset matching?

查看:97
本文介绍了具有子集匹配维的间隔树?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是一个关于复杂问题的算法问题.基础是这样的:

This is an algorithmic question about a somewhat complex problem. The foundation is this:

基于可用空位保留空位的调度系统.插槽具有某些条件,我们称它们为 tags .如果可用插槽的标签集是保留插槽的超集,则保留将通过这些标签将其与可用插槽匹配.

A scheduling system based on available slots and reserved slots. Slots have certain criteria, let's call them tags. A reservation is matched to an available slot by those tags, if the available slot's tag set is a superset of the reserved slot.

作为一个具体的例子,采取这种情况:

As a concrete example, take this scenario:

11:00  12:00  13:00
+--------+
| A, B   |
+--------+
       +--------+
       | C, D   |
       +--------+

可以在11:00到12:30的时间之间为标签AB进行预订,从12:00到13:30可以预订CD,并且有一个从大约12:00到12:30重叠.

Between the times of 11:00 to 12:30 reservations for the tags A and B can be made, from 12:00 to 13:30 C and D is available, and there's an overlap from about 12:00 to 12:30.

11:00  12:00  13:00
+--------+
| A, B   |
+--------+
       +--------+
       | C, D   |
       +--------+
  xxxxxx
  x A  x
  xxxxxx

这里已经预订了A,因此在11:15-ish和12:00-ish之间不能再预订AB.

Here a reservation for A has been made, so no other reservations for A or B can be made between 11:15-ish and 12:00-ish.

简而言之,这就是想法.可用插槽没有具体限制:

That's the idea in a nutshell. There are no specific limitations for the available slots:

  • 可用插槽可以包含任意数量的标签
  • 任意数量的插槽可以随时重叠
  • 插槽的长度是任意的
  • 预订中可以包含任意数量的标签

系统中唯一需要遵守的规则是:

The only rule that needs to be obeyed in the system is:

  • 添加预订时,至少有一个剩余的可用插槽必须与预订中的所有标签匹配

要澄清一下:如果同时有两个带有标记A的可用插槽,则可以同时为A保留两个,但不能再保留.

To clarify: when there are two available slots at the same time with, say, tag A, then two reservations for A can be made at that time, but no more.

我可以通过修改间隔树的实现来工作.作为快速概述:

I have that working with a modified implementation of an interval tree; as a quick overview:

  • 将所有可用的插槽添加到间隔树(保留重复项/重叠项)
  • 所有保留的插槽都被迭代,并且:
    • 从树中查询所有与预订时间匹配的可用空位
    • 将与预订标签匹配的第一个切片,并从树中删除该切片
    • all available slots are added to the interval tree (duplicates/overlaps are preserved)
    • all reserved slots are iterated and:
      • all available slots matching the time of the reservation are queried from the tree
      • the first of those matching the reservation's tags is sliced and the slice removed from the tree

      该过程完成后,剩下的就是可用插槽的剩余部分,我可以查询是否可以在特定时间进行新的预订并添加它.

      When that process is finished, what's left are the remaining slices of available slots, and I can query whether a new reservation can be made for a particular time and add it.

      数据结构看起来像这样:

      Data structures look something like this:

      {
        type: 'available', 
        begin: 1497857244, 
        end: 1497858244, 
        tags: [{ foo: 'bar' }, { baz: 42 }]
      }
      {
        type: 'reserved', 
        begin: 1497857345, 
        end: 1497857210, 
        tags: [{ foo: 'bar' }]
      }
      

      标签本身就是键值对象,它们的列表是标签集".如果有帮助,可以将其序列化;到目前为止,我使用的是Python set类型,它使比较容易.插槽开始/结束时间是树中的UNIX时间戳.我并不特别喜欢这些特定的数据结构,如果有用的话,可以对其进行重构.

      Tags are themselves key-value objects, a list of them is a "tag set". Those could be serialised if it helps; so far I'm using a Python set type which makes comparison easy enough. Slot begin/end times are UNIX time stamps within the tree. I'm not particularly married to these specific data structures and can refactor them if it's useful.

      我面临的问题是,这无法正常运行.保留偶尔会潜入与其他保留相冲突的系统中,我还无法弄清楚这种情况如何发生.当标签以复杂的方式重叠时,它也不太聪明,需要计算最佳分布,以便所有预留都可以尽可能地适合可用的插槽.实际上,目前尚不确定在重叠方案中如何将预留与可用插槽匹配.

      The problem I'm facing is that this doesn't work bug-free; every once in a while a reservation sneaks its way into the system that conflicts with other reservations, and I couldn't yet figure out how that can happen exactly. It's also not very clever when tags overlap in a complex way where the optimal distribution needs to be calculated so all reservations can be fit into the available slots as best as possible; in fact currently it's non-deterministic how reservations are matched to available slots in overlapping scenarios.

      我想知道的是:间隔树最适合此用途,但是我当前的将标签集匹配作为附加维度添加的系统笨拙且固定; 是否有可以优雅处理的数据结构或算法?

      What I want to know is: interval trees are mostly great for this purpose, but my current system to add tag set matching as an additional dimension to this is clunky and bolted-on; is there a data structure or algorithm that can handle this in an elegant way?

      必须支持的操作:

      1. 向系统查询与某些标签集匹配的可用插槽(考虑到可能会降低可用性但本身不属于所述标签集的预留;例如,在上面的示例中,查询B的可用性).
      2. 确保无法将没有匹配的可用插槽的预订添加到系统中.
      1. Querying the system for available slots that match certain tag sets (taking into account reservations that may reduce availability but are not themselves part of said tag set; e.g. in the example above querying for an availability for B).
      2. Ensuring no reservations can be added to the system which don't have a matching available slot.

      推荐答案

      可以使用约束编程来解决您的问题.在python中,可以使用 python-constraint 库来实现.

      Your problem can be solved using constraint programming. In python this can be implemented using the python-constraint library.

      首先,我们需要一种方法来检查两个插槽是否彼此一致.如果两个插槽共享一个标签并且它们的框重叠,则此函数返回true.在python中,可以使用以下函数实现

      First, we need a way to check if two slots are consistent with each other. this is a function that returns true if two slots share a tag and their rimeframes overlap. In python this can be implemented using the following function

      def checkNoOverlap(slot1, slot2):
          shareTags = False
          for tag in slot1['tags']:
              if tag in slot2['tags']:
                  shareTags = True
                  break    
          if not shareTags: return True
          return not (slot2['begin'] <= slot1['begin'] <= slot2['end'] or 
                      slot2['begin'] <= slot1['end'] <= slot2['end'])
      

      我不确定您是否希望标记是完全相同的(例如{foo:bar}等于{foo:bar})还是仅是键(例如{foo:bar}等于{foo:qux}),但是您可以在上面的函数中进行更改.

      I was not sure whether you wanted the tags to be completely the same (like {foo: bar} equals {foo: bar}) or only the keys (like {foo: bar} equals {foo: qux}), but you can change that in the function above.

      我们可以将python-constraint模块用于您请求的两种功能.

      We can use the python-constraint module for the two kinds of functionality you requested.

      第二个功能是最简单的.为了实现这一点,我们可以使用函数isConsistent(set),该函数将提供的数据结构中的插槽列表作为输入.然后该函数会将所有插槽提供给python-constraint,并检查插槽列表是否一致(没有2个不应重叠,重叠的插槽)并返回一致性.

      The second functionality is the easiest. To implement this, we can use the function isConsistent(set) which takes a list of slots in the provided data structure as input. The function will then feed all the slots to python-constraint and will check if the list of slots is consistent (no 2 slots that shouldn't overlap, overlap) and return the consistency.

      def isConsistent(set):
              #initialize python-constraint context
              problem = Problem()
              #add all slots the context as variables with a singleton domain
              for i in range(len(set)):
                  problem.addVariable(i, [set[i]])        
              #add a constraint for each possible pair of slots
              for i in range(len(set)):
                  for j in range(len(set)):
                      #we don't want slots to be checked against themselves
                      if i == j:
                          continue
                      #this constraint uses the checkNoOverlap function
                      problem.addConstraint(lambda a,b: checkNoOverlap(a, b), (i, j))
              # getSolutions returns all the possible combinations of domain elements
              # because all domains are singleton, this either returns a list with length 1 (consistent) or 0 (inconsistent)
              return not len(problem.getSolutions()) == 0
      

      每当用户想要添加预留位置时,都可以调用此函数.可以将输入插槽添加到已经存在的插槽列表中,并可以检查其一致性.如果一致,则保留新的插槽.否则,新广告位会重叠,因此应该拒绝.

      This function can be called whenever a user wants to add a reservation slot. The input slot can be added to the list of already existing slots and the consistency can be checked. If it is consistent, the new slot an be reserverd. Else, the new slot overlaps and should be rejected.

      这个问题有点棘手.我们可以使用与上面相同的功能,并进行一些重大更改.现在,我们希望将所有可能的插槽添加到现有的插槽中,而不是将新的插槽与现有的插槽一起添加.然后,我们可以检查所有可能的插槽与保留的插槽的一致性,并向约束系统询问是否有一致的组合.

      This problem is a bit trickier. We can use the same functionality as above with a few significant changes. Instead of adding the new slot together with the existing slot, we now want to add all possible slots to the already existing slots. We can then check the consistency of all those possible slots with the reserved slots and ask the constraint system for the combinations that are consistent.

      因为如果不加任何限制,可能的插槽数将是无限的,因此我们首先需要为程序声明一些参数:

      Because the number of possible slots would be infinite if we didn't put any restrictions on it, we first need to declare some parameters for the program:

      MIN = 149780000 #available time slots can never start earlier then this time
      MAX = 149790000 #available time slots can never start later then this time
      GRANULARITY = 1*60 #possible time slots are always at least one minut different from each other
      

      我们现在可以继续执行主要功能.看起来很像一致性检查,但是我们现在添加一个变量来发现所有可用的插槽,而不是用户的新插槽.

      We can now continue to the main function. It looks a lot like the consistency check, but instead of the new slot from the user, we now add a variable to discover all available slots.

      def availableSlots(tags, set):
          #same as above
          problem = Problem()
          for i in range(len(set)):
              problem.addVariable(i, [set[i]])
          #add an extra variable for the available slot is added, with a domain of all possible slots
          problem.addVariable(len(set), generatePossibleSlots(MIN, MAX, GRANULARITY, tags))
          for i in range(len(set) +1):
              for j in range(len(set) +1):
                  if i == j:
                      continue
                  problem.addConstraint(lambda a, b: checkNoOverlap(a, b), (i, j))
          #extract the available time slots from the solution for clean output
          return filterAvailableSlots(problem.getSolutions())
      

      我使用一些辅助功能来使代码更整洁.它们包含在这里.

      I use some helper functions to keep the code cleaner. They are included here.

      def filterAvailableSlots(possibleCombinations):
          result = []
          for slots in possibleCombinations:
              for key, slot in slots.items():
                  if slot['type'] == 'available':
                      result.append(slot)
      
          return result
      
      def generatePossibleSlots(min, max, granularity, tags):
          possibilities = []
          for i in range(min, max - 1, granularity):
              for j in range(i + 1, max, granularity):
                  possibleSlot = {
                                    'type': 'available',
                                    'begin': i,
                                    'end': j,
                                    'tags': tags
                  }
                  possibilities.append(possibleSlot)
          return tuple(possibilities)
      

      您现在可以将getAvailableSlots(tags,set)函数与需要可用插槽的标签以及一组已经保留的插槽一起使用.请注意,此函数实际上会返回所有一致的可能的时隙,因此不会费力查找最大长度之一或进行其他优化.

      You can now use the function getAvailableSlots(tags, set) with the tags for which you want the available slots and a set of already reserved slots. Note that this function really return all the consistent possible slots, so no effort is done to find the one of maximum lenght or for other optimalizations.

      希望这会有所帮助! (按照您在我的pycharm中所述,它可以正常工作)

      Hope this helps! (I got it to work as you described in my pycharm)

      这篇关于具有子集匹配维的间隔树?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆