将对象平均分布在多个集合中 [英] Spread objects evenly over multiple collections

查看:91
本文介绍了将对象平均分布在多个集合中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

场景是存在 n 个对象,它们的大小不同,不均匀地分布在 m 个存储桶中。值区的大小是它包含的所有对象大小的总和。现在碰巧出现了存储桶的大小变化很大的情况。

The scenario is that there are n objects, of different sizes, unevenly spread over m buckets. The size of a bucket is the sum of all of the object sizes that it contains. It now happens that the sizes of the buckets are varying wildly.

如果我想将这些对象均匀地分布在这些存储桶上,以使总大小成为一个好的算法每个水桶大约是一样的?如果该算法在平均分布范围内朝着较小的移动量倾斜,那将是很好的选择。

What would be a good algorithm if I want to spread those objects evenly over those buckets so that the total size of each bucket would be about the same? It would be nice if the algorithm leaned towards less move size over a perfectly even spread.

我在Ruby中有这种幼稚,无效且错误的解决方案。

I have this naïve, ineffective, and buggy solution in Ruby.

buckets = [ [10, 4, 3, 3, 2, 1], [5, 5, 3, 2, 1], [3, 1, 1], [2] ]

avg_size = buckets.flatten.reduce(:+) / buckets.count + 1

large_buckets = buckets.take_while {|arr| arr.reduce(:+) >= avg_size}.to_a

large_buckets.each do |large|
  smallest = buckets.last

  until ((small_sum = smallest.reduce(:+)) >= avg_size)
    break if small_sum + large.last >= avg_size
    smallest << large.pop
  end

  buckets.insert(0, buckets.pop)
end

=> [[3, 1, 1, 1, 2, 3], [2, 1, 2, 3, 3], [10, 4], [5, 5]]


推荐答案

我相信这是装箱问题,因此这是NP难题。您的答案本质上是首次拟合递减启发式算法的变体,这是一个相当不错的启发式算法。也就是说,我相信以下内容会带来更好的结果。

I believe this is a variant of the bin packing problem, and as such it is NP-hard. Your answer is essentially a variant of the first fit decreasing heuristic, which is a pretty good heuristic. That said, I believe that the following will give better results.


  • 使用平衡的二叉树按降序对每个存储桶进行排序。

  • 计算平均大小。

  • 按大小降序对大小小于平均值的存储桶(过小的存储桶)进行排序。平衡的二叉树。

  • 使用平衡的二叉树按大小最大元素的大小(最大元素的大小)排序。

  • Pass1:从具有最大元素的存储桶中移除最大元素;将{9,1}的存储桶放在首位,将{8,5}的存储桶放在第二位。如果将其大小减小到平均值以下,则替换已删除的元素,然后从太大的存储桶的平衡二叉树中删除存储桶;否则将元素放置在最小的存储桶中,然后对两个经过修改的存储桶重新编制索引,以反映新的最小存储桶和具有最大元素的新过大存储桶。继续进行迭代,直到您删除了所有太大的存储桶。

  • Pass2:从最小到最大依次遍历过小的存储桶,然后选择最合适的存储桶从最大的太大的水桶中选取元素,而不会使其变成太大的水桶;从最大到最小依次遍历其余的太大的水桶,从中删除最适合的元素,而不会导致它们变成太小的水桶。对其余的太小桶执行相同的操作。此变体的结果不会像更复杂的变体那样好,因为它不会将存储桶从太大类别转移到太小类别,反之亦然(因此搜索空间会较小),但这也意味着它具有更简单的停止条件(只需遍历所有太小的存储桶然后停止),而如果您不小心,那么复杂的变量可能会导致无限循环。 / li>
  • Sort each individual bucket in descending size order, using a balanced binary tree.
  • Calculate average size.
  • Sort the buckets with size less than average (the "too-small buckets") in descending size order, using a balanced binary tree.
  • Sort the buckets with size greater than average (the "too-large buckets") in order of the size of their greatest elements, using a balanced binary tree (so the bucket with {9, 1} would come first and the bucket with {8, 5} would come second).
  • Pass1: Remove the largest element from the bucket with the largest element; if this reduces its size below the average, then replace the removed element and remove the bucket from the balanced binary tree of "too-large buckets"; else place the element in the smallest bucket, and re-index the two modified buckets to reflect the new smallest bucket and the new "too-large bucket" with the largest element. Continue iterating until you've removed all of the "too-large buckets."
  • Pass2: Iterate through the "too-small buckets" from smallest to largest, and select the best-fitting elements from the largest "too-large bucket" without causing it to become a "too-small bucket;" iterate through the remaining "too-large buckets" from largest to smallest, removing the best-fitting elements from them without causing them to become "too-small buckets." Do the same for the remaining "too-small buckets." The results of this variant won't be as good as they are for the more complex variant because it won't shift buckets from the "too-large" to the "too-small" category or vice versa (hence the search space will be smaller), but this also means that it has much simpler halting conditions (simply iterate through all of the "too-small" buckets and then halt), whereas the complex variant might cause an infinite loop if you're not careful.

这个想法是通过移动Pass1中最大的元素,您可以更轻松地更精确地匹配Pass2中存储桶的大小。您可以使用平衡的二叉树,以便在删除或添加元素后可以快速为存储桶或存储桶的树重新索引,但是可以使用链接列表(平衡的二叉树在最坏情况下的性能更好,但是链接列表可能具有更好的平均用例性能)。通过在Pass2中执行最佳拟合而不是首次拟合,您不太可能执行无用的移动(例如,将大小为10的对象从比平均值大5的桶中移动到比平均值小5的桶中-首先拟合会盲目地播放电影,最合适的选择是查询下一个太大的存储桶以寻找更大的对象,或者从存储桶树中删除太小的存储桶。

The idea is that by moving the largest elements in Pass1 you make it easier to more precisely match up the buckets' sizes in Pass2. You use balanced binary trees so that you can quickly re-index the buckets or the trees of buckets after removing or adding an element, but you could use linked lists instead (the balanced binary trees would have better worst-case performance but the linked lists might have better average-case performance). By performing a best-fit instead of a first-fit in Pass2 you're less likely to perform useless moves (e.g. moving a size-10 object from a bucket that's 5 greater than average into a bucket that's 5 less than average - first fit would blindly perform the movie, best-fit would either query the next "too-large bucket" for a better-sized object or else would remove the "too-small bucket" from the bucket tree).

这篇关于将对象平均分布在多个集合中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆