权重的蟒蛇给numpy的阵列,它找到分裂数组索引所以每次分裂的总和小于值 [英] Python given numpy array of weights, find indices which split array so that sum of each split is less than value

查看:152
本文介绍了权重的蟒蛇给numpy的阵列,它找到分裂数组索引所以每次分裂的总和小于值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有重量,W的一维阵列和相同的形状,w为容量的c阵列。我需要找到指数最小的阵列,这样当W由这些指标拆分,拆分阵列的cumsums比在C对应的能力更低。
由于权重和容量的数组如下:

  W = [1,2,3,4,5,6] C = [3,12,7,6,12]

我需要找到索引'我'的最小数量,以便分裂波束小于c中的相应能力的cumsums。在这种情况下,

  I = [2,3,5]

W的形成了以我为错层阵列的cumsums

  [1,1 + 2,1 + 2 + 3,4,5,5 + 6]

的每个元素不是c显然更少。暨金额的计算方法给出<一个href=\"http://stackoverflow.com/questions/34525118/find-cumsum-of-subarrays-split-by-indices-for-numpy-array-efficiently\">here.

需要的指标的近似值也未尝不可。但cumsums应严格比在C对应的元素少。

w是一个非常大的阵列(尺寸100000元素)。我需要一个量化的解决方案,这是有效的。正如前面所说的,近似的罚款,只要cumsums是比C

下面是我试过。我认为整个C矩阵只是其中的一个元素重复多次(我想首先要解决一个简单的案例,然后添加复杂性)。在这种情况下,我只是要确保每个分割数组必须有总和小于给定值(有点类似于装箱)。我觉得指数如下:

 重= np.random.random_integers(1,20,大小=(20))
容量= 100#查找累计总和和能力划分。这给指数的近似值。在第一的所有元素
#分裂阵列将具有0和1之间的值的那些在第二阵列将具有1和2之间的元素,
# 等等。当曾经的整数部分的变化,新的分裂阵列将形成。查找该指数。
#服用所有元素的上限值之后,在0和1之间的元素将成为1,元素之间
#1和2成为2等。这些元素改变给指数的地方。采取差异找到
#边界(变化)。
指数= np.diff(np.ceil(np.cumsum(权重由[i])/ self.sleigh_capacity))
#0重新present重复的元素,1S重新在这里值的变化present值。查找索引
指数= np.where(指数!= 0)[0] +1

这给我的指标。有一点需要注意的是,这可能会给我错了指数,因为累计总和从开始计算。
即,[1,2,3,2,3]的cumsum为[1,2,6,8,9]。现在,如果我的容量为5。
除以cumsum 5和服用CEIL给我[1,1,2,2,2]将对应于分裂指数为[1,4]。但实际的分割索引是[1,3,4]。我通过降低产能处理这个。也就是说,如果我的实际容量为5,我把它作为4,然后做上述(值4是由纯粹的猜测得到。要在安全的一面,我可能降低产能更进一步)。

但我不能给此方法扩展到能力都存在不同的情况。也就是说,如果我有形状(1,5)的容量数组,那么我将不得不使用不同的方法,因为这种方法是行不通的。


解决方案

  W = [1,2,3,1,6,6] C = [1,3,5,1,6,12]

解决这个问题的唯一办法是

  I = [2,3,4,5]

贪婪的解决方案(我的理解就是要等到你不能把)

它开始用2得到[1,2]&LT; = [1,1 + 2]在C。
然而,如果下一个分割为4(作为贪婪溶液引出,你进入的问题,因为没有任何能满足1)。我们应该在2和3,而不是分裂它。

我建议使用回溯回头看的时候出现这种情况,但运行时间可能会失控。与100K的限制似乎在最坏的情况提出一个线性解决方案或nlogn解决方案。我有一个如何做到这一点与动态规划,但还是搞清楚一些具体的想法。将更新希望,或一段时间后丢弃的答案。 :)

I have an 1D array of weights, w and an array of capacities c of the same shape as w. I need to find the smallest array of indices such that when w is split by these indices, the cumsums of split arrays less than the corresponding capacities in c. Given an array of weights and capacities as follows:

w = [1,2,3,4,5,6]; c = [3, 12, 7, 6, 12]

I need to find the smallest number of indices 'i' so that the cumsums of split arrays less than the corresponding capacities in c. In this case,

i = [2, 3, 5]

The cumsums of split arrays of w formed by i are

[1, 1+2, 1+2+3, 4, 5, 5+6]

each element is clearly less than c. The cum sums are calculated as given here.

An approximation of the required indices is also fine. But the cumsums should be strictly less than corresponding elements in c.

w is a very large array (size 100000 elements). I need a vectorized solution for it to be efficient. As said before, approximations are fine as long as the cumsums are less than c

Here is what I've tried. I've assumed that the entire c matrix is just one element repeated multiple times (I'm trying to solve a simpler case first and then add complexities). In this case, I just have to ensure that each split array has to have sum less than a given value (somewhat similar to bin packing). I find the indices as follows.

weights = np.random.random_integers(1, 20, size=(20))
capacity = 100

# Find cumulative sums and divide by capacity. This gives an approximation of indices. All elements in first
# split array would have values between 0 and 1. Those in second array would have elements between 1 and 2,
# and so on. When ever the integer part changes, a new split array would be formed. Find indices from this.
# After taking the ceiling value of all elements, elements between 0 and 1 would become 1, elements between
# 1 and 2 become 2 and so on. The place where the elements change give the indices. Take diff to find the
# boundary (of change).
indices = np.diff(np.ceil(np.cumsum(weights[i]) / self.sleigh_capacity))
# 0s represent repeated elements, 1s represent values where values change. Find the indices
indices = np.where(indices != 0)[0] + 1

This gives me the indices. One thing to note is that this might give me wrong indices, because cumulative sums are calculated from the beginning. That is, cumsum of [1,2,3,2,3] is [1,2,6,8,9]. Now if my capacity is 5. dividing cumsum by 5 and taking ceil gives me [1, 1, 2, 2, 2] which would correspond to splitting indices of [1, 4]. But the actual splitting indices are [1, 3, 4]. I'm handling this by reducing the capacity. That is, if my actual capacity is 5, I'd take it as 4 and then do the above (The value 4 is gotten by pure guess. To be on the safer side I might decrease the capacity even further).

But I'm not able to extend this method to the case where the capacities are varying. That is, if I have a capacity array of shape (1,5) then I would have to use a different approach, as this approach wouldn't work.

解决方案

w = [1,2,3,1,6,6]; c = [1,3,5, 1, 6, 12]

The only solution to this is

i=[2,3,4,5]

The greedy solution (to my understanding is to take until you cannot take)

It starts off with a 2 to get the [1,2] < =[1, 1+2] in c. However, if the next split is at 4 (as the greedy solution leads to, you get into issues since nothing can satisfy the 1). We should have instead split it at 2 and 3.

I suggested using backtracking to look back when this happens, but the running time could spiral out of control. The limit with 100k seems to suggest a linear solution or nlogn solution at worst. I have ideas of how to do this with dynamic programming, but still figuring out some specifics. Will update hopefully, or discard answer after a while. :)

这篇关于权重的蟒蛇给numpy的阵列,它找到分裂数组索引所以每次分裂的总和小于值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆