通过itertools.combinations对象的生成器进行迭代需要花费很长时间 [英] Iterating through a generator of itertools.combinations object takes forever

查看:112
本文介绍了通过itertools.combinations对象的生成器进行迭代需要花费很长时间的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

编辑::

与juanpa&在 python chat 上的注释和Kevin的融合中,我得出的结论是通过生成器迭代花费的时间与通过任何其他对象进行迭代的时间相同因为生成器本身会动态生成这些组合。此外,融合方法对于 len(arr) 1000 (可能高达 5k -但是由于超时而终止,这当然是在线法官决定的-请注意,这并不是因为尝试获取 min_variance_sub ,但我还必须在 min_variance_sub 中获取所有可能的对的绝对差之和)。我将接受融合的方法作为该问题的答案,因为它回答了该问题。
但是我还将为该问题陈述创建一个新问题(更像是 QnA ,在此我还将回答未来的问题访客-我从其他候选人的来稿中得到了答案,编辑由问题解决者提供,并且由问题解决者自己提供了代码-尽管我没有了解他们使用的方法)。我将在创建它时链接到另一个问题:)

这是 HERE

:
after all these discussions with juanpa & fusion here in the comments and Kevin on python chat , i have come to a conclusion that iterating through a generator takes the same time as it would take iterating through any other object because generator itself generates those combinations on the fly. Moreover the approach by fusion worked great for len(arr) up to 1000(maybe up to 5k - but it terminates due to time out, of course on an online judge - Please Note it is not because of trying to get the min_variance_sub, but I also have to get the sum of absolute differences of all the pairs possible in the min_variance_sub). I am going to accept fusion's approach as an answer for this question, because it answered the question. But I will also create a new question for that problem statement (more like a QnA, where I will also answer the question for future visitors - i got the answer from submissions by other candidates, an editorial by problem setter, and a code by problem setter himself - though I do not understand the approach they used). I will link to the other question as I create it :)
It's HERE

我正在使用 itertools。组合在数组上,所以首先我尝试了类似

I'm using itertools.combinations on an array so first up I tried something like

aList = [list(x) for x in list(cmb(arr, k))]

其中cmb = itertools.combinations ,arr是列表,k是int。
这对len(arr)< 20左右,但这会在len(arr)变为50或更大时引发 MemoryError

where cmb = itertools.combinations, arr is the list, and k is an int. This works totally good for len(arr) < 20 or so but this Raised a MemoryError when len(arr) became 50 or more.

在kevin的Python聊天建议中,我使用了生成器,它以惊人的速度生成了这样的组合

On a suggestion by kevin on Python Chat, I used a generator, and it worked amazingly fast in generating those combinations like this

aGen = (list(x) for x in cmb(arr, k))

但是遍历这个过程太慢了生成器对象。
我尝试了a

But It's so slow to iterate through this generator object. I tried something like

for p in aGen:
    continue

,甚至这段代码似乎永远都需要。

and even this code seems to take forever.

Kevin还建议了有关第k个组合的答案很好,但就我而言,我实际上是想测试所有可能的组合并选择具有最小方差的组合。

Kevin also suggested an answer talking about kth combination which was nice but in my case I actually want to test all the possible combinations and select the one with minimum variance.

那么检查数组(列表)的所有可能组合以具有最小方差(是精确地讲,我只需要考虑具有正好为k个元素的子数组)

So what would be the memory efficient way of checking all the possible combinations of an array (a list) to have minimum variance (to be precise, I only need to consider sub arrays having exactly k number of elements)

谢谢您的帮助。

推荐答案

您可以先使用 n 个元素对列表进行排序,

You can sort the list with n elements first,

然后使用沿着排序列表的k长的移动窗口。

Then use a moving window of k length along the sorted list.

并找到 n-k + 1 的最小方差

最小值应为所有最小值

 
def myvar(arr):
    l = len(arr)
    m = sum(arr)/l
    return sum((i-m)**2 for i in arr)/l


input_list = [.......]

sorted_list = sorted(input_list)

variance = None
min_variance_sub = None
for i in range(len(sorted_list) - k + 1):
    sub = sorted_list[i:i+k]
    var = myvar(sub)
    if variance is None or var<variance:
        variance = var
        min_variance_sub=sub
print(min_variance_sub)

这篇关于通过itertools.combinations对象的生成器进行迭代需要花费很长时间的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆