将itertools数组转换为numpy数组 [英] convert itertools array into numpy array

查看:273
本文介绍了将itertools数组转换为numpy数组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在创建此数组:

A=itertools.combinations(range(6),2)

我必须使用numpy来操纵该数组,例如:

and I have to manipulate this array with numpy, like:

A.reshape(..

如果尺寸A高,则命令list(A)太慢.

If the dimensions is A is high, the command list(A) is too slow.

更新1: 我已经尝试过hpaulj的解决方案,在这种特定情况下,速度要慢一些,知道吗?

Update 1: I've tried the solution of hpaulj, in this specific situation is a little bit slower, any idea?

start=time.clock()

A=it.combinations(range(495),3)
A=np.array(list(A))
print A

stop=time.clock()
print stop-start
start=time.clock()

A=np.fromiter(it.chain(*it.combinations(range(495),3)),dtype=int).reshape (-1,3)
print A

stop=time.clock()
print stop-start

结果:

[[  0   1   2]
 [  0   1   3]
 [  0   1   4]
 ..., 
 [491 492 494]
 [491 493 494]
 [492 493 494]]
10.323822
[[  0   1   2]
 [  0   1   3]
 [  0   1   4]
 ..., 
 [491 492 494]
 [491 493 494]
 [492 493 494]]
12.289898

推荐答案

我正在重新打开它,因为我不喜欢链接的答案.接受的答案建议使用

I'm reopening this because I dislike the linked answer. The accepted answer suggests using

np.array(list(A))  # producing a (15,2) array

但是OP显然已经尝试了list(A),发现它运行缓慢.

But the OP aparently has already tried list(A), and found it to be slow.

另一个答案建议使用np.fromiter.但是,在其注释中隐藏了fromiter需要一维数组的提示.

Another answer suggests using np.fromiter. But buried in its comments is the note that fromiter requires a 1d array.

In [102]: A=itertools.combinations(range(6),2)
In [103]: np.fromiter(A,dtype=int)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-103-29db40e69c08> in <module>()
----> 1 np.fromiter(A,dtype=int)

ValueError: setting an array element with a sequence.

因此,将fromiter与该itertools结合使用,需要以某种方式展平迭代器.

So using fromiter with this itertools requires somehow flattening the iterator.

一组快速的时间表明list并不是一个缓慢的步骤.它将列表转换为慢速数组:

A quick set of timings suggests that list isn't the slow step. It's converting the list to an array that is slow:

In [104]: timeit itertools.combinations(range(6),2)
1000000 loops, best of 3: 1.1 µs per loop
In [105]: timeit list(itertools.combinations(range(6),2))
100000 loops, best of 3: 3.1 µs per loop
In [106]: timeit np.array(list(itertools.combinations(range(6),2)))
100000 loops, best of 3: 14.7 µs per loop

我认为使用fromiter的最快方法是使用itertools.chain的惯用用法来平整combinations:

I think the fastest way to use fromiter is to flatten the combinations with an idiomatic use of itertools.chain:

In [112]: timeit
np.fromiter(itertools.chain(*itertools.combinations(range(6),2)),dtype=int)
   .reshape(-1,2)
100000 loops, best of 3: 12.1 µs per loop

至少在这种小尺寸上节省的时间不多. (fromiter也会占用一个count,这又减少了一个µs.对于更大的情况,range(60),该fromiter花费的时间是array的一半.

Not much of a time savings, at least on this small size. (fromiter also takes a count, which shaves off another µs. With a larger case, range(60), the fromiter takes half the time of array.

[numpy] itertools的快速搜索显示了一些生成所有组合的纯数字方式的建议. itertools快速,用于生成纯Python结构,但是将其转换为数组是一个缓慢的步骤.

A quick search on [numpy] itertools turns up a number of suggestions of pure numpy ways of generating all combinations. itertools is fast, for generating pure Python structures, but converting those to arrays is a slow step.

关于这个问题的挑剔点.

A picky point about the question.

A是一个生成器,而不是一个数组. list(A)确实会生成一个嵌套列表,可以将其宽松地描述为数组.但这不是np.array,也没有reshape方法.

A is a generator, not an array. list(A) does produce a nested list, that can be described loosely as an array. But it isn't a np.array, and does not have a reshape method.

这篇关于将itertools数组转换为numpy数组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆