将itertools数组转换为numpy数组 [英] convert itertools array into numpy array
问题描述
我正在创建此数组:
A=itertools.combinations(range(6),2)
我必须使用numpy来操纵该数组,例如:
and I have to manipulate this array with numpy, like:
A.reshape(..
如果尺寸A高,则命令list(A)
太慢.
If the dimensions is A is high, the command list(A)
is too slow.
更新1: 我已经尝试过hpaulj的解决方案,在这种特定情况下,速度要慢一些,知道吗?
Update 1: I've tried the solution of hpaulj, in this specific situation is a little bit slower, any idea?
start=time.clock()
A=it.combinations(range(495),3)
A=np.array(list(A))
print A
stop=time.clock()
print stop-start
start=time.clock()
A=np.fromiter(it.chain(*it.combinations(range(495),3)),dtype=int).reshape (-1,3)
print A
stop=time.clock()
print stop-start
结果:
[[ 0 1 2]
[ 0 1 3]
[ 0 1 4]
...,
[491 492 494]
[491 493 494]
[492 493 494]]
10.323822
[[ 0 1 2]
[ 0 1 3]
[ 0 1 4]
...,
[491 492 494]
[491 493 494]
[492 493 494]]
12.289898
推荐答案
我正在重新打开它,因为我不喜欢链接的答案.接受的答案建议使用
I'm reopening this because I dislike the linked answer. The accepted answer suggests using
np.array(list(A)) # producing a (15,2) array
但是OP显然已经尝试了list(A)
,发现它运行缓慢.
But the OP aparently has already tried list(A)
, and found it to be slow.
另一个答案建议使用np.fromiter
.但是,在其注释中隐藏了fromiter
需要一维数组的提示.
Another answer suggests using np.fromiter
. But buried in its comments is the note that fromiter
requires a 1d array.
In [102]: A=itertools.combinations(range(6),2)
In [103]: np.fromiter(A,dtype=int)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-103-29db40e69c08> in <module>()
----> 1 np.fromiter(A,dtype=int)
ValueError: setting an array element with a sequence.
因此,将fromiter
与该itertools结合使用,需要以某种方式展平迭代器.
So using fromiter
with this itertools requires somehow flattening the iterator.
一组快速的时间表明list
并不是一个缓慢的步骤.它将列表转换为慢速数组:
A quick set of timings suggests that list
isn't the slow step. It's converting the list to an array that is slow:
In [104]: timeit itertools.combinations(range(6),2)
1000000 loops, best of 3: 1.1 µs per loop
In [105]: timeit list(itertools.combinations(range(6),2))
100000 loops, best of 3: 3.1 µs per loop
In [106]: timeit np.array(list(itertools.combinations(range(6),2)))
100000 loops, best of 3: 14.7 µs per loop
我认为使用fromiter
的最快方法是使用itertools.chain
的惯用用法来平整combinations
:
I think the fastest way to use fromiter
is to flatten the combinations
with an idiomatic use of itertools.chain
:
In [112]: timeit
np.fromiter(itertools.chain(*itertools.combinations(range(6),2)),dtype=int)
.reshape(-1,2)
100000 loops, best of 3: 12.1 µs per loop
至少在这种小尺寸上节省的时间不多. (fromiter
也会占用一个count
,这又减少了一个µs.对于更大的情况,range(60)
,该fromiter
花费的时间是array
的一半.
Not much of a time savings, at least on this small size. (fromiter
also takes a count
, which shaves off another µs. With a larger case, range(60)
, the fromiter
takes half the time of array
.
对[numpy] itertools
的快速搜索显示了一些生成所有组合的纯数字方式的建议. itertools
快速,用于生成纯Python结构,但是将其转换为数组是一个缓慢的步骤.
A quick search on [numpy] itertools
turns up a number of suggestions of pure numpy ways of generating all combinations. itertools
is fast, for generating pure Python structures, but converting those to arrays is a slow step.
关于这个问题的挑剔点.
A picky point about the question.
A
是一个生成器,而不是一个数组. list(A)
确实会生成一个嵌套列表,可以将其宽松地描述为数组.但这不是np.array
,也没有reshape
方法.
A
is a generator, not an array. list(A)
does produce a nested list, that can be described loosely as an array. But it isn't a np.array
, and does not have a reshape
method.
这篇关于将itertools数组转换为numpy数组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!