遍历numpy数组的最快方法是什么 [英] what is the quickest way to iterate through a numpy array
问题描述
我注意到直接"遍历numpy数组与通过tolist
方法遍历之间存在有意义的区别.请参阅下面的时间:
I noticed a meaningful difference between iterating through a numpy array "directly" versus iterating through via the tolist
method. See timing below:
直接
[i for i in np.arange(10000000)]
通过tolist
[i for i in np.arange(10000000).tolist()]
directly
[i for i in np.arange(10000000)]
via tolist
[i for i in np.arange(10000000).tolist()]
考虑到我发现了一种更快的方法.我想问问还有什么可以使它运行更快?
considering I've discovered one way to go faster. I wanted to ask what else might make it go faster?
什么是最快的遍历numpy数组的方法?
what is fastest way to iterate through a numpy array?
推荐答案
这些是我在速度较慢的计算机上的计时
These are my timings on a slower machine
In [1034]: timeit [i for i in np.arange(10000000)]
1 loop, best of 3: 2.16 s per loop
如果我直接生成范围(Py3,这是一个生成器),则时间要好得多.以此为基准来理解这种规模的列表.
If I generate the range directly (Py3 so this is a genertor) times are much better. Take this a baseline for a list comprehension of this size.
In [1035]: timeit [i for i in range(10000000)]
1 loop, best of 3: 1.26 s per loop
tolist
首先将范围转换为列表;需要更长的时间,但迭代仍在列表中
tolist
converts the arange to a list first; takes a bit longer, but the iteration is still on a list
In [1036]: timeit [i for i in np.arange(10000000).tolist()]
1 loop, best of 3: 1.6 s per loop
使用list()
-与直接迭代数组相同;这表明直接迭代首先可以做到这一点.
Using list()
- same time as direct iteration on the array; that suggests that the direct iteration first does this.
In [1037]: timeit [i for i in list(np.arange(10000000))]
1 loop, best of 3: 2.18 s per loop
In [1038]: timeit np.arange(10000000).tolist()
1 loop, best of 3: 927 ms per loop
在.tolist上重复一次
same times a iterating on the .tolist
In [1039]: timeit list(np.arange(10000000))
1 loop, best of 3: 1.55 s per loop
通常,如果必须循环,则处理列表的速度更快.访问列表元素更简单.
In general if you must loop, working on a list is faster. Access to elements of a list is simpler.
查看通过索引返回的元素.
Look at the elements returned by indexing.
a[0]
是另一个numpy
对象;它是由a
中的值构造的,而不仅仅是获取的值
a[0]
is another numpy
object; it is constructed from the values in a
, but not simply a fetched value
list(a)[0]
是相同的类型;列表只是[a[0], a[1], a[2]]]
list(a)[0]
is the same type; the list is just [a[0], a[1], a[2]]]
In [1043]: a = np.arange(3)
In [1044]: type(a[0])
Out[1044]: numpy.int32
In [1045]: ll=list(a)
In [1046]: type(ll[0])
Out[1046]: numpy.int32
但是tolist
将数组转换为纯列表,在这种情况下为int列表.它比list()
做更多的工作,但是是在编译后的代码中完成的.
but tolist
converts the array into a pure list, in this case, as list of ints. It does more work than list()
, but does it in compiled code.
In [1047]: ll=a.tolist()
In [1048]: type(ll[0])
Out[1048]: int
通常不要使用list(anarray)
.它很少做任何有用的事情,并且不如tolist()
强大.
In general don't use list(anarray)
. It rarely does anything useful, and is not as powerful as tolist()
.
迭代数组的最快方法是什么-无.至少不是在Python中;在C代码中有快速的方法.
What's the fastest way to iterate through array - None. At least not in Python; in c code there are fast ways.
a.tolist()
是从数组创建列表整数的最快的矢量化方法.它会迭代,但是会在已编译的代码中进行迭代.
a.tolist()
is the fastest, vectorized way of creating a list integers from an array. It iterates, but does so in compiled code.
但是您的真正目标是什么?
But what is your real goal?
这篇关于遍历numpy数组的最快方法是什么的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!