复制numpy数组的速度 [英] Speed of copying numpy array

查看:113
本文介绍了复制numpy数组的速度的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想知道使用b = np.array(a)而不是b = np.copy(a)将Numpy数组a复制到b是否有任何缺点.当我%timeit时,前者的速度最高可提高100%.

I am wondering if there is any downside of using b = np.array(a) rather than b = np.copy(a) to copy a Numpy array a into b. When I %timeit, the former can be upto 100% faster.

在两种情况下,b is a都是False,并且我可以操纵b保持a不变,因此我想这可以达到.copy()的预期.

In both cases b is a is False, and I can manipulate b leaving a intact, so I suppose this does what is expected from .copy().

我错过了什么吗?使用np.array复制数组有什么不当之处?

Am I missing anything? What is improper about using np.array to do copy an array?

在python 3.6.5,numpy 1.14.2中,对于较大的尺寸,速度差异会迅速缩小:

with python 3.6.5, numpy 1.14.2, while the speed difference closes rapidly for larger sizes:

a = np.arange(1000)

%timeit np.array(a)
501 ns ± 30.1 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

%timeit np.copy(a)  
1.1 µs ± 35.7 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

推荐答案

摘录自

这等效于:

>>> np.array(a, copy=True)

此外,如果您查看源代码:

def copy(a, order='K'):
    return array(a, order=order, copy=True)

一些时间:

In [1]: import numpy as np

In [2]: a = np.ascontiguousarray(np.random.randint(0, 20000, 1000))

In [3]: %timeit b = np.array(a)
562 ns ± 10.1 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

In [4]: %timeit b = np.array(a, order='K', copy=True)
1.1 µs ± 10.8 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

In [5]: %timeit b = np.copy(a)
1.21 µs ± 9.28 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

In [6]: a = np.ascontiguousarray(np.random.randint(0, 20000, 1000000))

In [7]: %timeit b = np.array(a)
310 µs ± 6.31 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [8]: %timeit b = np.array(a, order='K', copy=True)
311 µs ± 2.6 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [9]: %timeit b = np.copy(a)
313 µs ± 4.33 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [10]: print(np.__version__)
1.13.3


出乎意料的是,简单地将参数显式设置为其默认值会改变np.array()的执行速度.另一方面,也许仅处理这些显式参数会增加足够的执行时间,从而对小型数组有所作为.实际上,来自源代码numpy.array() ,当提供关键字参数时,可以看到有更多的检查和更多的处理正在执行,例如,请参见 goto finish .这种开销(额外处理关键字参数)是您在小型阵列的计时中检测到的.对于较大的阵列,与实际复制阵列的时间相比,此开销微不足道.


It is unexpected that simply explicitly setting parameters to their default values changes the speed of execution of np.array(). On the other hand, maybe just processing these explicit arguments adds enough execution time to make a difference for small arrays. Indeed, from the source code for the numpy.array(), one can see that there are many more checks and more processing being performed when keyword arguments are provided, for example, see goto full_path. When keyword parameters are not set, the execution skips all the way down to goto finish. This overhead (of additional processing of keyword arguments) is what you detect in timings for small arrays. For larger arrays this overhead is insignificant in comparison to the actual time of copying the arrays.

这篇关于复制numpy数组的速度的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆