NumPy中行与列操作的性能 [英] Performance of row vs column operations in NumPy

查看:94
本文介绍了NumPy中行与列操作的性能的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

一些文章表明MATLAB比行操作更喜欢列操作,具体取决于您对数据进行布局时,其性能可能会发生很大的变化.这显然是因为MATLAB使用 列主要 顺序来表示数组.

There are a few articles that show that MATLAB prefers column operations than row operations, and that depending on you lay out your data the performance can vary significantly. This is apparently because MATLAB uses a column-major order for representing arrays.

我记得读过Python(NumPy)使用 行主要 的顺序.有了这个,我的问题是:

I remember reading that Python (NumPy) uses a row-major order. With this, my questions are:

  1. 使用NumPy时,可以预期性能会有类似的差异吗?
  2. 如果上述答案是肯定的,那么一些突出这一区别的示例是什么?
  1. Can one expect a similar difference in performance when working with NumPy?
  2. If the answer to the above is yes, what would be some examples that highlight this difference?

推荐答案

就像许多基准测试一样,这实际上取决于具体情况.的确,默认情况下,numpy会以C连续(行优先)的顺序创建数组,因此,抽象地讲,对列进行扫描的操作应比对行进行扫描的操作更快.但是,阵列的形状,ALU的性能以及处理器上的基础缓存会对细节产生巨大影响.

Like many benchmarks, this really depends on the particulars of the situation. It's true that, by default, numpy creates arrays in C-contiguous (row-major) order, so, in the abstract, operations that scan over columns should be faster than those that scan over rows. However, the shape of the array, the performance of the ALU, and the underlying cache on the processor have a huge impact on the particulars.

例如,在我的MacBook Pro上,使用小整数或浮点数组时,时间相似,但是小整数类型明显比浮点类型慢:

For instance, on my MacBook Pro, with a small integer or float array, the times are similar, but a small integer type is significantly slower than the float type:

>>> x = numpy.ones((100, 100), dtype=numpy.uint8)
>>> %timeit x.sum(axis=0)
10000 loops, best of 3: 40.6 us per loop
>>> %timeit x.sum(axis=1)
10000 loops, best of 3: 36.1 us per loop

>>> x = numpy.ones((100, 100), dtype=numpy.float64)
>>> %timeit x.sum(axis=0)
10000 loops, best of 3: 28.8 us per loop
>>> %timeit x.sum(axis=1)
10000 loops, best of 3: 28.8 us per loop

使用更大的数组,绝对差会变大,但至少对于更大的数据类型,在我的机器上仍然会变小:

With larger arrays the absolute differences become larger, but at least on my machine are still smaller for the larger datatype:

>>> x = numpy.ones((1000, 1000), dtype=numpy.uint8)
>>> %timeit x.sum(axis=0)
100 loops, best of 3: 2.36 ms per loop
>>> %timeit x.sum(axis=1)
1000 loops, best of 3: 1.9 ms per loop

>>> x = numpy.ones((1000, 1000), dtype=numpy.float64)
>>> %timeit x.sum(axis=0)
100 loops, best of 3: 2.04 ms per loop
>>> %timeit x.sum(axis=1)
1000 loops, best of 3: 1.89 ms per loop

您可以告诉numpy使用numpy.asarraynumpy.onesnumpy.zeros等的order='F'关键字参数,或者通过使用numpy.asfortranarray.正如预期的那样,此顺序交换了行或列操作的效率:

You can tell numpy to create a Fortran-contiguous (column-major) array using the order='F' keyword argument to numpy.asarray, numpy.ones, numpy.zeros, and the like, or by converting an existing array using numpy.asfortranarray. As expected, this ordering swaps the efficiency of the row or column operations:

in [10]: y = numpy.asfortranarray(x)
in [11]: %timeit y.sum(axis=0)
1000 loops, best of 3: 1.89 ms per loop
in [12]: %timeit y.sum(axis=1)
100 loops, best of 3: 2.01 ms per loop

这篇关于NumPy中行与列操作的性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆