Python:多维数组的迭代超级慢吗? [英] Python: is the iteration of the multidimensional array super slow?

查看:113
本文介绍了Python:多维数组的迭代超级慢吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我必须迭代二维整数数组中的所有项目并更改值(根据某些规则,不重要).

I have to iterate all items in two-dimensional array of integers and change the value (according to some rule, not important).

我很惊讶python运行时与C#或java运行时之间在性能上有显着差异.我写的是完全错误的python代码(v2.7.2)吗?

I'm surprised how significant difference in performance is there between python runtime and C# or java runtime. Did I wrote totally wrong python code (v2.7.2)?

import numpy
a = numpy.ndarray((5000,5000), dtype = numpy.int32)
for x in numpy.nditer(a.T):
    x = 123

>python -m timeit -n 2 -r 2 -s "import numpy; a = numpy.ndarray((5000,5000), dtype=numpy.int32)" "for x in numpy.nditer(a.T):" "  x = 123"
2 loops, best of 2: 4.34 sec per loop

例如,C#代码仅执行50毫秒,即python的运行速度快了 100倍! (假设matrix变量已经初始化)

For example the C# code performs only 50ms, i.e. python is almost 100 times slower! (suppose the matrix variable is already initialized)

for (y = 0; y < 5000; y++)
for (x = 0; x < 5000; x++)
    matrix[y][x] = 123;

推荐答案

Python是比C或C#更动态的语言.循环如此缓慢的主要原因是,每次循环时,CPython解释器都会做一些额外的工作,从而浪费时间:具体地说,它是将名称x与迭代器中的下一个对象绑定在一起,然后在评估分配后,必须再次查找名称x.

Python is a much more dynamic language than C or C#. The main reason why the loop is so slow is that on every pass, the CPython interpreter is doing some extra work that wastes time: specifically, it is binding the name x with the next object from the iterator, then when it evaluates the assignment it has to look up the name x again.

正如@Sven Marnach指出的那样,您可以调用方法函数numpy.fill()并且它很快.该函数是用C或Fortran编译的,它将仅循环遍历numpy.array数据结构的地址并填充值.动态性比Python小得多,这对于这种简单的情况很有用.

As @Sven Marnach noted, you can call a method function numpy.fill() and it is fast. That function is compiled C or maybe Fortran, and it will simply loop over the addresses of the numpy.array data structure and fill in the values. Much less dynamic than Python, which is good for this simple case.

但是现在考虑PyPy.在PyPy下运行程序后,JIT将分析您的代码实际在做什么.在此示例中,它注意到名称x除了赋值外不用于其他任何用途,并且可以优化绑定名称的过程.这个例子应该是PyPy极大地加速的例子. PyPy的速度可能会比普通Python快十倍(因此,速度只有C的十分之一,而不是C的1/100).

But now consider PyPy. Once you run your program under PyPy, a JIT analyzes what your code is actually doing. In this example, it notes that the name x isn't used for anything but the assignment, and it can optimize away binding the name. This example should be one that PyPy speeds up tremendously; likely PyPy will be ten times faster than plain Python (so only one-tenth as fast as C, rather than 1/100 as fast).

http://pypy.org

据我了解,PyPy暂时不会与Numpy一起使用,因此您还不能仅在PyPy下运行现有的Numpy代码.但是,这一天就要到了.

As I understand it, PyPy won't be working with Numpy for a while yet, so you can't just run your existing Numpy code under PyPy yet. But the day is coming.

我对PyPy感到兴奋.它提供了希望,我们可以用一种非常高级的语言(Python)进行编写,而几乎获得用便携式汇编语言"(C)进行编写的性能.对于这样的示例,通过使用来自CPU(SSE2,NEON或其他)的SIMD指令,Numpy甚至可以击败朴素的C代码的性能.对于此示例,使用SIMD,您可以在每个循环中将四个整数设置为123,这将比普通的C循环更快. (除非C编译器也使用SIMD优化!考虑到这种情况,很可能会发生这种情况.因此,对于本示例,我们回到接近C的速度"而不是更快.但是,我们可以想象到棘手的情况C编译器不够聪明,无法进行优化(未来的PyPy可能会优化).

I'm excited about PyPy. It offers the hope that we can write in a very high-level language (Python) and yet get nearly the performance of writing things in "portable assembly language" (C). For examples like this one, the Numpy might even beat the performance of naive C code, by using SIMD instructions from the CPU (SSE2, NEON, or whatever). For this example, with SIMD, you could set four integers to 123 with each loop, and that would be faster than a plain C loop. (Unless the C compiler used a SIMD optimization also! Which, come to think of it, is likely for this case. So we are back to "nearly the speed of C" rather than faster for this example. But we can imagine trickier cases that the C compiler isn't smart enough to optimize, where a future PyPy might.)

但是暂时不要介意PyPy.如果您要使用Numpy,最好学习诸如numpy.fill()之类的所有功能以加快代码的速度.

But never mind PyPy for now. If you will be working with Numpy, it is a good idea to learn all the functions like numpy.fill() that are there to speed up your code.

这篇关于Python:多维数组的迭代超级慢吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆