我的python代码一般都很慢,这正常吗? [英] My python codes in general are very slow, is this normal?
问题描述
我最近开始自学python,并且一直在使用此语言进行在线算法课程.由于某些原因,我为此课程创建的许多代码都很慢(相对于我过去创建的C/C ++ Matlab代码),并且我开始担心我没有正确使用python.
I recently began self-learning python, and have been using this language for an online course in algorithms. For some reason, many of my codes I created for this course are very slow (relatively to C/C++ Matlab codes I have created in the past), and I'm starting to worry that I am not using python properly.
这是一个简单的python和matlab代码,用于比较它们的速度.
Here is a simple python and matlab code to compare their speed.
MATLAB
for i = 1:100000000
a = 1 + 1
end
Python
for i in list(range(0, 100000000)):
a=1 + 1
matlab代码大约需要0.3秒,而python代码大约需要7秒.这正常吗?我针对许多复杂问题的python代码非常慢.例如,作为硬件任务,我正在大约900000个节点的图上进行深度优先搜索,而这是永远的.谢谢.
The matlab code takes about 0.3 second, and the python code takes about 7 seconds. Is this normal? My python codes for much complex problems are very slow. For example, as a HW assignment, I'm running depth first search on a graph with about 900000 nodes, and this is taking forever. Thank you.
推荐答案
性能为
对性能不要太担心-计划在以后进行优化
需要.
Don’t fret too much about performance--plan to optimize later when
needed. 这就是Python与许多高性能计算后端引擎(例如 numpy , OpenBLAS 甚至 CUDA ,仅举几例. That's one of the reasons why Python integrated with a lot of high performance calculating backend engines, such as numpy, OpenBLAS and even CUDA, just to name a few. 如果要提高性能,最好的前进方法是让高性能库为您完成繁重的工作.在Python中优化循环(通过使用xrange而不是Python 2.7中的range)不会获得非常引人注目的结果. The best way to go foreward if you want to increase performance is to let high-performance libraries do the heavy lifting for you. Optimizing loops within Python (by using xrange instead of range in Python 2.7) won't get you very dramatic results. 下面是一些比较不同方法的代码: Here is a bit of code that compares different approaches: 代码: 结果(图)在OSX的Python 2.7.13上运行:
Results (graph) ran on Python 2.7.13 on OSX:
Numpy比CUDA解决方案执行得更快的原因是,使用CUDA的开销没有超过Python + Numpy的效率.对于较大的浮点计算,CUDA的性能甚至优于Numpy. The reason that Numpy performs faster than the CUDA solution is that the overhead of using CUDA does not beat the efficiency of Python+Numpy. For larger, floating point calculations, CUDA does even better than Numpy. 请注意,Numpy解决方案的执行速度比原始解决方案快80倍以上.如果您的时间正确,那甚至比Matlab还要快... Note that the Numpy solution performs more that 80 times faster than your original solution. If your timings are correct, this would even be faster than Matlab... 有关DFS的最终说明(深度优先搜索):此处是有关Python中DFS的有趣文章. A final note on DFS (Depth-afirst-Search): here is an interesting article on DFS in Python. 这篇关于我的python代码一般都很慢,这正常吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
list(range())
xrange()
i
list(range())
xrange()
i
outimport timeit
import matplotlib.pyplot as mplplt
iter = 100
testcode = [
"for i in list(range(1000000)): a = 1+1",
"for i in xrange(1000000): a = 1+1",
"for _ in xrange(1000000): a = 1+1",
"import numpy; one = numpy.ones(1000000); a = one+one",
"import pycuda.gpuarray as gpuarray; import pycuda.driver as cuda; import pycuda.autoinit; import numpy;" \
"one_gpu = gpuarray.GPUArray((1000000),numpy.int16); one_gpu.fill(1); a = (one_gpu+one_gpu).get()"
]
labels = ["list(range())", "i in xrange()", "_ in xrange()", "numpy", "numpy and CUDA"]
timings = [timeit.timeit(t, number=iter) for t in testcode]
print labels, timings
label_idx = range(len(labels))
mplplt.bar(label_idx, timings)
mplplt.xticks(label_idx, labels)
mplplt.ylabel('Execution time (sec)')
mplplt.title('Timing of integer addition in python 2.7\n(smaller value is better performance)')
mplplt.show()