Python函数调用非常慢 [英] Python Function calls are really slow

查看:535
本文介绍了Python函数调用非常慢的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这主要是为了确保我的方法是正确的,但是我的基本问题是,如果我需要访问函数,是否值得在函数之外进行检查?我知道,我知道,过早优化,但在很多情况下,它与在函数调用中放入if语句以确定是否需要运行其余代码或将其放在函数调用之前存在差异。换句话说,它不需要任何努力来做到这一点或其他。现在,所有的检查都混合在一起,我希望得到它很好,标准化。



我问的主要原因是因为其他答案我看到大部分时间引用timeit,但这给了我负数,所以我切换到这个:

  import timeit 
import cProfile

def aaaa(idd):
return idd

def main():
#start = timeit.timeit()
for我在范围内(9999999):
a = 5
#end = timeit.timeit()
#print(1,end-start)

def main2 ():
#start = timeit.timeit()
在范围内(9999999):
aaaa(5)
#end = timeit.timeit()
'print(2,end - start)

cProfile.run('main()',sort ='cumulative')
cProfile.run('main2()', sort ='cumulative')

得到了这个输出

  ncalls tottime percall cumtime percall filename:lineno(函数) 
1 0.000 0.000 0.310 0.310 {built-in method exec}
1 0.000 0.000 0.310 0.310< string>:1(< module>)
1 0.310 0.310 0.310 0.310 test.py: 7(main)
1 0.000 0.000 0.000 0.000 {_lsprof.Profiler'对象的方法'disable'}

ncalls tottime percall cumtime percall filename:lineno(函数)
1 0.000 0.000 2.044 2.044 {内置方法exec}
1 0.000 0.000 2.044 2.044 :1(< module>)
1 1.522 1.522 2.044 2.044 test.py:14(main2)
9999999 0.521 0.000 0.521 0.000 test.py:4 (aaaa)

对我来说,不调用该函数的时间为.31秒,调用它需要1.52秒,这慢了将近5倍。但正如我所说,我用timeit得到了负数,所以我想确定它实际上 慢。



,函数调用的原因非常缓慢是因为python需要查找以确保该函数在它可以运行之前仍然存在?是不是有什么办法可以让它像...假设一切都还在那里,这样它就不必做不必要的工作了(显然)会减慢5倍?

解决方案

你在这里比较苹果和梨。一种方法做简单赋值,另一种方法调用一个函数。是的,函数调用会增加开销。



您应该将这个值降至 timeit

 >>>进口时间
>>> timeit.timeit('a = 5')
0.03456282615661621
>>> timeit.timeit('foo()','def foo():a = 5')
0.14389896392822266

现在我们所做的只是添加函数调用( foo 执行同样的操作),因此您可以测量额外的时间a函数调用需要。你不能说这个速度慢了将近4倍,不,函数调用 1.000.000 迭代增加了0.11秒的开销。



如果不是 a = 5 ,我们需要花费0.5秒来执行一百万次迭代,将它们移动到一个函数中并不会让事情花费2秒。现在它将花费0.61秒,因为函数开销不会增长。



一个函数调用需要操作堆栈,将本地框架压入堆栈,创建一个新框架,然后在函数返回时再次清除它。



换句话说,将函数移动到一个函数会增加一个小的开销,并且移动到该函数的语句越多,开销就会占所做工作的百分比就越小。函数 never 使得这些语句本身变得更慢。



Python函数只是一个存储在变量中的对象。您可以将函数分配给不同的变量,将其替换为完全不同的东西,或者随时删除它们。当你调用一个函数时,你首先引用它们的名字( foo ),然后调用函数对象((arguments) code>);该查找必须每次都以动态语言发生。



您可以在为函数生成的字节码中看到这一点:

 >>> def foo():
...通过
...
>>> def bar():
... return foo()
...
>>> import dis
>>> dis.dis(bar)
2 0 LOAD_GLOBAL 0(foo)
3 CALL_FUNCTION 0
6 RETURN_VALUE


$ b $ < LOAD_GLOBAL 操作码在全局名称空间中查找名称( foo )基本上是一个哈希表查找),并将结果推送到堆栈。 CALL_FUNCTION 然后调用栈中的任何东西,用返回值替换它。 RETURN_VALUE 从函数调用中返回,再次将堆栈中最顶层的值作为返回值。


This is mostly to make sure my methodology is correct, but my basic question was is it worth it to check outside of a function if I need to access the function at all. I know, I know, premature optimization, but in many cases, its the difference between putting an if statement inside the function call to determine whether I need to run the rest of the code, or putting it before the function call. In other words, it takes no effort to do it one way or the other. Right now, all the checks are mixed between both, and I'd like the get it all nice and standardized.

The main reason I asked is because the other answers I saw mostly referenced timeit, but that gave me negative numbers, so I switched to this:

import timeit
import cProfile

def aaaa(idd):
    return idd

def main():
    #start = timeit.timeit()
    for i in range(9999999):
        a = 5
    #end = timeit.timeit()
    #print("1", end - start)

def main2():
    #start = timeit.timeit()
    for i in range(9999999):
        aaaa(5)
    #end = timeit.timeit()
    #print("2", end - start)

cProfile.run('main()', sort='cumulative')
cProfile.run('main2()', sort='cumulative')

and got this for output

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.310    0.310 {built-in method exec}
        1    0.000    0.000    0.310    0.310 <string>:1(<module>)
        1    0.310    0.310    0.310    0.310 test.py:7(main)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    2.044    2.044 {built-in method exec}
        1    0.000    0.000    2.044    2.044 <string>:1(<module>)
        1    1.522    1.522    2.044    2.044 test.py:14(main2)
  9999999    0.521    0.000    0.521    0.000 test.py:4(aaaa)

To me that shows that not calling the function is .31 seconds, and calling it takes 1.52 seconds, which is almost 5 times slower. But like I said, I got negative numbers with timeit, so I want to make sure its actually that slow.

Also from what I gather, the reason function calls are so slow is because python needs to look up to make sure the function still exists before it can run it or something? Isn't there any way to just tell it to like...assume that everything is still there so that it doesn't have to do unnecessary work that (apparently) slows it down 5x?

解决方案

You are comparing apples and pears here. One method does simple assignment, the other calls a function. Yes, function calls will add overhead.

You should strip this down to the bare minimum for timeit:

>>> import timeit
>>> timeit.timeit('a = 5')
0.03456282615661621
>>> timeit.timeit('foo()', 'def foo(): a = 5')
0.14389896392822266

Now all we did was add a function call (foo does the same thing), so you can measure the extra time a function call takes. You cannot state that this is nearly 4 times slower, no, the function call adds a 0.11 second overhead for 1.000.000 iterations.

If instead of a = 5 we do something that takes 0.5 seconds to execute one million iterations, moving them to a function won't make things take 2 seconds. It'll now take 0.61 seconds because the function overhead doesn't grow.

A function call needs to manipulate the stack, pushing the local frame onto it, creating a new frame, then clear it all up again when the function returns.

In other words, moving statements to a function adds a small overhead, and the more statements you move to that function, the smaller the overhead becomes as a percentage of the total work done. A function never makes those statements themselves slower.

A Python function is just an object stored in a variable; you can assign functions to a different variable, replace them with something completely different, or delete them at any time. When you invoke a function, you first reference the name by which they are stored (foo) and then invoke the function object ((arguments)); that lookup has to happen every single time in a dynamic language.

You can see this in the bytecode generated for a function:

>>> def foo():
...     pass
... 
>>> def bar():
...     return foo()
... 
>>> import dis
>>> dis.dis(bar)
  2           0 LOAD_GLOBAL              0 (foo)
              3 CALL_FUNCTION            0
              6 RETURN_VALUE        

The LOAD_GLOBAL opcode looks up the name (foo) in the global namespace (basically a hash table lookup), and pushes the result onto the stack. CALL_FUNCTION then invokes whatever is on the stack, replacing it with the return value. RETURN_VALUE returns from a function call, again taking whatever is topmost on the stack as the return value.

这篇关于Python函数调用非常慢的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆