Python:any() 意外性能 [英] Python: any() unexpected performance

查看:54
本文介绍了Python:any() 意外性能的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在将 any() 内置函数的性能与 文档 建议:

I am comparing the performance of the any() built-in function with the actual implementation the docs suggest:

我正在以下列表中查找大于 0 的元素:

I am looking for an element greater than 0 in the following list:

lst = [0 for _ in range(1000000)] + [1]

这就是所谓的等效函数:

This is the supposedly equivalent function:

def gt_0(lst):
    for elm in lst:
        if elm > 0:
            return True
    return False

这些是性能测试的结果:

And these are the results of the performance tests:

>> %timeit any(elm > 0 for elm in lst)
>> 10 loops, best of 3: 35.9 ms per loop

>> %timeit gt_0(lst)
>> 100 loops, best of 3: 16 ms per loop

我希望两者具有完全相同的性能,但是 any() 如果慢两倍.为什么?

I would expect both of the to have the exact same performance, however any() if two times slower. Why?

推荐答案

原因是你已经通过了生成器表达式any() 函数.Python 需要将生成器表达式转换为生成器函数,这就是它执行速度较慢的原因.因为一个生成器函数每次都需要调用__next__()方法来生成item并将它传递给any.这是在手动定义的函数中,您将整个列表传递给已准备好所有项目的函数.

The reason is that you've passed a generator expression to the any() function. Python needs to convert your generator expression to a generator function and that's why it performs slower. Because a generator function needs to call the __next__() method each time for generating the item and passing it to the any. This is while in a manual defined function you are passing the whole list to your function which has all the items prepared already.

使用列表推导式而不是生成器表达式可以更好地看到差异:

You can see the difference better by using a list comprehension rather than a generator expression:

In [4]: %timeit any(elm > 0 for elm in lst)
10 loops, best of 3: 66.8 ms per loop

In [6]: test_list = [elm > 0 for elm in lst]

In [7]: %timeit any(test_list)
100 loops, best of 3: 4.93 ms per loop

另外一个比 next 额外调用成本更高的代码瓶颈是你进行比较的方式.正如评论中提到的,您手动功能的更好等价物是:

Also another bottleneck in your code which has more cost than extra calls on next is the way you do the comparison. As mentioned in comment the better equivalent of your manual function is:

any(True for elm in lst if elm > 0)

在这种情况下,您正在与生成器表达式进行比较,它的执行时间几乎与您手动定义的函数相同(我猜最细微的差异是因为生成器.)为了更深入地了解根本原因阅读 Ashwini 的回答.

In this case you're doing the comparison with the generator expression and it'll perform almost in an equal time as your manual defined function (the slightest difference is because of the generator, I guess.) For a deeper understanding of the underlying reasons read the Ashwini's answer.

这篇关于Python:any() 意外性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆