列表与带连接功能的生成器理解速度 [英] List vs generator comprehension speed with join function

查看:24
本文介绍了列表与带连接功能的生成器理解速度的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

所以我从官方文档中得到了这些例子.https://docs.python.org/2/library/timeit.html

究竟是什么让第一个示例(生成器表达式)比第二个示例(列表推导式)慢?

<预><代码>>>>timeit.timeit('"-".join(str(n) for n in range(100))', number=10000)0.8187260627746582>>>timeit.timeit('"-".join([str(n) for n in range(100)])', number=10000)0.7288308143615723

解决方案

str.join 方法将它的可迭代参数转换为一个列表,如果它已经不是一个列表或元组.这让连接逻辑多次迭代项目(它先计算结果字符串的大小,然后再计算实际复制数据).

您可以在 CPython 源代码:

PyObject *PyUnicode_Join(PyObject *separator, PyObject *seq){/* 省略了函数开头的大量变量声明 */fseq = PySequence_Fast(seq, "只能加入一个可迭代对象");/* ... */}

C API 中的PySequence_Fast 函数正是我所描述的.它将任意可迭代对象转换为列表(本质上是通过对其调用 list),除非它已经是列表或元组.

将生成器表达式转换为列表意味着生成器的通常优点(较小的内存占用和短路的可能性)不适用于 str.join,因此生成器的(小)额外开销使其性能变差.

So I got these examples from the official documentation. https://docs.python.org/2/library/timeit.html

What exactly makes the first example (generator expression) slower than the second (list comprehension)?

>>> timeit.timeit('"-".join(str(n) for n in range(100))', number=10000)
0.8187260627746582
>>> timeit.timeit('"-".join([str(n) for n in range(100)])', number=10000)
0.7288308143615723

解决方案

The str.join method converts its iterable parameter to a list if it's not a list or tuple already. This lets the joining logic iterate over the items multiple times (it makes one pass to calculate the size of the result string, then a second pass to actually copy the data).

You can see this in the CPython source code:

PyObject *
PyUnicode_Join(PyObject *separator, PyObject *seq)
{
    /* lots of variable declarations at the start of the function omitted */

    fseq = PySequence_Fast(seq, "can only join an iterable");

    /* ... */
}

The PySequence_Fast function in the C API does just what I described. It converts an arbitrary iterable into a list (essentially by calling list on it), unless it already is a list or tuple.

The conversion of the generator expression to a list means that the usual benefits of generators (a smaller memory footprint and the potential for short-circuiting) don't apply to str.join, and so the (small) additional overhead that the generator has makes its performance worse.

这篇关于列表与带连接功能的生成器理解速度的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆