Python中的生成器与列表理解性能 [英] Generators vs List Comprehension performance in Python

查看:109
本文介绍了Python中的生成器与列表理解性能的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

目前,我正在学习有关生成器和列表理解的知识,并与探查器搞混,以了解性能增益偶然发现了使用这两者的大范围内质数之和的cProfile.

我可以看到,生成器中的:1 genexpr作为累积时间比其列表中的列表要短,但是第二行让我感到困惑.是否正在执行我认为对号码进行检查的呼叫,但是不应该将其作为列表理解中的另一个:1模块?

我在个人资料中缺少什么吗?

In [8]: cProfile.run('sum((number for number in xrange(9999999) if number % 2 == 0))')
         5000004 function calls in 1.111 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
  5000001    0.760    0.000    0.760    0.000 <string>:1(<genexpr>)
        1    0.000    0.000    1.111    1.111 <string>:1(<module>)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
        1    0.351    0.351    1.111    1.111 {sum}



In [9]: cProfile.run('sum([number for number in xrange(9999999) if number % 2 == 0])')
         3 function calls in 1.123 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    1.075    1.075    1.123    1.123 <string>:1(<module>)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
        1    0.048    0.048    0.048    0.048 {sum}

解决方案

首先,这些调用是针对生成器对象的next(或Python 3中的__next__)方法,而不是进行偶数校验. >

在Python 2中,您将不会为列表理解(LC)获得任何额外的行,因为LC没有创建任何对象,但是在Python 3中,您将因为现在使其类似于生成器表达式而成为一个额外的代码对象(<listcomp>)也是为LC创建的.

>>> cProfile.run('sum([number for number in range(9999999) if number % 2 == 0])')
         5 function calls in 1.751 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    1.601    1.601    1.601    1.601 <string>:1(<listcomp>)
        1    0.068    0.068    1.751    1.751 <string>:1(<module>)
        1    0.000    0.000    1.751    1.751 {built-in method exec}
        1    0.082    0.082    0.082    0.082 {built-in method sum}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

>>> cProfile.run('sum((number for number in range(9999999) if number % 2 == 0))')
         5000005 function calls in 2.388 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
  5000001    1.873    0.000    1.873    0.000 <string>:1(<genexpr>)
        1    0.000    0.000    2.388    2.388 <string>:1(<module>)
        1    0.000    0.000    2.388    2.388 {built-in method exec}
        1    0.515    0.515    2.388    2.388 {built-in method sum}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

尽管生成器表达式中的调用数为1(LC)与5000001相比有所不同,但这是最主要的,因为sum正在消耗迭代器,因此必须调用其__next__方法500000 + 1次(最后1个可能是为StopIteration结束迭代).对于列表理解,所有的魔力都发生在其代码对象内,其中LIST_APPEND帮助它将项目一个接一个地追加到列表中,即cProfile没有可见的调用.

Currently I was learning about generators and list comprehension, and messing around with the profiler to see about performance gains stumbled into this cProfile of a sum of prime numbers in a large range using both.

I can see that in the generator the :1 genexpr as cumulative time way shorter than in its list counterpart, but the second line is what baffles me. Is doing a call which I think is the check for number is prime, but then isn't supposed to be another :1 module in the list comprehension?

Am I missing something in the profile?

In [8]: cProfile.run('sum((number for number in xrange(9999999) if number % 2 == 0))')
         5000004 function calls in 1.111 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
  5000001    0.760    0.000    0.760    0.000 <string>:1(<genexpr>)
        1    0.000    0.000    1.111    1.111 <string>:1(<module>)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
        1    0.351    0.351    1.111    1.111 {sum}



In [9]: cProfile.run('sum([number for number in xrange(9999999) if number % 2 == 0])')
         3 function calls in 1.123 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    1.075    1.075    1.123    1.123 <string>:1(<module>)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
        1    0.048    0.048    0.048    0.048 {sum}

解决方案

First of all the calls are to next(or __next__ in Python 3) method of the generator object not for some even number check.

In Python 2 you are not going to get any additional line for a list comprehension(LC) because LC are not creating any object, but in Python 3 you will because now to make it similar to a generator expression an additional code object(<listcomp>) is created for a LC as well.

>>> cProfile.run('sum([number for number in range(9999999) if number % 2 == 0])')
         5 function calls in 1.751 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    1.601    1.601    1.601    1.601 <string>:1(<listcomp>)
        1    0.068    0.068    1.751    1.751 <string>:1(<module>)
        1    0.000    0.000    1.751    1.751 {built-in method exec}
        1    0.082    0.082    0.082    0.082 {built-in method sum}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

>>> cProfile.run('sum((number for number in range(9999999) if number % 2 == 0))')
         5000005 function calls in 2.388 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
  5000001    1.873    0.000    1.873    0.000 <string>:1(<genexpr>)
        1    0.000    0.000    2.388    2.388 <string>:1(<module>)
        1    0.000    0.000    2.388    2.388 {built-in method exec}
        1    0.515    0.515    2.388    2.388 {built-in method sum}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

The number of calls are different though 1(LC) compared to 5000001 in generator expression, this is most because sum is consuming the iterator hence has to call its __next__ method 500000 + 1 times(last 1 is probably for StopIteration to end the iteration). For a list comprehension all the magic happens inside its code object where the LIST_APPEND helps it in appending items one by one to the list, i.e no visible calls for cProfile.

这篇关于Python中的生成器与列表理解性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆