什么时候不是使用python生成器的好时机? [英] When is not a good time to use python generators?

查看:166
本文介绍了什么时候不是使用python生成器的好时机?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这与相反,您可以使用哪些Python生成器函数对于?:python生成器,生成器表达式和 itertools 模块是我最近最喜欢的python功能。它们在设置操作链以在大量数据上执行时特别有用 - 我经常在处理DSV文件时使用它们。

This is rather the inverse of What can you use Python generator functions for?: python generators, generator expressions, and the itertools module are some of my favorite features of python these days. They're especially useful when setting up chains of operations to perform on a big pile of data--I often use them when processing DSV files.

所以当是不是使用生成器,生成器表达式或 itertools 函数的好时机?

So when is it not a good time to use a generator, or a generator expression, or an itertools function?


  • 我应该何时更喜欢 zip()而不是 itertools.izip(),或

  • range() over xrange(),或

  • [x for f in foo] over (x for f in foo)

  • When should I prefer zip() over itertools.izip(), or
  • range() over xrange(), or
  • [x for x in foo] over (x for x in foo)?

显然,我们最终需要通过创建列表将生成器解析为实际数据或者使用非生成器循环迭代它。有时我们只需知道长度。这不是我要问的。

Obviously, we eventually need to "resolve" a generator into actual data, usually by creating a list or iterating over it with a non-generator loop. Sometimes we just need to know the length. This isn't what I'm asking.

我们使用生成器,因此我们不会将新列表分配到内存中用于临时数据。这对于大型数据集尤其有用。对于小型数据集也有意义吗?有没有明显的记忆/ CPU权衡?

We use generators so that we're not assigning new lists into memory for interim data. This especially makes sense for large datasets. Does it make sense for small datasets too? Is there a noticeable memory/cpu trade-off?

我特别感兴趣的是,如果有人对此进行了一些分析,那么对于< a href =http://www.gossamer-threads.com/lists/python/python/76267\"rel =noreferrer>列表理解性能与map()和filter()。 ( alt link

I'm especially interested if anyone has done some profiling on this, in light of the eye-opening discussion of list comprehension performance vs. map() and filter(). (alt link)

推荐答案

在以下情况下使用列表而不是生成器:

1)您需要访问数据多次次(即缓存结果而不是重新计算结果):

1) You need to access the data multiple times (i.e. cache the results instead of recomputing them):

for i in outer:           # used once, okay to be a generator or return a list
    for j in inner:       # used multiple times, reusing a list is better
         ...

2)您需要随机访问(或除了转发顺序之外的任何访问):

2) You need random access (or any access other than forward sequential order):

for i in reversed(data): ...     # generators aren't reversible

s[i], s[j] = s[j], s[i]          # generators aren't indexable

3)您需要加入字符串(需要两次传递数据):

3) You need to join strings (which requires two passes over the data):

s = ''.join(data)                # lists are faster than generators in this use case

4)您正在使用 PyPy ,有时无法通过正常的函数调用和列表操作来优化生成器代码。

4) You are using PyPy which sometimes can't optimize generator code as much as it can with normal function calls and list manipulations.

这篇关于什么时候不是使用python生成器的好时机?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆