为什么发电机在第一次产量之前不执行? [英] Why don't generators execute until first yield?

查看:67
本文介绍了为什么发电机在第一次产量之前不执行?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

嗨!


首先介绍一下。


昨天我花了很多时间调试以下方法/>
我们开发的相当纤薄的数据库抽象层:


,----

| def selectColumn(self,table,column,where = {},order_by = [],group_by = []):

| """执行返回单列的SQL select查询

|

|该列作为列表返回。如果

|,则会抛出异常结果不是单个列。""

| query = build_select(table,[column],where,order_by,group_by)

| result = DBResult(self.rawQuery(query))

|如果result.colcount!= 1:

|提出QueryError(查询必须返回一列,查询)

|对于result.fetchAllRowsAsList()中的行:

| yield row [0]

` ----


我只是将方法重写为生成器而不是返回

结果列表。接下来的测试失败了:


,----

| def testSelectColumnMultipleColumns(self):

| res = self.fdb.selectColumn(''db3ut1'',[''c1'',''c2''],

| {''c1'':( 1,2)} ,order_by =''c1'')

| self.assertRaises(db3.QueryError,self.fdb.selectColumn,

|''db3ut1'',[''c1'',''c2''],{''c1'': (1,2)},order_by =''c1'')

` ----


我希望这会引发一个QueryError,因为result.colcount!= 1

约束被违反(如前所述),但这不是

的情况。在我从生成器获得第一个结果之前没有违反的约束。


现在到了要点。当一个生成器函数运行时,它立即返回一个生成器,并且它不会在生成器内运行任何代码。

直到调用generator.next()为止生成器内的任何代码执行

,给它传统的懒惰评估语义。为什么不用
生成器遵循Python通常的热切评估语义,

立即执行直到第一次收益之前呢?

给出生成器特殊情况的语义没有充分的理由是一个非常好的想法,所以我很好奇,如果有充分的理由以这种方式它是b / b
。使用当前的语义,这意味着错误可能会突然出现在
意外时间而不是代码快速失败。


Martin

Hi!

First a bit of context.

Yesterday I spent a lot of time debugging the following method in a
rather slim database abstraction layer we''ve developed:

,----
| def selectColumn(self, table, column, where={}, order_by=[], group_by=[]):
| """Performs a SQL select query returning a single column
|
| The column is returned as a list. An exception is thrown if the
| result is not a single column."""
| query = build_select(table, [column], where, order_by, group_by)
| result = DBResult(self.rawQuery(query))
| if result.colcount != 1:
| raise QueryError("Query must return exactly one column", query)
| for row in result.fetchAllRowsAsList():
| yield row[0]
`----

I''d just rewritten the method as a generator rather than returning a
list of results. The following test then failed:

,----
| def testSelectColumnMultipleColumns(self):
| res = self.fdb.selectColumn(''db3ut1'', [''c1'', ''c2''],
| {''c1'':(1, 2)}, order_by=''c1'')
| self.assertRaises(db3.QueryError, self.fdb.selectColumn,
| ''db3ut1'', [''c1'', ''c2''], {''c1'':(1, 2)}, order_by=''c1'')
`----

I expected this to raise a QueryError due to the result.colcount != 1
constraint being violated (as was the case before), but that isn''t the
case. The constraint it not violated until I get the first result from
the generator.

Now to the main point. When a generator function is run, it immediately
returns a generator, and it does not run any code inside the generator.
Not until generator.next() is called is any code inside the generator
executed, giving it traditional lazy evaluation semantics. Why don''t
generators follow the usual eager evaluation semantics of Python and
immediately execute up until right before the first yield instead?
Giving generators special case semantics for no good reason is a really
bad idea, so I''m very curious if there is a good reason for it being
this way. With the current semantics it means that errors can pop up at
unexpected times rather than the code failing fast.

Martin

推荐答案

2008年5月7日星期三上午2:29,Martin Sand Christensen< ms*@es.aau.dkwrote:
On Wed, May 7, 2008 at 2:29 AM, Martin Sand Christensen <ms*@es.aau.dkwrote:

现在到了重点。当一个生成器函数运行时,它立即返回一个生成器,并且它不会在生成器内运行任何代码。

直到调用generator.next()为止生成器内的任何代码执行

,给它传统的懒惰评估语义。为什么不用
生成器遵循Python通常的热切评估语义,

立即执行直到第一次收益之前呢?

给出生成器特殊情况的语义没有充分的理由是一个非常好的想法,所以我很好奇,如果有充分的理由以这种方式它是b / b
。使用当前的语义,这意味着错误可能会突然出现在
意外时间,而不是代码快速失败。
Now to the main point. When a generator function is run, it immediately
returns a generator, and it does not run any code inside the generator.
Not until generator.next() is called is any code inside the generator
executed, giving it traditional lazy evaluation semantics. Why don''t
generators follow the usual eager evaluation semantics of Python and
immediately execute up until right before the first yield instead?
Giving generators special case semantics for no good reason is a really
bad idea, so I''m very curious if there is a good reason for it being
this way. With the current semantics it means that errors can pop up at
unexpected times rather than the code failing fast.



不是懒惰的评价,用

迭代器替换列表的重点是什么?除此之外,当实例化
时,运行到第一次产生将使得生成器的第一次迭代与剩余的迭代不一致

。考虑这个有点人为的例子:


def printing_iter(东西):

for item in stuff:

打印项目

产量项目


显然,这里的想法是创建一个包装另一个

迭代器的生成器在产生它时打印每个项目。但是使用你的

建议,这将在创建

生成器时打印第一个项目,而不是在第一个项目实际上打印时

重复过来。


如果你真的想要一个符合你描述方式的发电机,我会建议做这样的事情:
< br $> b $ b def myGenerator(args):

immediate_setup_code()

def generator():

for actual_generator_loop()中的项目:

收益项目

返回生成器()

Isn''t lazy evaluation sort of the whole point of replacing a list with
an iterator? Besides which, running up to the first yield when
instantiated would make the generator''s first iteration inconsistent
with the remaining iterations. Consider this somewhat contrived
example:

def printing_iter(stuff):
for item in stuff:
print item
yield item

Clearly, the idea here is to create a generator that wraps another
iterator and prints each item as it yields it. But using your
suggestion, this would instead print the first item at the time the
generator is created, rather than when the first item is actually
iterated over.

If you really want a generator that behaves the way you describe, I
suggest doing something like this:

def myGenerator(args):
immediate_setup_code()

def generator():
for item in actual_generator_loop():
yield item
return generator()


Martin Sand Christensen< ms * @ es.aau.dkwrote:
Martin Sand Christensen <ms*@es.aau.dkwrote:

现在到了主要观点。运行生成器函数时,
Now to the main point. When a generator function is run, it



立即

immediately


返回一个生成器,它不运行
returns a generator, and it does not run any code inside the



生成器中的任何代码。

generator.


直到调用generator.next()才会生成生成器内的任何代码

执行,给它传统的惰性求值语义。为什么不用
生成器遵循通常急切的Python评估语义和

立即执行直到第一次收益之前呢?
Not until generator.next() is called is any code inside the generator
executed, giving it traditional lazy evaluation semantics. Why don''t
generators follow the usual eager evaluation semantics of Python and
immediately execute up until right before the first yield instead?



你的意思是你希望生成器的语义是当你创建它们时,或者每次调用next()时它们运行直到他们达到收益

然后(除了初始运行)返回结果,这是前一次产生的结果

?实现它很容易,但可能会让用户感到困惑。

You mean you expect the semantics of generators to be that when you
create them, or every time you call next() they run until they hit yield
and then (except for the initial run) return the result that was yielded
the time before? It is easy enough to implement that, but could be a bit
confusing for the user.


>> def greedy(fn):
>>def greedy(fn):



def greedygenerator(* args,** kw):

def延迟():

it = iter(fn(* args,** kw))

尝试:

res = it.next()

除了StopIteration:

收益无

返回

产量无

的价值:

收益率res

res =价值

收益率资产
it =延迟()

it.next()

退货

返回贪婪发电机

def greedygenerator(*args, **kw):
def delayed():
it = iter(fn(*args, **kw))
try:
res = it.next()
except StopIteration:
yield None
return
yield None
for value in it:
yield res
res = value
yield res
it = delayed()
it.next()
return it
return greedygenerator


>> @greedy
>>@greedy



def mygen(n):

for i in range(n):

print i

yield i

def mygen(n):
for i in range(n):
print i
yield i


>> x = mygen(3)
>>x = mygen(3)



0

0


>> ; list(x)
>>list(x)



1

2

[0, 1,2]

1
2
[0, 1, 2]


>> x = mygen(0)
list(x)
>>x = mygen(0)
list(x)



[]

[]


>>>
>>>



现在尝试:


for getCommandsFromUser()中的命令:

print"该命令的结果是',执行(命令)


其中getCommandsFromUser是一个从stdin读取的贪婪生成器,

并看看为什么发电机不能这样工作。

Now try:

for command in getCommandsFromUser():
print "the result of that command was", execute(command)

where getCommandsFromUser is a greedy generator that reads from stdin,
and see why generators don''t work that way.


>>>>" Ian" == Ian Kelly< ia ********* @ gmail.comwrites:

IanIsn懒惰评价排序更换清单的全部内容

Ianwith一个迭代器?除此之外,当实例化时,运行到第一次产量将使生成器的第一次迭代

与剩余的迭代一致。


这不是我的想法,虽然这可能不会很明显地足够了。我希望发生器在第一次产量之前立即运行直到

,这样第一次调用next()就会以

的第一次产量开始。


我的反对意见是,生成器_by default_具有与语言其余部分不同的语义

。作为一个概念的懒惰评估非常适合它可以提供的所有好处,但是,正如我已经说明的那样,严格懒惰

评估语义有时会有点令人惊讶并导致

如果你不经常承担

差异,那么很难调试的问题。在这方面,在我看来,我的建议

将是一个改进。我不是语言方面的专家,但是我很可能会错过大局的一部分

让人觉得为什么应该这样做就像它们一样。


至于稍微改变生成器语义的代码,我没看到b $ b真正解决了这个问题:如果你要将这样的代码

应用到你的生成器,你可能正是这样做的,因为你知道语义上的差异,b $ b而且你不会对它感到惊讶。您可能仍然想要更改语义,但是对于

的原因与我的观点无关。


Martin
>>>>"Ian" == Ian Kelly <ia*********@gmail.comwrites:
IanIsn''t lazy evaluation sort of the whole point of replacing a list
Ianwith an iterator? Besides which, running up to the first yield when
Ianinstantiated would make the generator''s first iteration
Ianinconsistent with the remaining iterations.

That wasn''t my idea, although that may not have come across quite
clearly enough. I wanted the generator to immediately run until right
before the first yield so that the first call to next() would start with
the first yield.

My objection is that generators _by default_ have different semantics
than the rest of the language. Lazy evaluation as a concept is great for
all the benefits it can provide, but, as I''ve illustrated, strictly lazy
evaluation semantics can be somewhat surprising at times and lead to
problems that are hard to debug if you don''t constantly bear the
difference in mind. In this respect, it seems to me that my suggestion
would be an improvement. I''m not any kind of expert on languages,
though, and I may very well be missing a part of the bigger picture that
makes it obvous why things should be as they are.

As for code to slightly change the semantics of generators, that doesn''t
really address the issue as I see it: if you''re going to apply such code
to your generators, you''re probably doing it exactly because you''re
aware of the difference in semantics, and you''re not going to be
surprised by it. You may still want to change the semantics, but for
reasons that are irrelevant to my point.

Martin


这篇关于为什么发电机在第一次产量之前不执行?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆