Python for循环和迭代器行为 [英] Python for loop and iterator behavior
问题描述
我想更多地了解迭代器
,所以如果我错了请纠正我。
I wanted to understand a bit more about iterators
, so please correct me if I'm wrong.
迭代器是一个对象,它具有指向下一个对象的指针,并被读取为缓冲区或流(即链接列表)。它们特别有效,因为他们只是通过引用而不是使用索引来告诉你接下来是什么。
An iterator is an object which has a pointer to the next object and is read as a buffer or stream (i.e. a linked list). They're particularly efficient cause all they do is tell you what is next by references instead of using indexing.
但是我仍然不明白为什么会发生以下行为:
However I still don't understand why is the following behavior happening:
In [1]: iter = (i for i in range(5))
In [2]: for _ in iter:
....: print _
....:
0
1
2
3
4
In [3]: for _ in iter:
....: print _
....:
In [4]:
第一次循环通过迭代器( In [2]
)它好像被消耗并留空,所以第二个循环( In [3]
)什么都不打印。
After a first loop through the iterator (In [2]
) it's as if it was consumed and left empty, so the second loop (In [3]
) prints nothing.
但是我从未为 iter
变量分配新值。
However I never assigned a new value to the iter
variable.
为
循环引发了什么?
推荐答案
你的怀疑是正确的:迭代器已被消耗。
Your suspicion is correct: the iterator has been consumed.
实际上,你的迭代器是一个生成器,这是一个只能通过迭代一次的对象。
In actuality, your iterator is a generator, which is an object which has the ability to be iterated through only once.
type((i for i in range(5))) # says it's type generator
def another_generator():
yield 1 # the yield expression makes it a generator, not a function
type(another_generator()) # also a generator
它们效率高的原因与通过引用告诉你下一步是什么无关。它们很有效,因为它们只根据要求生成下一个项目;所有项目不会立即生成。事实上,你可以有一个无限的生成器:
The reason they are efficient has nothing to do with telling you what is next "by reference." They are efficient because they only generate the next item upon request; all of the items are not generated at once. In fact, you can have an infinite generator:
def my_gen():
while True:
yield 1 # again: yield means it is a generator, not a function
for _ in my_gen(): print(_) # hit ctl+c to stop this infinite loop!
其他一些更正有助于提高您的理解:
Some other corrections to help improve your understanding:
- 生成器不是指针,并且不像您在其他语言中熟悉的那样表现。
- 与生成器不同其他语言:如上所述,生成器的每个结果都是动态生成的。在请求之前不会生成下一个结果。
- $ c>
的关键字组合
$ c>接受一个可迭代对象作为其第二个参数。
- 可迭代对象可以是生成器,如示例中所示,但它也可以是任何其他可迭代对象,例如
列表
,或dict
,或str
对象(字符串),或提供所需的用户定义类型功能。 -
iter
函数应用于对象以获取迭代器(顺便说一下:不要使用iter
作为Python中的变量名称,就像你所做的那样 - 它是关键词之一)。实际上,更确切地说,对象的__ iter __
方法被调用(在大多数情况下,所有iter
函数都会被调用;__ iter __
是Python所谓的魔术方法之一。) - 如果对
__ iter __
的调用成功,函数next()
在循环中反复应用于可迭代对象,并在$ c中将中的第一个变量提供给
$ c>被分配给
next()
函数的结果。 (记住:可迭代对象可以是生成器,容器对象的迭代器,或任何其他可迭代对象。)实际上,更确切地说:它调用迭代器对象的__ next __
方法,这是另一种神奇方法。 - 当
next()
引发循环的结束=https://docs.python.org/3/library/exceptions.html#StopIteration =noreferrer>
例外(通常当StopIteration
next()
被调用时,当iterable没有另一个对象产生时发生。
- The generator is not a pointer, and does not behave like a pointer as you might be familiar with in other languages.
- One of the differences from other languages: as said above, each result of the generator is generated on the fly. The next result is not produced until it is requested.
- The keyword combination
for
in
accepts an iterable object as its second argument. - The iterable object can be a generator, as in your example case, but it can also be any other iterable object, such as a
list
, ordict
, or astr
object (string), or a user-defined type that provides the required functionality. - The
iter
function is applied to the object to get an iterator (by the way: don't useiter
as a variable name in Python, as you have done - it is one of the keywords). Actually, to be more precise, the object's__iter__
method is called (which is, for the most part, all theiter
function does anyway;__iter__
is one of Python's so-called "magic methods"). - If the call to
__iter__
is successful, the functionnext()
is applied to the iterable object over and over again, in a loop, and the first variable supplied tofor
in
is assigned to the result of thenext()
function. (Remember: the iterable object could be a generator, or a container object's iterator, or any other iterable object.) Actually, to be more precise: it calls the iterator object's__next__
method, which is another "magic method". - The
for
loop ends whennext()
raises theStopIteration
exception (which usually happens when the iterable does not have another object to yield whennext()
is called).
你可以用python手动手动实现 for
循环(可能不完美,但足够接近):
You can "manually" implement a for
loop in python this way (probably not perfect, but close enough):
try:
temp = iterable.__iter__()
except AttributeError():
raise TypeError("'{}' object is not iterable".format(type(iterable).__name__))
else:
while True:
try:
_ = temp.__next__()
except StopIteration:
break
except AttributeError:
raise TypeError("iter() returned non-iterator of type '{}'".format(type(temp).__name__))
# this is the "body" of the for loop
continue
上面和你的示例代码之间几乎没有区别。
There is pretty much no difference between the above and your example code.
实际上,循环中更有趣的部分不是
,但是理解中的 ,但
中的。在
中使用本身会产生与
相同的效果中
对其参数的影响是非常有用的,因为
对于
in
实现了非常相似的行为。
Actually, the more interesting part of a for
loop is not the for
, but the in
. Using in
by itself produces a different effect than for
in
, but it is very useful to understand what in
does with its arguments, since for
in
implements very similar behavior.
-
单独使用时,关键字中的
首先调用对象的
中使用__包含__
方法,这是另一种魔术方法(请注意,在时会跳过此步骤)。在容器上单独使用
中的,您可以执行以下操作:
When used by itself, the
in
keyword first calls the object's__contains__
method, which is yet another "magic method" (note that this step is skipped when usingfor
in
). Usingin
by itself on a container, you can do things like this:
1 in [1, 2, 3] # True
'He' in 'Hello' # True
3 in range(10) # True
'eH' in 'Hello'[::-1] # True
如果可迭代对象不是容器(即它没有 __包含__
方法),在
中接下来试图调用对象的 __ iter __
方法。如前所述: __ iter __
方法返回Python中已知的 iterator 。基本上,迭代器是一个对象,您可以使用内置的泛型函数 <$ 1 上的c $ c> next() 。生成器只是一种迭代器。
If the iterable object is NOT a container (i.e. it doesn't have a __contains__
method), in
next tries to call the object's __iter__
method. As was said previously: the __iter__
method returns what is known in Python as an iterator. Basically, an iterator is an object that you can use the built-in generic function next()
on1. A generator is just one type of iterator.
如果你希望创建自己的对象类型进行迭代(即,您可以在中使用
,或只是
in
,就此而言,了解 yield
关键字很有用,该关键字用于生成器(如上所述)。
If you wish to create your own object type to iterate over (i.e, you can use for
in
, or just in
, on it), it's useful to know about the yield
keyword, which is used in generators (as mentioned above).
class MyIterable():
def __iter__(self):
yield 1
m = MyIterable()
for _ in m: print(_) # 1
1 in m # True
yield
的存在将函数或方法转换为生成器而不是常规函数/方法。如果使用生成器,则不需要 __ next __
方法(它会自动带来 __ next __
)。
The presence of yield
turns a function or method into a generator instead of a regular function/method. You don't need the __next__
method if you use a generator (it brings __next__
along with it automatically).
如果您希望创建自己的容器对象类型(即,您可以在中单独使用,但是
中的 不是
,你只需要
__包含__
方法。
If you wish to create your own container object type (i.e, you can use in
on it by itself, but NOT for
in
), you just need the __contains__
method.
class MyUselessContainer():
def __contains__(self, obj):
return True
m = MyUselessContainer()
1 in m # True
'Foo' in m # True
TypeError in m # True
None in m # True
1 请注意,是一个迭代器,一个对象必须实现迭代器协议。这只意味着 __ next __
和 __ iter __
方法必须正确实现(生成器来使用此功能免费,因此您在使用它时无需担心)。另请注意 ___ next __
方法实际上下一步
(没有下划线)在Python 2中。
1 Note that, to be an iterator, an object must implement the iterator protocol. This only means that both the __next__
and __iter__
methods must be correctly implemented (generators come with this functionality "for free", so you don't need to worry about it when using them). Also note that the ___next__
method is actually next
(no underscores) in Python 2.
2 请参阅此答案有关创建可迭代类的不同方法。
2 See this answer for the different ways to create iterable classes.
这篇关于Python for循环和迭代器行为的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!