itertools.flatten()?和复制生成器/迭代器。 [英] itertools.flatten()? and copying generators/iterators.

查看:64
本文介绍了itertools.flatten()?和复制生成器/迭代器。的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

下面是一个实现''flattening''递归生成器(采用嵌套的

迭代器并删除它的所有嵌套)。这可能是一般的和有用的

足以包含在itertools中吗? (我知道*我想要类似的东西......)


非常基本的例子:

rl = [1,[2,3,[4,5],''678'',9]]
list(flatten(rl))
[1 ,2,3,4,5,''6'',''7'',''8'',9] notstring = lambda obj:not isinstance(obj,type(''''))
list(flatten(rl,notstring))
[1,2,3,4,5,''678'',9] isstring = lambda obj:not notstring(obj)
list(flatten (rl,isstring))
[1,[2,3,[4,5],''678'',9]]#字符串在一个列表中,所以我们永远不会下降那么远。 /> car_is_2 = lambda obj:isinstance(obj,type([]))和obj [0] == 2
list(flatten(rl,car_is_2))
[1,2,3,[ 4,5],''678'',9] rls = [''这里'',''是'',[''有些'',[''嵌套'''',''字符串'']]
list(flatten(rls))
[''H'','e'',''r'',''e'',''a'', ''r'','e'','s'',''o'','''',''e'','n'',''e'','' s'','t'',

''e'','d'','s'','t'',''r'',''我',''n',''g'','s''列表(flatten(rls,notstring))
[''这里'',''是'',''一些'',''嵌套'',''字符串''] rli = iter([1,2,iter([''abc'',iter(''ABC'')]),4]) /> list(flatten(rli))
[1,2,''a'',''b'',''c'',''A'',''B'','' C'',4]列表(flatten(rli,notstring))
[] #rli是一个迭代器,记住!
rli = iter([1,2,iter([''abc''' ,iter(''ABC'')]),4])
list(flatten(rli,notstring))
[1,2,''abc'',''A'',' 'B'',''C'',4]#以下我不知道该怎么做......
空= [1,[],3]
emptyiter = [ 1,iter([]),3] list(flatten(empty))
[1,[],3] list(flatten(emptyiter))
[1,3]




我也努力让它与迭代器和生成器对象一起工作,并且

它主要是这样做的。但是,如果该对象已经是一个

生成器/迭代器,我在确定给定对象是否会无限迭代时遇到一些问题。基本上,我无法复制迭代器(为什么?)。有关详细信息,请参阅下面的
isemptyorself()。除此之外,我只是一般

不确定当迭代器/发生器遇到
时应该采取什么样的行为。


另外,为什么迭代器类型不包含在types模块中或在语言参考(Python 2.2)中描述




---代码---

def isiterable(obj):

try:iter(obj)

除TypeError:return False

else:return True


def isemptyorself(iterable):

"""如果iterable不产生任何东西,则为True。 ;"

it = iter(iterable)


#KLUDGE!这个测试必须修改对象才能测试它的价值。这意味着将丢弃iterable的值!

#很可能,iterable本身就是一个迭代器或生成器,

#因为id(iter(GENR或ITER)) == id(GENR或ITER)。

#不幸的是,我们不能使用

#复制模块复制生成器和迭代器,所以我们必须假设这个迭代器

#不会自我产生或什么都没有....


如果它是可迭代的:

返回False


尝试:res = it.next()

除了StopIteration:#Yields nothing

返回True

else:

如果res == iterable:#Yields本身

返回True

返回False


def flatten(iterable,isnested = None):

"""迭代迭代物品,降序为嵌套物品。


isnested是一个函数,如果

iterable的元素应该进入,则返回true。默认是

考虑迭代iter()认为可迭代的任何东西(除非

这样做会导致无限递归)。


"""

如果isnested为None:

isnested = lambda obj:True #Always下降


for iterable中的项目:

if isiterable(item)而不是isemptyorself(item)\

和isnested(item):

for flatten(flat,isnested):

收益子项

其他:

收益项

-

Francis Avila

解决方案

[Francis Avila]

下面是一个实现一个''flattening''递归生成器(取一个嵌套的
迭代器并删除它的所有嵌套)。这可能是一般的和有用的
足以包含在itertools中吗? (我知道*我想要类似的东西...)


有趣的帖子!


我会把你的想法添加到到目前为止收到的itertool建议列表。我的

首先采取

是它更有可能被添加到食谱中。在示例部分。


核心itertools应该是原始的构建块,它与一个

另一个组合,以制作其他工具。另外,我试图通过排除可以轻松高效编码的新工具来保持工具集尽可能小的价值。

纯蟒蛇。


我会像这样编码:


def flatten(s):

试试:

iter(s)

除了TypeError:

yield s

else:

for elem in s:

for subelem in flatten(elem):

yield subelem


正如你的例子所示,它确实有一些令人惊讶的行为

字符串分开而不是保持原样。由AtomicString子类很容易处理




类AtomicString(str):

def __iter __(self) :

引发TypeError


a = [1,2,AtomicString(''abc''),4]

#以下我不知道该怎么做......
空= [1,[],3]
emptyiter = [1,iter([]),3]




上面的代码定义处理这个没有问题。

我在确定给定对象是否会无限迭代时遇到一些问题,如果该对象已经是
生成器/迭代器。


一般情况下,这是不可知的。

基本上,我无法复制迭代器(为什么?)。


Alex Martelli刚写了一篇关于让许多迭代器可复制的PEP。

参见 www.python.org/peps/pep-0323.html


直到采用,你最好的选择是使用tee():


def tee(iterable):

"从单个迭代中返回两个独立的迭代器"

def gen(next,data = {},cnt = [0]):

dpop = data.pop

for i in count() :

if i == cnt [0]:

item = data [i] = next()

cnt [0] + = 1

否则:

item = dpop(i)

收益项目

next = iter(可迭代)。下一页

返回(gen(下),gen(下))


另外,为什么迭代器类型不包含在types模块中或者描述为
语言参考(Python 2.2)?




没有迭代器类型。迭代器可以是支持

迭代器

协议的任何对象:__ user __()和next()。迭代器,容器迭代器以及每个迭代工具都定义了自己的类型。

这是一篇很长但很有教育意义的帖子。我希望它成为广泛阅读的广告。你的工作冒险进入以前未知的

领域并涉及一些前沿问题,如可复制性,

确定某些东西是否可迭代,各种迭代器类型,

和确定给定的

过程是否有限的深度数学主题。

Raymond Hettinger


Raymond Hettinger写道:

核心itertools应该是原始的构建块,与另一个组合,以制作其他工具。另外,我试图通过排除可以在纯python中轻松有效编码的新工具来尽可能减小工具集。

我会编写它像这样:

def flatten(s):
尝试:
iter(s)
除TypeError:
yield s
else:
for elem in s:
for subelem in flatten(elem):
yield subelem

正如你的例子所示,它确实有一些令人惊讶的行为/>字符串分开而不是保持原样。这很容易被AtomicString子类照顾:

类AtomicString(str):
def __iter __(self):
引发TypeError

a = [1,2,AtomicString(''abc''),4]

>>> #以下我不知道该怎么做...
>>>空= [1,[],3]
>>> emptyiter = [1,iter([]),3]




我建议做一个小修改:

def flatten(s,toiter = iter):

尝试:

it = toiter(s)
除了TypeError之外的


收益率s

其他:

for elem:

for subelem flatten(elem,toiter) :

收益率subelem


def keeptrings(seq):

if isinstance(seq,basestring):

引发TypeError

返回iter(seq)


sample = [1,2,[3," abc def" .split()] ,4]

打印样本

打印列表(flatten(sample,keepstrings))

打印列表(flatten([1,[], 3]))

打印列表(flatten([1,iter([]),3]))


以上处理字符串的方式是客户代码是非侵入性的。


彼得


Peter Otten写道:

... < blockquote class =post_quotes> def keeptrings(seq):
if isinstance(seq,basestring):
引发TypeError
return iter(seq)
...上面以一种非侵入性的方式处理字符串客户端代码。




是的,非常好。我将keeptrings设为默认值(一个RARELY想要

来将字符串视为非原子的)并将typetest替换为

尝试执行seq +''''(如果它没有提高,请搜索ain''ta字符串),

但这只是我:-)。

Alex


Below is an implementation a ''flattening'' recursive generator (take a nested
iterator and remove all its nesting). Is this possibly general and useful
enough to be included in itertools? (I know *I* wanted something like it...)

Very basic examples:

rl = [1, [2, 3, [4, 5], ''678'', 9]]
list(flatten(rl)) [1, 2, 3, 4, 5, ''6'', ''7'', ''8'', 9] notstring = lambda obj: not isinstance(obj, type(''''))
list(flatten(rl, notstring)) [1, 2, 3, 4, 5, ''678'', 9] isstring = lambda obj: not notstring(obj)
list(flatten(rl, isstring)) [1, [2, 3, [4, 5], ''678'', 9]] #The string is within a list, so we never descend that far.
car_is_2 = lambda obj: isinstance(obj, type([])) and obj[0] == 2
list(flatten(rl, car_is_2)) [1, 2, 3, [4, 5], ''678'', 9] rls = [''Here'', ''are'', [''some'', [''nested''], ''strings'']]
list(flatten(rls)) [''H'', ''e'', ''r'', ''e'', ''a'', ''r'', ''e'', ''s'', ''o'', ''m'', ''e'', ''n'', ''e'', ''s'', ''t'',
''e'', ''d'', ''s'', ''t'', ''r'', ''i'', ''n'', ''g'', ''s''] list(flatten(rls, notstring)) [''Here'', ''are'', ''some'', ''nested'', ''strings''] rli = iter([1, 2, iter([''abc'', iter(''ABC'')]), 4])
list(flatten(rli)) [1, 2, ''a'', ''b'', ''c'', ''A'', ''B'', ''C'', 4] list(flatten(rli, notstring)) [] #rli is an iterator, remember!
rli = iter([1, 2, iter([''abc'', iter(''ABC'')]), 4])
list(flatten(rli, notstring)) [1, 2, ''abc'', ''A'', ''B'', ''C'', 4] # The following I''m not sure what to do about...
empty = [1, [], 3]
emptyiter = [1, iter([]), 3]
list(flatten(empty)) [1, [], 3] list(flatten(emptyiter)) [1, 3]



I tried hard to get it to work with iterator and generator objects, too, and
it mostly does. However, I''m having some problems determining whether a
given object will iterate infinitely, if that object is already a
generator/iterator. Basically, I''m unable to copy an iterator (why?). See
isemptyorself() below for details. Aside from that, I''m just generally
unsure what the proper behavior should be when an iterator/generator is
encountered.

Also, why is the iterator type not included in the types module or described
in the language reference (Python 2.2)?

--- Code ---
def isiterable(obj):
try: iter(obj)
except TypeError: return False
else: return True

def isemptyorself(iterable):
"""True if iterable yields nothing or itself."""
it = iter(iterable)

# KLUDGE! This test must modify the object in order to test
# it. This means that a value of iterable will be discarded!
# Most likely, iterable is itself an iterator or generator,
# because id(iter(GENR or ITER)) == id(GENR or ITER).
# Unfortunately, we can''t copy generators and iterators using
# the copy module, so we must just assume that this iterator
# doesn''t yield itself or nothing....

if it is iterable:
return False

try: res = it.next()
except StopIteration: #Yields nothing
return True
else:
if res == iterable: #Yields itself
return True
return False

def flatten(iterable, isnested=None):
"""Iterate items in iterable, descending into nested items.

isnested is a function that returns true if the element of
iterable should be descended into. The default is to
consider iterable anything that iter() thinks is iterable (unless
doing so would cause an infinite recursion).

"""
if isnested is None:
isnested = lambda obj: True #Always descend

for item in iterable:
if isiterable(item) and not isemptyorself(item) \
and isnested(item):
for subitem in flatten(item, isnested):
yield subitem
else:
yield item
--
Francis Avila

解决方案

[Francis Avila]

Below is an implementation a ''flattening'' recursive generator (take a nested
iterator and remove all its nesting). Is this possibly general and useful
enough to be included in itertools? (I know *I* wanted something like it...)
Interesting post!

I''ll add your idea to the list of itertool suggestions received so far. My
first take
is that it more likely to be added to the "recipes" in the examples section.

Core itertools should be primitive building blocks that combine with one
another to make other tools. Also, I''m trying to keep the toolset as small
as possible by excluding new tools that can be easily and efficiently coded
in pure python.

I would code it like this:

def flatten(s):
try:
iter(s)
except TypeError:
yield s
else:
for elem in s:
for subelem in flatten(elem):
yield subelem

As your examples show, it does have some suprising behavior in that
strings get split apart instead of staying intact. That is easily taken care of
by an AtomicString subclass:

class AtomicString(str):
def __iter__(self):
raise TypeError

a = [1, 2, AtomicString(''abc''), 4]

# The following I''m not sure what to do about...
empty = [1, [], 3]
emptyiter = [1, iter([]), 3]



The above code definition handles this without a problem.
I''m having some problems determining whether a
given object will iterate infinitely, if that object is already a
generator/iterator.
In general, that is not knowable.
Basically, I''m unable to copy an iterator (why?).
Alex Martelli just wrote a PEP about making many iterators copyable.
See www.python.org/peps/pep-0323.html

Until that is adopted, your best bet is to use tee():

def tee(iterable):
"Return two independent iterators from a single iterable"
def gen(next, data={}, cnt=[0]):
dpop = data.pop
for i in count():
if i == cnt[0]:
item = data[i] = next()
cnt[0] += 1
else:
item = dpop(i)
yield item
next = iter(iterable).next
return (gen(next), gen(next))

Also, why is the iterator type not included in the types module or described
in the language reference (Python 2.2)?



There is no iterator type. Iterators can be any object that supports the
iterator
protocol: __iter__() and next(). Generators, container iterators, and each of
the itertools define their own type.
This was a long but highly instructive pair of posts. I hope it becomes
widely read. Your work ventured into previously uncharted
territory and touched on a number of frontier issues like copyability,
determining whether something is iterable, the various iterator types,
and the deep mathematical subject of determining whether a given
process is finite.
Raymond Hettinger


Raymond Hettinger wrote:

Core itertools should be primitive building blocks that combine with one
another to make other tools. Also, I''m trying to keep the toolset as
small as possible by excluding new tools that can be easily and
efficiently coded in pure python.

I would code it like this:

def flatten(s):
try:
iter(s)
except TypeError:
yield s
else:
for elem in s:
for subelem in flatten(elem):
yield subelem

As your examples show, it does have some suprising behavior in that
strings get split apart instead of staying intact. That is easily taken
care of by an AtomicString subclass:

class AtomicString(str):
def __iter__(self):
raise TypeError

a = [1, 2, AtomicString(''abc''), 4]

>>> # The following I''m not sure what to do about...
>>> empty = [1, [], 3]
>>> emptyiter = [1, iter([]), 3]



I suggest a minor modification:

def flatten(s, toiter=iter):
try:
it = toiter(s)
except TypeError:
yield s
else:
for elem in it:
for subelem in flatten(elem, toiter):
yield subelem

def keepstrings(seq):
if isinstance(seq, basestring):
raise TypeError
return iter(seq)

sample = [1, 2, [3, "abc def".split()], 4]
print sample
print list(flatten(sample, keepstrings))
print list(flatten([1, [], 3]))
print list(flatten([1, iter([]), 3]))

The above handles strings in a way that is nonintrusive on client code.

Peter


Peter Otten wrote:
...

def keepstrings(seq):
if isinstance(seq, basestring):
raise TypeError
return iter(seq) ... The above handles strings in a way that is nonintrusive on client code.



Yes, very nice. I''d make keepstrings the default (one RARELY wants
to treat strings as nonatomic) AND replace the typetest with an
attempt to do a seq+'''' (if it doesn''t raise, seq ain''t a string),
but that''s just me:-).
Alex


这篇关于itertools.flatten()?和复制生成器/迭代器。的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆