如何将itertools的“石斑鱼"变成将对象放入列表 [英] How to turn an itertools "grouper" object into a list

查看:81
本文介绍了如何将itertools的“石斑鱼"变成将对象放入列表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试学习如何在Python中使用itertools.groupby,我想找到每组字符的大小.最初,我尝试查看是否可以找到单个组的长度:

I am trying to learn how to use itertools.groupby in Python and I wanted to find the size of each group of characters. At first I tried to see if I could find the length of a single group:

from itertools import groupby
len(list(list( groupby("cccccaaaaatttttsssssss") )[0][1]))

我每次都会得到0.

我做了一些研究,发现其他人正在这样做:

I did a little research and found out that other people were doing it this way:

from itertools import groupby
for key,grouper in groupby("cccccaaaaatttttsssssss"):
    print key,len(list(grouper))

哪个效果很好.我感到困惑的是,为什么后者代码有效,而前者却无效?如果我只想像原来的代码那样只邀请第n个小组,那我该怎么办?

Which works great. What I am confused about is why does the latter code work, but the former does not? If I wanted to get only the nth group like I was trying to do in my original code, how would I do that?

推荐答案

第一种方法不起作用的原因是,使用

The reason that your first approach doesn't work is that the the groups get "consumed" when you create that list with

list(groupby("cccccaaaaatttttsssssss"))

引用 groupby文档

返回的组本身就是一个共享基础对象的迭代器 可通过groupby()迭代.因为源是共享的,所以当 groupby()对象是高级的,上一个组不再 可见.

The returned group is itself an iterator that shares the underlying iterable with groupby(). Because the source is shared, when the groupby() object is advanced, the previous group is no longer visible.

让我们将其分解为几个阶段.

Let's break it down into stages.

from itertools import groupby

a = list(groupby("cccccaaaaatttttsssssss"))
print(a)
b = a[0][1]
print(b)
print('So far, so good')
print(list(b))
print('What?!')

输出

[('c', <itertools._grouper object at 0xb715104c>), ('a', <itertools._grouper object at 0xb715108c>), ('t', <itertools._grouper object at 0xb71510cc>), ('s', <itertools._grouper object at 0xb715110c>)]
<itertools._grouper object at 0xb715104c>
So far, so good
[]
What?!

我们的itertools._grouper object at 0xb715104c是空的,因为它与groupby返回的父"迭代器共享其内容,而这些项现在都消失了,因为第一个list调用在父对象上进行了迭代.

Our itertools._grouper object at 0xb715104c is empty because it shares its contents with the "parent" iterator returned by groupby, and those items are now gone because that first list call iterated over the parent.

如果您尝试在任何迭代器上进行两次迭代(例如,简单的生成器表达式),则实际上没有什么不同.

It's really no different to what happens if you try to iterate twice over any iterator, eg a simple generator expression.

g = (c for c in 'python')
print(list(g))
print(list(g))

输出

['p', 'y', 't', 'h', 'o', 'n']
[]


顺便说一句,如果您实际上不需要groupby组的内容,这是另一种获取长度的方法.比建立一个列表来查找长度要便宜一些(并且使用更少的RAM).


BTW, here's another way to get the length of a groupby group if you don't actually need its contents; it's a little cheaper (and uses less RAM) than building a list just to find its length.

from itertools import groupby

for k, g in groupby("cccccaaaaatttttsssssss"):
    print(k, sum(1 for _ in g))

输出

c 5
a 5
t 5
s 7

这篇关于如何将itertools的“石斑鱼"变成将对象放入列表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆