itertools groupby对象输出不正确 [英] itertools groupby object not outputting correctly

查看:154
本文介绍了itertools groupby对象输出不正确的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图使用itertools.groupby帮助我按正面或负面属性对整数列表进行分组,例如:

I was trying to use itertools.groupby to help me group a list of integers by positive or negative property, for example:

输入

[1,2,3, -1,-2,-3, 1,2,3, -1,-2,-3] 

将返回

[[1,2,3],[-1,-2,-3],[1,2,3],[-1,-2,-3]]

但是如果我:

import itertools

nums = [1,2,3, -1,-2,-3, 1,2,3, -1,-2,-3]
group_list = list(itertools.groupby(nums, key=lambda x: x>=0))
print(group_list)
for k, v in group_list:
    print(list(v))
>>>
[]
[-3]
[]
[]

但如果我没有 list() groupby对象,它将正常工作:

But if I don't list() the groupby object, it will work fine:

nums = [1,2,3, -1,-2,-3, 1,2,3, -1,-2,-3]
group_list = itertools.groupby(nums, key=lambda x: x>=0)
for k, v in group_list:
    print(list(v))
>>>
[1, 2, 3]
[-1, -2, -3]
[1, 2, 3]
[-1, -2, -3]

我不明白的是,groupby对象是由一对密钥组成的迭代器 _grouper 对象,对groupby对象的 list()的调用不应该使用 _grouper 对象?

What I don't understand is, a groupby object is a iterator composed by a pair of key and _grouper object, a call of list() of a groupby object should not consume the _grouper object?

即使它确实消耗了,我怎么得到 [ - 3] 来自第二个元素?

And even if it did consume, how did I get [-3] from the second element?

推荐答案

Per 文档,明确指出推进 groupby 对象呈现前一组不可用(实际上是空的):

Per the docs, it is explicitly noted that advancing the groupby object renders the previous group unusable (in practice, empty):


返回的组本身是一个迭代器,它与 groupby共享底层的iterable ()。由于源是共享的,因此当 groupby()对象被提前时,前一个组将不再可见。因此,如果稍后需要该数据,则应将其存储为列表。

The returned group is itself an iterator that shares the underlying iterable with groupby(). Because the source is shared, when the groupby() object is advanced, the previous group is no longer visible. So, if that data is needed later, it should be stored as a list.

基本上,而不是 list 直接使用列表构造函数,你需要一个listcomp,它从group迭代器转换为 list groupby 对象之前code> s,替换:

Basically, instead of list-ifying directly with the list constructor, you'd need a listcomp that converts from group iterators to lists before advancing the groupby object, replacing:

group_list = list(itertools.groupby(nums, key=lambda x: x>=0))

with:

group_list = [(k, list(g)) for k, g in itertools.groupby(nums, key=lambda x: x>=0)]

大多数 itertools的设计模块类型旨在避免隐式存储数据,因为它们旨在与潜在的巨大输入一起使用。如果所有的石斑鱼都存储了输入中所有数据的副本(并且 groupby 对象必须确保追溯填充它们),那么它会变得丑陋,并且可能会造成内存损失意外地。通过强制您使值显式存储,您不会无意中存储无限量的数据,根据Python的Zen:

The design of most itertools module types is intended to avoid storing data implicitly, because they're intended to be used with potentially huge inputs. If all the groupers stored copies of all the data from the input (and the groupby object had to be sure to retroactively populate them), it would get ugly, and potentially blow memory by accident. By forcing you to make storing the values explicit, you don't accidentally store unbounded amounts of data unintentionally, per the Zen of Python:


显式优于隐式。

Explicit is better than implicit.

这篇关于itertools groupby对象输出不正确的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆