如何使用itertools.groupby()? [英] How do I use itertools.groupby()?

查看:88
本文介绍了如何使用itertools.groupby()?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我无法找到有关如何实际使用Python的itertools.groupby()函数的可理解的解释.我想做的是这样的:

I haven't been able to find an understandable explanation of how to actually use Python's itertools.groupby() function. What I'm trying to do is this:

  • 获取列表-在这种情况下,是对象化的lxml元素的子元素
  • 根据某些条件将其分为几类
  • 然后稍后分别遍历每个组.
  • Take a list - in this case, the children of an objectified lxml element
  • Divide it into groups based on some criteria
  • Then later iterate over each of these groups separately.

我已经审查了文档,但是我尝试将其应用到简单的数字列表之外时遇到了麻烦.

I've reviewed the documentation, but I've had trouble trying to apply them beyond a simple list of numbers.

那么,我该如何使用itertools.groupby()?我应该使用另一种技术吗?指向良好先决条件"的指针.阅读也将不胜感激.

So, how do I use of itertools.groupby()? Is there another technique I should be using? Pointers to good "prerequisite" reading would also be appreciated.

推荐答案

重要说明:您必须先对数据进行排序.

我没有得到的部分是在示例结构中

The part I didn't get is that in the example construction

groups = []
uniquekeys = []
for k, g in groupby(data, keyfunc):
   groups.append(list(g))    # Store group iterator as a list
   uniquekeys.append(k)

k是当前的分组键,而g是一个迭代器,可用于迭代该分组键定义的组.换句话说,groupby迭代器本身会返回迭代器.

k is the current grouping key, and g is an iterator that you can use to iterate over the group defined by that grouping key. In other words, the groupby iterator itself returns iterators.

下面是使用更清晰的变量名的示例:

Here's an example of that, using clearer variable names:

from itertools import groupby

things = [("animal", "bear"), ("animal", "duck"), ("plant", "cactus"), ("vehicle", "speed boat"), ("vehicle", "school bus")]

for key, group in groupby(things, lambda x: x[0]):
    for thing in group:
        print("A %s is a %s." % (thing[1], key))
    print("")
    

这将为您提供输出:

熊是动物.
鸭子是动物.

A bear is a animal.
A duck is a animal.

仙人掌是植物.

快艇就是车辆.
校车就是车辆.

A speed boat is a vehicle.
A school bus is a vehicle.

在此示例中,things是元组的列表,其中每个元组中的第一项是第二项所属的组.

In this example, things is a list of tuples where the first item in each tuple is the group the second item belongs to.

groupby()函数采用两个参数:(1)要分组的数据和(2)与数据分组的函数.

The groupby() function takes two arguments: (1) the data to group and (2) the function to group it with.

在这里,lambda x: x[0]告诉groupby()使用每个元组中的第一项作为分组键.

Here, lambda x: x[0] tells groupby() to use the first item in each tuple as the grouping key.

在上面的for语句中,groupby返回三对(键,组迭代器)对-每个唯一键一次.您可以使用返回的迭代器来迭代该组中的每个单个项目.

In the above for statement, groupby returns three (key, group iterator) pairs - once for each unique key. You can use the returned iterator to iterate over each individual item in that group.

下面是一个使用列表理解的具有相同数据的示例:

Here's a slightly different example with the same data, using a list comprehension:

for key, group in groupby(things, lambda x: x[0]):
    listOfThings = " and ".join([thing[1] for thing in group])
    print(key + "s:  " + listOfThings + ".")

这将为您提供输出:

动物:熊和鸭.
植物:仙人掌.
车辆:快艇和校车.

animals: bear and duck.
plants: cactus.
vehicles: speed boat and school bus.

这篇关于如何使用itertools.groupby()?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆