我如何使用Python的itertools.groupby()? [英] How do I use Python's itertools.groupby()?
问题描述
我无法找到关于如何实际使用Python的 itertools.groupby()
函数的可理解的解释。我要做的是这样:
I haven't been able to find an understandable explanation of how to actually use Python's itertools.groupby()
function. What I'm trying to do is this:
- 列出一个清单 - 在这种情况下,一个物化的孩子
lxml
元素 - 根据某些条件将其划分为组
- 然后分别迭代这些组中的每一个。
- Take a list - in this case, the children of an objectified
lxml
element - Divide it into groups based on some criteria
- Then later iterate over each of these groups separately.
我查看了文档,以及示例,但我在尝试将它们应用到一个简单的数字列表之外时遇到了麻烦。
I've reviewed the documentation, and the examples, but I've had trouble trying to apply them beyond a simple list of numbers.
那么,我如何使用 itertools.groupby()
?我应该使用另一种技术吗?指向良好先决条件阅读的指针也将受到赞赏。
So, how do I use of itertools.groupby()
? Is there another technique I should be using? Pointers to good "prerequisite" reading would also be appreciated.
推荐答案
重要提示:你必须首先对您的数据进行排序。
我没有得到的部分是示例中的部分建设
The part I didn't get is that in the example construction
groups = []
uniquekeys = []
for k, g in groupby(data, keyfunc):
groups.append(list(g)) # Store group iterator as a list
uniquekeys.append(k)
k
是当前的分组键, g
是一个迭代器您可以用来迭代该分组键定义的组。换句话说, groupby
迭代器本身返回迭代器。
k
is the current grouping key, and g
is an iterator that you can use to iterate over the group defined by that grouping key. In other words, the groupby
iterator itself returns iterators.
这是一个例子,使用更清晰的变量名:
Here's an example of that, using clearer variable names:
from itertools import groupby
things = [("animal", "bear"), ("animal", "duck"), ("plant", "cactus"), ("vehicle", "speed boat"), ("vehicle", "school bus")]
for key, group in groupby(things, lambda x: x[0]):
for thing in group:
print "A %s is a %s." % (thing[1], key)
print " "
这将给你输出:
熊是动物。
鸭子是动物。
A bear is a animal.
A duck is a animal.
仙人掌是一种植物。
快艇是一种车辆。
校车是一种车辆。
A speed boat is a vehicle.
A school bus is a vehicle.
在这个例子中,的东西
是第一个项目的元组列表在每个元组中是第二个项目所属的组。
In this example, things
is a list of tuples where the first item in each tuple is the group the second item belongs to.
groupby()
函数有两个参数:(1)要分组的数据和(2)函数把它分组。
The groupby()
function takes two arguments: (1) the data to group and (2) the function to group it with.
这里, lambda x:x [0]
告诉 groupby()
使用每个元组中的第一项作为分组键。
Here, lambda x: x[0]
tells groupby()
to use the first item in each tuple as the grouping key.
在中
语句, groupby
返回三个(键,组迭代器)对 - 每个唯一键一次。您可以使用返回的迭代器来迭代该组中的每个单独项目。
In the above for
statement, groupby
returns three (key, group iterator) pairs - once for each unique key. You can use the returned iterator to iterate over each individual item in that group.
这是使用列表解析的相同数据的略有不同的示例:
Here's a slightly different example with the same data, using a list comprehension:
for key, group in groupby(things, lambda x: x[0]):
listOfThings = " and ".join([thing[1] for thing in group])
print key + "s: " + listOfThings + "."
这将为您提供输出:
动物:熊和鸭。
植物:仙人掌。
车辆:快艇和校车。
animals: bear and duck.
plants: cactus.
vehicles: speed boat and school bus.
这篇关于我如何使用Python的itertools.groupby()?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!