按项目分组的元组列表 [英] Group list of tuples by item

查看:57
本文介绍了按项目分组的元组列表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我以以下列表为例:

[(148, Decimal('3.0')), (325, Decimal('3.0')), (148, Decimal('2.0')), (183, Decimal('1.0')), (308, Decimal('1.0')), (530, Decimal('1.0')), (594, Decimal('1.0')), (686, Decimal('1.0')), (756, Decimal('1.0')), (806, Decimal('1.0'))]

现在我想按ID分组,所以我将使用itemgetter(0):

Now i want to group by the id, so I will use itemgetter(0):

import operator, itertools
from decimal import *
test=[(148, Decimal('3.0')), (325, Decimal('3.0')), (148, Decimal('2.0')), (183, Decimal('1.0')), (308, Decimal('1.0')), (530, Decimal('1.0')), (594, Decimal('1.0')), (686, Decimal('1.0')), (756, Decimal('1.0')), (806, Decimal('1.0'))]

for _k, data in itertools.groupby(test, operator.itemgetter(0)):
    print list(data) 

我不知道为什么,但是我得到了错误的输出:

I don't know why but I am getting this wrong output:

[(148, Decimal('3.0'))]
[(325, Decimal('3.0'))]
[(148, Decimal('2.0'))]
[(183, Decimal('1.0'))]
[(308, Decimal('1.0'))]
[(530, Decimal('1.0'))]
[(594, Decimal('1.0'))]
[(686, Decimal('1.0'))]
[(756, Decimal('1.0'))]
[(806, Decimal('1.0'))]

如您所见,输出未按ID分组.但是,如果我使用itemgetter(1),则上面的代码可以正常工作.输出按十进制值分组.

As you can see the output is not grouped by id. However the code above works fine if I use itemgetter(1). The output is grouped by decimal val.

[(148, Decimal('3.0')), (325, Decimal('3.0'))]
[(148, Decimal('2.0'))]
[(183, Decimal('1.0')), (308, Decimal('1.0')), (530, Decimal('1.0')), (594, Decimal('1.0')), (686, Decimal('1.0')), (756, Decimal('1.0')), (806, Decimal('1.0'))]

我在这里想念什么?

推荐答案

您首先需要对 groupby 的数据进行排序,然后将分组连续元素(根据您提供的密钥):

You would first need to sort the data for the groupby to work, it groups consecutive elements based on the key you provide:

import operator, itertools
from decimal import *
test=[(148, Decimal('3.0')), (325, Decimal('3.0')), (148, Decimal('2.0')), (183, Decimal('1.0')), (308, Decimal('1.0')), (530, Decimal('1.0')), (594, Decimal('1.0')), (686, Decimal('1.0')), (756, Decimal('1.0')), (806, Decimal('1.0'))]

for _k, data in itertools.groupby(sorted(test), operator.itemgetter(0)):
    print list(data)

但是您最好使用字典进行分组以避免不必要的O(n log n)排序:

But you would be better using a dict to group to avoid an unnecessary O(n log n) sort:

from collections import defaultdict

d = defaultdict(list)

for t in test:
    d[t[0]].append(t)

for v in d.values():
    print(v)

两者都会给您相同的分组,只是不一定要以相同的顺序.

Both would give you the same groupings, just not necessarily in the same order.

这篇关于按项目分组的元组列表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆