Python组合 [英] Python group by

查看:131
本文介绍了Python组合的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我有一组这样的数据对,其中索引0是值,索引1是类型:

  input = [
('11013331','KAT'),
('9085267','NOT'),
('5238761','ETH'),
('5349618','ETH'),
('11788544','NOT'),
('962142','ETH'),
('7795297','ETH' ),
('7341464','ETH'),
('9843236','KAT'),
('5594916','ETH'),
(' 1550003','ETH')
]

我想按他们的类型由第一个索引字符串):

  result = [
{
type:'KAT' ,
物品:['11013331','9843236']
},
{
类型:'NOT',
物品: ['9085267','11788544']
},
{
type:'ETH',
items:['5238761','962142','7795297',' 7341464','5594916','1550003']
}
]

我怎样才能以有效的方式实现这一目标?



谢谢

>分两步进行。首先,创建一个字典。

 >>>输入= [('11013331','KAT'),('9085267','NOT'),('5238761','ETH'),('5349618','ETH'),('11788544','NOT '),('962142','ETH'),('7795297','ETH'),('7341464','ETH'),('9843236','KAT'),('5594916','ETH '),('1550003','ETH')] 
>>> from collections import defaultdict
>>> res = defaultdict(list)
>>>对于输入中的v,k:res [k] .append(v)
...



<然后,将该字典转换为预期格式。

 >>> [{'type':k,'items':v} for k,v in res.items()] 
[{'items':['9085267','11788544'],'type':' NOT'},{'items':['5238761','5349618','962142','7795297','7341464','5594916','1550003'],'type':'ETH'},{' '''''''''''''''''' >

使用itertools.groupby也是可行的,但它需要首先对输入进行排序。

 >>> sorted_input = sorted(input,key = itemgetter(1))
>>> groups = groupby(sorted_input,key = itemgetter(1))
>>> [{'type':k,'items':[x [0] for x in v]} for k,v in groups]
[{'items':['5238761','5349618',' ''','''','''','''','''','''','''''''''''''''''''''''''''''''''''''''' '',{'items':['9085267','11788544'],'type':'NOT'}]






注意这两个都不考虑按键的原始顺序。如果您需要保留订单,您需要OrderedDict。

 >>> from collections import OrderedDict 
>>> res = OrderedDict()
>>>输入中的v,k:
...如果k在res中:res [k] .append(v)
... else:res [k] = [v]
。 ..
>>> $ {$'''''','''''','''','''','''''','''''''' KAT'},{'items':['9085267','11788544'],'type':'NOT'},{'items':['5238761','5349618','962142','7795297', '7341464','5594916','1550003'],'type':'ETH'}]


Assume that I have a such set of pair datas where index 0 is the value and the index 1 is the type:

input = [
          ('11013331', 'KAT'), 
          ('9085267',  'NOT'), 
          ('5238761',  'ETH'), 
          ('5349618',  'ETH'), 
          ('11788544', 'NOT'), 
          ('962142',   'ETH'), 
          ('7795297',  'ETH'), 
          ('7341464',  'ETH'), 
          ('9843236',  'KAT'), 
          ('5594916',  'ETH'), 
          ('1550003',  'ETH')
        ]

I want to group them by their type(by the 1st indexed string) as such:

result = [ 
           { 
             type:'KAT', 
             items: ['11013331', '9843236'] 
           },
           {
             type:'NOT', 
             items: ['9085267', '11788544'] 
           },
           {
             type:'ETH', 
             items: ['5238761', '962142', '7795297', '7341464', '5594916', '1550003'] 
           }
         ] 

How can I achieve this in an efficient way?

Thanks

解决方案

Do it in 2 steps. First, create a dictionary.

>>> input = [('11013331', 'KAT'), ('9085267', 'NOT'), ('5238761', 'ETH'), ('5349618', 'ETH'), ('11788544', 'NOT'), ('962142', 'ETH'), ('7795297', 'ETH'), ('7341464', 'ETH'), ('9843236', 'KAT'), ('5594916', 'ETH'), ('1550003', 'ETH')]
>>> from collections import defaultdict
>>> res = defaultdict(list)
>>> for v, k in input: res[k].append(v)
...

Then, convert that dictionary into the expected format.

>>> [{'type':k, 'items':v} for k,v in res.items()]
[{'items': ['9085267', '11788544'], 'type': 'NOT'}, {'items': ['5238761', '5349618', '962142', '7795297', '7341464', '5594916', '1550003'], 'type': 'ETH'}, {'items': ['11013331', '9843236'], 'type': 'KAT'}]


It is also possible with itertools.groupby but it requires the input to be sorted first.

>>> sorted_input = sorted(input, key=itemgetter(1))
>>> groups = groupby(sorted_input, key=itemgetter(1))
>>> [{'type':k, 'items':[x[0] for x in v]} for k, v in groups]
[{'items': ['5238761', '5349618', '962142', '7795297', '7341464', '5594916', '1550003'], 'type': 'ETH'}, {'items': ['11013331', '9843236'], 'type': 'KAT'}, {'items': ['9085267', '11788544'], 'type': 'NOT'}]


Note both of these do not respect the original order of the keys. You need an OrderedDict if you need to keep the order.

>>> from collections import OrderedDict
>>> res = OrderedDict()
>>> for v, k in input:
...   if k in res: res[k].append(v)
...   else: res[k] = [v]
... 
>>> [{'type':k, 'items':v} for k,v in res.items()]
[{'items': ['11013331', '9843236'], 'type': 'KAT'}, {'items': ['9085267', '11788544'], 'type': 'NOT'}, {'items': ['5238761', '5349618', '962142', '7795297', '7341464', '5594916', '1550003'], 'type': 'ETH'}]

这篇关于Python组合的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆