将元组列表拆分为相同元组字段的子列表 [英] Split a list of tuples into sub-lists of the same tuple field

查看:97
本文介绍了将元组列表拆分为相同元组字段的子列表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有大量的这种格式的元组列表.每个元组的第二个字段是类别字段.

I have a huge list of tuples in this format. The second field of the each tuple is the category field.

    [(1, 'A', 'foo'),
    (2, 'A', 'bar'),
    (100, 'A', 'foo-bar'),

    ('xx', 'B', 'foobar'),
    ('yy', 'B', 'foo'),

    (1000, 'C', 'py'),
    (200, 'C', 'foo'),
    ..]

将其分解为相同类别(A,B,C等)的子列表的最有效方法是什么?

What is the most efficient way to break it down into sub-lists of the same category ( A, B, C .,etc)?

推荐答案

使用 itertools .groupby :

import itertools
import operator

data=[(1, 'A', 'foo'),
    (2, 'A', 'bar'),
    (100, 'A', 'foo-bar'),

    ('xx', 'B', 'foobar'),
    ('yy', 'B', 'foo'),

    (1000, 'C', 'py'),
    (200, 'C', 'foo'),
    ]

for key,group in itertools.groupby(data,operator.itemgetter(1)):
    print(list(group))

收益

[(1, 'A', 'foo'), (2, 'A', 'bar'), (100, 'A', 'foo-bar')]
[('xx', 'B', 'foobar'), ('yy', 'B', 'foo')]
[(1000, 'C', 'py'), (200, 'C', 'foo')]

或者,要创建一个将每个组作为子列表的列表,可以使用列表理解:

Or, to create one list with each group as a sublist, you could use a list comprehension:

[list(group) for key,group in itertools.groupby(data,operator.itemgetter(1))]


itertools.groupby的第二个参数是一个函数,itertools.groupby应用于data中的每个项目(第一个参数).预计将返回key. itertools.groupby然后将所有具有相同key的连续项目组合在一起.


The second argument to itertools.groupby is a function which itertools.groupby applies to each item in data (the first argument). It is expected to return a key. itertools.groupby then groups together all contiguous items with the same key.

operator.itemgetter(1)在一个序列.

例如,如果

row=(1, 'A', 'foo')

然后

operator.itemgetter(1)(row)

等于'A'.

正如@eryksun在注释中指出的那样,如果元组的类别以某种随机顺序出现,则在应用itertools.groupby之前必须先对data进行排序.这是因为itertools.groupy仅将具有相同密钥的连续项收集到组中.

As @eryksun points out in the comments, if the categories of the tuples appear in some random order, then you must sort data first before applying itertools.groupby. This is because itertools.groupy only collects contiguous items with the same key into groups.

要按类别对元组进行排序,请使用:

To sort the tuples by category, use:

data2=sorted(data,key=operator.itemgetter(1))

这篇关于将元组列表拆分为相同元组字段的子列表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆