将元组列表拆分为相同元组字段的子列表 [英] Split a list of tuples into sub-lists of the same tuple field
问题描述
我有大量的这种格式的元组列表.每个元组的第二个字段是类别字段.
I have a huge list of tuples in this format. The second field of the each tuple is the category field.
[(1, 'A', 'foo'),
(2, 'A', 'bar'),
(100, 'A', 'foo-bar'),
('xx', 'B', 'foobar'),
('yy', 'B', 'foo'),
(1000, 'C', 'py'),
(200, 'C', 'foo'),
..]
将其分解为相同类别(A,B,C等)的子列表的最有效方法是什么?
What is the most efficient way to break it down into sub-lists of the same category ( A, B, C .,etc)?
推荐答案
使用 itertools .groupby :
import itertools
import operator
data=[(1, 'A', 'foo'),
(2, 'A', 'bar'),
(100, 'A', 'foo-bar'),
('xx', 'B', 'foobar'),
('yy', 'B', 'foo'),
(1000, 'C', 'py'),
(200, 'C', 'foo'),
]
for key,group in itertools.groupby(data,operator.itemgetter(1)):
print(list(group))
收益
[(1, 'A', 'foo'), (2, 'A', 'bar'), (100, 'A', 'foo-bar')]
[('xx', 'B', 'foobar'), ('yy', 'B', 'foo')]
[(1000, 'C', 'py'), (200, 'C', 'foo')]
或者,要创建一个将每个组作为子列表的列表,可以使用列表理解:
Or, to create one list with each group as a sublist, you could use a list comprehension:
[list(group) for key,group in itertools.groupby(data,operator.itemgetter(1))]
itertools.groupby
的第二个参数是一个函数,itertools.groupby
应用于data
中的每个项目(第一个参数).预计将返回key
. itertools.groupby
然后将所有具有相同key
的连续项目组合在一起.
The second argument to itertools.groupby
is a function which itertools.groupby
applies to each item in data
(the first argument). It is expected to return a key
. itertools.groupby
then groups together all contiguous items with the same key
.
operator.itemgetter(1)在一个序列.
例如,如果
row=(1, 'A', 'foo')
然后
operator.itemgetter(1)(row)
等于'A'
.
正如@eryksun在注释中指出的那样,如果元组的类别以某种随机顺序出现,则在应用itertools.groupby
之前必须先对data
进行排序.这是因为itertools.groupy
仅将具有相同密钥的连续项收集到组中.
As @eryksun points out in the comments, if the categories of the tuples appear in some random order, then you must sort data
first before applying itertools.groupby
. This is because itertools.groupy
only collects contiguous items with the same key into groups.
要按类别对元组进行排序,请使用:
To sort the tuples by category, use:
data2=sorted(data,key=operator.itemgetter(1))
这篇关于将元组列表拆分为相同元组字段的子列表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!