比较元组在Python连续名单的第一要素 [英] Comparing first element of the consecutive lists of tuples in Python

查看:153
本文介绍了比较元组在Python连续名单的第一要素的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有元组的列表,每个包含两个元素。几个子列表的第一个元素是常见的。我想比较这些子列表的第一个元素在一个列表中追加第二元件。这里是我的列表:

I have a list of tuples, each containing two elements. The first element of few sublists is common. I want to compare the first element of these sublists and append the second element in one lists. Here is my list:

myList=[(1,2),(1,3),(1,4),(1,5),(2,6),(2,7),(2,8),(3,9),(3,10)]

我想作一个列表的列表出它看起来是这样的:`

I would like to make a list of lists out of it which looks something like this:`

NewList=[(2,3,4,5),(6,7,8),(9,10)]

我希望如果有任何有效的方法。

I hope if there is any efficient way.

推荐答案

您可以使用的 OrderedDict 以组由每个元组的第一个子元素的元素:

You can use an OrderedDict to group the elements by the first subelement of each tuple:

myList=[(1,2),(1,3),(1,4),(1,5),(2,6),(2,7),(2,8),(3,9),(3,10)]

from collections import OrderedDict

od  = OrderedDict()

for a,b in myList:
    od.setdefault(a,[]).append(b)

print(list(od.values()))
[[2, 3, 4, 5], [6, 7, 8], [9, 10]]

如果你真的想元组:

print(list(map(tuple,od.values())))
[(2, 3, 4, 5), (6, 7, 8), (9, 10)]

如果你不关心的元素出现,只是想最有效的方式来组你可以使用的 collections.defaultdict

If you did not care about the order the elements appeared and just wanted the most efficient way to group you could use a collections.defaultdict:

from collections import defaultdict

od  = defaultdict(list)

for a,b in myList:
    od[a].append(b)

print(list(od.values()))

最后,如果你的数据是为了按照您输入的例子,即排序的,你可以简单地使用的 itertools.groupby由每个元组第一个子元素以组,然后从分组中提取的元组的第二个元素:

Lastly, if your data is in order as per your input example i.e sorted you could simply use itertools.groupby to group by the first subelement from each tuple and extract the second element from the grouped tuples:

from itertools import groupby
from operator import itemgetter
print([tuple(t[1] for t in v) for k,v in groupby(myList,key=itemgetter(0))])

输出:

[(2, 3, 4, 5), (6, 7, 8), (9, 10)]

此外,如果你的数据是GROUPBY只会工作的整理的至少第一个元素。

在合理的大小清单上的一些时段:

Some timings on a reasonable sized list:

In [33]: myList = [(randint(1,10000),randint(1,10000)) for _ in range(100000)]

In [34]: myList.sort()

In [35]: timeit ([tuple(t[1] for t in v) for k,v in groupby(myList,key=itemgetter(0))])
10 loops, best of 3: 44.5 ms per loop

In [36]: %%timeit                                                               od = defaultdict(list)
for a,b in myList:
    od[a].append(b)
   ....: 
10 loops, best of 3: 33.8 ms per loop

In [37]: %%timeit
dictionary = OrderedDict()
for x, y in myList:
     if x not in dictionary:
        dictionary[x] = [] # new empty list
    dictionary[x].append(y)
   ....: 
10 loops, best of 3: 63.3 ms per loop

In [38]: %%timeit   
od = OrderedDict()
for a,b in myList:
    od.setdefault(a,[]).append(b)
   ....: 
10 loops, best of 3: 80.3 ms per loop

如果秩序问题以及数据的整理的,顺应的 GROUPBY 的,它会得到更接近defaultdict方法,如果有必要的所有元素映射元组在defaultdict。

If order matters and the data is sorted, go with the groupby, it will get even closer to the defaultdict approach if it is necessary to map all the elements to tuple in the defaultdict.

如果数据没有进行排序,或者你不关心任何命令,你不会找到一个更快的方法比组使用的 defaultdict 的办法。

If the data is not sorted or you don't care about any order, you won't find a faster way to group than using the defaultdict approach.

这篇关于比较元组在Python连续名单的第一要素的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆