列表的另一个合并列表,但大多数是pythonic方式 [英] Yet another merging list of lists, but most pythonic way

查看:80
本文介绍了列表的另一个合并列表,但大多数是pythonic方式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我尝试找到一些选定答案的答案,但是它们似乎没有用.我必须合并列表列表.

I tried finding the answers of few selected answers, but they doesn't seem working. I have to merge list of lists.

[[283, 311], [311, 283, 316], [150, 68], [86, 119], [259, 263], [263, 259, 267], [118, 87], [262, 264], [264, 262], [85, 115], [115, 85], [244, 89], [140, 76, 82, 99], [236, 111], [330, 168], [76, 63, 107, 124, 128, 135, 140], [131, 254], [254, 131], [21, 38], [38, 21], [220, 291], [291, 220], [296, 46], [64, 53, 57, 61, 63, 65, 66, 76, 96, 100, 103, 114, 123, 127, 128, 130, 148, 149], [274, 240], [157, 225, 234], [225, 247], [233, 44], [89, 244], [80, 101], [210, 214], [78, 155], [55, 139], [102, 74, 75, 132], [105, 252], [149, 55, 59, 63, 71, 73, 81, 100, 102, 116, 122, 138, 146], [97, 231], [231, 97], [155, 78], [239, 305], [305, 239], [145, 94, 248], [147, 150], [61, 64], [152, 219], [219, 152], [246, 250], [252, 105], [223, 235], [235, 223], [237, 60, 344], [344, 237], [182, 129], [331, 117], [12, 2, 8, 10, 13, 15], [250, 246]]

例如,我想要的只是[283,311]存在于[311,283,316]中.那么应该将两者合并,使上面的列表中只有一个列表.如果列表存在于其他列表中,则我需要对其进行合并.

All I want is, for example, [283,311] exists in [311,283,316]. then both should be merged making only one list in the above list. I need to merge a list if it exists in some other.

请注意,我可以使用循环内循环来实现,但是正在寻找一种实现该目标的Python方法.另外,如果您知道如何合并至少共享一个共享元素的列表,请共享.

Please note, I can do it using loop inside loop, but looking for a pythonic way to achieve this. Also, if you know how to merge lists sharing atleaast one common element, please share.

请不要忘记,正在寻找一种Python方法.我认为使用任何核心Pythonic方法都是不可能的.那么,使用循环的下一个可能的解决方案应该是什么.我需要效率,因为每半小时必须合并超过一百万个项目的列表列表.我确实在for循环中使用了for循环,然后比较了两个项目,但这花费了太多时间.

Please don't forget, looking for a pythonic approach. I think it is not possible using any core Pythonic approach. Then what should be next possible solution using loops. I need efficiency as have to merge list of lists with more than a million items every half an hour. I did using for loop inside for loop then comparing two items, but it is taking too much of time.

推荐答案

如果您有大量小型集合,我认为这是另一种方法,它应该比@mgilson更快:

Here's an alternative approach that I think should be faster than @mgilson's if you have a large number of small sets:

from collections import defaultdict

sets = set(map(frozenset, lists))

def remove_subsets(sets):
    # map each element to the sets in which it occurs
    sets_containing = defaultdict(set)
    for s in sets:
        for x in s:
            sets_containing[x].add(s)

    for s in sets:
        supersets = set.intersection(*(sets_containing[x] for x in s))
        if len(supersets) == 1:
            yield s

区别主要在于最终循环,该循环不遍历所有n(n-1)/2个对对,而仅遍历外部循环中的 n 个集,然后遍历一组候选超集,其中包含正在考虑的集合的某些元素.可以通过在生成空集时尽早停止reduce来对其进行进一步优化.

The difference is mainly in the final loop, which runs not through all n(n-1)/2 distinct pairs of sets, but only though the n sets in the outer loop, then through a set of candidate supersets that contain some element of the set under consideration. It can be optimized further by stopping the reduce early when it produces an empty set.

这篇关于列表的另一个合并列表,但大多数是pythonic方式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆