替换列表&QUOT名单;凝聚"列表中,同时保持顺序列表 [英] Replace list of list with "condensed" list of list while maintaining order

查看:82
本文介绍了替换列表&QUOT名单;凝聚"列表中,同时保持顺序列表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有列表的列表,在code口连接。我想链接的每个子列表中,如果有任何共同的价值观。那么我想替换名单列表清单的精简列表。 例如::如果我有一个列表 [1,2,3],[3,4]] 我想 [1,2,3,4] 。如果我有 [4,3],[1,2,3]] 我想 [4,3,1,2] 。如果我有 [1,2,3],[A,B],[3,4],[B,C] 我想 [1,2,3,4],[A,B,C]] [A,B,C],[1,2,3, 4] 我不在乎是哪一个。

I have a list of list as in the code I attached. I want to link each sub list if there are any common values. I then want to replace the list of list with a condensed list of list. Examples: if I have a list [[1,2,3],[3,4]] I want [1,2,3,4]. If I have [[4,3],[1,2,3]] I want [4,3,1,2]. If I have [[1,2,3],[a,b],[3,4],[b,c]] I want [[1,2,3,4],[a,b,c]] or [[a,b,c],[1,2,3,4]] I don't care which one.

我几乎没有...

我的问题是,当我有一样的情况下, [1,2,3],[10.5],[3,8,5]] 我要 [1,2,3,10,5,8] ,但我目前的code,我得到 [1,2,3,8 ,10.5]

My problem is when I have a case like [[1,2,3],[10,5],[3,8,5]] I want [1,2,3,10,5,8] but with my current code I get [1,2,3,8,10,5]

下面是我的code:

import itertools

a = [1,2,3]
b = [3,4]
i = [21,22]
c = [88,7,8]
e = [5,4]
d = [3, 50]
f = [8,9]
g=  [9,10]
h = [20,21]

lst = [a,b,c,i,e,d,f,g,h,a,c,i]*1000  
#I have a lot of list but not very many different lists

def any_overlap(a, b):
  sb = set(b)
  return any(itertools.imap(sb.__contains__, a))

def find_uniq(lst):
    ''' return the uniq parts of lst'''
    seen = set()
    seen_add = seen.add
    return [ x for x in lst if x not in seen and not seen_add(x)]

def overlap_inlist(o_lst, lstoflst):
    '''
    Search for overlap, using "any_overlap", of a list( o_lst) in a list of lists (lstoflst).
    If there is overlap add the uniq part of the found list to the search list, and keep 
    track of where that list was found 
    '''
    used_lst =[ ]
    n_lst =[ ]
    for lst_num, each_lst in enumerate(lstoflst):
        if any_overlap(o_lst,each_lst):
            n_lst.extend(each_lst)
            used_lst.append(lst_num)
    n_lst= find_uniq(n_lst)
    return  n_lst, used_lst

def comb_list(lst):
    '''
    For each list in a list of list find all the overlaps using 'ovelap_inlist'.
    Update the list each time to delete the found lists. Return the final combined list. 
    '''
    for updated_lst in lst:
        n_lst, used_lst = overlap_inlist(updated_lst,lst)
        lst[:] = [x for i,x in enumerate(lst) if i not in used_lst]
        lst.insert(0,n_lst)
    return lst
comb_lst = comb_list(lst)
print comb_lst

从这个脚本放的输出是:

The out put from this script is:

[[88, 7, 8, 9, 10], [1, 2, 3, 4, 50, 5], [21, 22, 20]]

我想它所以关键是像原来的顺序:

I want it so the key are in the original order like:

[[88, 7, 8, 9, 10], [1, 2, 3, 4, 5, 50,], [21, 22, 20]]

5和50切换在新的LST [2]

The 5 and 50 are switched in new lst[2]

我有点新的蟒蛇。我想AP preciate对我目前的code中的问题或意见,任何解决方案。我不是计算机科学家,我想有可能是某种算法,迅速做到这一点(也许从集合论?)。如果有这样的算法,请点我朝着正确的方向。

I am somewhat new to python. I would appreciate any solutions to the problem or comments on my current code. I am not a computer scientists, I imagine there may be some kind of algorithm that does this quickly( maybe from set theory? ). If there is such an algorithm please point me to the right direction.

我可能会使得这种方式比较复杂,然后就... 谢谢!!

I may be making this way more complicated then it is... Thank you!!

推荐答案

下面是一个强力的方法(可能更容易理解):

Here's a brute-force approach (it might be easier to understand):

from itertools import chain

def condense(*lists):
    # remember original positions
    positions = {}
    for pos, item in enumerate(chain(*lists)):
        if item not in positions:
            positions[item] = pos

    # condense disregarding order
    sets = condense_sets(map(set, lists))

    # restore order
    result = [sorted(s, key=positions.get) for s in sets]
    return result if len(result) != 1 else result[0]

def condense_sets(sets):
    result = []
    for candidate in sets:
        for current in result:
            if candidate & current:   # found overlap
                current |= candidate  # combine (merge sets)

                # new items from candidate may create an overlap
                # between current set and the remaining result sets
                result = condense_sets(result) # merge such sets
                break
        else:  # no common elements found (or result is empty)
            result.append(candidate)
    return result

示例

>>> condense([1,2,3], [10,5], [3,8,5])
[1, 2, 3, 10, 5, 8]
>>> a = [1,2,3]
>>> b = [3,4]
>>> i = [21,22]
>>> c = [88,7,8]
>>> e = [5,4]
>>> d = [3, 50]
>>> f = [8,9]
>>> g=  [9,10]
>>> h = [20,21]
>>> condense(*[a,b,c,i,e,d,f,g,h,a,c,i]*1000)
[[1, 2, 3, 4, 5, 50], [88, 7, 8, 9, 10], [21, 22, 20]]
>>> condense([1], [2, 3, 2])
[[1], [2, 3]]

性能比较

凝聚_ *()函数都是从这个问题的答案。 lst_OP 的问题(不同大小)输入列表, lst_BK - 从@Blckknght's回答(不同大小)。请参阅<一href="https://gist.github.com/4babbce1179d77272b68/e4d5e174b09a3298766178f252d2c17c5f0619fb#file_measure_performance_condense_lists.py">the来源。

Performance comparison

condense_*() functions are from the answers to this question. lst_OP input list from the question (different size), lst_BK - the test list from @Blckknght's answer (different size). See the source.

测量表明,基于不相交的集合和无向图的连通分支解决方案的概念进行这两种类型的输入类似。

Measurements show that solutions based on "disjoint sets" and "connected components of undirected graph" concepts perform similar on both types of input.

name                       time ratio comment
condense_hynekcer     5.79 msec  1.00 lst_OP
condense_hynekcer2     7.4 msec  1.28 lst_OP
condense_pillmuncher2 11.5 msec  1.99 lst_OP
condense_blckknght    16.8 msec  2.91 lst_OP
condense_jfs            26 msec  4.49 lst_OP
condense_BK           30.5 msec  5.28 lst_OP
condense_blckknght2   30.9 msec  5.34 lst_OP
condense_blckknght3   32.5 msec  5.62 lst_OP


name                       time  ratio comment
condense_blckknght     964 usec   1.00 lst_BK
condense_blckknght2   1.41 msec   1.47 lst_BK
condense_blckknght3   1.46 msec   1.51 lst_BK
condense_hynekcer2    1.95 msec   2.03 lst_BK
condense_pillmuncher2  2.1 msec   2.18 lst_BK
condense_hynekcer     2.12 msec   2.20 lst_BK
condense_BK           3.39 msec   3.52 lst_BK
condense_jfs           544 msec 563.66 lst_BK


name                       time ratio comment
condense_hynekcer     6.86 msec  1.00 lst_rnd
condense_jfs          16.8 msec  2.44 lst_rnd
condense_blckknght    28.6 msec  4.16 lst_rnd
condense_blckknght2   56.1 msec  8.18 lst_rnd
condense_blckknght3   56.3 msec  8.21 lst_rnd
condense_BK           70.2 msec 10.23 lst_rnd
condense_pillmuncher2  324 msec 47.22 lst_rnd
condense_hynekcer2     334 msec 48.61 lst_rnd

要重现的结果克隆要点并运行<一href="https://gist.github.com/4babbce1179d77272b68#file_measure_performance_condense_lists.py"><$c$c>measure-performance-condense-lists.py

To reproduce results clone gist and run measure-performance-condense-lists.py

这篇关于替换列表&QUOT名单;凝聚&QUOT;列表中,同时保持顺序列表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆