Python:2个列表的交集,保留两个列表中的重复项 [英] Python: intersection of 2 lists keeping duplicates from both lists

查看:149
本文介绍了Python:2个列表的交集,保留两个列表中的重复项的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想有效地找到两个列表的交集,同时保留两个的重复项,例如A = [1,1,2,3],B = [1,1,2,4]应返回[1,1,1,1,2]

I want to efficiently find the intersection of two lists , keeping duplicates from both, e.g. A=[1,1,2,3], B=[1,1,2,4] should return [1,1,1,1,2]

我知道之前曾问过类似的问题(两个列表的Python交集保持重复) 但这对我没有帮助,因为仅保留了一个列表中的重复项.

I know a similar question was asked previously (Python intersection of two lists keeping duplicates) however this does not help me because only the duplicates from one list are retained.

以下作品

def intersect(A,B):
    C=[]
    for a in A:
        for b in B:
            if a==b:
                C.append(a)
    return C

但是对于我正在做的事情效率还不够高!为了加快处理速度,我尝试对列表进行排序

however it isn't efficient enough for what I'm doing! To speed things up I tried sorting the lists

def intersect(A,B):
    A.sort()
    B.sort()
    C=[]
    i=0
    j=0
    while i<len(A) and j<len(B):
        if A[i]<=B[j]:
            if A[i]==B[j]: 
                C.append(A[i])
            i+=1
        else:
            j=j+1
    return C

但是,这只会保留列表B中的重复项.有什么建议吗?

however this only keeps the duplicates from list B. Any suggestions?

推荐答案

以下是您提出的问题的答案:

Here is the answer to your question as asked:

import collections
for A,B,expected_output in (
    ([1,1,2,3], [1,1,2,4], [1,1,1,1,2]),
    ([1,1,2,3], [1,2,4], [1,1,2])):
    cntA = collections.Counter(A)
    cntB = collections.Counter(B)
    output = [
        x for x in sorted(set(A) & set(B)) for i in range(cntA[x]*cntB[x])]
    assert output == expected_output

以下是我本人和另外两个人最初解释的问题的答案:

Here is the answer to the question as originally interpreted by myself and two others:

import collections
A=[1,1,2,3]
B=[1,1,2,4]
expected_output = [1,1,1,1,2,2]
cntA = collections.Counter(A)
cntB = collections.Counter(B)
cnt_sum = collections.Counter(A) + collections.Counter(B)
output = [x for x in sorted(set(A) & set(B)) for i in range(cnt_sum[x])]
assert output == expected_output

您可以在此处找到collections.Counter()文档. collections是一个很棒的模块,我强烈建议您阅读整个模块上的文档.

You can find the collections.Counter() documentation here. collections is a great module and I highly recommend giving the documentation on the whole module a read.

我意识到您实际上不需要查找集合的交集,因为根据文档,缺少元素的数量为零":

I realized you don't actually need to find the intersection of the sets, because the "count of a missing element is zero" according to the documentation:

import collections
for A,B,expected_output in (
    ([1,1,2,3], [1,1,2,4], [1,1,1,1,2]),
    ([1,1,2,3], [1,2,4], [1,1,2])):
    cntA = collections.Counter(A)
    cntB = collections.Counter(B)
    output = [
        x for x in sorted(set(A)) for i in range(cntA[x]*cntB[x])]
    assert output == expected_output

这篇关于Python:2个列表的交集,保留两个列表中的重复项的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆