Python脚本可从列表中删除唯一元素,并以正确的顺序打印包含重复元素的列表 [英] Python script to remove unique elements from a list and print the list with repeated elements in proper order

查看:198
本文介绍了Python脚本可从列表中删除唯一元素,并以正确的顺序打印包含重复元素的列表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我编写了一个脚本,以从列表中删除所有唯一元素,并仅使用重复的元素打印列表:

I have written a script to remove all unique elements from a list and print the list with only repeated elements:

下面是一些示例,输入列表的输出列表应为

Below are some examples how the output list for an input list should be

Input list1:
1,2,1,1,3,5,3,4,3,1,6,7,8,5

Output List1:
1,1,1,3,5,3,3,1,5

Input list2:
1,2,1,1,3,3,4,3,1,6,5

Output List2:
1,1,1,3,3,3,1

#! /bin/python

def remove_unique(*n):
    dict1={}
    list1=[]
    for i in range(len(n)):
        for j in range(i+1,len(n)):
            if n[i] == n[j]:
               dict1[j]=n[j]
               dict1[i]=n[i]
    for x in range(len(n)):
        if x in dict1.keys():
           list1.append(dict1[x])
    return list1

lst1=remove_unique(1,2,1,1,3,5,3,4,3,1,6,7,8,5)
for n in lst1:
    print(n, end=" ")

上面的脚本使用少量较小的列表进行测试时,完全可以按预期工作.但是我想要一些关于如何优化具有更大长度(50000< = len(list)< = 50M)的输入列表的脚本(考虑时间和空间复杂性)的想法.

The script above works exactly as expected when tested with few smaller lists. However I want some ideas on how to optimize the script (both time and space complexities considered) for input lists with bigger lengths ( 50000 <=len(list) <= 50M )

推荐答案

您的脚本存在许多问题:

your script has a number of issues:

  • 经典的if x in dict1.keys() => if x in dict1确保使用字典检查而不是线性的
  • 没有列表理解:append是循环的,而不是表现出色的.
  • O(n^2)由于双循环而变得复杂
  • the classical if x in dict1.keys() => if x in dict1 to be sure to use the dictionary check instead of linear
  • no list comprehension: append in a loop, not as performant.
  • O(n^2) complexity because of the double loop

我的方法:

您可以使用collections.Counter对元素进行计数,然后使用列表过滤器根据出现次数对列表进行过滤以过滤出新列表:

You could count your elements using collections.Counter, then filter out a new list using a list comprehension using a filter on the number of ocurrences:

from collections import Counter

list1 = [1,2,1,1,3,5,3,4,3,1,6,7,8,5]

c = Counter(list1)
new_list1 = [k for k in list1 if c[k]>1]

print(new_list1)

结果:

[1, 1, 1, 3, 5, 3, 3, 1, 5]

我可能是错的,但是这种方法的复杂性大约是O(n*log(n))(列表的线性扫描加上字典中键的哈希值和列表理解中的查找).因此,这是明智的性能.

I may be wrong but, the complexity of this approach is (roughly) O(n*log(n)) (linear scan of the list plus the hashing of the keys in the dictionary and the lookup in the list comprehension). So, it's good performance-wise.

这篇关于Python脚本可从列表中删除唯一元素,并以正确的顺序打印包含重复元素的列表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆