删除列表中的重复项 [英] Removing duplicates in the lists

查看:130
本文介绍了删除列表中的重复项的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我几乎需要编写一个程序来检查列表中是否有重复项,如果确实存在,它将删除它们并返回一个新列表,其中包含未重复/删除的项。这是我尝试过的方法,但老实说我不知道​​该怎么做。

Pretty much I need to write a program that checks if a list has any duplicates and if it does it removes them and returns a new list with the items that weren't duplicated/removed. This is what I have tried but honestly I don't know what to do.

def remove_duplicates():
    t = ['a', 'b', 'c', 'd']
    t2 = ['a', 'c', 'd']
    for t in t2:
        t.append(t.remove())
    return t


推荐答案

获取唯一商品集合的常用方法是使用 set 。集合是不同对象的无序集合。要从任何迭代创建集合,您只需将其传递给内置的 set() 函数。如果以后再次需要真实列表,则可以类似地将集合传递给 list() 函数。

The common approach to get a unique collection of items is to use a set. Sets are unordered collections of distinct objects. To create a set from any iterable, you can simply pass it to the built-in set() function. If you later need a real list again, you can similarly pass the set to the list() function.

以下示例应涵盖您要执行的操作:

The following example should cover whatever you are trying to do:

>>> t = [1, 2, 3, 1, 2, 5, 6, 7, 8]
>>> t
[1, 2, 3, 1, 2, 5, 6, 7, 8]
>>> list(set(t))
[1, 2, 3, 5, 6, 7, 8]
>>> s = [1, 2, 3]
>>> list(set(t) - set(s))
[8, 5, 6, 7]

从示例结果中可以看到,原始顺序未得到维护。如上所述,集合本身是无序集合,因此顺序丢失。将集合转换回列表时,将创建任意订单。

As you can see from the example result, the original order is not maintained. As mentioned above, sets themselves are unordered collections, so the order is lost. When converting a set back to a list, an arbitrary order is created.

如果订单为对您很重要,那么您将不得不使用其他机制。一个非常常见的解决方案是依靠 OrderedDict 以便在插入期间保持键的顺序:

If order is important to you, then you will have to use a different mechanism. A very common solution for this is to rely on OrderedDict to keep the order of keys during insertion:

>>> from collections import OrderedDict
>>> list(OrderedDict.fromkeys(t))
[1, 2, 3, 5, 6, 7, 8]

从Python 3.7开始 ,内置字典也保证可以保持插入顺序,因此,如果您使用的是Python 3.7或更高版本(或CPython 3.6),也可以直接使用它:

Starting with Python 3.7, the built-in dictionary is guaranteed to maintain the insertion order as well, so you can also use that directly if you are on Python 3.7 or later (or CPython 3.6):

>>> list(dict.fromkeys(t))
[1, 2, 3, 5, 6, 7, 8]

请注意,这可能会产生一些开销,先创建字典,然后再从中创建列表。如果您实际上不需要保留订单,通常最好使用一组,特别是因为它可以为您提供更多操作。查看此问题,以获取更多详细信息以及删除重复项时保留订单的其他方法。

Note that this may have some overhead of creating a dictionary first, and then creating a list from it. If you don’t actually need to preserve the order, you’re often better off using a set, especially because it gives you a lot more operations to work with. Check out this question for more details and alternative ways to preserve the order when removing duplicates.

最后请注意,设置 OrderedDict / dict 解决方案要求您的商品为可哈希。这通常意味着它们必须是不变的。如果必须处理不可散列的项目(例如列表对象),则必须使用慢速方法,基本上必须将每个项目与嵌套循环中的其他所有项目进行比较。

Finally note that both the set as well as the OrderedDict/dict solutions require your items to be hashable. This usually means that they have to be immutable. If you have to deal with items that are not hashable (e.g. list objects), then you will have to use a slow approach in which you will basically have to compare every item with every other item in a nested loop.

这篇关于删除列表中的重复项的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆