如何在保留订单的同时从列表中删除重复项? [英] How do you remove duplicates from a list whilst preserving order?
问题描述
是否有内置功能可在保留顺序的同时从Python列表中删除重复项?我知道我可以使用集合来删除重复项,但这会破坏原始顺序.我也知道我可以这样滚动自己:
Is there a built-in that removes duplicates from list in Python, whilst preserving order? I know that I can use a set to remove duplicates, but that destroys the original order. I also know that I can roll my own like this:
def uniq(input):
output = []
for x in input:
if x not in output:
output.append(x)
return output
(感谢展开,感谢
(Thanks to unwind for that code sample.)
但我想尽可能利用内置的或更多Pythonic的习惯用法.
But I'd like to avail myself of a built-in or a more Pythonic idiom if possible.
相关问题:推荐答案
在这里,您还有其他选择: http://www.peterbe.com/plog/uniqifiers-benchmark Here you have some alternatives: http://www.peterbe.com/plog/uniqifiers-benchmark 最快的一个: 为什么将 Why assign 如果您打算在同一数据集上大量使用此功能,则最好使用有序集: http://code.activestate.com/recipes/528878/ If you plan on using this function a lot on the same dataset, perhaps you would be better off with an ordered set: http://code.activestate.com/recipes/528878/ O (1)每个操作的插入,删除和成员检查. O(1) insertion, deletion and member-check per operation. (小注释: (Small additional note: 这篇关于如何在保留订单的同时从列表中删除重复项?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!def f7(seq):
seen = set()
seen_add = seen.add
return [x for x in seq if not (x in seen or seen_add(x))]
seen.add
分配给seen_add
而不是仅仅调用seen.add
? Python是一种动态语言,解决每次迭代seen.add
的成本比解决局部变量的成本高. seen.add
可能在两次迭代之间发生了变化,并且运行时不够智能,无法将其排除在外.为了安全起见,它必须每次检查对象.seen.add
to seen_add
instead of just calling seen.add
? Python is a dynamic language, and resolving seen.add
each iteration is more costly than resolving a local variable. seen.add
could have changed between iterations, and the runtime isn't smart enough to rule that out. To play it safe, it has to check the object each time.seen.add()
始终返回None
,因此,上述 or
仅作为尝试进行集合更新的一种方法,而不是其不可或缺的一部分.逻辑测试.)seen.add()
always returns None
, so the or
above is there only as a way to attempt a set update, and not as an integral part of the logical test.)