如何加快列表理解 [英] How to speed up list comprehension
问题描述
以下是我的列表:
col = [['red', 'yellow', 'blue', 'red', 'green', 'yellow'],
['pink', 'orange', 'brown', 'pink', 'brown']
]
我的目标是消除在每个列表中出现一次的项目.
My goal is to eliminate items that appear once in each list.
这是我的代码:
eliminate = [[w for w in c if c.count(w)>1]for c in col]
Output: [['red', 'red', 'yellow','yellow'], ['pink','pink', 'brown','brown']]
该代码对于较小的数据集(如上面的列表)正常工作,但是我的数据集非常大.每个列表最多包含1000个项目.
The code works fine for small dataset such as the list above, however, my dataset is very large. Each list contains up to a 1000 items.
有没有办法使上述代码更快地工作?就像将代码分解成两个或多个for循环一样,因为我的理解是正常的for循环比列表理解要快.
Is there a way to make the above code work faster? like breaking the code down into two or more for-loops, as my understanding is that the normal for-loop is faster than list comprehension.
有什么建议吗?谢谢.
Any suggestions? thanks.
推荐答案
我可以尝试尝试OrderedCounter
以避免重复的.count()
调用:
I'd have a go at trying an OrderedCounter
to avoid the repeated .count()
calls:
from collections import OrderedDict, Counter
col=[['red', 'yellow', 'blue', 'red', 'green', 'yellow'],['pink', 'orange', 'brown', 'pink', 'brown']]
class OrderedCounter(Counter, OrderedDict):
pass
new = [[k for k, v in OrderedCounter(el).iteritems() if v != 1] for el in col]
# [['red', 'yellow'], ['pink', 'brown']]
如果我们只想迭代一次,那么(类似于Martijn的方法-加上少玩组合):
And if we just wish to iterate once, then (similar to Martijn's - plus playing less with sets):
from itertools import count
def unique_plurals(iterable):
seen = {}
return [el for el in iterable if next(seen.setdefault(el, count())) == 1]
new = map(unique_plurals, col)
这在指定所需的出现次数方面更加灵活,并且保留一个dict
而不是多个set
.
This is more flexible in terms of specifying how many occurrences are required, and keeps one dict
instead of multiple set
s.
这篇关于如何加快列表理解的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!