在python中过滤/迭代非常大的列表 [英] Filtering / iterating through very large lists in python
问题描述
如果我有一个表示1000万个对象的列表,那么如何快速过滤列表。完整的迭代大概需要4到5秒钟才能完成列表的理解。在python中有没有有效的数据结构或库?还是python不适合大量数据?
If I have a list with say 10 million objects, how do I filter the list quickly. It takes about 4-5 seconds for a complete iteration thru a list comprehension. Are there any efficient data structures or libraries for this in python? Or is python not suited for large sets of data?
推荐答案
Itertools 专为有效的循环而设计。特别是,您可能会发现 ifilter
适合您的目的。通过大量数据结构迭代总是很昂贵的,但是如果您只需要一些数据,那么懒惰的评估可以帮助很多。
Itertools is designed for efficient looping. Particularly, you might find that ifilter
suits your purpose. Iterating through large data structures is always expensive, but if you only need some of the data at a time lazy evaluation can help a lot.
您还可以尝试使用生成器表达式,它们通常与列表理解对应方(尽管使用可能不同)或发生器相同,这也具有懒惰评估的好处。
You can also try using generator expressions, which are usually identical to their list comprehension counterparts (though usage can be different) or a generator, which also have the benefits of lazy evaluation.
这篇关于在python中过滤/迭代非常大的列表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!