处理大型列表的最佳方法? [英] Best way to handle large lists?

查看:85
本文介绍了处理大型列表的最佳方法?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个系统有一些非常大的列表(数千或

成千上万的条目)和一些相当小的。好多次

我必须在大型列表和小型列表之间产生差异,

而不破坏任何一个列表的完整性。我想知道是否

任何人都有关于如何做到这一点并保持性能的建议

高?有没有更好的方式比


[i for big in bigList如果我不在smallList]


谢谢。

Chaz

解决方案

Chaz Ginger< cg ******** @ hotmail.comwrote:


我有一个系统有一些非常大的列表(数千或

成千上万的条目),还有一些相当小。好多次

我必须在大型列表和小型列表之间产生差异,

而不破坏任何一个列表的完整性。我想知道是否

任何人都有关于如何做到这一点并保持性能的建议

高?有没有更好的方式比


[i for big in bigList如果我不在smallList]



怎么样


smallSet = set(smallList)

something = [如果我不在smallSet中,我在bigList中为i]


对某些代表性数据使用timeit.py,看看

有什么不同。


Chaz Ginger< cg********@hotmail.comwrites:


我的系统有一些非常大的列表(数千或

成千上万的条目)和一些相当小的条目。好多次

我必须在大型列表和小型列表之间产生差异,

而不破坏任何一个列表的完整性。我想知道是否

任何人都有关于如何做到这一点并保持性能的建议

高?有没有更好的方式比


[i for big in bigList如果我不在smallList]



diff = list( set(bigList) - set(smallList))


请注意,不能以任何特定顺序获取元素。


< blockquote> ChazGingerírta:


我有一个系统有一些非常大的列表(数千或

成千上万条目)和一些相当小的条目。好多次

我必须在大型列表和小型列表之间产生差异,

而不破坏任何一个列表的完整性。我想知道是否

任何人都有关于如何做到这一点并保持性能的建议

高?有没有更好的方式比


[i for big in bigList如果我不在smallList]


谢谢。

Chaz



嗨!


如果您有大清单,可以使用类似dbm的数据库。

他们非常快。 BSDDB,flashdb等。请参阅SleepyCat,或查看python帮助。


In在大型数据集中非常慢,但bsddb使用哈希值,所以它是

非常快。

SleepyCat数据库有很多附加功能,你可以设置缓存大小和

许多其他参数。


或者,如果您不喜欢dbm样式数据库,则可以使用SQLite。另外

快,你可以使用SQL命令。

比bsddb慢一点,但它就像SQL服务器。您可以通过特殊参数提高

的速度。


dd


I have a system that has a few lists that are very large (thousands or
tens of thousands of entries) and some that are rather small. Many times
I have to produce the difference between a large list and a small one,
without destroying the integrity of either list. I was wondering if
anyone has any recommendations on how to do this and keep performance
high? Is there a better way than

[ i for i in bigList if i not in smallList ]

Thanks.
Chaz

解决方案

Chaz Ginger <cg********@hotmail.comwrote:

I have a system that has a few lists that are very large (thousands or
tens of thousands of entries) and some that are rather small. Many times
I have to produce the difference between a large list and a small one,
without destroying the integrity of either list. I was wondering if
anyone has any recommendations on how to do this and keep performance
high? Is there a better way than

[ i for i in bigList if i not in smallList ]

How about:

smallSet = set(smallList)
something = [ i for i in bigList if i not in smallSet ]

Use timeit.py on some representative data to see what difference that
makes.


Chaz Ginger <cg********@hotmail.comwrites:

I have a system that has a few lists that are very large (thousands or
tens of thousands of entries) and some that are rather small. Many times
I have to produce the difference between a large list and a small one,
without destroying the integrity of either list. I was wondering if
anyone has any recommendations on how to do this and keep performance
high? Is there a better way than

[ i for i in bigList if i not in smallList ]

diff = list(set(bigList) - set(smallList))

Note that doesn''t get you the elements in any particular order.


Chaz Ginger írta:

I have a system that has a few lists that are very large (thousands or
tens of thousands of entries) and some that are rather small. Many times
I have to produce the difference between a large list and a small one,
without destroying the integrity of either list. I was wondering if
anyone has any recommendations on how to do this and keep performance
high? Is there a better way than

[ i for i in bigList if i not in smallList ]

Thanks.
Chaz

Hi !

If you have big list, you can use dbm like databases.
They are very quick. BSDDB, flashdb, etc. See SleepyCat, or see python help.

In is very slow in large datasets, but bsddb is use hash values, so it
is very quick.
The SleepyCat database have many extras, you can set the cache size and
many other parameters.

Or if you don''t like dbm style databases, you can use SQLite. Also
quick, you can use SQL commands.
A little slower than bsddb, but it is like SQL server. You can improve
the speed with special parameters.

dd


这篇关于处理大型列表的最佳方法?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆