Python中小集合的性能 [英] Performance of small sets in Python
问题描述
我正在寻找最有效的方法来表示Python中给定范围(例如0-10)的小整数集.在这种情况下,效率意味着快速构建(从未排序的列表开始),快速查询(对每个集合进行几个查询)以及合理快速构建已排序的版本(可能每十个集合一次).先验的候选人使用的是Python的内置集合类型(快速查询),使用排序数组(可能更快地进行构造?)或使用位数组(如果我在C中,则可以使所有内容更快...但是我怀疑Python是否会高效(?)).有任何建议可供选择吗?
I am looking for the most efficient way to represent small sets of integers in a given range (say 0-10) in Python. In this case, efficiency means fast construction (from an unsorted list), fast query (a couple of queries on each set), and reasonably fast construction of a sorted version (perhaps once per ten sets or so). A priori the candidates are using Python's builtin set type (fast query), using a sorted array (perhaps faster to constrct?), or using a bit-array (fast everything if I was in C... but I doubt Python will be that efficient (?)). Any advice of which one to choose?
谢谢.
推荐答案
我将使用位图并将集合"的成员存储在int
中,这实际上可能比内置的要快.在这种情况下,set
类型-尽管我还没有测试过.肯定会需要更少的存储空间.
I'd use a bitmapping and store the members of a "set" in an int
...which might actually be faster than the built-in set
type in this case -- although I haven't tested that. It would definitely require less storage.
更新
我现在没有时间进行全套的实现,并根据Python的内置类对其进行基准测试,但是我认为这是一个说明我的建议的可行示例.我想您会同意的,代码看起来相当快,而且内存效率很高.
I don't have the time right now to do a full set-like implementation and benchmark it against Python's built-in class, but here's what I believe is a working example illustrating my suggestion. As I think you'd agree, the code looks fairly fast as well as memory efficient.
鉴于Python几乎透明的无限"长整数功能,编写的内容将自动使用比所需范围大得多的整数值,尽管这样做可能会使速度变慢. ;)
Given Python's almost transparent "unlimited" long integer capabilities, what is written will automatically work with integer values in a much larger range than you need, although doing so would likely slow things down a bit. ;)
class BitSet(object):
def __init__(self, *bitlist):
self._bitmap = 0
for bitnum in bitlist:
self._bitmap |= (1 << bitnum)
def add(self, bitnum):
self._bitmap |= (1 << bitnum)
def remove(self, bitnum):
if self._bitmap & (1 << bitnum):
self._bitmap &= ~(1 << bitnum)
else:
raise KeyError
def discard(self, bitnum):
self._bitmap &= ~(1 << bitnum)
def clear(self):
self._bitmap = 0
def __contains__(self, bitnum):
return bool(self._bitmap & (1 << bitnum))
def __int__(self):
return self._bitmap
if __name__ == '__main__':
bs = BitSet()
print '28 in bs:', 28 in bs
print 'bs.add(28)'
bs.add(28)
print '28 in bs:', 28 in bs
print
print '5 in bs:', 5 in bs
print 'bs.add(5)'
bs.add(5)
print '5 in bs:', 5 in bs
print
print 'bs.remove(28)'
bs.remove(28)
print '28 in bs:', 28 in bs
这篇关于Python中小集合的性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!