Python中小集合的性能 [英] Performance of small sets in Python

查看:86
本文介绍了Python中小集合的性能的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在寻找最有效的方法来表示Python中给定范围(例如0-10)的小整数集.在这种情况下,效率意味着快速构建(从未排序的列表开始),快速查询(对每个集合进行几个查询)以及合理快速构建已排序的版本(可能每十个集合一次).先验的候选人使用的是Python的内置集合类型(快速查询),使用排序数组(可能更快地进行构造?)或使用位数组(如果我在C中,则可以使所有内容更快...但是我怀疑Python是否会高效(?)).有任何建议可供选择吗?

I am looking for the most efficient way to represent small sets of integers in a given range (say 0-10) in Python. In this case, efficiency means fast construction (from an unsorted list), fast query (a couple of queries on each set), and reasonably fast construction of a sorted version (perhaps once per ten sets or so). A priori the candidates are using Python's builtin set type (fast query), using a sorted array (perhaps faster to constrct?), or using a bit-array (fast everything if I was in C... but I doubt Python will be that efficient (?)). Any advice of which one to choose?

谢谢.

推荐答案

我将使用位图并将集合"的成员存储在int中,这实际上可能比内置的要快.在这种情况下,set类型-尽管我还没有测试过.肯定会需要更少的存储空间.

I'd use a bitmapping and store the members of a "set" in an int...which might actually be faster than the built-in set type in this case -- although I haven't tested that. It would definitely require less storage.

更新

我现在没有时间进行全套的实现,并根据Python的内置类对其进行基准测试,但是我认为这是一个说明我的建议的可行示例.我想您会同意的,代码看起来相当快,而且内存效率很高.

I don't have the time right now to do a full set-like implementation and benchmark it against Python's built-in class, but here's what I believe is a working example illustrating my suggestion. As I think you'd agree, the code looks fairly fast as well as memory efficient.

鉴于Python几乎透明的无限"长整数功能,编写的内容将自动使用比所需范围大得多的整数值,尽管这样做可能会使速度变慢. ;)

Given Python's almost transparent "unlimited" long integer capabilities, what is written will automatically work with integer values in a much larger range than you need, although doing so would likely slow things down a bit. ;)

class BitSet(object):
    def __init__(self, *bitlist):
        self._bitmap = 0
        for bitnum in bitlist:
            self._bitmap |= (1 << bitnum)

    def add(self, bitnum):
        self._bitmap |= (1 << bitnum)

    def remove(self, bitnum):
        if self._bitmap & (1 << bitnum):
            self._bitmap &= ~(1 << bitnum)
        else:
            raise KeyError

    def discard(self, bitnum):
       self._bitmap &= ~(1 << bitnum)

    def clear(self):
        self._bitmap = 0

    def __contains__(self, bitnum):
        return bool(self._bitmap & (1 << bitnum))

    def __int__(self):
        return self._bitmap

if __name__ == '__main__':

    bs = BitSet()

    print '28 in bs:', 28 in bs
    print 'bs.add(28)'
    bs.add(28)
    print '28 in bs:', 28 in bs

    print
    print '5 in bs:', 5 in bs
    print 'bs.add(5)'
    bs.add(5)
    print '5 in bs:', 5 in bs

    print
    print 'bs.remove(28)'
    bs.remove(28)
    print '28 in bs:', 28 in bs

这篇关于Python中小集合的性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆