使用Python的现代,高性能Bloom过滤器? [英] Modern, high performance bloom filter in Python?

查看:121
本文介绍了使用Python的现代,高性能Bloom过滤器?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在寻找一种Python中的生产质量Bloom过滤器实现,以处理相当多的项目(例如100M到1B的项目,其误报率为0.01%).

I'm looking for a production quality bloom filter implementation in Python to handle fairly large numbers of items (say 100M to 1B items with 0.01% false positive rate).

Pybloom 是一种选择,但由于它会抛出DeprecationWarning错误,因此似乎正在显示其年龄定期使用Python 2.5. Joe Gregorio还具有实现.

Pybloom is one option but it seems to be showing its age as it throws DeprecationWarning errors on Python 2.5 on a regular basis. Joe Gregorio also has an implementation.

要求是快速查找性能和稳定性.我也愿意为特别好的c/c ++实现创建Python接口,如果有很好的Java实现,甚至对Jython也很开放.

Requirements are fast lookup performance and stability. I'm also open to creating Python interfaces to particularly good c/c++ implementations, or even to Jython if there's a good Java implementation.

缺少这一点,关于可以处理约16E9位的位阵列/位向量表示的任何建议吗?

Lacking that, any recommendations on a bit array / bit vector representation that can handle ~16E9 bits?

推荐答案

最终,我找到了 pybloomfiltermap .我没有用过,但是看起来很合适.

Eventually I found pybloomfiltermap. I haven't used it, but it looks like it'd fit the bill.

这篇关于使用Python的现代,高性能Bloom过滤器?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆