用于模糊匹配的Python哈希表 [英] Python hash table for fuzzy matching
问题描述
我试图实现一个数据结构,可以根据键快速查找。python dict
在我的查找涉及到一个平等的时候很好
(例如 key == somevalue
转换为 datadict [somevalue]
。
问题是,我还需要能够根据更复杂的比较来有效地查找密钥,例如 key> 50
或 key.startswith('abc')
。
<显然,我无法在这两种情况下使用相同的解决方案,但目前我无法弄明白如何解决这两种情况。任何人都可以提出一种方法吗?
这听起来不像你想要一个哈希算法 - 而是一些二叉树形式,甚至你使用 bisect
module with。值得一看: Python的标准库 - 是否有一个用于平衡二叉树的模块?
另一个选项(取决于您的数据)将使用内存中 sqlite3
数据库,并为可能的查找创建适当的索引 - 但是您可以交易性能/内存和SQL语法以实现灵活性...
I am trying to implement a data structure which allows rapid look-ups based on keys.
The python dict
is great when my look-ups involve an equality
(e.g. key == somevalue
translates to datadict[somevalue]
.
The problem is that I also need to be able to efficiently look up keys based on a more complex comparison, e.g. key > 50
, or key.startswith('abc')
.
Obviously I can't use the same solution in both cases, but at the moment I can't figure out how to solve either case. Can anyone suggest a way of doing this?
It doesn't sound like you want a hash algorithm - instead some form of binary tree. Or even a list which you use the bisect
module with. It'd be worth looking at: Python's standard library - is there a module for balanced binary tree?
Another option (depending on your data), would be to use use an in-memory sqlite3
database and create appropriate indices for possible lookups -- but you'll trade performance/memory and SQL syntax for flexibility...
这篇关于用于模糊匹配的Python哈希表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!