用于模糊匹配的Python哈希表 [英] Python hash table for fuzzy matching

查看:353
本文介绍了用于模糊匹配的Python哈希表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述



我试图实现一个数据结构,可以根据键快速查找。python dict 在我的查找涉及到一个平等的时候很好

(例如 key == somevalue 转换为 datadict [somevalue]



问题是,我还需要能够根据更复杂的比较来有效地查找密钥,例如 key> 50 key.startswith('abc')



<显然,我无法在这两种情况下使用相同的解决方案,但目前我无法弄明白如何解决这两种情况。任何人都可以提出一种方法吗?

解决方案

这听起来不像你想要一个哈希算法 - 而是一些二叉树形式,甚至你使用 bisect module with。值得一看: Python的标准库 - 是否有一个用于平衡二叉树的模块?



另一个选项(取决于您的数据)将使用内存中 sqlite3 数据库,并为可能的查找创建适当的索引 - 但是您可以交易性能/内存和SQL语法以实现灵活性...


I am trying to implement a data structure which allows rapid look-ups based on keys.

The python dict is great when my look-ups involve an equality
(e.g. key == somevalue translates to datadict[somevalue].

The problem is that I also need to be able to efficiently look up keys based on a more complex comparison, e.g. key > 50, or key.startswith('abc').

Obviously I can't use the same solution in both cases, but at the moment I can't figure out how to solve either case. Can anyone suggest a way of doing this?

解决方案

It doesn't sound like you want a hash algorithm - instead some form of binary tree. Or even a list which you use the bisect module with. It'd be worth looking at: Python's standard library - is there a module for balanced binary tree?

Another option (depending on your data), would be to use use an in-memory sqlite3 database and create appropriate indices for possible lookups -- but you'll trade performance/memory and SQL syntax for flexibility...

这篇关于用于模糊匹配的Python哈希表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆