哈希表和子串匹配 [英] Hash Table and Substring Matching

查看:206
本文介绍了哈希表和子串匹配的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有上百个按键,例如像:

I have hundreds of keys for example like:

  • redapple
  • maninred
  • foraman
  • blueapple

我必须与这些密钥的数据,数据是一个字符串,并在结束有关关键。

i have data related to these keys, data is a string and has related key at the end.

  • redapple:该树,有-redapple
  • maninred:她锯最maninred
  • foraman:他们买来的,任何─present-foraman
  • blueapple:这 - 是 - 令人吃惊的,但是,它 - 是 - 一 - blueapple

我期望使用哈希表和哈希函数根据项来记录数据,我期望能够从表retieve数据

i am expected to use hash table and hash function to record the data according to keys and i am expected to be able to retieve data from table.

我知道要使用的散列函数和散列表,就没有问题这里。

i know to use hash function and hash table, there is no problem here.

但是;

我有望给该程序的字符串,发生子字符串和检索数据匹配的钥匙。

i am expected to give the program a string which takes place as a substring and retrieve the data for the matching keys.

例如:

我必须给并必须能够获得

i must give "red" and must be able to get

  • redapple:该树,有-redapple
  • maninred:她锯最maninred

作为输出

我必须给苹果并必须能够获得

i must give "apple" and must be able to get

  • redapple:该树,有-redapple
  • blueapple:这 - 是 - 令人吃惊的,但是,它 - 是 - 一 - blueapple

作为输出

我只能想到,如果他们有一个匹配的子字符串,是有一些其他的解决办法来搜索所有的钥匙?如果我搜索的所有关键字符串每个查询,使用哈希是不必要的,毫无意义的,是吧?

但是,搜索所有键子是O(N),我有望解决带O的问题(1)。

But, searching all keys for substring is O(N), i am expected to solve the problem with O(1).

使用哈希我可以凑一键如redapple来如 943 ,然后maninred来如 332

With hashing i can hash a key e.g. "redapple" to e.g. 943, and "maninred" to e.g. 332.

和查询人给字符串红色我怎样才能找到从 943 332 ,该键有红的子?这是从我的CS思维能力。

And query man give the string "red" how can i found out from 943 and 332 that the keys has "red" substring? It is out of my cs thinking skills.

感谢您的任何意见,想法。

Thanks for any advise, idea.

推荐答案

可能你应该使用反转指标正格拉姆,同样的方法用于拼写校正。对于字 redapple 您将有下面的一组3 gramms 红,EDA,DAP,应用程序,PPL,PLE 。对于每一个n-格拉姆你将有字符串的列表,其中包含它。例如,对于红色将是

Possible you should use the invert index for n-gramm, the same approach is used for spell correction. For word redapple you will have following set of 3-gramms red, eda, dap, app, ppl, ple. For each n-gramm you will have a list of string in which contains it. For example for red it will be

红色 - > maninred,redapple

在这个名单的话必须是有序的。当你想找到一个包含AA给予子的所有字符串,你潜入该子上正格拉姆和拦截的单词列表正格拉姆。

words in this list must be ordered. When you want to find the all string that contains a a give substring, you dived the substring on n-gramm and intercept the list of words for n-gramm.

这alogriphm不是为O(n),但实践它有足够的速度。

This alogriphm is not O(n), but it practice it has enough speed.

这篇关于哈希表和子串匹配的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆