字谜算法使用哈希表和/或尝试 [英] Anagram Algorithm using a hashtable and/or tries

查看:185
本文介绍了字谜算法使用哈希表和/或尝试的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在寻找的上网一段时间,现在的步骤,找到一个字符串(单词)的所有字谜(IE团队生产的词驯服)使用哈希表和字典树。所有我这里的SO发现是验证2个字字谜。我希望把它更进一步,找到英语的算法,这样我可以用Java编程。

例如,

  • 通过所有字符循环。
  • 对于每一个独特的性格插入到哈希表。
  • 等等。

我不想要一个完整的程序。是的,我练了一次采访。如果这个问题来了,然后我就知道了它,并知道如何解释它不只是记住它。

解决方案

最简洁的答案,由于一些人在编程珠玑一书引用的是(意译):

这样(手波水平左向右)排序,然后这样(手波垂直上下)

这意味着,从一列的表(字)开始,创建一个两列的表:(sorted_word,字),然后排序第一列

现在找到一个单词字谜,第一计算排序字和做二进制搜索为表中的第一列的第一次出现,并读出第二列的值,而第一列是相同的。

输入(不需要进行排序):

 伴侣
驯服
微尘
球队
对我来说
 

排序这样(水平):

  aemt,交配
aemt,驯服
EMOT,微尘
aemt,团队
EMOT,希瑞
 

排序这样(垂直):

  aemt,交配
aemt,驯服
aemt,团队
EMOT,微尘
EMOT,希瑞
 

查找团队 - >aemt

  aemt,交配
aemt,驯服
aemt,团队
 

至于哈希表/试他们只接触到的图片,如果你想有一个稍微更快的查找。使用哈希表可以分区2列垂直排序表为k个分区基于第一列的哈希值。这会给你一个常数因子加速,因为你必须只在一个分区做一个二进制搜索。尝试是帮助你避免做过多的字符串比较优化的一种不同的方式,你挂完第一行的索引表为线索的每个终端的相应部分。

I have been searching the internet for awhile now for steps to find all the anagrams of a string (word) (i.e. Team produces the word tame) using a hashtable and a trie. All I have found here on SO is to verify 2 words are anagrams. I would like to take it a step further and find an algorithm in english so that I can program it in Java.

For example,

  • Loop through all the characters.
  • For each unique character insert into the hashtable.
  • and so forth.

I don't want a complete program. Yes, I am practicing for an interview. If this question comes up then I will know it and know how to explain it not just memorize it.

解决方案

the most succinct answer due to some guy quoted in the "programming pearls" book is (paraphrasing):

"sort it this way (waves hand horizontally left to right), and then that way (waves hand vertically top to bottom)"

this means, starting from a one-column table (word), create a two column table: (sorted_word, word), then sort it on the first column.

now to find anagrams of a word, first compute sorted word and do a binary search for its first occurrence in the first column of the table, and read off the second column values while the first column is the same.

input (does not need to be sorted):

mate
tame
mote
team
tome

sorted "this way" (horizontally):

aemt, mate
aemt, tame
emot, mote
aemt, team
emot, tome

sorted "that way" (vertically):

aemt, mate
aemt, tame
aemt, team
emot, mote
emot, tome

lookup "team" -> "aemt"

aemt, mate
aemt, tame
aemt, team

As far as hashtables/tries they only come into the picture if you want a slightly speedier lookup. Using hash tables you can partition the 2-column vertically sorted table into k-partitions based on the hash of the first column. this will give you a constant factor speedup because you have to do a binary search only within one partition. tries are a different way of optimizing by helping you avoid doing too many string comparisons, you hang off the index of the first row for the appropriate section of the table for each terminal in the trie.

这篇关于字谜算法使用哈希表和/或尝试的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆