从字典中获取字谜列表 [英] get list of anagrams from a dictionary
问题描述
基本上,字谜就像字符串的排列.例如 stack
、sackt
、stakc
都是 stack
(认为上面的词没有意义).不管怎样,你应该已经明白我的基本意思了.
Basically, Anagrams are like permutation of string.E.g stack
,sackt
,stakc
all are anagrams of stack
(thought above words aren't meaningful). Anyways you could have understood what I basically meant.
现在,我想要一个 anagrams
列表,给出百万个单词或简单地从字典中说出来.
Now, I want a list of anagrams
given million words or simply say from a dictionary.
我的基本问题是查找字典中唯一字谜的总数?
排序和比较不会工作,因为它的时间复杂度很糟糕.
Sorting and comparing won't work as it's time complexity is pretty bad.
我想到了使用哈希表,字符串作为键.
I thought of using hash table, string as key.
但问题是散列函数应该是什么?如果一些伪代码会有所帮助假如.其他一些比上述方法更好的方法也会有所帮助.
But the problem is what should be the hash function ? It would be helpful if some pseudocode provided. Some other approaches better than mentioned approaches would also be helpful.
谢谢.
推荐答案
显而易见的解决方案是将每个字符映射到一个素数,然后将这些素数相乘.所以如果 'a'' -> 2 和 'b' -> 3,那么
The obvious solution is to map each character to a prime number and multiply the prime numbers. So if 'a'' -> 2 and 'b' -> 3, then
- 'ab' -> 6
- 'ba' -> 6
- 'bab' -> 18
- '阿爸' -> 36
- '爸爸' -> 36
为了尽量减少溢出的可能性,可以将最小的素数分配给更频繁的字母 (e,t,i,a,n).注意:第 26 个素数是 101.
To minimise the chance of overflow, the smallest primes could be assigned to the more frequent letters (e,t,i,a,n). Note: The 26th prime is 101.
更新:可以在此处找到实现
这篇关于从字典中获取字谜列表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!