从字典中获取字谜列表 [英] get list of anagrams from a dictionary

查看:20
本文介绍了从字典中获取字谜列表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

基本上,字谜就像字符串的排列.例如 stacksacktstakc 都是 stack(认为上面的词没有意义).不管怎样,你应该已经明白我的基本意思了.

Basically, Anagrams are like permutation of string.E.g stack ,sackt ,stakc all are anagrams of stack (thought above words aren't meaningful). Anyways you could have understood what I basically meant.

现在,我想要一个 anagrams 列表,给出百万个单词或简单地从字典中说出来.

Now, I want a list of anagrams given million words or simply say from a dictionary.

我的基本问题是查找字典中唯一字谜的总数?

排序和比较不会工作,因为它的时间复杂度很糟糕.

Sorting and comparing won't work as it's time complexity is pretty bad.

我想到了使用哈希表,字符串作为键.

I thought of using hash table, string as key.

但问题是散列函数应该是什么?如果一些伪代码会有所帮助假如.其他一些比上述方法更好的方法也会有所帮助.

But the problem is what should be the hash function ? It would be helpful if some pseudocode provided. Some other approaches better than mentioned approaches would also be helpful.

谢谢.

推荐答案

显而易见的解决方案是将每个字符映射到一个素数,然后将这些素数相乘.所以如果 'a'' -> 2 和 'b' -> 3,那么

The obvious solution is to map each character to a prime number and multiply the prime numbers. So if 'a'' -> 2 and 'b' -> 3, then

  • 'ab' -> 6
  • 'ba' -> 6
  • 'bab' -> 18
  • '阿爸' -> 36
  • '爸爸' -> 36

为了尽量减少溢出的可能性,可以将最小的素数分配给更频繁的字母 (e,t,i,a,n).注意:第 26 个素数是 101.

To minimise the chance of overflow, the smallest primes could be assigned to the more frequent letters (e,t,i,a,n). Note: The 26th prime is 101.

更新:可以在此处找到实现

这篇关于从字典中获取字谜列表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆