生成字谜的算法 [英] Algorithm to generate anagrams

查看:36
本文介绍了生成字谜的算法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

生成字谜的最佳策略是什么.

<块引用>

字谜是一种文字游戏,是重新排列字母的结果一个词或词组产生一个新词或词组,使用所有原来的只写一次;前任.

  • 十一加二十二加一
  • 小数点I'm a dot in place
  • 的字谜
  • AstronomersMoon starers
  • 的字谜

起初看起来很简单,只是将字母打乱并生成所有可能的组合.但是,仅生成字典中的单词的有效方法是什么.

我看到了这个页面,在 Ruby 中解决字谜.

但是你的想法是什么?

解决方案

这些答案中的大多数都非常低效和/或只会给出一个词的解决方案(没有空格).我的解决方案可以处理任意数量的单词并且非常有效.

你想要的是一个特里数据结构.这是一个完整的 Python 实现.您只需要一个保存在名为 words.txt 的文件中的单词列表您可以在此处尝试 Scrabble 词典单词列表:

http://www.isc.ro/lists/twl06.zip

MIN_WORD_SIZE = 4 # 输出中单词的最小大小类节点(对象):def __init__(self, letter='', final=False, depth=0):self.letter = 字母self.final = 最终self.depth = 深度self.children = {}定义添加(自我,字母):节点=自我对于索引,枚举中的字母(字母):如果字母不在 node.children 中:node.children[letter] = Node(letter, index==len(letters)-1, index+1)节点 = node.children[信件]def anagram(自我,字母):瓷砖 = {}对于字母中的字母:瓷砖[信件] = 瓷砖.get(信件, 0) + 1min_length = len(字母)返回 self._anagram(tiles, [], self, min_length)def _anagram(自我,瓷砖,路径,根,min_length):如果 self.final 和 self.depth >= MIN_WORD_SIZE:word = ''.join(路径)长度 = len(word.replace(' ', ''))如果长度 >= min_length:屈服词path.append(' ')对于 root._anagram(tiles, path, root, min_length) 中的单词:屈服词路径.pop()对于字母,self.children.iteritems() 中的节点:计数=tiles.get(字母,0)如果计数 == 0:继续瓷砖[字母] = 计数 - 1路径.附加(字母)对于 node._anagram(tiles, path, root, min_length) 中的单词:屈服词路径.pop()瓷砖[字母] =计数def load_dictionary(path):结果 = 节点()对于打开的行(路径,'r'):word = line.strip().lower()结果.添加(字)返回结果定义主():打印正在加载单词列表".words = load_dictionary('words.txt')而真:字母 = raw_input('请输入字母:')字母 = 字母.lower()字母 = 字母.replace(' ', '')如果不是字母:休息计数 = 0对于words.anagram(字母)中的单词:打印字计数 += 1打印%d 个结果."% 数数如果 __name__ == '__main__':主要的()

当你运行程序时,单词被加载到内存中的一个树中.之后,只需输入您要搜索的字母,它就会打印结果.它只会显示使用所有输入字母的结果,不会更短.

它从输出中过滤短词,否则结果数量巨大.随意调整 MIN_WORD_SIZE 设置.请记住,如果 MIN_WORD_SIZE 为 1,仅使用astronomers"作为输入会得到 233,549 个结果.也许您可以找到一个更短的单词列表,只包含更常见的英语单词.

此外,除非您将im"添加到字典并将 MIN_WORD_SIZE 设置为 2,否则收缩I'm"(来自您的示例之一)不会显示在结果中.

获取多个单词的技巧是在搜索中遇到完整单词时跳回树中的根节点.然后你继续遍历特里直到所有的字母都用完.

What would be the best strategy to generate anagrams.

An anagram is a type of word play, the result of rearranging the letters
of a word or phrase to produce a new  word or phrase, using all the original
letters exactly once; 
ex.

  • Eleven plus two is anagram of Twelve plus one
  • A decimal point is anagram of I'm a dot in place
  • Astronomers is anagram of Moon starers

At first it looks straightforwardly simple, just to jumble the letters and generate all possible combinations. But what would be the efficient approach to generate only the words in dictionary.

I came across this page, Solving anagrams in Ruby.

But what are your ideas?

解决方案

Most of these answers are horribly inefficient and/or will only give one-word solutions (no spaces). My solution will handle any number of words and is very efficient.

What you want is a trie data structure. Here's a complete Python implementation. You just need a word list saved in a file named words.txt You can try the Scrabble dictionary word list here:

http://www.isc.ro/lists/twl06.zip

MIN_WORD_SIZE = 4 # min size of a word in the output

class Node(object):
    def __init__(self, letter='', final=False, depth=0):
        self.letter = letter
        self.final = final
        self.depth = depth
        self.children = {}
    def add(self, letters):
        node = self
        for index, letter in enumerate(letters):
            if letter not in node.children:
                node.children[letter] = Node(letter, index==len(letters)-1, index+1)
            node = node.children[letter]
    def anagram(self, letters):
        tiles = {}
        for letter in letters:
            tiles[letter] = tiles.get(letter, 0) + 1
        min_length = len(letters)
        return self._anagram(tiles, [], self, min_length)
    def _anagram(self, tiles, path, root, min_length):
        if self.final and self.depth >= MIN_WORD_SIZE:
            word = ''.join(path)
            length = len(word.replace(' ', ''))
            if length >= min_length:
                yield word
            path.append(' ')
            for word in root._anagram(tiles, path, root, min_length):
                yield word
            path.pop()
        for letter, node in self.children.iteritems():
            count = tiles.get(letter, 0)
            if count == 0:
                continue
            tiles[letter] = count - 1
            path.append(letter)
            for word in node._anagram(tiles, path, root, min_length):
                yield word
            path.pop()
            tiles[letter] = count

def load_dictionary(path):
    result = Node()
    for line in open(path, 'r'):
        word = line.strip().lower()
        result.add(word)
    return result

def main():
    print 'Loading word list.'
    words = load_dictionary('words.txt')
    while True:
        letters = raw_input('Enter letters: ')
        letters = letters.lower()
        letters = letters.replace(' ', '')
        if not letters:
            break
        count = 0
        for word in words.anagram(letters):
            print word
            count += 1
        print '%d results.' % count

if __name__ == '__main__':
    main()

When you run the program, the words are loaded into a trie in memory. After that, just type in the letters you want to search with and it will print the results. It will only show results that use all of the input letters, nothing shorter.

It filters short words from the output, otherwise the number of results is huge. Feel free to tweak the MIN_WORD_SIZE setting. Keep in mind, just using "astronomers" as input gives 233,549 results if MIN_WORD_SIZE is 1. Perhaps you can find a shorter word list that only contains more common English words.

Also, the contraction "I'm" (from one of your examples) won't show up in the results unless you add "im" to the dictionary and set MIN_WORD_SIZE to 2.

The trick to getting multiple words is to jump back to the root node in the trie whenever you encounter a complete word in the search. Then you keep traversing the trie until all letters have been used.

这篇关于生成字谜的算法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆