算法生成字谜 [英] Algorithm to generate anagrams

查看:142
本文介绍了算法生成字谜的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

什么是产生字谜的最佳策略。

 字谜是一种文字游戏的,重新排列字母的结果
的词或短语,以产生一个新的词或短语,使用所有的原始
字母恰好一次;
恩。
 

     

      
  • 十加二是字谜的十二加一
  •   
  • 小数点是字谜的我在的地方点
  •   
  • 天文学家是字谜的月亮starers
  •   

起初,这看起来直截了当简单,只需混杂英文字母,并生成所有可能的组合。但是,这将是有效的方法来生成字典中只有一行字。

我碰到这个页面,在Ruby中解决字谜。

但是,你有什么想法?

解决方案

大多数答案都是可怕的效率低下和/或只会给一个单词的解决方案(没有空格)。我的解决方案可以处理任何数量的话,是非常高效的。

你想要的是一个线索的数据结构。这里有一个完整 Python实现。你只需要保存在名为 words.txt 文件你可以试试拼字游戏字典单词列表在这里单词列表:

http://www.isc.ro/lists/twl06.zip

MIN_WORD_SIZE =字的4#最小尺寸的输出 类节点(对象):     高清__init __(个体经营,信='',最后=假,深度= 0):         self.letter =信         self.final =最终         self.depth =深度         self.children = {}     DEF增加(个体经营,字母):         节点=自         对于指数,信历数(字母):             如果不信的node.children:                 node.children [信] =节点(字母,指数== LEN(字母)-1,指数+ 1)             节点= node.children [信]     高清字谜(个体经营,字母):         瓷砖= {}         对于信中的字母:             瓷砖[信] = tiles.get(字母,0)+ 1         MIN_LENGTH = LEN(字母)         返回self._anagram(瓷砖,[],自我,MIN_LENGTH)     高清_anagram(个体经营,砖,路径,根,MIN_LENGTH):         如果self.final和self.depth> = MIN_WORD_SIZE:             字=''。加入(路径)             长度= len个(word.replace('',''))             如果长度GT; = MIN_LENGTH:                 产量字             path.append('')             在root._anagram字(砖,路径,根,MIN_LENGTH):                 产量字             path.pop()         对于来信,信中self.children.iteritems()节点:             数= tiles.get(字母,0)             如果count == 0:                 继续             瓷砖[信] =计数 - 1             path.append(信)             在node._anagram字(砖,路径,根,MIN_LENGTH):                 产量字             path.pop()             瓷砖[信] =计 高清load_dictionary(路径):     结果=节点()     在开线(路径,为'r'):         字= line.strip()。低()         result.add(字)     返回结果 高清的main():     打印正在加载单词列表。     也就是说= load_dictionary('words.txt)     而真正的:         字母=进行raw_input('输入字母:)         字母= letters.lower()         字母= letters.replace('','')         如果不信:             打破         数= 0         字在words.anagram(字母):             印字             数+ = 1         打印%d个结果。 % 计数 如果__name__ =='__main__':     主要()

当你运行该程序时,字被加载到内存中的一个线索。在此之后,只需输入你想要搜索的字母,它会打印结果。它只会显示,使用所有的输入字母的结果,没有什么更短。

有过滤从输出短词语,否则结果的数量是巨大的。随意调整的 MIN_WORD_SIZE 设置。请记住,只是用天文学家作为输入提供了233549的结果,如果 MIN_WORD_SIZE 1。也许你可以找到一个更短的单词列表只包含更常见的英语单词。

此外,收缩我(从你的例子之一)的结果将不会出现,除非你加上IM的字典,并设置 MIN_WORD_SIZE 2

的技巧,以获得多个单词是跳回到根节点的线索,每当你遇到一个完整的词在搜索中。然后你不断遍历线索,直到所有的字母都被使用。

What would be the best strategy to generate anagrams.

An anagram is a type of word play, the result of rearranging the letters
of a word or phrase to produce a new  word or phrase, using all the original
letters exactly once; 
ex.

  • Eleven plus two is anagram of Twelve plus one
  • A decimal point is anagram of I'm a dot in place
  • Astronomers is anagram of Moon starers

At first it looks straightforwardly simple, just to jumble the letters and generate all possible combinations. But what would be the efficient approach to generate only the words in dictionary.

I came across this page, Solving anagrams in Ruby.

But what are your ideas?

解决方案

Most of these answers are horribly inefficient and/or will only give one-word solutions (no spaces). My solution will handle any number of words and is very efficient.

What you want is a trie data structure. Here's a complete Python implementation. You just need a word list saved in a file named words.txt You can try the Scrabble dictionary word list here:

http://www.isc.ro/lists/twl06.zip

MIN_WORD_SIZE = 4 # min size of a word in the output

class Node(object):
    def __init__(self, letter='', final=False, depth=0):
        self.letter = letter
        self.final = final
        self.depth = depth
        self.children = {}
    def add(self, letters):
        node = self
        for index, letter in enumerate(letters):
            if letter not in node.children:
                node.children[letter] = Node(letter, index==len(letters)-1, index+1)
            node = node.children[letter]
    def anagram(self, letters):
        tiles = {}
        for letter in letters:
            tiles[letter] = tiles.get(letter, 0) + 1
        min_length = len(letters)
        return self._anagram(tiles, [], self, min_length)
    def _anagram(self, tiles, path, root, min_length):
        if self.final and self.depth >= MIN_WORD_SIZE:
            word = ''.join(path)
            length = len(word.replace(' ', ''))
            if length >= min_length:
                yield word
            path.append(' ')
            for word in root._anagram(tiles, path, root, min_length):
                yield word
            path.pop()
        for letter, node in self.children.iteritems():
            count = tiles.get(letter, 0)
            if count == 0:
                continue
            tiles[letter] = count - 1
            path.append(letter)
            for word in node._anagram(tiles, path, root, min_length):
                yield word
            path.pop()
            tiles[letter] = count

def load_dictionary(path):
    result = Node()
    for line in open(path, 'r'):
        word = line.strip().lower()
        result.add(word)
    return result

def main():
    print 'Loading word list.'
    words = load_dictionary('words.txt')
    while True:
        letters = raw_input('Enter letters: ')
        letters = letters.lower()
        letters = letters.replace(' ', '')
        if not letters:
            break
        count = 0
        for word in words.anagram(letters):
            print word
            count += 1
        print '%d results.' % count

if __name__ == '__main__':
    main()

When you run the program, the words are loaded into a trie in memory. After that, just type in the letters you want to search with and it will print the results. It will only show results that use all of the input letters, nothing shorter.

It filters short words from the output, otherwise the number of results is huge. Feel free to tweak the MIN_WORD_SIZE setting. Keep in mind, just using "astronomers" as input gives 233,549 results if MIN_WORD_SIZE is 1. Perhaps you can find a shorter word list that only contains more common English words.

Also, the contraction "I'm" (from one of your examples) won't show up in the results unless you add "im" to the dictionary and set MIN_WORD_SIZE to 2.

The trick to getting multiple words is to jump back to the root node in the trie whenever you encounter a complete word in the search. Then you keep traversing the trie until all letters have been used.

这篇关于算法生成字谜的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆