Efficent字争夺算法 [英] Efficent word scramble algorithm

查看:160
本文介绍了Efficent字争夺算法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在寻找一个有效的算法进行加扰的一组字母到含有单词的最大数量的置换。

I'm looking for an efficient algorithm for scrambling a set of letters into a permutation containing the maximum number of words.

举例来说,说我给字母列表:{E,E,H,R,S,T}。我需要订购他们以这样一种方式,以包含单词的最大数量。如果我为了这些字母为孤单,它包含单词的,有,她,这里,和ERE。使实施例可以有5分,因为它含有5个字。我想订购以这样的方式的字母以具有最高分(包含最字)。

For example, say I am given the list of letters: {e, e, h, r, s, t}. I need to order them in such a way as to contain the maximum number of words. If I order those letters into "theres", it contain the words "the", "there", "her", "here", and "ere". So that example could have a score of 5, since it contains 5 words. I want to order the letters in such a way as to have the highest score (contain the most words).

一个天真的算法将尝试得分,每次改变。我相信这是O(n!),所以720不同的排列将被审判就在6个字母以上(包括部分重复的,因为这个例子还有一项电子的两倍)。欲了解更多字母,天真的解决方案迅速成为当然是不可能的。

A naive algorithm would be to try and score every permutation. I believe this is O(n!), so 720 different permutations would be tried for just the 6 letters above (including some duplicates, since the example has e twice). For more letters, the naive solution quickly becomes impossible, of course.

该算法不必实际产生最好的解决方案,但它应该发现在合理的时间量良好的解决方案。对于我的应用程序,仅仅是猜测(蒙特卡洛)在几百万排列作品相当不好,所以这是目前该商标被击败。

The algorithm doesn't have to actually produce the very best solution, but it should find a good solution in a reasonable amount of time. For my application, simply guessing (Monte Carlo) at a few million permutations works quite poorly, so that's currently the mark to beat.

我目前使用的是阿霍Corasick 算法得分排列。它搜索在短短一次通过文字字典中的每个字,所以我认为这是相当有效的。这也意味着我有全都储存在线索的话,但如果另一种算法,需要不同的存储这没关系。我不担心有关设置字典,实际订货刚刚运行时间和搜索。如果需要,即使是一个模糊的字典可以使用,如布隆过滤

I am currently using the Aho-Corasick algorithm to score permutations. It searches for each word in the dictionary in just one pass through the text, so I believe it's quite efficient. This also means I have all the words stored in a trie, but if another algorithm requires different storage that's fine too. I am not worried about setting up the dictionary, just the run time of the actual ordering and searching. Even a fuzzy dictionary could be used if needed, like a Bloom Filter.

有关我的应用程序,给出字母列表是100左右,而字典有超过10万项。这本字典永远不会改变,但有几个不同的字母列表需要订购。

For my application, the list of letters given is about 100, and the dictionary contains over 100,000 entries. The dictionary never changes, but several different lists of letters need to be ordered.

我正在考虑尝试一种寻路算法。我相信我可以先参观一下从列表中随机字母为起点。然后每个剩余的字母将被用于创造的道路。我认为这将很好地与阿霍Corasick评分算法,因为分数可以建立一个字母的时间。我没有尝试过的路径还发现,虽然;也许这不是一个连一个好主意吗?我不知道哪个路径寻找算法可能是最好的。

I am considering trying a path finding algorithm. I believe I could start with a random letter from the list as a starting point. Then each remaining letter would be used to create a "path." I think this would work well with the Aho-Corasick scoring algorithm, since scores could be built up one letter at a time. I haven't tried path finding yet though; maybe it's not a even a good idea? I don't know which path finding algorithm might be best.

另一种算法,我想也开始用随机字母。那么字典特里将搜索包含仍然字母富的分支。含有不可用字母字典分支机构将被修剪。我对如何做到这一点的工作完全细节有点云里雾里,但它可以完全消除得分排列。

Another algorithm I thought of also starts with a random letter. Then the dictionary trie would be searched for "rich" branches containing the remain letters. Dictionary branches containing unavailable letters would be pruned. I'm a bit foggy on the details of how this would work exactly, but it could completely eliminate scoring permutations.

任何建议,意见或算法将大大AP preciated。谢谢!

Any suggestions, advice, or algorithms will be greatly appreciated. Thanks!

推荐答案

您可以尝试模拟退火< /一>,它已被成功地用于在多个域中的复杂的优化问题。基本上你随机爬山,同时逐步减少随意性。既然你已经有阿霍Corasick得分,你所做的大部分工作已经。所有你需要的是一种能够产生邻居排列;对于一些简单的像换了一对字母应该很好地工作。

You might try simulated annealing, which has been used successfully for complex optimization problems in a number of domains. Basically you do randomized hill-climbing while gradually reducing the randomness. Since you already have the Aho-Corasick scoring you've done most of the work already. All you need is a way to generate neighbor permutations; for that something simple like swapping a pair of letters should work fine.

这篇关于Efficent字争夺算法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆