"字谜解算器"根据统计数据,而不是一本字典/表? [英] "Anagram solver" based on statistics rather than a dictionary/table?

查看:211
本文介绍了"字谜解算器"根据统计数据,而不是一本字典/表?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的问题是概念上类似于解决字谜,但我不能只用一本字典查找。我试图找到合理的话,而不是真实的话。

My problem is conceptually similar to solving anagrams, except I can't just use a dictionary lookup. I am trying to find plausible words rather than real words.

我已经创建了一个N-gram模型(目前,N = 2)的基础上一堆文本的信件。现在,由于字母随机序列,我想根据转移概率重排它们成为最可能的序列。我以为我会需要 Viterbi算法当我开始这一点,但我看的更深一些,维特比算法优化基于所述观察到的输出隐藏随机变量的序列。我想,优化输出序列。

I have created an N-gram model (for now, N=2) based on the letters in a bunch of text. Now, given a random sequence of letters, I would like to permute them into the most likely sequence according to the transition probabilities. I thought I would need the Viterbi algorithm when I started this, but as I look deeper, the Viterbi algorithm optimizes a sequence of hidden random variables based on the observed output. I am trying to optimize the output sequence.

有一个著名的算法,这一点,我可以阅读有关?还是我在正确的轨道与维特比,我只是没有看到如何应用它?

Is there a well-known algorithm for this that I can read about? Or am I on the right track with Viterbi and I'm just not seeing how to apply it?

更新

我添加了一个赏金,要求更深入地了解这个问题。 (分析解释为什么一种有效的方法是不可能的,其它启发式/除了模拟退火近似等)

I have added a bounty to ask for more insight into this problem. (Analysis explaining why an efficient approach isn't possible, other heuristics/approximations besides simulated annealing, etc.)

推荐答案

如果我正确理解你的问题,您正在搜索的字母的所有排列为一个与2克的概率最低的产品的话。

If I understand your problem correctly, you are searching all permutations of letters in a word for the one with the lowest product of 2-gram probabilities.

如果你的话太长,​​干脆蛮力所有组合,我发现,随机优化算法产生一个短的时间内很好的效果。我(有数学背景)已经做了该算法的一些工作模拟退火,这是我想很好地将适合于您的问题。它是pretty的容易实现。

If your word is too long to simply brute force all combinations, I've found that stochastic optimization algorithms produce good results in a short time. I (having a mathematical background) have done some work on the algorithm "Simulated Annealing", which I think would fit nicely to your problem. And it is pretty easy to implement.

这篇关于"字谜解算器"根据统计数据,而不是一本字典/表?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆