如何确定随机字符串听起来像英语? [英] How do I determine if a random string sounds like English?

查看:68
本文介绍了如何确定随机字符串听起来像英语?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一种算法,可以根据输入单词列表生成字符串.如何仅将听起来像英语单词的字符串分开? IE.保留 LORD 的同时丢弃 RDLO .

编辑:为澄清起见,它们不必是词典中的实际单词.他们只需要听起来像英语.例如,将接受 KEAL .

解决方案

您可以构建包含大量英文文本的马尔可夫链.

然后,您可以将单词输入到markov链中,并检查单词是英语的可能性有多大.

请参阅此处: http://en.wikipedia.org/wiki/Markov_chain

在页面底部,您可以看到markov文本生成器.您想要的恰恰相反.

简而言之:markov链为每个字符存储下一个字符将跟随的概率.如果您有足够的内存,可以将此想法扩展为两个或三个字符.

I have an algorithm that generates strings based on a list of input words. How do I separate only the strings that sounds like English words? ie. discard RDLO while keeping LORD.

EDIT: To clarify, they do not need to be actual words in the dictionary. They just need to sound like English. For example KEAL would be accepted.

解决方案

You can build a markov-chain of a huge english text.

Afterwards you can feed words into the markov chain and check how high the probability is that the word is english.

See here: http://en.wikipedia.org/wiki/Markov_chain

At the bottom of the page you can see the markov text generator. What you want is exactly the reverse of it.

In a nutshell: The markov-chain stores for each character the probabilities of which next character will follow. You can extend this idea to two or three characters if you have enough memory.

这篇关于如何确定随机字符串听起来像英语?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆