如何确定随机字符串是否听起来像英语? [英] How do I determine if a random string sounds like English?

查看:21
本文介绍了如何确定随机字符串是否听起来像英语?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个算法,可以根据输入单词列表生成字符串.如何仅将听起来像英语单词的字符串分开?IE.丢弃RDLO,同时保留LORD.

澄清一下,它们不需要是字典中的实际单词.他们只需要听起来像英语.例如 KEAL 将被接受.

解决方案

您可以构建一个巨大的英文文本的马尔可夫链.

之后,您可以将单词输入马尔可夫链并检查单词是英语的概率有多大.

见这里:http://en.wikipedia.org/wiki/Markov_chain >

在页面底部,您可以看到马尔可夫文本生成器.你想要的恰恰相反.

简而言之:马尔可夫链为每个字符存储下一个字符出现的概率.如果你有足够的内存,你可以将这个想法扩展到两个或三个字符.

I have an algorithm that generates strings based on a list of input words. How do I separate only the strings that sounds like English words? ie. discard RDLO while keeping LORD.

EDIT: To clarify, they do not need to be actual words in the dictionary. They just need to sound like English. For example KEAL would be accepted.

解决方案

You can build a markov-chain of a huge english text.

Afterwards you can feed words into the markov chain and check how high the probability is that the word is english.

See here: http://en.wikipedia.org/wiki/Markov_chain

At the bottom of the page you can see the markov text generator. What you want is exactly the reverse of it.

In a nutshell: The markov-chain stores for each character the probabilities of which next character will follow. You can extend this idea to two or three characters if you have enough memory.

这篇关于如何确定随机字符串是否听起来像英语?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆