建立与Openears兼容的语言模型 [英] Building openears compatible language model

查看:96
本文介绍了建立与Openears兼容的语言模型的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在做语音到文本和文本到语音的开发,我发现了 OpenEars API非常有用.

I am doing some development on speech to text and text to speech and I found the OpenEars API very useful.

基于 cmu-slm 的API的原理是,它使用了语言模型来映射iPhone设备收听的语音.因此,我决定找到一个大型英语模型来提供API语音识别器引擎.但是我无法理解与OpenEars一起使用的voxfourge英语数据模型的格式.

The principle of this cmu-slm based API is it uses a language model to map the speech listened by the iPhone device. So I decided to find a big English language model to feed the API speech recognizer engine. But I failed to understand the format of the voxfourge english data model to use with OpenEars.

有人知道我如何才能将.languagemodel和.dic文件用于英语与OpenEars一起使用?

Do anyone have any idea that how can I get the .languagemodel and .dic file for English language to work with OpenEars?

推荐答案

旧问题,但是答案仍然很有趣. OpenEars现在具有内置的语言模型生成功能,因此一种选择是让您使用LanguageModelGenerator类根据需要在应用中动态创建模型,该类使用MITLM库和NSScanner完成与上述CMU工具包相同的任务.在iPhone上处理超过5000个单词的语料库将花费很长时间,但是您始终可以使用模拟器运行一次,并将输出从document文件夹中取出并保存.

Old question, but maybe the answer is still interesting. OpenEars now has built-in language model generation, so one option is for you to create models dynamically in your app as you need them using the LanguageModelGenerator class, which uses the MITLM library and NSScanner to accomplish the same task as the CMU toolkit mentioned above. Processing a corpus with >5000 words on the iPhone is going to take a very long time, but you could always use the Simulator to run it once and get the output out of the documents folder and keep it.

此处解释了用于大词汇量识别的另一种选择:

Another option for large vocabulary recognition is explained here:

使用50,000创建ARPA语言模型文件字

我已经指出,作为OpenEars开发人员,CMU工具的5000个单词的限制与使用Pocketsphinx时在iPhone上可能具有不错的准确性和处理速度的最大词汇量非常接近.因此,最后的建议是重新概念化您的任务,以便它绝对不需要大的词汇识别能力(例如,由于OpenEars允许您即时切换模型,您可能会发现您不需要一个庞大的模型,但是可以使用多个可以在不同上下文中切换的较小的语言来获得),或者使用可以在服务器上进行大词汇量识别的基于网络的API(或在您自己的服务器上创建使用Sphinx4的自己的API).祝你好运!

Having said that, I need to point out as the OpenEars developer that the CMU tool's limit of 5000 words corresponds pretty closely to the maximum vocabulary size that is likely to have decent accuracy and processing speed on the iPhone when using Pocketsphinx. So, the last suggestion would be to either reconceptualize your task so that it doesn't absolutely require large vocabulary recognition (for instance, since OpenEars allows you switch models on the fly, you may find that you don't need one enormous model but can get by with multiple smaller ones that you can switch in in different contexts), or to use a network-based API that can do large vocabulary recognition on a server (or make your own API that uses Sphinx4 on your own server). Good luck!

这篇关于建立与Openears兼容的语言模型的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆