限制语音识别结果在Android [英] Restricting speech recognition results on Android

查看:118
本文介绍了限制语音识别结果在Android的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在做一个应用程序,允许人们说话,几个选项(弦乐)之间进行选择。我有一个小问题,使得Android的语音识别适合我的想法。

I'm making an app that allows people to speak and select between a few options (Strings). I'm having a little problem making the Android Speech Recognizer fit my idea.

有只传递给SpeechRecognizer是有效的,并有它的之间的最佳的比赛?

Is there a way to just pass to the SpeechRecognizer the parameters that are "valid" and having it select between those the "best" match?

我不需要code,我只是需要一些指导,我的谷歌福看来今天要失败的我。

I don't need the code, I just need some guidance as my google-fu seems to be failing me today.

推荐答案

我们的解决这个问题,在 HTTP描述://kaljurand.github.io/Grammars/ ,例如:看看从这个页面链接的文件:

Our solution to this problem is described at http://kaljurand.github.io/Grammars/, e.g. check out the paper linked from this page:

Kaarel Kaljurand,TanelAlumäe。在语音控制自然语言   基于识别用户界面(CNL 2012)

Kaarel Kaljurand, Tanel Alumäe. Controlled Natural Language in Speech Recognition Based User Interfaces (CNL 2012)

的基本思想是:

  1. 请不要使用谷歌的语音识别,因为你不能(目前)通过语言模型(如语法)把它(在我们的情况下,它也不支持输入语言,我们想用);
  2. ,所以你需要实现自己的语音识别(例如,基于狮身人面像),并使其接受语法为输入的一部分;
  3. 执行语法。如果它是可以接受的词组一个简单的列表,然后JSGF会做的语法描述语言,对于较复杂的语法,我建议语法框架(可以自动编译到JSGF或有限状态自动机);
  4. 在实现一个Android应用程序,它通过添加方式对语法传递给识别器扩展了RecognizerIntent API。您可以如碱它在Kõnele
  1. don't use Google's speech recognizer because you cannot (currently) pass the language model (e.g. a grammar) to it (in our case it also didn't support the input language that we wanted to use);
  2. so you need to implement your own speech recognizer (e.g. based on Sphinx) and make it accept grammars as part of the input;
  3. implement the grammar. If it's a simple list of acceptable phrases then JSGF will do as the grammar description language, for more complex grammars I recommend Grammatical Framework (which you can automatically compile to JSGF or finite-state automata);
  4. implement an Android app that extends the RecognizerIntent API by adding a way to pass the grammar to the recognizer. You can base it e.g. on Kõnele.

这一切都可能是你的情况矫枉过正。后处理谷歌的结果(如@gregm建议)肯定是更容易实现。但是,如果你想扩展到更复杂和/或多种语言的语言模型,那么我们的做法无疑提供了所需的模块化和EX pressive力量。

All this might be an overkill in your case. Post-processing of Google's results (as @gregm suggests) is certainly easier to implement. But if you want to scale to more complex and/or multilingual language models then our approach certainly provides the required modularity and expressive power.

这篇关于限制语音识别结果在Android的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆