语音识别和getUserMedia [英] Speech recognition and getUserMedia

查看:127
本文介绍了语音识别和getUserMedia的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在构建一个web应用程序并计划使用 speechRecognition navigator.getUserMedia 用于音频输入。



我注意到我的桌面浏览器(Mac上的Chrome,v。31.0.1650.63)询问两次使用权限麦克风。虽然这对用户来说可能有些恼人,但语音识别和音频输入似乎都可行。

然而,如果我在Android上打开相同的页面(Nexus 7 ,Android v4.4.2; Chrome v31.0.1650.59),它要求两次允许使用我的麦克风,但我只能使用其中一种(以先启动的为准)。有时,即使我授权访问麦克风,我也会收到语音识别错误:不允许错误。



我做了一个jsFiddle,在这里: http://jsfiddle.net/5xBpW

我的问题是:有没有办法在输入流上执行语音识别?或者有什么其他的方式可以让这两种功能在Android上使用?

Nuance有一款令人兴奋的新工具/产品(由现任Google工程部门主管Ray K创建),它使用专有学习算法(例如机器智能)将语音数据转换为动作。



此工具可以理解上下文,并可将其应用于特定操作,因此用户不必使用精确的短语:

https://developer.nuance.com/public/index.php ?task = mix



游览: https://developer.nuance.com/views/templates/mix/howDoesMixWork/phone/index.html



<缺点是你依赖第三方,但是你所看到的API也是实验性的,这可能是有趣的。


I'm building a web application and plan on using both speechRecognition and navigator.getUserMedia for audio input.

I noticed that my desktop browser (Chrome on Mac, v. 31.0.1650.63) asks twice for permission to use the microphone. While this may be a little bit annoying for the user, both voice recognition and audio input seem to work.

However, if I open the same page on Android (Nexus 7, Android v4.4.2; Chrome v31.0.1650.59), it asks twice for permission to use my microphone, but I can only use one of the two (whichever was started first). Sometimes, I also get a speech recognition error: "not-allowed" error, even though I gave permission to access the microphone.

I made a jsFiddle, here: http://jsfiddle.net/5xBpW/

My question is: Is there a way to perform speech recognition on an input stream? Or is there any other way to have both functionalities work on Chrome for Android?

解决方案

Have you considered other tools? There is an exciting new tool / product from Nuance (founded by Ray K, now head of Google Engineering) that translates voice data into actions using proprietary learning algorithms eg machine intelligence.

This tool understands context and can apply that to specific actions so the user doesn't have to use an exact phrase:

https://developer.nuance.com/public/index.php?task=mix

Tour: https://developer.nuance.com/views/templates/mix/howDoesMixWork/phone/index.html

The downside is that you are relying on a third party, but since the API you are looking at is also experimental this could be of interest.

这篇关于语音识别和getUserMedia的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆