谷歌是如何保持做语音识别,同时节省了音频记录在同一时间? [英] How does Google Keep do Speech Recognition while saving the audio recording at the same time?

查看:301
本文介绍了谷歌是如何保持做语音识别,同时节省了音频记录在同一时间?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Android的SpeechRecognizer显然不允许记录上你正在做语音识别成音频文件的输入。 也就是说,无论你使用的是MediaRecorder(或AudioRecord为此事),或者你录制语音做语音识别与SpeechRecognizer,在这种情况下的音频没有被记录到文件(至少不是一个你可以访问);但你不能在同一时间做这两个。

Android's SpeechRecognizer apparently doesn't allow to record the input on which you're doing speech recognition into an audio file. That is, either you record voice using a MediaRecorder (or AudioRecord for that matter) or you do Speech Recognition with a SpeechRecognizer, in which case the audio isn't recorded into a file (at least not one you can access); but you can't do both at the same time.

如何实现录制音频和做语音识别,同时在Android上的问题已经被问了几次,最流行的解决方案是记录一个FLAC文件,并使用谷歌的非正式语音API,它可以让你通过POST请求发送一个FLAC文件,并获得与转录的JSON响应。 <一href="http://mikepultz.com/2011/03/accessing-google-speech-api-chrome-11/">http://mikepultz.com/2011/03/accessing-google-speech-api-chrome-11/ (过时的Andr​​oid版) <一href="https://github.com/katchsvartanian/voiceRecognition/tree/master/VoiceRecognition">https://github.com/katchsvartanian/voiceRecognition/tree/master/VoiceRecognition <一href="http://mikepultz.com/2013/07/google-speech-api-full-duplex-php-version/">http://mikepultz.com/2013/07/google-speech-api-full-duplex-php-version/

The question of how to achieve recording audio and doing speech recognition at the same time in Android has been asked several times, and the most popular "solution" is to record a flac file and use Google's unofficial Speech API which allows you to send a flac file via a POST request and obtain a json response with the transcription. http://mikepultz.com/2011/03/accessing-google-speech-api-chrome-11/ (outdated Android version) https://github.com/katchsvartanian/voiceRecognition/tree/master/VoiceRecognition http://mikepultz.com/2013/07/google-speech-api-full-duplex-php-version/

该工程pretty的很好,但有一个巨大的限制是它不能与文件使用长于约10-15秒(确切的限制是不明确,可能取决于文件大小或也许的量话)。这使得它不适合我的需要。

That works pretty well but has a huge limitation which is it can't be used with files longer than about 10-15 seconds (the exact limit is not clear and may depend on file size or perhaps the amount of words). This makes it not suitable for my needs.

另外,切片音频文件到较小的文件是不是一个可能的解决方案;即使忘记了对困难的正确拆分文件在正确的位置上(而不是在字的中间),许多连续的请求,上述Web服务API会随机产生空应答(谷歌说,有每50请求使用限制一天,但像往常一样,他们不会透露的细节的真正的,明确限制的请求阵阵使用限制)。

Also, slicing the audio file into smaller files is NOT a possible solution; even forgetting about the difficulties in properly splitting the file at the right positions (not in the middle of a word), many consecutive requests to the abovementioned web service api will randomly result in empty responses (Google says there's a usage limit of 50 requests per day, but as usual they don't disclose the details of the real usage limits which clearly restrict bursts of requests).

所以,这一切就似乎表明,获得语音的转录,而在同一时间记录输入到在机器人的音频文件是不可能的。

So, all this would seem to indicate that getting a transcription of speech while at the same time recording the input into an audio file in Android is IMPOSSIBLE.

不过,谷歌保持Android应用程序正是这么做的。 它可以让你说话,transcrbes你说成文字,并保存文本和音频记录(以及现在还不清楚,它存储了它,但你可以重播)。 而且它没有长度的限制。

HOWEVER, the Google Keep Android app does exactly that. It allows you to speak, transcrbes what you said into text, and saves both the text and the audio recording (well it's not clear where it stores it, but you can replay it). And it has no length limitation.

所以,问题是:没有任何人有一个想法Google如何KEEP不是吗? 我想看看源$ C ​​$ C,但它似乎没有用,是不是?

So the question is: DOES ANYBODY HAVE AN IDEA OF HOW GOOGLE KEEP DOES IT? I would look at the source code but it doesn't seem to be available, is it?

我嗅到了分组谷歌保持发送和同时做语音识别接收,并且绝对不使用上面提到的语音API。所有交通TLS和(来自外部),它看起来pretty的一样一样的,当你使用SpeechRecognizer。

I sniffed the packets Google Keep sends and receives while doing speech recognition, and it definitely does NOT use the speech api mentioned above. All the traffic is TLS and (from the outside) it looks pretty much the same as when you're using SpeechRecognizer.

因此​​,没有可能的方式存在,以一种分裂(即复制或复用)麦克风输入流分成两股气流,并喂他们中的一个来一个SpeechRecognizer和另一个MediaRecorder?

So does perhaps a way exist to kind of "split" (i.e. duplicate, or multiplex) the microphone input stream into two streams, and feed one of them to a SpeechRecognizer and the other to a MediaRecorder?

推荐答案

谷歌保持的推出 RecognizerIntent 某些无证演员,并希望得到的意图包含的记录的URI音频。如果 RecognizerIntent 是由谷歌语音搜索服务,然后这是可行的,并保持获取音频。

Google Keep launches RecognizerIntent with certain undocumented extras and expects the resulting intent to contain the URI of the recorded audio. If RecognizerIntent is serviced by Google Voice Search then it all works out and Keep gets the audio.

请参阅<一href="http://stackoverflow.com/questions/23047433/record-save-audio-from-voice-recognition-intent/">record/save从语音识别意图音频以了解更多信息和code样品调用识别器以相同的方式保持(可能)不。

See record/save audio from voice recognition intent for more information and a code sample that calls the recognizer in the same way as Keep (probably) does.

请注意,这种行为是不是安卓的一部分。它只是两个闭源的谷歌应用程序是如何相互沟通的当前无证方式。

Note that this behavior is not part of Android. It's simply the current undocumented way of how two closed-source Google apps communicate with each other.

这篇关于谷歌是如何保持做语音识别,同时节省了音频记录在同一时间?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆