Google Keep 如何在保存录音的同时进行语音识别? [英] How does Google Keep do Speech Recognition while saving the audio recording at the same time?

查看:33
本文介绍了Google Keep 如何在保存录音的同时进行语音识别?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Android 的 SpeechRecognizer 显然不允许将您进行语音识别的输入记录到音频文件中.也就是说,您要么使用 MediaRecorder(或 AudioRecord)录制语音,要么使用 SpeechRecognizer 进行语音识别,在这种情况下,音频不会记录到文件中(至少不是您可以访问的文件);但你不能同时进行.

Android's SpeechRecognizer apparently doesn't allow to record the input on which you're doing speech recognition into an audio file. That is, either you record voice using a MediaRecorder (or AudioRecord for that matter) or you do Speech Recognition with a SpeechRecognizer, in which case the audio isn't recorded into a file (at least not one you can access); but you can't do both at the same time.

在Android中如何实现录音和语音识别同时进行的问题已经被问过好几次了,最流行的解决方案"是录制一个flac文件并使用谷歌的非官方Speech API,它可以让你通过 POST 请求发送 flac 文件并获得带有转录的 json 响应.http://mikepultz.com/2011/03/accessing-google-speech-api-chrome-11/(过时的Android版本)https://github.com/katchsvartanian/voiceRecognition/tree/master/VoiceRecognitionhttp://mikepultz.com/2013/07/google-speech-api-full-duplex-php-version/

The question of how to achieve recording audio and doing speech recognition at the same time in Android has been asked several times, and the most popular "solution" is to record a flac file and use Google's unofficial Speech API which allows you to send a flac file via a POST request and obtain a json response with the transcription. http://mikepultz.com/2011/03/accessing-google-speech-api-chrome-11/ (outdated Android version) https://github.com/katchsvartanian/voiceRecognition/tree/master/VoiceRecognition http://mikepultz.com/2013/07/google-speech-api-full-duplex-php-version/

这很好用,但有一个巨大的限制,即它不能用于长度超过 10-15 秒的文件(确切的限制尚不清楚,可能取决于文件大小或字数).这使得它不适合我的需求.

That works pretty well but has a huge limitation which is it can't be used with files longer than about 10-15 seconds (the exact limit is not clear and may depend on file size or perhaps the amount of words). This makes it not suitable for my needs.

此外,将音频文件切成较小的文件也不是可行的解决方案;即使忘记在正确的位置(而不是在一个词的中间)正确拆分文件的困难,对上述 Web 服务 api 的许多连续请求将随机导致空响应(Google 表示每个请求的使用限制为 50 个)一天,但像往常一样,他们不会透露实际使用限制的详细信息,这显然限制了请求的爆发).

Also, slicing the audio file into smaller files is NOT a possible solution; even forgetting about the difficulties in properly splitting the file at the right positions (not in the middle of a word), many consecutive requests to the abovementioned web service api will randomly result in empty responses (Google says there's a usage limit of 50 requests per day, but as usual they don't disclose the details of the real usage limits which clearly restrict bursts of requests).

因此,所有这些似乎都表明在 Android 中获取语音转录的同时将输入记录到音频文件中是不可能的.

So, all this would seem to indicate that getting a transcription of speech while at the same time recording the input into an audio file in Android is IMPOSSIBLE.

然而,Google Keep Android 应用正是这样做的.它允许您说话,将您所说的内容转录成文本,并保存文本和录音(目前尚不清楚它的存储位置,但您可以重播).并且没有长度限制.

HOWEVER, the Google Keep Android app does exactly that. It allows you to speak, transcrbes what you said into text, and saves both the text and the audio recording (well it's not clear where it stores it, but you can replay it). And it has no length limitation.

所以问题是:有没有人知道 GOOGLE 如何保持它?我会查看源代码,但它似乎不可用,是吗?

So the question is: DOES ANYBODY HAVE AN IDEA OF HOW GOOGLE KEEP DOES IT? I would look at the source code but it doesn't seem to be available, is it?

我在进行语音识别时嗅探了 Google Keep 发送和接收的数据包,它绝对没有使用上面提到的语音 api.所有流量都是 TLS 并且(从外部看)它看起来与您使用 SpeechRecognizer 时几乎相同.

I sniffed the packets Google Keep sends and receives while doing speech recognition, and it definitely does NOT use the speech api mentioned above. All the traffic is TLS and (from the outside) it looks pretty much the same as when you're using SpeechRecognizer.

那么也许存在一种将麦克风输入流拆分"(即复制或多路复用)为两个流,并将其中一个提供给 SpeechRecognizer,另一个提供给 MediaRecorder 的方法?

So does perhaps a way exist to kind of "split" (i.e. duplicate, or multiplex) the microphone input stream into two streams, and feed one of them to a SpeechRecognizer and the other to a MediaRecorder?

推荐答案

Google Keep 使用某些未记录的附加功能启动 RecognizerIntent,并期望生成的意图包含录制音频的 URI.如果 RecognizerIntent 由 Google 语音搜索提供服务,那么一切都会解决,并且 Keep 会获取音频.

Google Keep launches RecognizerIntent with certain undocumented extras and expects the resulting intent to contain the URI of the recorded audio. If RecognizerIntent is serviced by Google Voice Search then it all works out and Keep gets the audio.

请参阅录制/保存来自语音识别意图的音频更多信息和一个代码示例,它以与 Keep(可能)相同的方式调用识别器.

See record/save audio from voice recognition intent for more information and a code sample that calls the recognizer in the same way as Keep (probably) does.

请注意,此行为不是 Android 的一部分.这只是两个封闭源代码的 Google 应用程序如何相互通信的当前未记录方式.

Note that this behavior is not part of Android. It's simply the current undocumented way of how two closed-source Google apps communicate with each other.

这篇关于Google Keep 如何在保存录音的同时进行语音识别?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆