录制语音时,语音识别无法正常工作 [英] Voice recognition fails to work when the voice is under recording
问题描述
我正在研究一种功能,当按下一个按钮时,它将启动语音识别,并同时记录用户所说的内容.代码如下:
button_start.setOnTouchListener( new View.OnTouchListener()
{
@Override
public boolean onTouch(View arg0, MotionEvent event)
{
if (pressed == false)
{
Intent intent = new Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH);
intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL,RecognizerIntent.LANGUAGE_MODEL_FREE_FORM);
intent.putExtra(RecognizerIntent.EXTRA_CALLING_PACKAGE,"voice.recognition.test");
intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE, "zh-HK");
intent.putExtra(RecognizerIntent.EXTRA_MAX_RESULTS,1);
sr.startListening(intent);
Log.i("111111","11111111");
pressed = true;
}
recordAudio();
}
if((event.getAction()==MotionEvent.ACTION_UP || event.getAction()==MotionEvent.ACTION_CANCEL))
{
stopRecording();
}
return false;
}
});
}
public void recordAudio()
{
isRecording = true;
try
{
mediaRecorder = new MediaRecorder();
mediaRecorder.setAudioSource(MediaRecorder.AudioSource.MIC);
mediaRecorder.setOutputFormat(MediaRecorder.OutputFormat.THREE_GPP);
mediaRecorder.setOutputFile(audioFilePath);
mediaRecorder.setAudioEncoder(MediaRecorder.AudioEncoder.AMR_NB);
mediaRecorder.prepare();
}
catch (Exception e)
{
e.printStackTrace();
}
mediaRecorder.start();
}
public void stopRecording()
{
if (isRecording)
{
mediaRecorder.stop();
mediaRecorder.reset(); // set state to idle
mediaRecorder.release();
mediaRecorder = null;
isRecording = false;
}
else
{
mediaPlayer.release();
mediaPlayer.reset();
mediaPlayer = null;
}
}
class listener implements RecognitionListener
{
// standard codes onReadyForSpeech, onBeginningOfSpeech, etc
}
问题:
我已经逐步制作了该应用程序,但最初该应用程序没有录音功能,并且语音识别功能完美运行.
经过多次测试并认为语音识别是可以的之后,我开始使用MediaRecorder
合并录音功能.
然后我进行了测试,一旦按下button_start,甚至在我尝试讲话之前,就会立即出现 ERROR3 AUDIO
消息.
我播放录音.声音也被正确记录和保存.
发生了什么事?为什么在使用语音识别时不能同时录音?
谢谢!
-编辑-示例项目.更改该" readData()",以便在"sData"中的原始PCM由2个线程(示例项目中的fileSink线程,RecognirAPI线程)共享.对于接收器,只需使用在每个"sData" IO处刷新的PCM流连接编码器即可.记得向CLO发送信息流,它将起作用.回顾" writeAudiaDataToFile()"以获取有关fileSink的更多信息.
-编辑-参见此线程 >
当您尝试这样做时,HAL和麦克风缓冲区将发生基本冲突:
speechRecognizer.startListening(recognizerIntent); // <-- needs mutex use of mic
和
mediaRecorder.start(); // <-- needs mutex use of mic
您只能选择上述操作中的一个或另一个来拥有麦克风基础的音频API!
如果您想模仿Google Keep的功能,该功能仅需说一次,并且作为一个输入过程(您的语音输入为mic)的输出,您将获得2种单独的输出类型(STT和一个MP3的文件接收器)那么您必须在麦克风离开HAL层时对其进行拆分.
例如:
-
从拆分上述缓冲区的字节(您可以从缓冲区中获取一个流,并将该流传输到2个地方)
在对STT进行编码之前或之后,将 - STRM 1转换为STT的API(有STT API接受Raw PCM 16或已编码)
-
STRM 2到编码器,然后到fileSink以捕获记录
Split可以在麦克风产生的实际缓冲区上运行,也可以在相同字节的派生流上运行.
对于您正在研究的内容,我建议您查看getCurrentRecording()
和consumeRecording()
.请注意,此处提到的API都有一些用例.
I am working on a function that when a button is pressed, it will launch voice recognition and at the same time will record what the user says. Codes as follows:
button_start.setOnTouchListener( new View.OnTouchListener()
{
@Override
public boolean onTouch(View arg0, MotionEvent event)
{
if (pressed == false)
{
Intent intent = new Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH);
intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL,RecognizerIntent.LANGUAGE_MODEL_FREE_FORM);
intent.putExtra(RecognizerIntent.EXTRA_CALLING_PACKAGE,"voice.recognition.test");
intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE, "zh-HK");
intent.putExtra(RecognizerIntent.EXTRA_MAX_RESULTS,1);
sr.startListening(intent);
Log.i("111111","11111111");
pressed = true;
}
recordAudio();
}
if((event.getAction()==MotionEvent.ACTION_UP || event.getAction()==MotionEvent.ACTION_CANCEL))
{
stopRecording();
}
return false;
}
});
}
public void recordAudio()
{
isRecording = true;
try
{
mediaRecorder = new MediaRecorder();
mediaRecorder.setAudioSource(MediaRecorder.AudioSource.MIC);
mediaRecorder.setOutputFormat(MediaRecorder.OutputFormat.THREE_GPP);
mediaRecorder.setOutputFile(audioFilePath);
mediaRecorder.setAudioEncoder(MediaRecorder.AudioEncoder.AMR_NB);
mediaRecorder.prepare();
}
catch (Exception e)
{
e.printStackTrace();
}
mediaRecorder.start();
}
public void stopRecording()
{
if (isRecording)
{
mediaRecorder.stop();
mediaRecorder.reset(); // set state to idle
mediaRecorder.release();
mediaRecorder = null;
isRecording = false;
}
else
{
mediaPlayer.release();
mediaPlayer.reset();
mediaPlayer = null;
}
}
class listener implements RecognitionListener
{
// standard codes onReadyForSpeech, onBeginningOfSpeech, etc
}
Questions:
I have made the app step by step, and at first the app does not have recording functions, and the voice recognition works perfectly.
After I have tested many times and considered the voice recognition is ok, I start to incorporate the recording functions using the MediaRecorder
.
I then tested, once the button_start is pressed, ERROR3 AUDIO
message immediately appears even before I tried to speak.
I play back the voice recording. The voice is recorded and saved properly too.
What is happening? Why Cannot recording at the same time when using voice recognition?
Thanks!
--EDIT-- module for Opus-Record WHILE Speech-Recognition also runs
--EDIT-- 'V1BETA1' streaming, continuous, recognition with minor change to sample project. Alter that 'readData()', so the raw PCM in 'sData' is shared by 2 threads ( fileSink thread , recognizerAPI thread from sample project). For the sink, just hook up an encoder using a PCM stream refreshed at each 'sData' IO. remember to CLO the stream and it will work. review 'writeAudiaDataToFile()' for more on fileSink....
--EDIT-- see this thread
There is going to be a basic conflict over the HAL and the microphone buffer when you try to do:
speechRecognizer.startListening(recognizerIntent); // <-- needs mutex use of mic
and
mediaRecorder.start(); // <-- needs mutex use of mic
You can only choose one or the other of the above actions to own the audio API's underlying the mic!
If you want to mimic the functionality of Google Keep where you talk only once and as output from the one input process (your speech into mic) you get 2 separate types of output (STT and a fileSink of say the MP3) then you must split something as it exits the HAL layer from the mic.
For example:
Pick up the RAW audio as PCM 16 coming out of the mic's buffer
Split the above buffer's bytes (you can get a stream from the buffer and pipe the stream 2 places)
STRM 1 to the API for STT either before or after you encode it (there are STT APIs accepting both Raw PCM 16 or encoded)
STRM 2 to an encoder, then to the fileSink for your capture of the recording
Split can operate on either the actual buffer produced by the mic or on a derivative stream of those same bytes.
For what you are getting into, I recommend you look at getCurrentRecording()
and consumeRecording()
here.
STT API reference: Google "pultz speech-api". Note that there are use-cases on the API's mentioned there.
这篇关于录制语音时,语音识别无法正常工作的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!