流输入到System.Speech.Recognition.SpeechRecognitionEngine [英] Streaming input to System.Speech.Recognition.SpeechRecognitionEngine
问题描述
我试图做流在C#中的语音识别从TCP套接字。我遇到的问题是,SpeechRecognitionEngine.SetInputToAudioStream()似乎需要一个定义的长度可以寻求流。现在我能想到的,使这项工作的唯一方法是在一个MemoryStream反复运行识别为多个输入进来
I am trying to do "streaming" speech recognition in C# from a TCP socket. The problem I am having is that SpeechRecognitionEngine.SetInputToAudioStream() seems to require a Stream of a defined length which can seek. Right now the only way I can think to make this work is to repeatedly run the recognizer on a MemoryStream as more input comes in.
下面是一些代码来说明:
Here's some code to illustrate:
SpeechRecognitionEngine appRecognizer = new SpeechRecognitionEngine();
System.Speech.AudioFormat.SpeechAudioFormatInfo formatInfo = new System.Speech.AudioFormat.SpeechAudioFormatInfo(8000, System.Speech.AudioFormat.AudioBitsPerSample.Sixteen, System.Speech.AudioFormat.AudioChannel.Mono);
NetworkStream stream = new NetworkStream(socket,true);
appRecognizer.SetInputToAudioStream(stream, formatInfo);
// At the line above a "NotSupportedException" complaining that "This stream does not support seek operations."
有谁知道如何解决这个问题?它必须支持流某种投入,因为它使用SetInputToDefaultAudioDevice()工作正常使用麦克风。
Does anyone know how to get around this? It must support streaming input of some sort, since it works fine with the microphone using SetInputToDefaultAudioDevice().
谢谢,肖恩
推荐答案
我得到了现场语音识别通过重写流类工作:
I got live speech recognition working by overriding the stream class:
class SpeechStreamer : Stream
{
private AutoResetEvent _writeEvent;
private List<byte> _buffer;
private int _buffersize;
private int _readposition;
private int _writeposition;
private bool _reset;
public SpeechStreamer(int bufferSize)
{
_writeEvent = new AutoResetEvent(false);
_buffersize = bufferSize;
_buffer = new List<byte>(_buffersize);
for (int i = 0; i < _buffersize;i++ )
_buffer.Add(new byte());
_readposition = 0;
_writeposition = 0;
}
public override bool CanRead
{
get { return true; }
}
public override bool CanSeek
{
get { return false; }
}
public override bool CanWrite
{
get { return true; }
}
public override long Length
{
get { return -1L; }
}
public override long Position
{
get { return 0L; }
set { }
}
public override long Seek(long offset, SeekOrigin origin)
{
return 0L;
}
public override void SetLength(long value)
{
}
public override int Read(byte[] buffer, int offset, int count)
{
int i = 0;
while (i<count && _writeEvent!=null)
{
if (!_reset && _readposition >= _writeposition)
{
_writeEvent.WaitOne(100, true);
continue;
}
buffer[i] = _buffer[_readposition+offset];
_readposition++;
if (_readposition == _buffersize)
{
_readposition = 0;
_reset = false;
}
i++;
}
return count;
}
public override void Write(byte[] buffer, int offset, int count)
{
for (int i = offset; i < offset+count; i++)
{
_buffer[_writeposition] = buffer[i];
_writeposition++;
if (_writeposition == _buffersize)
{
_writeposition = 0;
_reset = true;
}
}
_writeEvent.Set();
}
public override void Close()
{
_writeEvent.Close();
_writeEvent = null;
base.Close();
}
public override void Flush()
{
}
}
...并使用该实例作为输入到SetInputToAudioStream方法流。只要流返回的长度或返回的计数小于所要求的识别引擎认为输入已完成。这样就建立了,从来没有完成一个循环缓冲区。
... and using an instance of that as the stream input to the SetInputToAudioStream method. As soon as the stream returns a length or the returned count is less than that requested the recognition engine thinks the input has finished. This sets up a circular buffer that never finishes.
这篇关于流输入到System.Speech.Recognition.SpeechRecognitionEngine的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!