使用System.Speech与Kinect的 [英] Using System.Speech with Kinect

查看:252
本文介绍了使用System.Speech与Kinect的的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我开发一个原型语音到文本的大学项目字幕应用。我将要在后期用我的项目中的手势识别,所以我认为这将是使用Kinect的为麦克风源,而不是使用额外的麦克风是一个好主意。我的应用程序的想法是要认识到自发的演讲,如长期和复杂的句子(我的理解它不会对语音听写不会是完美不过)。我见过很多Kinect的语音样本,其中它对Microsoft.Speech一个参考,但不是System.Speech。因为我需要训练语音引擎和DictationGrammar装入语音识别引擎,Microsoft.Speech是我唯一的选择。



我设法在使用Kinect的作为直接麦克风音频源得到它的工作,但因为我是加载的Kinect的视频预览和手势识别,我无法访问它作为一个直接麦克风。



这是代码直接访问麦克风,而无需加载Kinect的硬件手势等,并完美的作品:

 私人无效InitializeSpeech()
{
变种speechRecognitionEngine =新SpeechRecognitionEngine();
speechRecognitionEngine.SetInputToDefaultAudioDevice();
speechRecognitionEngine.LoadGrammar(新DictationGrammar());
speechRecognitionEngine.RecognizeAsync(RecognizeMode.Multiple);
speechRecognitionEngine.SpeechRecognized + =(S,参数)=> MessageBox.Show(args.Result.Text);
}



而这正是我需要通过Kinect的,一旦访问访问源已经加载,这是没有做任何事情。这是我想要做的:

 使用(VAR audioSource =新KinectAudioSource())
{
audioSource.FeatureMode = TRUE;
audioSource.AutomaticGainControl = FALSE;
audioSource.SystemMode = SystemMode.OptibeamArrayOnly;

VAR recognizerInfo = GetKinectRecognizer();
变种speechRecognitionEngine =新SpeechRecognitionEngine(recognizerInfo.Id);

speechRecognitionEngine.LoadGrammar(新DictationGrammar());
speechRecognitionEngine.SpeechRecognized + =(S,参数)=> MessageBox.Show(args.Result.Text);使用

(VAR S = audioSource.Start())
{
speechRecognitionEngine.SetInputToAudioStream(S,新SpeechAudioFormatInfo(EncodingFormat.Pcm,16000,16,1,32000,2 , 空值));
speechRecognitionEngine.RecognizeAsync(RecognizeMode.Multiple);
}
}



所以现在的问题是,它甚至有可能使用System.Speech代替Microsoft.Speech与目前Kinect的SDK,以及那我第二个代码示例中做错了?



GetKinectRecognizer法

 私有静态RecognizerInfo GetKinectRecognizer()
{
Func键< RecognizerInfo,布尔> matchingFunc = R = GT;
{
字符串值;
r.AdditionalInfo.TryGetValue(Kinect的超时值);
返回True.Equals(值,StringComparison.InvariantCultureIgnoreCase)及和放大器; EN-US.Equals(r.Culture.Name,StringComparison.InvariantCultureIgnoreCase);
};

返回SpeechRecognitionEngine.InstalledRecognizers(),其中(matchingFunc).FirstOrDefault()。
}


解决方案

从我自己的实验,我可以告诉你,你其实可以同时使用这两个库。



试试这个代码,而不是当前的代码(请确保您添加到System.Speech的引用,很明显):

 使用(VAR audioSource =新KinectAudioSource())
{
audioSource.FeatureMode = TRUE;
audioSource.AutomaticGainControl = FALSE;
audioSource.SystemMode = SystemMode.OptibeamArrayOnly;

System.Speech.Recognition.RecognizerInfo RI = GetKinectRecognizer();
变种speechRecognitionEngine =新SpeechRecognitionEngine(ri.Id);

speechRecognitionEngine.LoadGrammar(新DictationGrammar());
speechRecognitionEngine.SpeechRecognized + =(S,参数)=> MessageBox.Show(args.Result.Text);使用

(VAR S = audioSource.Start())
{
speechRecognitionEngine.SetInputToAudioStream(S,新SpeechAudioFormatInfo(EncodingFormat.Pcm,16000,16,1,32000,2 , 空值));
speechRecognitionEngine.RecognizeAsync(RecognizeMode.Multiple);
}
}



祝你好运!


I am developing a prototype speech to text captioning application for a University project. I am going to be using gesture recognition within my project late on, so I thought it would be a good idea to use the Kinect as the microphone source, rather than using an additional microphone. The idea of my application is to recognize spontaneous speeches such as long and complex sentences (I understand it won’t that the speech dictation will not be perfect however). I have seen many Kinect speech samples where it makes a reference to Microsoft.Speech, but not System.Speech. As I need to train the speech engine and load a DictationGrammar into the Speech Recognition Engine, Microsoft.Speech is the only option for me.

I have managed to get it working while using the Kinect as the direct microphone audio source, but since I am loading the Kinect for the video preview and gesture recognition, I am unable to access it as a direct microphone.

This is code accessing the microphone directly without loading the Kinect hardware for gesture, etc, and works perfectly:

private void InitializeSpeech()
{
    var speechRecognitionEngine = new SpeechRecognitionEngine();
    speechRecognitionEngine.SetInputToDefaultAudioDevice();
    speechRecognitionEngine.LoadGrammar(new DictationGrammar());
    speechRecognitionEngine.RecognizeAsync(RecognizeMode.Multiple);
    speechRecognitionEngine.SpeechRecognized += (s, args) => MessageBox.Show(args.Result.Text);
}

And this is where I need to access the access source via the Kinect once it has been loaded, which isn't doing anything at all. This I want to be doing:

using (var audioSource = new KinectAudioSource())
{
    audioSource.FeatureMode = true;
    audioSource.AutomaticGainControl = false;
    audioSource.SystemMode = SystemMode.OptibeamArrayOnly;

    var recognizerInfo = GetKinectRecognizer();
    var speechRecognitionEngine = new SpeechRecognitionEngine(recognizerInfo.Id);

    speechRecognitionEngine.LoadGrammar(new DictationGrammar());
    speechRecognitionEngine.SpeechRecognized += (s, args) => MessageBox.Show(args.Result.Text);

    using (var s = audioSource.Start())
    {
        speechRecognitionEngine.SetInputToAudioStream(s, new SpeechAudioFormatInfo(EncodingFormat.Pcm, 16000, 16, 1, 32000, 2, null));
        speechRecognitionEngine.RecognizeAsync(RecognizeMode.Multiple);
    }
}

So the question is, is it even possible to use System.Speech instead of Microsoft.Speech with the current Kinect SDK, and what am I doing wrong in the 2nd code sample?

GetKinectRecognizer Method

private static RecognizerInfo GetKinectRecognizer()
{
    Func<RecognizerInfo, bool> matchingFunc = r =>
    {
        string value;
        r.AdditionalInfo.TryGetValue("Kinect", out value);
        return "True".Equals(value, StringComparison.InvariantCultureIgnoreCase) && "en-US".Equals(r.Culture.Name, StringComparison.InvariantCultureIgnoreCase);
    };

    return SpeechRecognitionEngine.InstalledRecognizers().Where(matchingFunc).FirstOrDefault();
}

解决方案

From my own experimentation, I can tell you that you can in fact use both libraries simultaneously.

Try this code instead of your current code (make sure that you add a reference to System.Speech, obviously):

using (var audioSource = new KinectAudioSource())
{
    audioSource.FeatureMode = true;
    audioSource.AutomaticGainControl = false;
    audioSource.SystemMode = SystemMode.OptibeamArrayOnly;

    System.Speech.Recognition.RecognizerInfo ri = GetKinectRecognizer();
    var speechRecognitionEngine = new SpeechRecognitionEngine(ri.Id);

    speechRecognitionEngine.LoadGrammar(new DictationGrammar());
    speechRecognitionEngine.SpeechRecognized += (s, args) => MessageBox.Show(args.Result.Text);

    using (var s = audioSource.Start())
    {
        speechRecognitionEngine.SetInputToAudioStream(s, new SpeechAudioFormatInfo(EncodingFormat.Pcm, 16000, 16, 1, 32000, 2, null));
        speechRecognitionEngine.RecognizeAsync(RecognizeMode.Multiple);
    }
}

Good Luck!!!

这篇关于使用System.Speech与Kinect的的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆