微软语音识别:有信心比分交替的结果吗? [英] Microsoft Speech Recognition: Alternate results with confidence score?

查看:555
本文介绍了微软语音识别:有信心比分交替的结果吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是新来的Microsoft.Speech识别器(使用微软语音平台SDK版本11)工作,我试图把它输出正最好的肯定,从一个简单的语法相匹配,随着信心的成绩。每。

I'm new to working with the Microsoft.Speech recognizer (using Microsoft Speech Platform SDK Version 11) and I'm trying to have it output the n-best recognition matches from a simple grammar, along with the confidence score for each.

根据该文件(并提到<一个href="http://stackoverflow.com/questions/17247473/c-sharp-system-speech-recognition-alternate-words">in在回答这个问题),应该可以使用 e.Result.Alternates 访问比得分最高的另外一个识别的单词。然而,即使重新信心拒绝阈值设置为0(这应该没啥意思被拒绝)之后,我仍然只得到一个结果,也没有候补(虽然 SpeechHypothesized 事件表明,中的至少一个的换句话说似乎要与在某一点非零可信度)识别。

According to the documentation (and as mentioned in the answer to this question), one should be able to use e.Result.Alternates to access the recognized words other than the top-scoring one. However, even after resetting the confidence rejection threshold to 0 (which should mean nothing is rejected), I still only get one result, and no alternates (although the SpeechHypothesized events indicate that at least one of the other words does seem to be recognized with non-zero confidence at some point).

我的问题:谁能给我解释一下为什么我只得到一个识别的单词,即使在信心拒绝阈值设置为零?我怎样才能获得其他可能的比赛,他们的信心得分是多少?我缺少的是在这里吗?

My question: Can anyone explain to me why I only get one recognized word, even when the confidence rejection threshold is set to zero? How can I get the other possible matches and their confidence scores? What am I missing here?

下面是我的code。在此先感谢任何人谁可以帮助:)

Below is my code. Thanks in advance to anyone who can help :)

在下面的示例中,识别器发送的字新闻的wav文件,并有相似的话(绞索,蝾螈)来选择。我想提取识别器的可靠性得分为每个单词的列表,(他们都应该是非零),即使它只会返回最好的之一(新京报)作为结果。

In the sample below, the recognizer is sent a wav file of the word "news", and has to select from similar words ("noose", "newts"). I want to extract a list of the recognizer's confidence score for EACH word (they should all be non-zero), even though it will only return the best one ("news") as the result.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using Microsoft.Speech.Recognition;

namespace SimpleRecognizer
{
    class Program
    {
        static readonly string[] settings = new string[] {
            "CFGConfidenceRejectionThreshold",
            "HighConfidenceThreshold", 
            "NormalConfidenceThreshold",
            "LowConfidenceThreshold"};

        static void Main(string[] args)
        {
            // Create a new SpeechRecognitionEngine instance.
            SpeechRecognitionEngine sre = new SpeechRecognitionEngine(); //en-US SRE

            // Configure the input to the recognizer.
            sre.SetInputToWaveFile(@"C:\Users\Anjana\Documents\news.wav");

            // Display Recognizer Settings (Confidence Thresholds)
            ListSettings(sre);

            // Set Confidence Threshold to Zero (nothing should be rejected)
            sre.UpdateRecognizerSetting("CFGConfidenceRejectionThreshold", 0);
            sre.UpdateRecognizerSetting("HighConfidenceThreshold", 0);
            sre.UpdateRecognizerSetting("NormalConfidenceThreshold", 0);
            sre.UpdateRecognizerSetting("LowConfidenceThreshold", 0);

            // Display New Recognizer Settings
            ListSettings(sre);

            // Build a simple Grammar with three choices
            Choices topics = new Choices();
            topics.Add(new string[] { "news", "newts", "noose" });
            GrammarBuilder gb = new GrammarBuilder();
            gb.Append(topics);
            Grammar g = new Grammar(gb);
            g.Name = "g";

            // Load the Grammar
            sre.LoadGrammar(g);

            // Register handlers for Grammar's SpeechRecognized Events
            g.SpeechRecognized += new EventHandler<SpeechRecognizedEventArgs>(gram_SpeechRecognized);

            // Register a handler for the recognizer's SpeechRecognized event.
            sre.SpeechRecognized += new EventHandler<SpeechRecognizedEventArgs>(sre_SpeechRecognized);

            // Register Handler for SpeechHypothesized
            sre.SpeechHypothesized += new EventHandler<SpeechHypothesizedEventArgs>(sre_SpeechHypothesized);

            // Start recognition.
            sre.Recognize();

            Console.ReadKey(); //wait to close

        }
        static void gram_SpeechRecognized(object sender, SpeechRecognizedEventArgs e)
        {
            Console.WriteLine("\nNumber of Alternates from Grammar {1}: {0}", e.Result.Alternates.Count.ToString(), e.Result.Grammar.Name);
            foreach (RecognizedPhrase phrase in e.Result.Alternates)
            {
                Console.WriteLine(phrase.Text + ", " + phrase.Confidence);
            }
        }
        static void sre_SpeechRecognized(object sender, SpeechRecognizedEventArgs e)
        {
            Console.WriteLine("\nSpeech recognized: " + e.Result.Text + ", " + e.Result.Confidence);
            Console.WriteLine("Number of Alternates from Recognizer: {0}", e.Result.Alternates.Count.ToString());
            foreach (RecognizedPhrase phrase in e.Result.Alternates)
            {
                Console.WriteLine(phrase.Text + ", " + phrase.Confidence);
            }
        }
        static void sre_SpeechHypothesized(object sender, SpeechHypothesizedEventArgs e)
        {
            Console.WriteLine("Speech from grammar {0} hypothesized: {1}, {2}", e.Result.Grammar.Name, e.Result.Text, e.Result.Confidence);
        }
        private static void ListSettings(SpeechRecognitionEngine recognizer)
        {
            foreach (string setting in settings)
            {
                try
                {
                    object value = recognizer.QueryRecognizerSetting(setting);
                    Console.WriteLine("  {0,-30} = {1}", setting, value);
                }
                catch
                {
                    Console.WriteLine("  {0,-30} is not supported by this recognizer.",
                      setting);
                }
            }
            Console.WriteLine();
        }
    }
}

这给出了以下的输出:

Original recognizer settings:
  CFGConfidenceRejectionThreshold = 20
  HighConfidenceThreshold        = 80
  NormalConfidenceThreshold      = 50
  LowConfidenceThreshold         = 20

Updated recognizer settings:
  CFGConfidenceRejectionThreshold = 0
  HighConfidenceThreshold        = 0
  NormalConfidenceThreshold      = 0
  LowConfidenceThreshold         = 0

Speech from grammar g hypothesized: noose, 0.2214646
Speech from grammar g hypothesized: news, 0.640804

Number of Alternates from Grammar g: 1
news, 0.9208503

Speech recognized: news, 0.9208503
Number of Alternates from Recognizer: 1
news, 0.9208503

我还试图为每个字(而不是一个短语与三个选择)一个单独的词组执行本,甚至与每个单词/短语单独语法。结果基本一致:只有一个备用。

I also tried implementing this with a separate phrase for each word (instead of one phrase with three choices), and even with a separate grammar for each word/phrase. The results are basically the same: only one "alternate".

推荐答案

我相信这是另一个地方,SAPI让你求人办事的SR引擎并不真正支持。

I believe this is another place where SAPI lets you ask for things that the SR engine doesn't really support.

无论Microsoft.Speech.Recognition和System.Speech.Recognition使用底层的SAPI接口,以做好本职工作;唯一的区别是SR引擎被使用。 (Microsoft.Speech.Recognition使用服务器引擎; System.Speech.Recognition使用桌面引擎)

Both Microsoft.Speech.Recognition and System.Speech.Recognition use the underlying SAPI interfaces to do their work; the only difference is which SR engine gets used. (Microsoft.Speech.Recognition uses the Server engine; System.Speech.Recognition uses the Desktop engine.)

候补委员主要设计用于听写,没有上下文无关文法。你总是可以得到的一个的替代了CFG,但备用发电code看起来将不会扩大候补CFGS。

Alternates are primarily designed for dictation, not context-free grammars. You can always get one alternate for a CFG, but the alternate generation code looks like it won't expand the alternates for CFGs.

不幸的是,Microsoft.Speech.Recognition引擎不支持听写。 (然而它确实,工作与低得多的质量的音频,并且它不需要训练。)

Unfortunately, the Microsoft.Speech.Recognition engine doesn't support dictation. (It does, however, work with much lower quality audio, and it doesn't need training.)

这篇关于微软语音识别:有信心比分交替的结果吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆