程序运行时如何使Microsoft Azure语音转文本开始转录? (Unity,C#) [英] How to get Microsoft Azure Speech To Text to start transcribing when program is run? (Unity, C#)
问题描述
我正在尝试使用Unity3D中Microsoft Azure的Cognitive Services Speech To Text SDK开发一个简单的应用程序.我正在关注本教程,并且效果很好.本教程的唯一问题是语音转文本"是通过按钮激活的.当您按下按钮时,它将在一个句子的时间内进行转录,并且您必须再次按下按钮才能再次进行转录.我的问题是,我希望程序在Unity中运行后立即开始转录,而不是每次我想转录一个句子时都必须按下一个按钮.
I am trying to build a simple app using Microsoft Azure's Cognitive Services Speech To Text SDK in Unity3D. I've following this tutorial, and it worked quite well. The only problem with this tutorial is that the Speech-To-Text is activated by a button. When you press the button, it'll transcribe for the duration of a sentence, and you'll have to press the button again for it to transcribe again. My problem is I'd like it to start transcribing as soon as the program is run in Unity, rather than having to press a button each time I want to transcribe a sentence.
这是代码.
public async void ButtonClick()
{
// Creates an instance of a speech config with specified subscription key and service region.
// Replace with your own subscription key and service region (e.g., "westus").
var config = SpeechConfig.FromSubscription("[My API Key]", "westus");
// Make sure to dispose the recognizer after use!
using (var recognizer = new SpeechRecognizer(config))
{
lock (threadLocker)
{
waitingForReco = true;
}
// Starts speech recognition, and returns after a single utterance is recognized. The end of a
// single utterance is determined by listening for silence at the end or until a maximum of 15
// seconds of audio is processed. The task returns the recognition text as result.
// Note: Since RecognizeOnceAsync() returns only a single utterance, it is suitable only for single
// shot recognition like command or query.
// For long-running multi-utterance recognition, use StartContinuousRecognitionAsync() instead.
var result = await recognizer.RecognizeOnceAsync().ConfigureAwait(false);
// Checks result.
string newMessage = string.Empty;
if (result.Reason == ResultReason.RecognizedSpeech)
{
newMessage = result.Text;
}
else if (result.Reason == ResultReason.NoMatch)
{
newMessage = "NOMATCH: Speech could not be recognized.";
}
else if (result.Reason == ResultReason.Canceled)
{
var cancellation = CancellationDetails.FromResult(result);
newMessage = $"CANCELED: Reason={cancellation.Reason} ErrorDetails={cancellation.ErrorDetails}";
}
lock (threadLocker)
{
message = newMessage;
waitingForReco = false;
}
}
}
void Start()
{
if (outputText == null)
{
UnityEngine.Debug.LogError("outputText property is null! Assign a UI Text element to it.");
}
else if (startRecoButton == null)
{
message = "startRecoButton property is null! Assign a UI Button to it.";
UnityEngine.Debug.LogError(message);
}
else
{
// Continue with normal initialization, Text and Button objects are present.
}
}
void Update()
{
lock (threadLocker)
{
if (startRecoButton != null)
{
startRecoButton.interactable = !waitingForReco && micPermissionGranted;
}
}
}
我尝试删除Button对象,但是语音转文本不会运行.
I've tried removing the Button object, but then the speech-to-text won't run.
任何提示或建议都将是惊人的.谢谢.
Any tips or advice would be amazing. Thank you.
推荐答案
根据您所引用的教程脚本中的注释:
Per the comments in the script of the tutorial your referenced:
// Starts speech recognition, and returns after a single utterance is recognized. The end of a
// single utterance is determined by listening for silence at the end or until a maximum of 15
// seconds of audio is processed. The task returns the recognition text as result.
// Note: Since RecognizeOnceAsync() returns only a single utterance, it is suitable only for single
// shot recognition like command or query.
// For long-running multi-utterance recognition, use StartContinuousRecognitionAsync() instead.
但是,这并不像用'StartContinuousRecognitionAsync
'替换'RecognizeOnceAsync
'那样简单,因为它们的行为是不同的. RecognizeOnceAsync
基本上将打开麦克风最多15秒钟,然后停止收听.
But it's not as simple as replacing 'RecognizeOnceAsync
' with 'StartContinuousRecognitionAsync
', because the behaviours are different. RecognizeOnceAsync
will basically turn on your mic for a maximum of 15 seconds, and then stop listening.
相反,将按钮设置为我应该继续听还是不继续听?"使用StartContinuousRecognitionAsync
和StopContinuousRecognitionAsync
,然后更改您的Start
函数以简单地启动一个新的识别器,并等待语音识别器事件通过.下面是我用来启用此功能的脚本:
Instead, make the button into 'should I listen continuously or not?' using StartContinuousRecognitionAsync
and StopContinuousRecognitionAsync
, and then change your Start
function to simply start up a new recognizer and have it waiting for the Speech Recognizer event to come through. Below is the script I used to enable this functionality:
using UnityEngine;
using UnityEngine.UI;
using Microsoft.CognitiveServices.Speech;
public class HelloWorld : MonoBehaviour
{
public Text outputText;
public Button startRecordButton;
// PULLED OUT OF BUTTON CLICK
SpeechRecognizer recognizer;
SpeechConfig config;
private object threadLocker = new object();
private bool speechStarted = false; //checking to see if you've started listening for speech
private string message;
private bool micPermissionGranted = false;
private void RecognizingHandler(object sender, SpeechRecognitionEventArgs e)
{
lock (threadLocker)
{
message = e.Result.Text;
}
}
public async void ButtonClick()
{
if (speechStarted)
{
await recognizer.StopContinuousRecognitionAsync().ConfigureAwait(false); // this stops the listening when you click the button, if it's already on
lock(threadLocker)
{
speechStarted = false;
}
}
else
{
await recognizer.StartContinuousRecognitionAsync().ConfigureAwait(false); // this will start the listening when you click the button, if it's already off
lock (threadLocker)
{
speechStarted = true;
}
}
}
void Start()
{
startRecordButton.onClick.AddListener(ButtonClick);
config = SpeechConfig.FromSubscription("KEY", "REGION");
recognizer = new SpeechRecognizer(config);
recognizer.Recognizing += RecognizingHandler;
}
void Update()
{
lock (threadLocker)
{
if (outputText != null)
{
outputText.text = message;
}
}
}
}
以下是使用此功能的我的gif图片.您不会说我根本不单击按钮(在录制gif之前,它只被单击了一次)(同样,对于奇怪的句子,我的同事不停打扰,问我在跟谁说话)
And below is a gif of me using this functionality. You'll not that I don't click the button at all (and it was only clicked once, prior to the gif being recorded)(also, sorry for the strange sentences, my coworkers kept interrupting asking who I was talking to)
这篇关于程序运行时如何使Microsoft Azure语音转文本开始转录? (Unity,C#)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!