沃森语音转文本实时流C#代码示例 [英] Watson speech to text live stream C# code example
问题描述
我正在尝试使用C#构建一个应用程序,该应用程序将获取音频流(目前是从文件中获取,但稍后将是网络流),并在它们可用时实时返回Watson的转录,类似于演示位于 https://speech-to-text-demo.mybluemix.net/
I'm trying to build an app in C# that will take an audio stream (from a file for now, but later it will be a web stream) and return transcriptions from Watson in real time as they become available, similar to the demo at https://speech-to-text-demo.mybluemix.net/
有人知道我可以在哪里找到一些示例代码(最好是C#),可以帮助我入门吗?
Does anyone know where I can find some sample code, preferably in C#, that could help me get started?
我根据 https://github.com/watson-developer-cloud/dotnet-standard-sdk/tree/development/src/IBM.WatsonDeveloperCloud.SpeechToText.v1 ,但我得到了当我调用RecognizeWithSession时出现BadRequest错误.我不确定我是否在正确的道路上.
I tried this, based on the limited documentation at https://github.com/watson-developer-cloud/dotnet-standard-sdk/tree/development/src/IBM.WatsonDeveloperCloud.SpeechToText.v1, but I get a BadRequest error when I call RecognizeWithSession. I'm not sure if I'm on the right path here.
static void StreamingRecognize(string filePath)
{
SpeechToTextService _speechToText = new SpeechToTextService();
_speechToText.SetCredential(<user>, <pw>);
var session = _speechToText.CreateSession("en-US_BroadbandModel");
//returns initialized
var recognizeStatus = _speechToText.GetSessionStatus(session.SessionId);
// set up observe
var taskObserveResult = Task.Factory.StartNew(() =>
{
var result = _speechToText.ObserveResult(session.SessionId);
return result;
});
// get results
taskObserveResult.ContinueWith((antecedent) =>
{
var results = antecedent.Result;
});
var metadata = new Metadata();
metadata.PartContentType = "audio/wav";
metadata.DataPartsCount = 1;
metadata.Continuous = true;
metadata.InactivityTimeout = -1;
var taskRecognizeWithSession = Task.Factory.StartNew(() =>
{
using (FileStream fs = File.OpenRead(filePath))
{
_speechToText.RecognizeWithSession(session.SessionId, "audio/wav", metadata, fs, "chunked");
}
});
}
推荐答案
在Watson Developer Cloud-SDK的内部,以您的编程语言,您可以看到一个名为Examples的文件夹,并且可以使用语音转文字.
Inside the Watson Developer Cloud - SDK's, in your programming language, you can see one folder called Examples, and you can access the example for using Speech to Text.
SDK支持WebSocket,可以满足您录制更多实时内容而不是上传音频文件的需求.
The SDK has support for WebSockets which would satisfy your requirement of transcribing more real-time versus uploading an audio file.
static void Main(string[] args)
{
Transcribe();
Console.WriteLine("Press any key to exit");
Console.ReadLine();
}
// http://www.ibm.com/smarterplanet/us/en/ibmwatson/developercloud/doc/getting_started/gs-credentials.shtml
static String username = "<username>";
static String password = "<password>";
static String file = @"c:\audio.wav";
static Uri url = new Uri("wss://stream.watsonplatform.net/speech-to-text/api/v1/recognize");
// these should probably be private classes that use DataContractJsonSerializer
// see https://msdn.microsoft.com/en-us/library/bb412179%28v=vs.110%29.aspx
// or the ServiceState class at the end
static ArraySegment<byte> openingMessage = new ArraySegment<byte>( Encoding.UTF8.GetBytes(
"{\"action\": \"start\", \"content-type\": \"audio/wav\", \"continuous\" : true, \"interim_results\": true}"
));
static ArraySegment<byte> closingMessage = new ArraySegment<byte>(Encoding.UTF8.GetBytes(
"{\"action\": \"stop\"}"
));
// ... more in the link below