Google Speech Api从音频文件中获取文本,并在C#中返回{“result”:[]} [英] Google Speech Api get text from audio file returning {"result":[]} in C#
问题描述
我正在尝试创建一个Windows应用程序,在该应用程序中,我可以将我拥有的音频文件转换为带有Google语音识别API的文本文件。这是我做的:
1)我去了 https://groups.google.com/a/chromium.org/forum/?fromgroups#!forum/chromium-dev 并成为会员。
$ b 2)我前往我的Google Developers Console并成功生成了API密钥。
$ b <3> I在线获得一些代码并运行它:
private void btnGoogle_Click(object sender,EventArgs e)
{
string path = @Z:\path\to\audio\file\good-morning-google.flac;
尝试
{
FileStream fileStream = File.OpenRead(path);
MemoryStream memoryStream = new MemoryStream();
memoryStream.SetLength(fileStream.Length);
fileStream.Read(memoryStream.GetBuffer(),0,(int)fileStream.Length);
byte [] BA_AudioFile = memoryStream.GetBuffer();
HttpWebRequest _HWR_SpeechToText = null;
_HWR_SpeechToText =
(HttpWebRequest)HttpWebRequest.Create(
https://www.google.com/speech-api/v2/recognize?output=json&lang=en-us&键=你的API密钥,在这里);
_HWR_SpeechToText.Credentials = CredentialCache.DefaultCredentials;
_HWR_SpeechToText.Method =POST;
_HWR_SpeechToText.ContentType =audio / x-flac; rate = 44100;
_HWR_SpeechToText.ContentLength = BA_AudioFile.Length;
Stream stream = _HWR_SpeechToText.GetRequestStream();
stream.Write(BA_AudioFile,0,BA_AudioFile.Length);
stream.Close();
HttpWebResponse HWR_Response =(HttpWebResponse)_HWR_SpeechToText.GetResponse();
if(HWR_Response.StatusCode == HttpStatusCode.OK)
{
Console.WriteLine(looks ok ...);
StreamReader SR_Response =新的StreamReader(HWR_Response.GetResponseStream());
Console.WriteLine(SR_Response.ReadToEnd());
Console.WriteLine(SR_Response.ReadToEnd());
Console.WriteLine(完成);
$ b catch(Exception ex)
{
Console.WriteLine(ex.ToString()) ;
}
Console.ReadLine();
}
上面的代码运行。它给了我以下输出:
看起来不错...
{result:[]}
因此,我知道我得到了一个 HttpStatusCode.OK
因为看起来不错......
执行日志行。
然而,结果完全是空的......为什么?我做错了什么?
编辑:这里是我得到的音频文件: https://github.com/gillesdemey/google-speech-v2
首先你的代码比需要的更复杂,我使用这个:
string api_key =xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx;
string path = @C:\temp\good-morning-google.flac;
byte [] bytes = System.IO.File.ReadAllBytes(path);
WebClient客户端=新WebClient();
client.Headers.Add(Content-Type,audio / x-flac; rate = 44100);
byte [] result = client.UploadData(string.Format(
https://www.google.com/speech-api/v2/recognize?client=chromium&lang=en-us& key = {0},api_key),POST,bytes);
string s = client.Encoding.GetString(result);
第二个问题是您的音频文件!它采用32位立体声。它应该是16位PCM Mono。所以转换成单声道并下降到16位。我使用 http://www.audacityteam.org/ 转换您的文件。看看截图。
然后我得到了这个回应:
{结果:[]}
{result:[{alternative:[{transcript:早上好,Google今天感觉如何,信心:0.987629}], true}],result_index:0}
I'm trying to create a windows application where I can take an audio file I have and transcribe the voice in it to a text file with the Google Speech Recognition API. Here is what I did:
1) I went here https://groups.google.com/a/chromium.org/forum/?fromgroups#!forum/chromium-dev and became a member.
2) I went to my Google Developers Console and generated an API key successfully.
3) I got some code online and ran it:
private void btnGoogle_Click(object sender, EventArgs e)
{
string path = @"Z:\path\to\audio\file\good-morning-google.flac";
try
{
FileStream fileStream = File.OpenRead(path);
MemoryStream memoryStream = new MemoryStream();
memoryStream.SetLength(fileStream.Length);
fileStream.Read(memoryStream.GetBuffer(), 0, (int)fileStream.Length);
byte[] BA_AudioFile = memoryStream.GetBuffer();
HttpWebRequest _HWR_SpeechToText = null;
_HWR_SpeechToText =
(HttpWebRequest)HttpWebRequest.Create(
"https://www.google.com/speech-api/v2/recognize?output=json&lang=en-us&key=your-api-key-here");
_HWR_SpeechToText.Credentials = CredentialCache.DefaultCredentials;
_HWR_SpeechToText.Method = "POST";
_HWR_SpeechToText.ContentType = "audio/x-flac; rate=44100";
_HWR_SpeechToText.ContentLength = BA_AudioFile.Length;
Stream stream = _HWR_SpeechToText.GetRequestStream();
stream.Write(BA_AudioFile, 0, BA_AudioFile.Length);
stream.Close();
HttpWebResponse HWR_Response = (HttpWebResponse)_HWR_SpeechToText.GetResponse();
if (HWR_Response.StatusCode == HttpStatusCode.OK)
{
Console.WriteLine("looks ok...");
StreamReader SR_Response = new StreamReader(HWR_Response.GetResponseStream());
Console.WriteLine(SR_Response.ReadToEnd());
Console.WriteLine(SR_Response.ReadToEnd());
Console.WriteLine("Done");
}
}
catch (Exception ex)
{
Console.WriteLine(ex.ToString());
}
Console.ReadLine();
}
The code above runs. It gives me the following output:
looks ok...
{"result":[]}
Thus I know I am getting a HttpStatusCode.OK
response because the looks ok...
log line executes.
However, the result is totally empty... Why is that? Am I doing something wrong?
EDIT: Here is where I got the audio file: https://github.com/gillesdemey/google-speech-v2
First of all your code is more complex then needed, I used this:
string api_key = "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx";
string path = @"C:\temp\good-morning-google.flac";
byte[] bytes = System.IO.File.ReadAllBytes(path);
WebClient client = new WebClient();
client.Headers.Add("Content-Type", "audio/x-flac; rate=44100");
byte[] result = client.UploadData(string.Format(
"https://www.google.com/speech-api/v2/recognize?client=chromium&lang=en-us&key={0}", api_key), "POST", bytes);
string s = client.Encoding.GetString(result);
The second issue you have is your audio file! It's in 32-bit stereo. It should be 16-bit PCM Mono. So convert to mono and drop to 16-bit. I used http://www.audacityteam.org/ to convert your file. See screenshot.
Then I got this response:
{"result":[]}
{"result":[{"alternative":[{"transcript":"good morning Google how are you feeling today","confidence":0.987629}],"final":true}],"result_index":0}
这篇关于Google Speech Api从音频文件中获取文本,并在C#中返回{“result”:[]}的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!