语音到文本,ogg格式的新API参数,不工作! [英] Speech To Text, new API parameter for ogg format, NOT WORK!

查看:101
本文介绍了语音到文本,ogg格式的新API参数,不工作!的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/rest-speech-to-text

https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/rest-speech-to-text

一个月以前,ogg使用  '内容类型'
' audio / OGG;编解码器=音频/ PCM; samplerate = 16000 ' ,没关系,

a months ago, that ogg use 'Content-type': 'audio/ogg; codec=audio/pcm; samplerate=16000' and it's fine,

然而,现在你们改变了到" audio / ogg; codecs = opus ",它完全不起作用!

however, now you guys change that to "audio/ogg; codecs=opus", and it totally doesn't work!

这是python代码:

this is the python code:

导入请求

导入json

def saf():

     ;打开('d:/a.ogg','rb')为f:

        1:

            data = f.read(1024)

           如果不是数据:

                break

           收益率数据

 

header = {

    '转移编码':'chunked',

    '内容类型':'audio / ogg; codecs = opus',

    'Ocp-Apim-Subscription-Key':'***',

    '接受':'application / json'    

}

 

response = requests.post('https://westus.stt.speech.microsoft.com/语音/识别/对话/认知服务/ v1?language = zh-CN',headers = headers,verify = False,data = saf())

 

打印(response.content)
$
#results = json.loads(response.content)

#print(结果)

import requests
import json
def saf():
    with open('d:/a.ogg', 'rb') as f:
        while 1:
            data=f.read(1024)
            if not data:
                break
            yield data
 
headers = {
    'Transfer-Encoding': 'chunked',
    'Content-type': 'audio/ogg; codecs=opus',
    'Ocp-Apim-Subscription-Key': '***',
    'Accept': 'application/json'    
}
 
response = requests.post('https://westus.stt.speech.microsoft.com/speech/recognition/conversation/cognitiveservices/v1?language=zh-CN', headers=headers, verify=False, data=saf())
 
print(response.content)
#results=json.loads(response.content)
#print(results)

和输出消息是b'{" Message":" Unsupported audio format"}'

and the output message is b'{"Message":"Unsupported audio format"}'

推荐答案

<你好约翰,

对于SDK,它目前只支持带有PCM编解码器的WAV格式。 

For the SDK it currently only supports the WAV format with PCM codec. 

在你使用REST的情况下应支持音频格式。您可以尝试使用'codecs = audio / opus'一次并检查吗?如果还有其他问题,我建议在文档中提出GIT问题

链接
 所以相关的产品组团队可以解决它。

In your case of using REST the audio format should be supported. Could you please try to use 'codecs=audio/opus' once and check? If there are further issues I would recommend to raise a GIT issue at the documentation link so the concerned product group team can address it.

仅限SDK支持MP3和Opus / Ogg音频文件作为流输入文件。此功能仅适用于C ++和C#的Linux,目前处于测试阶段(详情请参阅

这里

-Rohit


这篇关于语音到文本,ogg格式的新API参数,不工作!的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆