可以将Microsoft Bing Speech配置为仅返回数字/字母吗? [英] Can Microsoft Bing Speech be configured to return only numbers / letters?

查看:87
本文介绍了可以将Microsoft Bing Speech配置为仅返回数字/字母吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Microsoft Bing Speech API是否可以配置为仅返回数字和字母,而不是完整单词?

Can the Microsoft Bing Speech API be configured to only return numbers and letters, as opposed to full words?

该用例正在翻译加拿大邮政编码.前任. M 1 B 0 R3.Microsoft可能会返回"Em 1 Be 0 Are 3"

The use case is translating Canadian postal codes. Ex. M 1 B 0 R 3. Microsoft may return "Em 1 Be 0 Are 3"

我们的音频文件为8000hz,并使用"M-ULAW"编码.我们在更改采样率或编码方面没有灵活性.我们正在使用"SMD"方案,但是我找不到有关此功能的任何文档.基本请求URI:

Our audio file is 8000hz and encoded with "M-ULAW". We have no flexibility in changing the sample rate or encoding. We are using the "SMD" scenario, but I can't find any documentation on what this does. Base request URI:

https://speech.platform.bing.com/recognize?scenarios=smd&appid=D4D52672-91D7-4C74-8AD8-42B1D98141A5&device.os=your_device_os&version=3.0

对于这种用例,是否有办法从Microsoft获得更准确的答复?

Is there a way to get a more accurate response from Microsoft for this use case?

谢谢

推荐答案

您可以尝试使用自定义语言模型.

You could try using Microsoft's Custom Speech Service (previously known as the Custom Recognition Intelligent Service, or CRIS) to create and use a custom language model.

The guidelines for transcription of custom language models say "Common acronyms can be left as a single entity without periods or spaces between the letters, but all other acronyms should be written out in separate letters, with each letter separated by a single space" and include this example:

Original text               After normalization
-----------------------     ---------------------------
play OU812 by Van Halen     play O U 8 1 2 by Van Halen

因此,按照他们的指导方针,您的自定义语言模型将是一个文件,其中每一行看起来都像这样:

So following their guidelines, your custom language model will be a file where each line looks something like this:

M 1 B 0 R 3

您可以根据代码结构轻松生成包含数千个加拿大邮政编码示例的文件,该文件的正则表达式格式如下:

You can easily generate a file containing thousands of examples of Canadian postal codes based on the structure of the codes, which in regular expression format looks like this:

[ABCEGHJKLMNPRSTVXY][0-9][ABCEGHJKLMNPRSTVWXYZ][0-9][ABCEGHJKLMNPRSTVWXYZ][0-9]

(以上表达式来自有关验证邮政编码的答案.)

通过这样做,您可以告诉识别器您希望人们说些什么,并帮助识别器选择声音的多种可能性(例如"U"对您").我认为这将对您获得的结果产生巨大的影响.

By doing this you're telling the recognizer what sort of things you're expecting people to say, and helping it choose when there are multiple possibilities for a sound (e.g. "U" vs. "you"). I think it will make a huge difference in the results you get.

这篇关于可以将Microsoft Bing Speech配置为仅返回数字/字母吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆