Google Cloud Speech API单词提示 [英] Google Cloud Speech API word Hints

查看:131
本文介绍了Google Cloud Speech API单词提示的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

您能举一些在​​Google云语音API中使用单词提示的示例吗?我尝试将REST API执行程序用于brook.flac.我输入了短语Brooklin(而不是Brooklyn),但结果是相同的.它们完全有效吗?

Can you give and example of using word hints in Google cloud speech API. I try to use Rest API executor for brook.flac. I input phrase Brooklin (instead of Brooklyn) but the result is the same. Does they works at all?

推荐答案

来自 https ://cloud.google.com/speech/docs/basics#phrase-提示

对于任何给定的识别任务,您还可以传递SpeechContext(SpeechContext类型),该语音上下文提供有助于处理给定音频的信息.当前,上下文可以保存短语列表以充当识别器的提示".这些短语可以提高识别此类单词或短语的可能性.

For any given recognition task, you may also pass a speechContext (of type SpeechContext) that provides information to aid in processing the given audio. Currently, a context can hold a list of phrases to act as "hints" to the recognizer; these phrases can boost the probability that such words or phrases will be recognized.

您可以通过以下几种方式使用这些短语提示:

You may use these phrase hints in a few ways:

提高特定单词和短语的准确性,这些单词和短语可能会在音频数据中过度使用.例如,如果用户通常说特定的命令,则可以将其作为短语提示提供.如果所提供的音频包含噪声或所包含的语音不是很清晰,则此类附加短语可能特别有用. 将其他单词添加到识别任务的词汇表中. Cloud Speech API包含非常庞大的词汇表.但是,如果专有名词或特定领域的词汇不明确,则可以将其添加到请求的SpeechContext提供的短语中. 短语既可以提供为小单词组,也可以提供为单个单词. (有关这些短语的数量和大小的限制,请参阅内容限制".)当作为多词短语提供时,提示会增加顺序识别这些单词的可能性,但在较小程度上,也会增加识别部分单词的可能性.短语,包括单个单词.

Improve the accuracy for specific words and phrases that may tend to be overrepresented in your audio data. For example, if specific commands are typically spoken by the user, you can provide these as phrase hints. Such additional phrases may be particularly useful if the supplied audio contains noise or the contained speech is not very clear. Add additional words to the vocabulary of the recognition task. The Cloud Speech API includes a very large vocabulary. However, if proper names or domain-specific words are out-of-vocabulary, you can add them to the phrases provided to your requests's speechContext. Phrases may be provided both as small groups of words or as single words. (See Content Limits for limits on the number and size of these phrases.) When provided as multi-word phrases, hints boost the probability of recognizing those words in sequence but also, to a lesser extent, boost the probability of recognizing portions of the phrase, including individual words.

例如,此shwazil_hoful.flac文件包含一些虚构的单词.如果在不提供这些词汇以外的单词的情况下执行识别,识别器将不会返回所需的成绩单,而是返回词汇中的单词,例如:整天都是燕子".

For example, this shwazil_hoful.flac file contains some made-up words. If recognition is performed without supplying these out-of-vocabulary words, the recognizer will not return the desired transcript, but instead return words that are in vocabulary, such as: "it's a swallow whole day".

{
  "config": {
    "encoding":"FLAC",
    "sampleRateHertz": 16000,
    "languageCode":"en-US"
  },
  "audio":{
    "uri":"gs://speech-demo/shwazil_hoful.flac"
  }
}

但是,当识别请求中提供了这些词汇以外的单词时,识别器将返回所需的成绩单:这是shwazil美好的一天".

However, when these out-of-vocabulary words are supplied with the recognition request, the recognizer will return the desired transcript: "it's a shwazil hoful day".

{
  "config": {
    "encoding":"FLAC",
    "sampleRateHertz": 16000,
    "languageCode":"en-US",
    "speechContexts": {
      "phrases":["hoful","shwazil"]
     }
  },
  "audio":{
    "uri":"gs://speech-demo/shwazil_hoful.flac"
  }
}

或者,如果某些单词通常在短语中一起说出,则可以将它们组合在一起,这可以进一步提高人们对它们被识别的信心.

Alternatively, if certain words are typically said together in a phrase, they can be grouped together, which may further increase the confidence that they will be recognized.

{
  "config": {
    "encoding":"FLAC",
    "sampleRateHertz": 16000,
    "languageCode":"en-US",
    "speechContexts": {
      "phrases":["shwazil hoful day"]
     }
  },
  "audio":{
    "uri":"gs://speech-demo/shwazil_hoful.flac"
  }
}

通常,在提供语音上下文提示时要格外小心.通过将短语限制为仅预期要说的短语,可以实现更好的识别精度.例如,如果存在多个对话框状态或设备操作模式,则仅提供与当前状态相对应的提示,而不要始终为所有可能的状态提供提示.

In general, be sparing when providing speech context hints. Better recognition accuracy can be achieved by limiting phrases to only those expected to be spoken. For example, if there are multiple dialog states or device operating modes, provide only the hints that correspond to the current state, rather than always supplying hints for all possible states.

这篇关于Google Cloud Speech API单词提示的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆