Google语音识别API:每个单词的时间戳? [英] Google Speech Recognition API: timestamp for each word?

查看:358
本文介绍了Google语音识别API:每个单词的时间戳?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

可以通过向http://www.google.com/speech-api/v2/recognize?...

示例:我在WAV文件中说过"二三为五". Google API为我提供了这一点:

Example: I have said "one two three for five" in a WAV file. Google API gives me this:

{
  u'alternative':
  [
    {u'transcript': u'12345'},
    {u'transcript': u'1 2 3 4 5'},
    {u'transcript': u'one two three four five'}
  ],
  u'final': True
}

问题:能否获得每个单词被说出的时间(以秒为单位)?

以我的示例为例:

['one', 0.23, 0.80], ['two', 1.03, 1.45], ['three', 1.79, 2.35], etc.

即在时间00:00:00.23和00:00:00.80之间说了一个"一词,
在时间00:00:01.03和00:00:01.45(以秒为单位)之间说了两个"一词.

i.e. the word "one" has been said between time 00:00:00.23 and 00:00:00.80,
the word "two" has been said between time 00:00:01.03 and 00:00:01.45 (in seconds).

PS:正在寻找一种支持除英语之外的其他语言(尤其是法语)的API.

PS: looking for an API supporting other languages than English, especially French.

推荐答案

我相信其他答案现在已经过时了.现在,使用Google Cloud Search API可以实现: https://cloud.google.com/speech/docs/async-time-offsets

I believe the other answer is now out of date. This is now possible with the Google Cloud Search API: https://cloud.google.com/speech/docs/async-time-offsets

这篇关于Google语音识别API:每个单词的时间戳?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆