如何确定SpeechRecognitionEngine识别的单词的位置? [英] How to determine position of recognized words of SpeechRecognitionEngine?

查看:233
本文介绍了如何确定SpeechRecognitionEngine识别的单词的位置?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我探索的 SpeechRecognitionEngine 的能力,我的最终目标是要输入一个WAV文件和WAV文件的转录,并输出的位置每个单词的开头的WAV文件(理想情况下,结束)。

I am exploring the SpeechRecognitionEngine's capabilities, and my end goal is to input a WAV file and a transcription of that WAV file, and to output the positions in the WAV file of the beginning (and ideally, end) of each word.

我可以得到发动机的成功认出这句话,但我不知道如何找回音频位置启动Word时,不能当识别推测或认可等。

I can get the engine to recognize the phrase successfully, but I can not understand how to retrieve the audio positions when the word starts, not when the recognition was hypothesized or recognized, etc.

如果你好奇的这点是什么,它是在自动化唇形动画工作流程。

If you're curious what the point of this is, it is in automating lipsync animation workflows.

感谢您的时间。

推荐答案

适当的音频以文本对齐是需要从语音识别不同的特定算法的任务。你可以模拟具有ASR发动机的一些对齐功能,但它的工作好

Proper audio to text alignment is a task which requires specific algorithms different from the speech recognition. You can emulate some alignment functionality with ASR engine, but it will work good.

有关的比对算法的实现,您可以检查CMUSphinx语音识别工具包:

For the implementations of the alignment algorithms you can check CMUSphinx speech recognition toolkit:

http://cmusphinx.sourceforge.net /?S =长+音频+排列

http://www.bluevincent.com/2011/02/speech-to-text-using-java.html

或者你可以尝试商业公司的服务就像一个来自Nexiwave

Or you can try commercial company service like the one from Nexiwave

http://nexiwave.com/index.php/applications/transcription-timestamping

这篇关于如何确定SpeechRecognitionEngine识别的单词的位置?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆