创建有声读物字幕 [英] Create subtitles for audio books
问题描述
我要添加时间戳预订的句子,装修相关的有声读物。
各种语言的理想。
I want to add timestamps to book sentences, fitting the relevant audiobook. In various languages ideally.
下面是一个例子:结果
的傲慢与prejudice 的结果
从古腾堡项目结果文本
<一href=\"http://ia700408.us.archive.org/8/items/pride_$p$pjudice_1102_librivox/prideand$p$pjudice_01_austen_64kb.mp3\"相对=nofollow>从LibriVox的音频
我的想法是要找到一个语音识别工具,使句子上的时间戳(步骤1),然后映射凌乱转录使用莱文斯坦距离原文(步骤2)。
My idea was to find a voice recognition tool that puts timestamps on sentences (step 1), and then map the messy transcription to the original text using levenshtein distances (step 2).
该网站 https://speechlogger.appspot.com/ 提供了一个解决方案,第1档,但它在字符输出的限制。我可以theoritically使用网络自动化来完成这项工作,通过启动每分钟左右一个新的记录,但它真的很脏。
The website https://speechlogger.appspot.com/ offers a solution to the 1st step, but it's limited in character output. I could theoritically use web automation to get the job done, by starting a new recording every minute or so, but it's really dirty.
我的脚本的步骤2中的R和测试其上的样品,我从speechlogger得到,它工作okayish,但是这可能如果程序知道文本,当读训练语音识别软件等来大大提高。我不是在这里用我的一切资料,第一抄录。
I scripted step 2 in R and tested it on a sample I got from speechlogger and it works okayish, but this could be greatly improved if the program knew the text, like when you read to train a speech recognition software. I'm not using all my information here by transcribing first.
所以我的问题是,我能有什么其他办法时间戳的音频文件,并且是有办法,我可以让我更聪明的方法通过使识别引擎知道它应该认识?
So my questions are, what alternative ways could i have to timestamp audio files, and is there a way i can make my process smarter by letting the recognition engine know what it's supposed to recognize ?
推荐答案
有针对各种精度水平开发了许多很好的软件包:
There are many nice software packages developed for that with various level of accuracy:
<一个href=\"https://github.com/cmusphinx/sphinx4/blob/master/sphinx4-samples/src/main/java/edu/cmu/sphinx/demo/aligner/AlignerDemo.java\"相对=nofollow>定位仪演示中Sphinx4 - 在java中CMUSphinx工具箱
Aligner Demo in Sphinx4 - CMUSphinx toolkit in java
SAIL对齐 - 基于HTK对准,Perl脚本,颇有些包
SAIL align - HTK-based aligner, quite some pack of perl scripts.
温和 - 基于Kaldi对准,可以作为一个服务
Gentle - Kaldi-based aligner, works as a service.
这篇关于创建有声读物字幕的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!