如何通过静音部分拆分视频或音频 [英] How to split video or audio by silent parts

查看:872
本文介绍了如何通过静音部分拆分视频或音频的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要按词自动拆分语音视频,所以每个词都是一个单独的视频文件.您知道执行此操作的任何方法吗?

I need to automatically split video of a speech by words, so every word is a separate video file. Do you know any ways to do this?

我的计划是检测无声部分并将其用作单词分隔符.但是我没有找到任何执行此操作的工具,而且ffmpeg似乎不是执行此操作的正确工具.

My plan was to detect silent parts and use them as words separators. But i didn't find any tool to do this and looks like ffmpeg is not the right tool for that.

推荐答案

您可以首先使用ffmpeg来检测静默间隔,像这样

You could first use ffmpeg to detect intervals of silence, like this

ffmpeg -i "input.mov" -af silencedetect=noise=-30dB:d=0.5 -f null - 2> vol.txt

这将产生控制台输出,其读数如下所示:

This will produce console output with readings that look like this:

[silencedetect @ 00000000004b02c0] silence_start: -0.0306667
[silencedetect @ 00000000004b02c0] silence_end: 1.42767 | silence_duration: 1.45833
[silencedetect @ 00000000004b02c0] silence_start: 2.21583
[silencedetect @ 00000000004b02c0] silence_end: 2.7585 | silence_duration: 0.542667
[silencedetect @ 00000000004b02c0] silence_start: 3.1315
[silencedetect @ 00000000004b02c0] silence_end: 5.21833 | silence_duration: 2.08683
[silencedetect @ 00000000004b02c0] silence_start: 5.3895
[silencedetect @ 00000000004b02c0] silence_end: 7.84883 | silence_duration: 2.45933
[silencedetect @ 00000000004b02c0] silence_start: 8.05117
[silencedetect @ 00000000004b02c0] silence_end: 10.0953 | silence_duration: 2.04417
[silencedetect @ 00000000004b02c0] silence_start: 10.4798
[silencedetect @ 00000000004b02c0] silence_end: 12.4387 | silence_duration: 1.95883
[silencedetect @ 00000000004b02c0] silence_start: 12.6837
[silencedetect @ 00000000004b02c0] silence_end: 14.5572 | silence_duration: 1.8735
[silencedetect @ 00000000004b02c0] silence_start: 14.9843
[silencedetect @ 00000000004b02c0] silence_end: 16.5165 | silence_duration: 1.53217

然后,您将生成命令,以从每个静默末尾拆分到下一个静默开始.您可能需要添加一些250 ms的句柄,因此音频的持续时间将为250 ms * 2.

You then generate commands to split from each silence end to the next silence start. You will probably want to add some handles of, say, 250 ms, so the audio will have a duration of 250 ms * 2 more.

ffmpeg -ss <silence_end - 0.25> -t <next_silence_start - silence_end + 2 * 0.25> -i input.mov word-N.mov

(我已跳过了指定音频/视频参数的操作)

(I have skipped specifying audio/video parameters)

您将要编写一个脚本来刮取控制台日志并生成带有时间码的结构化(也许是CSV)文件-每行一对:silent_end和下一行silence_start.然后是另一个脚本,用于生成每对数字的命令.

You'll want to write a script to scrape the console log and generate a structured (maybe CSV) file with the timecodes - one pair on each line: silence_end and the next silence_start. And then another script to generate the commands with each pair of numbers.

这篇关于如何通过静音部分拆分视频或音频的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆