使用sox和python根据时间戳列表使音频区域静音 [英] silence out regions of audio based on a list of time stamps , using sox and python

查看：322 发布时间：2020/9/13 21:23:53 python audio sox

本文介绍了使用sox和python根据时间戳列表使音频区域静音的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个音频文件.
我有一堆[开始，结束]时间戳片段.

I have an audio file.
I have a bunch of [start, end] time stamp segments.

我要实现的目标: 说音频长6:00分钟.
我的细分是:[[0.0,4.0]，[8.0,12.0]，[16.0,20.0]，[24.0,28.0]]

WHAT I WANT TO ACHIEVE: Say audio is 6:00 minutes long.
Segments I have are : [[0.0,4.0], [8.0,12.0], [16.0,20.0], [24.0,28.0]]

在将这两个参数传递给sox + python之后，输出的音频应该是6分钟长，但是只有在片段经过的时间内才有音频.

After I pass these two to sox + python , out put should be audio that is 6 minutes long, but has audio only in the times passed by the segments.

即我想将time stamps和原始音频传递给SOX + python 从而生成一种音频，所有音频都被静音，除了与通过的片段相对应的那些部分之外

i.e I want to pass the time stamps and original audio to SOX + python so that an audio with everything silenced out except for those portions corresponding to the passed segments is generated

我无法实现上述目标，但在相反的情况下却有点接近，在经过数天的谷歌搜索后，我得到了以下信息:

I couldn't achieve above but came somewhat close to the opposite, after days of googling I have this:

已更新，更简洁的代码+示例:
像这样进行填充和修剪的sox命令

UPDATED, MORE CONCISE CODE + EXAMPLE:
sox command that takes padding and trimming like this

SOX__SILENCE = 'sox "{inputaudio}" -c 1 "{outputaudio}" {padding}{trimming}'

随机细分进行测试:

# random segments:
A= [[0.0,16.0]]
b=[[1.0,2.0]]
z= [[1.6, 8.3], [13.2, 33.7], [35.0,38.0], [42.0,51.0], [70.2,73.7], [90.0,99.2], [123.0,131.1]]
q= [[0.0,4.0], [8.0,12.0], [16.0,20.0], [24.0,28.0]]

一个小的python脚本，用于生成填充和修剪.

A small python script to generate padding and trimming.

填充:

def get_pad_pattern_from_timestamps(my_segments):
        padding = 'pad'
        for segment in my_segments:
            duration = str(segment[1] - segment[0])
            padding = padding + ' ' + duration + '@' + str(segment[0])
        return padding

print get_pad_pattern_from_timestamps(A)
print get_pad_pattern_from_timestamps(b)
print get_pad_pattern_from_timestamps(z)
print get_pad_pattern_from_timestamps(q)

从^输出:

pad 16.0@0.0
pad 1.0@1.0
pad 6.7@1.6 20.5@13.2 3.0@35.0 9.0@42.0 3.5@70.2 9.2@90.0 8.1@123.0
pad 4.0@0.0 4.0@8.0 4.0@16.0 4.0@24.0 4.0@32.0 4.0@40.0

修剪:

def get_trimm_pattern_from_timestamps(my_segments):
        trimming = ''
        for segment in my_segments:
            duration = str(segment[1] - segment[0])
            trimming = trimming + ' trim 0 ' + str(segment[0]) + ' 0 ' + duration + ' ' + duration
        return trimming

print get_trimm_pattern_from_timestamps(A)
print get_trimm_pattern_from_timestamps(b)
print("\n")
print get_trimm_pattern_from_timestamps(z)
print("\n")
print get_trimm_pattern_from_timestamps(q)
print("\n")

修剪的输出:

trim 0 0.0 0 16.0 16.0
 trim 0 1.0 0 1.0 1.0


 trim 0 1.6 0 6.7 6.7 trim 0 13.2 0 20.5 20.5 trim 0 35.0 0 3.0 3.0 trim 0 42.0 0 9.0 9.0 trim 0 70.2 0 3.5 3.5 trim 0 90.0 0 9.2 9.2 trim 0 123.0 0 8.1 8.1


 trim 0 0.0 0 4.0 4.0 trim 0 8.0 0 4.0 4.0 trim 0 16.0 0 4.0 4.0 trim 0 24.0 0 4.0 4.0 trim 0 32.0 0 4.0 4.0 trim 0 40.0 0 4.0 4.0

使用来自终端的about输出来运行SOX:

RUNNING SOX using about outputs from a terminal:

Padding:  

    sox dinners.mp3 -c 1 testlongpad.mp3 pad 4.0@0.0 4.0@8.0 4.0@16.0 4.0@24.0

Trimming:  

    sox dinners.mp3 -c 1 testrim.mp3 trim 0 0.0 0 16.0 16.0

Padd and trimm: 

    sox dinners.mp3 -c 1 testlongpadtrim.mp3 pad 4.0@0.0 4.0@8.0 4.0@16.0 4.0@24.0 trim 0 0.0 0 4.0 4.0 trim 0 8.0 0 4.0 4.0 trim 0 16.0 0 4.0 4.0 trim 0 24.0 0 4.0 4.0

如果S是我的细分受众群，那么NS就是其他一切.在^方法中，我通过了NS，并且NS从音频中删除了.

If S are my segments, then NS is everything else. In ^ approach I'm passing NS , and NS is getting removed from Audio.

我想要实现的仍然是相同的，但是以不同的方式，即我想通过S以便仅保留与S相对应的音频部分.

What I want to achieve is still the same but in a different way i.e I want to pass S so that only portions of audio corresponding toS are retained.

PS:我的问题非常具体，我是音频处理的新手，不确定如何进行.请不要以太宽泛之类来结束问题. 我很乐意提供更多详细信息以进行澄清. 最后，这不是硬件问题.这是一个个人项目.

PS: My question is very specific, i am new to audio processing and unsure how to proceed. Kindly don't close question as being too broad or something. I'd be happy to provide more details to provide clarification. Lastly this is not a hw question. This is for a personal project.

示例音频: https://www. dropbox.com/s/1p27nfwney42ka2/LAZY_SALON_-03-_Hot_Dinners.mp3?dl=0

Sample Audio : https://www.dropbox.com/s/1p27nfwney42ka2/LAZY_SALON_-03-_Hot_Dinners.mp3?dl=0

样本句段[[开始，结束]，[，]]:[[1.6, 8.3], [13.2, 33.7], [35.0,38.0], [42.0,51.0], [70.2,73.7], [90.0,99.2], [123.0,131.1]]

Sample Segments[[start,end],[,] ] : [[1.6, 8.3], [13.2, 33.7], [35.0,38.0], [42.0,51.0], [70.2,73.7], [90.0,99.2], [123.0,131.1]]

因此，当这些时间戳与音频一起传递给sox/python时，音频中的所有内容(所提供片段中的那些部分除外)都应被静音.

So when these time stamps are passed to sox/python with audio, everything in the audio except those portions in the supplied segments should be silenced out.

使用sox和python根据时间戳列表使音频区域静音 [英] silence out regions of audio based on a list of time stamps , using sox and python

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

使用sox和python根据时间戳列表使音频区域静音 [英] silence out regions of audio based on a list of time stamps , using sox and python

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭