使用sox和python根据时间戳列表使音频区域静音 [英] silence out regions of audio based on a list of time stamps , using sox and python

查看:322
本文介绍了使用sox和python根据时间戳列表使音频区域静音的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个音频文件.
我有一堆[开始,结束]时间戳片段.

I have an audio file.
I have a bunch of [start, end] time stamp segments.

我要实现的目标: 说音频长6:00分钟.
我的细分是:[[0.0,4.0],[8.0,12.0],[16.0,20.0],[24.0,28.0]]

WHAT I WANT TO ACHIEVE: Say audio is 6:00 minutes long.
Segments I have are : [[0.0,4.0], [8.0,12.0], [16.0,20.0], [24.0,28.0]]

在将这两个参数传递给sox + python之后,输出的音频应该是6分钟长,但是只有在片段经过的时间内才有音频.

After I pass these two to sox + python , out put should be audio that is 6 minutes long, but has audio only in the times passed by the segments.

即我想将time stamps和原始音频传递给SOX + python 从而生成一种音频,所有音频都被静音,除了与通过的片段相对应的那些部分之外

i.e I want to pass the time stamps and original audio to SOX + python so that an audio with everything silenced out except for those portions corresponding to the passed segments is generated

我无法实现上述目标,但在相反的情况下却有点接近,在经过数天的谷歌搜索后,我得到了以下信息:

I couldn't achieve above but came somewhat close to the opposite, after days of googling I have this:

已更新,更简洁的代码+示例:
像这样进行填充和修剪的sox命令

UPDATED, MORE CONCISE CODE + EXAMPLE:
sox command that takes padding and trimming like this

SOX__SILENCE = 'sox "{inputaudio}" -c 1 "{outputaudio}" {padding}{trimming}'

随机细分进行测试:

# random segments:
A= [[0.0,16.0]]
b=[[1.0,2.0]]
z= [[1.6, 8.3], [13.2, 33.7], [35.0,38.0], [42.0,51.0], [70.2,73.7], [90.0,99.2], [123.0,131.1]]
q= [[0.0,4.0], [8.0,12.0], [16.0,20.0], [24.0,28.0]]

一个小的python脚本,用于生成填充和修剪.

A small python script to generate padding and trimming.

填充:

def get_pad_pattern_from_timestamps(my_segments):
        padding = 'pad'
        for segment in my_segments:
            duration = str(segment[1] - segment[0])
            padding = padding + ' ' + duration + '@' + str(segment[0])
        return padding
​
print get_pad_pattern_from_timestamps(A)
print get_pad_pattern_from_timestamps(b)
print get_pad_pattern_from_timestamps(z)
print get_pad_pattern_from_timestamps(q)

从^输出:

pad 16.0@0.0
pad 1.0@1.0
pad 6.7@1.6 20.5@13.2 3.0@35.0 9.0@42.0 3.5@70.2 9.2@90.0 8.1@123.0
pad 4.0@0.0 4.0@8.0 4.0@16.0 4.0@24.0 4.0@32.0 4.0@40.0

修剪:

def get_trimm_pattern_from_timestamps(my_segments):
        trimming = ''
        for segment in my_segments:
            duration = str(segment[1] - segment[0])
            trimming = trimming + ' trim 0 ' + str(segment[0]) + ' 0 ' + duration + ' ' + duration
        return trimming

print get_trimm_pattern_from_timestamps(A)
print get_trimm_pattern_from_timestamps(b)
print("\n")
print get_trimm_pattern_from_timestamps(z)
print("\n")
print get_trimm_pattern_from_timestamps(q)
print("\n")

修剪的输出:

trim 0 0.0 0 16.0 16.0
 trim 0 1.0 0 1.0 1.0


 trim 0 1.6 0 6.7 6.7 trim 0 13.2 0 20.5 20.5 trim 0 35.0 0 3.0 3.0 trim 0 42.0 0 9.0 9.0 trim 0 70.2 0 3.5 3.5 trim 0 90.0 0 9.2 9.2 trim 0 123.0 0 8.1 8.1


 trim 0 0.0 0 4.0 4.0 trim 0 8.0 0 4.0 4.0 trim 0 16.0 0 4.0 4.0 trim 0 24.0 0 4.0 4.0 trim 0 32.0 0 4.0 4.0 trim 0 40.0 0 4.0 4.0

使用来自终端的about输出来运行SOX:

RUNNING SOX using about outputs from a terminal:

Padding:  

    sox dinners.mp3 -c 1 testlongpad.mp3 pad 4.0@0.0 4.0@8.0 4.0@16.0 4.0@24.0

Trimming:  

    sox dinners.mp3 -c 1 testrim.mp3 trim 0 0.0 0 16.0 16.0

Padd and trimm: 

    sox dinners.mp3 -c 1 testlongpadtrim.mp3 pad 4.0@0.0 4.0@8.0 4.0@16.0 4.0@24.0 trim 0 0.0 0 4.0 4.0 trim 0 8.0 0 4.0 4.0 trim 0 16.0 0 4.0 4.0 trim 0 24.0 0 4.0 4.0

如果S是我的细分受众群,那么NS就是其他一切.在^方法中,我通过了NS,并且NS从音频中删除了.

If S are my segments, then NS is everything else. In ^ approach I'm passing NS , and NS is getting removed from Audio.

我想要实现的仍然是相同的,但是以不同的方式,即我想通过S以便仅保留与S相对应的音频部分.

What I want to achieve is still the same but in a different way i.e I want to pass S so that only portions of audio corresponding toS are retained.

PS:我的问题非常具体,我是音频处理的新手,不确定如何进行.请不要以太宽泛之类来结束问题. 我很乐意提供更多详细信息以进行澄清. 最后,这不是硬件问题.这是一个个人项目.

PS: My question is very specific, i am new to audio processing and unsure how to proceed. Kindly don't close question as being too broad or something. I'd be happy to provide more details to provide clarification. Lastly this is not a hw question. This is for a personal project.

示例音频: https://www. dropbox.com/s/1p27nfwney42ka2/LAZY_SALON_-03-_Hot_Dinners.mp3?dl=0

Sample Audio : https://www.dropbox.com/s/1p27nfwney42ka2/LAZY_SALON_-03-_Hot_Dinners.mp3?dl=0

样本句段[[开始,结束],[,]]:[[1.6, 8.3], [13.2, 33.7], [35.0,38.0], [42.0,51.0], [70.2,73.7], [90.0,99.2], [123.0,131.1]]

Sample Segments[[start,end],[,] ] : [[1.6, 8.3], [13.2, 33.7], [35.0,38.0], [42.0,51.0], [70.2,73.7], [90.0,99.2], [123.0,131.1]]

因此,当这些时间戳与音频一起传递给sox/python时,音频中的所有内容(所提供片段中的那些部分除外)都应被静音.

So when these time stamps are passed to sox/python with audio, everything in the audio except those portions in the supplied segments should be silenced out.

推荐答案

我能够采用一种解决方法来实现.

I was able to implement with a workaround.

请参阅:从列表中创建新列表通过分组对python中的列表进行排序

我所做的是创建一个包含段之间区域的新列表,然后将其传递给sox.此刻,我传递给袜子的任何东西都被清除了.因此,我计算了要删除的区域,然后将其传递给sox.效果很好.

What I did was create a new list containing the regions between segments and then pass it on to sox. At the moment whatever I pass to sox gets removed. So I calculated regions to be removed and then passed it on to sox. It worked pretty well.

解决方案仍然是倒置的,但我不必更改袜筒中的任何东西.

Solution is still inverted , but I don't have to change anything in the sox.

我不会接受我的答案作为答案.希望有人能够提出一种解决方案,该方案涉及修改sox命令,而不必像我一样重新计算段.

I won't accept my answer as an answer. Hoping someone is able to come up with a solution which involves modifying sox commands and not have to recalculate segments like I did.

这篇关于使用sox和python根据时间戳列表使音频区域静音的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆