使用python使用准确的标签将时间范围扩展为更多较小的增量步长 [英] expand time ranges into more steps of smaller increments with accurate labels using python
问题描述
我有一个带有时间戳和标签的文本文件,如下所示:
I have a text file with time stamps and labels like this :
0.000000 14.463912 tone
14.476425 16.891247 noise
16.891247 21.232923 not_music
21.232923 23.172289 not_music
23.172289 29.128018 not_music
如果我将步长指定为1秒. 我希望此列表分解为持续时间为1秒的时间范围 但仍带有最近的标签.如何将时间范围分解成较小的步骤,但要带有准确的标签?
If I specify a step size of 1 second. I want this list to explode into time frames of 1 second long duration but still carry the nearest label. How do I explode the time ranges into smaller steps but with accurate labels?
例如,如果我的步骤是1秒,则 第一行将变成大约14行,如:
for example if my step were 1 second, then the first line would become ~ 14 lines like :
0.0 1.0 tone
1.0 2.0 tone
.
.
.
13.0 14.0 tone
[14.0 , 14.46] and [14.47, 15.0] #fall in a grey zone , don't know
what to do
15.0 16.0 noise
到目前为止,我已经设法读取了文本文件并将其存储在类似以下的列表中:
So far I have managed to read in the text file and store them in a list like:
my_segments =[]
for line in open('./data/annotate.txt', 'rb').readlines():
start, end, label = line.split("\t")
start = float(start)
end = float(end)
label = label.strip()
my_segments.append((start, end, label))
# print my_segments
for i in range(len(my_segments)):
print my_segments[i]
我看了@Jared的 https://stackoverflow.com/a/18265979/4932791 ,其中详细介绍了如何使用numpy在给定步长的两个数字之间创建一个范围.像这样:
I looked at https://stackoverflow.com/a/18265979/4932791 by @Jared which details how to create a range between two numbers with a given step size using numpy. like so :
>>> numpy.arange(11, 17, 0.5)
array([ 11. , 11.5, 12. , 12.5, 13. , 13.5, 14. , 14.5, 15. ,
15.5, 16. , 16.5])
无法弄清楚如何在一定范围内执行类似的操作.
Unable to figure out how to do something similar on a range of ranges.
我设法提出的伪代码/算法是:
Pseudocode/algorithm I managed to come up with is :
- 第1步,调整步长
- 步骤2-将步长分配给与步长相对应的left_variable和right_variable
第3步-将这一步移至各个范围之内,例如窗口 如果步数落在该范围内,则步长为 它是相应的标签. - 第4步,现在更新左侧的内容, 一步一步走.
- 步骤5-从步骤3重复到文件结尾 达到了.
- step 1- take a step size,
- step 2- assign step size to a left_variable and a right_variable corresponding to the step size
step 3- move this step like window over the each range and check if the step falls within the range or not, If it does then assign it the corresponding label. - step 4- now update the left and right by 1 step.
- step 5- repeat from step 3 till end of file is reached.
我认为要处理极端情况,我应该将步长减小到0.25秒之类,并设定条件,如果当前步长至少重叠40或50%,那么我会相应地分配标签.
I think to handle edge cases, I should reduce step size to 0.25 seconds or something like that and put a condition if the current step has atleast 40 or 50% overlap then I assign the label accordingly.
更新: 我无法使用的解决方案:
Update : my non working solution :
sliding_window = 0
#st,en = [0.0,1.0]
jumbo= []
for i in range(len(hold_segments)):
if sliding_window > hold_segments[i][0] and sliding_window+1 < hold_segments[i][1]:
jumbo.append((sliding_window,sliding_window+1,hold_segments[i][2]))
sliding_window=sliding_window+1
print hold_segments[i][2]
推荐答案
我希望这些注释清楚代码的作用.对于非整数步进大小也很好用
I hope with the comments it is clear what the code does. Works also well for non-integer stepsize
from __future__ import division
import numpy as np
my_segments = [
(0, 14.46, "ringtone"),
(14.46, 16.89, "noise"),
(16.89, 21.23, "not_music"),
]
def expand(segments, stepsize):
result = []
levels = [x[0] for x in segments] + [segments[-1][1]] #0, 14.46, 16.89, 21.23
i = 0 # tracks the index in segments that we need at the current step
for step in np.arange(0, levels[-1], stepsize):
# first check if the index needs to be updated
# update when the next level will be reached at the next 'stepsize / 2'
# (this effectively rounds to the nearest level)
if i < len(levels) - 2 and (step + stepsize / 2) > levels[i+1]:
i += 1
# now append the values
result.append((step, step + stepsize, segments[i][2]))
return result
stepsize = 0.02
print len(expand(my_segments, stepsize))
print my_segments[-1][1] / stepsize
>>> 1062 # steps are rounded up
>>> 1061.5
这篇关于使用python使用准确的标签将时间范围扩展为更多较小的增量步长的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!