提取数值大于阈值的numpy数组的子数组 [英] Extract subarrays of numpy array whose values are above a threshold

查看:163
本文介绍了提取数值大于阈值的numpy数组的子数组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个声音信号,以numpy数组的形式导入,我想将其切成numpy数组的大块.但是,我希望这些块仅包含高于阈值的元素.例如:

I have a sound signal, imported as a numpy array and I want to cut it into chunks of numpy arrays. However, I want the chunks to contain only elements above a threshold. For example:

threshold = 3
signal = [1,2,6,7,8,1,1,2,5,6,7]

应输出两个数组

vec1 = [6,7,8]
vec2 = [5,6,7]

好的,以上是列表,但您明白我的意思.

Ok, the above are lists, but you get my point.

这是我到目前为止所尝试的,但这只会杀死我的RAM

Here is what I tried so far, but this just kills my RAM

def slice_raw_audio(audio_signal, threshold=5000):

    signal_slice, chunks = [], []

    for idx in range(0, audio_signal.shape[0], 1000):
        while audio_signal[idx] > threshold:
            signal_slice.append(audio_signal[idx])
         chunks.append(signal_slice)
    return chunks

推荐答案

这是一种方法-

def split_above_threshold(signal, threshold):
    mask = np.concatenate(([False], signal > threshold, [False] ))
    idx = np.flatnonzero(mask[1:] != mask[:-1])
    return [signal[idx[i]:idx[i+1]] for i in range(0,len(idx),2)]

样品运行-

In [48]: threshold = 3
    ...: signal = np.array([1,1,7,1,2,6,7,8,1,1,2,5,6,7,2,8,7,2])
    ...: 

In [49]: split_above_threshold(signal, threshold)
Out[49]: [array([7]), array([6, 7, 8]), array([5, 6, 7]), array([8, 7])]

运行时测试

其他方法-

Runtime test

Other approaches -

# @Psidom's soln
def arange_diff(signal, threshold):
    above_th = signal > threshold
    index, values = np.arange(signal.size)[above_th], signal[above_th]
    return np.split(values, np.where(np.diff(index) > 1)[0]+1)

# @Kasramvd's soln   
def split_diff_step(signal, threshold):   
    return np.split(signal, np.where(np.diff(signal > threshold))[0] + 1)[1::2]

时间-

In [67]: signal = np.random.randint(0,9,(100000))

In [68]: threshold = 3

# @Kasramvd's soln 
In [69]: %timeit split_diff_step(signal, threshold)
10 loops, best of 3: 39.8 ms per loop

# @Psidom's soln
In [70]: %timeit arange_diff(signal, threshold)
10 loops, best of 3: 20.5 ms per loop

In [71]: %timeit split_above_threshold(signal, threshold)
100 loops, best of 3: 8.22 ms per loop

这篇关于提取数值大于阈值的numpy数组的子数组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆