从多维 Numpy 数组行中选择随机窗口 [英] Selecting Random Windows from Multidimensional Numpy Array Rows
本文介绍了从多维 Numpy 数组行中选择随机窗口的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个大数组,其中每一行都是一个时间序列,因此需要保持有序.
I have a large array where each row is a time series and thus needs to stay in order.
我想为每一行选择一个给定大小的随机窗口.
I want to select a random window of a given size for each row.
>>>import numpy as np
>>>arr = np.array(range(42)).reshape(6,7)
>>>arr
array([[ 0, 1, 2, 3, 4, 5, 6],
[ 7, 8, 9, 10, 11, 12, 13],
[14, 15, 16, 17, 18, 19, 20],
[21, 22, 23, 24, 25, 26, 27],
[28, 29, 30, 31, 32, 33, 34],
[35, 36, 37, 38, 39, 40, 41]])
>>># What I want to do:
>>>select_random_windows(arr, window_size=3)
array([[ 1, 2, 3],
[11, 12, 13],
[14, 15, 16],
[22, 23, 24],
[38, 39, 40]])
对我来说理想的解决方案是怎样的:
def select_random_windows(arr, window_size):
offsets = np.random.randint(0, arr.shape[0] - window_size, size = arr.shape[1])
return arr[:, offsets: offsets + window_size]
但不幸的是这不起作用
def select_random_windows(arr, wndow_size):
result = []
offsets = np.random.randint(0, arr.shape[0]-window_size, size = arr.shape[1])
for row, offset in enumerate(start_indices):
result.append(arr[row][offset: offset + window_size])
return np.array(result)
当然,我可以用列表理解来做同样的事情(并获得最小的速度提升),但我想知道是否有一些超级智能的 numpy 向量化方法可以做到这一点.
Sure, I could do the same with a list comprehension (and get a minimal speed boost), but I was wondering wether there is some super smart numpy vectorized way to do this.
推荐答案
这是一个利用 np.lib.stride_tricks.as_strided
-
Here's one leveraging np.lib.stride_tricks.as_strided
-
def random_windows_per_row_strided(arr, W=3):
idx = np.random.randint(0,arr.shape[1]-W+1, arr.shape[0])
strided = np.lib.stride_tricks.as_strided
m,n = arr.shape
s0,s1 = arr.strides
windows = strided(arr, shape=(m,n-W+1,W), strides=(s0,s1,s1))
return windows[np.arange(len(idx)), idx]
在具有 10,000
行的更大数组上的运行时测试 -
Runtime test on bigger array with 10,000
rows -
In [469]: arr = np.random.rand(100000,100)
# @Psidom's soln
In [470]: %timeit select_random_windows(arr, window_size=3)
100 loops, best of 3: 7.41 ms per loop
In [471]: %timeit random_windows_per_row_strided(arr, W=3)
100 loops, best of 3: 6.84 ms per loop
# @Psidom's soln
In [472]: %timeit select_random_windows(arr, window_size=30)
10 loops, best of 3: 26.8 ms per loop
In [473]: %timeit random_windows_per_row_strided(arr, W=30)
100 loops, best of 3: 9.65 ms per loop
# @Psidom's soln
In [474]: %timeit select_random_windows(arr, window_size=50)
10 loops, best of 3: 41.8 ms per loop
In [475]: %timeit random_windows_per_row_strided(arr, W=50)
100 loops, best of 3: 10 ms per loop
这篇关于从多维 Numpy 数组行中选择随机窗口的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文