如何从数组中有效地选择多个切片? [英] how to efficiently select multiple slices from an array?
问题描述
给定一个数组
d = np.random.randn(100)
和一个索引数组
i = np.random.random_integers(low=3, high=d.size - 5, size=20)
我怎样才能有效地创建一个二维数组 r
with
how can I efficiently create a 2d array r
with
r.shape = (20, 8)
这样所有 j = 0 .. 19
,
r[j] = d[i[j]-3:i[j]+5]
在我的情况下,阵列非常大(约200000而不是100和20) ),快速有用。
In my case, the arrays are quite large (~200000 instead of 100 and 20), so something quick would be useful.
推荐答案
您可以创建数据的窗口视图,即(93,8)
数组,其中item [i,j]
是item [i + j]
原始数组,如:
You can create a windowed view of your data, i.e. a (93, 8)
array, where item [i, j]
is item [i+j]
of your original array, as:
>>> from numpy.lib.stride_tricks import as_strided
>>> wd = as_strided(d, shape=(len(d)-8+1, 8), strides=d.strides*2)
您现在可以提取所需的切片:
You can now extract your desired slices as:
>>> r = wd[i-3]
请注意 wd
只是原始数据的视图,因此不需要额外的内存。在使用任意索引提取 r
的那一刻,数据将被复制。因此,根据您希望如何使用 r
数组,您可能希望尽可能地延迟它,或者甚至可以完全避免它:您可以随时访问它行 r [j]
as wd [j-3]
而不触发副本。
Note that wd
is simply a view of your original data, so it takes no extra memory. The moment you extract r
with arbitrary indices, the data is copied. So depending on how you want to use your r
array, you may want to delay that as much as possible, or maybe even avoid it altogether: you can always access what would be row r[j]
as wd[j-3]
without triggering a copy.
这篇关于如何从数组中有效地选择多个切片?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!