如何通过切片范围有效地索引到一维numpy数组 [英] How to efficiently index into a 1D numpy array via slice ranges

查看:92
本文介绍了如何通过切片范围有效地索引到一维numpy数组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个很大的一维数据数组.我在发生重要事件的数据中有一个索引的starts数组.我想得到一个范围数组,以便得到长度为L的窗口,每个窗口都为starts的一个起点.伪造的样本数据:

I have a big 1D array of data. I have a starts array of indexes into that data where important things happened. I want to get an array of ranges so that I get windows of length L, one for each starting point in starts. Bogus sample data:

data = np.linspace(0,10,50)
starts = np.array([0,10,21])
length = 5

我想本能地做类似的事情

I want to instinctively do something like

data[starts:starts+length]

但是实际上,我需要将starts转换为范围为"windows"的2D数组.来自功能语言,我认为它是从列表到列表列表的map,例如:

But really, I need to turn starts into 2D array of range "windows." Coming from functional languages, I would think of it as a map from a list to a list of lists, like:

np.apply_along_axis(lambda i: np.arange(i,i+length), 0, starts)

但这不会起作用,因为apply_along_axis仅允许标量返回值.

But that won't work because apply_along_axis only allows scalar return values.

您可以执行以下操作:

pairs = np.vstack([starts, starts + length]).T
ranges = np.apply_along_axis(lambda p: np.arange(*p), 1, pairs)
data[ranges]

或者您可以通过列表理解来做到这一点:

Or you can do it with a list comprehension:

data[np.array([np.arange(i,i+length) for i in starts])]

或者您可以迭代地进行. (嘘)

Or you can do it iteratively. (Bleh.)

是否有一种简洁,惯用的方法在某些特定的起点切成这样的数组? (原谅麻木的新手.)

Is there a concise, idiomatic way to slice into an array at certain start points like this? (Pardon the numpy newbie-ness.)

推荐答案

data = np.linspace(0,10,50)
starts = np.array([0,10,21])
length = 5

对于仅NumPy的方式,您可以按此处所述使用numpy.meshgrid()

For a NumPy only way of doing this, you can use numpy.meshgrid() as described here

http://docs.scipy.org/doc/numpy/reference/generation/numpy.meshgrid.html

正如hpaulj在评论中指出的那样,由于您可以使用数组广播,因此实际上并不需要meshgrid.

As hpaulj pointed out in the comments, meshgrid actually isn't needed for this problem as you can use array broadcasting.

http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html

# indices = sum(np.meshgrid(np.arange(length), starts))

indices = np.arange(length) + starts[:, np.newaxis]
# array([[ 0,  1,  2,  3,  4],
#        [10, 11, 12, 13, 14],
#        [21, 22, 23, 24, 25]])
data[indices]

返回

array([[ 0.        ,  0.20408163,  0.40816327,  0.6122449 ,  0.81632653],
       [ 2.04081633,  2.24489796,  2.44897959,  2.65306122,  2.85714286],
       [ 4.28571429,  4.48979592,  4.69387755,  4.89795918,  5.10204082]])

这篇关于如何通过切片范围有效地索引到一维numpy数组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆