NumPy:每m点选择n个点 [英] NumPy: Selecting n points every m points

查看:69
本文介绍了NumPy:每m点选择n个点的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如果我的numpy.ndarray大小为300点(现在为1 x 300),并且我想每30点选择10点,我该怎么做?

If I have a numpy.ndarray that's, say, 300 points in size (1 x 300 for now), and I wanted to select 10 points every 30 points, how would I do that?

换句话说:我想要第一个10点,然后跳过20,然后再抓10个,然后跳过10 ...,直到数组结束.

In other words: I want the first 10 points, then skip 20, then grab 10 more, and then skip 10... until the end of the array.

推荐答案

要从30个元素的每个块中选择10个元素,我们可以简单地重塑为2D并从中切出前几列10列每行-

To select 10 elements off each block of 30 elements, we can simply reshape into 2D and slice out the first 10 columns from each row -

a.reshape(-1,30)[:,:10]

好处是输出将是对输入的视图,因此实际上是免费的,并且没有任何额外的内存开销.让我们运行一个示例来展示和证明这些-

The benefit is the output would be a view into the input and as such virtually free and without any extra memory overhead. Let's have a sample run to show and prove those -

In [43]: np.random.seed(0)

In [44]: a = np.random.randint(0,9,(1,300))

In [48]: np.shares_memory(a,a.reshape(10,30)[0,:,:10])
Out[48]: True

如果需要拼合的版本,请使用.ravel()-

If you need a flattened version, use .ravel() -

a.reshape(-1,30)[:,:10].ravel()

时间-

In [38]: a = np.random.randint(0,9,(300))

# @sacul's soln
In [39]: %%timeit
    ...: msk = [True] * 10 + [False] * 20
    ...: out = a[np.tile(msk, len(a)//len(msk))]
100000 loops, best of 3: 7.6 µs per loop

# From this post
In [40]: %timeit a.reshape(-1,30)[:,:10].ravel()
1000000 loops, best of 3: 1.07 µs per loop

In [41]: a = np.random.randint(0,9,(3000000))

# @sacul's soln
In [42]: %%timeit
    ...: msk = [True] * 10 + [False] * 20
    ...: out = a[np.tile(msk, len(a)//len(msk))]
100 loops, best of 3: 3.66 ms per loop

# From this post
In [43]: %timeit a.reshape(-1,30)[:,:10].ravel()
100 loops, best of 3: 2.32 ms per loop

# If you are okay with `2D` output, it is virtually free
In [44]: %timeit a.reshape(-1,30)[:,:10]
1000000 loops, best of 3: 519 ns per loop


带有1D数组的通用案例

A.元素数量是块长度的倍数


Generic case with 1D array

A. No. of elements being multiple of block length

对于元素数量为n倍数的1D数组a的数组,要从每个n元素块中选择m元素并获得1D数组输出, :

For a 1D array a with number of elements being a multiple of n, to select m elements off each block of n elements and get a 1D array output, we would have :

a.reshape(-1,n)[:,:m].ravel()

请注意,ravel()展平部分在此处进行复制.因此,如有可能,请保留未展平的2D版本以提高内存效率.

Note that ravel() flattening part makes a copy there. So, if possible keep the unflattened 2D version for memory efficiency.

样品运行-

In [59]: m,n = 2,5

In [60]: N = 25

In [61]: a = np.random.randint(0,9,(N))

In [62]: a
Out[62]: 
array([5, 0, 3, 3, 7, 3, 5, 2, 4, 7, 6, 8, 8, 1, 6, 7, 7, 8, 1, 5, 8, 4,
       3, 0, 3])

# Select 2 elements off each block of 5 elements
In [63]: a.reshape(-1,n)[:,:m].ravel()
Out[63]: array([5, 0, 3, 5, 6, 8, 7, 7, 8, 4])

B.通用编号的元素

我们将利用受启发的 np.lib.stride_tricks.as_strided 通过 this post 从每个n元素块中选择m元素-

We would leverage np.lib.stride_tricks.as_strided, inspired by this post to select m elements off each block of n elements -

def skipped_view(a, m, n):
    s = a.strides[0]
    strided = np.lib.stride_tricks.as_strided
    shp = ((a.size+n-1)//n,n)
    return strided(a,shape=shp,strides=(n*s,s), writeable=False)[:,:m]

def slice_m_everyn(a, m, n):
    a_slice2D = skipped_view(a,m,n)
    extra = min(m,len(a)-n*(len(a)//n))
    L = m*(len(a)//n) + extra
    return a_slice2D.ravel()[:L]

请注意,skipped_view使我们可以查看输入数组以及可能未分配给输入数组的内存区域的视图,但是此后,我们将进行展平和切片以将其限制为所需的输出,这就是一个副本.

Note that skipped_view gets us a view into the input array and possibly into memory region not assigned to the input array, but after that we are flattening and slicing to restrict it to our desired output and that's a copy.

样品运行-

In [170]: np.random.seed(0)
     ...: a = np.random.randint(0,9,(16))

In [171]: a
Out[171]: array([5, 0, 3, 3, 7, 3, 5, 2, 4, 7, 6, 8, 8, 1, 6, 7])

# Select 2 elements off each block of 5 elements
In [172]: slice_m_everyn(a, m=2, n=5)
Out[172]: array([5, 0, 3, 5, 6, 8, 7])

In [173]: np.random.seed(0)
     ...: a = np.random.randint(0,9,(19))

In [174]: a
Out[174]: array([5, 0, 3, 3, 7, 3, 5, 2, 4, 7, 6, 8, 8, 1, 6, 7, 7, 8, 1])

# Select 2 elements off each block of 5 elements
In [175]: slice_m_everyn(a, m=2, n=5)
Out[175]: array([5, 0, 3, 5, 6, 8, 7, 7])

这篇关于NumPy:每m点选择n个点的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆