numpy向量化函数可重复连续元素的块 [英] Numpy-vectorized function to repeat blocks of consecutive elements

查看:86
本文介绍了numpy向量化函数可重复连续元素的块的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Numpy具有а repeat 函数,将数组的每个元素重复给定的次数(每个元素).

Numpy has а repeat function, that repeats each element of the array a given (per element) number of times.

我想实现一个功能类似的功能,但不重复单个元素,而是重复大小可变的连续元素块.本质上,我需要以下功能:

I want to implement a function that does similar thing but repeats not individual elements, but variably sized blocks of consecutive elements. Essentially I want the following function:

import numpy as np

def repeat_blocks(a, sizes, repeats):
    b = []    
    start = 0
    for i, size in enumerate(sizes):
        end = start + size
        b.extend([a[start:end]] * repeats[i])
        start = end
    return np.concatenate(b)

例如,给定

a = np.arange(20)
sizes = np.array([3, 5, 2, 6, 4])
repeats = np.array([2, 3, 2, 1, 3])

然后

repeat_blocks(a, sizes, repeats)

返回

array([ 0,  1,  2, 
        0,  1,  2,

        3,  4,  5,  6,  7, 
        3,  4,  5,  6,  7, 
        3,  4,  5,  6,  7, 

        8,  9, 
        8,  9,

        10, 11, 12, 13, 14, 15,

        16, 17, 18, 19,
        16, 17, 18, 19,
        16, 17, 18, 19 ])

我想以性能的名义将这些循环推入numpy中.这可能吗?如果可以,怎么办?

I want to push these loops into numpy in the name of performance. Is this possible? If so, how?

推荐答案

这里是使用 cumsum -

# Get repeats for each group using group lengths/sizes
r1 = np.repeat(np.arange(len(sizes)), repeats)

# Get total size of output array, as needed to initialize output indexing array
N = (sizes*repeats).sum() # or np.dot(sizes, repeats)

# Initialize indexing array with ones as we need to setup incremental indexing
# within each group when cumulatively summed at the final stage. 
# Two steps here:
# 1. Within each group, we have multiple sequences, so setup the offsetting
# at each sequence lengths by the seq. lengths preceeeding those.
id_ar = np.ones(N, dtype=int)
id_ar[0] = 0
insert_index = sizes[r1[:-1]].cumsum()
insert_val = (1-sizes)[r1[:-1]]

# 2. For each group, make sure the indexing starts from the next group's
# first element. So, simply assign 1s there.
insert_val[r1[1:] != r1[:-1]] = 1

# Assign index-offseting values
id_ar[insert_index] = insert_val

# Finally index into input array for the group repeated o/p
out = a[id_ar.cumsum()]

这篇关于numpy向量化函数可重复连续元素的块的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆