NumPy - 使用权重在 2D 数组列上向量化 bincount [英] NumPy - Vectorizing bincount over 2D array column wise with weights

查看:40
本文介绍了NumPy - 使用权重在 2D 数组列上向量化 bincount的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在查看解决方案这里这里 但没看到我如何将它应用到我的结构中.

I've been looking at the solutions here and here but failing to see how I can apply it to my structures.

我有 3 个数组:一个 (M, N) 零,(P,) 索引(一些重复)和一个 (P,N) 个值.

I have 3 arrays: an (M, N) of zeros, and (P,) of indexes (some repeat) and an (P, N) of values.

我可以用一个循环来完成它:

I can accomplish it with a loop:

# a: (M, N)
# b: (P, N)
# ix: (M,)
for i in range(N):
    a[:, i] += np.bincount(ix, weights=b[:, i], minlength=M)

我还没有看到任何以这种方式使用索引或使用 weights 关键字的示例.我知道我需要将所有内容都放入一维数组中以对其进行矢量化,但是我正在努力弄清楚如何实现这一点.

I've not seen any examples that use indexes in this manner, or with the weights keyword. I understand I need to bring everything into a 1D array to vectorize it, however I am struggling to figure out how to accomplish that.

推荐答案

基本思想与那些链接帖子中详细讨论的相同,即创建一个 2D 的 bin 数组,每个 bin 具有偏移量要处理的一维数据"(在这种情况下是每列).所以,考虑到这些,我们最终会得到这样的结果 -

Basic idea stays the same as discussed in some detail in those linked posts, i.e. create a 2D array of bins with offsets per "1D data" to be processed (per col in this case). So, with those in mind, we will end up with something like this -

# Extent of bins per col
n = ix.max()+1

# 2D bins for per col processing
ix2D = ix[:,None] + n*np.arange(b.shape[1])

# Finally use bincount with those 2D bins as flattened and with
# flattened b as weights. Reshaping is needed to add back into "a".
a[:n] += np.bincount(ix2D.ravel(), weights=b.ravel(), minlength=n*N).reshape(N,-1).T

这篇关于NumPy - 使用权重在 2D 数组列上向量化 bincount的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆