2D ID数组和1D权重的加权numpy bincount [英] weighted numpy bincount for 2D IDs array and 1D weights

查看:123
本文介绍了2D ID数组和1D权重的加权numpy bincount的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用numpy_indexed来应用矢量化的numpy bincount,如下所示:

I am using numpy_indexed for applying a vectorized numpy bincount, as follows:

import numpy as np
import numpy_indexed as npi
rowidx, colidx = np.indices(index_tri.shape)
(cols, rows), B = npi.count((index_tri.flatten(), rowidx.flatten()))

其中index_tri是以下矩阵:

index_tri = np.array([[ 0,  0,  0,  7,  1,  3],
       [ 1,  2,  2,  9,  8,  9],
       [ 3,  1,  1,  4,  9,  1],
       [ 5,  6,  6, 10, 10, 10],
       [ 7,  8,  9,  4,  3,  3],
       [ 3,  8,  6,  3,  8,  6],
       [ 4,  3,  3,  7,  8,  9],
       [10, 10, 10,  5,  6,  6],
       [ 4,  9,  1,  3,  1,  1],
       [ 9,  8,  9,  1,  2,  2]])

然后我将合并的值映射到以下初始化矩阵m的相应位置:

Then I map the binned values in the corresponding position of the following initialized matrix m:

m = np.zeros((10,11))
m 
array([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]])

m[rows, cols] = B
m
array([[3., 1., 0., 1., 0., 0., 0., 1., 0., 0., 0.],
       [0., 1., 2., 0., 0., 0., 0., 0., 1., 2., 0.],
       [0., 3., 0., 1., 1., 0., 0., 0., 0., 1., 0.],
       [0., 0., 0., 0., 0., 1., 2., 0., 0., 0., 3.],
       [0., 0., 0., 2., 1., 0., 0., 1., 1., 1., 0.],
       [0., 0., 0., 2., 0., 0., 2., 0., 2., 0., 0.],
       [0., 0., 0., 2., 1., 0., 0., 1., 1., 1., 0.],
       [0., 0., 0., 0., 0., 1., 2., 0., 0., 0., 3.],
       [0., 3., 0., 1., 1., 0., 0., 0., 0., 1., 0.],
       [0., 1., 2., 0., 0., 0., 0., 0., 1., 2., 0.]])

但是,这认为index_tri中每列的每个值的权重为1.现在,如果我有一个weights数组,则在index_tri中每列中提供相应的权重值,而不是1:

However, this considers that the weight of each value in index_tri per column is 1. Now if I have a weights array, providing a corresponding weight value per column in index_tri instead of 1:

weights = np.array([0.7, 0.8, 1.5, 0.6, 0.5, 1.9])

如何应用加权二进制计数,以便我的输出矩阵m变为:

how to apply a weighted bincount so that my output matrix m becomes as follows:

array([[3., 0.5, 0., 1.9, 0., 0., 0., 0.6, 0., 0., 0.],
       [0., 0.7, 2.3, 0., 0., 0., 0., 0., 0.5, 2.5, 0.],
       [0., 4.2, 0., 0.7, 0.6, 0., 0., 0., 0., 0.5, 0.],
       [0., 0., 0., 0., 0., 0.7, 2.3, 0., 0., 0., 3.],
       [0., 0., 0., 2.4, 0.6, 0., 0., 0.7, 0.8, 1.5, 0.],
       [0., 0., 0., 2.3, 0., 0., 2.4, 0., 1.3, 0., 0.],
       [0., 0., 0., 2.3, 0.7, 0., 0., 0.6, 0.5, 1.9, 0.],
       [0., 0., 0., 0., 0., 0.6, 2.4, 0., 0., 0., 3.],
       [0., 3.9, 0., 0.6, 0.7, 0., 0., 0., 0., 0.8, 0.],
       [0., 0.6, 2.4, 0., 0., 0., 0., 0., 0.8, 2.2, 0.]])

有什么主意吗?

通过使用for循环和numpy bincount()我可以按以下方式解决它:

By using a for loop and the numpy bincount() I could solve it as follows:

for i in range(m.shape[0]):
   m[i, :] = np.bincount(index_tri[i, :], weights=weights, minlength=m.shape[1])

我正在尝试从此处此处,但是我无法弄清楚ix2D变量在第一个链接中所对应的含义.有人可以详细说明一下吗.

I am trying to adapt the vectorized provided solution from here and here respectively but I cannot figure out what the ix2D variable corresponds to in the first link. Could someone elaborate a bit if possible.

更新(解决方案):

基于下面的@Divakar解决方案,这是一个更新的版本,如果您的索引输入矩阵不能覆盖输出初始化矩阵的全部范围,则它会额外使用一个输入参数:

Based on the @Divakar's solution below, here is an updated version where it takes an extra input parameter in case that your indices input matrix does not cover the full range of the output initialized matrix:

    def bincount2D(id_ar_2D, weights_1D, sz=None):
        # Inputs : 2D id array, 1D weights array

        # Extent of bins per col
        if sz == None:
            n = id_ar_2D.max() + 1
            N = len(id_ar_2D)
        else:
            n = sz[1]
            N = sz[0]

        # add offsets to the original values to be used when we apply raveling later on
        id_ar_2D_offsetted = id_ar_2D + n * np.arange(N)[:, None]

        # Finally use bincount with those 2D bins as flattened and with
        # flattened b as weights. Reshaping is needed to add back into "a".
        ids = id_ar_2D_offsetted.ravel()
        W = np.tile(weights_1D, N)
        return np.bincount(ids, W, minlength=n * N).reshape(-1, n)

推荐答案

this post -

def bincount2D(id_ar_2D, weights_1D):
    # Inputs : 2D id array, 1D weights array
    
    # Extent of bins per col
    n = id_ar_2D.max()+1
    
    N = len(id_ar_2D)
    id_ar_2D_offsetted = id_ar_2D + n*np.arange(N)[:,None]
    
    # Finally use bincount with those 2D bins as flattened and with
    # flattened b as weights. Reshaping is needed to add back into "a".
    ids = id_ar_2D_offsetted.ravel()
    W = np.tile(weights_1D,N)
    return np.bincount(ids, W, minlength=n*N).reshape(-1,n)

out = bincount2D(index_tri, weights)

这篇关于2D ID数组和1D权重的加权numpy bincount的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆