numpy:有条件的总和 [英] Numpy: conditional sum

查看:113
本文介绍了numpy:有条件的总和的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下numpy数组:

I have the following numpy array:

import numpy as np
arr = np.array([[1,2,3,4,2000],
                [5,6,7,8,2000],
                [9,0,1,2,2001],
                [3,4,5,6,2001],
                [7,8,9,0,2002],
                [1,2,3,4,2002],
                [5,6,7,8,2003],
                [9,0,1,2,2003]
              ])

我理解np.sum(arr, axis=0)提供结果:

array([   40,    28,    36,    34, 16012])

我想做的(无for循环)是根据最后一列的值对各列求和,以便提供的结果为:

what I would like to do (without a for loop) is sum the columns based on the value of the last column so that the result provided is:

array([[   6,    8,   10,   12, 4000],
       [  12,    4,    6,    8, 4002],
       [   8,   10,   12,    4, 4004],
       [  14,    6,    8,   10, 4006]])

我意识到这可能是一个无循环的尝试,但希望能做到最好……

I realize that it may be a stretch to do without a loop, but hoping for the best...

如果必须使用for循环,那将如何工作?

If a for loop must be used, then how would that work?

我尝试了np.sum(arr[:, 4]==2000, axis=0)(我会用for循环中的变量替换2000),但是结果为 2

I tried np.sum(arr[:, 4]==2000, axis=0) (where I would substitute 2000 with the variable from the for loop), however it gave a result of 2

推荐答案

您可以使用

You can do this in pure numpy using a clever application of np.diff and np.add.reduceat. np.diff will give you the indices where the rightmost column changes:

d = np.diff(arr[:, -1])

np.where 会转换您的布尔值索引d转换为np.add.reduceat期望的整数索引:

np.where will convert your boolean index d into the integer indices that np.add.reduceat expects:

d = np.where(d)[0]

reduceat也将期望看到零索引,并且所有内容都需要移动一个:

reduceat will also expect to see a zero index, and everything needs to be shifted by one:

indices = np.r_[0, e + 1]

使用 np.r_ np.concatenate 方便一些,因为它允许标量.然后,总和变为:

Using np.r_ here is a bit more convenient than np.concatenate because it allows scalars. The sum then becomes:

result = np.add.reduceat(arr, indices, axis=0)

这当然可以组合成一个单行:

This can be combined into a one-liner of course:

>>> result = np.add.reduceat(arr, np.r_[0, np.where(np.diff(arr[:, -1]))[0] + 1], axis=0)
>>> result
array([[   6,    8,   10,   12, 4000],
       [  12,    4,    6,    8, 4002],
       [   8,   10,   12,    4, 4004],
       [  14,    6,    8,   10, 4006]])

这篇关于numpy:有条件的总和的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆