numpy中的数字求和数组 [英] Sum array by number in numpy

查看:185
本文介绍了numpy中的数字求和数组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我有一个类似numpy的数组: [1,2,3,4,5,6] 和另一个数组: [0,0,1,2,2,1] 我想按组(第二个数组)对第一个数组中的项求和,并按组号顺序获得n组结果(在这种情况下,结果将是[3,9,9]).我该如何在numpy中执行此操作?

Assuming I have a numpy array like: [1,2,3,4,5,6] and another array: [0,0,1,2,2,1] I want to sum the items in the first array by group (the second array) and obtain n-groups results in group number order (in this case the result would be [3, 9, 9]). How do I do this in numpy?

推荐答案

有多种方法可以做到这一点,但这是一种方法:

There's more than one way to do this, but here's one way:

import numpy as np
data = np.arange(1, 7)
groups = np.array([0,0,1,2,2,1])

unique_groups = np.unique(groups)
sums = []
for group in unique_groups:
    sums.append(data[groups == group].sum())

可以对向量进行矢量化处理,以便根本没有for循环,但是我建议您反对它.它变得不可读,将需要几个2D临时数组,如果您有大量数据,则可能需要大量内存.

You can vectorize things so that there's no for loop at all, but I'd recommend against it. It becomes unreadable, and will require a couple of 2D temporary arrays, which could require large amounts of memory if you have a lot of data.

这是您可以完全矢量化的一种方法.请记住,这可能(并且可能会)比上述版本慢. (而且可能有更好的矢量化方法,但是已经晚了,我很累,所以这只是浮现在脑海的第一件事...)

Here's one way you could entirely vectorize. Keep in mind that this may (and likely will) be slower than the version above. (And there may be a better way to vectorize this, but it's late and I'm tired, so this is just the first thing to pop into my head...)

但是,请记住,这是一个不好的例子……使用上面的循环,您的状况(速度和可读性方面)确实更好……

However, keep in mind that this is a bad example... You're really better off (both in terms of speed and readability) with the loop above...

import numpy as np
data = np.arange(1, 7)
groups = np.array([0,0,1,2,2,1])

unique_groups = np.unique(groups)

# Forgive the bad naming here...
# I can't think of more descriptive variable names at the moment...
x, y = np.meshgrid(groups, unique_groups)
data_stack = np.tile(data, (unique_groups.size, 1))

data_in_group = np.zeros_like(data_stack)
data_in_group[x==y] = data_stack[x==y]

sums = data_in_group.sum(axis=1)

这篇关于numpy中的数字求和数组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆