使用CuPy或NumPy有效地计算分区总和 [英] Calculate partitioned sum efficiently with CuPy or NumPy

查看:126
本文介绍了使用CuPy或NumPy有效地计算分区总和的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个要累加的长度很长的数组*,长度为 L (我们叫它 values ),还有一个长度相同的排序后的一维数组 L 包含用于对原始数组进行分区的 N 个整数–我们将此数组称为 labels .

I have a very long array* of length L (let's call it values) that I want to sum over, and a sorted 1D array of the same length L that contains N integers with which to partition the original array – let's call this array labels.

我当前正在做的是这个(模块正在 cupy numpy ):

What I'm currently doing is this (module being cupy or numpy):

result = module.empty(N)
for i in range(N):
    result[i] = values[labels == i].sum()

但这并不是最有效的方法(应该可以摆脱 for 循环,但是如何?).由于对 labels 进行了排序,因此我可以轻松确定断点并将这些索引用作起点/终点,但是我看不出这是如何解决 for 循环问题的.

But this can't be the most efficient way of doing it (it should be possible to get rid of the for loop, but how?). Since labels is sorted, I could easily determine the break points and use those indices as start/stop points, but I don't see how this solves the for loop problem.

请注意,如果可能的话,我想避免沿途创建大小为 N x L 的数组,因为 L 非常大.

Note that I would like to avoid creating an array of size NxL along the way, if possible, since L is very large.

我正在cupy工作,但是任何numpy解决方案也都可以使用,并且可能会移植.在cupy中,对于 ReductionKernel 来说似乎是这种情况,但是我不太清楚该怎么做.

I'm working in cupy, but any numpy solution is welcome too and could probably be ported. Within cupy, it seems this would be a case for a ReductionKernel, but I don't quite see how to do it.

*, values 是一维的,但我认为解决方案将不依赖于此

* in my case, values is 1D, but I assume the solution wouldn't depend on this

推荐答案

您正在描述一个分组和汇总.您可以为此编写一个CuPy RawKernel ,但是使用

You are describing a groupby sum aggregation. You could write a CuPy RawKernel for this, but it would be much easier to use the existing groupby aggregations implemented in cuDF, the GPU dataframe library. They can interoperate without requiring you to copy the data. If you call .values on the resulting cuDF Series, it will give you a CuPy array.

如果您返回CPU,则可以对熊猫执行相同的操作.

If you went back to the CPU, you could do the same thing with pandas.

import cupy as cp
import pandas as pd

N = 100
values = cp.random.randint(0, N, 1000)
labels = cp.sort(cp.random.randint(0, N, 1000))

L = len(values)
result = cp.empty(L)
for i in range(N):
    result[i] = values[labels == i].sum()
    
result[:5]
array([547., 454., 402., 601., 668.])

import cudf

df = cudf.DataFrame({"values": values, "labels": labels})
df.groupby(["labels"])["values"].sum().values[:5]
array([547, 454, 402, 601, 668])

这篇关于使用CuPy或NumPy有效地计算分区总和的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆