Numba - 如何并行填充二维数组 [英] Numba - How to fill 2D array in parallel

查看：111 发布时间：2021/6/10 19:05:34 python vectorization numba

本文介绍了Numba - 如何并行填充二维数组的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个对 float64(x,y) 上的二维矩阵进行运算的函数.基本概念:对于每个行的组合(行数选择 2)，计算减法后正值的数量(行 1 - 行 2).在 int64(y,y) 的 2Dmatrix 中，如果值高于某个阈值，则将该值存储在索引 [row1,row2] 中，如果低于某个阈值，则将该值存储在索引 [row2,row1] 中.

I have a function that operates on a 2D matrix on float64(x,y). Basic concept: for each combination of rows (no. rows choose 2) count the number of positiv values after subtraction (row1 - row2). In a 2Dmatrix of int64(y,y) store this value in index [row1,row2] if value is above a certain threshold and [row2,row1] if below.

我已经实现并用@njit(parallel=False) 装饰它，效果很好@njit(parallel=True) 似乎没有加速.为了加快整个过程，我查看了@guvectorize，它也有效.但是，在这种情况下，我也无法弄清楚如何将 @guvectorize 与 parallel true 一起使用.

I've implemented that and decorated it with @njit(parallel=False), that works fine @njit(parallel=True) seems to give no speedup. Trying to speed up the whole thing I had a look at @guvectorize, that works as well. However I'm not able to figure out how to use @guvectorize with parallel true in this case either.

我查看了 numba guvectorize target='parallel' 慢比 target='cpu' ，其中解决方案是使用 @vecorize，但我无法将解决方案转移到我的问题，因此我现在正在寻求帮助 :)

I had a look at numba guvectorize target='parallel' slower than target='cpu' , where the solution was to use @vecorize instead, but I can not transfer the solution to my problem, therefore I am now seeking help :)

基本的jitted和guvectorized实现

Basic jitted and guvectorized implementation

import numpy as np
from numba import jit, guvectorize, prange
import timeit

@jit(parallel=False)
def check_pairs_sg(raw_data):
    # 2D array to be filled
    result = np.full((len(raw_data), len(raw_data)), -1)

    # Iterate over all possible gene combinations
    for r1 in range(0, len(raw_data)):
        for r2 in range(r1+1, len(raw_data)):
            diff = np.subtract(raw_data[:, r1], raw_data[:, r2])

            num_pos = len(np.where(diff > 0)[0])

            # Arbitrary check to illustrate
            if num_pos >= 5: 
               result[r1,r2] = num_pos
            else:
               result[r2,r1] = num_pos

    return result

@jit(parallel=True)
def check_pairs_multi(raw_data):
    # 2D array to be filled
    result = np.full((len(raw_data), len(raw_data)), -1)

    # Iterate over all possible gene combinations
    for r1 in range(0, len(raw_data)):
        for r2 in prange(r1+1, len(raw_data)):
            diff = np.subtract(raw_data[:, r1], raw_data[:, r2])

            num_pos = len(np.where(diff > 0)[0])

            # Arbitrary check to illustrate
            if num_pos >= 5: 
               result[r1,r2] = num_pos
            else:
               result[r2,r1] = num_pos

    return result

@guvectorize(["void(float64[:,:], int64[:,:])"],
             "(n,m)->(m,m)", target='cpu')
def check_pairs_guvec_sg(raw_data, result):
    for r1 in range(0, len(result)):
        for r2 in range(r1+1, len(result)):
            diff = np.subtract(raw_data[:, r1], raw_data[:, r2])

            num_pos = len(np.where(diff > 0)[0])

            # Arbitrary check to illustrate
            if num_pos >= 5: 
               result[r1,r2] = num_pos
            else:
               result[r2,r1] = num_pos

@guvectorize(["void(float64[:,:], int64[:,:])"],
             "(n,m)->(m,m)", target='parallel')
def check_pairs_guvec_multi(raw_data, result):
    for r1 in range(0, len(result)):
        for r2 in range(r1+1, len(result)):
            diff = np.subtract(raw_data[:, r1], raw_data[:, r2])

            num_pos = len(np.where(diff > 0)[0])

            # Arbitrary check to illustrate
            if num_pos >= 5: 
               result[r1,r2] = num_pos
            else:
               result[r2,r1] = num_pos

if __name__=="__main__":
     np.random.seed(404)
     a = np.random.random((512,512)).astype(np.float64)
     res = np.full((len(a), len(a)), -1)

并用

%timeit check_pairs_sg(a)
%timeit check_pairs_multi(a)
%timeit check_pairs_guvec_sg(a, res)
%timeit check_pairs_guvec_multi(a, res)

导致:

614 ms ± 2.54 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
507 ms ± 6.87 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
622 ms ± 3.88 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
671 ms ± 4.35 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

我一直在思考如何将其实现为 @vectorized 或适当的并行 @guvectorize 以真正并行地填充生成的二维数组.

I cat wrap my head around on how to implement this as @vectorized or a proper parallel @guvectorize to fill the resulting 2D array truely in parallel.

我想这是我尝试将其进一步推向 gpu 之前的第一步.

I guess this is my first step before trying to taking this further to gpu.

非常感谢任何帮助.

写Numba代码时考虑其他编译语言

例如，想想这些行的或多或少完全等效的实现

Think of other compiled languages when writing Numba code

For example think of a more or less exact equivalent implementation of the lines

diff = np.subtract(raw_data[:, r1], raw_data[:, r2])
num_pos = len(np.where(diff > 0)[0])

在 C++ 中.

伪代码

分配一个数组差异，循环raw_data[i*size_dim_1+r1](循环索引为i)
分配一个布尔数组，循环遍历整个数组 diff 并检查是否 diff[i]>0
循环遍历布尔数组，获取 b_arr==True 的索引，并通过 vector::push_back() 将它们保存到向量中.
检查向量的大小

您代码中的主要问题是:

为简单操作创建临时数组
非连续内存访问

移除临时数组并简化

@nb.njit(parallel=False)
def check_pairs_simp(raw_data):
    # 2D array to be filled
    result = np.full((raw_data.shape[0],raw_data.shape[1]), -1)
    
    # Iterate over all possible gene combinations
    for r1 in range(0, raw_data.shape[1]):
        for r2 in range(r1+1, raw_data.shape[1]):
            num_pos=0
            for i in range(raw_data.shape[0]):
                if (raw_data[i,r1]>raw_data[i,r2]):
                    num_pos+=1
            
            # Arbitrary check to illustrate
            if num_pos >= 5: 
               result[r1,r2] = num_pos
            else:
               result[r2,r1] = num_pos
    
    return result

移除临时数组和简化 + 连续内存访问

@nb.njit(parallel=False)
def check_pairs_simp_rev(raw_data_in):
    #Create a transposed array not just a view 
    raw_data=np.ascontiguousarray(raw_data_in.T)
    
    # 2D array to be filled
    result = np.full((raw_data.shape[0],raw_data.shape[1]), -1)
    
    # Iterate over all possible gene combinations
    for r1 in range(0, raw_data.shape[0]):
        for r2 in range(r1+1, raw_data.shape[0]):
            num_pos=0
            for i in range(raw_data.shape[1]):
                if (raw_data[r1,i]>raw_data[r2,i]):
                    num_pos+=1
            
            # Arbitrary check to illustrate
            if num_pos >= 5: 
               result[r1,r2] = num_pos
            else:
               result[r2,r1] = num_pos
    
    return result

移除临时数组和简化 + 连续内存访问 + 并行化

@nb.njit(parallel=True,fastmath=True)
def check_pairs_simp_rev_p(raw_data_in):
    #Create a transposed array not just a view 
    raw_data=np.ascontiguousarray(raw_data_in.T)
    
    # 2D array to be filled
    result = np.full((raw_data.shape[0],raw_data.shape[1]), -1)
    
    # Iterate over all possible gene combinations
    for r1 in nb.prange(0, raw_data.shape[0]):
        for r2 in range(r1+1, raw_data.shape[0]):
            num_pos=0
            for i in range(raw_data.shape[1]):
                if (raw_data[r1,i]>raw_data[r2,i]):
                    num_pos+=1
            
            # Arbitrary check to illustrate
            if num_pos >= 5: 
               result[r1,r2] = num_pos
            else:
               result[r2,r1] = num_pos
    
    return result

时间

%timeit check_pairs_sg(a)
488 ms ± 8.68 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit check_pairs_simp(a)
186 ms ± 3.83 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
%timeit check_pairs_simp_rev(a)
12.1 ms ± 226 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit check_pairs_simp_rev_p(a)
5.43 ms ± 49.1 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

这篇关于Numba - 如何并行填充二维数组的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

Numba - 如何并行填充二维数组 [英] Numba - How to fill 2D array in parallel

问题描述

推荐答案

写Numba代码时考虑其他编译语言

Think of other compiled languages when writing Numba code

相关文章

Python最新文章

热门教程

热门工具

登录关闭

Numba - 如何并行填充二维数组 [英] Numba - How to fill 2D array in parallel

问题描述

推荐答案

写Numba代码时考虑其他编译语言

Think of other compiled languages when writing Numba code

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭