如何在 numba 的 `@guvectorize` 中调用 `@guvectorize`? [英] How call a `@guvectorize` inside a `@guvectorize` in numba?

查看:58
本文介绍了如何在 numba 的 `@guvectorize` 中调用 `@guvectorize`?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图在 @guvectorize 中调用 @guvectorize 但我有一个错误说:

I'm trying to call a @guvectorize inside a @guvectorize but I have an error saying :

Untyped global name 'regNL_nb': cannot determine Numba type of <class 'numpy.ufunc'>

File "di.py", line 12:
def H2Delay_nb(S1, S2, R2):
    H1 = regNL_nb(S1, S2)
    ^

这是一个 MRE:

import numpy as np
from numba import guvectorize, float64, int64, njit, cuda, jit

@guvectorize(["float64[:], float64[:], float64[:]"], '(n),(n)->(n)')
def regNL_nb(S1, S2, h2):
    for i in range(len(S1)):
        h2[i] = S1[i] + S2[i]

@guvectorize(["float64[:], float64[:],  float64[:]"], '(n),(n)->(n)',nopython=True)
def H2Delay_nb(S1, S2, R2):
    H1 = regNL_nb(S1, S2)
    H2 = regNL_nb(S1, S2,)
    for i in range(len(S1)):
        R2[i] =  H1[i] + H2[i]


S1 = np.array([1,2,3,4,5,6,7,8,9])
S2 = np.array([1,2,3,4,5,6,7,8,9])
H2 = H2Delay_nb(S1, S2)
print(H2)

我不知道如何告诉 numba 函数 regNL_nb 是一个 guvectorize 函数.

I don't know how do I tell to numba that the function regNL_nb is a guvectorize function.

推荐答案

我的回答仅适用于您可以将 @guvectorize 替换为 @njit 的情况,这将是完全相同的代码,同样的速度,只是一点点使用更长的语法.

My answer is only for the case if you're fine with replacing @guvectorize with @njit, it will be totally same code, same fast, just a bit more longer syntax to use.

在 nopython 模式下,在其他 guvectorized 函数中接受 @guvectorize-ed 函数可能存在一些问题.

It is probably some issue with accepting @guvectorize-ed functions inside other guvectorized function in nopython mode.

但是 Numba 在其他 njited 中接受非常好的只是常规 @njit-ed 函数.因此,您可以重写您的函数以使用@njit,您的函数签名将与外部世界的@guvectorize-ed 保持一致.@njit 版本只需要额外使用 np.empty_like(...) + 函数内部返回.

But Numba accepts perfectly good just regular @njit-ed functions inside other njited. So you may rewrite your function to use @njit, your function signature will remain same as @guvectorize-ed for outside world. @njit version will just need extra usage of np.empty_like(...) + return inside function.

提醒您 - 所有@njit-ed 函数始终启用 nopython 模式,因此您的 njit 代码将与 guvectorize+nopython 一样快.

To remind you - all @njit-ed functions are always having nopython mode enabled, so your njited code will be same fast as guvectorize+nopython.

我还提供了 CUDA 解决方案作为第二个代码片段.

Also I provide CUDA solution as second code snippet.

您也可以将@njited 设置为仅内部辅助函数,但外部您可能仍然可以使用@guvectorize-ed.此外,如果您想要通用功能(接受任何输入),只需从 njited 定义中删除签名 'f8[:](f8[:], f8[:])' ,签名将在调用时自动解析.

You may also make @njited only internal helper function, but external you probably can still have as @guvectorize-ed. Also if you want universal function (accepting any inputs) just remove signature 'f8[:](f8[:], f8[:])' from njited definition, signature will be auto-resolved on call.

最终代码如下:

在线试用!

import numpy as np
from numba import guvectorize, float64, int64, njit, cuda, jit

@njit('f8[:](f8[:], f8[:])', cache = True)
def regNL_nb(S1, S2):
    h2 = np.empty_like(S1)
    for i in range(len(S1)):
        h2[i] = S1[i] + S2[i]
    return h2
        
@njit('f8[:](f8[:], f8[:])', cache = True)
def H2Delay_nb(S1, S2):
    H1 = regNL_nb(S1, S2)
    H2 = regNL_nb(S1, S2)
    R2 = np.empty_like(H1)
    for i in range(len(S1)):
        R2[i] =  H1[i] + H2[i]
    return R2

S1 = np.array([1,2,3,4,5,6,7,8,9], dtype = np.float64)
S2 = np.array([1,2,3,4,5,6,7,8,9], dtype = np.float64)
H2 = H2Delay_nb(S1, S2)
print(H2)

输出:

[ 4.  8. 12. 16. 20. 24. 28. 32. 36.]

相同代码的 CUDA 变体,如果你想自动创建和返回结果数组,它需要额外的函数包装器,因为 CUDA 代码函数不允许有返回值:

CUDA variant of same code, it needs extra functions-wrappers if you want to automatically create and return resulting array, because CUDA-code function doesn't allow to have return value:

import numpy as np
from numba import guvectorize, float64, int64, njit, cuda, jit

@cuda.jit('void(f8[:], f8[:], f8[:])', cache = True)
def regNL_nb_cu(S1, S2, h2):
    for i in range(len(S1)):
        h2[i] = S1[i] + S2[i]
        
@njit('f8[:](f8[:], f8[:])', cache = True)
def regNL_nb(S1, S2):
    h2 = np.empty_like(S1)
    regNL_nb_cu(S1, S2, h2)
    return h2
        
@cuda.jit('void(f8[:], f8[:], f8[:])', cache = True)
def H2Delay_nb_cu(S1, S2, R2):
    H1 = regNL_nb(S1, S2)
    H2 = regNL_nb(S1, S2)
    for i in range(len(S1)):
        R2[i] =  H1[i] + H2[i]
        
@njit('f8[:](f8[:], f8[:])', cache = True)
def H2Delay_nb(S1, S2):
    R2 = np.empty_like(S1)
    H2Delay_nb_cu(S1, S2, R2)
    return R2

S1 = np.array([1,2,3,4,5,6,7,8,9], dtype = np.float64)
S2 = np.array([1,2,3,4,5,6,7,8,9], dtype = np.float64)
H2 = H2Delay_nb(S1, S2)
print(H2)

这篇关于如何在 numba 的 `@guvectorize` 中调用 `@guvectorize`?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆