使向量化numpy函数的行为类似于ufunc [英] Making a vectorized numpy function behave like a ufunc

查看:87
本文介绍了使向量化numpy函数的行为类似于ufunc的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

让我们假设我们有一个Python函数,该函数接受Numpy数组并返回另一个数组:

 将numpy导入为npdef f(x,y,method ='p'):"参数:x(np.ndarray),y(np.ndarray),方法(str)返回:np.ndarray"z = x.copy()如果方法=='p':遮罩= x<0别的:遮罩= x>0z [掩码] = 0返回z * y 

尽管实际实现并不重要.我们可以假设 x y 始终是相同形状的数组,并且输出的形状与 x 相同./p>

问题是包装这种函数的最简单/最优雅的方法是什么,以便它可以与ND数组(N> 1)和标量参数一起使用,其方式类似于

大部分有效,但将无法通过上面的测试2,从而生成标量输出而不是矢量1.如果要解决此问题,则需要对输入的类型添加更多测试(例如 isinstance(x,np.ndarray) x.ndim> 0 ,等等),但我恐怕会忘记那里的一些极端情况.此外,上述实现还不够通用,无法包装带有不同数量参数的函数(请参见下面的第2点).

在使用Cython/f2py函数时,这似乎是一个相当普遍的问题,我想知道在某个地方是否有通用的解决方案?

编辑:@hpaulj的注释后的精度更高.本质上,我在寻找

  1. 一个与 np.atleast_1d 相反的函数,例如 atleast_1d_inverse(np.atleast_1d(x),x)== x ,其中第二个参数仅用于确定原始对象 x 的类型或维数.返回numpy标量(即 ndim = 0 的数组)而不是python标量是可以的.

  2. 一种检查函数f并生成与其定义一致的包装器的方法.例如,此类包装可以用作

    f_ufunc = ufunc_wrapper(f,args = ['x','y'])

    然后如果我们有一个不同的函数 def f2(x,option = 2):返回x ** 2 ,我们也可以使用

    f2_ufunc = ufunc_wrapper(f2,args = ['x']).

注意:与ufuncs的类比可能会受到限制,因为这对应于相反的问题.我没有设计要转换为接受向量和标量输入的标量函数,而是设计了可以处理向量的函数(可以将其视为先前已向量化的东西),我想再次接受标量,而无需进行更改原始功能.

这不能完全回答使向量化函数真正表现得像 ufunc 的问题,但我最近确实遇到了 numpy.vectorize 带来的小麻烦,听起来与您的问题类似.即使传递了标量输入,该包装程序仍坚持返回一个 array (具有 ndim = 0 shape =()).

但是,看来以下做法是正确的.在这种情况下,我正在对一个简单的函数进行矢量化处理,以将浮点值返回到一定数量的有效数字.

  def signif(x,digits):返回回合(x,数字-int(np.floor(np.log10(abs(x))))-1)def vectorize(f):vf = np.vectorize(f)def newfunc(* args,** kwargs):返回vf(* args,** kwargs)[()]返回newfuncvsignif =向量化(signif) 

这给

 >>>vsignif(0.123123,2)0.12>>>vsignif([[0.123123,123.2]],2)数组([[0.12,120.]])>>>vsignif([[0.123123,123.2]],[2,1])数组([[0.12,100.]]) 

Let's suppose that we have a Python function that takes in Numpy arrays and returns another array:

import numpy as np

def f(x, y, method='p'):
    """Parameters:  x (np.ndarray) , y (np.ndarray), method (str)
    Returns: np.ndarray"""
    z = x.copy()    
    if method == 'p':
        mask = x < 0
    else:
        mask = x > 0
    z[mask] = 0
    return z*y

although the actual implementation does not matter. We can assume that x and y will always be arrays of the same shape, and that the output is of the same shape as x.

The question is what would be the simplest/most elegant way of wrapping such function so it would work with both ND arrays (N>1) and scalar arguments, in a manner somewhat similar to universal functions in Numpy.

For instance, the expected output for the above function should be,

In [1]: f_ufunc(np.arange(-1,2), np.ones(3), method='p') 
Out[1]: array([ 0.,  0.,  1.]) # random array input -> output of the same shape

In [2]: f_ufunc(np.array([1]), np.array([1]), method='p') 
Out[2]: array([1])   # array input of len 1 -> output of len 1

In [3]: f_ufunc(1, 1, method='p')
Out[3]: 1  # scalar input -> scalar output

  • The function f cannot be changed, and it will fail if given a scalar argument for x or y.

  • When x and y are scalars, we transform them to 1D arrays, do the calculation then transform them back to scalars at the end.

  • f is optimized to work with arrays, scalar input being mostly a convenience. So writing a function that work with scalars and then using np.vectorize or np.frompyfunc would not be acceptable.

A beginning of an implementation could be,

def atleast_1d_inverse(res, x):
    # this function fails in some cases (see point 1 below).
    if res.shape[0] == 1:
        return res[0]
    else:
        return res

def ufunc_wrapper(func, args=[]):
    """ func:  the wrapped function
        args:  arguments of func to which we apply np.atleast_1d """

    # this needs to be generated dynamically depending on the definition of func
    def wrapper(x, y, method='p'):
        # we apply np.atleast_1d to the variables given in args
        x = np.atleast_1d(x)
        y = np.atleast_1d(x)

        res = func(x, y, method='p')

        return atleast_1d_inverse(res, x)

    return wrapper

f_ufunc = ufunc_wrapper(f, args=['x', 'y'])

which mostly works, but will fail the tests 2 above, producing a scalar output instead of a vector one. If we want to fix that, we would need to add more tests on the type of the input (e.g. isinstance(x, np.ndarray), x.ndim>0, etc), but I'm afraid to forget some corner cases there. Furthermore, the above implementation is not generic enough to wrap a function with a different number of arguments (see point 2 below).

This seems to be a rather common problem, when working with Cython / f2py function, and I was wondering if there was a generic solution for this somewhere?

Edit: a bit more precisions following @hpaulj's comments. Essentially, I'm looking for

  1. a function that would be the inverse of np.atleast_1d, such as atleast_1d_inverse( np.atleast_1d(x), x) == x, where the second argument is only used to determine the type or the number of dimensions of the original object x. Returning numpy scalars (i.e. arrays with ndim = 0) instead of a python scalar is ok.

  2. A way to inspect the function f and generate a wrapper that is consistent with its definition. For instance, such wrapper could be used as,

    f_ufunc = ufunc_wrapper(f, args=['x', 'y'])

    and then if we have a different function def f2(x, option=2): return x**2, we could also use

    f2_ufunc = ufunc_wrapper(f2, args=['x']).

Note: the analogy with ufuncs might be a bit limited, as this corresponds to the opposite problem. Instead of having a scalar function that we transform to accept both vector and scalar input, I have a function designed to work with vectors (that can be seen as something that was previously vectorized), that I would like to accept scalars again, without changing the original function.

解决方案

This doesn't fully answer the question of making a vectorized function truly behave like a ufunc, but I did recently run into a slight annoyance with numpy.vectorize that sounds similar to your issue. That wrapper insists on returning an array (with ndim=0 and shape=()) even if passed scalar inputs.

But it appears that the following does the right thing. In this case I am vectorizing a simple function to return a floating point value to a certain number of significant digits.

def signif(x, digits):
    return round(x, digits - int(np.floor(np.log10(abs(x)))) - 1)

def vectorize(f):
    vf = np.vectorize(f)

    def newfunc(*args, **kwargs):
        return vf(*args, **kwargs)[()]
    return newfunc

vsignif = vectorize(signif)

This gives

>>> vsignif(0.123123, 2)
0.12
>>> vsignif([[0.123123, 123.2]], 2)
array([[   0.12,  120.  ]])
>>> vsignif([[0.123123, 123.2]], [2, 1])
array([[   0.12,  100.  ]])

这篇关于使向量化numpy函数的行为类似于ufunc的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆