将 python 函数广播到 numpy 数组 [英] Broadcasting a python function on to numpy arrays

查看:39
本文介绍了将 python 函数广播到 numpy 数组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我们有一个特别简单的函数,比如

Let's say we have a particularly simple function like

import scipy as sp
def func(x, y):
   return x + y

这个函数显然适用于 xy 的几种内置 Python 数据类型,如字符串、列表、整数、浮点数、数组等.因为我们对数组特别感兴趣,我们考虑两个数组:

This function evidently works for several builtin python datatypes of x and y like string, list, int, float, array, etc. Since we are particularly interested in arrays, we consider two arrays:

x = sp.array([-2, -1, 0, 1, 2])
y = sp.array([-2, -1, 0, 1, 2])

xx = x[:, sp.newaxis]
yy = y[sp.newaxis, :]

>>> func(xx, yy)

返回

array([[-4, -3, -2, -1,  0],
  [-3, -2, -1,  0,  1],
  [-2, -1,  0,  1,  2],
  [-1,  0,  1,  2,  3],
  [ 0,  1,  2,  3,  4]])

正如我们所期望的那样.

just as we would expect.

现在如果想要将数组作为以下函数的输入怎么办?

Now what if one wants to throw in arrays as the inputs for the following function?

def func2(x, y):
  if x > y:
     return x + y
  else:
     return x - y

执行 >>>func(xx, yy) 会引发错误.

人们会想到的第一个明显方法是 scipy/numpy 中的 sp.vectorize 函数.然而,这种方法已被证明不是很有效.任何人都可以想出一种更健壮的方法来将任何函数一般广播到 numpy 数组吗?

The first obvious method that one would come up with is the sp.vectorize function in scipy/numpy. This method, nevertheless has been proved to be not very efficient. Can anyone think of a more robust way of broadcasting any function in general on to numpy arrays?

如果以数组友好的方式重写代码是唯一的方法,如果您也能在这里提及它会有所帮助.

If re-writing the code in an array-friendly fashion is the only way, it would help if you could mention it here too.

推荐答案

np.vectorize 是将 Python 对数字进行运算的函数转换为对 ndarray 进行运算的 numpy 函数的通用方法.

np.vectorize is a general way to convert Python functions that operate on numbers into numpy functions that operate on ndarrays.

但是,正如您所指出的,它并不是很快,因为它在后台"使用了 Python 循环.

However, as you point out, it isn't very fast, since it is using a Python loop "under the hood".

为了获得更好的速度,您必须手工制作一个函数,该函数将 numpy 数组作为输入并利用该 numpy 特性:

To achieve better speed, you have to hand-craft a function that expects numpy arrays as input and takes advantage of that numpy-ness:

import numpy as np

def func2(x, y):
    return np.where(x>y,x+y,x-y)      

x = np.array([-2, -1, 0, 1, 2])
y = np.array([-2, -1, 0, 1, 2])

xx = x[:, np.newaxis]
yy = y[np.newaxis, :]

print(func2(xx, yy))
# [[ 0 -1 -2 -3 -4]
#  [-3  0 -1 -2 -3]
#  [-2 -1  0 -1 -2]
#  [-1  0  1  0 -1]
#  [ 0  1  2  3  0]]

<小时>

关于性能:


Regarding performance:

test.py:

import numpy as np

def func2a(x, y):
    return np.where(x>y,x+y,x-y)      

def func2b(x, y):
    ind=x>y
    z=np.empty(ind.shape,dtype=x.dtype)
    z[ind]=(x+y)[ind]
    z[~ind]=(x-y)[~ind]
    return z

def func2c(x, y):
    # x, y= x[:, None], y[None, :]
    A, L= x+ y, x<= y
    A[L]= (x- y)[L]
    return A

N=40
x = np.random.random(N)
y = np.random.random(N)

xx = x[:, np.newaxis]
yy = y[np.newaxis, :]

运行:

当 N=30 时:

% python -mtimeit -s'import test' 'test.func2a(test.xx,test.yy)'
1000 loops, best of 3: 219 usec per loop

% python -mtimeit -s'import test' 'test.func2b(test.xx,test.yy)'
1000 loops, best of 3: 488 usec per loop

% python -mtimeit -s'import test' 'test.func2c(test.xx,test.yy)'
1000 loops, best of 3: 248 usec per loop

当 N=1000 时:

With N=1000:

% python -mtimeit -s'import test' 'test.func2a(test.xx,test.yy)'
10 loops, best of 3: 93.7 msec per loop

% python -mtimeit -s'import test' 'test.func2b(test.xx,test.yy)'
10 loops, best of 3: 367 msec per loop

% python -mtimeit -s'import test' 'test.func2c(test.xx,test.yy)'
10 loops, best of 3: 186 msec per loop

这似乎表明 func2afunc2c 稍快(而 func2b 非常慢).

This seems to suggest that func2a is slightly faster than func2c (and func2b is horribly slow).

这篇关于将 python 函数广播到 numpy 数组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆