在numpy数组上广播python函数 [英] Broadcasting a python function on to numpy arrays

查看:270
本文介绍了在numpy数组上广播python函数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我们有一个特别简单的功能,例如

Let's say we have a particularly simple function like

import scipy as sp
def func(x, y):
   return x + y

该函数显然适用于xy的几种内置python数据类型,例如字符串,列表,整型,浮点数,数组等.由于我们对数组特别感兴趣,因此考虑两个数组:

This function evidently works for several builtin python datatypes of x and y like string, list, int, float, array, etc. Since we are particularly interested in arrays, we consider two arrays:

x = sp.array([-2, -1, 0, 1, 2])
y = sp.array([-2, -1, 0, 1, 2])

xx = x[:, sp.newaxis]
yy = y[sp.newaxis, :]

>>> func(xx, yy)

这将返回

array([[-4, -3, -2, -1,  0],
  [-3, -2, -1,  0,  1],
  [-2, -1,  0,  1,  2],
  [-1,  0,  1,  2,  3],
  [ 0,  1,  2,  3,  4]])

正如我们期望的那样.

现在如果要把数组作为以下函数的输入怎么办?

Now what if one wants to throw in arrays as the inputs for the following function?

def func2(x, y):
  if x > y:
     return x + y
  else:
     return x - y

执行>>>func(xx, yy)会引发错误.

第一个显而易见的方法是scipy/numpy中的sp.vectorize函数.然而,这种方法已被证明不是很有效.有人能想到一种更健壮的方法来将任何函数广播到numpy数组吗?

The first obvious method that one would come up with is the sp.vectorize function in scipy/numpy. This method, nevertheless has been proved to be not very efficient. Can anyone think of a more robust way of broadcasting any function in general on to numpy arrays?

如果以数组友好的方式重写代码是唯一的方法,那么如果您在这里也提及它,将会有所帮助.

If re-writing the code in an array-friendly fashion is the only way, it would help if you could mention it here too.

推荐答案

np.vectorize是将对数字进行运算的Python函数转换为对ndarrays进行运算的numpy函数的通用方法.

np.vectorize is a general way to convert Python functions that operate on numbers into numpy functions that operate on ndarrays.

但是,正如您所指出的那样,它并不是很快,因为它在幕后"使用了Python循环.

However, as you point out, it isn't very fast, since it is using a Python loop "under the hood".

要获得更快的速度,您必须手工制作一个函数,该函数期望将numpy数组作为输入并利用该numpy-ness:

To achieve better speed, you have to hand-craft a function that expects numpy arrays as input and takes advantage of that numpy-ness:

import numpy as np

def func2(x, y):
    return np.where(x>y,x+y,x-y)      

x = np.array([-2, -1, 0, 1, 2])
y = np.array([-2, -1, 0, 1, 2])

xx = x[:, np.newaxis]
yy = y[np.newaxis, :]

print(func2(xx, yy))
# [[ 0 -1 -2 -3 -4]
#  [-3  0 -1 -2 -3]
#  [-2 -1  0 -1 -2]
#  [-1  0  1  0 -1]
#  [ 0  1  2  3  0]]


关于性能:


Regarding performance:

test.py :

import numpy as np

def func2a(x, y):
    return np.where(x>y,x+y,x-y)      

def func2b(x, y):
    ind=x>y
    z=np.empty(ind.shape,dtype=x.dtype)
    z[ind]=(x+y)[ind]
    z[~ind]=(x-y)[~ind]
    return z

def func2c(x, y):
    # x, y= x[:, None], y[None, :]
    A, L= x+ y, x<= y
    A[L]= (x- y)[L]
    return A

N=40
x = np.random.random(N)
y = np.random.random(N)

xx = x[:, np.newaxis]
yy = y[np.newaxis, :]

运行:

N = 30:

% python -mtimeit -s'import test' 'test.func2a(test.xx,test.yy)'
1000 loops, best of 3: 219 usec per loop

% python -mtimeit -s'import test' 'test.func2b(test.xx,test.yy)'
1000 loops, best of 3: 488 usec per loop

% python -mtimeit -s'import test' 'test.func2c(test.xx,test.yy)'
1000 loops, best of 3: 248 usec per loop

当N = 1000时:

With N=1000:

% python -mtimeit -s'import test' 'test.func2a(test.xx,test.yy)'
10 loops, best of 3: 93.7 msec per loop

% python -mtimeit -s'import test' 'test.func2b(test.xx,test.yy)'
10 loops, best of 3: 367 msec per loop

% python -mtimeit -s'import test' 'test.func2c(test.xx,test.yy)'
10 loops, best of 3: 186 msec per loop

这似乎表明func2a的速度比func2c的速度稍快(而func2b的速度非常慢).

This seems to suggest that func2a is slightly faster than func2c (and func2b is horribly slow).

这篇关于在numpy数组上广播python函数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆