在numpy数组上映射函数的最有效方法 [英] Most efficient way to map function over numpy array
问题描述
在numpy数组上映射函数的最有效方法是什么?我在当前项目中一直采用的方式如下:
What is the most efficient way to map a function over a numpy array? The way I've been doing it in my current project is as follows:
import numpy as np
x = np.array([1, 2, 3, 4, 5])
# Obtain array of square of each element in x
squarer = lambda t: t ** 2
squares = np.array([squarer(xi) for xi in x])
但是,这似乎效率很低,因为我使用列表理解将新数组构造为Python列表,然后再将其转换回numpy数组.
However, this seems like it is probably very inefficient, since I am using a list comprehension to construct the new array as a Python list before converting it back to a numpy array.
我们可以做得更好吗?
推荐答案
我已经用np.array(map(f, x)) > perfplot
(属于我的一个小项目).
I've tested all suggested methods plus np.array(map(f, x))
with perfplot
(a small project of mine).
消息1:如果可以使用numpy的本机函数,则执行此操作.
Message #1: If you can use numpy's native functions, do that.
如果您要向量化的函数已经被 进行了向量化(例如原始帖子中的x**2
示例),则使用它的速度将比其他任何东西都要快 (请注意对数刻度):
If the function you're trying to vectorize already is vectorized (like the x**2
example in the original post), using that is much faster than anything else (note the log scale):
如果您确实需要向量化,那么使用哪种变体并没有多大关系.
If you actually need vectorization, it doesn't really matter much which variant you use.
用于重现绘图的代码:
import numpy as np
import perfplot
import math
def f(x):
# return math.sqrt(x)
return np.sqrt(x)
vf = np.vectorize(f)
def array_for(x):
return np.array([f(xi) for xi in x])
def array_map(x):
return np.array(list(map(f, x)))
def fromiter(x):
return np.fromiter((f(xi) for xi in x), x.dtype)
def vectorize(x):
return np.vectorize(f)(x)
def vectorize_without_init(x):
return vf(x)
perfplot.show(
setup=lambda n: np.random.rand(n),
n_range=[2 ** k for k in range(20)],
kernels=[f, array_for, array_map, fromiter, vectorize, vectorize_without_init],
xlabel="len(x)",
)
这篇关于在numpy数组上映射函数的最有效方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!