用于性能、缓存的 Numpy Pure Functions [英] Numpy Pure Functions for performance, caching
问题描述
我正在用 numpy 编写一些中等性能的关键代码.此代码将位于最内层循环中,计算的运行时间以小时为单位.快速计算表明,在计算的某些变体中,此代码将执行大约 10^12 次.
I'm writing some moderately performance critical code in numpy. This code will be in the inner most loop, of a computation that's run time is measured in hours. A quick calculation suggest that this code will be executed up something like 10^12 times, in some variations of the calculation.
所以函数是计算sigmoid(X),另一个是计算它的导数(梯度).Sigmoid 具有对于
y=sigmoid(x), dy/dx= y(1-y)
在 numpy 的 python 中,这看起来像:
So the function is to calculate sigmoid(X) and another to calculate its derivative (gradient).
Sigmoid has the property that for
y=sigmoid(x), dy/dx= y(1-y)
In python for numpy this looks like:
sigmoid = vectorize(lambda(x): 1.0/(1.0+exp(-x)))
grad_sigmoid = vectorize(lambda (x): sigmoid(x)*(1-sigmoid(x)))
可以看出,两个函数都是纯函数(没有副作用),所以他们是记忆的理想人选,至少在短期内,我对缓存每一次对 sigmoid 的调用感到担忧:存储 10^12 个浮点数,这将占用数 TB 的 RAM.
As can be seen, both functions are pure (without side effects), so they are ideal candidates for memoization, at least for the short term, I have some worries about caching every single call to sigmoid ever made: Storing 10^12 floats which would take several terabytes of RAM.
有没有什么好的方法可以优化这个?
python 会发现这些是纯函数并酌情为我缓存它们吗?
我在担心什么吗?
推荐答案
这些函数在 scipy 中已经存在.sigmoid 函数可用作 scipy.special.expit
.
These functions already exist in scipy. The sigmoid function is available as scipy.special.expit
.
In [36]: from scipy.special import expit
比较 expit
和向量化的 sigmoid 函数:
Compare expit
to the vectorized sigmoid function:
In [38]: x = np.linspace(-6, 6, 1001)
In [39]: %timeit y = sigmoid(x)
100 loops, best of 3: 2.4 ms per loop
In [40]: %timeit y = expit(x)
10000 loops, best of 3: 20.6 µs per loop
expit
也比自己实现公式更快:
expit
is also faster than implementing the formula yourself:
In [41]: %timeit y = 1.0 / (1.0 + np.exp(-x))
10000 loops, best of 3: 27 µs per loop
逻辑分布的 CDF 是 sigmoid 函数.它可以作为 scipy.stats.logistic
的 cdf
方法使用,但是 cdf
最终会调用 expit
,所以有使用这种方法是没有意义的.您可以使用 pdf
方法来计算 sigmoid 函数的导数,或者使用开销较小的 _pdf
方法,但滚动您自己的"方法更快:
The CDF of the logistic distribution is the sigmoid function. It is available as the cdf
method of scipy.stats.logistic
, but cdf
eventually calls expit
, so there is no point in using that method. You can use the pdf
method to compute the derivative of the sigmoid function, or the _pdf
method which has less overhead, but "rolling your own" is faster:
In [44]: def sigmoid_grad(x):
....: ex = np.exp(-x)
....: y = ex / (1 + ex)**2
....: return y
时间(x 的长度为 1001):
Timing (x has length 1001):
In [45]: from scipy.stats import logistic
In [46]: %timeit y = logistic._pdf(x)
10000 loops, best of 3: 73.8 µs per loop
In [47]: %timeit y = sigmoid_grad(x)
10000 loops, best of 3: 29.7 µs per loop
如果您要使用远离尾部的值,请小心您的实现.指数函数很容易溢出.logistic._cdf
比我对 sigmoid_grad
的快速实现要健壮一点:
Be careful with your implementation if you are going to use values that are far into the tails. The exponential function can overflow pretty easily. logistic._cdf
is a bit more robust than my quick implementation of sigmoid_grad
:
In [60]: sigmoid_grad(-500)
/home/warren/anaconda/bin/ipython:3: RuntimeWarning: overflow encountered in double_scalars
import sys
Out[60]: 0.0
In [61]: logistic._pdf(-500)
Out[61]: 7.1245764067412855e-218
使用sech**2
(1/cosh**2
)的实现比上面的sigmoid_grad
要慢一点:>
An implementation using sech**2
(1/cosh**2
) is a bit slower than the above sigmoid_grad
:
In [101]: def sigmoid_grad_sech2(x):
.....: y = (0.5 / np.cosh(0.5*x))**2
.....: return y
.....:
In [102]: %timeit y = sigmoid_grad_sech2(x)
10000 loops, best of 3: 34 µs per loop
但它可以更好地处理尾部:
But it handles the tails better:
In [103]: sigmoid_grad_sech2(-500)
Out[103]: 7.1245764067412855e-218
In [104]: sigmoid_grad_sech2(500)
Out[104]: 7.1245764067412855e-218
这篇关于用于性能、缓存的 Numpy Pure Functions的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!