不提高并行化功能表现 [英] Improve performance of function without parallelization

查看：85 发布时间：2016/6/3 10:37:34 python arrays performance numpy

本文介绍了不提高并行化功能表现的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

几周前我张贴的问题（<一个href=\"http://stackoverflow.com/questions/21269833/speed-up-nested-for-loop-with-elements-exponentiation\">Speed嵌套起来与元素幂环路），它通过了 abarnert 一个很好的答案。这个问题涉及到一个，因为它使用的性能改进所述用户建议。

Some weeks ago I posted a question (Speed up nested for loop with elements exponentiation) which got a very good answer by abarnert. This question is related to that one since it makes use of the performance improvements suggested by said user.

我需要改进，涉及计算三个因素，然后应用指数对他们的功能的性能

I need to improve the performance of a function that involves calculating three factors and then applying an exponential on them.

下面是一个 MWE 我的code的：

Here's a MWE of my code:

import numpy as np
import timeit

def random_data(N):
    # Generate some random data.
    return np.random.uniform(0., 10., N)

# Data lists.
array1 = np.array([random_data(4) for _ in range(1000)])
array2 = np.array([random_data(3) for _ in range(2000)])

# Function.
def func():
    # Empty list that holds all values obtained in for loop.    
    lst = []
    for elem in array1:
        # Avoid numeric errors if one of these values is 0.            
        e_1, e_2 = max(elem[0], 1e-10), max(elem[1], 1e-10)
        # Obtain three parameters.
        A = 1./(e_1*e_2)
        B = -0.5*((elem[2]-array2[:,0])/e_1)**2
        C = -0.5*((elem[3]-array2[:,1])/e_2)**2
        # Apply exponential.
        value = A*np.exp(B+C)
        # Store value in list.
        lst.append(value)

    return lst

# time function.
func_time = timeit.timeit(func, number=100)
print func_time

是否有可能加快 FUNC 无需recurr到并行？

Is it possible to speed up func without having to recurr to parallelization?

推荐答案

下面是我到目前为止所。我的方法是跨numpy的阵列做大量数学尽可能的。

Here's what I have so far. My approach is to do as much of the math as possible across numpy arrays.

优化：

计算 A 取值numpy的内

B 和 C 重系数计算他们分成因素的影响，其中一些可以在计算numpy的

Calculate As within numpy
Re-factor calculation of B and C by splitting them into factors, some of which can be computed within numpy

code：

def optfunc():
    e0 = array1[:, 0]
    e1 = array1[:, 1]
    e2 = array1[:, 2]
    e3 = array1[:, 3]

    ar0 = array2[:, 0]
    ar1 = array2[:, 1]

    As = 1./(e0 * e1)
    Bfactors = -0.5 * (1 / e0**2)
    Cfactors = -0.5 * (1 / e1**2)

    lst = []
    for i, elem in enumerate(array1):
        B = ((elem[2] - ar0) ** 2) * Bfactors[i]
        C = ((elem[3] - ar1) ** 2) * Cfactors[i]

        value = As[i]*np.exp(B+C)

        lst.append(value)

    return lst

print np.allclose(optfunc(), func())

# time function.
func_time = timeit.timeit(func, number=10)
opt_func_time = timeit.timeit(optfunc, number=10)
print "%.3fs --> %.3fs" % (func_time, opt_func_time)

结果：

True
0.759s --> 0.485s

在这一点上我卡住了。我设法完全做没有蟒蛇的循环，但它比上面的版本是有原因的我还不知道要慢：

At this point I'm stuck. I managed to do it entirely without python for loops, but it is slower than the above version for a reason I do not yet understand:

def optfunc():
    x = array1
    y = array2

    x0 = x[:, 0]
    x1 = x[:, 1]
    x2 = x[:, 2]
    x3 = x[:, 3]

    y0 = y[:, 0]
    y1 = y[:, 1]

    A = 1./(x0 * x1)
    Bfactors = -0.5 * (1 / x0**2)
    Cfactors = -0.5 * (1 / x1**2)

    B = (np.transpose([x2]) - y0)**2 * np.transpose([Bfactors])
    C = (np.transpose([x3]) - y1)**2 * np.transpose([Cfactors])

    return np.transpose([A]) * np.exp(B + C)

结果：

True
0.780s --> 0.558s

不过请注意，后者让你的 np.array ，而前者只让你一个Python列表...这可能解释这个差异，但我不肯定的。

However note that the latter gets you an np.array whereas the former only gets you a Python list... this might account for the difference but I'm not sure.

这篇关于不提高并行化功能表现的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

不提高并行化功能表现 [英] Improve performance of function without parallelization

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

不提高并行化功能表现 [英] Improve performance of function without parallelization

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭