在纯NumPy中重写for循环以减少执行时间 [英] Rewriting a for loop in pure NumPy to decrease execution time

查看:61
本文介绍了在纯NumPy中重写for循环以减少执行时间的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

最近被问到如何为科学应用优化Python循环,并且收到但是,B值的计算实际上嵌套在其他几个循环中,因为它是在规则的位置网格处求值的.是否有类似的智能NumPy重写来节省此过程的时间?

However, calculation of the B value is actually nested within a few other loops, because it is evaluated at a regular grid of positions. Is there a similarly smart NumPy rewrite to shave time off this procedure?

我怀疑这部分的性能提升会不太明显,其缺点可能是无法将计算进度报告给用户,并且无法将结果写入到计算结果中.输出文件直到计算结束,并且可能一步一步进行将对内存产生影响?有可能规避其中的任何一个吗?

I suspect the performance gain for this part would be less marked, and the disadvantages would presumably be that it would not be possible to report back to the user on the progress of the calculation, that the results could not be written to the output file until the end of the calculation, and possibly that doing this in one enormous step would have memory implications? Is it possible to circumvent any of these?

import numpy as np
import time

def reshape_vector(v):
    b = np.empty((3,1))
    for i in range(3):
        b[i][0] = v[i]
    return b

def unit_vectors(r):
     return r / np.sqrt((r*r).sum(0))

def calculate_dipole(mu, r_i, mom_i):
    relative = mu - r_i
    r_unit = unit_vectors(relative)
    A = 1e-7

    num = A*(3*np.sum(mom_i*r_unit, 0)*r_unit - mom_i)
    den = np.sqrt(np.sum(relative*relative, 0))**3
    B = np.sum(num/den, 1)
    return B

N = 20000 # number of dipoles
r_i = np.random.random((3,N)) # positions of dipoles
mom_i = np.random.random((3,N)) # moments of dipoles
a = np.random.random((3,3)) # three basis vectors for this crystal
n = [10,10,10] # points at which to evaluate sum
gamma_mu = 135.5 # a constant

t_start = time.clock()
for i in range(n[0]):
    r_frac_x = np.float(i)/np.float(n[0])
    r_test_x = r_frac_x * a[0]
    for j in range(n[1]):
        r_frac_y = np.float(j)/np.float(n[1])
        r_test_y = r_frac_y * a[1]
        for k in range(n[2]):
            r_frac_z = np.float(k)/np.float(n[2])
            r_test = r_test_x +r_test_y + r_frac_z * a[2]
            r_test_fast = reshape_vector(r_test)
            B = calculate_dipole(r_test_fast, r_i, mom_i)
            omega = gamma_mu*np.sqrt(np.dot(B,B))
            # write r_test, B and omega to a file
    frac_done = np.float(i+1)/(n[0]+1)
    t_elapsed = (time.clock()-t_start)
    t_remain = (1-frac_done)*t_elapsed/frac_done
    print frac_done*100,'% done in',t_elapsed/60.,'minutes...approximately',t_remain/60.,'minutes remaining'

推荐答案

如果您配置文件您的代码,您将看到99%的运行时间在calculate_dipole中,因此减少此循环的时间实际上不会显着减少执行时间.如果您想使其更快,您仍然需要专注于calculate_dipole.我为此尝试了calculate_dipole的Cython代码,并且使总体时间减少了大约2倍.可能还有其他方法可以改善Cython代码.

If you profile your code, you'll see that 99% of the running time is in calculate_dipole so reducing the time for this looping really won't give a noticeable reduction in execution time. You still need to focus on calculate_dipole if you want to make this faster. I tried my Cython code for calculate_dipole on this and got a reduction by about a factor of 2 in the overall time. There might be other ways to improve the Cython code too.

这篇关于在纯NumPy中重写for循环以减少执行时间的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆