我可以在python3中提高数组乘法的速度吗 [英] Can i increase speed of array multiplication in python3

查看:186
本文介绍了我可以在python3中提高数组乘法的速度吗的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要加快下面的简单代码的速度.我已经使用numba和pypy,执行时间大约需要0.00018秒.但是我需要增加执行时间.有什么办法吗?

I need to speed up my below simple code. I already use numba and pypy, execution time takes nearly 0.00018 sec. However i need to increase execution time. Is there any way to do it ??

编辑1

我有一个巨大的矩阵,例如250000x6000.对于每个元素,我必须在下面的代码下运行.我使用具有10个内核的并行处理.这意味着(250000 * 6000 * 0.00018秒/10)大约7或8个小时.

I have a huge matrix like 250000x6000. For each element i have to run below code. I use parallel processing with 10 cores. It means that (250000*6000*0.00018 sec / 10) about 7 or 8 hours.

编辑2 :
例如: 我从0到3000
Bn是3001x1浮点数组
value,part和normx是浮点标量
腿是3001x1浮点数组

Edit-2:
For example: i goes from 0 to 3000
Bn is 3001x1 float array
value,part and normx are float scalars
leg is 3001x1 float array

i = np.arange(Lmin,Lmax+1)
kernel = np.sum(((2*i+1)/part)*((value)**(i+1))*leg[i]*Bn[i]*((i-1)/normx))

到目前为止我尝试过的(最快的)

What i tried so far (the fastest one)

 @njit
 def trial(normx,Lmin,Lmax,Bn,)
     kernel = 0
     part = something*4*np.pi
     value = some value/normx
     leg = some.funtions()
     for i in range(Lmin,Lmax+1)
         kernel += ((2*i+1)/part)*((value)**(i+1))*leg[i]*Bn[i]
 return(kernel)

推荐答案

在此计算中,乘法并不费时.但是,((value)**(i+1))的求幂非常昂贵,可能没有必要.还不清楚您是否在函数中使用全局变量.如果是这样,请避免.

Multiplications aren't time consuming in this calculation. But the exponentiation ((value)**(i+1)) is very costly and likely not necessary. It is also not clear if you use global variables in your function. If so, avoid it.

原始实施

@nb.njit(fastmath=True,error_model="numpy")
def trial_orig(part,value,Lmin,Lmax,Bn,leg):
    kernel = 0.
    for i in range(0,Lmax+1):
        kernel += ((2*i+1)/part)*((value)**(i+1))*leg[i]*Bn[i]
    return kernel

避免取幂

请注意,将fact与其自身重复乘以仅在代数上与直接通过乘幂计算相同.由于这是数字数学,因此结果可能会略有不同.

Please note that multiplying fact repeatedly with itself is only algebraically the same as directly calculating with exponentiation. Since this is numerical math the results may slightly differ.

@nb.njit(fastmath=True,error_model="numpy")
def trial_mod(part,value,Lmin,Lmax,Bn,leg):
    #I assume that Lmin is always >0
    assert Lmin>=0.
    kernel = 0.
    fact=value**(Lmin+1)
    for i in range(Lmin,Lmax):
        kernel += ((2*i+1)/part)*fact*leg[i]*Bn[i]
        fact*=value
    return kernel

时间

leg=np.random.rand(3001)
Bn=np.random.rand(3001)
Lmin=0
Lmax=3000
part=15.
value=0.8

%timeit trial_orig(part,value,Lmin,Lmax,Bn,leg)
100 µs ± 1.11 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
%timeit trial_mod(part,value,Lmin,Lmax,Bn,leg)
4.25 µs ± 21.8 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

这篇关于我可以在python3中提高数组乘法的速度吗的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆