我可以在python3中提高数组乘法的速度吗 [英] Can i increase speed of array multiplication in python3
问题描述
我需要加快下面的简单代码的速度.我已经使用numba和pypy,执行时间大约需要0.00018秒.但是我需要增加执行时间.有什么办法吗?
I need to speed up my below simple code. I already use numba and pypy, execution time takes nearly 0.00018 sec. However i need to increase execution time. Is there any way to do it ??
编辑1
我有一个巨大的矩阵,例如250000x6000.对于每个元素,我必须在下面的代码下运行.我使用具有10个内核的并行处理.这意味着(250000 * 6000 * 0.00018秒/10)大约7或8个小时.
I have a huge matrix like 250000x6000. For each element i have to run below code. I use parallel processing with 10 cores. It means that (250000*6000*0.00018 sec / 10) about 7 or 8 hours.
编辑2 :
例如:
我从0到3000
Bn是3001x1浮点数组
value,part和normx是浮点标量
腿是3001x1浮点数组
Edit-2:
For example:
i goes from 0 to 3000
Bn is 3001x1 float array
value,part and normx are float scalars
leg is 3001x1 float array
i = np.arange(Lmin,Lmax+1)
kernel = np.sum(((2*i+1)/part)*((value)**(i+1))*leg[i]*Bn[i]*((i-1)/normx))
到目前为止我尝试过的(最快的)
What i tried so far (the fastest one)
@njit
def trial(normx,Lmin,Lmax,Bn,)
kernel = 0
part = something*4*np.pi
value = some value/normx
leg = some.funtions()
for i in range(Lmin,Lmax+1)
kernel += ((2*i+1)/part)*((value)**(i+1))*leg[i]*Bn[i]
return(kernel)
推荐答案
在此计算中,乘法并不费时.但是,((value)**(i+1))
的求幂非常昂贵,可能没有必要.还不清楚您是否在函数中使用全局变量.如果是这样,请避免.
Multiplications aren't time consuming in this calculation. But the exponentiation ((value)**(i+1))
is very costly and likely not necessary. It is also not clear if you use global variables in your function. If so, avoid it.
原始实施
@nb.njit(fastmath=True,error_model="numpy")
def trial_orig(part,value,Lmin,Lmax,Bn,leg):
kernel = 0.
for i in range(0,Lmax+1):
kernel += ((2*i+1)/part)*((value)**(i+1))*leg[i]*Bn[i]
return kernel
避免取幂
请注意,将fact
与其自身重复乘以仅在代数上与直接通过乘幂计算相同.由于这是数字数学,因此结果可能会略有不同.
Please note that multiplying fact
repeatedly with itself is only algebraically the same as directly calculating with exponentiation. Since this is numerical math the results may slightly differ.
@nb.njit(fastmath=True,error_model="numpy")
def trial_mod(part,value,Lmin,Lmax,Bn,leg):
#I assume that Lmin is always >0
assert Lmin>=0.
kernel = 0.
fact=value**(Lmin+1)
for i in range(Lmin,Lmax):
kernel += ((2*i+1)/part)*fact*leg[i]*Bn[i]
fact*=value
return kernel
时间
leg=np.random.rand(3001)
Bn=np.random.rand(3001)
Lmin=0
Lmax=3000
part=15.
value=0.8
%timeit trial_orig(part,value,Lmin,Lmax,Bn,leg)
100 µs ± 1.11 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
%timeit trial_mod(part,value,Lmin,Lmax,Bn,leg)
4.25 µs ± 21.8 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
这篇关于我可以在python3中提高数组乘法的速度吗的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!