Python提高函数速度 [英] Python improving function speed
问题描述
我正在编写自己的脚本来计算两个信号之间的关系.因此,我使用 mlab.csd 和 mlab.psd 函数来计算信号的 CSD 和 PSD.我的数组 x 的形状为 (120,68,68,815).我的脚本运行了几分钟,这个函数是这么长时间的热点.
I am coding my own script to calculate relation between two signals. Therefore I use the mlab.csd and mlab.psd functions to compute the CSD and PSD of the signals. My array x is in the shape of (120,68,68,815). My script runs several minutes and this function is the hotspot for this high amount of time.
有人知道我应该做什么吗?我对脚本性能的提高并不熟悉.谢谢!
Anyone any idea what I should do? I am not that familiar with script performance increasing. Thanks!
# to read the list of stcs for all the epochs
with open('/home/daniel/Dropbox/F[...]', 'rb') as f:
label_ts = pickle.load(f)
x = np.asarray(label_ts)
nfft = 512
n_freqs = nfft/2+1
n_epochs = len(x) # in this case there are 120 epochs
channels = 68
sfreq = 1017.25
def compute_mean_psd_csd(x, n_epochs, nfft, sfreq):
'''Computes mean of PSD and CSD for signals.'''
Rxy = np.zeros((n_epochs, channels, channels, n_freqs), dtype=complex)
Rxx = np.zeros((n_epochs, channels, channels, n_freqs))
Ryy = np.zeros((n_epochs, channels, channels, n_freqs))
for i in xrange(0, n_epochs):
print('computing connectivity for epoch %s'%(i+1))
for j in xrange(0, channels):
for k in xrange(0, channels):
Rxy[i,j,k], freqs = mlab.csd(x[j], x[k], NFFT=nfft, Fs=sfreq)
Rxx[i,j,k], _____ = mlab.psd(x[j], NFFT=nfft, Fs=sfreq)
Ryy[i,j,k], _____ = mlab.psd(x[k], NFFT=nfft, Fs=sfreq)
Rxy_mean = np.mean(Rxy, axis=0, dtype=np.float32)
Rxx_mean = np.mean(Rxx, axis=0, dtype=np.float32)
Ryy_mean = np.mean(Ryy, axis=0, dtype=np.float32)
return freqs, Rxy, Rxy_mean, np.real(Rxx_mean), np.real(Ryy_mean)
推荐答案
如果 csd
和 psd
方法的计算量很大,那么可能会有帮助.有可能您可以简单地缓存先前调用的结果并获取它,而不是多次计算.
Something that could help, if the csd
and psd
methods are computationally intensive. There are chances that you could probably simply cache the results of previous calls and get it instead of calculating multiple times.
看起来,您将有 120 * 68 * 68 = 591872
个周期.
As it seems, you will have 120 * 68 * 68 = 591872
cycles.
在psd计算的情况下,应该可以毫无问题地缓存值,因为方法只依赖于一个参数.
In the case of the psd calculation, it should be possible to cache the values without problem has the method only depend on one parameter.
将值存储在 x [j]
或 x [k]
的字典中,以检查该值是否存在.如果该值不存在,请对其进行计算并存储.如果该值存在,只需跳过该值并重新使用该值.
Store the value inside a dict for the x[j]
or x[k]
check if the value exists. If the value doesn't exist, compute it and store it. If the value exists, simply skip the value and reusue the value.
if x[j] not in cache_psd:
cache_psd[x[j]], ____ = mlab.psd(x[j], NFFT=nfft, Fs=sfreq)
Rxx[i,j,k] = cache_psd[x[j]]
if x[k] not in cache_psd:
cache_psd[x[k]], ____ = mlab.psd(x[k], NFFT=nfft, Fs=sfreq)
Ryy[i,j,k] = cache_psd[x[k]]
你可以用 csd
方法做同样的事情.我对此知之甚少,无法多说.如果参数的顺序无关紧要,您可以将两个参数按排序顺序存储,以防止出现2, 1
和1, 2
等重复项.
You can do the same with the csd
method. I don't know enough about it to say more. If the order of the parameter doesn't matter, you can store the two parameter in a sorted order to prevent duplicates such as 2, 1
and 1, 2
.
只有在内存访问时间低于计算时间和存储时间的情况下,使用缓存才会使代码更快.可以使用执行 memoization
的模块轻松添加此修复程序.
The use of the cache will make the code faster only if the memory access time is lower than the computation time and storing time. This fix could be easily added with a module that does memoization
.
这里有一篇关于记忆的文章,供进一步阅读:
Here's an article about memoization for further reading:
http://www.python-course.eu/python3_memoization.php
这篇关于Python提高函数速度的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!