如何使用python以更快的方式执行100000次2d fft? [英] How to do 100000 times 2d fft in a faster way using python?

查看：88 发布时间：2021/5/6 20:58:38 python numpy parallel-processing fft

本文介绍了如何使用python以更快的方式执行100000次2d fft?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个形状为(100000，256，256)的3d numpy数组，我想对2d数组的每个堆栈进行FFT，这意味着FFT的100000倍.

I have a 3d numpy array with a shape of (100000, 256, 256), and I'd like to do FFT on every stack of the 2d array, which means 100000 times of FFT.

我用下面的最少代码测试了单个数据和堆叠数据的速度.

I have tested the speed of single and the stacked data with minimum code below.

import numpy as np
a = np.random.random((256, 256))
b = np.random.random((10, 256, 256))

%timeit np.fft.fft2(a)

%timeit np.fft.fftn(b, axes=(1, 2,))

其中提供以下内容:

每个循环872 µs±19.2 µs(平均±标准偏差，共运行7次，每个循环1000次)

每个循环6.46 ms±227 µs(平均±标准偏差，共运行7次，每个循环100个循环)

10万次fft将耗费一分钟以上的时间.

100000 times of fft will take more than one minite.

有没有更快的方法同时执行多个fft或ifft?

更新:经过一番搜索，我发现了 cupy ，这似乎可以帮上忙.

Update: After a bit search, I found cupy, which seems can help.


pyfftw, wrapping the FFTW library, is likely faster than the FFTPACK library wrapped by np.fft and scipy.fftpack.
After all, FFTW stands for Fastest Fourier Transform in the West.
最小代码是:
import numpy as np
import pyfftw
import multiprocessing
b = np.random.random((100, 256, 256))
bb = pyfftw.empty_aligned((100,256, 256), dtype='float64')
bf= pyfftw.empty_aligned((100,256, 129), dtype='complex128')
fft_object_b = pyfftw.FFTW(bb, bf,axes=(1,2),flags=('FFTW_MEASURE',), direction='FFTW_FORWARD',threads=multiprocessing.cpu_count())
bb=b
fft_object_b(bb)

这是扩展代码，用于定时执行 np.fft 和 pyfftw :
Here is an extended code timing the execution of np.fft and pyfftw:
import numpy as np
from timeit import default_timer as timer
import multiprocessing
a = np.random.random((256, 256))
b = np.random.random((100, 256, 256))

start = timer()
for i in range(10):
    np.fft.fft2(a)
end = timer()
print"np.fft.fft2, 1 slice", (end - start)/10

start = timer()
for i in range(10):
     bf=np.fft.fftn(b, axes=(1, 2,))
end = timer()
print "np.fft.fftn, 100 slices", (end - start)/10
print "bf[3,42,42]",bf[3,42,42]


import pyfftw

aa = pyfftw.empty_aligned((256, 256), dtype='float64')
af= pyfftw.empty_aligned((256, 129), dtype='complex128')
bb = pyfftw.empty_aligned((100,256, 256), dtype='float64')
bf= pyfftw.empty_aligned((100,256, 129), dtype='complex128')
print 'number of threads:' , multiprocessing.cpu_count()

fft_object_a = pyfftw.FFTW(aa, af,axes=(0,1), flags=('FFTW_MEASURE',), direction='FFTW_FORWARD',threads=multiprocessing.cpu_count())

fft_object_b = pyfftw.FFTW(bb, bf,axes=(1,2),flags=('FFTW_MEASURE',), direction='FFTW_FORWARD',threads=multiprocessing.cpu_count())


aa=a
bb=b
start = timer()
for i in range(10):
    fft_object_a(aa)
end = timer()
print "pyfftw, 1 slice",(end - start)/10

start = timer()
for i in range(10):
    fft_object_b(bb)
end = timer()
print "pyfftw, 100 slices", (end - start)/10
print "bf[3,42,42]",bf[3,42,42]

最后，结果是大大提高了速度: pyfftw被证明比我的计算机上的np.fft快10倍.，使用2个线程.
Finally, the outcome is a significant speed up: pyfftw proves 10 times faster than np.fft on my computer., using 2 threads. 
np.fft.fft2, 1 slice 0.00459032058716
np.fft.fftn, 100 slices 0.478203487396
bf[3,42,42] (-38.190256258791734+43.03902512127183j)
number of threads: 2
pyfftw, 1 slice 0.000421094894409
pyfftw, 100 slices 0.0439268112183
bf[3,42,42] (-38.19025625879178+43.03902512127183j)

您的计算机似乎比我的计算机好很多！
Your computer seems much better than mine!

                        这篇关于如何使用python以更快的方式执行100000次2d fft?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！


                    
                        查看全文

如何使用python以更快的方式执行100000次2d fft? [英] How to do 100000 times 2d fft in a faster way using python?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

如何使用python以更快的方式执行100000次2d fft? [英] How to do 100000 times 2d fft in a faster way using python?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭