相对慢的python numpy 3D傅立叶变换 [英] Comparatively slow python numpy 3D Fourier Transformation

查看:92
本文介绍了相对慢的python numpy 3D傅立叶变换的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

对于我的工作,我需要在大图像上执行离散傅立叶变换(DFT).在当前示例中,我需要1921 x 512 x 512图像的3D FT(以及512 x 512图像的2D FFT).现在,我正在使用numpy包和关联的函数 np.fft.fftn().下面的代码段以下列方式示例性地显示了在大小相等/略小的2D/3D随机数生成网格上的2D和3D FFT时间:

For my work I need to perform discrete fourier transformations (DFTs) on large images. In the current example I require a 3D FT for a 1921 x 512 x 512 image (along with 2D FFTs of 512 x 512 images). Right now, I am using the numpy package and the associated function np.fft.fftn(). The code snippet below exemplarily shows 2D and 3D FFT times on an equal-sized/slightly smaller 2D/3D random-number-generated grid in the following way:

import sys
import numpy as np
import time

tas = time.time()
a = np.random.rand(512, 512)
tab = time.time()
b = np.random.rand(100, 512, 512)

tbfa = time.time()

fa = np.fft.fft2(a)
tfafb = time.time()
fb = np.fft.fftn(b)
tfbe = time.time()

print "initializing 512 x 512 grid:", tab - tas
print "initializing 100 x 512 x 512 grid:", tbfa - tab
print "2D FFT on 512 x 512 grid:", tfafb - tbfa
print "3D FFT on 100 x 512 x 512 grid:", tfbe - tfafb

输出:

initializing 512 x 512 grid: 0.00305700302124
initializing 100 x 512 x 512 grid: 0.301637887955
2D FFT on 512 x 512 grid: 0.0122730731964
3D FFT on 100 x 512 x 512 grid: 3.88418793678

我的问题是我将经常需要此过程,因此每个图像花费的时间应该很短.在我自己的计算机上进行测试(中端笔记本电脑,将2GB RAM分配给虚拟机(->因此,测试网格更小))时,您可以看到3D FFT大约需要5 s(数量级).现在,在工作中,机器变得更好,集群/网格体系结构系统和FFT更快.在这两种情况下,二维视图都立即完成.

The problem that I have is that I will need this process quite often, so the time spent per image should be short. When testing on my own computer (middle-segment laptop, 2GB RAM allocated to virtual machine (--> therefore smaller test grid)), as you can see the 3D FFT takes ~ 5 s (order-of-magnitude). Now, at work, the machines are way better, cluster/grid-architecture systems and FFTs are much faster. In both cases the 2D ones finish quasi instantaneously.

但是对于1921x512x512, np.fft.fftn ()大约需要5分钟.由于我猜scipy的实现并不快,并且考虑到在相同大小的网格的MATLAB FFT上完成约5 s,因此我的问题是是否有一种方法可以加快该过程达到或接近MATLAB的时间.我对FFT的了解有限,但显然MATLAB使用FFTW算法,而python则没有.通过pyFFTW软件包获得类似时间的合理机会是多少?同样,1921年似乎是一个不幸的选择,只有两个主要因素(17、113),因此我认为这也起了作用.另一方面,512是二的合适幂.是否可以在不加零到2048的情况下尽可能地获得类似MATLAB的时间?

However with 1921x512x512, np.fft.fftn() takes ~ 5 min. Since I guess scipy's implementation is not much faster and considering that on MATLAB FFTs of same-sized grids finish within ~ 5 s, my question is whether there is a method to speed the process up to or almost to MATLAB times. My knowledge about FFTs is limited, but apparently MATLAB uses the FFTW algorithm, which python does not. Any reasonable chance that with some pyFFTW package I get similar times? Also, 1921 seems an unlucky choice, having only 2 prime factors (17, 113), so I assume this also plays a role. On the other hand 512 is a well-suited power of two. Are MATLAB-like times achievable if possible also without padding up with zeros to 2048?

我问是因为我将不得不大量使用FFT(以至于这样的差异将产生巨大的影响!),并且如果无法减少python中的计算时间,我必须切换到其他更快的实现方式.

I'm asking because I'll have to use FFTs a lot (to an amount where such differences will be of huge influence!) and in case there is no possibility to reduce computation times in python, I'd have to switch to other, faster implementations.

推荐答案

是的,与numpy.fftscipy.fftpack相比,通过接口pyfftw使用FFTW可能会减少您的计算时间.可以在诸如之类的基准测试中比较DFT算法的这些实现的性能.结果报告在提高Python中的FFT性能

Yes, there is a chance that using FFTW through the interface pyfftw will reduce your computation time compared to numpy.fft or scipy.fftpack. The performances of these implementations of DFT algorithms can be compared in benchmarks such as this one : some interesting results are reported in Improving FFT performance in Python

我建议使用以下代码进行测试:

I suggest the following code for a test:

import pyfftw
import numpy
import time
import scipy

f = pyfftw.n_byte_align_empty((127,512,512),16, dtype='complex128')
#f = pyfftw.empty_aligned((33,128,128), dtype='complex128', n=16)
f[:] = numpy.random.randn(*f.shape)

# first call requires more time for plan creation
# by default, pyfftw use FFTW_MEASURE for the plan creation, which means that many 3D dft are computed so as to choose the fastest algorithm.
fftf=pyfftw.interfaces.numpy_fft.fftn(f)

#help(pyfftw.interfaces)
tas = time.time()
fftf=pyfftw.interfaces.numpy_fft.fftn(f) # here the plan is applied, nothing else.
tas = time.time()-tas
print "3D FFT, pyfftw:", tas

f = pyfftw.n_byte_align_empty((127,512,512),16, dtype='complex128')
#f = pyfftw.empty_aligned((33,128,128), dtype='complex128', n=16)
f[:] = numpy.random.randn(*f.shape)


tas = time.time()
fftf=numpy.fft.fftn(f)
tas = time.time()-tas
print "3D FFT, numpy:", tas

tas = time.time()
fftf=scipy.fftpack.fftn(f)
tas = time.time()-tas
print "3D FFT, scipy/fftpack:", tas

# first call requires more time for plan creation
# by default, pyfftw use FFTW_MEASURE for the plan creation, which means that many 3D dft are computed so as to choose the fastest algorithm.
f = pyfftw.n_byte_align_empty((128,512,512),16, dtype='complex128')
fftf=pyfftw.interfaces.numpy_fft.fftn(f)

tas = time.time()
fftf=pyfftw.interfaces.numpy_fft.fftn(f) # here the plan is applied, nothing else.
tas = time.time()-tas
print "3D padded FFT, pyfftw:", tas

在大小适中的计算机上,对于127 * 512 * 512的尺寸,我得到了:

For a size of 127*512*512, on my modest computer, I got:

3D FFT, pyfftw: 3.94130897522
3D FFT, numpy: 16.0487070084
3D FFT, scipy/fftpack: 19.001199007
3D padded FFT, pyfftw: 2.55221295357

因此,pyfftw明显快于numpy.fftscipy.fftpack.使用填充甚至更快,但是计算出来的东西却有所不同.

So pyfftw is significantly faster than numpy.fft and scipy.fftpack. Using padding is even faster, but the thing that is computed is different.

最后,由于pyfftw根据

Lastly, pyfftw may seem slower at the first run due to the fact that it uses the flag FFTW_MEASURE according to the documentation. It's a good thing if and only if many DFTs of the same size are successively computed.

这篇关于相对慢的python numpy 3D傅立叶变换的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆