FFTW与Matlab FFT [英] FFTW vs Matlab FFT

查看：159 发布时间：2020/5/6 12:26:27 c++ matlab mex fftw

本文介绍了FFTW与Matlab FFT的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我在matlab Central上发布了此内容，但没有得到任何回应，因此我认为我将在此处重新发布.

I posted this on matlab central but didn't get any responses so I figured I'd repost here.

我最近在Matlab中编写了一个简单的例程，该例程在for循环中使用FFT； FFT支配了计算.我出于实验目的在mex中编写了相同的例程，该例程称为FFTW 3.3库.事实证明，对于非常大的阵列，matlab例程的运行速度比mex例程的运行速度快(大约快一倍).混合例程使用智慧并执行相同的FFT计算.我也知道matlab使用FFTW，但是它们的版本是否可能会稍微优化一些?我什至使用了FFTW_EXHAUSTIVE标志，对于大型数组，它的速度仍然是MATLAB对应标志的两倍.此外，我确保使用的matlab是带有"-singleCompThread"标志的单线程，并且使用的mex文件不在调试模式下.只是想知道是否是这种情况-还是在我不知道的情况下matlab正在使用某些优化.谢谢.

I recently wrote a simple routine in Matlab that uses an FFT in a for-loop; the FFT dominates the calculations. I wrote the same routine in mex just for experimentation purposes and it calls the FFTW 3.3 library. It turns out that the matlab routine runs faster than the mex routine for very large arrays (about twice as fast). The mex routine uses wisdom and and performs the same FFT calculations. I also know matlab uses FFTW, but is it possible their version is slightly more optimized? I even used the FFTW_EXHAUSTIVE flag and its still about twice as slow for large arrays than the MATLAB counterpart. Furthermore I ensured the matlab I used was single threaded with the "-singleCompThread" flag and the mex file I used was not in debug mode. Just curious if this was the case - or if there are some optimizations matlab is using under the hood that I dont know about. Thanks.

这是混合部分:

void class_cg_toeplitz::analysis() {
// This method computes CG iterations using FFTs
    // Check for wisdom
    if(fftw_import_wisdom_from_filename("cd.wis") == 0) {
        mexPrintf("wisdom not loaded.\n");
    } else {
        mexPrintf("wisdom loaded.\n");
    }

    // Set FFTW Plan - use interleaved FFTW
    fftw_plan plan_forward_d_buffer;    
    fftw_plan plan_forward_A_vec;       
    fftw_plan plan_backward_Ad_buffer;
    fftw_complex *A_vec_fft;
    fftw_complex *d_buffer_fft;
    A_vec_fft = fftw_alloc_complex(n);
    d_buffer_fft = fftw_alloc_complex(n);

    // CREATE MASTER PLAN - Do this on an empty vector as creating a plane 
    // with FFTW_MEASURE will erase the contents; 
    // Use d_buffer
    // This is somewhat dangerous because Ad_buffer is a vector; but it does not
    // get resized so &Ad_buffer[0] should work
    plan_forward_d_buffer = fftw_plan_dft_r2c_1d(d_buffer.size(),&d_buffer[0],d_buffer_fft,FFTW_EXHAUSTIVE);
    plan_forward_A_vec = fftw_plan_dft_r2c_1d(A_vec.height,A_vec.value,A_vec_fft,FFTW_WISDOM_ONLY);
    // A_vec_fft.*d_buffer_fft will overwrite d_buffer_fft
    plan_backward_Ad_buffer = fftw_plan_dft_c2r_1d(Ad_buffer.size(),d_buffer_fft,&Ad_buffer[0],FFTW_EXHAUSTIVE);

    // Get A_vec_fft
    fftw_execute(plan_forward_A_vec);

    // Find initial direction - this is the initial residual
    for (int i=0;i<n;i++) {
        d_buffer[i] = b.value[i];
        r_buffer[i] = b.value[i];
    }    

    // Start CG iterations
    norm_ro = norm(r_buffer);
    double fft_reduction = (double)Ad_buffer.size(); // Must divide by size of vector because inverse FFT does not do this
    while (norm(r_buffer)/norm_ro > relativeresidual_cutoff) {        
        // Find Ad - use fft
        fftw_execute(plan_forward_d_buffer);    
        // Get A_vec_fft.*fft(d) - A_vec_fft is only real, but d_buffer_fft
        // has complex elements; Overwrite d_buffer_fft        
        for (int i=0;i<n;i++) {
            d_buffer_fft[i][0] = d_buffer_fft[i][0]*A_vec_fft[i][0]/fft_reduction;
            d_buffer_fft[i][1] = d_buffer_fft[i][1]*A_vec_fft[i][0]/fft_reduction;
        }        
        fftw_execute(plan_backward_Ad_buffer); 

        // Calculate r'*r
        rtr_buffer = 0;
        for (int i=0;i<n;i++) {
            rtr_buffer = rtr_buffer + r_buffer[i]*r_buffer[i];
        }    

        // Calculate alpha
        alpha = 0;
        for (int i=0;i<n;i++) {
            alpha = alpha + d_buffer[i]*Ad_buffer[i];
        }    
        alpha = rtr_buffer/alpha;

        // Calculate new x
        for (int i=0;i<n;i++) {
            x[i] = x[i] + alpha*d_buffer[i];
        }   

        // Calculate new residual
        for (int i=0;i<n;i++) {
            r_buffer[i] = r_buffer[i] - alpha*Ad_buffer[i];
        }   

        // Calculate beta
        beta = 0;
        for (int i=0;i<n;i++) {
            beta = beta + r_buffer[i]*r_buffer[i];
        }  
        beta = beta/rtr_buffer;

        // Calculate new direction vector
        for (int i=0;i<n;i++) {
            d_buffer[i] = r_buffer[i] + beta*d_buffer[i];
        }  

        *total_counter = *total_counter+1;
        if(*total_counter >= iteration_cutoff) {
            // Set total_counter to -1, this indicates failure
            *total_counter = -1;
            break;
        }
    }

    // Store Wisdom
    fftw_export_wisdom_to_filename("cd.wis");

    // Free fft alloc'd memory and plans
    fftw_destroy_plan(plan_forward_d_buffer);
    fftw_destroy_plan(plan_forward_A_vec);
    fftw_destroy_plan(plan_backward_Ad_buffer);
    fftw_free(A_vec_fft);
    fftw_free(d_buffer_fft);
};

这是matlab部分:

Here's the matlab portion:

% Take FFT of A_vec.
A_vec_fft = fft(A_vec); % Take fft once

% Find initial direction - this is the initial residual 
x = zeros(n,1); % search direction
r = zeros(n,1); % residual
d = zeros(n+(n-2),1); % search direction; pad to allow FFT
for i = 1:n
    d(i) = b(i); 
    r(i) = b(i); 
end

% Enter CG iterations
total_counter = 0;
rtr_buffer = 0;
alpha = 0;
beta = 0;
Ad_buffer = zeros(n+(n-2),1); % This holds the product of A*d - calculate this once per iteration and using FFT; only 1:n is used
norm_ro = norm(r);

while(norm(r)/norm_ro > 10^-6)
    % Find Ad - use fft
    Ad_buffer = ifft(A_vec_fft.*fft(d)); 

    % Calculate rtr_buffer
    rtr_buffer = r'*r;

    % Calculate alpha    
    alpha = rtr_buffer/(d(1:n)'*Ad_buffer(1:n));

    % Calculate new x
    x = x + alpha*d(1:n);

    % Calculate new residual
    r = r - alpha*Ad_buffer(1:n);

    % Calculate beta
    beta = r'*r/(rtr_buffer);

    % Calculate new direction vector
    d(1:n) = r + beta*d(1:n);      

    % Update counter
    total_counter = total_counter+1; 
end

就时间而言，对于N = 50000和b = 1:n，使用mex大约需要10.5秒，而使用matlab大约需要4.4秒.我正在使用R2011b.谢谢

In terms of time, for N = 50000 and b = 1:n it takes about 10.5 seconds with mex and 4.4 seconds with matlab. I'm using R2011b. Thanks

FFTW与Matlab FFT [英] FFTW vs Matlab FFT

问题描述

推荐答案

相关文章

C/C++开发最新文章

热门教程

热门工具

登录关闭

FFTW与Matlab FFT [英] FFTW vs Matlab FFT

问题描述

推荐答案

相关文章

C/C++开发最新文章

热门教程

热门工具

登录 关闭

登录关闭