如何快速获得cufftcomplex的幅度和相位 [英] how to get cufftcomplex magnitude and phase fast

查看:183
本文介绍了如何快速获得cufftcomplex的幅度和相位的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个cufftcomplex数据块,它是cuda fft(R2C)的结果.我知道数据被保存为具有实数和图像编号的结构.现在我想通过一种快速的方法(不是循环)来获取每个复杂元素的振幅= sqrt(R * R + I * I)和相位= arctan(I/R).有什么好办法吗?或任何图书馆都可以做到这一点?

解决方案

由于 cufftExecR2C 对GPU上的数据进行操作,因此结果已经在GPU上(在将它们复制回之前,主机,如果您正在这样做.)

编写您自己的cuda内核来完成此任务应该很简单.您描述的幅度是 cuComplex.h 头文件中的 cuCabs cuCabsf 返回的值.通过查看该头文件中的函数,您应该能够弄清楚如何编写自己的计算相角的工具.您会注意到, cufftComplex 只是typedef的 cuComplex .

假设您的cufftExecR2C调用在大小为 sz 的数组 data 中留下了类型为 cufftComplex 的一些结果.您的内核可能看起来像这样:

  #include< math.h>#include< cuComplex.h>#include< cufft.h>#define nTPB 256//内核的每个块的线程数#define sz 100000////或FFT中输出数据的大小...__host__ __device__ float carg(const cuComplex& z){return atan2(cuCimagf(z),cuCrealf(z));}//极角__global__ void magphase(cufftComplex * data,float * mag,float * phase,int dsz){int idx = threadIdx.x + blockDim.x * blockIdx.x;如果(idx< dsz){mag [idx] = cuCabsf(data [idx]);phase [idx] = carg(data [idx]);}}...int main(){.../*使用CUFFT计划将信号转换到适当的位置.*//*您的代码可能已经像这样:*/if(cufftExecR2C(plan,(cufftReal *)data,data)!= CUFFT_SUCCESS){fprintf(stderr,"CUFFT错误:ExecR2C转发失败");返回;}/*,那么您可以添加:*/浮动* h_mag,* h_phase,* d_mag,* d_phase;//首先使用主机malloc分配h_数组,然后...cudaMalloc((void **)& d_mag,sz * sizeof(float));cudaMalloc((void **)&d_phase,sz * sizeof(float));<(sz + nTPB-1)/nTPB,nTPB>(数据,d_mag,d_phase,sz);cudaMemcpy(h_mag,d_mag,sz * sizeof(float),cudaMemcpyDeviceToHost);cudaMemcpy(h_phase,d_phase,sz * sizeof(float),cudaMemcpyDeviceToHost); 

您也可以使用推力为此创建函子幅度和相位函数,并将这些函子与 data mag phase 一起传递给 CUBLAS 来做到这一点,结合使用向量加法和向量乘法运算.

问题/答案可能也会引起关注.我从那里拿起了相位函数 carg .

i have a cufftcomplex data block which is the result from cuda fft(R2C). i know the data is save as a structure with a real number followed by image number. now i want to get the amplitude=sqrt(R*R+I*I), and phase=arctan(I/R) of each complex element by a fast way(not for loop). Is there any good way to do that? or any library could do that?

解决方案

Since cufftExecR2C operates on data that is on the GPU, the results are already on the GPU, (before you copy them back to the host, if you are doing that.)

It should be straightforward to write your own cuda kernel to accomplish this. The amplitude you're describing is the value returned by cuCabs or cuCabsf in cuComplex.h header file. By looking at the functions in that header file, you should be able to figure out how to write your own that computes the phase angle. You'll note that cufftComplex is just a typedef of cuComplex.

let's say your cufftExecR2C call left some results of type cufftComplex in array data of size sz. Your kernel might look like this:

#include <math.h>
#include <cuComplex.h>
#include <cufft.h>
#define nTPB 256    // threads per block for kernel
#define sz 100000   // or whatever your output data size is from the FFT
...

__host__ __device__ float carg(const cuComplex& z) {return atan2(cuCimagf(z), cuCrealf(z));} // polar angle

__global__ void magphase(cufftComplex *data, float *mag, float *phase, int dsz){
  int idx = threadIdx.x + blockDim.x*blockIdx.x;
  if (idx < dsz){
    mag[idx]   = cuCabsf(data[idx]);
    phase[idx] = carg(data[idx]);
  }
}

...
int main(){
...
    /* Use the CUFFT plan to transform the signal in place. */
    /* Your code might be something like this already:      */
    if (cufftExecR2C(plan, (cufftReal*)data, data) != CUFFT_SUCCESS){
      fprintf(stderr, "CUFFT error: ExecR2C Forward failed");
      return;   
    }
    /* then you might add:                                  */
    float *h_mag, *h_phase, *d_mag, *d_phase;
    // malloc your h_ arrays using host malloc first, then...
    cudaMalloc((void **)&d_mag, sz*sizeof(float));
    cudaMalloc((void **)&d_phase, sz*sizeof(float));
    magphase<<<(sz+nTPB-1)/nTPB, nTPB>>>(data, d_mag, d_phase, sz);
    cudaMemcpy(h_mag, d_mag, sz*sizeof(float), cudaMemcpyDeviceToHost);
    cudaMemcpy(h_phase, d_phase, sz*sizeof(float), cudaMemcpyDeviceToHost);

You can also do this using thrust by creating functors for the magnitude and phase functions, and passing these functors along with data, mag and phase to thrust::transform.

I'm sure you can probably do it with CUBLAS as well, using a combination of vector add and vector multiply operations.

This question/answer may be of interest as well. I lifted my phase function carg from there.

这篇关于如何快速获得cufftcomplex的幅度和相位的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆