是否有可能重叠分批的FFT与CUDA的cuFFT库和cufftPlanMany? [英] Is it possible to overlap batched FFTs with CUDA's cuFFT library and cufftPlanMany?

查看:1762
本文介绍了是否有可能重叠分批的FFT与CUDA的cuFFT库和cufftPlanMany?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图并行化称为Chromaprint的声学指纹库的FFT变换。它的工作原理是将原始音频分成许多重叠帧并对它们应用傅里叶变换。 Chromaprint使用4096的帧大小,具有2/3重叠。例如,第一帧由元素[0 ... 4095]组成,然后第二帧是像[1366 .. 5462]。

I am trying to parallelize the FFT transforms of an acoustic fingerprinting library known as Chromaprint. It works by "splitting the original audio into many overlapping frames and applying the Fourier transform on them." Chromaprint uses a frame size of 4096, with a 2/3 overlap. For instance, the first frame consists of elements [0...4095], then the second frame is something like [1366.. 5462].

对于cufftPlanMany,I知道你可以指定大小为4096的批次,这将执行批次[0 ... 4095],[4096 ... 8192]等等。有一些方法使批量转换重叠,或者我应该考虑另一种方法不使用批处理执行?

With cufftPlanMany, I know that you can specify batches of size 4096, that will perform batches of [0... 4095], [4096... 8192], etc. Is there some way to make the batched transforms overlap, or should I consider another approach that doesn't use batched execution?

推荐答案

如果使用高级数据布局 idist 参数应允许您设置2个连续变换输入集的起始点之间的任意偏移。

If you use Advanced Data Layout, the idist parameter should allow you to set any arbitrary offset between the starting points of 2 successive transform input sets.

对于1D情况,将根据以下基于您传递的参数选择输入:

For the 1D case, the input will be selected according to the following based on the parameters you pass:

input[ b * idist + x * istride]

(其中 b 是当前正在处理的批次号,即b = 0,1,2 ... size]

(where b is the batch number currently being processed, i.e. b = 0, 1, 2, ... batch size)

idist和odist参数指示输入和输出数据中两个连续批次的第一个元素之间的距离。

"The idist and odist parameters indicate the distance between the first element of two consecutive batches in the input and output data."

这篇关于是否有可能重叠分批的FFT与CUDA的cuFFT库和cufftPlanMany?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆