使用FFTW计算音频数据的离散傅立叶变换 [英] Computing the discrete fourier transform of audio data with FFTW

查看:390
本文介绍了使用FFTW计算音频数据的离散傅立叶变换的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在信号处理方面还很陌生,所以请稍等一下.我已经为Windows下载并安装了FFTW.该文档还可以,但是我仍然有疑问.

I am quite new to signal processing so forgive me if I rant on a bit. I have download and installed FFTW for windows. The documentation is ok but I still have queries.

我的总体目标是从计算机上的声卡捕获以44100 amps/sec采样的原始音频数据(此任务已经使用库和我的代码实现了),然后对这些音频数据的块执行DFT.

My overall aim is to capture raw audio data sampled at 44100 samps/sec from the sound card on the computer (this task is already implemented using libraries and my code), and then perform the DFT on blocks of this audio data.

我只想在音频中找到一定范围的频率分量,而不会执行任何逆DFT.在这种情况下,是否需要进行从真实到真实的转换,因此fftw_plan_r2r_1d()函数?

I am only interested in finding a range of frequency components in the audio and I will not be performing any inverse DFT. In this case, is a real to real transformation all that is necessary, hence the fftw_plan_r2r_1d() function?

我要转换的数据块的长度为11025个样本.我的函数如下所示.这将导致11025个bin的频谱阵列.我如何知道结果中的最大频率分量?

My blocks of data to be transformed are 11025 samples long. My function is called as shown below. This will result in a spectrum array of 11025 bins. How do I know the maximum frequency component in the result?

我认为bin间隔是Fs/n,44100/11025,所以4.这是否意味着我将在阵列中拥有一个频谱,从0 Hz一直到4400 Hz,直到4100或更高.到奈奎斯特频率22200的一半?

I believe that the bin spacing is Fs/n , 44100/11025, so 4. Does it mean that I will have a frequency spectrum in the array from 0 Hz all the way up to 44100Hz in steps of 4, or up to half the nyquist frequency 22200?

这对我来说是个问题,因为我只希望搜索60Hz至3000Hz的频率.有什么方法可以限制变换范围吗?

This would be a problem for me as I only wish to search for frequencies from 60Hz up to 3000Hz. Is there some way to limit the transform range?

我看不到该函数的任何参数,或者也许还有另一种方法?

I don't see any arguments for the function, or maybe there is another way?

在此先感谢您的帮助.

p = fftw_plan_r2r_1d(11025, audioData, spectrum, FFTW_REDFT00, FFTW_ESTIMATE);

推荐答案

要回答上述一些个人问题:

To answer some of your individual questions from the above:

  • 您需要实数到复杂的转换,而不是实数到
  • 您将在感兴趣的频率(magnitude = sqrt(re*re + im*im))上计算复杂出纸槽的幅度
  • 频率分辨率的确是Fs / N = 44100 / 11025 = 4 Hz,即每个输出仓的宽度为4 Hz
  • 对于实数到复杂的变换,您将获得N/2 +1个输出仓,从而为您提供从0Fs / 2的频率
  • 您只需忽略不感兴趣的频率-FFT非常高效,因此您可以浪费"多余的输出仓(除非您只对相对较少数量的输出频率感兴趣)
  • you need a real-to-complex transform, not real-to-real
  • you will calculate the magnitude of the complex output bins at the frequencies of interest (magnitude = sqrt(re*re + im*im))
  • the frequency resolution is indeed Fs / N = 44100 / 11025 = 4 Hz, i.e. the width of each output bin is 4 Hz
  • for a real-to-complex transform you get N/2 + 1 output bins giving you frequencies from 0 to Fs / 2
  • you just ignore frequencies in which you are not interested - the FFT is very efficient so you can afford to "waste" unwanted output bins (unless you are only interested in a relatively small number of output frequencies)

其他说明:

  • 计划创建实际上并不执行FFT-通常,您一次创建一个计划,然后多次使用它(通过调用fftw_execute)
  • 为了提高性能,您可能希望使用单精度调用(例如fftwf_execute而不是fftw_execute,并且类似地用于计划创建等)
  • plan creation does not actually perform an FFT - typically you create a plan once and then use it many times (by calling fftw_execute)
  • for performance you probably want to use the single precision calls (e.g. fftwf_execute rather than fftw_execute, and similarly for plan creation etc)

关于StackOverflow的一些有用的相关问题/答案:

Some useful related questions/answers on StackOverflow:

如何从fft结果中获取频率?

如何在C ++中使用fft生成音频频谱?

您可能还需要阅读更多类似的问题和答案-搜索标签.

There are many more similar questions and answers which you might also want to read - search for the fft and fftw tags.

还请注意, dsp.stackexchange.com 是针对DSP 理论问题的站点的首选站点.而不是实际的特定编程问题.

Also note that dsp.stackexchange.com is the preferred site for site for questions on DSP theory rather than actual specific programming problems.

这篇关于使用FFTW计算音频数据的离散傅立叶变换的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆