使用 Apple FFT 和 Accelerate 框架 [英] Using the Apple FFT and Accelerate Framework

查看:27
本文介绍了使用 Apple FFT 和 Accelerate 框架的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

是否有人将 Apple FFT 用于 iPhone 应用程序,或者知道我可以在哪里找到有关如何使用它的示例应用程序?我知道 Apple 发布了一些示例代码,但我不确定如何将其实施到实际项目中.

Has anybody used the Apple FFT for an iPhone app yet or know where I might find a sample application as to how to use it? I know that Apple has some sample code posted, but I'm not really sure how to implement it into an actual project.

推荐答案

我刚刚获得了适用于 iPhone 项目的 FFT 代码:

I just got the FFT code working for an iPhone project:

  • 创建一个新项目
  • 删除除 main.m 和 xxx_info.plist 之外的所有文件
  • 转到项目设置并搜索 pch 并阻止它尝试加载 .pch(因为我们刚刚删除了它)
  • 将代码示例复制粘贴到 main.m 中的任何内容上
  • 删除#include 的 Carbon 行.Carbon 适用于 OSX.
  • 删除所有框架,添加加速框架

您可能还需要从 info.plist 中删除一个条目,该条目告诉项目加载 xib,但我 90% 确定您不需要为此烦恼.

You might also need to remove an entry from info.plist that tells the project to load a xib, but I'm 90% sure you don't need to bother with that.

注意:程序输出到控制台,结果显示为 0.000,这不是错误——它只是非常快

NOTE: Program outputs to console, results come out as 0.000 that's not an error –- it's just very very fast

这段代码真的很愚蠢;它得到了慷慨的评论,但实际上并没有让生活变得更轻松.

This code is really stupidly obscure; it is generously commented, but the comments don't actually make life any easier.

基本上它的核心是:

vDSP_fft_zrip(setupReal, &A, stride, log2n, FFT_FORWARD);
vDSP_fft_zrip(setupReal, &A, stride, log2n, FFT_INVERSE);

对 n 个真正的浮点数进行 FFT,然后反向返回到我们开始的地方.ip 代表就地,这意味着 &A 被覆盖这就是所有这些特殊包装错误的原因——以便我们可以将返回值压缩到与发送值相同的空间中.

FFT on n real floats, and then reverse to get back to where we started. ip stands for in-place, which means &A gets overwritten That's the reason for all this special packing malarkey -- so that we can squash the return value into the same space as the send value.

为了给出一些观点(例如:为什么我们首先要使用这个函数?),假设我们想对麦克风输入执行音调检测,并且我们已经设置好一些回调每次麦克风进入 1024 个浮点数时触发.假设麦克风采样率为 44.1kHz,则约为 44 帧/秒.

To give some perspective (like, as in: why would we be using this function in the first place?), Let's say we want to perform pitch detection on microphone input, and we have set it up so that some callback gets triggered every time the microphone gets in 1024 floats. Supposing the microphone sampling rate was 44.1kHz, so that's ~44 frames / sec.

因此,我们的时间窗口是 1024 个样本的持续时间,即 1/44 秒.

So, our time-window is whatever the time duration of 1024 samples is, ie 1/44 s.

所以我们会用麦克风中的 1024 个浮点数打包 A,设置 log2n=10 (2^10=1024),预先计算一些线轴 (setupReal) 和:

So we would pack A with 1024 floats from the mic, set log2n=10 (2^10=1024), precalculate some bobbins (setupReal) and:

vDSP_fft_zrip(setupReal, &A, stride, log2n, FFT_FORWARD);

现在 A 将包含 n/2 个复数.这些代表 n/2 个频率箱:

Now A will contain n/2 complex numbers. These represent n/2 frequency bins:

  • bin[1].idealFreq = 44Hz -- 即我们可以可靠检测到的最低频率是该窗口内的一个完整波,即 44Hz 波.

  • bin[1].idealFreq = 44Hz -- ie The lowest frequency we can reliably detect is ONE complete wave within that window, ie a 44Hz wave.

bin[2].idealFreq = 2 * 44Hz

bin[2].idealFreq = 2 * 44Hz

bin[512].idealFreq = 512 * 44Hz -- 我们可以检测到的最高频率(称为奈奎斯特频率)是每对点代表一个波的地方,即窗口内的 512 个完整波,即512 * 44Hz,或:n/2 * bin[1].idealFreq

bin[512].idealFreq = 512 * 44Hz -- The highest frequency we can detect (known as the Nyquist frequency) is where every pair of points represents a wave, ie 512 complete waves within the window, ie 512 * 44Hz, or: n/2 * bin[1].idealFreq

实际上有一个额外的 Bin,Bin[0],它通常被称为DC 偏移".碰巧 Bin[0] 和 Bin[n/2] 总是有复数分量 0,所以 A[0].realp 用于存储 Bin[0] 和 A[0].imagp 用于存储 Bin[n/2]

Actually there is an extra Bin, Bin[0] which is often referred to as 'DC Offset'. It so happens that Bin[0] and Bin[n/2] will always have complex component 0, so A[0].realp is used to store Bin[0] and A[0].imagp is used to store Bin[n/2]

每个复数的大小就是围绕该频率振动的能量.

And the magnitude of each complex number is the amount of energy vibrating around that frequency.

因此,如您所见,它不会是一个非常出色的音高检测器,因为它的粒度几乎不够细.有一个狡猾的技巧 从 FFT 中提取精确频率Bins 使用帧之间的相位变化来获得给定 bin 的精确频率.

So, as you can see, it wouldn't be a very great pitch detector as it doesn't have nearly fine enough granularity. There is a cunning trick Extracting precise frequencies from FFT Bins using phase change between frames to get the precise frequency for a given bin.

好的,现在进入代码:

注意 vDSP_fft_zrip 中的 'ip', = 'in place' 即输出覆盖 A('r' 表示它需要实际输入)

Note the 'ip' in vDSP_fft_zrip, = 'in place' ie output overwrites A ('r' means it takes real inputs)

查看有关 vDSP_fft_zrip 的文档,

Look at the documentation on vDSP_fft_zrip,

真实数据存储在split complex中形式,奇实数存储在分裂复形的虚侧形式甚至实数存储在真实的一面.

Real data is stored in split complex form, with odd reals stored on the imaginary side of the split complex form and even reals in stored on the real side.

这可能是最难理解的事情.我们在整个过程中都使用相同的容器 (&A).所以一开始我们想用n个实数填充它.在 FFT 之后,它将保存 n/2 个复数.然后我们把它扔进逆变换,希望得到我们原来的 n 个实数.

this is probably the hardest thing to understand. We are using the same container (&A) all the way through the process. so in the beginning we want to fill it with n real numbers. after the FFT it is going to be holding n/2 complex numbers. we then throw that into the inverse transform, and hopefully get out our original n real numbers.

现在 A 的结构是复数值的设置.所以vDSP需要标准化如何将实数打包进去.

now the structure of A its setup for complex values. So vDSP needs to standardise how to pack real numbers into it.

所以首先我们生成 n 个实数:1, 2, ..., n

so first we generate n real numbers: 1, 2, ..., n

for (i = 0; i < n; i++)
    originalReal[i] = (float) (i + 1);

接下来我们将它们作为 n/2 个复杂的 #s 打包到 A 中:

Next we pack them into A as n/2 complex #s:

// 1. masquerades n real #s as n/2 complex #s = {1+2i, 3+4i, ...}
// 2. splits to 
//   A.realP = {1,3,...} (n/2 elts)
//   A.compP = {2,4,...} (n/2 elts)
//
vDSP_ctoz(
          (COMPLEX *) originalReal, 
          2,                            // stride 2, as each complex # is 2 floats
          &A, 
          1,                            // stride 1 in A.realP & .compP
          nOver2);                      // n/2 elts

您确实需要查看 A 是如何分配的才能获得此信息,也许可以在文档中查找 COMPLEX_SPLIT.

You would really need to look at how A is allocated to get this, maybe look up COMPLEX_SPLIT in the documentation.

A.realp = (float *) malloc(nOver2 * sizeof(float));
A.imagp = (float *) malloc(nOver2 * sizeof(float));

接下来我们做一个预计算.

Next we do a pre-calculation.

数学课的快速 DSP 课程:傅立叶理论需要很长时间才能让您了解(我已经断断续续地研究它好几年了)

cisoid 是:

z = exp(i.theta) = cos(theta) + i.sin(theta)

即复平面单位圆上的一点.

当您将复数相乘时,角度会相加.所以 z^k 会一直在单位圆上跳跃;z^k 可以在角度 k.theta

  • 选择 z1 = 0+1i,即距实轴四分之一圈,注意 z1^2 z1^3 z1^4 各自再转四分之一圈,使得 z1^4 = 1

选择z2 = -1,即半圈.z2^4 = 1 但此时 z2 已经完成了 2 个周期(z2^2 也是 = 1).因此,您可以将 z1 视为基频,将 z2 视为一次谐波

类似地,z3 = '四分之三转'点,即 -i 正好完成 3 个循环,但实际上每次向前 3/4 与每次向后 1/4 相同

即z3 只是 z1 但在相反的方向 - 这被称为混叠

z2 是有意义的最高频率,因为我们选择了 4 个样本来保持全波.

  • z0 = 1+0i,z0^(anything)=1,这是直流偏移

您可以将任何 4 点信号表示为 z0 z1 和 z2 的线性组合即你把它投影到这些基向量上

但我听到你问将信号投射到 cisoid 上意味着什么?"

你可以这样想:针绕着cisoid旋转,所以在样本k处,针指向k.theta方向,长度为signal[k].与 cisoid 的频率完全匹配的信号将在某个方向上凸出生成的形状.所以如果你把所有的贡献加起来,你会得到一个强大的合成向量.如果频率几乎匹配,那么凸起会更小,并且会围绕圆缓慢移动.对于与频率不匹配的信号,贡献会相互抵消.

http://complextoreal.com/tutorials/tutorial-4-fourier-analysis-made-easy-part-1/ 将帮助您获得直观的理解.

但要点是;如果我们选择将 1024 个样本投影到 {z0,...,z512} 上,我们会预先计算 z0 到 z512,这就是这个预先计算步骤的内容.

请注意,如果您在实际代码中执行此操作,您可能希望在应用加载时执行一次并在退出时调用一次补充发布函数.不要做很多次——它很贵.

Note that if you are doing this in real code you probably want to do this once when the app loads and call the complementary release function once when it quits. DON'T do it lots of times -- it is expensive.

// let's say log2n = 8, so n=2^8=256 samples, or 'harmonics' or 'terms'
// if we pre-calculate the 256th roots of unity (of which there are 256) 
// that will save us time later.
//
// Note that this call creates an array which will need to be released 
// later to avoid leaking
setupReal = vDSP_create_fftsetup(log2n, FFT_RADIX2);

值得注意的是,如果我们将 log2n 设置为例如 8,您可以将这些预先计算的值扔到任何使用分辨率 <= 2^8 的 fft 函数中.因此(除非您想要最终的内存优化)只需为您需要的最高分辨率创建一组,然后将其用于所有内容.

It's worth noting that if we set log2n to eg 8, you can throw these precalculated values into any fft function that uses resolution <= 2^8. So (unless you want ultimate memory optimisation) just create one set for the highest resolution you're going to need, and use it for everything.

现在使用我们刚刚预先计算的内容进行实际转换:

Now the actual transforms, making use of the stuff we just precalculated:

vDSP_fft_zrip(setupReal, &A, stride, log2n, FFT_FORWARD);

此时 A 将包含 n/2 个复数,只有第一个实际上是两个伪装成复数的实数(DC 偏移,Nyquist #).文档概述解释了这种包装.它非常简洁——基本上它允许将转换的(复杂)结果打包到与(真实的,但打包方式很奇怪的)输入相同的内存占用中.

At this point A will be containing n/2 complex numbers, only the first one is actually two real numbers (DC offset, Nyquist #) masquerading as a complex number. The documentation overview explains this packing. It is quite neat -- basically it allows the (complex) results of the transform to be packed into the same memory footprint as the (real, but weirdly packaged) inputs.

vDSP_fft_zrip(setupReal, &A, stride, log2n, FFT_INVERSE);

然后再回来...我们仍然需要从 A 中解压我们的原始数组.然后我们比较只是为了检查我们是否完全恢复了我们开始的内容,释放我们预先计算的线轴并完成!

and back again... we will still need to unpack our original array from A. then we compare just to check that we have got back exactly what we started out with, release our precalculated bobbins and done!

等等!在您打开包装之前,还有最后一件事需要完成:

But wait! before you unpack, there is one final thing that needs to be done:

// Need to see the documentation for this one...
// in order to optimise, different routines return values 
// that need to be scaled by different amounts in order to 
// be correct as per the math
// In this case...
scale = (float) 1.0 / (2 * n);

vDSP_vsmul(A.realp, 1, &scale, A.realp, 1, nOver2);
vDSP_vsmul(A.imagp, 1, &scale, A.imagp, 1, nOver2);

这篇关于使用 Apple FFT 和 Accelerate 框架的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆