使用IOS Accelerate Framework在非二次幂图像上进行2D信号处理? [英] Using IOS Accelerate Framework for 2D Signal Processing on Non-Power-of-Two images?

查看:130
本文介绍了使用IOS Accelerate Framework在非二次幂图像上进行2D信号处理?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

//编辑...

我正在编辑我的问题,以解决专门处理非二次幂图像的问题。我有一个基本结构,适用于尺寸为256x256或1024x1024的方形灰度图像,但无法看到如何推广到任意大小的图像。 fft函数似乎希望您包含宽度和高度的log2,但是它不清楚如何解压缩结果数据,或者数据是否只是被扰乱。我认为显而易见的事情是将npot图像置于更大的全黑图像中,然后在查看数据时忽略这些位置中的任何值。但是想知道是否有一种不那么笨拙的方法来处理npot数据。

I'm editing my question slightly to address the issue of working specifically with non-power-of-two images. I've got a basic structure that works with square grayscale images with sizes like 256x256 or 1024x1024, but can't see how to generalize to arbitrarily sized images. The fft functions seem to want you to include the log2 of the width and height, but then its unclear how to unpack the resulting data, or if the data isn't just getting scrambled. I suppose the obvious thing to do would be to center the npot image within a larger, all black image and then ignore any values in those positions when looking at the data. But wondering if there's a less awkward way to work with npot data.

// ... END EDIT

//...END EDIT

我在使用Accelerate Framework文档时遇到了一些麻烦。我通常会使用FFTW3,但是我无法在实际的IOS设备上进行编译(请参阅问题)。任何人都可以指向一个使用Accelerate的超级简单实现,它执行以下操作:

I'm having a bit of trouble with the Accelerate Framework documentation. I would normally use FFTW3, but I'm having trouble getting that to compile on an actual IOS device (see this question). Can anybody point me to a super simple implementation using Accelerate that does something like the following:

1)将图像数据转换为适当的数据结构,可以传递给Accelerate的FFT方法。

在最简单的FFTW3中,使用灰度图像,这涉及将无符号字节放入fftw_complex数组中,该数组只是两个浮点数的结构,其中一个包含实数值和其他虚数(以及每个像素将虚数初始化为零的位置)。

1) Turns image data into an appropriate data structure that can be passed to Accelerate's FFT methods.
In FFTW3, at its simplest, using a grayscale image, this involves placing the unsigned bytes into a "fftw_complex" array, which is simply a struct of two floats, one holding the real value and the other the imaginary (and where the imaginary is initialized to zero for each pixel).

2)采用此数据结构并对其执行FFT。

2) Takes this data structure and performs an FFT on it.

3)打印幅度和相位。

3) Prints out the magnitude and phase.

4)对其执行IFFT。

4) Performs an IFFT on it.

5)从IFFT产生的数据中重新创建原始图像。

5) Recreates the original image from the data resulting from the IFFT.

虽然这是一个非常基本的例子,但我无法使用来自Apple网站的文档。这里的回答^非常有帮助,但我仍然是有点混淆如何使用加速来使用灰度(或颜色)2D图像来执行此基本功能。

Although this is a very basic example, I am having trouble using the documentation from Apple's site. The SO answer by Pi here is very helpful, but I am still somewhat confused about how to use Accelerate to do this basic functionality using a grayscale (or color) 2D image.

无论如何,任何指针或特别是一些处理2D图像的简单工作代码都会非常有用!

Anyhow, any pointers or especially some simple working code that processes a 2D image would be extremely helpful!

\\\ EDIT \\\

\\\ EDIT \\\

好的,花了一些时间后来深入研究文档和SO上的一些非常有用的代码以及 pkmital的github repo ,我有一些工作代码,我以为我会发布自1)我花了一段时间来弄清楚它和2)因为我还有几个问题......

Okay, after taking some time to dive into the documentation and some very helpful code on SO as well as on pkmital's github repo, I've got some working code that I thought I'd post since 1) it took me a while to figure it out and 2) since I have a couple of remaining questions...

初始化FFT计划。假设平方幂为2的图像:

Initialize FFT "plan". Assuming a square power-of-two image:

#include <Accelerate/Accelerate.h>
...
UInt32 N = log2(length*length);
UInt32 log2nr = N / 2; 
UInt32 log2nc = N / 2;
UInt32 numElements = 1 << ( log2nr + log2nc );
float SCALE = 1.0/numElements;
SInt32 rowStride = 1; 
SInt32 columnStride = 0;
FFTSetup setup = create_fftsetup(MAX(log2nr, log2nc), FFT_RADIX2);

传入一个字节数组,用于方形二次幂灰度图像并将其转换为COMPLEX_SPLIT :

Pass in a byte array for a square power-of-two grayscale image and turn it into a COMPLEX_SPLIT:

COMPLEX_SPLIT in_fft;
in_fft.realp = ( float* ) malloc ( numElements * sizeof ( float ) );
in_fft.imagp = ( float* ) malloc ( numElements * sizeof ( float ) );

for ( UInt32 i = 0; i < numElements; i++ ) {
    if (i < t->width * t->height) {
      in_fft.realp[i] = t->data[i] / 255.0;
      in_fft.imagp[i] = 0.0;
    }
}

对转换后的图像数据运行FFT,然后抓取幅度和阶段:

Run the FFT on the transformed image data, then grab the magnitude and phase:

COMPLEX_SPLIT out_fft;
out_fft.realp = ( float* ) malloc ( numElements * sizeof ( float ) );
out_fft.imagp = ( float* ) malloc ( numElements * sizeof ( float ) );

fft2d_zop ( setup, &in_fft, rowStride, columnStride, &out_fft, rowStride, columnStride, log2nc, log2nr, FFT_FORWARD );

magnitude = (float *) malloc(numElements * sizeof(float));
phase = (float *) malloc(numElements * sizeof(float));

for (int i = 0; i < numElements; i++) {
   magnitude[i] = sqrt(out_fft.realp[i] * out_fft.realp[i] + out_fft.imagp[i] * out_fft.imagp[i]) ;
   phase[i] = atan2(out_fft.imagp[i],out_fft.realp[i]);
}

现在,您可以对out_fft数据运行IFFT以获取原始图像。 ..

Now you can run an IFFT on the out_fft data to get the original image...

COMPLEX_SPLIT out_ifft;
out_ifft.realp = ( float* ) malloc ( numElements * sizeof ( float ) );
out_ifft.imagp = ( float* ) malloc ( numElements * sizeof ( float ) );
fft2d_zop (setup, &out_fft, rowStride, columnStride, &out_ifft, rowStride, columnStride, log2nc, log2nr, FFT_INVERSE);   

vsmul( out_ifft.realp, 1, SCALE, out_ifft.realp, 1, numElements );
vsmul( out_ifft.imagp, 1, SCALE, out_ifft.imagp, 1, numElements );

或者您可以在幅度上运行IFFT以获得自相关...

Or you can run an IFFT on the magnitude to get an autocorrelation...

COMPLEX_SPLIT in_ifft;
in_ifft.realp = ( float* ) malloc ( numElements * sizeof ( float ) );
in_ifft.imagp = ( float* ) malloc ( numElements * sizeof ( float ) );
for (int i = 0; i < numElements; i++) {
  in_ifft.realp[i] = (magnitude[i]);
  in_ifft.imagp[i] = 0.0;
}

fft2d_zop ( setup, &in_fft, rowStride, columnStride, &out_ifft, rowStride, columnStride, log2nc, log2nr, FFT_INVERSE );      

vsmul( out_ifft.realp, 1, SCALE, out_ifft.realp, 1, numElements );
vsmul( out_ifft.imagp, 1, SCALE, out_ifft.imagp, 1, numElements );

最后,您可以将ifft结果放回图像数组中:

Finally, you can put the ifft results back into an image array:

for ( UInt32 i = 0; i < numElements; i++ ) {
  t->data[i] = (int) (out_ifft.realp[i] * 255.0);
}     

我还没弄清楚如何使用Accelerate框架来处理非二次幂图像。如果我在设置中分配了足够的内存,那么我可以进行FFT,然后使用IFFT来获取原始图像。但如果尝试进行自相关(具有FFT的幅度),那么我的图像会得到不可思议的结果。我不确定适当填充图像的最佳方法,所以希望有人知道如何做到这一点。 (或者共享vDSP_conv方法的工作版本!)

I haven't figured out how to use the Accelerate framework to handle non-power-of-two images. If I allocate enough memory in the setup, then I can do an FFT, followed by an IFFT to get my original image. But if try to do an autocorrelation (with the magnitude of the FFT), then my image gets wonky results. I'm not sure of the best way to pad the image appropriately, so hopefully someone has an idea of how to do this. (Or share a working version of the vDSP_conv method!)

推荐答案

我想说是为了对任意图像大小进行处理,你所要做的就是将你的输入值数组适当调整为下一个2的幂。

I would say that in order to perform work on arbitrary image sizes, all you have to do is size your input value array appropriately to the next power of 2.

困难的部分是放置原始图像数据和填充的内容用。你真正试图从图像中对图像或数据进行的操作至关重要。

The hard part is where to put your original image data and what to fill with. What you are really trying to do to the image or data mine from the image is crucial.

在下面的链接PDF中,要特别注意12.4之上的段落。 2
http://www.mathcs.org/ java / programs / FFT / FFTInfo / c12-4.pdf

In the linked PDF below, pay particular attention to the paragraph just above 12.4.2 http://www.mathcs.org/java/programs/FFT/FFTInfo/c12-4.pdf

虽然上面谈到了沿2轴的操纵,但我们可能会有类似的想法在第二维之前,并在第二维之后。如果我是正确的,那么这个例子可以适用(这绝不是一个精确的算法):

While the above speaks about the manipulation along 2 axes, we could potentialy perform a similar idea prior to the second dimension, and following onto the second dimension. If Im correct, then this example could apply (and this is by no means an exact algorithm yet):

说我们的图像是900乘900:
首先我们可以将图像分成512,256,128和4的垂直条带。
然后我们将为每一行处理4个1D FFT,一个用于前512个像素,下一个用于后面的256个像素接下来是128,然后是剩下的4的最后一个。由于FFT的输出本质上是频率的普及,那么这些可以简单地添加(从频率ONLY角度,而不是角度偏移)。
然后我们可以将同样的技术推向第二维度。在这一点上,我们会考虑每个输入像素而不必实际填充。

say we have an image that is 900 by 900: first we could split the image into vertical strips of 512, 256, 128, and 4. We would then process 4 1D FFTs for each row, one for the first 512 pixels, the next for the following 256 pixels, the next for the following 128, then the last for the remaining 4. Since the output of the FFT is essentially popularity of frequency, then these could simply be added (from the frequency ONLY perspective, not the angular offset). We could then push this same techniquie toward the 2nd dimension. At this point we would have taken into consideration every input pixel without actually having to pad.

这真的只是思考的问题,我自己也没试过,确实应该自己研究一下。如果你现在正在做这种工作,那么你现在可能有更多的时间。

This is really just food for thought, I have not tried this myself, and indeed should research this myself. If you are truly doing this kind of work right now, you may have more time than I at this point though.

这篇关于使用IOS Accelerate Framework在非二次幂图像上进行2D信号处理?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆