快速二维卷积的DSP [英] Fast 2D convolution for DSP

查看:233
本文介绍了快速二维卷积的DSP的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我要实现其中一些是为了在的BeagleBoard 运行的图像处理算法。这些算法使用卷积广泛。我试图找到二维卷积一个很好的C实现(可能使用的快速傅立叶变换)。我也想了算法能够对BeagleBoard的的DSP上运行,因为我听说DSP为这些类型的操作(与乘累加指令)进行了优化。

I want to implement some image-processing algorithms which are intended to run on a beagleboard. These algorithms use convolutions extensively. I'm trying to find a good C implementation for 2D convolution (probably using the Fast Fourier Transform). I also want the algorithm to be able to run on the beagleboard's DSP, because I've heard that the DSP is optimized for these kinds of operations (with its multiply-accumulate instruction).

我在场上没有背景,所以我认为它不会实施卷积自己是一个好主意,(我可能不会做,谁的人了解所有的数学背后为好)。我相信,一个DSP好的C卷积实现某处存在,但我没能找到它?

I have no background in the field so I think it won't be a good idea to implement the convolution myself (I probably won't do it as good as someone who understands all the math behind it). I believe a good C convolution implementation for DSP exists somewhere but I wasn't able find it?

有人能帮忙吗?

编辑:原来内核pretty小。它的尺寸或者是2X2 3X3或。所以我想我不是寻找一个基于FFT的实现。我正在寻找在网络上进行卷积看到它的定义,所以我可以在一个直接的方式(我真的不知道什么是卷积)实现它。所有我发现是一些与乘积分,我不知道如何用矩阵来做到这一点。可能有人给我的2X2内核的情况下一张code(或伪code)的?

Turns out the kernel is pretty small. Its dimensions are either 2X2 or 3X3. So I guess I'm not looking for an FFT-based implementation. I was searching for convolution on the web to see its definition so I can implement it in a straight forward way (I don't really know what convolution is). All I've found is something with multiplied integrals and I have no idea how to do it with matrices. Could somebody give me a piece of code (or pseudo code) for the 2X2 kernel case?

推荐答案

什么是形象和内核的尺寸是多少?如果内核是大型那么你可以使用基于FFT的卷积,否则对小仁只使用直接卷积。

What are the dimensions of the image and the kernel ? If the kernel is large then you can use FFT-based convolution, otherwise for small kernels just use direct convolution.

该DSP可能不是,虽然做到这一点的最好办法 - 只是因为它有一个MAC指令并不意味着它会更有效率。是否对的Beagle Board的ARM CPU有NEON SIMD?如果是的话,可能是要走的路(和更多的乐趣也是如此)。

The DSP might not be the best way to do this though - just because it has a MAC instruction doesn't mean that it will be more efficient. Does the ARM CPU on the Beagle Board have NEON SIMD ? If so then that might be the way to go (and more fun too).

对于一个小的内核,你可以做直接的卷积是这样的:

For a small kernel, you can do direct convolution like this:

// in, out are m x n images (integer data)
// K is the kernel size (KxK) - currently needs to be an odd number, e.g. 3
// coeffs[K][K] is a 2D array of integer coefficients
// scale is a scaling factor to normalise the filter gain

for (i = K / 2; i < m - K / 2; ++i) // iterate through image
{
  for (j = K / 2; j < n - K / 2; ++j)
  {
    int sum = 0; // sum will be the sum of input data * coeff terms

    for (ii = - K / 2; ii <= K / 2; ++ii) // iterate over kernel
    {
      for (jj = - K / 2; jj <= K / 2; ++jj)
      {
        int data = in[i + ii][j +jj];
        int coeff = coeffs[ii + K / 2][jj + K / 2];

        sum += data * coeff;
      }
    }
    out[i][j] = sum / scale; // scale sum of convolution products and store in output
  }
}

您可以修改此支持连的K值 - 它只是需要一点点的关心与两个内环的上限/下限

You can modify this to support even values of K - it just takes a little care with the upper/lower limits on the two inner loops.

这篇关于快速二维卷积的DSP的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆