空间域中的图像卷积 [英] Image convolution in spatial domain

查看:109
本文介绍了空间域中的图像卷积的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用 linear <>复制



与基于FFT的卷积相比,输出很奇怪并且不正确



我该如何解决这个问题?



请注意,我获得了以下内容Matlab的图像输出与我的C#FFT输出匹配:







Update-1::按照@ Ben Voigt 的评论,我更改了 Rescale()函数将 255.0 替换为 1 ,因此输出为大大改善。但是,输出仍然与FFT输出不匹配(这是正确的输出)。





Update-2:在@ Cris Luengo 发表评论之后,我用了拼接,然后执行空间卷积。结果如下:



因此,输出比上一个差。但是,这与,




1.空间减去频域

2.频率减去空间域



看起来,差异很大,这意味着空间卷积没有正确完成。





源代码:



(如果需要更多信息,请通知我源代码。)

  public static double [,] LinearConvolutionSpatial(double [,] image,double [ ,] mask)
{
int maskWidth = mask.GetLength(0);
int maskHeight = mask.GetLength(1);

double [,] paddedImage = ImagePadder.Pad(image,maskWidth);

double [,] conv = Convolution.ConvolutionSpatial(paddedImage,mask);

int cropSize =(maskWidth / 2);

double [,]裁剪= ImageCropper.Crop(conv,cropSize);

返回转化;
}
静态double [,] ConvolutionSpatial(double [,] paddedImage1,double [,] mask1)
{
int imageWidth = paddedImage1.GetLength(0);
int imageHeight = paddedImage1.GetLength(1);

int maskWidth = mask1.GetLength(0);
int maskHeight = mask1.GetLength(1);

int convWidth = imageWidth-((maskWidth / 2)* 2);
int convHeight = imageHeight-((maskHeight / 2)* 2);

double [,] convolve =新double [convWidth,convHeight];

for(int y = 0; y {
for(int x = 0; x {
int startX = x;
int startY = y;

卷积[x,y] = Sum(pappedImage1,mask1,startX,startY);
}
}

重新缩放(卷积);

收益卷积;
}

静态double Sum(double [,] paddedImage1,double [,] mask1,int startX,int startY)
{
double sum = 0;

int maskWidth = mask1.GetLength(0);
int maskHeight = mask1.GetLength(1);

for(int y = startY; y<(startY + maskHeight); y ++)
{
for(int x = startX; x<(startX + maskWidth) ; x ++)
{
double img = paddedImage1 [x,y];
double msk = mask1 [x-startX,y-startY];
sum = sum +(img * msk);
}
}

返还金额;
}

静态void Rescale(double [,] convolve)
{
int imageWidth = convolve.GetLength(0);
int imageHeight = convolve.GetLength(1);

double maxAmp = 0.0;

for(int j = 0; j< imageHeight; j ++)
{
for(int i = 0; i< imageWidth; i ++)
{
maxAmp = Math.Max(maxAmp,convolve [i,j]);
}
}

double scale = 1.0 / maxAmp;

for(int j = 0; j< imageHeight; j ++)
{
for(int i = 0; i< imageWidth; i ++)
{
double d =卷积[i,j] *比例;
卷积[i,j] = d;
}
}
}

公共静态位图ConvolveInFrequencyDomain(位图image1,位图kernel1)
{
位图结果= null;

位图图像=(位图)图像1.Clone();
位图内核=(位图)kernel1.Clone();

//线性卷积:求和。
//圆形卷积:max
uint paddedWidth = Tools.ToNextPow2((uint)(image.Width + kernel.Width));
uint paddedHeight = Tools.ToNextPow2((uint)(image.Height + kernel.Height));

位图paddedImage = ImagePadder.Pad(image,(int)paddedWidth,(int)paddedHeight);
位图paddedKernel = ImagePadder.Pad(内核,(int)paddedWidth,(int)paddedHeight);

Complex [,] cpxImage = ImageDataConverter.ToComplex(paddedImage);
Complex [,] cpxKernel = ImageDataConverter.ToComplex(paddedKernel);

//调用复杂函数
Complex [,] convolve = Convolve(cpxImage,cpxKernel);

结果= ImageDataConverter.ToBitmap(convolve);

结果= ImageCropper.Crop(结果,(kernel.Width / 2)+1);

返回结果;
}


解决方案

我从 此链接 。主要线索是引入偏移量因子




  • factor 是内核中所有值的总和。

  • 偏移量是一个任意值,用于进一步固定输出。





@ Cris Luengo 的答案也提出了一个有效的观点。





在给定链接中提供了以下源代码:

  private void SafeImageConvolution(Bitmap image,ConvMatrix fmat)
{
//如果(fmat.Factor == 0)
return;避免除以0


位图srcImage =(位图)image.Clone();

int x,y,filterx,filtery;
int s = fmat.Size / 2;
int r,g,b;
颜色tempPix;

for(y = s; y< srcImage.Height-s; y ++)
{
for(x = s; x< srcImage.Width-s; x ++ )
{
r = g = b = 0;

//卷积
for(filtery = 0; filtery< fmat.Size; filtery ++)
{
for(filterx = 0; filterx< fmat。大小; filterx ++)
{
tempPix = srcImage.GetPixel(x + filterx-s,y + filtery-s);

r + = fmat.Matrix [filtery,filterx] * tempPix.R;
g + = fmat.Matrix [filtery,filterx] * tempPix.G;
b + = fmat.Matrix [filtery,filterx] * tempPix.B;
}
}

r = Math.Min(Math.Max((r / fmat.Factor)+ fmat.Offset,0),255);
g = Math.Min(Math.Max((g / fmat.Factor)+ fmat.Offset,0),255);
b = Math.Min(Math.Max((b / fmat.Factor)+ fmat.Offset,0),255);

image.SetPixel(x,y,Color.FromArgb(r,g,b));
}
}
}


I am trying to replicate the outcome of this link using linear convolution in spatial-domain.

Images are first converted to 2d double arrays and then convolved. Image and kernel are of the same size. The image is padded before convolution and cropped accordingly after the convolution.

As compared to the FFT-based convolution, the output is weird and incorrect.

How can I solve the issue?

Note that I obtained the following image output from Matlab which matches my C# FFT output:

.

Update-1: Following @Ben Voigt's comment, I changed the Rescale() function to replace 255.0 with 1 and thus the output is improved substantially. But, still, the output doesn't match the FFT output (which is the correct one).

.

Update-2: Following @Cris Luengo's comment, I have padded the image by stitching and then performed spatial convolution. The outcome has been as follows:

So, the output is worse than the previous one. But, this has a similarity with the 2nd output of the linked answer which means a circular convolution is not the solution.

.

Update-3: I have used the Sum() function proposed by @Cris Luengo's answer. The result is a more improved version of **Update-1**:

But, it is still not 100% similar to the FFT version.

.

Update-4: Following @Cris Luengo's comment, I have subtracted the two outcomes to see the difference:
,

1. spatial minus frequency domain
2. frequency minus spatial domain

Looks like, the difference is substantial which means, spatial convolution is not being done correctly.

.

Source Code:

(Notify me if you need more source code to see.)

    public static double[,] LinearConvolutionSpatial(double[,] image, double[,] mask)
    {
        int maskWidth = mask.GetLength(0);
        int maskHeight = mask.GetLength(1);

        double[,] paddedImage = ImagePadder.Pad(image, maskWidth);

        double[,] conv = Convolution.ConvolutionSpatial(paddedImage, mask);

        int cropSize = (maskWidth/2);

        double[,] cropped = ImageCropper.Crop(conv, cropSize);

        return conv;
    } 
    static double[,] ConvolutionSpatial(double[,] paddedImage1, double[,] mask1)
    {
        int imageWidth = paddedImage1.GetLength(0);
        int imageHeight = paddedImage1.GetLength(1);

        int maskWidth = mask1.GetLength(0);
        int maskHeight = mask1.GetLength(1);

        int convWidth = imageWidth - ((maskWidth / 2) * 2);
        int convHeight = imageHeight - ((maskHeight / 2) * 2);

        double[,] convolve = new double[convWidth, convHeight];

        for (int y = 0; y < convHeight; y++)
        {
            for (int x = 0; x < convWidth; x++)
            {
                int startX = x;
                int startY = y;

                convolve[x, y] = Sum(paddedImage1, mask1, startX, startY);
            }
        }

        Rescale(convolve);

        return convolve;
    } 

    static double Sum(double[,] paddedImage1, double[,] mask1, int startX, int startY)
    {
        double sum = 0;

        int maskWidth = mask1.GetLength(0);
        int maskHeight = mask1.GetLength(1);

        for (int y = startY; y < (startY + maskHeight); y++)
        {
            for (int x = startX; x < (startX + maskWidth); x++)
            {
                double img = paddedImage1[x, y];
                double msk = mask1[x - startX, y - startY];
                sum = sum + (img * msk);
            }
        }

        return sum;
    }

    static void Rescale(double[,] convolve)
    {
        int imageWidth = convolve.GetLength(0);
        int imageHeight = convolve.GetLength(1);

        double maxAmp = 0.0;

        for (int j = 0; j < imageHeight; j++)
        {
            for (int i = 0; i < imageWidth; i++)
            {
                maxAmp = Math.Max(maxAmp, convolve[i, j]);
            }
        }

        double scale = 1.0 / maxAmp;

        for (int j = 0; j < imageHeight; j++)
        {
            for (int i = 0; i < imageWidth; i++)
            {
                double d = convolve[i, j] * scale;
                convolve[i, j] = d;
            }
        }
    } 

    public static Bitmap ConvolveInFrequencyDomain(Bitmap image1, Bitmap kernel1)
    {
        Bitmap outcome = null;

        Bitmap image = (Bitmap)image1.Clone();
        Bitmap kernel = (Bitmap)kernel1.Clone();

        //linear convolution: sum. 
        //circular convolution: max
        uint paddedWidth = Tools.ToNextPow2((uint)(image.Width + kernel.Width));
        uint paddedHeight = Tools.ToNextPow2((uint)(image.Height + kernel.Height));

        Bitmap paddedImage = ImagePadder.Pad(image, (int)paddedWidth, (int)paddedHeight);
        Bitmap paddedKernel = ImagePadder.Pad(kernel, (int)paddedWidth, (int)paddedHeight);

        Complex[,] cpxImage = ImageDataConverter.ToComplex(paddedImage);
        Complex[,] cpxKernel = ImageDataConverter.ToComplex(paddedKernel);

        // call the complex function
        Complex[,] convolve = Convolve(cpxImage, cpxKernel);

        outcome = ImageDataConverter.ToBitmap(convolve);

        outcome = ImageCropper.Crop(outcome, (kernel.Width/2)+1);

        return outcome;
    } 

解决方案

I have found the solution from this link. The main clue was to introduce an offset and a factor.

  • factor is the sum of all values in the kernel.
  • offset is an arbitrary value to fix the output further.

.

@Cris Luengo's answer also raised a valid point.

.

The following source code is supplied in the given link:

    private void SafeImageConvolution(Bitmap image, ConvMatrix fmat) 
    { 
        //Avoid division by 0 
        if (fmat.Factor == 0) 
            return; 

        Bitmap srcImage = (Bitmap)image.Clone(); 

        int x, y, filterx, filtery; 
        int s = fmat.Size / 2; 
        int r, g, b; 
        Color tempPix; 

        for (y = s; y < srcImage.Height - s; y++) 
        { 
            for (x = s; x < srcImage.Width - s; x++) 
            { 
                r = g = b = 0; 

                // Convolution 
                for (filtery = 0; filtery < fmat.Size; filtery++) 
                { 
                    for (filterx = 0; filterx < fmat.Size; filterx++) 
                    { 
                        tempPix = srcImage.GetPixel(x + filterx - s, y + filtery - s); 

                        r += fmat.Matrix[filtery, filterx] * tempPix.R; 
                        g += fmat.Matrix[filtery, filterx] * tempPix.G; 
                        b += fmat.Matrix[filtery, filterx] * tempPix.B; 
                    } 
                } 

                r = Math.Min(Math.Max((r / fmat.Factor) + fmat.Offset, 0), 255); 
                g = Math.Min(Math.Max((g / fmat.Factor) + fmat.Offset, 0), 255); 
                b = Math.Min(Math.Max((b / fmat.Factor) + fmat.Offset, 0), 255); 

                image.SetPixel(x, y, Color.FromArgb(r, g, b)); 
            } 
        } 
    } 

这篇关于空间域中的图像卷积的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆