在傅立叶域中将图像与内核卷积 [英] Convolving image with kernel in Fourier domain

查看:95
本文介绍了在傅立叶域中将图像与内核卷积的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在图像和卷积内核周围使用零填充,将它们转换为傅立叶域,然后将它们反转以得到卷积图像,请参见下面的代码.但是,结果是错误的.我原本期望图像模糊,但是输出是四个偏移的季度.为什么输出错误,我该如何修复代码?

I'm using zero padding around my image and convolution kernel, converting them to the Fourier domain, and inverting them back to get the convolved image, see code below. The result, however, is wrong. I was expecting a blurred image, but the output is four shifted quarters. Why is the output wrong, and how can I fix the code?

输入图片:

卷积结果:

from PIL import Image,ImageDraw,ImageOps,ImageFilter
import numpy as np 
from scipy import fftpack
from copy import deepcopy
import imageio
## STEP 1 ##
im1=Image.open("pika.jpeg")
im1=ImageOps.grayscale(im1)
im1.show()
print("s",im1.size)
## working on this image array
im_W=np.array(im1).T
print("before",im_W.shape)
if(im_W.shape[0]%2==0):
im_W=np.pad(im_W, ((1,0),(0,0)), 'constant')
if(im_W.shape[1]%2==0):
im_W=np.pad(im_W, ((0,0),(1,0)), 'constant')
print("after",im_W.shape)
Boxblur=np.array([[1/9,1/9,1/9],[1/9,1/9,1/9],[1/9,1/9,1/9]])
dim=Boxblur.shape[0]

##padding before frequency domain multipication
pad_size=(Boxblur.shape[0]-1)/2
pad_size=int(pad_size)
##padded the image(starts here)

p_im=np.pad(im_W, ((pad_size,pad_size),(pad_size,pad_size)), 'constant')
t_b=(p_im.shape[0]-dim)/2
l_r=(p_im.shape[1]-dim)/2
t_b=int(t_b)
l_r=int(l_r)

##padded the image(ends here)

## padded the kernel(starts here)
k_im=np.pad(Boxblur, ((t_b,t_b),(l_r,l_r)), 'constant')
print("hjhj",k_im)
print("kernel",k_im.shape)

##fourier transforms image and kernel
fft_im = fftpack.fftshift(fftpack.fft2(p_im))
fft_k  = fftpack.fftshift(fftpack.fft2(k_im))
con_in_f=fft_im*fft_k
ifft2 = abs(fftpack.ifft2(fftpack.ifftshift(con_in_f)))
convolved=(np.log(abs(ifft2))* 255 / np.amax(np.log(abs(ifft2)))).astype(np.uint8)
final=Image.fromarray(convolved.T)
final.show()
u=im1.filter(ImageFilter.Kernel((3,3), [1/9,1/9,1/9,1/9,1/9,1/9,1/9,1/9,1/9], scale=None, offset=0))
u.show()

推荐答案

离散傅里叶变换(DFT)和FFT(计算DFT)的扩展源于第一个元素(对于图像,输入和输出的 both 的左上角像素).这就是为什么我们经常在输出上使用fftshift函数,以便将原点移动到我们更熟悉的位置(图像的中间)的原因.

The Discrete Fourier transform (DFT) and, by extension, the FFT (which computes the DFT) have the origin in the first element (for an image, the top-left pixel) for both the input and the output. This is the reason we often use the fftshift function on the output, so as to shift the origin to a location more familiar to us (the middle of the image).

这意味着在将其传递给FFT函数之前,我们需要将3x3均匀加权模糊内核转换为如下形式:

This means that we need to transform a 3x3 uniform weighted blurring kernel to look like this before passing it to the FFT function:

1/9  1/9  0  0  ... 0  1/9
1/9  1/9  0  0  ... 0  1/9
  0    0  0  0  ... 0    0
...  ...               ...
  0    0  0  0  ... 0    0
1/9  1/9  0  0  ... 0  1/9

也就是说,内核的中间位于图像的左上角,中间的上方和左侧的像素环绕并出现在图像的右端和底端.

That is, the middle of the kernel is at the top-left corner of the image, with the pixels above and to the left of the middle wrapping around and appearing at the right and bottom ends of the image.

我们可以使用ifftshift函数执行此操作,该函数在填充后应用于内核.在填充内核时,我们需要注意起源(内核中间)在内核映像k_im中的位置k_im.shape // 2(整数除法).最初,原点位于[3,3]//2 == [1,1].通常,我们匹配大小的图像大小相等,例如[256,256].原点将在[256,256]//2 == [128,128].这意味着我们需要在左侧和右侧(以及底部和顶部)填充不同的量.我们需要小心计算此填充:

We can do this using the ifftshift function, applied to the kernel after padding. When padding the kernel, we need to take care that the origin (middle of the kernel) is at location k_im.shape // 2 (integer division), within the kernel image k_im. Initially the origin is at [3,3]//2 == [1,1]. Usually, the image whose size we're matching is even in size, for example [256,256]. The origin there will be at [256,256]//2 == [128,128]. This means that we need to pad a different amount to the left and to the right (and bottom and top). We need to be careful computing this padding:

sz = img.shape  # the sizes we're matching
kernel = np.ones((3,3)) / 9
sz = (sz[0] - kernel.shape[0], sz[1] - kernel.shape[1])  # total amount of padding
kernel = np.pad(kernel, (((sz[0]+1)//2, sz[0]//2), ((sz[1]+1)//2, sz[1]//2)), 'constant')
kernel = fftpack.ifftshift(kernel)

请注意,不需要填充输入图像img(尽管如果要强制使用FFT更便宜的尺寸,则可以执行此操作).也不需要在乘法之前将fftshift应用于FFT的结果,然后在此之后立即反转此移位,这些移位是多余的.仅当要显示傅立叶域图像时,才应使用fftshift.最后,对滤镜图像进行对数缩放是错误的.

Note that the input image, img, does not need to be padded (though you can do this if you want to enforce a size for which the FFT is cheaper). There is also no need to apply fftshift to the result of the FFT before multiplication, and then reverse this shift right after, these shifts are redundant. You should use fftshift only if you want to display the Fourier domain image. Finally, applying logarithmic scaling to the filtered image is wrong.

结果代码是(我使用pyplot进行显示,完全不使用PIL):

The resulting code is (I'm using pyplot for display, not using PIL at all):

import numpy as np
from scipy import misc
from scipy import fftpack
import matplotlib.pyplot as plt

img = misc.face()[:,:,0]

kernel = np.ones((3,3)) / 9
sz = (img.shape[0] - kernel.shape[0], img.shape[1] - kernel.shape[1])  # total amount of padding
kernel = np.pad(kernel, (((sz[0]+1)//2, sz[0]//2), ((sz[1]+1)//2, sz[1]//2)), 'constant')
kernel = fftpack.ifftshift(kernel)

filtered = np.real(fftpack.ifft2(fftpack.fft2(img) * fftpack.fft2(kernel)))
plt.imshow(filtered, vmin=0, vmax=255)
plt.show()

请注意,我正在使用逆FFT的实部.虚部应仅包含非常接近零的值,这是在计算中舍入误差的结果.取绝对值(尽管很普遍)是不正确的.例如,您可能想对包含负值的图像应用滤镜,或者应用产生负值的滤镜.在这里取绝对值会产生人工制品.如果逆FFT的输出包含的虚数值与零显着不同,则填充滤波内核的方式有误.

Note that I am taking the real part of the inverse FFT. The imaginary part should contain only values very close to zero, which are the result of rounding errors in the computations. Taking the absolute value, though common, is incorrect. For example, you might want to apply a filter to an image that contains negative values, or apply a filter that produces negative values. Taking the absolute value here would create artefacts. If the output of the inverse FFT contains imaginary values significantly different from zero, then there is an error in the way that the filtering kernel was padded.

还要注意,这里的内核很小,因此模糊效果也很小.为了更好地看到模糊的效果,请制作更大的内核,例如np.ones((7,7)) / 49.

Also note that the kernel here is tiny, and consequently the blurring effect is tiny too. To better see the effect of the blurring, make a larger kernel, for example np.ones((7,7)) / 49.

这篇关于在傅立叶域中将图像与内核卷积的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆