使用 FFT 的 Matlab 模板匹配 [英] Matlab Template Matching Using FFT

查看:26
本文介绍了使用 FFT 的 Matlab 模板匹配的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在 Matlab 的傅立叶域中为模板匹配而苦苦挣扎.这是我的图片(艺术家是 DeviantArt 上的 RamalamaCreatures):

我的目标是在负鼠的耳朵周围放置一个边界框,就像这个例子(我使用 normxcorr2 执行模板匹配):

这是我使用的 Matlab 代码:

清除所有;关闭所有;模板 = rgb2gray(imread('possum_ear.jpg'));背景 = rgb2gray(imread('possum.jpg'));%% 计算填充bx = 大小(背景,2);by = 大小(背景,1);tx = 大小(模板,2);% 用于 bbox 放置ty = 大小(模板,1);%% fftc = real(ifft2(fft2(background) .* fft2(template, by, bx)));%% 找到峰值相关性[max_c, imax] = max(abs(c(:)));[ypeak, xpeak] = find(c == max(c(:)));数字;冲浪(c),阴影平面;% 绘图相关性%% 显示最佳匹配hFig = 图;hAx = 轴;位置 = [xpeak(1)-tx, ypeak(1)-ty, tx, ty];imshow(background, 'Parent', hAx);imrect(hAx,位置);

代码未按预期运行 - 它没有识别正确的区域.这是失败的结果 - 错误的区域被装箱:

这是失败匹配的相关性曲面图:

希望能帮到你!谢谢.

解决方案

您在代码中所做的实际上根本不是相关性的.您正在使用模板并对输入图像执行卷积.如果你回忆一下傅立叶变换,两个信号的频谱相乘相当于两个信号在时域/空间域的卷积.

基本上,您正在做的是将模板用作内核并使用它来过滤图像.然后,您将找到此输出的最大响应,这就是模板所在的位置.响应被装箱的地方是有意义的,因为该区域是完全白色的,并且使用模板作为具有完全白色区域的内核会给你一个非常大的响应,这就是为什么它最有可能确定该区域是最大的回复.具体来说,该区域将具有很多高值(~255 左右),并且自然地与模板补丁执行卷积,并且由于操作是加权和,该区域将为您提供非常大的输出.因此,如果您在图像的暗区使用模板,输出会很小 - 这是错误的,因为模板也由暗像素组成.

<小时>

但是,您当然可以使用傅立叶变换来定位模板所在的位置,但我建议您使用

GaGb是原图和频域模板,*是共轭.o 是所谓的 Hadamard 积或元素级积.我还想指出,这个分数的分子和分母的除法也是按元素进行的.使用互功率谱,如果您在这里找到产生绝对最大响应的 (x,y) 位置,这就是模板应位于背景图像中的位置.

因此,您只需更改计算相关性"的代码行,以便改为计算互功率谱.不过,我想指出一些非常重要的事情.当您执行 normxcorr2 时,相关性从图像的左上角开始.模板匹配从这个位置开始,它与一个窗口进行比较,该窗口是模板大小的窗口,其中左上角是原点.查找模板匹配的位置时,该位置相对于匹配窗口的左上角.计算 normxcorr2 后,您通常会添加最大响应的一半行和一半列以找到中心位置.

因为我们或多或少都在做与FFT/频域的模板匹配(滑动窗口、相关等)相同的操作,当你在这个相关数组中找到峰值时,你还必须取考虑到这一点.但是,您对 imrect 的调用以在模板匹配的位置周围绘制一个矩形,无论如何都在边界框的左上角,因此无需在此处进行偏移.因此,如果想找到匹配的中心位置,我们将稍微修改该代码,但在使用此代码时请记住偏移逻辑.

<小时>

我也修改了您的代码以直接从 StackOverflow 读取图像,以便可以重现:

清除所有;关闭所有;模板 = rgb2gray(imread('http://i.stack.imgur.com/6bTzT.jpg'));背景 = rgb2gray(imread('http://i.stack.imgur.com/FXEy7.jpg'));%% 计算填充bx = 大小(背景,2);by = 大小(背景,1);tx = 大小(模板,2);% 用于 bbox 放置ty = 大小(模板,1);%% fft%c = real(ifft2(fft2(background) .* fft2(template, by, bx)));%//变化 - 计算交叉功率谱Ga = fft2(背景);Gb = fft2(模板,by,bx);c = real(ifft2((Ga.*conj(Gb))./abs(Ga.*conj(Gb))));%% 找到峰值相关性[max_c, imax] = max(abs(c(:)));[ypeak, xpeak] = find(c == max(c(:)));数字;冲浪(c),阴影平面;% 绘图相关性%% 显示最佳匹配hFig = 图;hAx = 轴;%//新 - 不再需要偏移坐标%//xpeak 和 ypeak 已经是匹配窗口的左上角位置 = [xpeak(1), ypeak(1), tx, ty];imshow(background, 'Parent', hAx);imrect(hAx,位置);

这样,我得到以下图像:

在显示交叉功率谱的曲面图时,我还得到以下信息:

有一个明确定义的峰值,其余输出的响应非常小.这实际上是 Phase Correlation 的一个属性,因此很明显,最大值的位置是明确定义的,这就是模板所在的位置.

<小时>

希望这有帮助!

I am struggling with template matching in the Fourier domain in Matlab. Here are my images (the artist is RamalamaCreatures on DeviantArt):

My aim is to place a bounding box around the ear of the possum, like this example (where I performed template matching using normxcorr2):

Here is the Matlab code I am using:

clear all; close all;

template = rgb2gray(imread('possum_ear.jpg'));
background = rgb2gray(imread('possum.jpg'));

%% calculate padding
bx = size(background, 2); 
by = size(background, 1);
tx = size(template, 2); % used for bbox placement
ty = size(template, 1);

%% fft
c = real(ifft2(fft2(background) .* fft2(template, by, bx)));

%% find peak correlation
[max_c, imax]   = max(abs(c(:)));
[ypeak, xpeak] = find(c == max(c(:)));
figure; surf(c), shading flat; % plot correlation 

%% display best match
hFig = figure;
hAx  = axes;
position = [xpeak(1)-tx, ypeak(1)-ty, tx, ty];
imshow(background, 'Parent', hAx);
imrect(hAx, position);

The code is not functioning as intended - it is not identifying the correct region. This is the failed result - the wrong area is boxed:

This is the surface plot of the correlations for the failed match:

Hope you can help! Thanks.

解决方案

What you're doing in your code is actually not correlation at all. You are using the template and performing convolution with the input image. If you recall from the Fourier Transform, the multiplication of the spectra of two signals is equivalent to the convolution of the two signals in time/spatial domain.

Basically, what you are doing is that you are using the template as a kernel and using that to filter the image. You are then finding the maximum response of this output and that's what is deemed to be where the template is. Where the response is being boxed makes sense because that region is entirely white, and using the template as the kernel with a region that is entirely white will give you a very large response, which is why it most likely identified that area to be the maximum response. Specifically, the region will have a lot of high values (~255 or so), and naturally performing convolution with the template patch and this region will give you a very large output due to the operation being a weighted sum. As such, if you used the template in a dark area of the image, the output would be small - which is false because the template is also consisting of dark pixels.


However, you can certainly use the Fourier Transform to locate where the template is, but I would recommend you use Phase Correlation instead. Basically, instead of computing the multiplication of the two spectra, you compute the cross power spectrum instead. The cross power spectrum R between two signals in the frequency domain is defined as:

Source: Wikipedia

Ga and Gb are the original image and the template in frequency domain, and the * is the conjugate. The o is what is known as the Hadamard product or element-wise product. I'd also like to point out that the division of the numerator and denominator of this fraction is also element-wise. Using the cross power spectrum, if you find the (x,y) location here that produces the absolute maximum response, this is where the template should be located in the background image.

As such, you simply need to change the line of code that computes the "correlation" so that it computes the cross power spectrum instead. However, I'd like to point out something very important. When you perform normxcorr2, the correlation starts right at the top-left corner of the image. The template matching starts at this location and it gets compared with a window that is the size of the template where the top-left corner is the origin. When finding the location of the template match, the location is with respect to the top-left corner of the matched window. Once you compute normxcorr2, you traditionally add the half of the rows and half of the columns of the maximum response to find the centre location.

Because we are more or less doing the same operations for template matching (sliding windows, correlation, etc.) with the FFT / frequency domain, when you finish finding the peak in this correlation array, you must also take this into account. However, your call to imrect to draw a rectangle around where the template matches takes in the top left corner of a bounding box anyway, so there's no need to do the offset here. As such, we're going to modify that code slightly but keep the offset logic in mind when using this code for later if want to find the centre location of the match.


I've modified your code as well to read in the images directly from StackOverflow so that it's reproducible:

clear all; close all;

template = rgb2gray(imread('http://i.stack.imgur.com/6bTzT.jpg'));
background = rgb2gray(imread('http://i.stack.imgur.com/FXEy7.jpg'));

%% calculate padding
bx = size(background, 2); 
by = size(background, 1);
tx = size(template, 2); % used for bbox placement
ty = size(template, 1);

%% fft
%c = real(ifft2(fft2(background) .* fft2(template, by, bx)));

%// Change - Compute the cross power spectrum
Ga = fft2(background);
Gb = fft2(template, by, bx);
c = real(ifft2((Ga.*conj(Gb))./abs(Ga.*conj(Gb))));

%% find peak correlation
[max_c, imax]   = max(abs(c(:)));
[ypeak, xpeak] = find(c == max(c(:)));
figure; surf(c), shading flat; % plot correlation    

%% display best match
hFig = figure;
hAx  = axes;

%// New - no need to offset the coordinates anymore
%// xpeak and ypeak are already the top left corner of the matched window
position = [xpeak(1), ypeak(1), tx, ty];
imshow(background, 'Parent', hAx);
imrect(hAx, position);

With that, I get the following image:

I also get the following when showing a surface plot of the cross power spectrum:

There is a clear defined peak where the rest of the output has a very small response. That's actually a property of Phase Correlation and so obviously, the location of the maximum value is clearly defined and this is where the template is located.


Hope this helps!

这篇关于使用 FFT 的 Matlab 模板匹配的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆