使用FFT进行Matlab模板匹配 [英] Matlab Template Matching Using FFT

查看:1874
本文介绍了使用FFT进行Matlab模板匹配的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用Matlab中的傅里叶域中的模板匹配进行挣扎。这是我的图像(艺术家是DeviantArt上的RamalamaCreatures):






我的目标是在负鼠的耳朵周围放置一个边界框,就像这个例子(我使用normxcorr2进行模板匹配):





这是我正在使用的Matlab代码:

 全部清除;关闭所有; 

template = rgb2gray(imread('possum_ear.jpg'));
background = rgb2gray(imread('possum.jpg'));

%%计算填充
bx = size(background,2);
by = size(background,1);
tx = size(template,2);用于bbox放置的%
ty = size(template,1);

%% fft
c = real(ifft2(fft2(background)。* fft2(template,by,bx)));

%%找到峰值相关
[max_c,imax] = max(abs(c(:)));
[ypeak,xpeak] = find(c == max(c(:)));
数字;冲浪(c),阴影平坦; %绘图相关性

%%显示最佳匹配
hFig = figure;
hAx =轴;
position = [xpeak(1)-tx,ypeak(1)-ty,tx,ty];
imshow(背景,'父母',hAx);
imrect(hAx,position);

代码无法按预期运行 - 它无法识别正确的区域。这是失败的结果 - 错误的区域被装箱:



这是相关性的表面图失败的匹配:



希望你能提供帮助!谢谢。

解决方案

您在代码中所做的事实上根本就不是相关性。您正在使用模板和执行卷积与输入图像。如果你从傅里叶变换中回忆起来,两个信号的频谱相乘就相当于时间/空间域中两个信号的卷积。



基本上,你是什么的正在做的是你使用模板作为内核并使用它来过滤图像。然后,您将找到此输出的最大响应,这就是模板所在的位置。响应被装箱的地方是有意义的,因为该区域是完全白色的,并且使用模板作为具有完全白色区域的内核将给出非常大的响应,这就是为什么它最有可能确定该区域是最大的响应。具体来说,该区域将具有许多高值(约255左右),并且自然地使用模板补丁进行卷积,并且由于操作是加权和,该区域将给出非常大的输出。因此,如果您在图像的暗区使用模板,则输出会很小 - 这是假的,因为模板也包含暗像素。






但是,您当然可以使用傅立叶变换来定位模板的位置,但我建议您使用



Ga Gb 是频域中的原始图像和模板, * 是共轭物。 o 是所谓的Hadamard产品或元素产品。我还想指出,这一部分的分子和分母的划分也是元素方面的。使用交叉功率谱,如果在此处找到产生绝对最大响应的(x,y)位置,则模板应位于背景图像中。



因此,您只需更改计算相关性的代码行,以便计算交叉功率谱。但是,我想指出一些非常重要的事情。当您执行 normxcorr2 时,相关性将从图像的左上角开始。模板匹配从此位置开始,并与一个窗口进行比较,该窗口是左上角为原点的模板大小。找到模板匹配的位置时,该位置相对于匹配窗口的左上角。计算 normxcorr2 后,传统上会添加最大响应的一半行和一半列以查找中心位置



因为我们或多或少地使用FFT /频域对模板匹配(滑动窗口,相关等)进行相同的操作,所以当你在这里找到峰值时相关数组,您还必须考虑到这一点。但是,您对 imrect 的调用在模板匹​​配的位置周围绘制一个矩形,无论如何都会在边界框的左上角,因此不需要在此处进行偏移。因此,我们将略微修改该代码,但是如果想要找到匹配的中心位置,则在以后使用此代码时请记住偏移逻辑。






我也修改了你的代码,直接从StackOverflow读取图像,这样它就可以重现:

 全部清除;关闭所有; 

template = rgb2gray(imread('http://i.stack.imgur.com/6bTzT.jpg'));
background = rgb2gray(imread('http://i.stack.imgur.com/FXEy7.jpg'));

%%计算填充
bx = size(background,2);
by = size(background,1);
tx = size(template,2);用于bbox放置的%
ty = size(template,1);

%% fft
%c = real(ifft2(fft2(背景)。* fft2(模板,by,bx)));

%//变化 - 计算交叉功率谱
Ga = fft2(背景);
Gb = fft2(template,by,bx);
c = real(ifft2((Ga。* conj(Gb))./ abs(Ga。* conj(Gb))));

%%找到峰值相关
[max_c,imax] = max(abs(c(:)));
[ypeak,xpeak] = find(c == max(c(:)));
数字;冲浪(c),阴影平坦; %绘图相关性

%%显示最佳匹配
hFig = figure;
hAx =轴;

%//新增 - 不再需要抵消坐标
%// xpeak和ypeak已经是匹配窗口的左上角
position = [xpeak(1 ),ypeak(1),tx,ty];
imshow(背景,'父母',hAx);
imrect(hAx,position);

有了这个,我得到以下图片:





在显示交叉功率谱的曲面图时,我还得到以下信息:





有一个明确定义的峰值,其余输出有一个响应非常小。这实际上是相位相关的一个属性,很明显,最大值的位置是明确定义的,这是模板所在的位置。






希望这会有所帮助!


I am struggling with template matching in the Fourier domain in Matlab. Here are my images (the artist is RamalamaCreatures on DeviantArt):

My aim is to place a bounding box around the ear of the possum, like this example (where I performed template matching using normxcorr2):

Here is the Matlab code I am using:

clear all; close all;

template = rgb2gray(imread('possum_ear.jpg'));
background = rgb2gray(imread('possum.jpg'));

%% calculate padding
bx = size(background, 2); 
by = size(background, 1);
tx = size(template, 2); % used for bbox placement
ty = size(template, 1);

%% fft
c = real(ifft2(fft2(background) .* fft2(template, by, bx)));

%% find peak correlation
[max_c, imax]   = max(abs(c(:)));
[ypeak, xpeak] = find(c == max(c(:)));
figure; surf(c), shading flat; % plot correlation 

%% display best match
hFig = figure;
hAx  = axes;
position = [xpeak(1)-tx, ypeak(1)-ty, tx, ty];
imshow(background, 'Parent', hAx);
imrect(hAx, position);

The code is not functioning as intended - it is not identifying the correct region. This is the failed result - the wrong area is boxed:

This is the surface plot of the correlations for the failed match:

Hope you can help! Thanks.

解决方案

What you're doing in your code is actually not correlation at all. You are using the template and performing convolution with the input image. If you recall from the Fourier Transform, the multiplication of the spectra of two signals is equivalent to the convolution of the two signals in time/spatial domain.

Basically, what you are doing is that you are using the template as a kernel and using that to filter the image. You are then finding the maximum response of this output and that's what is deemed to be where the template is. Where the response is being boxed makes sense because that region is entirely white, and using the template as the kernel with a region that is entirely white will give you a very large response, which is why it most likely identified that area to be the maximum response. Specifically, the region will have a lot of high values (~255 or so), and naturally performing convolution with the template patch and this region will give you a very large output due to the operation being a weighted sum. As such, if you used the template in a dark area of the image, the output would be small - which is false because the template is also consisting of dark pixels.


However, you can certainly use the Fourier Transform to locate where the template is, but I would recommend you use Phase Correlation instead. Basically, instead of computing the multiplication of the two spectra, you compute the cross power spectrum instead. The cross power spectrum R between two signals in the frequency domain is defined as:

Source: Wikipedia

Ga and Gb are the original image and the template in frequency domain, and the * is the conjugate. The o is what is known as the Hadamard product or element-wise product. I'd also like to point out that the division of the numerator and denominator of this fraction is also element-wise. Using the cross power spectrum, if you find the (x,y) location here that produces the absolute maximum response, this is where the template should be located in the background image.

As such, you simply need to change the line of code that computes the "correlation" so that it computes the cross power spectrum instead. However, I'd like to point out something very important. When you perform normxcorr2, the correlation starts right at the top-left corner of the image. The template matching starts at this location and it gets compared with a window that is the size of the template where the top-left corner is the origin. When finding the location of the template match, the location is with respect to the top-left corner of the matched window. Once you compute normxcorr2, you traditionally add the half of the rows and half of the columns of the maximum response to find the centre location.

Because we are more or less doing the same operations for template matching (sliding windows, correlation, etc.) with the FFT / frequency domain, when you finish finding the peak in this correlation array, you must also take this into account. However, your call to imrect to draw a rectangle around where the template matches takes in the top left corner of a bounding box anyway, so there's no need to do the offset here. As such, we're going to modify that code slightly but keep the offset logic in mind when using this code for later if want to find the centre location of the match.


I've modified your code as well to read in the images directly from StackOverflow so that it's reproducible:

clear all; close all;

template = rgb2gray(imread('http://i.stack.imgur.com/6bTzT.jpg'));
background = rgb2gray(imread('http://i.stack.imgur.com/FXEy7.jpg'));

%% calculate padding
bx = size(background, 2); 
by = size(background, 1);
tx = size(template, 2); % used for bbox placement
ty = size(template, 1);

%% fft
%c = real(ifft2(fft2(background) .* fft2(template, by, bx)));

%// Change - Compute the cross power spectrum
Ga = fft2(background);
Gb = fft2(template, by, bx);
c = real(ifft2((Ga.*conj(Gb))./abs(Ga.*conj(Gb))));

%% find peak correlation
[max_c, imax]   = max(abs(c(:)));
[ypeak, xpeak] = find(c == max(c(:)));
figure; surf(c), shading flat; % plot correlation    

%% display best match
hFig = figure;
hAx  = axes;

%// New - no need to offset the coordinates anymore
%// xpeak and ypeak are already the top left corner of the matched window
position = [xpeak(1), ypeak(1), tx, ty];
imshow(background, 'Parent', hAx);
imrect(hAx, position);

With that, I get the following image:

I also get the following when showing a surface plot of the cross power spectrum:

There is a clear defined peak where the rest of the output has a very small response. That's actually a property of Phase Correlation and so obviously, the location of the maximum value is clearly defined and this is where the template is located.


Hope this helps!

这篇关于使用FFT进行Matlab模板匹配的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆