从图像中的统一背景提取页面 [英] Extract a page from a uniform background in an image

查看:216
本文介绍了从图像中的统一背景提取页面的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如果我有一个图像,其中有一页文字在统一的背景上拍摄,如何自动检测纸张和背景之间的边界?

If I have an image, in which there is a page of text shot on a uniform background, how can I auto detect the boundaries between the paper and the background?

我要检测的图像示例如下所示。我将要处理的图像由统一背景上的单个页面组成,可以以任何角度旋转。

An example of the image I want to detect is shown below. The images that I will be dealing with consist of a single page on a uniform background and they can be rotated at any angle.

推荐答案

一旦你将图像转换为灰度,通过一些已知的值阈值图像。该方法的问题是,我们应用一个全局阈值,所以如果你使阈值太高,图像底部的一些纸将丢失。如果你将阈值设置得太低,那么你肯定会得到论文,但是你会包含很多背景像素,并且很可能难以通过后处理来移除这些像素。

One simple method would be to threshold the image by some known value once you convert the image to grayscale. The problem with that approach is that we are applying a global threshold and so some of the paper at the bottom of the image will be lost if you make the threshold too high. If you make the threshold too low, then you'll certainly get the paper, but you'll include a lot of the background pixels too and it will probably be difficult to remove those pixels with post-processing.

我可以建议的一个方法是使用自适应阈值算法。过去为我工作的算法是 Bradley-Roth自适应阈值算法。您可以在我稍后评论过的一篇文章中阅读这篇文章:

One thing I can suggest is to use an adaptive threshold algorithm. An algorithm that has worked for me in the past is the Bradley-Roth adaptive thresholding algorithm. You can read up about it here on a post I commented on a while back:

Bradley自适应阈值 - 混淆(问题)

但是,如果你想要则首先获取图像的灰度版本的积分图像。积分图像很重要,因为它允许您计算 O(1)复杂度内的窗口内的像素总和。然而,积分图像的计算通常是 O(n ^ 2),但你只需要做一次。使用积分图像,您扫描大小 sxs 的像素的邻域,并检查平均强度是否小于 t%这个 sxs 窗口中的实际平均值,那么这是像素分类为背景。如果它更大,那么它被归类为前景的一部分。这是适应性的,因为阈值是使用局部像素邻域而不是使用全局阈值。

However, if you want the gist of it, an integral image of the grayscale version of the image is taken first. The integral image is important because it allows you to calculate the sum of pixels within a window in O(1) complexity. However, the calculation of the integral image is usually O(n^2), but you only have to do that once. With the integral image, you scan neighbourhoods of pixels of size s x s and you check to see if the average intensity is less than t% of the actual average within this s x s window then this is pixel classified as the background. If it's larger, then it's classified as being part of the foreground. This is adaptive because the thresholding is done using local pixel neighbourhoods rather than using a global threshold.

我已经为您编写了Bradley-Roth算法的实现。算法的默认参数为 s 为图像宽度的1/8,而 t 为15%。因此,你可以这样调用它来调用默认参数:

I've coded an implementation of the Bradley-Roth algorithm here for you. The default parameters for the algorithm are s being 1/8th of the width of the image and t being 15%. Therefore, you can just call it this way to invoke the default parameters:

out = adaptiveThreshold(im);

im 是输入图片, out 是表示属于前景(逻辑真)或背景(逻辑false )。您可以使用第二个和第三个输入参数: s 是阈值窗口的大小, t 我们讨论了上面的可以调用这样的函数:

im is the input image and out is a binary image that denotes what belongs to foreground (logical true) or background (logical false). You can play around with the second and third input parameters: s being the size of the thresholding window and t the percentage we talked about above and can call the function like so:

out = adaptiveThreshold(im, s, t);

因此,算法的代码如下:

Therefore, the code for the algorithm looks like this:

function [out] = adaptiveThreshold(im, s, t)

%// Error checking of the input
%// Default value for s is 1/8th the width of the image
%// Must make sure that this is a whole number
if nargin <= 1, s = round(size(im,2) / 8); end

%// Default value for t is 15
%// t is used to determine whether the current pixel is t% lower than the
%// average in the particular neighbourhood
if nargin <= 2, t = 15; end

%// Too few or too many arguments?
if nargin == 0, error('Too few arguments'); end
if nargin >= 4, error('Too many arguments'); end

%// Convert to grayscale if necessary then cast to double to ensure no
%// saturation
if size(im, 3) == 3
    im = double(rgb2gray(im));
elseif size(im, 3) == 1
    im = double(im);
else
    error('Incompatible image: Must be a colour or grayscale image');
end

%// Compute integral image
intImage = cumsum(cumsum(im, 2), 1);

%// Define grid of points
[rows, cols] = size(im);
[X,Y] = meshgrid(1:cols, 1:rows);

%// Ensure s is even so that we are able to index the image properly
s = s + mod(s,2);

%// Access the four corners of each neighbourhood
x1 = X - s/2; x2 = X + s/2;
y1 = Y - s/2; y2 = Y + s/2;

%// Ensure no co-ordinates are out of bounds
x1(x1 < 1) = 1;
x2(x2 > cols) = cols;
y1(y1 < 1) = 1;
y2(y2 > rows) = rows;

%// Count how many pixels there are in each neighbourhood
count = (x2 - x1) .* (y2 - y1);

%// Compute row and column co-ordinates to access each corner of the
%// neighbourhood for the integral image
f1_x = x2; f1_y = y2;
f2_x = x2; f2_y = y1 - 1; f2_y(f2_y < 1) = 1;
f3_x = x1 - 1; f3_x(f3_x < 1) = 1; f3_y = y2;
f4_x = f3_x; f4_y = f2_y;

%// Compute 1D linear indices for each of the corners
ind_f1 = sub2ind([rows cols], f1_y, f1_x);
ind_f2 = sub2ind([rows cols], f2_y, f2_x);
ind_f3 = sub2ind([rows cols], f3_y, f3_x);
ind_f4 = sub2ind([rows cols], f4_y, f4_x);

%// Calculate the areas for each of the neighbourhoods
sums = intImage(ind_f1) - intImage(ind_f2) - intImage(ind_f3) + ...
    intImage(ind_f4);

%// Determine whether the summed area surpasses a threshold
%// Set this output to 0 if it doesn't
locs = (im .* count) <= (sums * (100 - t) / 100);
out = true(size(im));
out(locs) = false;

end






我使用你的图像,我设置 s = 500 t = 5 ,这里的代码,这是图像I get:


When I use your image and I set s = 500 and t = 5, here's the code and this is the image I get:

im = imread('http://i.stack.imgur.com/MEcaz.jpg');
out = adaptiveThreshold(im, 500, 5);
imshow(out);

您可以看到在图片的底部白色有一些虚假的白色像素,我们需要一些洞填写纸张内部。因此,让我们使用一些形态,并声明一个结构化元素是一个15×15的正方形,执行一个开口,以删除噪声像素,然后填充孔,当我们完成:

You can see that there are some spurious white pixels at the bottom white of the image, and there are some holes we need to fill in inside the paper. As such, let's use some morphology and declare a structuring element that's a 15 x 15 square, perform an opening to remove the noisy pixels, then fill in the holes when we're done:

se = strel('square', 15);
out = imopen(out, se);
out = imfill(out, 'holes');
imshow(out);

这是我得到的所有:

>

不坏eh?现在如果你真的想看到图像看起来像分段的纸,我们可以使用这个掩码,并将其与原始图像相乘。这样,属于纸张的任何像素都会保留,而属于背景的像素则会消失:

Not bad eh? Now if you really want to see what the image looks like with the paper segmented, we can use this mask and multiply it with the original image. This way, any pixels that belong to the paper are kept while those that belong to the background go away:

out_colour = bsxfun(@times, im, uint8(out));
imshow(out_colour);

我们得到:

你必须玩与参数,直到它为你工作,但上述参数是我用来让它工作的特定页面,你向我们展示。图像处理是关于尝试和错误,并按正确的顺序放置处理步骤,直到您获得足够好的用于您的目的。

You'll have to play around with the parameters until it works for you, but the above parameters were the ones I used to get it working for the particular page you showed us. Image processing is all about trial and error, and putting processing steps in the right sequence until you get something good enough for your purposes.

快乐图片筛选!

这篇关于从图像中的统一背景提取页面的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆