如何检测文档图像上的边缘,并将切片切割成单独的图像? [英] How can I detect edges on an image of a document, and cut sections into seperate images?

查看:254
本文介绍了如何检测文档图像上的边缘,并将切片切割成单独的图像?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

任务是拍摄文档的图像,并利用围绕不同部分的直线,以便将图像分割成不同的文档以便进一步解析。不同部分的大小在页面之间完全不同(我们处理的是几千页)。以下是其中一张图片的图片:

The task is to take an image of a document, and leverage straight lines surrounding different 'sections' in order to split up the image into different documents for further parsing. Size of the different 'sections' is completely variable from page to page (we're dealing with several thousand pages). Here is an image of what one of these images looks like:

文件的布局示例:

图像分析/操作对我来说是全新的。到目前为止,我已经尝试使用Scikit图像边缘检测算法来查找框,希望使用这些坐标来剪切图像。然而,我尝试过的两种算法(Canny,Hough)在高灵敏度下拾取文本行作为边缘,而不是在低灵敏度下拾取我想要的行。我可以写一些自定义和低级别的东西来自己检测盒子,但我必须假设这是一个已解决的问题。

Image analysis/manipulation is completely new to me. So far I've attempted to use Scikit-image edge detection algorithms to find the 'boxes', with hopes to use those 'coordinates' to cut the image. However, the two algorithms I've tried (Canny, Hough) are picking up lines of text as 'edges' on high sensitivity, and not picking up the lines that I want on low sensitivity. I could write something custom and low level to detect the boxes myself, but I have to assume this is a solved problem.

我的方法是朝着正确的方向前进吗?谢谢!

Is my approach headed in the right direction? Thank you!

推荐答案

您似乎没有得到任何 OpenCV 答案,所以我有尝试 ImageMagick ,只需在命令行的终端中。 ImageMagick 安装在大多数Linux发行版上,可免费用于macOS和Windows。该技术非常适合 OpenCV ,因此如果适用于您,可以移植它。

You don't seem to be getting any OpenCV answers, so I had a try with ImageMagick, just in the Terminal at the command-line. ImageMagick is installed on most Linux distros and is available for macOS and Windows for free. The technique is pretty readily adaptable to OpenCV so you can port it across if it works for you.

我的第一步是做5x5箱式滤波器和阈值为80%以消除噪声扫描伪影然后反转(可能是因为我计划使用形态学,但最终没有)。

My first step was to do a 5x5 box filter and threshold at 80% to get rid of noise an scanning artefacts and then invert (probably because I was planning on using morphology, but didn't in the end).

convert news.jpg -depth 16 -statistic mean 5x5 -threshold 80% -negate z.png

然后我通过连接组件分析运行它并丢弃面积太小(2000像素以下)的所有blob:

I then ran that through "Connected Components Analysis" and discarded all blobs with too small an area (under 2000 pixels):

convert news.jpg -depth 16 -statistic mean 5x5 -threshold 80% -negate  \
   -define connected-components:verbose=true                           \
   -define connected-components:area-threshold=2000                    \
   -connected-components 4 -auto-level output.png

输出

Objects (id: bounding-box centroid area mean-color):
  110: 1254x723+59+174 686.3,536.0 901824 srgb(0,0,0)
  2328: 935x723+59+910 526.0,1271.0 676005 srgb(0,0,0)
  0: 1370x1692+0+0 685.2,712.7 399651 srgb(0,0,0)
  2329: 303x722+1007+911 1158.0,1271.5 218766 srgb(0,0,0)
  25: 1262x40+54+121 685.2,140.5 49820 srgb(255,255,255)
  109: 1265x735+54+168 708.3,535.0 20601 srgb(255,255,255)
  1: 1274x64+48+48 675.9,54.5 16825 srgb(255,255,255)
  2326: 945x733+54+905 526.0,1271.0 16660 srgb(255,255,255)  
  2327: 312x732+1003+906 1169.9,1271.5 9606 srgb(255,255,255)  <--- THIS ONE
  421: 403x15+328+342 528.6,350.1 4816 srgb(255,255,255)
  7: 141x23+614+74 685.5,85.2 2831 srgb(255,255,255)

字段标有第一行,但有趣的是第二行(块几何)和第四行(blob区域)。如您所见,有11行,因此它在图像中找到了11个斑点。第二个字段 AxB + C + D 表示矩形 A 像素宽 B 像素高,其左上角 C 图像左边缘的像素和 D 从顶部向下的像素。

The fields are labelled in the first line, but the interesting ones are the second (block geometry) and fourth field (blob area). As you can see, there are 11 lines so it has found 11 blobs in the image. The second field, AxB+C+D means a rectangle A pixels wide by B pixels tall with its top-left corner C pixels from the left edge of the image and D pixels down from the top.

让我们看看我用箭头标记的那个,开始 2327:312x732 + 1003 + 906 并在那个上绘制一个矩形:

Let's look at the one I have marked with an arrow, which starts 2327: 312x732+1003+906 and draw a rectangle over that one:

convert news.jpg -fill "rgba(255,0,0,0.5)" -draw "rectangle 1003,906 1315,1638" oneArticle.png

< a href =https://i.stack.imgur.com/uoQO1.jpg =nofollow noreferrer>

如果您想将该文章裁剪成新图片:

If you want to crop that article out into a new image:

convert news.jpg -crop 312x732+1003+906 article.jpg

如果我们绘制所有其他框,我们得到:

If we draw in all the other boxes , we get:

这篇关于如何检测文档图像上的边缘,并将切片切割成单独的图像?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆