图像处理删除线条 [英] image processing to remove lines

查看:158
本文介绍了图像处理删除线条的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图最终得到一个只有文字的图像。我的代码将此图像作为灰度,并删除其中的所有长行以及除文本/符号/度量之外的所有内容。

I am trying to end up with an image of only text. My code would take this image as greyscale and remove all the long lines from it and everything except for text/symbols/measurements.


这是可以使用诸如高斯模糊,侵蚀等技术来完成的,或者可能通过原始像素数据迭代并以某种方式确定图像上是否有与文本元素相对的线然后将那些不属于文本的像素转换为白色?我已经开始查看imagemagick库以及其他一些删除行的解决方案。我是图像处理技术的新手,任何帮助或执行路径都会非常有用。我在java环境中工作

Is this something that can be accomplished using techniques such as gaussian blur, erosion, etc. or perhaps iterating through the raw pixel data and somehow determine if there is a line on the image opposed to a text element then turning those pixels that are not part of text to white? I have began to look at imagemagick library as well as some other solutions to remove lines. I am new to image processing techniques, any help or path towards execution would be really helpful. I am working in a java environment

推荐答案

我有一些想法,我可能会进一步开发其中一些,或者可能不会!

I have a few ideas and I may develop some of them further, or may not!

1。使用颜色

图表看起来像是从某个包中生成的,包括Ghostscript,而不是从纸上扫描它,所以我认为你可以控制它代。如果是这样,最简单和最干净的选项可能是在Postscript中插入一个命令来改变所有文本的颜色,或者改变所有行和圆的颜色,然后你就可以使用颜色来提取文本。

The diagram looks like you have generated it from some package, including Ghostscript, rather than scanning it from paper, so I presume you can control its generation. If so, the very simplest and cleanest option is probably to insert a command into the Postscript to change the colour of all text, or alternatively of all lines and circles, then you can just use colour to extract the text.

2。使用过滤器

您可以使用长水平探测元素和中位数来确定水平线,使用长垂直探测元素来移除垂直线。显然,你可以调整长度等,但这看起来像这样:

You could use a long horizontal probing element and a median to determine horizontal lines and a long vertical probing element to remove vertical lines. Obviously, you can fiddle with the lengths etc, but that would look like this:

convert drawing.png                                              \
  \( -clone 0 -threshold 50% -negate -statistic median 200x1 \)  \
 -compose lighten -composite                                     \
  \( -clone 0 -threshold 50% -negate -statistic median 1x200 \)  \
 -composite result.png

我试过200作为长度:

I tried with 200 as the length:

并且500为长度:

3。使用连接组件分析或Blob分析

想法是找到图像中的所有斑点,然后删除那些大于你希望保留的信件。我提取了你的一部分图像以使用这种方法:

The idea would be to find all the blobs in the image, and then remove those blobs larger than the size of the letters you wish to retain. I extracted a portion of your image to play with this approach:

convert extract.png -colorspace gray -negate -threshold 50% \
   -define connected-components:verbose=true                \
   -connected-components 8 -auto-level output.png 

输出

Objects (id: bounding-box centroid area mean-color):
  2: 943x660+77+0 553.0,296.5 536272 srgb(0,0,0)
  0: 73x660+0+0 36.0,329.3 48150 srgb(0,0,0)
  10: 279x176+376+484 507.5,582.9 42374 srgb(0,0,0)
  8: 167x99+488+413 574.9,458.8 8939 srgb(0,0,0)
  5: 291x253+370+407 517.6,486.0 8121 srgb(255,255,255)
  7: 166x83+397+413 477.3,450.4 7479 srgb(0,0,0)
  9: 77x90+578+436 628.7,491.1 3511 srgb(0,0,0)
  6: 81x67+376+413 403.5,438.0 3197 srgb(0,0,0)
  1: 4x660+73+0 74.5,329.5 2640 srgb(255,255,255)
  3: 221x154+124+328 213.8,440.1 2225 srgb(255,255,255)
  4: 198x154+686+378 798.3,488.4 2133 srgb(255,255,255)
  11: 38x59+136+559 154.5,588.1 1094 srgb(255,255,255)
  12: 37x59+790+559 808.0,588.0 955 srgb(255,255,255)
  13: 37x59+837+559 855.0,588.0 955 srgb(255,255,255)
  15: 37x58+230+560 248.6,588.2 888 srgb(255,255,255)
  16: 37x58+742+560 760.6,588.2 888 srgb(255,255,255)
  14: 39x58+180+560 201.5,587.8 862 srgb(255,255,255)   <--- Let's look at this one
  19: 23x45+844+566 855.0,588.0 848 srgb(0,0,0)
  18: 23x45+797+566 808.0,588.0 848 srgb(0,0,0)
  20: 24x22+143+589 154.5,599.5 420 srgb(0,0,0)
  17: 18x16+146+566 154.5,573.6 227 srgb(0,0,0)
  21: 8x11+114+606 117.5,611.0 72 srgb(255,255,255)
  22: 8x11+720+606 723.5,611.0 72 srgb(255,255,255)
  23: 2x20+0+628 0.3,637.5 30 srgb(255,255,255)

这些字段的标题是输出的开头,但基本上看blob 14:

The fields are titled at the start of the output, but basically looking at blob 14:

 14: 39x58+180+560 201.5,587.8 862 srgb(255,255,255)

It是39像素宽,58像素高,位于距离左上角180,560的偏移处,它是白色的(255,255,255),由于我否定了图像,这意味着它在原始图像中是黑色的,因此它对应于a的大小你的文字的字母(50x70左右)。

It is 39 pixels wide and 58 pixels tall and located at offset 180,560 from the top-left corner and it is white (255,255,255) which, since I negated the image, means it is black in the original image so it corresponds to the size of a letter of your text (50x70 or so).

只是作为解释(实际处理不需要),让我们把它作为一个矩形画在提取物上:

Just by way of explanation (not necessary for actual processing), let's draw that as a rectangle onto the extract:

convert extract.png -fill red -draw "rectangle 180,560 219,617" aBlob.png

请注意,我们有图片偏移加宽度和高度,而foll由于 -draw rectangle 命令需要左上角和右下角,所以我们需要将宽度和高度添加到偏移量以获得右下角。

Note that we had image offset plus width and height, whereas the following -draw rectangle command takes top-left and bottom-right corner so we need to add the width and height to the offset to get the bottom-right corner.

好,所以我们现在可以制作所有字母的掩码!

Good, so we can now make a mask of all the letters!

convert extract.png -colorspace gray -negate -threshold 50% -define connected-components:verbose=true -connected-components 8 -auto-level output.png | awk -F"[ x+]" '/255,255,255/ && $4<=50 && $5<=80{printf "fill white rectangle %d,%d %d,%d\n",$6,$7,$6+$4,$7+$5}' > draw.txt

输出(文件 draw.txt

Output (in file draw.txt)

fill white rectangle 136,559 174,618
fill white rectangle 790,559 827,618
fill white rectangle 837,559 874,618
fill white rectangle 230,560 267,618
fill white rectangle 742,560 779,618
fill white rectangle 180,560 219,618
fill white rectangle 114,606 122,617
fill white rectangle 720,606 728,617
fill white rectangle 0,628 2,648

以下是如何将所有受保护的blob放入掩码:

Here is how to put all those protected blobs into a mask:

convert -size 1020x660 xc:black -draw @draw.txt mask.png

导致此掩码:

然后我们可以将掩码应用于图像:

Then we can apply the mask to the image:

convert extract.png mask.png -compose copyopacity -composite result.png

这篇关于图像处理删除线条的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆