识别图像中的可见形状 [英] Recognizing visio shapes in an image

查看:251
本文介绍了识别图像中的可见形状的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在提供SCADA解决方案时,我们经常获得结构化控制图(如下所示的流程图)中指定的最终用户规范,这些规范通常以PDF格式或图像形式提交.

Delivering SCADA solutions, we often get the our end user specifications specified in Structured Control Diagram (visio like flow diagrams seen below) that are often submitted in PDF format or as images.

为了在C#中访问这些文件,我希望使用其中一个OpenCV库.

In order to access these in C#, I was hoping to use one of the OpenCV libraries.

我当时正在研究模板识别,但是开始引入机器学习算法来教它识别框和箭头的已知特定形状似乎不合适.

I was looking at template recognition, but it seems a wrong fit to start feeding into a machine learning algorithm to teach it to recognize the preknown specific shape of boxes and arrows.

我看过的库有一些polyedge函数.但是,从下面的示例可以看出,如果元素之间没有间距,则系统会将整个对象视为一个大多边形的危险.

The libraries I've looked at have some polyedge functions. However, as can be seen from the example below there is the danger that the system will treat the whole thing as one large polygon when there is no spacing between elements..

注释可以旋转90度,我想使用OCR来识别它们以及矩形的内容.

The annotations may be any 90 degree rotation and I would like to identify them as well as the contents of the rectangles using OCR.

我对此没有任何经验,现在应该很明显,因此我希望有人可以向我指出合适的兔子洞的方向.如果有多种方法,请选择数学上最少的方法.

I do not have any experience in this, which should be apparent by now, so I hope somebody can point me out in the direction of the appropriate rabbit hole. If there are multiple approaches, then choose the least math heavy.

更新: 这是我正在谈论的图像类型的一个例子.

Update: This is an example of the type of image I'm talking about.

要解决的问题是:

  • 用单元格中的文本标识红色矩形(OCR).
  • 箭头的标识,包括方向和终点注释.线型(如果可能).
  • 组件的模板匹配.
  • 回退到某些折线实体,或者如果模板匹配失败,则退回某些东西.

推荐答案

我确定您确实意识到这是一个活跃的研究领域,本文中介绍的算法和方法是基础,也许还有更好的方法/更具体的解决方案,要么完全是启发式的,要么是基于这些基本方法的.

我将尝试描述一些我以前使用过的方法,并且在类似的情况下也能获得良好的效果(我们研究了简单的CAD图来查找电网的逻辑图),希望这种方法会有用.

I'll try to describe some methods which I used before and got good results from in similar situation (we worked on simple CAD drawings to find logical graph of a electrical grid) and I hope it would be useful.

以单元格中的文本(OCR)标识红色矩形.

这对于您的解决方案来说是微不足道的,因为您的文档质量很高,并且您可以轻松地将任何当前免费的OCR引擎(例如Tesseract)用于您的目的,对于90,180,...学位,像Tesseract会检测到它们(您应该配置引擎,并且在某些情况下,您应该提取检测到的边界并将其分别传递给OCR引擎),您可能只需要进行一些培训和微调即可获得最大的准确性.

this one is trivial for your solution as your documents are high quality, and you can easily adapt any current free OCR engines (e.g. Tesseract) for your purpose,there would be no problem for 90,180,... degrees, engines like Tesseract would detect them (you should config the engine, and in some cases you should extract detected boundries and pass them individually to OCR engine), you may just need some training and fine tuning to achieve maximum accuracy.

组件的模板匹配.

大多数模板匹配算法对比例尺都很敏感,而比例尺不变的算法非常复杂,因此,如果文档的尺寸和大小不同,我认为使用简单的模板匹配算法不会获得非常准确的结果.

Most template-matching algorithms are sensitive to scales and scale invariant ones are very complex, so I don't think you get very accurate results by using simple template matching algorithms if your documents vary in scale and size.

您的形状特征非常相似且稀疏,可以通过SIFT和SURF等算法获得良好的效果和独特的特征.

and your shapes features are very similar and sparse to get good results and unique features from algorithms such as SIFT and SURF.

我建议您使用轮廓,您的形状很简单,并且您的组件是通过组合这些简单的形状制成的,通过使用轮廓,您可以找到这些简单的形状(例如矩形和三角形),然后根据以前收集的轮廓检查轮廓在零件形状上,例如,您的零件之一是通过组合四个矩形来创建的,因此您可以将其相对轮廓保持在一起,然后在检测阶段将其与文档相对照

I suggest you to use contours, your shapes are simple and your components are made from combining these simple shapes, by using contours you can find these simple shapes (e.g rectangles and triangles) and then check the contours against previously gathered ones based on component shapes, for example one of your components are created by combining four rectangles, so you can hold relative contours together for it and check it later against your documents in detection phase

网上有很多有关轮廓分析的文章,建议您看一下,它们将为您提供有关如何使用轮廓检测​​简单和复杂形状的线索:

there are lots of articles about contour analysis on the net, I suggest you to have a look at these, they will give you a clue on how you can use contours to detect simple and complex shapes:

http://www.emgu. com/wiki/index.php/Shape_%28Triangle,_Rectangle,_Circle,_Line%29_Detection_in_CSharp

http://www.codeproject. com/Articles/196168/C中的图像识别轮廓分析

http://opencv-code.com/tutorials /检测图像中的简单形状/

http://opencv- python-tutroals.readthedocs.org/en/latest/py_tutorials/py_imgproc/py_contours/py_contours_begin/py_contours_begin.html

使用EmguCV将代码移植到c#很简单,所以不用担心

by the way porting code to c# using EmguCV is trivial, so don't worry about it

箭头的标识,包括方向和端点注释.线型(如果可能).

有几种查找线段的方法(例如Hough变换),这部分的主要问题是其他组件,因为它们通常也被检测为线,因此如果我们先找到组件并将其从文档中删除,检测线将变得更容易,并且错误检测更少.

There are several methods for finding line segments (e.g. Hough Transform), the main problem in this part is other components as they are normally detected as lines too, so if we find components first and remove them from document, detecting lines would be a lot easier and with far less false detections.

方法

1-基于不同颜色的文档图层,并在每个所需的图层上执行以下阶段.

1- Layer documents based on different Colors, and execute following phases on every desired layer.

2-使用OCR检测并提取文本,然后删除文本区域并重新创建不包含文本的文档.

2- Detect and extract text using OCR, then remove text regions and recreate the document without texts.

基于轮廓分析和收集的组件数据库的3-Detect Components,然后删除检测到的组件(已知和未知类型,因为未知形状会增加下一阶段的错误检测),并重新创建没有组件的文档,此刻检测良好的情况下,我们应该只有行

3-Detect Components, based on contour analysis and gathered component database, then remove detected components (both known and unknown types, as unknown shapes would increase your false detection in next phases) and recreate document without components,at this moment in case of good detection we should only have lines

4条检测线

5-此时,您可以根据检测到的位置从提取的组件,线和标签创建逻辑图

5-At this point you can create a logical graph from extracted components,lines and tags based on detected position

希望这会有所帮助

这篇关于识别图像中的可见形状的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆