OCR软件可以可靠地从表中读取值吗? [英] Can OCR software reliably read values from a table?

查看:74
本文介绍了OCR软件可以可靠地从表中读取值吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

OCR软件能否可靠地将如下图像转换为值列表?

更新:

更详细的任务如下:

我们有一个客户端应用程序,用户可以在其中打开报告.该报告包含一个值表. 但是并不是每个报表看起来都是一样的-不同的字体,不同的间距,不同的颜色,也许该报表包含许多具有不同行数/列数的表...

用户选择报告中包含表格的区域.使用鼠标.

现在,我们想使用OCR工具将所选表转换为值.

在用户选择矩形区域时,我可以要求其他信息 帮助OCR流程,并要求确认已正确识别这些值.

最初将是一个实验项目,因此很有可能会使用OpenSource OCR工具-或至少有一个不花任何钱用于实验目的的工具.

解决方案

简单的答案是肯定的,您应该选择正确的工具.

我不知道开源是否能够在这些图像上获得接近100%的准确性,但是根据这里的答案,如果您花一些时间训练并解决表格analisys问题以及诸如此类的事情,可能是肯定的. /p>

当我们谈论诸如ABBYY之类的商业OCR时,它将为您提供99%以上的准确度,并且会自动检测表格.没有培训,没有任何东西,只是有效.缺点是您必须为它支付$$.有些人会反对开放源代码,您需要花时间进行设置和维护-但每个人都在这里自己决定.

但是,如果我们谈论商用工具,实际上还有更多选择.这取决于您想要什么.盒装产品(例如FineReader)实际上旨在将输入文档转换为可编辑文档(例如Word或Excell).由于实际上是要获取数据,而不是Word文档,因此您可能需要研究其他产品类别-数据捕获,它本质上是OCR,外加一些其他逻辑以在页面上查找必要的数据.如果是发票,则可以是公司名称,总金额,到期日期,表中的行项目等.

数据捕获是一个复杂的主题,需要学习一些知识,但是正确地使用它可以在从文档捕获数据时提供准确的准确性.它使用不同的规则进行数据交叉检查,数据库查找等.必要时,它可能会发送数据以进行手动验证.企业广泛使用Data Capture应用程序来每月输入数百万个文档,并严重依赖于他们日常工作流程中提取的数据.

当然还有OCR SDK,可以通过API访问识别结果,并且可以编程处理数据.

如果您更详细地描述您的任务,我可以为您提供建议,让您更容易找到方向.

UPDATE

因此,您要做的基本上是使用所谓的单击索引"方法的Data Capture应用程序,但不是完全自动化的.市场上有许多类似的应用程序:您扫描图像,然后操作员单击图像上的文本(或在图像周围绘制矩形),然后将字段填充到数据库中.当要处理的图像数量相对较小并且手动工作量不足以证明全自动应用程序的成本合理时,这是一种很好的方法(是的,有些全自动系统可以处理具有不同字体,间距,布局,张数的图像.表中的行等).

如果您决定开发产品而不是购买,那么这里只需要选择OCR SDK.您要自己编写的所有UI,对不对?最大的选择是决定:开源还是商业.

据我所知,最好的开源软件是tesseract OCR.它是免费的,但是在表分析方面可能存在真正的问题,但是使用手动分区方法应该不是问题.至于OCR的准确性-人们经常接受OCR字体培训,以提高准确性,但是您不应该这样做,因为字体可能会有所不同.因此,您可以尝试tesseract并查看将获得什么精度-这将影响手动进行校正的数量.

商用OCR将提供更高的准确性,但会花费您很多钱.我认为您还是应该看看它是否值得,否则tesserack对您来说足够好.我认为最简单的方法是下载某些盒式OCR产品的试用版,例如FineReader.这样,您将很好地了解OCR SDK的准确性.

Would OCR Software be able to reliably translate an image such as the following into a list of values?

UPDATE:

In more detail the task is as follows:

We have a client application, where the user can open a report. This report contains a table of values. But not every report looks the same - different fonts, different spacing, different colors, maybe the report contains many tables with different number of rows/columns...

The user selects an area of the report which contains a table. Using the mouse.

Now we want to convert the selected table into values - using our OCR tool.

At the time when the user selects the rectangular area I can ask for extra information to help with the OCR process, and ask for confirmation that the values have been correct recognised.

It will initially be an experimental project, and therefore most likely with an OpenSource OCR tool - or at least one that does not cost any money for experimental purposes.

解决方案

Simple answer is YES, you should just choose right tools.

I don't know if open source can ever get close to 100% accuracy on those images, but based on the answers here probably yes, if you spend some time on training and solve table analisys problem and stuff like that.

When we talk about commertial OCR like ABBYY or other, it will provide you 99%+ accuracy out of the box and it will detect tables automatically. No training, no anything, just works. Drawback is that you have to pay for it $$. Some would object that for open source you pay your time to set it up and mantain - but everyone decides for himself here.

However if we talk about commertial tools, there is more choice actually. And it depends on what you want. Boxed products like FineReader are actually targeting on converting input documents into editable documents like Word or Excell. Since you want actually to get data, not the Word document, you may need to look into different product category - Data Capture, which is essentially OCR plus some additional logic to find necessary data on the page. In case of invoice it could be Company name, Total amount, Due Date, Line items in the table, etc.

Data Capture is complicated subject and requires some learning, but being properly used can give quaranteed accuracy when capturing data from the documents. It is using different rules for data cross-check, database lookups, etc. When necessary it may send datafor manual verification. Enterprises are widely usind Data Capture applicaitons to enter millions of documents every month and heavily rely on data extracted in their every day workflow.

And there are also OCR SDK ofcourse, that will give you API access to recognition results and you will be able to program what to do with the data.

If you describe your task in more detail I can provide you with advice what direction is easier to go.

UPDATE

So what you do is basically Data Capture application, but not fully automated, using so-called "click to index" approach. There is number of applications like that on the market: you scan images and operator clicks on the text on the image (or draws rectangle around it) and then populates fields to database. It is good approach when number of images to process is relatively small, and manual workload is not big enough to justify cost of fully automated application (yes, there are fully automated systems that can do images with different font, spacing, layout, number of rows in the tables and so on).

If you decided to develop stuff and instead of buying, then all you need here is to chose OCR SDK. All UI you are going to write yoursself, right? The big choice is to decide: open source or commercial.

Best Open source is tesseract OCR, as far as I know. It is free, but may have real problems with table analysis, but with manual zoning approach this should not be the problem. As to OCR accuracty - people are often train OCR for font to increase accuracy, but this should not be the case for you, since fonts could be different. So you can just try tesseract out and see what accuracy you will get - this will influence amount of manual work to correct it.

Commertial OCR will give higher accuracy but will cost you money. I think you should anyway take a look to see if it worth it, or tesserack is good enough for you. I think the simplest way would be to download trial version of some box OCR prouct like FineReader. You will get good idea what accuracy would be in OCR SDK then.

这篇关于OCR软件可以可靠地从表中读取值吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆