如何开发识别特定表格的OCR软件? [英] How do I develop an OCR software that recognizes specific forms?

查看:155
本文介绍了如何开发识别特定表格的OCR软件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在工作时执行了批次的数据输入,这很费时间和tedius。



我想制作OCR软件可以识别我经常看到的反复出现的形式,例如纳税申报表。我希望软件识别我指定的字段/区域中的数字,并将数字返回到我已经创建的excel表单中。



我设想的过程是,扫描文档,使其成为?pdf ?,我从我创建的模板列表中选择扫描文档的类型,软件根据所选模板提取我指定的数字,提取的数字按照我指定的顺序放入excel文档中,基于OCR模板。



I除了在Excel中创建宏之外没有编程经验。我是自学成才的,我愿意学习是否可以看到我在哪里阅读或看。我没有任何问题,这是一个3-6个月的学习项目,让我进入编码。如果这样可以节省我数小时的时间,而且我的所有同事都会变得嫉妒,那我手上的时间就没什么了。

I perform a lot of data entry at work, which is time consuming and tedius.

I want to make OCR software that can recognize recurring forms that I routinely see, for example a tax return. I want the software to recognize numbers in fields/areas I specify and return the numbers to an excel form that I have already created.

I envision the process to be, scan the document, make it a ?pdf?, I choose the type of scanned document from a list of templates I have created, software extracts the numbers I have specified based on the template selected, the extracted numbers are placed into an excel document in an order I specify based on the OCR template.

I have no programming experience beyond creating macros in excel. I'm self-taught on that and I'm willing to learn if I can be shown where to read or look. I have no issue making this a 3-6 month learning project that gets me into coding. I have nothing but time on my hands if this will save me hours and hours and all my coworkers become jealous.

推荐答案

我看到的一个问题是,如果它被放弃了没有正确地重新配置输入,那么您的数据将是错误的。



我最近试图将数据扫描到记事本,并且对k和其他字母和符号的字体感到困惑。



您可以从这样的事情开始,让您了解项目的规模。

查找并查看一些当前的API

http://en.wikipedia.org/wiki/Comparison_of_optical_character_recognition_software [ ^ ]
A problem I see with that is if it dosen't correctly reconize the input then your data will be wrong.

I recently tried to scan data to notepad and it got confused on the font for "k" and other letters and symbols.

You could start with something like this to give you an idea on the scale of the project.
Find and look at some of the current API's
http://en.wikipedia.org/wiki/Comparison_of_optical_character_recognition_software[^]


这篇关于如何开发识别特定表格的OCR软件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆