Office Automation文档扫描,解析和存档以及跟踪应用程序设计 [英] Office Automation Document Scan, Parse and Archive and Track application design
问题描述
大家好,
我正在寻找有关构建/使用基于Windows/Web的应用程序来扫描文档(主要是文本),解析为word/pdf格式的文本并将其保存/存档到存储库"并将一些元数据保存在数据库中的想法.
基本工作流程
收集特定帐户的一组文档,并将其通过扫描仪送入.必须有一种方法来识别文档的开头/结尾.
扫描的文档将存储并解析为某种文本格式,并准备将其转换为PDF或Word等.
PDF/Word文件被存档"到存储库甚至数据库中的Folder位置.
在数据库中跟踪文档/帐户的名称.
感谢您的帮助,Paul
Hi all,
I looking for ideas on building / using a windows / web based application for scanning documents ( primarily text ), parsing to text in word / pdf format and saving/archiving to a "repository" and saving some metadata on the documents in a database.
Basic workflow
A set of documents for a particular account is collected and fed through a scanner. There needs to be a way to identify beg/end of a document.
The scanned documents are stored and parsed to some text format and ready to be converted to PDF or Word etc.
The PDF/Word files are "archived" to a Folder location in a repository or even in a Database.
The name of the document / account is tracked in a Database.
Thanks for help, Paul
推荐答案
听起来像您正在描述COLD存储的开始部分.
Onbase 是Hyland的产品,可以做到.
Optika曾经提供这种功能,它们是由Stellent Inc.收购的,而Stellent Inc.被Oracle收购了.因此很明显,Oracle现在有一个可以执行此操作的产品.虽然目前无法找到它.如果您想从头开始,则需要做OCR.
Sounds like you are describing the beginning parts of COLD storage.
Onbase is a product from Hyland, that will do it.
Optika used to offer this, they were bought by Stellent Inc. who was acquired by Oracle. So obviously Oracle now has a product that does this. Though I can''t find it at the moment. If you are looking to do it from scratch you will need something to do OCR.
这篇关于Office Automation文档扫描,解析和存档以及跟踪应用程序设计的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!