如何使用C#从复杂图像中提取手写和打印文本 [英] How to extract handwritten and printed text from complex images using C#

查看:45
本文介绍了如何使用C#从复杂图像中提取手写和打印文本的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述


我正在开发一个项目,用于从c#中的杂志,报纸,处方,笔记本等真实世界扫描图像中提取文本。我完成了字符识别。现在我想从输入图像中提取每个单词并从单词
中分割每个字符并发送到我的字符识别算法。我的输入图像可能包含手写和打印文本。我的输入图像可能有复杂的背景。

I am working on a project to extract text from real world scanned images like magazines, newspaper, prescription, notebooks etc in c#. I finished character recognition. Now I want to extract each word from the input image and segment each character from word and send to my character recognition algorithm. My input images may contain both hand written and printed text. My input image may have complex background.


我的问题是如何识别区域图像中出现的文字
我对这个问题一无所知。帮我。提前谢谢。

My Question is how can we identify the regions where the text present in an image. I don't have any idea about this problem. Help me. Thanks in advance.

推荐答案

我先给你通常的免责声明:

I give you the usual disclaiemrs first:

语言识别确实不能可靠地工作。甚至95%被认为是好的,这是针对特定的预知语言,也可以选择非常精选的词汇或b& w打印字体对比。

识别手写是不可能的。

Language recognition does not work reliably. Even 95% is considered good, and that is for a specific pre-known langauge with very select Vocabulary to match too or b&w printed font contrast.
Recognizing handwriting is effectively impossible.

此外,区分单独的单词已经是任何现有的algorythm的一部分。他们正在进行完整的图像扫描。那些图像扫描必须满足机器可读性的一定最低要求(良好的对比度,已知的语言,已知的字体
颜色)。

即使一切顺利,人类仍然必须手动核心所有不确定因素。

Moreover, differentiating between seperate words is already part of any existing algorythm. They are giving full image scans. Those image scans must fullfill a certain minimum requirement on machine readability (good contrast, known langauge, known font color).
And even with everything going well, a human still has to manually corect all uncertainties.

您希望寻找完整的OCR解决方案。并且必须接受它们对您施加的任何限制(支持的语言,可靠性,图像要求)。

语言识别是科学家的工作,而不是程序员。

You want to look for complete OCR solutions to use. And have to accept any limitations (supported languages, reliability, image requirements) they impose on you.
Language identification is the job of scientists, not programmers.

这篇关于如何使用C#从复杂图像中提取手写和打印文本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆