有谁知道OCRing 7段显示为Windows Phone任何API? [英] Does anyone know any API for OCRing 7-Segment Display for Windows Phone?

查看:539
本文介绍了有谁知道OCRing 7段显示为Windows Phone任何API?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想开发一款Windows Phone 8.1应用程序,但我需要从不同的显示器识别一些数字。

我在下面这个例子: 的http://bsubramanyamraju.blogspot.com/2014/08/windowsphone-81-optical-character.html

这是使用Microsoft OCR运行时库: https://www.nuget.org/packages/Microsoft.Windows.Ocr/

不过它并没有当我试图识别这些实物图片的工作。即使我发现这个网站: https://www.unix-ag.uni-kl.de/~ auerswal / ssocr /

没有人有一个建议?还是没有人知道任何code相关呢?

感谢您的有价值的知识。

解决方案

我想回答你的问题将是当然,这里是与链接到一个黑盒过程中,任何OCR工具,但有几个涉及的方面,这是最好分开考虑。

首先,对图像$​​ P $ P-处理一些工作之前,你甚至考虑任何OCR。你的图像样本是非常显着的不同,包括全系列的问题。

样品1具有低的对比度,所以当它被二值化,以黑色和白色层,其中大部分的OCR将在某个阶段内部执行,没有字符要处理。它看起来像这样二值化后:

请参阅本OCR博客文章,对画面pre处理的其他详细信息:<一href="http://www.ocr-it.com/guide-to-better-mobile-images-from-cell-phone-camera-for-higher-quality-ocr">http://www.ocr-it.com/guide-to-better-mobile-images-from-cell-phone-camera-for-higher-quality-ocr.

其次,图像具有在头部没有dpi的信息,其中一些的OCR技术使用以确定图像的适当的比例。如果没有头信息,一些OCR程序可能会设置一些默认的DPI,这可能会或可能不适合你的形象,从而影响OCR结果。这不是关键的,但preferred如果这可以在图像生成时执行。

样品2有足够的对比度和自适应公证返回清晰的图像。它也缺少在头dpi的分辨率值。

样品3具有非常明显的对比,但它也有包头中没有分辨率dpi的。

在您遇到的OCR处理优化的图像,下一步就是看OCR技术。

我没有测试,一旦你提到的,假设你有正确的执行,但他们没有成功。我测试我已经在过去使用其他OCR工具。

在一般情况下,在没有7段的OCR已知我那里。不过,我能适应其他通用OCR这种专业的任务。每个OCR我tried'out开箱即用,或使用默认设置是无法处理这种认识。它是合乎逻辑和预期。为什么?因为最通用的OCR写入识别不可分割的像素图案的每个字符。这涉及到用于词语分离成单独的字符字符可分离的原则。换句话说,内部OCR算法查找连招从而弥补了每个字符。更强大的商业OCR允许像素图案一些休息时间,但他们预计最低为无,就像在打印或扫描缺陷,这可能会导致丢失字符块。

7段显示器的性质将有多重中断的每一个字符,以字符可分离原则相冲突。

更强大的OCR技术有)更多的宽容,以像素模式的断裂和/或b)有特殊的设置来处理这些情况。

我会进行进一步的测试与OCR-IT网络为基础的OCR的API平台,这是众所周知的给我。我担任其OCR功能的开发。我也广泛地在自己的iOS和Android应用程序使用它。 OCR-IT API是基于一个强大的商业OCR引擎,因此它具有良好的耐性格缺陷以及一些控件在这种情况下提供帮助。

样品3。这是最简单的样品处理,所以我首先测试它。使用OCR-IT的API,并使得使用默认设置的请求,请求输出到TXT格式,我得到如下:

似乎OCR是)分割字符到两个单独的行,和b)尝试读取所得patters尽可能接近到有效的字符。

在此基础上快速分析,使得下面的认识1调整OCR设置的结果:

这取得的OCR结果实质差异的设置从默认打印类型切换到使用DOTMATRIX,这是在这整个的OCR-IT的API的设定值的XML的中间

&LT;工作&GT;  &LT; InputURL&GT; HTTP://i.stack.imgur.com/wOtFx.jpg< / InputURL&GT;   &LT; CleanupSettings&GT;       &LT;纠偏&GT;假&LT; /纠偏&GT;       &LT; RemoveGarbage&GT;假&LT; / RemoveGarbage&GT;       &LT; RemoveTexture&GT;假&LT; / RemoveTexture&GT;       &LT; RotationType&GT; NoRotation&LT; / RotationType&GT;   &LT; / CleanupSettings&GT;   &LT; OCRSettings&GT;       &LT;打印类型&GT; DOTMATRIX&LT; /打印类型&GT;       &LT; OCRLanguage&GT;英语&LT; / OCRLanguage&GT;       &LT; Speed​​OCR&GT;假&LT; / Speed​​OCR&GT;       &LT; AnalysisMode&GT; MixedDocument&LT; / AnalysisMode&GT;       &LT; LookForBar codeS&GT;假&LT; / LookForBar codeS&GT;   &LT; / OCRSettings&GT;   &LT; OutputSettings&GT;       &LT; ExportFormat&GT;文字&lt; / ExportFormat&GT;   &LT; / OutputSettings&GT; &LT; /工作&GT;

使用DOTMATRIX打印类型接通所需的算法,以增加容忍在字符的结构,这通常发生在由点阵打印点阵打印机的性质断裂。另外,打字机打印类型可以使用,因为性格休息也有望在打字字体,从而由OCR自动处理。

有可能是多了一个变化的API设置使用数字字符集(语言),有效地消除任何可能的运行OCR错误必读1,我等。

样品2。在这个例子中,在每个人物的结构的差距更为广阔。即使是标准算法处理DOTMATRIX或Typerwriter打印类型不能容纳这些很大的差距。使用所有可能的设置变化返回类似这样的:

字符分割似乎是问题。一个技术解决方案,可以追溯到图像pre-处理。一个简单的算法可以实现,填补了7段字符的每个部分之间的差距。它并不一定要很precise,是这样的:

但是,这是足以产生一个完美的OCR结果。

由于这可能是事先不知道这7段LCD显示屏将需要填补的空白,并且没有,我建议这种算法适用于所有LCD 7段图像,具有或大或小的差距。我将限制的间隙的大小,以不超过一个节段的宽度宽。鉴于这些屏幕有各种背景和段的颜色,这pre-游行算法可以大大简化,如果它是在执行二值化(黑白与白)图像

总体而言,这项任务是可能的OCR和近开箱即用的功能,假设执行一些图像pre-处理。在一般情况下,我认为需要这种形象pre-处理任何OCR相关的项目,无论如何,针对此项目的。

如果您有任何关于OCR或图像pre-处理,下午进一步的问题我。

I'm trying to develop a Windows Phone 8.1 App but I need to recognize some numbers from different Displays.

I was following this example: http://bsubramanyamraju.blogspot.com/2014/08/windowsphone-81-optical-character.html

That is using the Microsoft OCR Runtime Library: https://www.nuget.org/packages/Microsoft.Windows.Ocr/

However it doesn't work when I'm trying to recognize those kind of pics. Even I found this site: https://www.unix-ag.uni-kl.de/~auerswal/ssocr/

Does anyone have a recommendation? Or Does anyone know any code related to it?

Thank for your worthy knowledge.

解决方案

I wish the answer to your question would be "Sure, here it is" with link to a black-box process-anything OCR tool, but there are several aspects involved, which are best considered separately.

First, there is some work on image pre-processing BEFORE you even consider any OCR. Your image samples are very drastically different, and include full range of issues.

SAMPLE 1 has low contrast, so when it is binarized to black and white layer, which most OCR will perform internally at some stage, there are no characters to process. It looks like this after binarization:

See this OCR Blog post for additional details on image pre-processing: http://www.ocr-it.com/guide-to-better-mobile-images-from-cell-phone-camera-for-higher-quality-ocr.

Secondly, the image has no dpi information in the header, which some OCR technologies use to determine appropriate scaling of the image. Without header information, some OCR programs may set some default dpi, which may or may not match your image, thus affecting the OCR result. This is NOT critical, but preferred if this can be implemented at the time of picture creation.

SAMPLE 2 has sufficient contrast and adaptive notarization returns a clear image. It is also missing dpi resolution value in the header.

SAMPLE 3 has very clear contrast, but it also has no resolution dpi in the header.

Once you have images that are optimized for OCR processing, the next step is to look at OCR technologies.

I did NOT test the once you mentioned, assuming you had correct implementation and yet no success with them. I tested other OCR tools I have used in the past.

In general, there in no 7-segment OCR known to me. However, I was able to adapt other generic OCR for this specialized task. Every OCR I tried'out-of-box' or with default settings is unable to handle this recognition. And it is logical and expected. Why? Because most generic OCR are written to recognize inseparable pixel pattern for each character. This is related to "character separability" principle used to separate words into separate characters. In other words, inner OCR algorithms look for connected strokes which make up each characters. More powerful commercial OCR allow some breaks in pixel patterns, but they are expected to be minimal to none, like defects in print or scan, which may result in missing character pieces.

7-segment display by nature will have multiple breaks in each character, conflicting with the character separability principle.

More powerful OCR technologies have a) more tolerance to breaks in pixel patterns and/or b) have special settings to handle these cases.

I will perform further testing with OCR-IT web-based OCR API platform, which is well known to me. I worked as a developer on its OCR capabilities. I also use it extensively in my own iOS and Android apps. OCR-IT API is based on a strong commercial OCR engine, so it is has good tolerance to character imperfections as well as some controls to help in this case.

SAMPLE 3. This is the easiest sample to process, so I tested it first. Using OCR-IT API, and making a request with default settings, requesting the output to TXT format, I get the following:

It appears that OCR is a) segmenting characters into two separate lines, and b) tries to read resulting patters as close as possible to valid characters.

Based on this quick analysis, making one adjustments to OCR settings results in the following recognition:

The setting that made substantial difference in OCR result is switching from default print type to using "DotMatrix", which is in the middle of this entire OCR-IT API settings XML:

<Job> 
 <InputURL>http://i.stack.imgur.com/wOtFx.jpg</InputURL>
  <CleanupSettings>
      <Deskew>false</Deskew>
      <RemoveGarbage>false</RemoveGarbage>
      <RemoveTexture>false</RemoveTexture>
      <RotationType>NoRotation</RotationType>
  </CleanupSettings>
  <OCRSettings>
      <PrintType>DotMatrix</PrintType>
      <OCRLanguage>English</OCRLanguage>
      <SpeedOCR>false</SpeedOCR>
      <AnalysisMode>MixedDocument</AnalysisMode>
      <LookForBarcodes>false</LookForBarcodes>
  </OCRSettings>
  <OutputSettings>
      <ExportFormat>Text</ExportFormat>
  </OutputSettings>
</Job>

The use of DotMatrix print type turned on necessary algorithms to increase tolerance for breaks in character structure, which commonly occurs by nature of dot-matrix printers in dot-matrix prints. Alternatively, a "Typewriter" print type could be used, since character breaks are also expected in typewritten fonts, thus being automatically handled by OCR.

There could be one more change to the API setting to run OCR using "Digits" character set (language), effectively eliminating any possibility of mis-reading 1 as I, etc.

SAMPLE 2. In this sample, the gaps in each character's structure are much wider. Even standard algorithms for handling DotMatrix or Typerwriter print types cannot accommodate for these wide gaps. The use of all possible setting variations returned something like this:

Character segmentation seems to be the issue. One technical solutions goes back to image pre-processing. A simple algorithm can be implemented to fill in gaps between each segment of the 7-segment character. It does not have to be very precise, something like this:

But that is enough to produce a perfect OCR result.

Since it may be unknown in advance which 7-segment LCD display will require filled in gaps, and which does not, I recommend to apply this algorithm to all LCD 7-segment images, with small or large gaps. I would limit the size of the gap to no wider than the width of a segment. Given these screens come in various background and segment colors, this pre-procession algorithm can be substantially simplified if it is performed on binarized (black & white) image.

Overall, this task is possible with OCR and near out-of-box functionality, assuming that some image pre-processing is performed. In general, I believe that image pre-processing is required for any OCR-related project anyway, specific to that project.

If you have any further questions about OCR or image pre-processing, pm me.

这篇关于有谁知道OCRing 7段显示为Windows Phone任何API?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆