使用PHP可识别的数字 [英] Recognizable numbers using PHP

查看:44
本文介绍了使用PHP可识别的数字的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试从图片中提取1-99范围内的一些数字.我已经尝试了使用PHP的几种OCR方法,但是最终我的脚本会失败,因为数字有时会向左或向右旋转5%.这使得图片无法识别.

我现在已经安装了Ocropus http://code.google.com/p/ocropus/作为测试.不幸的是,这并没有每次都给我正确的数字.这使我认为我的图片不够优化.

有人有一些技巧/想法来优化数字的可读性吗?我也很想知道如何从图片中找到数字的想法.

解决方案

Tesseract/Ocropus似乎与歪斜混淆,可能是同一行上的多个歪斜数字使Tesseract或Ocropus感到困惑. >

您是否将整个图像作为数字网格传递?您是否尝试过将每个包装箱(编号)作为单独的图像分别发送到OCR引擎?您可能会发现您获得了更好的结果.

您是否尝试过其他OCR引擎?您是否要求它是开源的?

我通过便宜的商用OCR引擎运行了图像,并正确识别了所有数字.因此,另一种选择是使用C#或C ++代码和接口相当快地包装商用OCR引擎,以提供更好的结果.

I’m trying to extract some numbers ranging from 1-99 from a picture. I’ve tried several OCR methods using PHP, but eventually my script will fail, since the numbers occasionally is rotated 5% to the left or right. This making the picture not being recognizable.

I’ve now installed Ocropus http://code.google.com/p/ocropus/ as a test. Unfortunately this is not giving me the correct numbers every time. This leads me to think that my pictures are not optimized enough.

Does anyone have some tips/ideas how to optimize the readability of the numbers? I would also be grateful for ideas how to find the numbers from the picture.

解决方案

It seems that Tesseract / Ocropus are getting confused with the skew an it could be that multiple skewed numbers on the same line is confusing the Tesseract or Ocropus.

Are you passing in the whole image as a grid of numbers ? Have you tried sending each box (number) individually as a separate image to the OCR engine ? You may find you get better results.

Have you tried any other OCR engines ? Do you require it to be open source ?

I ran the image through a cheaper commercial OCR engine and all numbers recognised correctly. So another option is to wrap up a commercial OCR engine quite quickly with C# or C++ code and interface to deliver improved results.

这篇关于使用PHP可识别的数字的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆