Google Cloud Vision - 数字和数字 OCR [英] Google Cloud Vision - Numbers and Numerals OCR

查看:30
本文介绍了Google Cloud Vision - 数字和数字 OCR的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在尝试用 Python 实现一个 OCR 程序,该程序读取特定格式的数字,XXX-XXX.我使用了谷歌的 Cloud Vision API 文本识别,但结果不可靠.在 30 张高对比度 1280 x 1024 bmp 图像中,只有少数产生了正确的输出,或者至少在结果中包含了正确的输出.该程序往往会省略一些数字,以非英语语言输出或偷偷插入一些特殊字符.

I've been trying to implement an OCR program with Python that reads numbers with a specific format, XXX-XXX. I used Google's Cloud Vision API Text Recognition, but the results were unreliable. Out of 30 high-contrast 1280 x 1024 bmp images, only a handful resulted in the correct output, or at least included the correct output in the results. The program tends to omit some numbers, output in non-English languages or sneak in a few special characters.

目标是至少连续输出正确的数字,如果结果洒上其他垃圾也没关系.有没有办法帮助程序更好地识别数字,例如将结果限制为特定格式,或仅限数字?

The goal is to at least output the correct numbers consecutively, doesn't matter if the results are sprinkled with other junk. Is there a way to help the program recognize numbers better, for example limit the results to a specific format, or to numbers only?

推荐答案

目前无法添加约束或为 Vision API 请求提供特定的预期数字格式,如上所述此处(由 Cloud Vision API 的项目经理提供).

At this moment it is not possible to add constraints or to give a specific expected number format to Vision API requests, as mentioned here (by the Project Manager of Cloud Vision API).

您还可以检查所有可能的请求参数(在 API 参考),没有表示要指定数字格式的任何内容.目前只有以下选项:

You can also check all the possible request parameters (in the API reference), none indicating anything to specify number format. Currently only options to:

  • latLongRect:指定图片的位置
  • languageHints:指示 text_detection 的预期语言(支持的语言列表 这里)
  • latLongRect: specify location of the image
  • languageHints: indicating the expected language for text_detection (list of supported languages here)

我假设您已经检查了多个响应(包含不同的图像区域)以查看是否可以使用不同数字的位置重建文本?

I assume you already checked out the multiple responses (with different included image regions) to see if you could reconstruct the text using the location of different digits?

请注意,Vision API 和 text_detection 并未专门针对您的数据进行优化,如果您有大量带注释的数据,也可以选择使用 Tensorflow 实际构建自己的模型.这篇博文 解释了检测车牌的系统设置(带有特定的数字格式).所有代码都可以在 Github 上找到,而且这个问题似乎与您的非常相关.

Note that the Vision API and text_detection is not optimized for your data specifically, if you would have a lot of annotated data, it is also an option to actually build your own model using Tensorflow. This blogpost explains a system setup to detect number plates (with a specific number format). All the code is available on Github and the problem seems very related to yours.

这篇关于Google Cloud Vision - 数字和数字 OCR的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆