更改图像DPI以用于tesseract [英] Changing image DPI for usage with tesseract

查看:202
本文介绍了更改图像DPI以用于tesseract的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在一个项目中识别名片中的文本并将其映射到适当的字段.我正在使用opencv进行图像处理.我需要将预处理后的图像馈送到Tesseract-OCR引擎以进行文本识别."https://github.com/tesseract-ocr/tesseract/wiki/ImproveQuality#rescaling" rel ="noreferrer">此链接指出图片的DPI至少应为300.我的图片像素大小为2560x1536,DPI为72.

I am working on a project to recognize text in Business Cards and map them to appropriate fields.I am using opencv for image processing.I need to feed the preprocessed image to Tesseract-OCR engine for text recognition.This link states that images should have atleast a DPI of 300.My image pixel size is 2560x1536 with 72 DPI.

  • 如何将DPI增加到300?
  • 也有人说,调整图像尺寸是有益的.如何最佳调整图像尺寸以获得良好的OCR结果
  • Tesseract在DPI至少为300 dpi的图像上效果最佳,因此调整图像大小可能会有所帮助.什么"在这里意味着什么.调整图像大小和DPI之间有什么关系?
  • How to increase the DPI to 300?
  • It is also said that it is beneficial to resize image.How to resize my image optimally for good OCR results
  • Tesseract works best on images which have a DPI of at least 300 dpi, so it may be beneficial to resize images. What does 'so' imply here.What is the relation between resizing an image and DPI?

推荐答案

对于OCR,真正重要的是像素的分辨率 .因为物理字符的范围可以从微小到巨大,而与采集设备的DPI无关.

For OCR, what really matters is the resolution in pixels. Because the physical characters can range from tiny to huge, independently of the DPI of the acquisition device.

根据经验,笔划宽度大约为3像素是一个好的开始.如果较低,则调整大小可能无济于事,因为信息丢失.如果更高,则运行时间可能会过多(或者不建议使用OCR功能来处理它).

As a rule of thumb, stroke width around 3 pixels is a good start. If lower, resizing might not be helpful because the information is missing. If much higher, the running time might be excessive (or the OCR function not be taylored to deal with it).

如果存在不匹配的情况,还请根据包本身对笔触宽度的假设以及存储在标头中的DPI信息,检查包是否不会尝试在内部调整大小.

Also check that the package will not attempt to resize internally, based on its own assumption of stroke width and the DPI info stored in the header, if there is a mismatch.

这篇关于更改图像DPI以用于tesseract的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆