从图像中提取线以馈送到OCR - Tesseract [英] Extracting lines from an image to feed to OCR - Tesseract

查看：137 发布时间：2018/7/30 17:18:50 opencv image-processing tesseract

本文介绍了从图像中提取线以馈送到OCR - Tesseract的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在看pycon的这个演讲可能对您有用：

您需要做的就是裁剪每一对线并将其作为图像提供给Tesseract。

I was watching this talk from pycon http://youtu.be/B1d9dpqBDVA?t=15m34s around the 15:33 mark the speaker talks about extracting lines from an image (receipt) and then feeding that to the OCR engine so that text can be extracted in a better way.

I have a similar need where I'm passing images to the OCR engine. However, I don't quite understand what he means by extracting lines from an image. What are some open source tools that I can use to extract lines from an image?

解决方案

Take a look at the technique used to detect the skew angle of a text.

Groups are lines are used to isolate text on an image (this is the interesting part).

From this result you can easily detect the upper/lower limits of each line of text. The text itself will be located inside them. I've faced a similar problem before, the code might be useful to you:

All you need to do from here is crop each pair of lines and feed that as an image to Tesseract.

这篇关于从图像中提取线以馈送到OCR - Tesseract的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

从图像中提取线以馈送到OCR - Tesseract [英] Extracting lines from an image to feed to OCR - Tesseract

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

从图像中提取线以馈送到OCR - Tesseract [英] Extracting lines from an image to feed to OCR - Tesseract

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭