图像处理/超轻OCR [英] Image processing / super light OCR

查看：148 发布时间：2018/7/30 16:21:12 image-processing ocr

本文介绍了图像处理/超轻OCR的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有55000张图片文件（JPG和TIFF格式），这些图片来自一本书。

I have 55 000 image files (in both JPG and TIFF format) which are pictures from a book.

每页的结构如下：

一些文字

some text

---（水平线）---

--- (horizontal line) ---

一个数字

一些文字

---（水平（）

另一个数字

一些文字

任何给定页面上都可以有0到4条水平线。

There can be from zero to 4 horizontal lines on any given page.

我需要查找数字是多少，就在水平线以下。

I need to find what the number is, just below the horizontal line.

但是，数字严格遵循，从第一页开始，所以为了找到数字，我不需要读它：我可以检测到水平线的存在，这比试图OCR页面检测数字更容易和更安全。

BUT, numbers strictly follow each other, starting at one on page one, so in order to find the number, I don't need to read it: I could just detect the presence of horizontal lines, which should be both easier and safer than trying to OCR the page to detect the numbers.

算法将是，基本上：

for each image
  count horizontal lines
  print image name, number of horizontal lines
  next image

问题是：什么是最好的图像库/语言来做计算水平线部分？

The question is: what would be the best image library/language to do the "count horizontal lines" part?

图像处理/超轻OCR [英] Image processing / super light OCR

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

图像处理/超轻OCR [英] Image processing / super light OCR

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭