从扫描的文档opencv python中提取行表 [英] Extract lined table from scanned document opencv python

查看：353 发布时间：2020/5/20 20:30:09 python opencv hough-transform opencv-python

本文介绍了从扫描的文档opencv python中提取行表的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想从扫描表中提取信息并将其存储为csv.现在，我的表提取算法执行以下步骤.

I want to extract the information from a scanned table and store it a csv. Right now my table extraction algorithm does the following steps.

应用偏斜校正
应用高斯滤波器进行降噪.
使用Otsu阈值进行二值化
做一个形态学的开口.
发现小伙子们
进行霍夫变换以获得表格行.
删除重复的行(同一行在10像素范围内)
使用线的斜率过滤水平线和垂直线(水平线和垂直线的斜率应小于+/- 5度).

对于数字出生的pdf和大多数扫描的文档，此算法运行良好.但是，有些文档的表很嘈杂，因此无法正确识别行.

This algorithm is working fine for digital born pdfs and most of the scanned documents. But, Some of the documents have a noisy table and thus its not identifying the lines correctly.

这是示例算法失败的示例图像.

Here is a sample image in which my algorithm fails.

这些是我正在此表上执行的操作. 1.高斯模糊

These are the operations I am doing on this table. 1.Gaussian blur

2.大津阈值

3.形态学开放

4.Canny边缘检测

4.Canny edge detection

5.过滤的行，如您所见，显然没有识别出行正确地.

5.filtered lines,as you can see the lines are clearly not identified correctly.

任何人都可以建议从这种质量较低的扫描中提取水平线和垂直线的更好方法.

Can anyone please suggest better method for extracting horizontal and vertical lines from this kind of less quality scans.

提前谢谢！

从扫描的文档opencv python中提取行表 [英] Extract lined table from scanned document opencv python

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

从扫描的文档opencv python中提取行表 [英] Extract lined table from scanned document opencv python

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭