如何在图像中查找类似结构的表格 [英] How to find table like structure in image

查看:71
本文介绍了如何在图像中查找类似结构的表格的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有不同类型的发票文件,我想在每个发票文件中查找表格.在此表中的位置不是恒定的.所以我去进行图像处理.首先,我尝试将发票转换为图像,然后找到基于表格边界的轮廓,最后可以捕获表格位置. 对于下面的代码,我使用了该任务.

I have different type of invoice files, I want to find table in each invoice file. In this table position is not constant. So I go for image processing. First I tried to convert my invoice into image, then I found contour based on table borders, Finally I can catch table position. For the task I used below code.

with Image(page) as page_image:
    page_image.alpha_channel = False #eliminates transperancy
    img_buffer=np.asarray(bytearray(page_image.make_blob()), dtype=np.uint8)
    img = cv2.imdecode(img_buffer, cv2.IMREAD_UNCHANGED)

    ret, thresh = cv2.threshold(img, 127, 255, 0)
    im2, contours, hierarchy = cv2.findContours(thresh, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
    margin=[]
    for contour in contours:
        # get rectangle bounding contour
        [x, y, w, h] = cv2.boundingRect(contour)
        # Don't plot small false positives that aren't text
        if (w >thresh1 and h> thresh2):
                margin.append([x, y, x + w, y + h])
    #data cleanup on margin to extract required position values.

在此代码thresh1中,thresh2我将基于文件进行更新.

In this code thresh1, thresh2 i'll update based on the file.

因此,使用此代码,我可以成功读取图像中表格的位置,使用此位置,我将在我的发票pdf文件中使用.例如

So using this code I can successfully read positions of tables in images, using this position i'll work on my invoice pdf file. For example

示例1:

示例2:

示例3:

输出:

示例1:

示例2:

示例3:

但是,现在我有了一种新格式,它没有任何边框,但它是一张桌子.怎么解决呢?因为我的整个操作仅取决于表格的边框.但是现在我没有桌子边框了.我怎样才能做到这一点?我不知道要摆脱这个问题.我的问题是,有什么方法可以基于表结构查找位置吗?

But, now I have a new format which doesn't have any borders but it's a table. How to solve this? Because my entire operation depends only on borders of the tables. But now I don't have a table borders. How can I achieve this? I don't have any idea to move out from this problem. My question is, Is there any way to find position based on table structure?.

例如,我的问题输入如下:

For example My problem input looks like below:

我想找到它的位置,如下所示:

I would like to find its position like below:

我该如何解决? 给我一个解决问题的主意真的很有意义.

How can I solve this? It is really appreciable to give me an idea to solve the problem.

谢谢.

推荐答案

Vaibhav是正确的.您可以尝试不同的形态学转换,以将像素提取或分组为不同的形状,线条等.例如,方法可以是:

Vaibhav is right. You can experiment with the different morphological transforms to extract or group pixels into different shapes, lines, etc. For example, the approach can be the following:

  1. 从膨胀"开始,将文本转换成实心点.
  2. 然后应用findContours函数作为下一步查找文本 边界框.
  3. 在具有文本边界框之后,可以应用一些 启发式算法,将文本框按文本框分组 坐标.这样,您可以找到一组对齐的文本区域 分为行和列.
  4. 然后,您可以按x和y坐标和/或某些坐标进行排序 对组进行分析以尝试查找分组的文本框是否可以 形成表格.
  1. Start from the Dilation to convert the text into the solid spots.
  2. Then apply the findContours function as a next step to find text bounding boxes.
  3. After having the text bounding boxes it is possible to apply some heuristics algorithm to cluster the text boxes into groups by their coordinates. This way you can find a groups of text areas aligned into rows and columns.
  4. Then you can apply sorting by x and y coordinates and/or some analysis to the groups to try to find if the grouped text boxes can form a table.

我写了一个小样本来说明这个想法.我希望代码能自我解释.我也在那里发表了一些评论.

I wrote a small sample illustrating the idea. I hope the code is self explanatory. I've put some comments there too.

import os
import cv2
import imutils

# This only works if there's only one table on a page
# Important parameters:
#  - morph_size
#  - min_text_height_limit
#  - max_text_height_limit
#  - cell_threshold
#  - min_columns


def pre_process_image(img, save_in_file, morph_size=(8, 8)):

    # get rid of the color
    pre = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    # Otsu threshold
    pre = cv2.threshold(pre, 250, 255, cv2.THRESH_BINARY | cv2.THRESH_OTSU)[1]
    # dilate the text to make it solid spot
    cpy = pre.copy()
    struct = cv2.getStructuringElement(cv2.MORPH_RECT, morph_size)
    cpy = cv2.dilate(~cpy, struct, anchor=(-1, -1), iterations=1)
    pre = ~cpy

    if save_in_file is not None:
        cv2.imwrite(save_in_file, pre)
    return pre


def find_text_boxes(pre, min_text_height_limit=6, max_text_height_limit=40):
    # Looking for the text spots contours
    # OpenCV 3
    # img, contours, hierarchy = cv2.findContours(pre, cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)
    # OpenCV 4
    contours, hierarchy = cv2.findContours(pre, cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)

    # Getting the texts bounding boxes based on the text size assumptions
    boxes = []
    for contour in contours:
        box = cv2.boundingRect(contour)
        h = box[3]

        if min_text_height_limit < h < max_text_height_limit:
            boxes.append(box)

    return boxes


def find_table_in_boxes(boxes, cell_threshold=10, min_columns=2):
    rows = {}
    cols = {}

    # Clustering the bounding boxes by their positions
    for box in boxes:
        (x, y, w, h) = box
        col_key = x // cell_threshold
        row_key = y // cell_threshold
        cols[row_key] = [box] if col_key not in cols else cols[col_key] + [box]
        rows[row_key] = [box] if row_key not in rows else rows[row_key] + [box]

    # Filtering out the clusters having less than 2 cols
    table_cells = list(filter(lambda r: len(r) >= min_columns, rows.values()))
    # Sorting the row cells by x coord
    table_cells = [list(sorted(tb)) for tb in table_cells]
    # Sorting rows by the y coord
    table_cells = list(sorted(table_cells, key=lambda r: r[0][1]))

    return table_cells


def build_lines(table_cells):
    if table_cells is None or len(table_cells) <= 0:
        return [], []

    max_last_col_width_row = max(table_cells, key=lambda b: b[-1][2])
    max_x = max_last_col_width_row[-1][0] + max_last_col_width_row[-1][2]

    max_last_row_height_box = max(table_cells[-1], key=lambda b: b[3])
    max_y = max_last_row_height_box[1] + max_last_row_height_box[3]

    hor_lines = []
    ver_lines = []

    for box in table_cells:
        x = box[0][0]
        y = box[0][1]
        hor_lines.append((x, y, max_x, y))

    for box in table_cells[0]:
        x = box[0]
        y = box[1]
        ver_lines.append((x, y, x, max_y))

    (x, y, w, h) = table_cells[0][-1]
    ver_lines.append((max_x, y, max_x, max_y))
    (x, y, w, h) = table_cells[0][0]
    hor_lines.append((x, max_y, max_x, max_y))

    return hor_lines, ver_lines


if __name__ == "__main__":
    in_file = os.path.join("data", "page.jpg")
    pre_file = os.path.join("data", "pre.png")
    out_file = os.path.join("data", "out.png")

    img = cv2.imread(os.path.join(in_file))

    pre_processed = pre_process_image(img, pre_file)
    text_boxes = find_text_boxes(pre_processed)
    cells = find_table_in_boxes(text_boxes)
    hor_lines, ver_lines = build_lines(cells)

    # Visualize the result
    vis = img.copy()

    # for box in text_boxes:
    #     (x, y, w, h) = box
    #     cv2.rectangle(vis, (x, y), (x + w - 2, y + h - 2), (0, 255, 0), 1)

    for line in hor_lines:
        [x1, y1, x2, y2] = line
        cv2.line(vis, (x1, y1), (x2, y2), (0, 0, 255), 1)

    for line in ver_lines:
        [x1, y1, x2, y2] = line
        cv2.line(vis, (x1, y1), (x2, y2), (0, 0, 255), 1)

    cv2.imwrite(out_file, vis)

我得到以下输出:

当然,要使该算法更健壮并适用于各种不同的输入图像,就必须对其进行相应的调整.

Of course to make the algorithm more robust and applicable to a variety of different input images it has to be adjusted correspondingly.

更新:更新了有关findContours的OpenCV API更改的代码.如果您安装了较旧版本的OpenCV,请使用相应的电话. 相关帖子.

Update: Updated the code with respect to the OpenCV API changes for findContours. If you have older version of OpenCV installed - use the corresponding call. Related post.

这篇关于如何在图像中查找类似结构的表格的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆