处理表的图像以从中获取数据 [英] Processing an image of a table to get data from it

查看:152
本文介绍了处理表的图像以从中获取数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一张桌子的图像(如下所示)。我正在尝试从表中获取数据,类似于此表单(表格图像的第一行):

I have this image of a table (seen below). And I'm trying to get the data from the table, similar to this form (first row of table image):

rows[0] = [x,x, , , , ,x, ,x,x, ,x, ,x, , , , ,x, , , ,x,x,x, ,x, ,x, , , , ]

我需要x的数量以及空格的数量。
还会有其他表格图像与此图像相似(都有x和相同的列数)。

I need the number of x's as well as the number of spaces. There will also be other table images that are similar to this one (all having x's and the same number of columns).

到目前为止,我能够检测到所有的x使用x的图像。而且我可以在一定程度上检测到线条我正在使用open cv2 for python。我也使用houghTransform来检测水平线和垂直线(效果非常好)。

So far, I am able to detect all of the x's using an image of an x. And I can somewhat detect the lines. I'm using open cv2 for python. I'm also using a houghTransform to detect the horizontal and vertical lines (that works really well).

我正在试图弄清楚如何逐行并将信息存储在列表中。

I'm trying to figure out how I can go row by row and store the information in a list.

这些是训练图像:
用于检测x(代码中的train1.png)
< img src =https://i.stack.imgur.com/WwpUb.pngalt =在此处输入图像说明>

These are the training images: used to detect x (train1.png in the code)

用于检测线路(train2代码中的.png)

used to detect lines (train2.png in the code)

用于检测行(代码中的train3.png)

used to detect lines (train3.png in the code)

这是我到目前为止的代码:

This is the code I have so far:

# process images
from pytesser import *
from PIL import Image
from matplotlib import pyplot as plt
import pytesseract
import numpy as np
import cv2
import math
import os

# the table images
images = ['table1.png', 'table2.png', 'table3.png', 'table4.png', 'table5.png']

# the template images used for training
templates = ['train1.png', 'train2.png', 'train3.png']

def hough_transform(im):
    img = cv2.imread('imgs/'+im)
    gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    edges = cv2.Canny(gray, 50, 150, apertureSize=3)

    lines = cv2.HoughLines(edges, 1, np.pi/180, 200)

    i = 1
    for rho, theta in lines[0]:
        a = np.cos(theta)
        b = np.sin(theta)
        x0 = a*rho
        y0 = b*rho
        x1 = int(x0 + 1000*(-b))
        y1 = int(y0 + 1000*(a))
        x2 = int(x0 - 1000*(-b))
        y2 = int(y0 - 1000*(a))

        #print '%s - 0:(%s,%s) 1:(%s,%s), 2:(%s,%s)' % (i,x0,y0,x1,y1,x2,y2)

        cv2.line(img, (x1,y1), (x2,y2), (0,0,255), 2)
        i += 1

    fn = os.path.splitext(im)[0]+'-lines'
    cv2.imwrite('imgs/'+fn+'.png', img)


def match_exes(im, te):
    img_rgb = cv2.imread('imgs/'+im)
    img_gry = cv2.cvtColor(img_rgb, cv2.COLOR_BGR2GRAY)
    template = cv2.imread('imgs/'+te, 0)
    w, h = template.shape[::-1]

    res = cv2.matchTemplate(img_gry, template, cv2.TM_CCOEFF_NORMED)
    threshold = 0.71
    loc = np.where(res >= threshold)

    pts = []
    exes = []
    blanks = []
    for pt in zip(*loc[::-1]):
        pts.append(pt)
        cv2.rectangle(img_rgb, pt, (pt[0]+w, pt[1]+h), (0,0,255), 1)


    fn = os.path.splitext(im)[0]+'-exes'
    cv2.imwrite('imgs/'+fn+'.png', img_rgb)

    return pts, exes, blanks


def match_horizontal_lines(im, te, te2):
    img_rgb = cv2.imread('imgs/'+im)
    img_gry = cv2.cvtColor(img_rgb, cv2.COLOR_BGR2GRAY)
    template = cv2.imread('imgs/'+te, 0)
    w1, h1 = template.shape[::-1]
    template2 = cv2.imread('imgs/'+te2, 0)
    w2, h2 = template2.shape[::-1]

    # first line template (the downward facing line)
    res1 = cv2.matchTemplate(img_gry, template, cv2.TM_CCOEFF_NORMED)
    threshold1 = 0.8
    loc1 = np.where(res1 >= threshold1)

    # second line template (the upward facing line)
    res2 = cv2.matchTemplate(img_gry, template2, cv2.TM_CCOEFF_NORMED)
    threshold2 = 0.8
    loc2 = np.where(res2 >= threshold2)

    pts = []
    exes = []
    blanks = []

    # find first line template (the downward facing line)
    for pt in zip(*loc1[::-1]):
        pts.append(pt)
        cv2.rectangle(img_rgb, pt, (pt[0]+w1, pt[1]+h1), (0,0,255), 1)

    # find second line template (the upward facing line)
    for pt in zip(*loc2[::-1]):
        pts.append(pt)
        cv2.rectangle(img_rgb, pt, (pt[0]+w2, pt[0]+h2), (0,0,255), 1)

    fn = os.path.splitext(im)[0]+'-horiz'
    cv2.imwrite('imgs/'+fn+'.png', img_rgb)

    return pts, exes, blanks


# process
text = ''
for img in images:
    print 'processing %s' % img
    hough_transform(img)
    pts, exes, blanks = match_exes(img, templates[0])
    pts1, exes1, blanks1 = match_horizontal_lines(img, templates[1], templates[2])
    text += '%s: %s x\'s & %s horizontal lines\n' % (img, len(pts), len(pts1))

# statistics file
outputFile = open('counts.txt', 'w')
outputFile.write(text)
outputFile.close()

,输出图像看起来像这样(你可以看到,所有x都被检测到但不是所有行)
x的

And, the output images look like this (as you can see, all x's are detected but not all lines) x's

水平线

hough变换

hough transform

正如我所说,我实际上只是想从表中获取数据,类似于这种形式(表格图像的第一行):

As I said, I'm actually just trying to get the data from the table, similar to this form (first row of table image):

row a = [x,x, , , , ,x, ,x,x, ,x, ,x, , , , ,x, , , ,x,x,x, ,x, ,x, , , , ]

我需要x的数量以及空格的数量。
还会有其他表格图像与此图像相似(都具有x和相同的列数和不同的行数)。

I need the number of x's as well as the number of spaces. There will also be other table images that are similar to this one (all having x's and the same number of columns and a different number of rows).

另外,我使用的是python 2.7

Also, I am using python 2.7

推荐答案

好的,我已经弄清楚了。我使用@beaker提供的建议在网格线之间查看。

Ok, I have figured it out. I used the suggestion provided by @beaker of looking between the grid lines.

在此之前,我必须从霍夫转换代码中删除重复的行。然后,我将剩余的行分为2个列表,纵向和横向。从那里,我可以循环水平然后垂直,然后创建一个感兴趣的区域(roi)图像。每个roi图像表示表主图像中的单元。我检查了每个细胞的轮廓,发现当细胞中有一个'x'时, len(轮廓)> = 2 。那么,任何 len(轮廓)< 2 是一个空白区域(我做了几个测试程序来解决这个问题)。以下是我用来使其工作的代码:

Before doing that I had to remove the duplicate lines from the hough transformation code. Then, I sorted those remaining lines into 2 lists, vertical and horizontal. From there, I could loop through the horizontal and then vertical and then create a region of interest (roi) image. Each roi image represents a 'cell' in the table master image. I checked each of those cells for contours and noticed that when there was an 'x' in the cell, len(contours) >= 2. So, any len(contours) < 2 was a blank space (I did several test programs to figure this out). Here is the code I used to get it working:

import cv2
import numpy as np
import os

# the list of images (tables)
images = ['table1.png', 'table2.png', 'table3.png', 'table4.png', 'table5.png']

# the list of templates (used for template matching)
templates = ['train1.png']

def remove_duplicates(lines):
    # remove duplicate lines (lines within 10 pixels of eachother)
    for x1, y1, x2, y2 in lines:
        for index, (x3, y3, x4, y4) in enumerate(lines):
            if y1 == y2 and y3 == y4:
                diff = abs(y1-y3)
            elif x1 == x2 and x3 == x4:
                diff = abs(x1-x3)
            else:
                diff = 0
            if diff < 10 and diff is not 0:
                del lines[index]
    return lines


def sort_line_list(lines):
    # sort lines into horizontal and vertical
    vertical = []
    horizontal = []
    for line in lines:
        if line[0] == line[2]:
            vertical.append(line)
        elif line[1] == line[3]:
            horizontal.append(line)
    vertical.sort()
    horizontal.sort(key=lambda x: x[1])
    return horizontal, vertical


def hough_transform_p(image, template, tableCnt):
    # open and process images
    img = cv2.imread('imgs/'+image)
    gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    edges = cv2.Canny(gray, 50, 150, apertureSize=3)

    # probabilistic hough transform
    lines = cv2.HoughLinesP(edges, 1, np.pi/180, 200, minLineLength=20, maxLineGap=999)[0].tolist()

    # remove duplicates
    lines = remove_duplicates(lines)

    # draw image
    for x1, y1, x2, y2 in lines:
        cv2.line(img, (x1, y1), (x2, y2), (0, 0, 255), 1)

    # sort lines into vertical & horizontal lists
    horizontal, vertical = sort_line_list(lines)

    # go through each horizontal line (aka row)
    rows = []
    for i, h in enumerate(horizontal):
        if i < len(horizontal)-1:
            row = []
            for j, v in enumerate(vertical):
                if i < len(horizontal)-1 and j < len(vertical)-1:
                    # every cell before last cell
                    # get width & height
                    width = horizontal[i+1][1] - h[1]
                    height = vertical[j+1][0] - v[0]

                else:
                    # last cell, width = cell start to end of image
                    # get width & height
                    width = tW
                    height = tH
                tW = width
                tH = height

                # get roi (region of interest) to find an x
                roi = img[h[1]:h[1]+width, v[0]:v[0]+height]

                # save image (for testing)
                dir = 'imgs/table%s' % (tableCnt+1)
                if not os.path.exists(dir):
                     os.makedirs(dir)
                fn = '%s/roi_r%s-c%s.png' % (dir, i, j)
                cv2.imwrite(fn, roi)

                # if roi contains an x, add x to array, else add _
                roi_gry = cv2.cvtColor(roi, cv2.COLOR_BGR2GRAY)
                ret, thresh = cv2.threshold(roi_gry, 127, 255, 0)
                contours, hierarchy = cv2.findContours(thresh, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)

                if len(contours) > 1:
                    # there is an x for 2 or more contours
                    row.append('x')
                else:
                    # there is no x when len(contours) is <= 1
                    row.append('_')
            row.pop()
            rows.append(row)

    # save image (for testing)
    fn = os.path.splitext(image)[0] + '-hough_p.png'
    cv2.imwrite('imgs/'+fn, img)


def process():
    for i, img in enumerate(images):
        # perform probabilistic hough transform on each image
        hough_transform_p(img, templates[0], i)


if __name__ == '__main__':
    process()

所以,示例图片:

So, the sample image:

并且输出(生成文本文件的代码为简洁而删除):

And, the output (code to generate text file was deleted for brevity):

如您所见,文本文件包含与图像位于相同位置的相同数量的x。现在困难的部分结束了,我可以继续我的任务!

As you can see, the text file contains the same number of x's in the same position as the image. Now that the hard part is over, I can continue on with my assignment!

这篇关于处理表的图像以从中获取数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆