如何检测所有用于在特定字段的表单中输入字母的框? [英] How to detect all boxes for inputting letters in forms for a particular field?

查看:73
本文介绍了如何检测所有用于在特定字段的表单中输入字母的框?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

需要从带有每个字符输入框的表格中识别文本.

It is required to recognize text from forms with boxes given for each character input.

我尝试为每个输入使用边界框并裁剪该特定输入,即我可以在名称"字段中获得用于输入的所有框.但是,当我尝试检测盒子组中的单个盒子时,我无法做到这一点,而opencv对于所有盒子仅返回一个轮廓. for循环中引用的文件是包含边界框坐标的文件. ropped_img是属于单个字段输入(例如名称)的图像.

I have tried using bounding box for each input and cropping that particular input, i.e I can get all the boxes for inputting in 'Name' field. But when I try to detect individual boxes in the group of boxes, I am not able to do so and the opencv returns only one contour for all the boxes. The file referred in the for loop is a file containing coordinates of the bounding box. The cropped_img is the image which belongs to a single field's input(eg. Name).

全形图片 这是表格的图像.

Full form image This is the image of the form.

每个字段的裁剪图像

cropped image for each field

它包含许多用于输入字符的框.在此,检测到的轮廓数始终为一.为什么我无法检测到所有单个盒子? 简而言之,我要在cropedped_img中放置所有单独的框.

It contains many boxes for inputting characters. Here the number of the contours detected is always one. Why am I not able to detect all individual boxes? In short, I want all the individual boxes in the cropped_img.

此外,真的很感谢其他任何采用ocr形式的任务的想法!

Also, any other idea for approaching the task of form ocr is really appreciated!

for line in file.read().split("\n"):
        if len(line)==0:
            continue 
        region = list(map(int,line.split(' ')[:-1]))      
        index=line.split(' ')[-1] 
        text=''
        contentDict={}
        #uzn in format left, up, width, height
        region[2] = region[0]+region[2]
        region[3] = region[1]+region[3]
        region = tuple(region)
        cropped_img =  panimg[region[1]:region[3],region[0]:region[2]]

        index=index.replace('_', ' ')
        if index=='sign' or index=='picture' or index=='Dec sign':
            continue

        kernel = np.ones((50,50),np.uint8)
        gray = cv2.cvtColor(cropped_img, cv2.COLOR_BGR2GRAY)
        ret, threshold = cv2.threshold(gray,127,255,cv2.THRESH_BINARY)
        threshold = cv2.bitwise_not(threshold)   
        dilate = cv2.dilate(threshold,kernel,iterations = 1)
        ret, threshold = cv2.threshold(dilate,127,255,cv2.THRESH_BINARY)
        dilate = cv2.dilate(threshold,kernel,iterations = 1)
        contours, hierarchy = cv2.findContours(dilate,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)
        contours.sort(key=lambda x:get_contour_precedence(x, panimg.shape[1]))


        print("Length of contours detected: ", len(contours))
        for j, ctr in enumerate(contours):
            # Get bounding box
            x, y, w, h = cv2.boundingRect(ctr)

            # Getting ROI

            roi = cropped_img[y:y+h, x:x+w]
            # show ROI
            cv2.imshow('segment no:'+str(j-1),roi)
            cv2.waitKey(0)

文件"file"的内容如下:

The content of file 'file' is as follows:

462 545 468 39 AO_Office
450 785 775 39 Last_Name
452 836 770 37 First_Name
451 885 772 39 Middle_Name
241 963 973 87 Abbreviation_Name

期望的输出是各个框的轮廓,以便为每个字段输入一个字母

The expected output is contours for individual boxes for inputting a single letter for each field

推荐答案

我知道我参加聚会有点晚了:),但万一有人正在寻找解决这个问题的方法-我最近想出了一个python解决此确切问题的软件包.
我将其命名为 BoxDetect ,并通过以下方式进行安装:

I know I'm a bit late to the party :) but in case somebody would be looking for solution to this problem - I recently came up with a python package that deals with this exact problem.
I called it BoxDetect and after installing it through:

pip install boxdetect

您可以尝试以下操作:

from boxdetect import config

config.min_w, config.max_w = (20,50)
config.min_h, config.max_h = (20,50)
config.scaling_factors = [0.4]
config.dilation_iterations = 0
config.wh_ratio_range = (0.5, 2.0)
config.group_size_range = (1, 100)
config.horizontal_max_distance_multiplier = 2


from boxdetect.pipelines import get_boxes

image_path = "dumpster/m1nda.jpg"
rects, grouped_rects, org_image, output_image = get_boxes(image_path, config, plot=False)


import matplotlib.pyplot as plt

print("======================")
print("Individual boxes (green): ", rects)
print("======================")
print("Grouped boxes (red): ", grouped_rects)
print("======================")
plt.figure(figsize=(25,25))
plt.imshow(output_image)
plt.show()

它返回所有矩形框的边界矩形坐标,形成长输入字段的分组框以及表单图像上的可视化:

It returns bounding rectangle coords of all the rectangle boxes, grouped boxes forming long entry fields and visualization on the form image:

Processing file:  dumpster/m1nda.jpg
======================
Individual boxes (green):  [[1153 1873   26   26]
 [1125 1873   24   27]
 [1098 1873   24   26]
 ...
 [ 558  551   42   28]
 [ 514  551   42   28]
 [ 468  551   42   28]]
======================
Grouped boxes (red):  [(468, 551, 457, 29), (424, 728, 47, 45), (608, 728, 31, 45), (698, 728, 33, 45), (864, 728, 31, 45), (1059, 728, 47, 45), (456, 792, 763, 29), (456, 842, 763, 28), (456, 891, 763, 29), (249, 969, 961, 28), (249, 1017, 962, 28), (700, 1064, 39, 32), (870, 1064, 41, 32), (376, 1124, 45, 45), (626, 1124, 29, 45), (750, 1124, 27, 45), (875, 1124, 41, 45), (1054, 1124, 28, 45), (507, 1188, 706, 29), (507, 1238, 706, 28), (507, 1287, 706, 29), (718, 1335, 36, 31), (856, 1335, 35, 31), (1008, 1335, 34, 32), (260, 1438, 51, 37), (344, 1438, 56, 37), (505, 1443, 98, 27), (371, 1530, 31, 31), (539, 1530, 31, 31), (486, 1636, 694, 28), (486, 1684, 694, 28), (486, 1731, 694, 29), (486, 1825, 694, 29), (486, 1873, 694, 28)]
======================

这篇关于如何检测所有用于在特定字段的表单中输入字母的框?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆