Python,文本检测OCR [英] Python, text detection OCR

查看:98
本文介绍了Python,文本检测OCR的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试从扫描的表单中提取数据.表单的标准格式类似于下图所示:

I am trying to extract data from a scanned form. The form has a standard format similar to the one shown in the image below:

我尝试使用pytesseract(tesseract OCR)检测图像的文本,并且在查找文本并将图像转换为文本方面做得不错. 但是,它实际上只是为我提供了所有检测到的文本,而没有保留数据的格式.

I have tried using pytesseract (tesseract OCR) to detect the image's text and it has done a decent job at finding the text and converting the image to text. However it essentially just gives me all the detected text without keeping the format of the data.

我希望能够执行以下操作:

I would like to be able to do something like the below:

找到一段特定的文本,然后在其下方或旁边找到关联的数据.类似于使用opencv的此问题使用Opencv检测图像中的文本区域

Find a particular piece of text and then find the associated data below or beside it. Similar to this question using opencv Detect text region in image using Opencv

有没有一种方法可以基本上完​​成以下任务:

Is there a way that I can essentially do the following:

  1. 要么找到表单上的所有文本框,然后在每个框上执行OCR,然后查看哪个文本框与"witnesess:"文本最匹配,然后找到紧接其下方的部分,然后对它们进行单独的OCR.
  2. 或者如果表单是标准表单,并且我知道见证"文本部分的大概位置,我可以在opencv中指定其一般位置,然后仅提取以下文本并对其执行OCR.

编辑:我已经尝试了以下代码来尝试检测文本的特定区域.但是,它并不能专门识别所有区域的文本.

EDIT: I have tried the below code to try to detect specific regions of the text. However it is not specifically identifying the text just all regions.

import cv2

img = cv2.imread('t2.jpg')
mser = cv2.MSER_create()

img = cv2.resize(img, (img.shape[1]*2, img.shape[0]*2))   
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
vis = img.copy()

regions = mser.detectRegions(gray)
hulls = [cv2.convexHull(p.reshape(-1, 1, 2)) for p in regions[0]]
cv2.polylines(vis, hulls, 1, (0,255,0)) 

cv2.imshow('img', vis)

结果是:

推荐答案

我认为您已经在自己的帖子中找到了答案. 我最近做了类似的事情,这就是我的做法:

I think you have the answer already in your own post. I did recently something similar and this is how I did it:

//id_image was loaded with cv2.imread
temp_image = id_image[start_y:end_y,start_x:end_x]
img = Image.fromarray(temp_image)
text = pytesseract.image_to_string(img, config="-psm 7")

因此,基本上,如果您的格式是预定义的,则只需要知道想要文本的字段的位置(已经知道),对其进行裁剪,然后应用ocr(tesseract)提取即可.

So basically, if your format is predefined, you just need to know the location of the fields that you want the text of (which you already know), crop it, and then apply the ocr (tesseract) extraction.

在这种情况下,您需要import pytesseract, PIL, cv2, numpy.

In this case you need import pytesseract, PIL, cv2, numpy.

这篇关于Python,文本检测OCR的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆