使用Python和OpenCV在OCR中检测字间距 [英] Detecting Interword Space in OCR using Python and OpenCV

查看:453
本文介绍了使用Python和OpenCV在OCR中检测字间距的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是Python和OpenCV的新手.我目前正在使用Python和OpenCV(而未使用Tesseract )进行OCR.到目前为止,我已经成功检测到文本(字符和数字),但是遇到了检测单词之间空格的问题. 例如- 如果图像显示"Hello John",则它会检测到John约翰,但无法检测到它们之间的空间,因此我的输出是" HelloJohn ".它们之间没有任何空格.我提取轮廓的代码如下所示(我已经导入了所有必需的模块,这是提取轮廓的主要模块):

I am new to Python and OpenCV . I am currently working on OCR using Python and OpenCV without using Tesseract.Till now I have been successful in detecting the text (character and digit) but I am encountering a problem to detect space between words. Eg- If the image says "Hello John", then it detects hello john but cannot detect space between them, so my output is "HelloJohn" without any space between them.My code for extracting contour goes like this(I have imported all the required modules, this one is the main module extracting contour) :

 imgGray = cv2.cvtColor(imgTrainingNumbers, cv2.COLOR_BGR2GRAY)
 imgBlurred = cv2.GaussianBlur(imgGray, (5,5), 0)                        


 imgThresh = cv2.adaptiveThreshold(imgBlurred,                           
                                  255,                                  
                                  cv2.ADAPTIVE_THRESH_GAUSSIAN_C,       
                                  cv2.THRESH_BINARY_INV,                
                                  11,                                   
                                  2)                                    

 cv2.imshow("imgThresh", imgThresh)      

 imgThreshCopy = imgThresh.copy()        

 imgContours, npaContours, npaHierarchy = cv2.findContours(imgThreshCopy,        
                                             cv2.RETR_EXTERNAL,                 
                                             cv2.CHAIN_APPROX_SIMPLE)           

此后,我将提取的轮廓分类为数字和字符. 请帮助我检测它们之间的空间. 预先谢谢您,您的回复将非常有帮助.

After this I classify the extracted contours which are digits and character. Please help me detecting space between them. Thank You in advance,your reply would be really helpful.

推荐答案

由于您没有提供示例图像,因此我生成了一个简单的图像进行测试:

Since you did not give any example images, I just generated a simple image to test with:

h, w = 100, 600
img = np.zeros((h, w), dtype=np.uint8)
font = cv2.FONT_HERSHEY_SIMPLEX
cv2.putText(img, 'OCR with OpenCV', (30, h-30), font, 2, 255, 2, cv2.LINE_AA)

正如我在评论中提到的,如果您只是 dilate 图片,然后白色区域会扩大.如果您使用足够大的内核来执行此操作,以使附近的字母合并,但又足够小,以至于各个单独的单词不会合并,那么您将能够提取每个单词的轮廓并将其一次用于OCR蒙版.

As I mentioned in the comments, if you simply dilate the image, then the white areas will expand. If you do this with a large enough kernel so that nearby letters merge, but small enough that separate words do not, then you'll be able to extract contours of each word and use that to mask one word at a time for OCR purposes.

kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (15, 15))
dilated = cv2.dilate(img, kernel)

要单独获取每个单词的掩码,只需找到这些较大blob的轮廓即可.您也可以对轮廓进行排序.垂直,水平或两者兼而有之,以便您以适当的顺序获得单词.在这里,由于我只有一行,所以我将仅沿x方向进行排序:

To get the mask of each word individually, just find the contours of these larger blobs. You can sort the contours too; vertically, horizontally, or both so that you get the words in proper order. Here since I just have a single line I'll sort just in the x direction:

contours = cv2.findContours(dilated, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)[1]
contours = sorted(contours, key=lambda c: min(min(c[:, :, 0])))

for i in range(len(contours)):

    mask = np.zeros((h, w), dtype=np.uint8)

    # i is the contour to draw, -1 means fill the contours
    mask = cv2.drawContours(mask, contours, i, 255, -1)
    masked_img = cv2.bitwise_and(img, img, mask=mask)

    cv2.imshow('Masked single word', masked_img)
    cv2.waitKey()

    # do your OCR here on the masked image

这篇关于使用Python和OpenCV在OCR中检测字间距的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆