OCR的清洁图像 [英] Cleaning image for OCR

查看:89
本文介绍了OCR的清洁图像的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在尝试清除OCR的图像:(线条)

I've been trying to clear images for OCR: (the lines)

我需要删除这些行以有时对图像进行进一步处理,但是我已经很接近了,但是很多时候,阈值会从文本中带走太多:

I need to remove these lines to sometimes further process the image and I'm getting pretty close but a lot of the time the threshold takes away too much from the text:

    copy = img.copy()
    blur = cv2.GaussianBlur(copy, (9,9), 0)
    thresh = cv2.adaptiveThreshold(blur,255,cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY_INV,11,30)

    kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (9,9))
    dilate = cv2.dilate(thresh, kernel, iterations=2)

    cnts = cv2.findContours(dilate, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
    cnts = cnts[0] if len(cnts) == 2 else cnts[1]

    for c in cnts:
        area = cv2.contourArea(c)
        if area > 300:
            x,y,w,h = cv2.boundingRect(c)
            cv2.rectangle(copy, (x, y), (x + w, y + h), (36,255,12), 3)

此外,如果字体更改,则不能使用常数. 有没有通用的方法可以做到这一点?

Additionally, using constant numbers will not work in case the font changes. Is there a generic way to do this?

推荐答案

这是个主意.我们将此问题分为几个步骤:

Here's an idea. We break this problem up into several steps:

  1. 确定平均矩形轮廓区域.我们先阈值,然后找到轮廓并使用轮廓的边界矩形区域进行过滤.我们这样做的原因是因为观察到,任何典型字符都将是如此之大,而较大的噪声将跨越较大的矩形区域.然后,我们确定平均面积.

  1. Determine average rectangular contour area. We threshold then find contours and filter using the bounding rectangle area of the contour. The reason we do this is because of the observation that any typical character will only be so big whereas large noise will span a larger rectangular area. We then determine the average area.

删除较大的离群轮廓.我们再次遍历轮廓,并通过填充轮廓将大于5x的较大轮廓删除.我们使用此动态阈值来提高鲁棒性,而不是使用固定的阈值区域.

Remove large outlier contours. We iterate through contours again and remove the large contours if they are 5x larger than the average contour area by filling in the contour. Instead of using a fixed threshold area, we use this dynamic threshold for more robustness.

通过垂直内核进行扩展以连接字符.这个想法是利用观察到的字符在列中对齐的优势.通过使用垂直内核进行扩展,我们可以将文本连接在一起,从而使噪声不会包含在此组合轮廓中.

Dilate with a vertical kernel to connect characters. The idea is take advantage of the observation that characters are aligned in columns. By dilating with a vertical kernel we connect the text together so noise will not be included in this combined contour.

消除小噪音.现在,要保留的文本已连接,我们找到轮廓并删除所有小于4x平均轮廓区域的轮廓.

Remove small noise. Now that the text to keep is connected, we find contours and remove any contours smaller than 4x the average contour area.

按位-并重建图像.由于我们只希望将轮廓保留在蒙版上,因此可以按位排列并保留文本并得到结果.

Bitwise-and to reconstruct image. Since we only have desired contours to keep on our mask, we bitwise-and to preserve the text and get our result.


这是该过程的可视化:


Here's a visualization of the process:

我们 Otsu的阈值以获得二进制图像,然后找到轮廓 a>确定平均矩形轮廓区域.在这里,我们删除

We Otsu's threshold to obtain a binary image then find contours to determine the average rectangular contour area. From here we remove the large outlier contours highlighted in green by filling contours

接下来,我们构造一个垂直内核扩张以连接角色.此步骤将保留所有所需的文本,并将噪声隔离为单独的斑点.

Next we construct a vertical kernel and dilate to connect the characters. This step connects all the desired text to keep and isolates the noise into individual blobs.

现在,我们使用等高线区域消除小噪音

Now we find contours and filter using contour area to remove the small noise

所有已去除的噪音颗粒均以绿色突出显示

Here are all the removed noise particles highlighted in green

结果

代码

import cv2

# Load image, grayscale, and Otsu's threshold
image = cv2.imread('1.png')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]

# Determine average contour area
average_area = [] 
cnts = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
    x,y,w,h = cv2.boundingRect(c)
    area = w * h
    average_area.append(area)

average = sum(average_area) / len(average_area)

# Remove large lines if contour area is 5x bigger then average contour area
cnts = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
    x,y,w,h = cv2.boundingRect(c)
    area = w * h
    if area > average * 5:  
        cv2.drawContours(thresh, [c], -1, (0,0,0), -1)

# Dilate with vertical kernel to connect characters
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (2,5))
dilate = cv2.dilate(thresh, kernel, iterations=3)

# Remove small noise if contour area is smaller than 4x average
cnts = cv2.findContours(dilate, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
    area = cv2.contourArea(c)
    if area < average * 4:
        cv2.drawContours(dilate, [c], -1, (0,0,0), -1)

# Bitwise mask with input image
result = cv2.bitwise_and(image, image, mask=dilate)
result[dilate==0] = (255,255,255)

cv2.imshow('result', result)
cv2.imshow('dilate', dilate)
cv2.imshow('thresh', thresh)
cv2.waitKey()

注意:传统的图像处理仅限于阈值处理,形态学运算和轮廓滤波(轮廓近似,面积,纵横比或斑点检测).由于输入图像可能会根据字符文本大小而变化,因此很难找到一个单一的解决方案.您可能需要考虑使用机器/深度学习来训练自己的分类器,以获得动态解决方案.

Note: Traditional image processing is limited to thresholding, morphological operations, and contour filtering (contour approximation, area, aspect ratio, or blob detection). Since input images can vary based on character text size, finding a singular solution is quite difficult. You may want to look into training your own classifier with machine/deep learning for a dynamic solution.

这篇关于OCR的清洁图像的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆