使用Python删除字母图像中的残留物 [英] Remove remains in a letter image with Python

查看:55
本文介绍了使用Python删除字母图像中的残留物的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一组图像,这些图像表示从单词图像中提取的字母.在某些图像中,有剩余的相邻字母,我想消除它们,但我不知道如何.

I have a set of images that represent letters extracted from an image of a word. In some images there are remains of the adjacent letters and I want to eliminate them but I do not know how.

一些样品

我正在使用openCV,并且尝试了两种方法,但均无效果.

I'm working with openCV and I've tried two ways and none works.

使用findContours:

With findContours:

def is_contour_bad(c):
    return len(c) < 50

gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
edged = cv2.Canny(gray, 50, 100)

contours = cv2.findContours(edged.copy(), cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)
contours = contours[0] if imutils.is_cv2() else contours[1]

mask = np.ones(image.shape[:2], dtype="uint8") * 255

for c in contours:
    # if the c  ontour is bad, draw it on the mask
    if is_contour_bad(c):
        cv2.drawContours(mask, [c], -1, 0, -1)

# remove the contours from the image and show the resulting images
image = cv2.bitwise_and(image, image, mask=mask)
cv2.imshow("After", image)
cv2.waitKey(0)

我认为它不起作用,因为图像在边缘cv2.drawContours无法正确计算面积并且不能消除内部点

I think it does not work because the image is on the edge cv2.drawContours can not calculate the area correctly and does not eliminate the interior points

with connectedComponentsWithStats:

With connectedComponentsWithStats:

cv2.imshow("Image", img)
cv2.waitKey(0)
nb_components, output, stats, centroids = cv2.connectedComponentsWithStats(img)
sizes = stats[1:, -1];
nb_components = nb_components - 1

min_size = 150

img2 = np.zeros((output.shape))
for i in range(0, nb_components):
    if sizes[i] >= min_size:
        img2[output == i + 1] = 255

cv2.imshow("After", img2)
cv2.waitKey(0)

在这种情况下,我不知道为什么侧面的小元件无法将它们识别为连接的元件

In this case I do not know why the small elements on the sides do not recognize them as connected components

嗯..我将不胜感激!

推荐答案

在问题的开头,您提到字母是从单词的图像中提取的.

In the very beginning of the question you have mentioned that letters have been extracted from an image of a word.

因此,我认为您可以正确提取.这样您就不会遇到这样的问题.我可以为您提供一种解决方案,该解决方案适用于从原始图像中提取字母或从给定的图像中提取并分离字母.

So as I think, You could have done the extraction correctly. Then you wouldn't have faced a problem like this. I can give you a solution which is applicable to either extracting letters from original image or extract and separate letters from the image you have given.

解决方案:

您可以使用凸包坐标来分隔这样的字符.

You can use convex hull coordinates to separate characters like this.

代码:

import cv2
import numpy as np

img = cv2.imread('test.png', 0)
cv2.bitwise_not(img,img)
img2 = img.copy()

ret, threshed_img = cv2.threshold(img, 127, 255, cv2.THRESH_BINARY)
image, contours, hier = cv2.findContours(threshed_img, cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_NONE)

#--- Black image to be used to draw individual convex hull ---
black = np.zeros_like(img)
contours = sorted(contours, key=lambda ctr: cv2.boundingRect(ctr)[0])

for cnt in contours:
    hull = cv2.convexHull(cnt)

    img3 = img.copy()
    black2 = black.copy()

    #--- Here is where I am filling the contour after finding the convex hull ---
    cv2.drawContours(black2, [hull], -1, (255, 255, 255), -1)
    r, t2 = cv2.threshold(black2, 127, 255, cv2.THRESH_BINARY)
    masked = cv2.bitwise_and(img2, img2, mask = t2)
    cv2.imshow("masked.jpg", masked)
    cv2.waitKey(0)

cv2.destroyAllWindows()

输出:

因此,正如我所建议的,更好的方法是从原始图像中提取字符时使用此解决方案,而不是在提取后消除噪声.

So as I suggest, the better thing is to use this solution when you extract characters from original image rather than removing noises after extraction.

这篇关于使用Python删除字母图像中的残留物的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆